Papers

  • Fine-tuning with LoRA

    December 18, 2024

    Introduces LoRA (Low-Rank Adaptation), a method for adapting large pre-trained language models efficiently. By freezing the main model and training small, low-rank matrices, LoRA significantly reduces the number of trainable parameters: up to 10,000 times less than full fine-tuning, while maintaining or improving task performance. It lowers memory requirements and introduces no extra inference latency. LoRA matches or outperforms full fine-tuning on various tasks, showing that large models can adapt using fewer resources. The method's simplicity allows seamless integration with existing architectures without compromising quality.
  • Categorizing Languages Through Metadata

    December 11, 2024

    The LinguaMeta project, detailed in the paper, unifies metadata for over 7,500 languages, covering aspects like language codes, speaker counts, writing systems, and regions. While metadata for widely spoken languages is robust, gaps exist for smaller, endangered languages. LinguaMeta offers comprehensive, traceable data aimed at supporting technology development for underrepresented languages.
  • This paper was actually written by me! This research introduces an AI medical chatbot trained on real patient-doctor conversations. Using advanced language models, it combines generative AI with retrieval-based methods like BERT to deliver accurate and context-aware responses. The chatbot aims to provide accessible preliminary healthcare insights, emphasizing it’s not a substitute for professional advice. This study highlights how fine-tuning large models for specific domains, paired with ensemble techniques, can improve conversational AI for real-world healthcare applications.
  • Subgraphs in Road Networks

    November 27, 2024

    Explores creating compact yet versatile road network models for navigation systems. It proposes algorithms to extract minimal subgraphs that preserve near-optimal routes for diverse travel preferences (e.g., shortest time, avoiding highways). Their greedy algorithm can reduce subgraph size by up to 60% compared to existing methods while maintaining accuracy. The research highlights how smaller subgraphs enable computationally intensive tasks, like dynamic routing or vehicle logistics, making navigation systems more adaptable and efficient.
  • Optimization of Quantum Measurement

    November 20, 2024

    Illustrates a way to improve how accurately we measure tiny quantum systems used in advanced computing. These systems, called superconducting circuits, are extremely sensitive and prone to errors during measurements. The researchers developed a smarter, faster method to fine-tune the measurement process, reducing errors to just 1.5% while keeping the system stable. This breakthrough could make quantum computers more reliable and able to handle bigger, more complex problems in the future.
  • Connecting Languages with Technology

    November 13, 2024

    This research explores the vast, untapped potential of data for thousands of languages, emphasizing that the main barrier to building language technology isn’t scarcity but the scattered nature of resources. A key insight reveals that, with better aggregation and community involvement, we could harness existing data to support language technologies for many under-resourced languages, making digital tools more accessible globally.
  • Improvements with Beam Search

    November 6, 2024

    Discusses methods to improve the confidence estimation in generative sequence labeling, a process used for tasks like entity extraction in AI. Traditional models rely on token-level probabilities, but this approach may miss the full scope of uncertainty. The authors propose using "beam search" statistics, leveraging multiple prediction candidates to better gauge model confidence. A key finding shows that methods like "Aggregated Sequence Probability" and "Adaptive Aggregated Sequence Probability" can reduce errors in confidence estimation, making predictions more reliable across various datasets. This improvement has practical implications for applications needing precise AI outputs, like virtual assistants or search engines.
  • Human-Algorithm Collaboration

    October 30, 2024

    Explores human-algorithm collaboration, specifically when an algorithm provides a shortlist for a human to make the final choice. The study finds that collaboration (where the algorithm suggests a subset of choices rather than making a solo decision) often improves outcomes when both human and algorithmic errors are independent. Using a shortlist of items, rather than a single option, can improve accuracy due to complementary strengths in decision-making, especially when neither the human nor the algorithm is perfect.
  • Accessible AI for Creative Tasks

    October 23, 2024

    Describes how generative AI (GAI) is transforming creative practices, especially for people with disabilities. Through interviews with 10 creatives with various disabilities, the study reveals how they adapt GAI to enhance both accessibility and artistic expression across different mediums, from painting to audio engineering. The participants shared insights on balancing creative practices with accessibility hacks, offering a unique perspective on integrating AI into artistic workflows.
  • Length Generalization in Large Language Models

    October 16, 2024

    Explores how large language models (LLMs) handle "length generalization," or the ability to solve longer problems using knowledge from shorter ones. The study finds that finetuning alone is ineffective at improving this skill, even with larger models. However, using a "scratchpad" method, where models break down tasks into steps, significantly enhances performance, especially when combined with in-context learning (learning from a few examples). LLMs can improve length generalization more through in-context learning than by traditional finetuning, offering a new approach to reasoning tasks like math and code execution​.
  • Merging Large Language Models

    October 9, 2024

    Focuses on efficiently creating high-performing large language models (LLMs) by merging multiple existing fine-tuned models rather than training new models from scratch. The challenge is finding methods to combine capabilities from various specialized models to generalize across tasks without the need for extensive retraining, thereby saving significant computational resources. A competition encourages participants to merge models under an 8 billion parameter limit using minimal compute resources. Merging expert models can potentially outperform existing models while dramatically reducing the cost and complexity of training large models from scratch.
  • Transformer-based Video Generation

    October 2, 2024

    Introduces VideoPoet, a model for generating high-quality videos from various input signals like images, text, and audio. It uses a transformer-based architecture similar to large language models, allowing it to generate videos in a 'zero-shot' manner, meaning it can create videos without being specifically trained for each task. A key finding is that VideoPoet outperforms existing video generation models, especially in creating fluid, realistic motions by incorporating multimodal inputs and a two-stage training process.
  • Attention is All You Need

    September 25, 2024

    Introduces the Transformer, a groundbreaking architecture in AI for processing sequences. Unlike traditional models that rely on complex recurrence or convolution, the Transformer uses only attention mechanisms to handle dependencies in data. This design allows for faster training and better performance in tasks like language translation. One key finding is that the Transformer outperforms previous state-of-the-art models in translation while requiring significantly less computational time. This is one of the most important papers to be released in recent history!
  • Bias in Natural Language Processing

    September 18, 2024

    Addresses the issue of bias in Natural Language Processing (NLP) models, which can produce harmful stereotypes and unequal outcomes for different social groups. Despite efforts to assess and mitigate these biases, current measurement methods are flawed. The authors propose using psychometrics, a field that measures abstract concepts, to improve how biases in NLP are evaluated. They focus on two key psychometric concepts: construct validity (ensuring measures capture the intended bias) and reliability (ensuring consistent results). One key finding is that NLP bias measures need more reliable and valid tools to prevent hidden biases from causing unintended societal harm.
  • Text Summarization

    September 11, 2024

    Examines the balance between redundancy and cohesion in extractive summarization, particularly for long, redundant texts like scientific papers. Two systems are introduced: one reward-based, optimizing for cohesion and informativeness, and another unsupervised, using psycholinguistic theories to simulate human memory. The reward-guided approach produces more cohesive summaries, though sometimes at the cost of informativeness. A key finding is that models focusing on cohesion create more structured and readable summaries, and can maintain or even improve informativeness, compared to those aimed solely at reducing redundancy.
  • Language Models as a Service

    September 4, 2024

    Discusses 'Language Models as a Service' (LMaaS), which refers to the use of powerful language models offered through APIs or web interfaces. LMaaS presents challenges such as limited accessibility, reproducibility, reliability, and trustworthiness due to its black-box nature and commercial restrictions. These challenges hinder efforts to understand and control these models. A key finding is that LMaaS exacerbates inequalities, as its pay-per-use model disproportionately affects lower-resource users, making it difficult for these groups to benefit from advances in AI.