← Back to Library

Advanced LoRA Fine-Tuning: How to Pick LoRA, QLoRA, DoRA, PiSSA, OLoRA, EVA, and LoftQ for LLMs

Deep Dives

Explore related topics with these Wikipedia articles, rewritten for enjoyable reading:

  • Low-rank approximation 2 min read

    LoRA fundamentally relies on low-rank matrix factorization to reduce parameter count. Understanding the mathematical foundations of low-rank approximation helps readers grasp why decomposing weight matrices into smaller components preserves model capability while dramatically reducing memory.

  • Quantization (signal processing) 13 min read

    QLoRA, QDoRA, and LoftQ all involve quantizing model weights to reduce memory. This article explains the core concept of mapping continuous values to discrete levels, which is essential for understanding the precision tradeoffs these techniques make.

  • Singular value decomposition 8 min read

    SVD is the mathematical technique underlying several LoRA variants like PiSSA and OLoRA, which use principal components for initialization. Understanding SVD helps readers appreciate how these methods identify the most important directions in weight space.

Advanced LoRA Fine-Tuning: How to Pick LoRA, QLoRA, DoRA, PiSSA, OLoRA, EVA, and LoftQ for LLMs

By Benjamin Marie

Image generated with ChatGPT

When it’s done well, LoRA can match full fine-tuning while using a fraction of the memory.

It was introduced in 2021, when open LLMs were scarce and relatively small. Today, we have plenty of models, from a few hundred million to hundreds of billions of parameters. On these larger models, LoRA (or one of its variants) is often the only practical way to fine-tune without spending $10k+.

Originally, LoRA was meant to train small adapters on top of the attention blocks of LLMs. Since then, the community has proposed many optimizations and extensions, including techniques that work with quantized models.

In this article, we’ll look at the most useful, modern approaches to LoRA for adapting LLMs to your task and budget. We’ll review (Q)DoRA, (Q)LoRA, PiSSA, EVA, OLoRA, and LoftQ, compare their performance (with and without a quantized base model, when that’s relevant), and discuss when to pick each method. All of them are implemented in Hugging Face TRL.

You can find my notebook showing how to use these techniques here:

Get the notebook (#187)

Read full article on The Kaitchup →