QDoRA

How does QDoRA enhance QLoRA?

QDoRA replaces LoRA with DoRA in the QLoRA framework, introducing a more advanced approach to fine-tuning by decomposing the weight matrix into two components: magnitude and direction. This approach was introduced in a recent project ”Efficient finetuning of Llama 3 with FSDP QDoRA” led by Kerem Turgutlu. It also uses FSDP (Fully Sharded Data Parallel) to split the model and train it across multiple GPUs, making it even more efficient.

This allows for more precise weight adjustments during fine-tuning, unlike QLoRA, which applies uniform changes. By combining this decomposition with quantization, QDoRA enhances the flexibility of full fine-tuning while retaining the memory efficiency of QLoRA.

This approach enables QDoRA to outperform QLoRA in various tasks by making more granular updates to model weights, thus achieving closer performance to full fine-tuning without significantly increasing memory or computational requirements.

In tests, QDoRA fine-tuned LLaMA2-7B and LLaMA3-8B on Orca-Math dataset, outperforming both QLoRA and full fine-tuning with less memory use. This makes QDoRA ideal for developers with limited GPU resources.