Takeaways from Hundreds of LLM finetuning experiments with LoRA

LoRA is a widely used technique for training custom LLMs. In this article, the author provides practical insights on various aspects of LoRA, such as the value of QLoRA, replacing AdamW with SGD, using a scheduler, and adjusting LoRA hyperparameters. The article covers topics like evaluation tasks and dataset, code framework, choosing a base model, evaluating LoRA defaults, memory savings with QLoRA, learning rate schedulers and SGD, iterating over the dataset multiple times, and hyperparameter tuning for LoRA. The author shares their experimental findings and highlights the best settings for optimal performance. It’s worth mentioning that the article is detailed and informative, focusing on technical aspects of LoRA.

https://lightning.ai/pages/community/lora-insights/