Sharp Generalization Bounds for Foundation Models with Asymmetric Randomized Low-Rank Adapters
Abstract
Low-Rank Adaptation (LoRA) has emerged as a widely adopted
parameter-efficient fine-tuning (PEFT) technique for foundation models. Recent
work has highlighted an inherent asymmetry in the initialization of LoRA's
low-rank factors, which has been present since its inception and was presumably
derived experimentally. This paper focuses on providing a comprehensive
theoretical characterization of asymmetric LoRA with frozen random factors.
First, while existing research provides upper-bound generalization guarantees
based on averages over multiple experiments, the behaviour of a single
fine-tuning run with specific random factors remains an open question. We
address this by investigating the concentration of the typical LoRA
generalization gap around its mean. Our main upper bound reveals a sample
complexity of $\tilde{\mathcal{O}}\left(\frac{\sqrt{r}}{\sqrt{N}}\right)$ with
high probability for rank $r$ LoRAs trained on $N$ samples. Additionally, we
also determine the fundamental limits in terms of sample efficiency,
establishing a matching lower bound of
$\mathcal{O}\left(\frac{1}{\sqrt{N}}\right)$. By more closely reflecting the
practical scenario of a single fine-tuning run, our findings offer crucial
insights into the reliability and practicality of asymmetric LoRA.