[HN Gopher] Implementing Weight-Decomposed Low-Rank Adaptation (...
___________________________________________________________________
Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from
Scratch
Author : rasbt
Score : 47 points
Date : 2024-02-18 18:50 UTC (4 hours ago)
(HTM) web link (magazine.sebastianraschka.com)
(TXT) w3m dump (magazine.sebastianraschka.com)
| jasonjmcghee wrote:
| This is a bit of a misleading title. Why not use the original?
|
| "Improving LoRA: Implementing Weight-Decomposed Low-Rank
| Adaptation (DoRA) from Scratch"
|
| (If it's too long, just drop the "Improving LoRA: " part)
| rasbt wrote:
| Thanks, fixed!
| murkt wrote:
| Hooray, no more confusion with LoRa the radio!
| 3abiton wrote:
| I'm waiting for physicists to have their gripe with the
| acronym.
| stavros wrote:
| Yes, but think of the explorer!
| sorenjan wrote:
| Speaking of LoRA, what happened with ZipLoRA? It's supposed to be
| a better way of merging multiple LoRAs, and the results look good
| in their examples. Is it being used anywhere?
|
| https://ziplora.github.io/
| rasbt wrote:
| Not sure, but in general, it looks like ZipLoRA is only useful
| in specific contexts like when you have two different tasks you
| want to optimize for (like style and content in a vision
| context). DoRA is more general, it's basically normalizing and
| scaling the LoRA matrices to get much better performance.
| According to the paper, it even works great for low ranks,
| which also effectively makes it even more parameter-efficient
| than OG LoRA.
| sorenjan wrote:
| I just read the article, nice write up! I think it would
| benefit from a short explanation of what the magnitude vector
| (m) and the directional matrix (V) are, I'm not familiar with
| that kind of decomposition.
|
| Not related to the article but tangentially relevant, would
| it be possible to train a LoRA or DoRA with a high rank, and
| then use SVD to see if the rank is too high and truncate to a
| better value of r? Maybe use different ranks for different
| layers after some training?
| rasbt wrote:
| Thanks for the feedback. Clarifying definitely wouldn't
| hurt. Added a paragraph and new figure at the top of the
| DoRA section: https://magazine.sebastianraschka.com/i/14179
| 7214/introducin...
|
| I haven't tried what you were suggesting, but that sounds
| actually plausible. Interesting idea!
| gliched_robot wrote:
| This is very cool and will change the way we do lora now.
___________________________________________________________________
(page generated 2024-02-18 23:00 UTC)