About 775,000 results
Open links in new tab
  1. [2202.10054] Fine-Tuning can Distort Pretrained Features and ...

    Feb 21, 2022 · We prove that the OOD error of fine-tuning is high when we initialize with a fixed or random head -- this is because while fine-tuning learns the head, the lower layers of the neural …

  2. •Pretrained models give large improvements in accuracy, but how we fine-tune them is key •LP-FT is just a starting point, better methods? •What to do when linear probing not so good?

  3. Fine-Tuning Distorts Pretrained Features - Emergent Mind

    Feb 21, 2022 · Fine-tuning alters pretrained features due to simultaneous optimization of the head and lower layers, causing distortions that compromise OOD performance.

  4. Key takeaway A larger change in parameters can distort pretrained features How to retain information beyond the limited data used for adaptation?

  5. Fine-Tuning without Distortion: Improving Robustness to ... - NeurIPS

    Our analysis suggests the easy two-step strategy of linear probing then full fine-tuning (LP-FT), which improves pretrained features without distortion, and leads to even higher accuracies.

  6. Fine-Tuning can Distort Pretrained Features and Underperform...

    Jan 28, 2022 · We prove that the OOD error of fine-tuning is high when we initialize with a fixed or random head---this is because while fine-tuning learns the head, the lower layers of the neural …

  7. span of the training data when using “good” pretrained features. Even with an infinitesimally small learning rate, fine-tuning distorts pretrained features

  8. Can we refine features without distorting them too much? +10% over fine-tuning! What to do when linear probing not so good?

  9. rameterized two-layer linear networks. We prove that the OOD error of fine-tuning is high when we initialize with a fixed or random head—this is because while fine-tuning learns the head, the lower …

  10. Fine-tune ViT-G/14 (pretrained on JFT-3B) many times with LP-FT using different hyperparameters, average their weights in a greedy strategy (add a new model to the “soup” if ID validation accuracy …