René's URL Explorer Experiment


Title: Model Merging — a biased overview | Donato Crisostomi

Description: A friendly tour of model merging, suspiciously aligned with my own research.

Keywords:

direct link

Domain: crisostomi.github.io

NoneIE=edge
authorDonato Crisostomi

Links:

Donato Crisostomi https://crisostomi.github.io/blog/
https://crisostomi.github.io/blog/
blog https://crisostomi.github.io/blog/index.html
Motivationhttps://crisostomi.github.io/blog/2025/model_merging/#motivation
Intro to model merginghttps://crisostomi.github.io/blog/2025/model_merging/#intro-to-model-merging
Merging models trained from scratch on the same taskhttps://crisostomi.github.io/blog/2025/model_merging/#merging-models-trained-from-scratch-on-the-same-task
Mode connectivityhttps://crisostomi.github.io/blog/2025/model_merging/#mode-connectivity
Neuron permutation symmetrieshttps://crisostomi.github.io/blog/2025/model_merging/#neuron-permutation-symmetries
Neuron matchinghttps://crisostomi.github.io/blog/2025/model_merging/#neuron-matching
Entering cycle-consistencyhttps://crisostomi.github.io/blog/2025/model_merging/#entering-cycle-consistency
Merging models finetuned from the same base model on different taskshttps://crisostomi.github.io/blog/2025/model_merging/#merging-models-finetuned-from-the-same-base-model-on-different-tasks
Task arithmetichttps://crisostomi.github.io/blog/2025/model_merging/#task-arithmetic
Task vectors and gradientshttps://crisostomi.github.io/blog/2025/model_merging/#task-vectors-and-gradients
Structure-aware merging methodshttps://crisostomi.github.io/blog/2025/model_merging/#structure-aware-merging-methods
Routing and MoErginghttps://crisostomi.github.io/blog/2025/model_merging/#routing-and-moerging
LLMs and Evolutionary Merginghttps://crisostomi.github.io/blog/2025/model_merging/#llms-and-evolutionary-merging
What comes next?https://crisostomi.github.io/blog/2025/model_merging/#what-comes-next
Workshop on Weight Space Learninghttps://weight-space-learning.github.io/
Estimathonhttps://estimathon.com/
according to Wikipedia.https://en.wikipedia.org/wiki/Taxis_of_New_York_City
WayBack Machinehttps://web.archive.org/
The Informationhttps://www.theinformation.com/articles/ex-openai-cto-muratis-startup-plans-compete-openai-others
Lucas Beyer’s excellent summaryhttps://x.com/giffmana/status/1924849877634449878
extra overhead for routinghttps://crisostomi.github.io/blog/2025/model_merging/#routing-and-moerging
better merging coefficientshttps://crisostomi.github.io/blog/2025/model_merging/#llms-and-evolutionary-merging
Hungarian algorithmhttps://en.wikipedia.org/wiki/Hungarian_algorithm
standard weight matching equationhttps://crisostomi.github.io/blog/2025/model_merging/#neuron-matching
comments powered by giscus.http://giscus.app/?ref_noscript
Jekyllhttps://jekyllrb.com/
al-foliohttps://github.com/alshedivat/al-folio
GitHub Pageshttps://pages.github.com/
Unsplashhttps://unsplash.com

Viewport: width=device-width, initial-scale=1, shrink-to-fit=no


URLs of crawlers that visited me.