Gradient-sign Masking for Task Vector Transport Across Pre-Trained Models

Rinaldi, Filippo; Panariello, Aniello; Salici, Giacomo; Liu, Fengyuan; Ciccone, Marco; Porrello, Angelo; Calderara, Simone

When a new release of a foundation model is published, practitioners typically need to repeat fine-tuning, even if the same task was already tackled in the previous version. A promising alternative is to reuse the parameter changes (i.e., task vectors) that capture how a model adapts to a specific task. However, these vectors often fail to transfer across different pre-trained models because their parameter spaces are misaligned. In this work, we show that successful transfer depends strongly on the gradient-sign structure of the new model. Based on this insight, we propose GradFix, which approximates the ideal sign structure and leverages it to transfer knowledge using only a handful of labeled samples. Notably, this requires no additional fine-tuning: we only compute a few target-model gradients without parameter updates and mask the source task vector accordingly. This yields an update that is locally aligned with the target loss landscape, effectively rebasing the task vector onto the new pre-training. We provide a theoretical guarantee that our method ensures first-order descent. Empirically, we demonstrate significant performance gains on vision and language benchmarks, consistently outperforming naive task vector addition and few-shot fine-tuning. We further show that transporting task vectors improves multi-task and multi-source model merging. Code is available at https://github.com/fillo-rinaldi/GradFix.

Gradient-sign Masking for Task Vector Transport Across Pre-Trained Models / Rinaldi, Filippo; Panariello, Aniello; Salici, Giacomo; Liu, Fengyuan; Ciccone, Marco; Porrello, Angelo; Calderara, Simone. - (2026). ( The Fourteenth International Conference on Learning Representations (ICLR) Rio de Janeiro, Brazil April 23rd - 27th, 2026).

Gradient-sign Masking for Task Vector Transport Across Pre-Trained Models

Filippo Rinaldi;Aniello Panariello;Giacomo Salici;Fengyuan Liu;Marco Ciccone;Angelo Porrello;Simone Calderara

2026

Abstract

When a new release of a foundation model is published, practitioners typically need to repeat fine-tuning, even if the same task was already tackled in the previous version. A promising alternative is to reuse the parameter changes (i.e., task vectors) that capture how a model adapts to a specific task. However, these vectors often fail to transfer across different pre-trained models because their parameter spaces are misaligned. In this work, we show that successful transfer depends strongly on the gradient-sign structure of the new model. Based on this insight, we propose GradFix, which approximates the ideal sign structure and leverages it to transfer knowledge using only a handful of labeled samples. Notably, this requires no additional fine-tuning: we only compute a few target-model gradients without parameter updates and mask the source task vector accordingly. This yields an update that is locally aligned with the target loss landscape, effectively rebasing the task vector onto the new pre-training. We provide a theoretical guarantee that our method ensures first-order descent. Empirically, we demonstrate significant performance gains on vision and language benchmarks, consistently outperforming naive task vector addition and few-shot fine-tuning. We further show that transporting task vectors improves multi-task and multi-source model merging. Code is available at https://github.com/fillo-rinaldi/GradFix.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Titolo del Convegno
	
				The Fourteenth International Conference on Learning Representations (ICLR)
			
	Luogo del Convegno
	
				Rio de Janeiro, Brazil
			
	Data del Convegno
	
				April 23rd - 27th, 2026
			
	Tutti gli autori
	
						Rinaldi, Filippo; Panariello, Aniello; Salici, Giacomo; Liu, Fengyuan; Ciccone, Marco; Porrello, Angelo; Calderara, Simone
					
	Citazione
	
				Gradient-sign Masking for Task Vector Transport Across Pre-Trained Models / Rinaldi, Filippo; Panariello, Aniello; Salici, Giacomo; Liu, Fengyuan; Ciccone, Marco; Porrello, Angelo; Calderara, Simone. - (2026). ( The Fourteenth International Conference on Learning Representations (ICLR) Rio de Janeiro, Brazil April 23rd - 27th, 2026).
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
16873_Gradient_Sign_Masking_fo.pdf Open access Tipologia: AAM - Versione dell'autore revisionata e accettata per la pubblicazione Dimensione 435.2 kB Formato Adobe PDF Visualizza/Apri	435.2 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris