With the growing accessibility of artificial intelligence and machine learning technologies, modern learning systems are operating in increasingly dynamic environments, where data distributions, tasks, and objectives evolve over time. Traditional static learning paradigms struggle to keep pace with this evolution, often leading to degraded performance, the loss of previously acquired knowledge, or inefficient retraining from scratch. Addressing these challenges calls for learning mechanisms that can transfer knowledge across temporal and structural dimensions, from sequential data through streams of tasks, and even to the reuse and recombination of entire models. This thesis investigates how learning systems can evolve alongside their environments by leveraging structured information and prior knowledge at multiple scales. The work follows a journey that begins with temporal data understanding, progresses through continual learning, and finally culminates in model merging and composition. Across these stages, it explores the central question of how information learned in one context can be reused or adapted in another. The first part focuses on temporal learning from visual data, treating video streams as structured time series. It introduces a consistency-based formulation for temporal anomaly localization (CSL-TAL) in a setting where frame-level annotations are unavailable. Then it focuses on multi-object tracking, introducing flow-based and probabilistic representations (TrackFlow) that improve the modeling of temporal dynamics in complex scenes. Finally, it tackles monocular per-object distance estimation (DistFormer), which integrates object-centric reasoning into temporal learning pipelines. Together, these studies show how structured temporal information can be exploited to learn more generalizable and interpretable visual representations. The second part investigates continual learning, where data arrive as a stream. CHARON presents an efficient continual learning framework for skeleton-based action recognition that combines masking and compression to improve memory usage and stability. CGIL introduces a method for continual adaptation of large vision-language models via generative latent replay, preserving zero-shot capabilities while enabling incremental prompt learning. Collectively, these contributions reinterpret continual learning as a form of structured temporal progression--a time series of evolving tasks. The final part explores knowledge transfer across models through the lens of model merging and task arithmetic. Here, instead of adapting a single model over time, the goal is to combine multiple pre-trained models to construct new capabilities dynamically. The PASTA framework for modular tracking shows how specialized model components can be composed in parameter space to generalize across domains. Subsequently, the thesis focuses on low-rank (MoDER and Core Space) and gradient-based (GradFix) techniques for model merging, allowing the creation of new models through direct parameter operations rather than retraining. These methods enable the synthesis of task-specific networks, representing a new form of temporal evolution--one that occurs in model space rather than data space. Overall, this thesis offers a coherent perspective on knowledge transfer in evolving systems. By connecting temporal learning, continual adaptation, and model composition, it reinterprets time series analysis as a broader principle of knowledge transfer across evolving representations. The resulting framework highlights the role of structure, modularity, and reuse in building scalable, adaptive, and resilient learning systems, capable not only of understanding the world as it changes, but of changing themselves in response.

Con la crescente diffusione delle tecnologie di intelligenza artificiale, i moderni sistemi di apprendimento operano in ambienti sempre più dinamici, in cui distribuzioni dei dati, task e obiettivi evolvono nel tempo. I paradigmi statici tradizionali faticano a tenere il passo con tali mutamenti, con conseguente degrado delle prestazioni, perdita di conoscenze acquisite o riaddestramenti poco efficienti. Affrontare tali sfide richiede meccanismi capaci di trasferire conoscenza attraverso dimensioni temporali e strutturali, dai dati sequenziali ai flussi di task, fino al riuso e alla combinazione di interi modelli preesistenti. Questa tesi analizza come i sistemi di apprendimento possano evolvere insieme ai propri ambienti, sfruttando informazioni strutturate e conoscenze pregresse. Il lavoro si articola in tre direttrici principali: la comprensione dei dati temporali, l'apprendimento continuo e la composizione di modelli, con l'obiettivo di comprendere come le informazioni apprese in un contesto possano essere riutilizzate o adattate in un altro. La prima parte è dedicata all'apprendimento temporale da dati visivi, considerando i flussi video come serie temporali strutturate. Viene proposta una formulazione basata sulla coerenza temporale per la localizzazione di anomalie (CSL-TAL) in assenza di annotazioni a livello di frame; seguono modelli probabilistici e rappresentazioni basate sul flusso per il tracciamento multi-oggetto (TrackFlow), e un approccio per la stima della distanza degli oggetti da visione monoculare (DistFormer), che integra un ragionamento centrato sull'oggetto nei processi temporali. Nel loro insieme, questi studi mostrano come l'informazione temporale possa essere sfruttata per ottenere rappresentazioni visive più generali e interpretabili. La seconda parte affronta l'apprendimento continuo, in cui i dati si presentano come flusso. CHARON propone un framework efficiente per il riconoscimento di azioni basato su scheletri, che combina mascheramento e compressione per ottimizzare memoria e stabilità. CGIL introduce invece un metodo di adattamento continuo per modelli visione-linguaggio di grandi dimensioni mediante generative latent replay, mantenendo le capacità zero-shot e consentendo l'apprendimento incrementale dei prompt. Questi contributi reinterpretano l'apprendimento continuo come una progressione temporale strutturata, intesa come una sequenza di task in evoluzione. La parte finale esplora il trasferimento di conoscenza tra modelli attraverso fusione e aritmetica dei modelli. Invece di adattare un singolo modello nel tempo, l'obiettivo è combinare modelli pre-addestrati per generare nuove capacità. Il framework PASTA mostra come componenti specializzati possano essere composti nello spazio dei parametri per generalizzare tra domini. Successivamente, vengono analizzate tecniche a basso rango (MoDER e Core Space) e basate sul gradiente (GradFix) per fondere modelli, consentendo la creazione di nuove reti tramite operazioni dirette sui parametri. Questi approcci permettono di sintetizzare reti specifiche per task, rappresentando una nuova forma di evoluzione nello spazio dei modelli anziché in quello dei dati. Nel complesso, la tesi offre una prospettiva unificata sul trasferimento di conoscenza nei sistemi in evoluzione. Collegando apprendimento temporale, adattamento continuo e composizione di modelli, reinterpreta l'analisi delle serie temporali come principio generale di trasferimento tra rappresentazioni che cambiano nel tempo. Il quadro che emerge evidenzia il ruolo di struttura, modularità e riuso nella costruzione di sistemi di apprendimento scalabili, adattivi e resilienti, capaci non solo di interpretare un mondo in trasformazione, ma anche di trasformarsi in risposta a esso.

Apprendere attraverso tempo, task e modelli: il trasferimento di conoscenza in sistemi in evoluzione / Aniello Panariello , 2026 Apr 20. 38. ciclo, Anno Accademico 2024/2025.

Apprendere attraverso tempo, task e modelli: il trasferimento di conoscenza in sistemi in evoluzione

PANARIELLO, Aniello
2026

Abstract

With the growing accessibility of artificial intelligence and machine learning technologies, modern learning systems are operating in increasingly dynamic environments, where data distributions, tasks, and objectives evolve over time. Traditional static learning paradigms struggle to keep pace with this evolution, often leading to degraded performance, the loss of previously acquired knowledge, or inefficient retraining from scratch. Addressing these challenges calls for learning mechanisms that can transfer knowledge across temporal and structural dimensions, from sequential data through streams of tasks, and even to the reuse and recombination of entire models. This thesis investigates how learning systems can evolve alongside their environments by leveraging structured information and prior knowledge at multiple scales. The work follows a journey that begins with temporal data understanding, progresses through continual learning, and finally culminates in model merging and composition. Across these stages, it explores the central question of how information learned in one context can be reused or adapted in another. The first part focuses on temporal learning from visual data, treating video streams as structured time series. It introduces a consistency-based formulation for temporal anomaly localization (CSL-TAL) in a setting where frame-level annotations are unavailable. Then it focuses on multi-object tracking, introducing flow-based and probabilistic representations (TrackFlow) that improve the modeling of temporal dynamics in complex scenes. Finally, it tackles monocular per-object distance estimation (DistFormer), which integrates object-centric reasoning into temporal learning pipelines. Together, these studies show how structured temporal information can be exploited to learn more generalizable and interpretable visual representations. The second part investigates continual learning, where data arrive as a stream. CHARON presents an efficient continual learning framework for skeleton-based action recognition that combines masking and compression to improve memory usage and stability. CGIL introduces a method for continual adaptation of large vision-language models via generative latent replay, preserving zero-shot capabilities while enabling incremental prompt learning. Collectively, these contributions reinterpret continual learning as a form of structured temporal progression--a time series of evolving tasks. The final part explores knowledge transfer across models through the lens of model merging and task arithmetic. Here, instead of adapting a single model over time, the goal is to combine multiple pre-trained models to construct new capabilities dynamically. The PASTA framework for modular tracking shows how specialized model components can be composed in parameter space to generalize across domains. Subsequently, the thesis focuses on low-rank (MoDER and Core Space) and gradient-based (GradFix) techniques for model merging, allowing the creation of new models through direct parameter operations rather than retraining. These methods enable the synthesis of task-specific networks, representing a new form of temporal evolution--one that occurs in model space rather than data space. Overall, this thesis offers a coherent perspective on knowledge transfer in evolving systems. By connecting temporal learning, continual adaptation, and model composition, it reinterprets time series analysis as a broader principle of knowledge transfer across evolving representations. The resulting framework highlights the role of structure, modularity, and reuse in building scalable, adaptive, and resilient learning systems, capable not only of understanding the world as it changes, but of changing themselves in response.
Learning Across Time, Tasks, and Models: Knowledge Transfer in Evolving Systems
20-apr-2026
CALDERARA, Simone
File in questo prodotto:
File Dimensione Formato  
Panariello.pdf

Open access

Descrizione: Panariello.Aniello.pdf
Tipologia: Tesi di dottorato
Dimensione 3.4 MB
Formato Adobe PDF
3.4 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1402269
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact