FPGA-based heterogeneous systems are a popular choice for accelerating Deep Neural Networks (DNNs), but efficiently integrating and orchestrating HW and SW tasks remains challenging. FPGA overlay architectures have been proposed to simplify accelerator management, yet state-of-the-art solutions struggle with performance bottlenecks caused by frequent CPU-FPGA interactions. We introduce a novel overlay-based methodology enabling the Proxy Computing paradigm, leveraging a local orchestrator and shared memory to (i) reduce accelerator control overhead and (ii) minimize unnecessary data movements. As a case study, we integrate the AMD/Xilinx Deep Learning Processing Unit (DPU) with additional accelerators for unsupported layers. Experiments show that our approach significantly reduces memory transfers, achieving up to 4 × speed up in the proposed case study.

Enabling the Proxy Computing Paradigm on DPU-based FPGA Acceleration / Brilli, Gianluca; Capotondi, Alessandro; Burgio, Paolo; Marongiu, Andrea. - 1:(2025), pp. 80-83. ( 22nd ACM International Conference on Computing Frontiers 2025, CF 2025 ita 2025) [10.1145/3719276.3725192].

Enabling the Proxy Computing Paradigm on DPU-based FPGA Acceleration

Brilli, Gianluca;Capotondi, Alessandro;Burgio, Paolo;Marongiu, Andrea
2025

Abstract

FPGA-based heterogeneous systems are a popular choice for accelerating Deep Neural Networks (DNNs), but efficiently integrating and orchestrating HW and SW tasks remains challenging. FPGA overlay architectures have been proposed to simplify accelerator management, yet state-of-the-art solutions struggle with performance bottlenecks caused by frequent CPU-FPGA interactions. We introduce a novel overlay-based methodology enabling the Proxy Computing paradigm, leveraging a local orchestrator and shared memory to (i) reduce accelerator control overhead and (ii) minimize unnecessary data movements. As a case study, we integrate the AMD/Xilinx Deep Learning Processing Unit (DPU) with additional accelerators for unsupported layers. Experiments show that our approach significantly reduces memory transfers, achieving up to 4 × speed up in the proposed case study.
2025
22nd ACM International Conference on Computing Frontiers 2025, CF 2025
ita
2025
1
80
83
Brilli, Gianluca; Capotondi, Alessandro; Burgio, Paolo; Marongiu, Andrea
Enabling the Proxy Computing Paradigm on DPU-based FPGA Acceleration / Brilli, Gianluca; Capotondi, Alessandro; Burgio, Paolo; Marongiu, Andrea. - 1:(2025), pp. 80-83. ( 22nd ACM International Conference on Computing Frontiers 2025, CF 2025 ita 2025) [10.1145/3719276.3725192].
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1389788
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact