Extending a Moldable Computer Architecture to Accelerate DL Inference on FPGA

IRIS

Over Over the past years, the field of Machine Learning (ML) and Deep Learning (DL) has seen strong developments both in terms of software and hardware, with the increase of specialized devices. One of the biggest challenges in this field is the inference phase, where the trained model makes predictions of unseen data. Although computationally powerful, traditional computing architectures face limitations in efficiently managing requests, especially from an energy point of view. For this reason, the need arose to find alternative hardware solutions, and among these, there are Field Programmable Gate Arrays (FPGAs): their key feature of being reconfigurable, combined with parallel processing capability, low latency and low power consumption, makes those devices uniquely suited to accelerating inference tasks. In this paper, we present a novel approach to accelerate the inference phase of a multi-layer perceptron (MLP) using BondMachine framework, an OpenSource framework for the design of hardware accelerators for FPGAs. Analysis of the latency, energy consumption, and resource usage, as well as comparisons with respect to standard architectures and other FPGA approaches, is presented, highlighting the strengths and critical points of the proposed solution. The present work represents an exploratory study to validate the proposed methodology on MLP architectures, establishing a crucial foundation for future work on scalability and the acceleration of more complex neural network models.

Extending a Moldable Computer Architecture to Accelerate DL Inference on FPGA

Mariotti, Mirko;Bianchini, Giulio;Neri, Igor;Spiga, Daniele;Ciangottini, Diego;Storchi, Loriano

2025-01-01

Abstract

Over Over the past years, the field of Machine Learning (ML) and Deep Learning (DL) has seen strong developments both in terms of software and hardware, with the increase of specialized devices. One of the biggest challenges in this field is the inference phase, where the trained model makes predictions of unseen data. Although computationally powerful, traditional computing architectures face limitations in efficiently managing requests, especially from an energy point of view. For this reason, the need arose to find alternative hardware solutions, and among these, there are Field Programmable Gate Arrays (FPGAs): their key feature of being reconfigurable, combined with parallel processing capability, low latency and low power consumption, makes those devices uniquely suited to accelerating inference tasks. In this paper, we present a novel approach to accelerate the inference phase of a multi-layer perceptron (MLP) using BondMachine framework, an OpenSource framework for the design of hardware accelerators for FPGAs. Analysis of the latency, energy consumption, and resource usage, as well as comparisons with respect to standard architectures and other FPGA approaches, is presented, highlighting the strengths and critical points of the proposed solution. The present work represents an exploratory study to validate the proposed methodology on MLP architectures, establishing a crucial foundation for future work on scalability and the acceleration of more complex neural network models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Rivista su cui è pubblicata l'opera
	
				ELECTRONICS
			
	Codice ISI del prodotto
	
				WOS:001569649500001
			
	Lingua/e
	
				Inglese
			
	Volume (in numero arabo)
	
				14
			
	Fascicolo
	
				17
			
	Codice DOI
	
				https://dx.doi.org/10.3390/electronics14173518
			
	Parole chiave
	
				BondMachine; DL accelerator; hardware-software co-design; low latency DL inference on FPGA; low-power DL inference on FPGA; reconfigurable computing
			
	Codice Scopus
	
				2-s2.0-105015962688
			
	Presenza di coautori internazionali
	
				no
			
	Numero autori
	
				6
			
	Tipologia
	
				info:eu-repo/semantics/article
			
	Tipologia sito docente
	
				262
			
	Tutti gli autori
	
						Mariotti, Mirko; Bianchini, Giulio; Neri, Igor; Spiga, Daniele; Ciangottini, Diego; Storchi, Loriano
					
	Tipologia
	
				1 Contributo su Rivista::1.1 Articolo in rivista
			
	Fulltext
	
				none
			
	Identificativo progetto
	
	Titolo Progetto
	
									Parallelizzazione e Porting su GPU di Codici Scientifici
								
	Acronimo
	
									PaGUSci
								
	Nome finanziatore
	
										Istituto Nazionale di Astrofisica

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11564/886553

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

social impact