21-25 March 2022
Academia Sinica
Europe/Zurich timezone

Machine Learning inference using PYNQ environment in an AWS EC2 F1 Instance

24 Mar 2022, 12:00
20m
Room 1

Room 1

Oral Presentation Track 1: Physics (including HEP) and Engineering Applications Physics & Engineering

Speaker

Dr Marco Lorusso (Alma Mater Studiorum - University of Bologna)

Description

In the past few years, using Machine and Deep Learning techniques has become more and more viable, thanks to the availability of tools which allow people without specific knowledge in the realm of data science and complex networks to build AIs for a variety of research fields. This process has encouraged the adoption of such techniques, e.g. in the context of High Energy Physics, new algorithms based on ML are being tested for event selection in trigger operations, end-user physics analysis, computing metadata based optimizations, and more. Time critical applications can benefit from implementing algorithms on low-latency hardware like specifically designed ASICs and programmable micro-electronics devices known as FPGAs. The latter offers a unique blend of the benefits of both hardware and software. Indeed, they implement circuits just like hardware, providing power, area and performance benefits over software, yet they can be reprogrammed cheaply and easily to implement a wide range of tasks, at the expense of performance with respect to ASICs.

In order to facilitate the translation of ML models to fit in the usual workflow for programming FPGAs, a variety of tools have been developed. One example is the HLS4ML toolkit, developed by the HEP community, which allows the translation of Neural Networks built using tools like TensorFlow to a High-Level Synthesis description (e.g. C++) in order to implement this kind of ML algorithms on FPGAs.

This paper presents and discusses the activity started at the Physics and Astronomy department of University of Bologna and INFN-Bologna devoted to preliminary studies for the trigger systems of the CMS experiment at the CERN LHC accelerator. A broader-purpose open-source project from Xilinx (a major FPGA producer) called PYNQ is being tested combined with the HLS4ML toolkit. The PYNQ purpose is to grant designers the possibility to exploit the benefits of programmable logic and microprocessors using the Python language. This software environment can be deployed on a variety of Xilinx platforms, from IOT devices like the ZYNQ-Z1 board, to the high performance ones, like Alveo accelerator cards and on the cloud EC2 F1 instances. The use of cloud computing in this work allows us to test the capabilities of this workflow, from the creation and training of a Neural Network and the creation of a HLS project using HLS4ML, to managing NN inference with custom Python drivers.

Hardware and software set-up, together with performance tests on various baseline models used as benchmarks, will be presented.The presence or not of some overhead causing an increase in latency will be investigated. Eventually, the consistency in the predictions of the NN, with respect to a more traditional way of interacting with the FPGA via the Vivado Software Development Kit using C++ code, will be verified.

As a next step for this study, Alveo accelerator cards are expected to be tested with the presented workflow as well, and a local server devoted to test NN in a fast, reliable and easy-to-use way will likely be assembled.

Primary author

Dr Marco Lorusso (Alma Mater Studiorum - University of Bologna)

Co-authors

Dr Riccardo Travaglini (INFN - Bologna Division) Prof. Daniele Bonacorsi (University of Bologna)

Presentation materials