Speaker
Description
To enhance the discovery potential of the Large Hadron Collider (LHC) at CERN in Geneva and improve the precision of Standard Model measurements, the High Luminosity LHC (HL-LHC) Project was initiated in 2010 to extend its operation by another decade and increase its luminosity by approximately tenfold beyond the design value.
In this context, the scope of applications for Machine Learning, particularly Artificial Neural Network algorithms, has experienced an exponential expansion due to their considerable potential for elevating the efficiency and efficacy of data processing, especially for innovative trigger-level event selection in Beyond Standard Model (BSM) research. This study explores Autoencoders (AEs), unbiased algorithms that select events based on abnormality without theoretical assumptions. However, the stringent latency and energy constraints of a HEP Trigger system require tailored software development and deployment strategies. These strategies aim to optimize the utilization of on-site hardware, with a specific focus on Field-Programmable Gate Arrays (FPGAs).
This is why a technique called Knowledge Distillation (KD) is studied in this work. It consists in using a large and well trained “teacher”, like the aforementioned AE, to train a much smaller student model which can be easily implemented on an FPGA. The optimization of this distillation process involves exploring different aspects, such as the architecture of the student and the quantization of weights and biases, with a strategic approach that includes hyperparameter searches to find the best compromise between accuracy, latency and hardware footprint.
The strategy followed to distill the teacher model will be presented, together with consideration on the difference in performance when applying the quantization before or after the best student model has been found.