Nowadays Machine Learning (ML) techniques are widely adopted in many areas of High-Energy Physics (HEP) and certainly will play a significant role also in the upcoming High-Luminosity LHC (HL-LHC) upgrade foreseen at CERN. A huge amount of data will be produced by LHC and collected by the experiments, facing challenges at the exascale.
Here, we present Machine Learning as a Service solution for HEP (MLaaS4HEP) to perform an entire ML pipeline (in terms of reading data, processing data, training ML models, serving predictions) in a completely model-agnostic fashion, directly using ROOT files of arbitrary size from local or distributed data sources.
With the new version of MLaaS4HEP code based on uproot4, we provide new features to improve user’s experience with framework and their workflows, e.g. user can provide some preprocessing functions to be applied on ROOT data before starting the ML pipeline.
Then our approach is extended to use local and cloud resources via HTTP proxy which allows physicists to submit their workflows using HTTP protocol. We discuss how we enabled this pipeline on INFN Cloud provider and supplement it with real use-case examples used in HEP analysis.