31 March 2019 to 5 April 2019
Academia Sinica
Asia/Taipei timezone

Boosting the CMS computing efficiency for data taking at higher luminosities

5 Apr 2019, 09:20
20m
Conference Room 2 (Academia Sinica)

Conference Room 2

Academia Sinica

Oral Presentation Physics (including HEP) and Engineering Applications Physics & Engineering Application

Speaker

Mr Leonardo Cristella (INFN Section of Bari)

Description

Thousands of physicists continuously analyze data collected by the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) using the CMS Remote Analysis Builder (CRAB) and the CMS global pool to exploit the resources of the World LHC Computing Grid. Efficient use of such an extensive and expensive system is crucial and the previous design needs to be upgraded for the next data taking at unprecedented luminosities. Supporting varied workflows while preserving efficient resource usage poses special challenges, like: scheduling of jobs in a multicore/pilot model where several single core jobs with an undefined runtime run inside pilot jobs with a fixed lifetime; avoiding that too many concurrent reads from same storage push jobs into I/O wait mode making CPU cycles go idle; monitor user activity to detect low-efficiency workflows and automatically provide them with advices for a smarter usage of the resources tailored on the use case. In this talk we report on two novel complementary approaches adopted in CMS to improve the scheduling efficiency of user analysis jobs: job automatic splitting, and job automatic estimated running time tuning. They both aim at finding an appropriate value for the scheduling runtime. With the automatic splitting mechanism, an estimation of the runtime of the jobs is performed upfront so that an appropriate value can be estimated for the scheduling runtime. With the automatic time tuning mechanism instead, the scheduling runtime is dynamically modified by analyzing the real runtime of jobs after they finish. We also report on how we used the flexibility of the global computing pool to tune the amount, kind, and running locations of jobs exploiting remote access to the input data. We discuss the strategies, concepts, details, and operational experiences, highlighting the pros and cons, and we show how such efforts have helped to improve the computing efficiency in CMS. Keyword Computing efficiency, High luminosity, CMS

Primary author

Mr Leonardo Cristella (INFN Section of Bari)

Co-authors

Mr Giacinto Donvito (INFN) Giorgio Pietro Maggi (Politecnico di Bari & INFN Section of Bari)

Presentation materials