21-25 March 2022
Academia Sinica
Europe/Zurich timezone

Open-source and cloud-native solutions for managing and analyzing heterogeneous and sensitive clinical Data

25 Mar 2022, 13:50
20m
Room 2

Room 2

Oral Presentation Track 6: Data Management & Big Data Data Management & Big Data

Speaker

Dr daniele spiga (INFN-PG)

Description

The requirement for an effective handling and management of heterogeneous and possibly confidential data continuously increases within multiple scientific domains.
PLANET (Pollution Lake ANalysis for Effective Therapy) is a INFN-funded research initiative aiming to implement an observational study to assess a possible statistical association between environmental pollution and Covid-19 infection, symptoms and course. PLANET builds on a "data-centric" based approach that takes into account clinical components, environmental and pollution conditions, complementing primary data and many eventual confounding factors such as population density, commuter density, socio-economic metrics and more . Besides the scientific one, the main technical challenge of the project is about collecting, indexing, storing and managing many types of datasets guaranteeing FAIRness as well as adherence to the prescribed regulatory frameworks, such as the GDPR.
In this contribution we describe the developed open-source DataLake platform, detailing its key features: the event-based storage system centered on MinIO, which automates metadata processing; the data pipeline, implemented via Argo Workflows; the GraphQL-based mechanisms to query object metadata; finally, the seamless integration of the platform with a compute multi-user environment, showing how all these frameworks are integrated in the Enhanced PrIvacy and Compliance (EPIC) Cloud partition of the INFN-Cloud federation.

Primary authors

Alessandro Costantini (INFN-CNAF) Barbara Martelli (INFN - CNAF) Cristina Duma (Istituto Nazionale di Fisica Nucleare, CNAF) Davide Salomoni (INFN) Dr daniele spiga (INFN-PG) Diego Ciangottini (Istituto Nazionale di Fisica Nucleare, Sezione di Perugia) Elisabetta Ronchieri (INFN CNAF) Mrs Giusy Sergi (INFN-CNAF) Mr Jacopo Gasparetto (INFN-CNAF) Prof. Loriano Storchi (Dipartimento di Farmacia Universita' G. d'Annunzio) Mirco Tracolli (Istituto Nazionale di Fisica Nucleare, Sezione di Perugia) Dr Pasquale Lubrano (INFN -Sezione Perugia) Sara Cutini (INFN Perugia)

Presentation materials