International Symposium on Grids & Clouds (ISGC) 2025

Name: International Symposium on Grids & Clouds (ISGC) 2025
Start: 2025-03-16T09:00:00+08:00
End: 2025-03-21T17:30:00+08:00
Location: BHSS, Academia Sinica

16-21 March 2025

BHSS, Academia Sinica

Asia/Taipei timezone

Contact

vhuang@gate.sinica.edu.tw

Session

Data Management & Big Data

20 Mar 2025, 16:00

Conference Room 2 (BHSS, Academia Sinica)

Conference Room 2

BHSS, Academia Sinica

Data Management & Big Data

Patrick Fuhrmann (DESY/dCache.org)

Description

The rapid growth of the data available to scientists and scholars – in terms of Velocity and Variety as well as sheer Volume – is transforming research across disciplines. Increasingly these data sets are generated not just through experiments, but as a byproduct of our day-to-day digital lives. This track explores the consequences of this growth, and encourages submissions relating to two aspects in particular - firstly, the conceptual models and analytical techniques required to process data at scale; secondly, approaches and tools for managing and creating these digital assets throughout their lifecycle.

Additionally, a significant additional dimension is the automated generation and provisioning of metadata, either from simulated data such as Digital Twins or from experiments that produce vast amounts of data beyond manual annotation capacity. The automation of metadata creation and their availability in searchable catalogues is crucial for aligning with the FAIR Data Principles, ensuring data is findable and reusable. This process is also pivotal in making data usable for machine-driven applications, notably in AI training scenarios."

There are no materials yet.

40. Data Management Planning within the EOSC CZ - Czech National Data Infrastructure for Research Data

Mr Jiří Marek (Masaryk University, Institute of Computer Science)

20/03/2025, 16:00

Track 6: Data Management & Big Data

Oral Presentation

Data Management Planning within the EOSC CZ - Czech National Data Infrastructure for Research Data

Author: Jiří Marek, Open Science manager at Masaryk University, Head of EOSC CZ Secretariat, Czech Republic

The rapid expansion of data availability is reshaping research methodologies across various disciplines. This surge, characterized by its Velocity, Variety, and Volume, is driven not...

47. Data Archive challenges for sPhenix 2025

Tim Chou (Brookhaven National Laboratory)

20/03/2025, 16:18

Track 6: Data Management & Big Data

Oral Presentation

With the Run2025 for sPhenix, it comes the higher data throughput and data volume
requirements.
The sustained data throughput required for sPhenix2025 is 20GB/sec. Once started in
mid-April, this sustained data steam will be steadily constant with no breaks through December.
The projected data volume is 200PB.
In order to meet these data throughput and volume requirement, we must rebuild...

7. The HEPS Data and Computing System

Hao Hu (Institute of High Energy Physics)

20/03/2025, 16:36

Track 6: Data Management & Big Data

Oral Presentation

The 14 beamlines for the phase I of High Energy Photon Source(HEPS) will produces more than 300PB/year raw data. Efficiently storing, analyzing, and sharing this huge amount of data presents a significant challenge for HEPS.

HEPS Computing and Communication System(HEPSCC), also called HEPS Computing Center, is an essential work group responsible for the IT R&D and services for the facility,...

88. Development and Application of the HEPS Scientific Data Processing Software Framework

Yu Hu (Chinese Academy of Sciences)

20/03/2025, 16:54

Oral Presentation

The High Energy Photon Source (HEPS) is a new fourth-generation high-energy synchrotron radiation facility, scheduled to become fully operational by the end of 2025. Compared to previous generations, it features significant advancements in brightness and detector performance. In its phase I, HEPS plans to construct 14 beamlines, with an estimated annual experimental data volume exceeding 300...

32. (Cancelled) CLUEstering: a novel density-based weighted clustering library

Simone Balducci

20/03/2025, 17:12

Track 6: Data Management & Big Data

Poster Presentation

CLUEstering is a versatile clustering library based on CLUE, a density-based, weighted clustering algorithm optimized for high-performance computing. The library offers a user-friendly Python interface and a C++ backend to maximize performance. CLUE’s parallel design is tailored to exploit modern hardware accelerators, enabling it to process large-scale datasets with exceptional scalability...

Building timetable...

International Symposium on Grids & Clouds (ISGC) 2025

Contact

Session

Data Management & Big Data

Conference Room 2

BHSS, Academia Sinica

Conveners

Data Management & Big Data

Description

Presentation materials