Speaker
Prof.
Xuebin Chi
(Computer Network Information Center, Chinese Academy of Sciences)
Description
In the past 20 years, being the national high-performance computing environment in China, CNGrid had shown a magnificent development from various angles: computing and storage capacity, number of software and user accounts, the supported projects and research papers, etc. As HPCs are widely used for accelerating the experiments and simulations of scientific researches, the main goal of CNGrid is to provide computing resources and services to researchers from various subjects. There are several advantages of using HPC environments instead of single supercomputers. Single supercomputers have limited kinds of software, and sometimes need to be shut down for maintenance work, while HPC environments that are composed of multiple HPCs can provide much more software covering different subjects, and can still provide services when any single HPC in the environment is offline.
During the service and maintenance work for CNGrid, it is found that only a small part of users can skilfully play with supercomputer systems, while the vast majority are very familiar with their research work but knows almost nothing about how to use HPCs to solve their problems. To help these green hands, CNGrid has built its platform service to aggregate HPC resources from the infrastructure pool and application pool, and provide easy accesses by using both command line consoles and web portals. Moreover, by providing APIs to developers, CNGrid can support communities and platforms for special research subjects, and is able to gather more resources to run simulation tasks for large scientific researching equipment.
At the moment CNGrid is in front of the gate towards the new era of exa-scale computing, with the followed new challenges. Two categories of questions lies to be answered by the HPC environments: the construction of the environment and the user services. The detailed entries are listed following:
- How to support the HPC resources and data aggregation in the context of exa-scale?
- How to design the task scheduler to improve the HPC resource usage and efficiency?
- How to solve the bottleneck of internet transmission speed?
- How to ensure the environment working stably and securely?
- How to provide easier accesses from users to HPC resources?
- How to support communities and platforms of various subjects in a unified method?
- How to evaluate the resource level and the service level in the environment?
We will unveil the considerations and efforts that have and being made towards these questions while the continuous construction of CNGrid. We will also introduce some planning ideas of development directions in the next stage of HPC environment construction in China.
Primary author
Prof.
Xuebin Chi
(Computer Network Information Center, Chinese Academy of Sciences)
Co-authors
Mr
Haili XIAO
(Computer Network Information Center, Chinese Academy of Sciences)
Dr
Yining Zhao
(Computer Network Information Center, Chinese Academy of Sciences)