Access and manage data storage efficiently and at scale on exascale systems


April 18, 2022 — Rob Farber is a global technology consultant and author with extensive experience in HPC and developing machine learning technologies that he applies in national labs and commercial organizations. The following is an excerpt from his article regarding the ExaIO project which is part of the Exascale Computing Project (ECP).

As the word exascale implies, next-generation exascale supercomputer systems will provide 1018 flop/s of scalable computing capacity. All of this computing capacity will be wasted if the storage hardware and software I/O stack cannot meet the storage needs of applications running at scale, leaving applications to drown in data as they attempt to write to storage or starve while waiting to read data from storage.

Suren Byna, ExaIO Project Principal Investigator in the Exascale Computing Project (ECP) and Computing Staff Scientist at Lawrence Berkeley National Laboratory, emphasizes the need for readiness to meet the I/O needs of exascale supercomputers in noting that storage is usually the last subsystem available for testing on these systems. In addressing the I/O needs of many ECP software technology (ST), application development (AD), and hardware integration (HI) projects, Byna observes that the storage-focused ExaIO project needs to prepare now to be ready when these systems go into production. “The success of the ExaIO project means addressing three trends that are becoming a triggering factor at exascale,” Byna said. “(1) too much data being generated, (2) too much data being consumed, and (3) storage performance becoming a trigger for many applications. Additionally, exascale-enabled hardware solutions involve both new and complex I/O and storage architectures that require enhancements to existing I/O libraries. We address these hardware trends and needs in ExaIO through the HDF5 [Hierarchical Data Format version 5] library and UnifyFS.

Byna emphasized the importance of adapting I/O technologies to be exascale-ready, noting that “without the funding provided by DOE and ECP to improve I/O libraries HDF5, applications using HDF5 will not be able to take advantage of the new exascale. storage architectures. The funding gives us the ability to develop new systems (like UnifyFS) that push I/O technologies to the next generation. Byna also thought about the breadth of technical support that comes from recognizing the general need for high-performance storage. “We are adding new features to HDF5, a popular data model, file format and I/O library. The ExaIO team is also developing a new file system, called UnifyFS, to take advantage of fast storage layers distributed across compute nodes in a supercomputing system. The project involves members of Lawrence Berkeley Lab, The HDF Group (THG) which is the lead developer and maintainer of HDF5, Argonne National Laboratory, Lawrence Livermore Laboratory, Oak Ridge National Lab, and North Carolina State University.

To read more, visit this link.

Source: Rob Farber, contributing editor at ECP


Comments are closed.