High performance computing (HPC) will reach the exascale level within the next few years. Single compute clusters will combine the performance of millions of cores to perform more than 1018 operations per second. Besides the challenges resulting from going to exascale computing, HPC is in the transition from being compute-centric to being data-centric. The management and handling of data become more and more important, and it will be crucial to scale data capacity and data bandwidth.
Our group Efficient Computing and Storage at the Johannes Gutenberg University Mainz is focusing on the areas storage systems and scalable computing.
We are focusing both on block and file level storage. We are developing protocols and architectures, which are able to efficiently use the underlying storage medias and integrate these architectures within scalable environments. New storage technologies, like solid state disks (SSDs), are integrated within these environments and help us to deliver optimized storage systems, e.g., in the context of data deduplication and backup.
Combining the performance of accelerators and standard processors within a single framework is investigated in the context of next generation HPC. Our compiler extensions automatically extend scientific source codes, so that the resulting applications can be seamlessly moved between CPUs and GPUs and the operating system can optimize the nodes’ throughput. The optimized utilization and energy efficiency are also investigated in the context of Cloud Computing where we simplify the access to scientific applications and HPC.
Most recent publications
- Alvaro Frank, Manuel Baumgartner, Reza Salkhordeh, and André Brinkmann. 2021. Improving checkpointing intervals by considering individual job failure probabilities. In 35th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 209–309. DOI
- Frederic Schimmelpfennig, Marc-André Vef, Reza Salkhordeh, Alberto Miranda, Ramon Nou, and André Brinkmann. 2021. Streamlining distributed Deep Learning I/O with ad hoc file systems. In 2021 IEEE International Conference on Cluster Computing (CLUSTER), 169–180. DOI Author/Publisher URL
- Nicolas Krauter, Patrick Raaf, Peter Braam, Reza Salkhordeh, Sebastian Erdweg, and Andre Brinkmann. 2021. Persistent Software Transactional Memory in Haskell. Proc. ACM Program. Lang. 5. DOI Author/Publisher URL
- Petra Berenbrink, André Brinkmann, Robert Elsäßer, Tom Friedetzky, and Lars Nagel. 2021. Randomized renaming in shared memory systems. Journal of Parallel and Distributed Computing 150: 112–120. DOI
- Wen Cheng, Chunyan Li, Lingfang Zeng, Yingjin Qian, Xi Li, and André Brinkmann. 2021. NVMM-Oriented Hierarchical Persistent Client Caching for Lustre. ACM Transactions on Storage 17: 6:1–6:22. DOI