Challenges and innovation in the age of Exascale
High-performance computing (HPC) simulations are providing unparalleled insights into new scientific discoveries and are essential tools for industrial product design. HPC technology development over the last decades has been fuelled by the scientific and engineering communities’ unquenchable thirst for more and more computing power. Exascale has been in everyone’s mind ever since the first Petaflop system was deployed in 2008. The target was clear: 1 exaflops within a 20-megawatt (MW) power envelop by 2020. As of today, a first Exascale system has been installed in the USA and more coming around the globe while post-Exascale supercomputers are already planned. The focus is clearly on meaningful application performance (Exascale = exaflops delivered to HPC applications) and this still entails multiple challenges  well beyond raw hardware performance (exaflops).
The HPC applications have evolved to deliver more performance through unprecedented levels of parallelism, but also with new techniques. Noticeably, (Big) Data Analysis was introduced to refine computer models through the mining of real-life physical observations. More recently, Artificial Intelligence (AI) frameworks made possible the use of surrogate models, thus drastically accelerating a significant range of HPC applications, and considerably improving the quality of the simulations.
The diversity challenge
Concurrently to the evolution of HPC applications software, HPC hardware architecture has also changed significantly. Ten years ago, the HPC ecosystem looked quite uniform, with most supercomputers based on x86 CPUs. By contrast, today’s supercomputer architectures are quite diverse. HPC systems are now commonly composed of several partitions, each featuring different types of computing/processing nodes. On the CPU side, different instruction set architectures (ISAs) are used besides traditional x86, particularly ARM and possibly in the future RISC-V. GPUs have been so far the HPC Accelerators of choice, now with multiple providers. In addition to GPUs, other accelerators are proposed such as FPGAs, or specific AI processing units (IPUS, TPUs …). This unprecedented wave of innovations in processors technology  presents for developers the opportunity to boost HPC applications performance, whilst at the same time tackling the challenge of such heterogeneous environments.
Energy efficiency at Exascale
Even though each new generation of computing elements is delivering more performance per Watt thanks to new architecture and advances in electronic manufacturing, the overall consumption of Exascale systems is nevertheless reaching costly levels. Exascale HPC datacenters are now commonly configured to provide 20 MW of electrical power, or more. With the rising electricity cost, each MW now exceeds $1 million per year. Hence, over a period of 5 years, the electricity bill for a 20 MW system will sum up to $100 million. Taking these considerations in mind, the supercomputer and datacenter utilities, and most importantly the cooling system, must be carefully optimized. Additionally, GPUs and CPUs consumption has been growing steadily, soon exceeding 500 W or even reaching 1000 W. With such power concentration, the heat dissipation requirements far exceed the capacity of classical air-cooled servers, but liquid cooling has proven to be the perfect practical solution for such requirements at Exascale.
Improving on previous generations, the newly introduced BullSequana XH3000 platform by Atos greatly expands the power supply and cooling capacities for each rack. As a result, a higher inlet temperature is admissible, and the datacenter free-cooling range is further extended. The use of chilled water, a strong requirement for classical air-cooled servers, is not necessary with DLC, thus allowing for a Power Usage Effectiveness (PUE) as low as 1.05 for most datacenters, all year around. On average, DLC reduces by 40% the HPC datacenters global electricity bill.
The future of Exascale is hybrid: the role of quantum and AI
Atos see the importance of the future coupling of HPC and Quantum computing. Within the framework of the EuroHPC HPCQS project, a first prototype will allow researchers to explore these possibilities. The Atos QLM (Quantum Learning Machine) software environment will ensure a smooth integration of the Quantum computing with the HPC platform.
AI, which is now used to accelerate HPC applications, also plays an important role in energy optimization, resource scheduling, data management, performance optimization and system preventive maintenance. These management software tools complement the hardware technology to improve the global energy efficiency of the system. Overall, AI-augmented solutions are the backbone of making advanced computing faster and more efficient, from hardware to software.
We believe that the future of supercomputing is in hybrid platforms, combining cutting-edge processing units to deliver unparalleled computing power at any scale. These solutions will prove crucial in tackling the emerging engineering and scientific challenges of the years and decades to come.
Future of Compute Week 2022
During this week we will deep-dive into a number of themes that if addressed could develop our large scale compute infrastructure to support the UK’s ambitions as a science and technology superpower. To find out more, including how to get involved, click the link below
Laura is techUK’s Head of Programme for Technology and Innovation.
She supports the application and expansion of emerging technologies, including Quantum Computing, High-Performance Computing, AR/VR/XR and Edge technologies, across the UK. As part of this, she works alongside techUK members and UK Government to champion long-term and sustainable innovation policy that will ensure the UK is a pioneer in science and technology
Before joining techUK, Laura worked internationally as a conference researcher and producer covering enterprise adoption of emerging technologies. This included being part of the strategic team at London Tech Week.
Laura has a degree in History (BA Hons) from Durham University, focussing on regional social history. Outside of work she loves reading, travelling and supporting rugby team St. Helens, where she is from.
Rory joined techUK in June 2023 after three years in the Civil Service on its Fast Stream leadership development programme.
During this time, Rory worked on the Government's response to Covid-19 (NHS Test & Trace), school funding strategy (Department for Education) and international climate and nature policy (Cabinet Office). He also tackled the social care crisis whilst on secondment to techUK's Health and Social Care programme in 2022.
Before this, Rory worked in the House of Commons and House of Lords alongside completing degrees in Political Economy and Global Politics.
Today, he is techUK's Programme Manager for Emerging Technologies, covering dozens of technologies including metaverse, drones, future materials, robotics, blockchain, space technologies, nanotechnology, gaming tech and Web3.0.