Department of Energy ready to use Frontier supercomputer to solve 24 science problems
The Department of Energy is poised to use its Frontier supercomputer to tackle 24 initial science and engineering problems with its applications and software stack.
The system, which is a keystone of the agency’s Exascale Computing Project, was late last month declared — albeit with some caveats — to have retaken the position as the world’s fastest exascale system.
The Department of Energy now plans to scale to about 9,400 nodes — separate computers that make up a high-performance computing (HPC) cluster, processing at a speed of 1,880 petaflops — of Oak Ridge Leadership Computing Facility’s exascale computer to simulate problems of national interest over the next 18 months.
Exascale Computing Project (ECP) has already been running on about 200 nodes of Frontier hardware since January, with one-and-a-half cabinets of the exascale computer’s 74 having been set aside for development teams to prepare for and identify last-minute issues with the transition.
“It’s crunch time, and we’re very excited,” Doug Kothe, director of the ECP, told FedScoop. “Within the next couple of months we’re going to be getting on Frontier and demonstrating very specific application capabilities and performance capabilities.”
Working with DOE sponsors, ECP chose the 24 initial science and engineering problems to focus on, though future possibilities number in the hundreds or thousands, Kothe said.
Problems span key national pillars: economic, national and energy security; scientific discovery; and even health care — so long as they’re exascale problems.
“An exascale problem is one that was really not solvable or approachable without this kind of power,” Kothe said. “It might mean that my concept-to-design cycle is a certain period, and to be able to simulate some phenomena in a design — with all the complexity I need to make a good solution — takes much longer than that design cycle.”
Engineers want results in days, hours or minutes, so months-long simulations aren’t practical. Exascale computers can simulate more complex phenomena with higher confidence while still allowing for much quicker hypothesis cycles.
Energy production is a core focus for DOE, so ECP will use Frontier to simulate fusion and nuclear fission reactors, wind energy farms, power grids, the clean combustion of fossil fuels like coal, and internal combustion engines used by land-based turbines and gas power plants. ECP has a power grid app for energy transmission and hopes to simulate a large portion of the nation’s interconnections.
ECP will also use Frontier to answer fundamental science questions around the origin of the elements in the universe including astrophysics, neutron star mergers and supernovae; the evolution of the universe known as cosmology; and the fundamental forces of nature like quantum chromodynamics.
Stanford Linear Accelerator Laboratory has an ECP project simulating how light sources interact with matter by shining photons from a free electron laser through biological samples or metal alloys to understand their structure.
Nontraditional apps include simulating genome assembly for microbiomes, how materials respond in extreme conditions like a radiation environment and COVID-19 virus docking scenarios.
“I’m pretty confident that we’re going to see some major discoveries come out of this exascale era with the applications that we’ve developed,” Kothe said.
ECP is part of the broader National Strategic Computing Initiative (NSCI) begun by DOE in 2016, and its apps and software aren’t exascale-specific — meaning they run on laptops, desktops and clusters too.
The Extreme Scale Scientific Software Stack (E4S), released more than three years ago at E4S.io, is comprised of the libraries apps need to come up with solutions and sits atop a Hewlett Packard Enterprise operating system.
DOE leads NSCI, and its Industry and Agency Council targets five agencies: the Department of Defense, National Institutes of Health, National Science Foundation, National Oceanic and Atmospheric Administration, and NASA.
DOE is working with NIH to develop machine learning to address cancer, called the Cancer Distributed Learning Environment (CanDLE), using Frontier.
Frontier apps may be built on multiple physical phenomena — fluid flow, heat transfer and material science — exportable to apps used by agencies outside the big five.
Developing Frontier was a codesign effort between ECP and the Oak Ridge Leadership Computing Facility, as is the case with the Aurora and El Capitan exascale computers at the Argonne and Lawrence Livermore national labs, respectively. ECP set requirements for the systems.
“We understand very well the hardware that’s being deployed,” Kothe said. “And we’re tailoring our applications to run well and exploit that hardware.”