Mechanical engineer Baskar Ganapathysubramanian is developing an app that would use artificial intelligence (AI) to help farmers identify pests and advise on how to combat them. Computer scientist Anuj Karpatne sees AI as the key to forecasting how climate change, land use, and increased demand will affect water quality in U.S. lakes. Biochemist David Baker has enlisted AI to design novel molecules that could lead to new drugs.
All three scientists already receive ample government funding to support their labs. But they lack access to the advanced computers they need to train their AI systems.
On 6 May, the National Science Foundation (NSF) announced the trio were among the 35 winners of government-funded supercomputer time in a 2-year pilot project that aims to boost AI-driven research across many disciplines and improve the safety, reliability, and trustworthiness of AI systems. The awards are the first from the National Artificial Intelligence Research Resource (NAIRR) program, which President Joe Biden asked NSF to lead as part of an October 2023 executive order on AI.
“Computational biologists have never had a way to get access to [computing] at this level,” says Baker, who leads the Institute for Protein Design at the University of Washington. “It’s hard for academics to keep up with industry.”
NAIRR is designed to narrow that gap. “We want to broaden access to large-scale computing so that more people can tackle key societal challenges,” says Katie Antypas, who leads NSF’s Office of Advanced Cyberinfrastructure, which is managing the program. “We also want to give students hands-on access to these tools.”
NAIRR’s road map comes from a 6-year, $2.6 billion plan for growing the nation’s academic AI research capacity that a blue-ribbon panel proposed in January 2023. NSF expects to make several dozen more awards in the coming weeks from the initial pool of 150 proposals, she adds, and this week it announced a second competition, with awards made on a rolling basis.
The NAIRR-funded scientists will have access to supercomputing facilities supported by NSF and the Department of Energy (DOE). Those machines are already in high demand, says Dan Stanzione, director of the NSF-funded Texas Advanced Computing Center, which hosts the Frontera and Lonestar machines. But he and other center directors have agreed to set aside time for the NAIRR projects.
The supercomputer time “is coming out of my discretionary fund for advanced scientific computing because NAIRR is a priority,” Stanzione says. “But we also hope to add capacity.” NSF’s 2025 budget proposal includes a request for a $150 million down payment on a new half-billion-dollar machine at the Texas center. DOE’s recent decision to extend the life of the 6-year-old Summit supercomputer at Oak Ridge National Laboratory, which has been eclipsed by an even more powerful machine, will also be a boon to NAIRR scientists.
So far, NSF has reallocated internal resources to administer the program, with help from a dozen other federal agencies. NAIRR will get its own funding, however, if Congress approves NSF’s request for $30 million to continue the awards through 2025. In addition, some 26 companies, including such AI heavyweights as Microsoft, Amazon Web Services, and NVIDIA, have agreed to provide computing resources. “Whatever money we get from Congress will not be enough, so we will need partners,” NSF Director Sethuraman Panchanathan admitted during a White House AI event on Monday that featured several NAIRR awardees.
Karpatne, a professor at Virginia Polytechnic Institute and State University, hopes the NAIRR grant will accelerate progress on a water quality model he has dubbed Lake-GPT. Unlike traditional models, which are based on detailed monitoring of one or two lakes and apply only to those settings, Lake-GPT aims to predict the fate of water quality in thousands of lakes across the country. The model is being trained on the vast amounts of environmental data collected by projects such as the NSF-funded National Ecological Observatory Network.
That training will require lots of computing time, so Karpatne asked for 12 million GPU hours on any available machine. (Graphical processing units, or GPUs, are the computer chips favored in most AI work.) He wound up with less than 10% of that amount, some 750,000 GPU hours, on Summit. That’s enough to do forecasts of 20 lakes over the next 6 months, he says—and obtain results that he hopes will earn him additional computer time in subsequent rounds.
In 2021, Ganapathysubramanian won a 5-year, $20 million grant from the U.S. Department of Agriculture (USDA) to lead an AI Institute for Resilient Agriculture at Iowa State University. He says the USDA money goes to support personnel and research projects that don’t need “heavy-duty” scientific computing.
But heavy-duty computing is what he needs to scale up InsectNet, a model that can identify pest species in mobile phone pictures. Its current version was trained on some 10 million images of insects, but Ganapathysubramanian hopes to improve its accuracy by expanding the training to 150 million images. He’s been given 920,000 GPU hours on Frontera to do that. He also wants to add a trustworthy chatbot that farmers can access on their phone to get advice on dealing with the pests. Eventually he’d like to use the farmers’ real-time requests, along with data from drones, to identify agricultural “hot spots”: infestations that may require immediate intervention.
For Shu Hu, a computer scientist at Purdue University, the chance to train his students is just as important as the 4300 GPU hours he will get on the Lonestar machine to train his AI model to detect fake images. Two of his graduate students are hoping to learn how to stay one step ahead of the forgers by searching for common features in the growing universe of DeepFakes. At the same time, two undergraduates will gain access to DOE-funded course materials on AI.
The first cohort of winners come from 17 states, Antypas notes, suggesting that NAIRR has already taken a small step toward making AI tools more accessible. She says federal officials are also discussing how NAIRR might provide researchers with not just computer time, but also access to vast training data sets, such as imaging data from federally funded clinical trials that have been curated by subject matter experts. “We want NAIRR to become a true national resource,” she says.
