Sage is a cyberinfrastructure (CI) testbed to support artificial intelligence (AI) explorations at the intersection of edge computing, real-time streaming sensor data, and interconnections to other National Science Foundation (NSF) CI, including high-performance computing. The testbed is part of the National Discovery Cloud for Climate (NDC-C). Sage NDC-C leverages prior NSF investments in Mid-Scale Research Infrastructure. Sage provides students and scientists access to an open, national-scale CI to explore and advance AI algorithms and techniques, real-time AI workflows for climate and natural disasters, and design new CI to support advanced experimental instruments and sensors with AI-enabled computation at the edge. Equally important, the testbed provides insights into how a future national-scale computing resource can be used to advance AI and the coupled, complex adaptive systems that constitute the Earth’s climate and biosphere.
Sage NDC-C: A Testbed Supporting Artificial Intelligence Research Spanning the Computing Continuum
Project leads
Pete Beckman (Northwestern University, PI), Ilkay Altintas (University of California San Diego, co-PI), Nicola Ferrier (Northwestern University, co-PI), Eugene Kelly (Colorado State University, co-PI), Michael Papka (University of Illinois Chicago, co-PI)
Key team members
Jason Leigh (University of Hawaii), Jim Olds (George Mason University), Manish Parashar (University of Utah), Dan Reed (University of Utah), Rajesh Sankaran (Northwestern University), Sean Shahkarami (Northwestern University), Helen Taaffe (Northwestern University), Valerie Taylor (Northwestern University), Doug Toomey (University of Oregon)
Advancing AI Research
Sage NDC-C provides a distributed, real-world system for exploring new AI methods and technologies; Sage is a unique, cyber-physical AI platform that supports a wide range of AI inquiry using streaming environmental data. For example, Sage has allowed scientists to develop and evaluate new, use-inspired AI@Edge models to track and classify cloud movement and new self-supervised learning approaches to characterize birdsong. Sage users have advanced several aspects of AI research, including learning at the edge, federated learning, self-supervised learning, and the exploration of multimodal large language models deployed to the edge.
Broader Impacts
Sage NDC-C helps diverse groups of students, researchers, and community scientists to learn about, protect, and even guide restoration of their natural resources—integrating ethics with AI systems in real-world settings. The testbed benefits the research community in three ways. First, it provides a unique testbed that facilitates computing research and scientific advancement that requires access to advanced programmable sensors networked with HPC and cloud systems. Second, the data generated from the programmable sensors will be made available to the wider research community to continue to advance research in AI. Third, Sage NDC-C facilitates multidisciplinary research by providing a platform whereby different science teams with different sensors can bring their unique data into a common platform for more robust analysis. Through collaborative design and development, Sage NDC-C addresses the National Science Board’s Vision 2030 “Missing Millions” challenge.
Innovative Partnerships
- The Great Lakes Fish & Wildlife Commission (GLIFWC) represents eleven Ojibwe tribes in Minnesota, Wisconsin, and Michigan who reserved hunting, fishing and gathering rights in the 1836, 1837, 1842, and 1854 Treaties with the United States government. Sage NDC-C works with the Ojibwe and GLIFWC to deploy Sage nodes to aid understanding of Manoomin (wild rice) and the impacts of climate disruption.
- ARM designs low-power CPUs, from microcontrollers to AI accelerators. Sage has partnered with ARM to build open-source software and support student activities.
- The National Ecological Observatory Network (NEON) is a continental-scale research project to measure and understand our ecosystem. AI-enabled Sage nodes, deployed into the NEON infrastructure, have provided new ecosystem and climate related measurements.
- The City of Chicago has partnered with Sage to deploy AI-enabled nodes on city streets using a published privacy policy and data use agreement to advance AI and privacy enhancing AI research.
- Sage has partnered with wildfire detection networks in Oregon and California to explore new AI algorithms, running on Sage nodes, that show promise for the early detection of wildfires.
Advancing Domain Science
To test and validate Sage NDC-C, we target an exemplar problem: the increasing effects of climate disruption on communities and ecosystems. The rapid pace of climate disruption is triggering extremes in weather activity, resulting in an increase in the intensity and frequency of flooding, droughts, and wildfire throughout much of the country. Understanding these extreme events and mitigating their impacts on society, air and water quality, and ecosystem health, require coupling of diverse data from HPC models, remote sensing satellites, and ground-based instruments and sensor networks. Bridging and synthesizing these data streams, which are diverse in their temporal and spatial resolution, will require developing an AI-enabled computing continuum that supports model-driven experimentation and AI@Edge observations. Sage NDC-C explores the key CI features required to design AI-driven observation-to-HPC CI that can be used by scientists to couple real-time observational data, weather and climate models, and new AI techniques.