Skip to main content
Projects / Demonstration projects / Sustainability Modeling and Management for Heterogeneous AI Systems

Sustainability Modeling and Management for Heterogeneous AI Systems (Supplement)

As we aim to advance the study of Safe, Secure, and Trustworthy AI, we must do so in a sustainable and efficient manner. The recent advent of Large Language Models has accelerated the need for the design of efficient and environmentally responsible AI systems. For example, GPT-3, which consumed over 500 tons of carbon during its training process alone (i.e., the equivalent of the average annual consumption of 100 cars). This trend is only accelerating, with just data centers projected to consume as much as 14% of U.S. energy consumption by 2030 (see attached Figure with data from https://www.semianalysis.com/p/ai-datacenter-energy-dilemma-race ). Recent advancements in AI accelerators provide promise of workload acceleration, often touting high FLOP count, large memory bandwidth, or novel dataflow architectures. However, with the plethora of specialized non-GPU accelerators available today, researchers are often left wondering which is the optimal choice for their given workload. This poses a high barrier to entry – as it is often unknown how a given workload will perform on a novel system. A suboptimal selection could induce costly overheads, including a large sustainability footprint that could otherwise be avoided.

In response, we propose an integrated research plan with three vectors—carbon attribution, heterogeneous processor management, heterogeneous model management—that form a comprehensive approach to sustainable computing. First, power and carbon analysis estimates the environmental impact of artificial intelligence (AI), enabling intelligent decision-making when scheduling workloads and allocating resources. Second, provisioning heterogeneous processors allows a system to leverage diverse capabilities, matching computational tasks to the appropriate hardware to optimize performance and energy efficiency. Third, training heterogeneous AI models allows a system to perform computation judiciously, matching inference queries to the appropriate model to optimize accuracy and resource efficiency.

This project is a supplement to Carbon Connect, our NSF Expedition in Computing that is (i) developing carbon accounting strategies for computing technology; (ii) mitigating embodied carbon by re-thinking hardware design; (iii) reducing operational carbon by re-thinking systems management; and (iv) balancing the complex interplay between embodied versus operational carbon. This supplement through NAIRR will allow for the exploration of the performance, efficiency, and sustainability of diverse AI architectures and hardware platforms.

David Brooks (Professor, Harvard University), Emma Strubell (Assistant Professor, Carnegie Mellon University), Benjamin C. Lee (Professor, University of Pennsylvania)

We will develop frameworks for extensible carbon telemetry and fair carbon attribution, which in turn relies on power attribution, based on a study of energy efficiency and proportionality using NAIRR’s diverse hardware resources to understand the relationship between resource utilization and power consumption. We will also microbenchmark common machine learning operators across diverse hardware, and develop machine learning models to predict performance and energy efficiency based on microbenchmark data. Our research has the potential for significant enhancements to NAIRR infrastructure, contributing to its utility and sustainability. First, by developing frameworks for power and carbon analysis, we lay foundations for a carbon dashboard. This tool would allow users to understand the resource and environmental intensity of their AI computations, promoting awareness and encouraging the adoption of sustainable practices. Second, by developing frameworks for performance analysis, we lay foundations for a recommendation system. This system could guide users to the most suitable hardware platforms for their specific AI tasks, optimizing performance and reducing energy use.

We will provide performance and power benchmarks for emerging AI model architectures, driving research in ensembles of efficient models. First, we will benchmark energy efficiency across a range of model architectures that have shown promise in generative artificial intelligence including well-known models, such as transformers, as well as emerging variants such as state space models. Using NAIRR’s diverse hardware resources, we will measure energy efficiency and identify the relative strengths of each for sparse versus dense computation, for control flow, and for memory locality. Based on these findings, we will then experiment with ensembles of heterogeneous models or model components (e.g. mixture-of-experts). We will apply different compression strategies for different parts of the model, producing heterogeneous experts that offer greater efficiency. We will then explore how these experts could be mapped to different hardware platforms.

The overall CarbonConnect project will broaden how computer scientists view environmental sustainability to encompass both embodied and operational emissions. We will develop rigorous foundations and practical methods for measuring, understanding, and reducing both kinds of emissions individually and for balancing their combination. Our research will set the standard for carbon accounting and measurement in the ICT industry, influencing environmental law and public policy. The solutions we develop will help to radically reduce the carbon footprint of the ICT sector. The NAIRR supplement broadens the scope of these impacts from industry datacenters to open, academic datacenters and shared research resources, as well as more diverse hardware platforms.

Learn more about how the project visit our website to learn more about the CarbonConnect project at https://carbonconnect.eco/

This work is supported by supplemental funding to National Science Foundation Grant No. (#2326605, #2326606, #2326610).