Introduction to AI Energy Consumption
Artificial intelligence (AI) systems, especially large language models (LLMs), are becoming increasingly powerful and widely used. However, their energy consumption is a growing concern. Researchers have been working to measure and understand the energy usage of these systems.
Measuring Energy Consumption
A team of researchers used setups with Nvidia’s A100 and H100 GPUs to measure the energy consumption of various AI systems, including LLMs and diffusion models. They tested models like Meta’s Llama 3.1 405B, an open-source chat-based AI with 405 billion parameters. The results showed that the model consumed 3352.92 joules of energy per request running on two H100 GPUs, which is equivalent to around 0.93 watt-hours. This is significantly less than the 2.9 watt-hours quoted for ChatGPT queries.
Comparing Energy Efficiency
The team also compared the energy efficiency of different models and hardware. For example, the Mixtral 8x22B model was run on both Ampere and Hopper platforms. The results showed that running the model on two Ampere GPUs resulted in 0.32 watt-hours per request, compared to just 0.15 watt-hours on one Hopper GPU. This demonstrates the improvements in energy efficiency of newer hardware.
The Unknown: Proprietary Models
However, the energy consumption of proprietary models like GPT-4, Gemini, or Grok remains unknown. The lack of transparency from companies like Google or Open AI makes it difficult for researchers to understand the energy efficiency of these models. The research community needs to know the exact energy consumption of these models to develop solutions to the energy efficiency problems.
The Need for Transparency
The most pressing issue is the lack of transparency from companies. They have no incentive to release power consumption numbers, as it could harm their business. However, researchers believe that people should understand what is actually happening, and companies should be encouraged to release some of those numbers.
Where Rubber Meets the Road
According to Nvidia’s Harris, energy efficiency in data centers follows a trend similar to Moore’s law, but only at a very large scale. The power consumption per rack is going up, but the performance-per-watt is getting better. This means that while the energy consumption of individual models is decreasing, the overall energy consumption of data centers is increasing due to the large scale of operations.
Conclusion
In conclusion, the energy consumption of AI systems is a growing concern, and researchers are working to measure and understand it. While there have been improvements in energy efficiency, the lack of transparency from companies makes it difficult to develop solutions. It is essential to encourage companies to release power consumption numbers to address the energy efficiency problems.
FAQs
- Q: What is the energy consumption of large language models?
A: The energy consumption of large language models varies depending on the model and hardware. For example, Meta’s Llama 3.1 405B consumes around 0.93 watt-hours per request running on two H100 GPUs. - Q: Why is it difficult to measure the energy consumption of proprietary models?
A: The lack of transparency from companies like Google or Open AI makes it difficult to measure the energy consumption of proprietary models like GPT-4, Gemini, or Grok. - Q: What can be done to address the energy efficiency problems?
A: Encouraging companies to release power consumption numbers and developing more energy-efficient hardware and models can help address the energy efficiency problems.