Approximately 34 million AI-generated images are created every day, releasing over
120,000 pounds of CO₂

Data from Hugging Face and CMU research

BUT, we can take action to reduce that

Scroll Down to Learn More

Hi! My name is Alvin Chen. I created this website to share my research on "Reducing Carbon Emissions in AI Image Generation at the Inference Stage."

AI image generation is a groundbreaking technology that produces stunning visuals and drives innovation. However, it comes with a hidden cost: its environmental impact. With approximately 34 million AI-generated images created daily, the energy required results in substantial carbon emissions—about 120,000 pounds of CO₂ every day. This is equivalent to about 54 metric tons daily!

While much attention has been given to the energy demands of training AI models, most of the carbon footprint actually occurs during the inference stage—the process of generating images. Surprisingly, this area remains underexplored in research. My goal was to address this gap by evaluating strategies to reduce emissions without sacrificing performance.

In my research, I focused on the following key factors:

Model Choice: Smaller models like DreamShaper were far more energy-efficient, consuming up to 95% less energy compared to larger models.
GPU Optimization: Using newer, power-efficient GPUs like the NVIDIA H100 reduced energy usage by up to 18%.
Spatial and Temporal Shifting: By shifting computational tasks to regions or times with lower carbon intensity, emissions were reduced significantly—up to 97% with global shifting and 77% with temporal shifting in places like California.

To achieve this, I tested five open-source models, ran simulations with real-world data patterns, and calculated emissions using regional carbon intensity metrics. My findings offer actionable insights for both individuals and companies generating AI images.

What key factors influence carbon emissions from AI image generation models?

1

Model

The model used to generate the image affects the number of operations necessary to complete the image generation. Along with the GPU choice, the model choice determines how much energy is consumed during generation.

2

GPU

The GPU used to generate the image affects the speed at which the operations are processed as well as the power usage. Along with the model choice, the GPU choice determines how much energy is consumed during generation.

3

Location

The carbon intensity is defined as the amount of greenouse gas emissions per unit of energy consumed. The location where the image is generated affects the carbon intensity of image generation since some regions have an energy mix containing more clean energy sources than others.

4

Time

The time when the image is generated also affects the carbon intensity of image generation due to the inherent variability of renewable energy caused by the unpredictability of weather conditions. For example, when it's cloudy, carbon intensity increases since less solar energy would be produced and more fossil fuels would have to be burned.

To calculate the carbon emissions from AI image generation, use this formula:

Power Consumption x Time x Carbon Intensity

Power consumption is measured in watts. Different GPUs consume different amounts of power. For example, the A10 GPU consumes up to 150 watts of power, while the H100 GPU consumes up to 700 watts of power.

Time is the number of seconds it takes to generate the image. The longer it takes, the more energy it consumes, and the higher the emissions are.

Carbon intensity is measured in grams of CO₂-equivalent per kilowatt-hour. Areas using renewable energy like wind and solar have lower carbon intensity, while those relying on fossil fuels have higher values.

Here's a short video with a more in-depth explanation of the formula.

Comparison of Five Popular Open-source Image Generation Models

Image Scoring Criteria:
1 - Unacceptable
2 - Needs improvement
3 - Acceptable
4 - Good quality
5 - Accurate to the prompt, high quality

Model

DeepFloyd / IF-I-XL-v1.0

Black Forest Labs / FLUX.1 [dev]

StabilityAI / SD-XL 1.0-base

StabilityAI / Stable Diffusion 3 Medium

Lykon / DreamShaper

Prompt 1:
A bustling medieval marketplace with people in period clothing, merchants selling goods from wooden stalls, and a castle looming in the background.

Score: 3

Score: 5

Score: 4

Score: 5

Score: 4

Prompt 2:
A mystical forest with glowing mushrooms, towering ancient trees, and a crystal-clear stream running through it. The sky is filled with vibrant auroras.

Score: 2

Score: 5

Prompt 3:
A futuristic city with towering skyscrapers made of glass and metal, flying cars zooming by, and a vibrant, bustling marketplace filled with alien creatures.

Score: 2

Score: 4

Score: 3

Score: 5

Score: 4

Prompt 4:
A colorful, swirling pattern of geometric shapes and lines, with a focus on bright blues, reds, and yellows, evoking a sense of motion and energy.

Score: 4

Score: 5

Score: 3

Prompt 5:
A detailed, realistic portrait of a young woman with curly hair, wearing a vintage dress, sitting by a window with soft sunlight illuminating her face.

Score: 4

Score: 5

Score: 4

Score: 5

Prompt 6:
A serene lakeside scene at dawn, with mist rising from the water, a family of ducks swimming by, and a fisherman in a small boat casting his line.

Score: 1

Score: 5

Score: 4

Score: 5

Score: 4

Prompt 7:
A gritty city street with vibrant graffiti covering the walls, a breakdancer performing in the foreground, and bystanders watching and taking pictures.

Score: 2

Score: 5

Score: 3

Score: 2

Prompt 8:
A majestic dragon with shimmering scales, large wings, and piercing eyes, perched on a mountain peak with a stormy sky in the background.

Score: 2

Score: 4

Score: 5

Score: 4

Prompt 9:
A dream-like scene with floating islands, a giant clock melting over a tree branch, and a man in a suit with a fishbowl for a head walking on a checkerboard path.

Score: 1

Score: 4

Score: 2

Score: 3

Score: 4

Prompt 10:
A dynamic action scene with a superhero in a colorful costume flying through the air, about to clash with a menacing villain, with bold lines and vibrant colors.

Score: 3

Score: 5

Score: 4

Score: 5

To evaluate the impact of model choice on carbon emissions, let's analyze the properties of these AI image generation models.

DeepFloyd / IF-I-XL-v1.0

IF-I-XL-v1.0, developed by the AI research lab DeepFloyd, is the largest and most carbon-intensive model among the five. Despite its scale, it generates low quality images and is not recommended for use.

Black Forest Labs / FLUX.1 [dev]

FLUX.1 [dev], developed by Black Forest Labs, is their most advanced open-source model. While it delivers high quality images, its large model size results in significant emissions and slower response times.

StabilityAI / SD-XL 1.0-base

SD-XL 1.0-base is one of the latest Stable Diffusion models. With a medium-sized architecture, it delivers good quality images at a reasonable speed. However, it has a high carbon footprint and performs slightly worse than Stable Diffusion 3 Medium across all metrics.

StabilityAI / Stable Diffusion 3 Medium

Stable Diffusion 3 Medium is another one of the latest Stable Diffusion models. It offers high image quality with decent speed, although its carbon emissions remain relatively high.

Lykon / DreamShaper

DreamShaper is an improved and fine-tuned version of Stable Diffusion models. As the smallest in model size, it is highly carbon-efficient, producing images quickly with good quality. This model is strongly recommended for all users.

Data Collection and Results Table:

Flowchart for Model and GPU Choice Evaluation

This flowchart outlines the data collection process for evaluating the impact of GPU and model choice. It begins with setting up an A100 GPU instance on Lambda Cloud and importing necessary Python libraries. Models are loaded sequentially, and each is tested for energy and power usage during image generation. If no prior dataset exists, a new file, "emissions.csv," is created to store data. After testing, energy consumption and time are logged, and results are saved in the “emissions.csv” file. The process repeats for all models, with a similar approach for testing across different GPU instances like H100.

Emissions Data for SD-XL 1.0-base

This screenshot displays part of the emissions.csv file, which was used to collect data from different models. Specifically, the image shows all the data collected for one model, SD-XL 1.0-base.

Example of Data Collected

The screenshot displays all the data collected per image generated, including the average power consumption, the number of seconds the image took to generate, the carbon intensity of the server used, the zone of the server used, the time the image was generated, and finally the emissions from the image generation.

Results Data Table

This data table contains the final results for each model, which illustrates the impact of model and GPU choice on energy consumption. The three columns highlighted in green demonstrate a clear positive correlation between model size and energy consumption. The two green columns labeled "Energy" show that the H100 GPU has lower emissions compared to the A100.

Model Size vs Energy Consumption Line Chart

This line chart illustrates how energy consumption increases with model size for both NVIDIA A100 and H100 GPUs. Larger models require more computational resources and, consequently, more energy.

Energy Consumption Comparison Bar Chart

This chart shows the energy consumption of each model on A100 and H100 GPUs. The H100 consistently consumes less energy than the A100 for models of the same size, showing its higher energy efficiency. The difference in energy consumption is more noticeable for larger models, making H100 the better choice for larger models.

Spatial Shifting: Simulations and Findings

Flowchart for the Process of the Request Simulator

This flowchart illustrates the process of the request generator, which creates all the data used for the spatial shifting simulation. It begins by creating a Gaussian distribution to randomly select request times, favoring times when people are typically awake. The process incorporates regional data, such as population, AI search interest, and time zones, to calculate weighted probabilities for each region. These weights are determined by multiplying population data with AI search interest and normalizing them across all regions.
The request generation process uses these probabilities to assign each request a random region and time. The requests, along with their corresponding regions and times, are saved in a CSV file. The process continues until a target of one million requests is generated, providing a realistic simulation of global AI model usage patterns.

Flowchart for the Spatial Shifting Simulation

This flowchart simulates carbon emissions for generating AI images under three scenarios: no spatial shifting (X), regional shifting (Y), and global shifting (Z).

Initialize Variables: X, Y, Z start at 0.
Load Data: Load pre-measured energy consumption and the file with 1 million requests.
Process Requests:
- No Shifting (X): Use the carbon intensity of the request's region and time.
- Regional Shifting (Y): Use the lowest carbon intensity on the same continent at that time.
- Global Shifting (Z): Use the lowest carbon intensity anywhere in the world at that time.
- Multiply each carbon intensity by the energy consumption and add to the respective variable.
Repeat for All Requests: Continue until all 1 million requests are processed.
End: The simulation concludes after all requests have been processed, resulting in the total emissions for each scenario (X, Y, Z).

Ten Recommended Regions Across the Globe

This table compares the average carbon intensity, standard deviation in carbon intensity, and population data for the ten recommended regions. All ten regions have either a Google Cloud, Microsoft Azure, or Amazon AWS server present. Every continent has at least 2 servers to avoid a single point of failure.

Comparison of Actual vs. Expected Request Time Distribution Using Gaussian Mixture (GMT)

This graphic illustrates the Gaussian distribution utilized in the request generator. The histogram (blue) represents the actual number of requests generated per hour over the course of the day in GMT, while the red curve depicts the expected distribution modeled using a Gaussian mixture. This alignment demonstrates how the Gaussian distribution underpins the temporal allocation of requests in the simulation.

Request Location Distribution Across U.S. Regions

This bar chart illustrates the locational distribution of the 1 million requests by the request generator. Areas with greater populations, such as California and Texas, have many more requests than areas with lower populations such as Alaska or Maine.

Standard Deviation in Carbon Intensity Across Complete Zones

This bar chart displays the standard deviation in carbon intensity data across complete zones, where each zone is represented on the x-axis. Zones with higher standard deviations on the left side of the chart exhibit significant temporal variability in carbon intensity, suggesting a greater potential for temporal shifting.

Relationship Between Queueing Delay and Processing Rate

This chart shows an inverse relationship: as the number of requests processed per second increases, the average queueing delay decreases. At low processing rates, the queuing delay is very high. As the processing rate improves, the delay drops significantly, approaching zero. This chart could be used to determine the optimal processing capacity required to minimize delays and reduce user wait times.

Temporal Distribution of Requests in Sweden Using Regional Spatial Shifting

This graph illustrates the number of requests received by Sweden per second during a simulation of regional spatial shifting. The data shows that Sweden experienced a peak load of 14 requests per second, occurring around noon (between 40,000 and 60,000 seconds after the day started). Understanding this peak load can help companies determine the number of GPUs required in a server to process user requests without delays. For example, if all the requests were for the Lykon / Dreamshaper model, which the H100 GPU processes in 1.2 (6/5) seconds, each GPU can handle 5/6 requests every second. To process 14 requests per second without queueing, the calculation is:
GPUs required = 14 / (5/6) = 16.8
Rounding up, 17 H100 GPUs would be needed to handle all of Sweden's requests without introducing a queuing delay.

Visual Representation of Spatial Shifting

As India had higher carbon intensity than other areas in Asia, requests from India were shifted to Japan or Singapore in the regional spatial shifting, depending on which zone had lower carbon intensity at the time of the request. Specifically, 88% of India's requests were shifted to Singapore and 12% were shifted to Japan according to simulated data.

Potential Carbon Savings Through Spatial Shifting

This chart outlines the effectiveness of spatial shifting in reducing carbon emissions, with global shifting achieving a 97% reduction and regional shifting achieving a 39% reduction. While global shifting offers significant carbon savings, it may face limitations due to government restrictions. Regional shifting presents a viable alternative as it (1) mitigates single points of failure, (2) reduces latency and enhances service quality, and (3) complies with data locality regulations imposed by governments.

Temporal Shifting: Analysis and Findings

Carbon Intensity Over Time of Germany, California, and New York

This line chart shows the variation in carbon intensity for Germany, California, and New York, emphasizing regions with significant fluctuations. Germany (DE) and California (US-CAL-CISO) exhibit relatively high variability, reflecting significant changes in carbon intensity over time and suggesting a strong potential for temporal shifting to reduce emissions.

Carbon Intensity Trend in California

This line chart shows the variation in carbon intensity over a 24-hour period in CAISO, California's primary energy supplier. Carbon intensity was notably lower between 8 AM and 4 PM compared to other times, primarily due to the increased availability of renewable energy sources like solar power during this period. For example, carbon intensity measured 291 gCO2eq/kWh at 5 AM but dropped to just 68 gCO2eq/kWh at 12 PM, representing a 77% reduction. This suggests that delaying energy-intensive tasks in California to the 8 AM to 4 PM window could significantly reduce carbon emissions.

Carbon Intensity Trend in Texas

This line chart depicts the fluctuation in carbon intensity over a 24-hour period in ERCOT, the primary energy supplier for Texas. As seen in California, carbon intensity was significantly lower between 8 AM and 4 PM compared to other times. The outlier at 11 PM likely reflects a brief period of strong winds that enabled a surge in wind energy generation. The data reveals that carbon intensity was 414 gCO2eq/kWh at 8 PM but dropped to 171 gCO2eq/kWh at 11 AM, representing a 59% reduction. This suggests that adopting temporal shifting could be an effective strategy for reducing carbon emissions in Texas.

Carbon Intensity Trend in New York

Unlike Texas and California, New York does not exhibit a trend of lower carbon intensity during the morning and afternoon. This may be attributed to New York's reliance on fewer renewable energy sources compared to California and Texas. As a result, the temporal shifting technique may be less effective in reducing carbon emissions in New York.

Carbon Intensity Values Across Regions

This box plot presents the carbon intensity values across various regions. While Sweden and France exhibit the lowest carbon intensity, zones with wider ranges—such as California and Texas—are more suitable for temporal shifting strategies. This variability allows energy-intensive tasks to be scheduled during periods of lower carbon intensity, offering significant potential for emissions reduction.

Summary

Here are the key observations and findings from my research:

1

The energy consumption of AI image generation can be reduced by selecting smaller image generation models and more power-efficient GPUs.

Model choice has a large impact on energy consumption of AI generated images. In general, smaller models consume less energy. When using the A100 GPU, the smallest model (DreamShaper) consumed 95% less energy than the largest model (IF-I-XL-v1.0).
Using more power-efficient GPUs can significantly reduce the energy consumption of AI image generation, with larger models benefiting more than smaller ones. When upgrading from the A100 to the H100 GPU, FLUX.1 [dev] showed about an 18% decrease in energy consumption, compared to a 7% reduction for DreamShaper.

2

The carbon intensity of AI image generation can be reduced by employing spatial shifting or temporal shifting.

Compared with the baseline data (no spatial shifting), global spatial shifting achieved a 97% reduction in carbon emissions, and regional spatial shifting achieved a 39% reduction.
Temporal shifting can save significant amounts of carbon, but the amount varies by region. California could save up to 77%, while Texas could save up to 59%.

Recommendations

Here are the recommendations based on my research:

1

Recommendations for individuals generating images and companies developing or deploying image generation models:

Choose smaller, more carbon-friendly models. Note that small models can still produce high-quality images. For instance, among the five models in this study, the Lykon/Dreamshaper model stands out for the smallest size, excellent energy savings, and great image quality.
Use more power-efficient GPUs, such as the NVIDIA H100 GPU.
Choose servers located in regions with lower carbon intensity. For example, Sweden and France are ideal choices for global shifting. This study also provides a list of ten recommended regions for regional shifting.
Use servers during times of lower carbon intensity, such as between 8 AM and 4 PM in California and Texas, especially for larger workloads.

2

Recommended Regions for Spatial Shifting:
(out of the 398 regions evaluated in this study)

Europe: Sweden, France
Sweden and France consistently have the lowest carbon intensity in the world. For instance, Sweden's carbon intensity is 21 times lower than that of Germany. Choosing cloud services in these regions is an environmentally friendly option. Both Amazon (AWS) and Microsoft (Microsoft Azure) have servers in Sweden and France, whereas Google (Google Cloud) only operates a server in France.
America: California, New York, Texas
California, New York, and Texas are the regions with the lowest carbon intensity for server locations in the United States. Their large populations also ensure lower latency for a greater number of users. Among the major cloud providers, Amazon operates a server in California, Google has servers in California and Texas, and Microsoft has servers in all three regions.
Asia: Singapore, Japan, India
Singapore and Japan, while having higher carbon intensity compared to Europe and America, maintain relatively low carbon intensity levels within the Asia region. This makes them favorable choices for selecting GPU servers when considering carbon intensity. On the other hand, India, with its massive population, offers the advantage of lower latency times, making it another good option for GPU server deployment. Microsoft, Amazon, and Google all have servers in the three regions.
Oceania: New Zealand, Australia
New Zealand and Australia are the regions with the lowest carbon intensity in Oceania. In addition, both countries have large populations, making them ideal locations for deploying GPU servers to achieve lower latency for a broader user base. Amazon has servers in both regions, whereas Google and Microsoft have servers exclusively in Australia.

What key factors influence carbon emissions from AI image generation models?

Comparison of Five Popular Open-source Image Generation Models

Model

DeepFloyd / IF-I-XL-v1.0

Black Forest Labs / FLUX.1 [dev]

StabilityAI / SD-XL 1.0-base

StabilityAI / Stable Diffusion 3 Medium

Lykon / DreamShaper

Deep
Floyd

FLUX.1
[dev]

SD-XL
1.0

SD3
Medium

Dream
Shaper

Spatial Shifting: Simulations and Findings

Temporal Shifting: Analysis and Findings

Summary

Recommendations

Carbon Emissions Calculator
for AI Image Generation

Model Choice

GPU Choice

Server Region

Time

Contact Me

What key factors influence carbon emissions from AI image generation models?

Comparison of Five Popular Open-source Image Generation Models

Model

DeepFloyd / IF-I-XL-v1.0

Black Forest Labs / FLUX.1 [dev]

StabilityAI / SD-XL 1.0-base

StabilityAI / Stable Diffusion 3 Medium

Lykon / DreamShaper

DeepFloyd

FLUX.1[dev]

SD-XL1.0

SD3Medium

DreamShaper

Spatial Shifting: Simulations and Findings

Temporal Shifting: Analysis and Findings

Summary

Recommendations

Carbon Emissions Calculatorfor AI Image Generation

Model Choice

GPU Choice

Server Region

Time

Contact Me

Deep
Floyd

FLUX.1
[dev]

SD-XL
1.0

SD3
Medium

Dream
Shaper

Carbon Emissions Calculator
for AI Image Generation