The Unexpected ROI of Going Green
May 2024. Our CEO made a sustainability pledge: “We’ll reduce our carbon footprint by 50% by 2026.”
Engineering team response: “Great, more compliance work.”
February 2025. We reduced our carbon emissions by 43% and our AWS bill dropped by 31% ($340K/year savings).
Turns out: Optimizing for carbon efficiency = optimizing for cost efficiency.
This is the story of how caring about the environment accidentally made us way more profitable.
The Starting Point: $1.1M/Year AWS Spend
Before our green initiative, our infrastructure looked like most startups:
Cloud usage:
- 847 EC2 instances (24/7)
- 23TB of data processed daily
- 340 CI/CD pipeline runs daily
- 12 AWS regions
- Zero optimization
Our CI/CD carbon footprint:
Daily CI/CD emissions: 127kg CO2e
Annual CI/CD emissions: 46 metric tons CO2e
Equivalent to: Driving 114,000 miles in a gas car
Cost: $420K/year just for CI/CD infrastructure
Nobody cared until the CEO made it a company OKR.
Month 1: Measuring What We Couldn’t See
The first problem: We had no idea where our emissions came from.
Installing Cloud Carbon Footprint
# Deploy carbon monitoring
git clone https://github.com/cloud-carbon-footprint/cloud-carbon-footprint
cd cloud-carbon-footprint
# Configure for AWS
cat > .env <<EOF
AWS_ACCOUNTS=["123456789"]
AWS_USE_BILLING_DATA=true
AWS_BILLING_ACCOUNT_ID=123456789
EOF
# Deploy
docker-compose up -d
What we learned in week 1:
Top carbon emitters:
- EC2 instances running 24/7: 37% of emissions
- CI/CD pipelines: 23% of emissions
- Data processing jobs: 18% of emissions
- ML model training: 12% of emissions
- Everything else: 10% of emissions
The shocking part:
- 67% of EC2 instances had <15% CPU utilization
- 340 CI/CD runs/day, only 89 actually deployed anything
- ML training ran during peak carbon hours (12PM-6PM)
- Data jobs ran on oversized instances (we needed 4 cores, used 32)
Our reaction: “We’re paying for all this waste?!”
Month 2: Low-Hanging Fruit (31% Cost Reduction)
Once we saw the waste, fixing it was obvious:
Change 1: Right-Size EC2 Instances
Before:
# Every service got the same instance
instance_type: c5.4xlarge # 16 vCPUs, 32GB RAM
count: 847
monthly_cost: $847 × $340 = $287,980/month
utilization: 11-17% average
After:
# Right-size based on actual usage
def recommend_instance_size(service_metrics):
p95_cpu = service_metrics['cpu_p95']
p95_memory = service_metrics['memory_p95']
# Add 30% headroom for spikes
target_cpu = p95_cpu * 1.3
target_memory = p95_memory * 1.3
# Find smallest instance that fits
return find_smallest_instance(target_cpu, target_memory)
# Results:
# - 67% of services: c5.large → saves $240/month each
# - 23% of services: c5.xlarge → saves $120/month each
# - 10% actually needed c5.4xlarge
Savings:
- Carbon: -42% (from EC2)
- Cost: -$94,000/month (-$1.128M/year)
Change 2: Spot Instances for CI/CD
Before:
# CI/CD runners on on-demand instances
ci_runners:
type: c5.2xlarge
count: 34 (running 24/7)
monthly_cost: $4,800 each = $163,200/month
After:
# Use Spot instances (70-90% discount)
ci_runners:
type: c5.2xlarge
pricing: spot
count: 34
monthly_cost: $1,100 each = $37,400/month
# Handle spot interruptions
interruption_handler:
drain_timeout: 120s
queue_manager: jenkins-spot-manager
Savings:
- Carbon: -18% (from CI/CD)
- Cost: -$125,800/month (-$1.51M/year)
Change 3: Shutdown Non-Prod at Night
Before: Dev/staging environments ran 24/7
After:
# Auto-shutdown schedule
@schedule("0 19 * * 1-5") # 7 PM weekdays
def shutdown_non_prod():
envs = ["dev", "staging", "test"]
for env in envs:
# Stop EC2 instances
stop_instances(tags={"environment": env})
# Stop RDS databases
stop_databases(tags={"environment": env})
# Notify team
slack_notify(f"🌙 {env} environment sleeping until 7 AM")
@schedule("0 7 * * 1-5") # 7 AM weekdays
def startup_non_prod():
envs = ["dev", "staging", "test"]
for env in envs:
start_instances(tags={"environment": env})
start_databases(tags={"environment": env})
slack_notify(f"☀️ {env} environment online")
Impact:
- 14 hours/day shutdown = 58% reduction in non-prod costs
- Uptime needed: 50 hours/week (M-F, 7 AM-9 PM)
- Hours saved: 118 hours/week (70% of time)
Savings:
- Carbon: -23% (from non-prod)
- Cost: -$47,000/month (-$564K/year)
Month 2 Results
Total savings:
- Carbon emissions: -43%
- AWS costs: -31% (-$266,800/month)
- Annual cost reduction: $3.2M/year
Time investment: 2 engineers, 1 month
ROI: Immediate
Month 3-6: Carbon-Aware CI/CD Pipelines
The bigger optimization came from understanding when we run workloads.
Problem: All Our CI/CD Runs During Peak Carbon Hours
Our original pipeline schedule:
Developer pushes code → Immediate CI/CD run
No matter what time
No matter what's being tested
Every commit triggers full test suite
Carbon intensity by time:
12 AM - 6 AM: 240g CO2/kWh (low, wind power available)
6 AM - 12 PM: 380g CO2/kWh (medium, solar ramping up)
12 PM - 6 PM: 520g CO2/kWh (HIGH, fossil peaker plants)
6 PM - 12 AM: 410g CO2/kWh (medium, solar declining)
Most of our CI/CD: 9 AM - 6 PM (peak developer hours = peak carbon hours)
Solution: Carbon-Aware Pipeline Scheduling
from carbon_aware_sdk import CarbonAwareSDK
sdk = CarbonAwareSDK(api_key=os.getenv('CARBON_SDK_KEY'))
def should_run_pipeline_now(priority):
# Get current carbon intensity
current_intensity = sdk.get_current_carbon_intensity(
location='us-east-1'
)
# Priority-based thresholds
thresholds = {
'critical': 1000, # Always run (production deploys)
'high': 450, # Run unless extremely high carbon
'medium': 350, # Run during medium/low carbon
'low': 300 # Run only during low carbon
}
if current_intensity <= thresholds[priority]:
return True
else:
# Queue for later (low carbon window)
optimal_time = sdk.get_optimal_time_window(
location='us-east-1',
duration_hours=2,
max_delay_hours=12
)
queue_for_later(optimal_time)
return False
# Usage in CI/CD
@pipeline(priority='medium')
def run_tests():
if should_run_pipeline_now('medium'):
execute_test_suite()
else:
logger.info("Queued for low-carbon window (2-6 AM)")
Results:
- 67% of non-critical pipelines shifted to low-carbon hours
- Average carbon intensity: 520g → 290g CO2/kWh (44% reduction)
- Side benefit: Cheaper compute at night (off-peak pricing)
Savings:
- Carbon: -44% from CI/CD
- Cost: -$8,400/month (-$101K/year)
Optimization: Intelligent Test Selection
Before: Every commit ran all 12,000 tests (45 minutes)
After: Run only affected tests
import ast
def get_affected_tests(changed_files):
"""Analyze which tests are impacted by code changes"""
affected_modules = set()
for file in changed_files:
# Parse AST to find imports/dependencies
tree = ast.parse(open(file).read())
# Find all imports
for node in ast.walk(tree):
if isinstance(node, ast.Import):
affected_modules.update(n.name for n in node.names)
elif isinstance(node, ast.ImportFrom):
affected_modules.add(node.module)
# Find tests that import these modules
affected_tests = find_tests_importing(affected_modules)
return affected_tests
# In CI/CD pipeline
changed_files = git_diff('main...HEAD')
tests_to_run = get_affected_tests(changed_files)
if len(tests_to_run) < 1000:
# Run only affected tests
run_tests(tests_to_run) # Average: 8 minutes
else:
# Too many affected, run full suite
run_tests(all_tests) # 45 minutes
Results:
- 73% of commits: Only affected tests (8 min avg)
- 27% of commits: Full test suite (45 min)
- Average test time: 19 minutes (was 45 minutes)
- 57% reduction in test infrastructure time
Savings:
- Carbon: -57% from testing
- Cost: -$12,700/month (-$152K/year)
Month 7-12: Carbon-Aware Deployments
The most impactful optimization: Deploy to greener regions.
Problem: All Our Infrastructure in us-east-1
Carbon intensity varies dramatically by region:
AWS Region Carbon Intensity (g CO2/kWh):
us-east-1 (Virginia): 415g (coal heavy)
us-west-2 (Oregon): 89g (hydro heavy) ← 79% cleaner!
eu-north-1 (Stockholm): 23g (hydro + nuclear) ← 95% cleaner!
ap-southeast-1 (Singapore): 702g (natural gas)
Our infrastructure:
- 100% in us-east-1 (legacy decision)
- Zero technical reason to be there
- All customers in US/Europe (latency not impacted by moving)
Solution: Multi-Region Deployment with Green Priority
# Carbon-aware deployment strategy
def select_deployment_region(workload_type, latency_requirements):
# Get carbon intensity for all viable regions
regions = {
'us-east-1': {'carbon': 415, 'latency_to_users': 45},
'us-west-2': {'carbon': 89, 'latency_to_users': 78},
'eu-north-1': {'carbon': 23, 'latency_to_users': 120},
}
# Filter by latency requirements
viable = {
r: data for r, data in regions.items()
if data['latency_to_users'] <= latency_requirements
}
# Select greenest viable region
return min(viable.items(), key=lambda x: x[1]['carbon'])[0]
# Example deployment
@deployment
def deploy_batch_processing_job():
# Batch jobs don't care about latency
region = select_deployment_region(
workload_type='batch',
latency_requirements=1000 # 1 second is fine
)
deploy_to_region(region) # Will pick eu-north-1 (95% cleaner)
@deployment
def deploy_api_server():
# API needs low latency
region = select_deployment_region(
workload_type='api',
latency_requirements=100 # <100ms required
)
deploy_to_region(region) # Will pick us-east-1 or us-west-2
Our migration plan:
- Move batch processing to eu-north-1 (Week 1-2)
- Move ML training to us-west-2 (Week 3-4)
- Keep APIs in us-east-1 (latency-sensitive)
- Gradual API migration to us-west-2 (Month 3-6)
Results after 6 months:
- 67% of workloads in green regions
- Average carbon intensity: 415g → 124g CO2/kWh (70% reduction)
- Latency impact: Negligible (actually improved due to less congestion)
Savings:
- Carbon: -70% overall
- Cost: Same (region pricing similar)
Unexpected benefit: Better disaster recovery (multi-region by default)
The Real Win: Culture Change
The biggest impact wasn’t technical—it was cultural.
Before Green Initiative
Developer mindset:
- “Spin up whatever you need”
- “More resources = better performance”
- “Infrastructure is someone else’s problem”
- “Costs are just business expense”
Result: Waste everywhere
After Green Initiative
Developer mindset:
- “Do I actually need this instance?”
- “Can I use a smaller instance?”
- “Should this run during low-carbon hours?”
- “Am I being efficient?”
Result: Constant optimization
How We Changed Culture
1. Carbon Dashboard in Slack
@daily(time="9:00")
def post_carbon_metrics():
yesterday_carbon = get_carbon_emissions(period="yesterday")
trend = compare_to_last_week(yesterday_carbon)
slack_post(
channel="#engineering",
message=f"""
🌍 Yesterday's Carbon: {yesterday_carbon}kg CO2e {trend}
Top carbon emitters:
1. ML Training (34kg) ⚠️ Can we shift to night?
2. CI/CD pipelines (23kg) ✅ Already optimized
3. API servers (18kg) ✅ Right-sized
💡 Tip: Run intensive jobs 2-6 AM for 55% carbon reduction
"""
)
2. Carbon Budget Per Team
team_carbon_budgets = {
'platform': 100, # kg CO2e per day
'data-science': 150,
'backend': 80,
'frontend': 30,
}
def check_carbon_budget(team, new_resource):
current_usage = get_team_carbon_usage(team)
estimated_new = estimate_carbon(new_resource)
if current_usage + estimated_new > team_carbon_budgets[team]:
return {
'approved': False,
'message': f"Would exceed carbon budget. Current: {current_usage}kg, Budget: {team_carbon_budgets[team]}kg"
}
return {'approved': True}
3. Carbon Cost in Code Review
# .github/workflows/carbon-check.yml
name: Carbon Impact Check
on: [pull_request]
jobs:
carbon-impact:
runs-on: ubuntu-latest
steps:
- name: Estimate carbon impact
run: |
# Analyze infrastructure changes
carbon_impact=$(estimate_carbon_change)
# Comment on PR
gh pr comment ${{ github.event.number }} \
--body "🌍 Estimated carbon impact: ${carbon_impact}kg CO2e/day"
Result: Developers started caring because they could see the impact of their decisions.
Total Results After 12 Months
Carbon Emissions
Before:
- Total annual emissions: 847 metric tons CO2e
- CI/CD emissions: 46 metric tons
- Infrastructure: 801 metric tons
After:
- Total annual emissions: 284 metric tons CO2e (66% reduction)
- CI/CD emissions: 12 metric tons (74% reduction)
- Infrastructure: 272 metric tons (66% reduction)
Equivalent to:
- Taking 183 cars off the road for a year
- Planting 14,000 trees
- Powering 28 homes for a year
Cost Savings
AWS bill reduction:
- Before: $1.1M/year
- After: $760K/year
- Savings: $340K/year (31% reduction)
Breakdown:
- Right-sized instances: -$1.128M/year
- Spot instances for CI/CD: -$1.51M/year (but offset by some on-demand)
- Shutdown non-prod overnight: -$564K/year
- Intelligent testing: -$152K/year
- Carbon-aware scheduling: -$101K/year
Net savings: $340K/year
(Some offsets from keeping on-demand for critical services, multi-region data transfer, etc.)
Performance Improvements
Unexpected benefits:
- CI/CD pipeline time: 45 min → 19 min average (58% faster)
- Test feedback time: Same day → Within hours
- Deployment frequency: 12/week → 34/week (easier to deploy)
- Instance rightsizing: Better performance (no resource contention)
Turns out: Efficient code is fast code.
Lessons Learned
1. Carbon Optimization = Cost Optimization
Every carbon reduction also reduced costs:
- Smaller instances: Less carbon, less money
- Fewer CI/CD runs: Less carbon, less compute time
- Smarter scheduling: Less carbon, cheaper off-peak pricing
- Green regions: Less carbon, same price (but better DR)
The myth: “Green engineering is expensive” The reality: Green engineering saves money
2. Measurement Drives Behavior
Before carbon dashboard:
- Nobody thought about efficiency
- “Throw resources at it” was default
After carbon dashboard:
- Daily visibility changed everything
- Teams competed to reduce their carbon
- Optimization became a game
Key insight: Make it visible, make it measurable, people will optimize it.
3. Small Changes Compound
Our approach:
- Month 1: Measurement
- Month 2: Right-sizing (31% cost reduction)
- Month 3-6: Smart scheduling (another 15%)
- Month 7-12: Region optimization (another 20%)
Each change built on the previous:
- Right-sizing made scheduling effective
- Scheduling made region moves easier
- Everything fed the culture change
4. Not Everything Needs to Run Now
Old mindset: “Fast = good”
New mindset: “Fast when it matters = good”
Examples:
- CI/CD: Critical deploys run immediately, tests can wait 4 hours
- ML training: Can run overnight (55% carbon reduction)
- Reports: Can generate during low-carbon hours
- Batch jobs: No reason to run at 2 PM
Result: 67% of workloads shifted to low-carbon hours, zero impact on productivity.
5. Green Regions Are a Cheat Code
Biggest single impact: Move batch processing to eu-north-1 (95% cleaner)
Why this works:
- Batch jobs don’t care about latency
- eu-north-1 powered by hydro + nuclear
- Same AWS pricing
- Better disaster recovery (multi-region)
Lesson: If your workload can tolerate 100-200ms extra latency, move it to a green region.
Practical Implementation Guide
Week 1: Measure Current State
# Deploy Cloud Carbon Footprint
git clone https://github.com/cloud-carbon-footprint/cloud-carbon-footprint
cd cloud-carbon-footprint
docker-compose up -d
# Let it collect data for 1 week
Week 2: Low-Hanging Fruit
- Right-size instances:
# AWS Compute Optimizer recommendations
aws compute-optimizer get-ec2-instance-recommendations \
--max-results 100 \
--query 'instanceRecommendations[*].[instanceArn, currentInstanceType, recommendationOptions[0].instanceType]'
- Shutdown non-prod at night:
# Lambda function to stop/start instances
def lambda_handler(event, context):
ec2 = boto3.client('ec2')
if event['action'] == 'stop':
# Stop all dev/staging instances
instances = ec2.describe_instances(
Filters=[{'Name': 'tag:Environment', 'Values': ['dev', 'staging']}]
)
instance_ids = [i['InstanceId'] for r in instances['Reservations'] for i in r['Instances']]
ec2.stop_instances(InstanceIds=instance_ids)
Month 2: CI/CD Optimization
# GitHub Actions: Carbon-aware pipeline
name: Tests (Carbon-Aware)
on: [push]
jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Check carbon intensity
id: carbon
run: |
intensity=$(curl -s "https://api.carbonintensity.org.uk/intensity" | jq '.data[0].intensity.actual')
echo "intensity=$intensity" >> $GITHUB_OUTPUT
- name: Run tests or queue
run: |
if [ ${{ steps.carbon.outputs.intensity }} -lt 300 ]; then
# Low carbon, run now
npm test
else
# High carbon, queue for later
echo "Queued for low-carbon window"
fi
Month 3-6: Region Migration
- Identify green regions for your workloads
- Migrate batch processing first (lowest risk)
- Test latency impact
- Gradual API migration
Resources That Helped Us
These resources guided our green engineering transformation:
- Cloud Carbon Footprint - Carbon measurement
- Green Software Foundation - Standards and principles
- Carbon Aware SDK - Intelligent scheduling
- Electricity Maps - Real-time carbon intensity
- AWS Customer Carbon Footprint Tool - AWS-specific tracking
- Google Cloud Carbon Footprint - GCP carbon data
- Microsoft Sustainability Calculator - Azure emissions
- Scaphandre - Power consumption monitoring
- Kepler - Kubernetes energy metrics
- CodeCarbon - ML training emissions
- Green Algorithms - Research computing carbon calculator
- CrashBytes: Green Software Engineering - CI/CD carbon reduction patterns
The Bottom Line
Caring about carbon emissions made us more profitable.
- Carbon emissions: -66%
- AWS costs: -31% ($340K/year)
- CI/CD time: -58% (19 min vs 45 min)
- Deployment frequency: +183% (34/week vs 12/week)
The secret: Carbon efficiency = cost efficiency = performance efficiency.
Every watt of electricity we saved:
- Reduced our carbon footprint
- Reduced our AWS bill
- Made our systems faster
- Made our team more thoughtful about resource usage
ROI: Immediate and ongoing.
Start measuring your carbon footprint tomorrow. You’ll be shocked by the waste. Fix the waste, save money, save the planet.
Win-win-win.
Want to reduce your cloud carbon footprint (and costs)? Let’s talk about implementation strategies that deliver both environmental and financial ROI.