WebAssembly in Production: Our Journey from Prototype to 10M Requests/Day

Real-world lessons from running WebAssembly at scale in a serverless environment—including performance wins, debugging nightmares, and the surprising edge cases that shaped our architecture.

The Ambitious Bet

After reading about WebAssembly transforming cloud-native applications, I was intrigued but skeptical. The promises were compelling:

  • Near-native performance in serverless environments
  • Sub-10ms cold starts vs. 200-500ms for containers
  • Language-agnostic runtime enabling polyglot architectures
  • Enhanced security through sandboxed execution

But would it work at scale? We decided to find out by migrating one of our most performance-sensitive services from Node.js to WebAssembly.

This is the story of that migration: the wins, the gotchas, and the lessons that only production traffic can teach.

The Use Case: Image Processing Service

Our image processing API handled:

  • 8M requests/day during normal traffic
  • 23M requests/day during peak events
  • p95 latency requirement: < 150ms
  • Cost target: < $0.0001 per request

The existing Node.js Lambda implementation was struggling:

  • 320ms average cold start
  • $8,400/month in Lambda costs
  • 14% of requests exceeded latency SLA during traffic spikes

We needed better performance at lower cost. WebAssembly seemed like the answer.

Phase 1: Prototype & Proof of Concept

Choosing the Source Language

We evaluated three languages that compile to Wasm:

Rust

  • Pros: Memory safety, excellent Wasm tooling, mature ecosystem
  • Cons: Steeper learning curve for team
  • Wasm binary size: 850KB (optimized)

AssemblyScript (TypeScript-like)

  • Pros: Familiar syntax, easy adoption for JS team
  • Cons: Smaller ecosystem, limited library support
  • Wasm binary size: 240KB (optimized)

Go (TinyGo)

  • Pros: Team familiarity, good stdlib coverage
  • Cons: Larger binary sizes, GC overhead
  • Wasm binary size: 1.2MB (optimized)

We chose Rust for three reasons:

  1. Best-in-class Wasm tooling (wasm-pack, wasm-bindgen)
  2. Strong image processing libraries (image-rs)
  3. Team commitment to learning Rust for systems programming

The First Implementation

Here’s our initial Rust implementation:

// src/lib.rs
use wasm_bindgen::prelude::*;
use image::{DynamicImage, ImageFormat, GenericImageView};
use base64::{Engine as _, engine::general_purpose};

#[wasm_bindgen]
pub struct ImageProcessor {
    quality: u8,
    max_width: u32,
    max_height: u32,
}

#[wasm_bindgen]
impl ImageProcessor {
    #[wasm_bindgen(constructor)]
    pub fn new(quality: u8, max_width: u32, max_height: u32) -> ImageProcessor {
        ImageProcessor {
            quality,
            max_width,
            max_height,
        }
    }
    
    #[wasm_bindgen]
    pub fn process_image(&self, base64_input: &str) -> Result<String, JsValue> {
        // Decode base64 input
        let image_data = general_purpose::STANDARD
            .decode(base64_input)
            .map_err(|e| JsValue::from_str(&format!("Base64 decode error: {}", e)))?;
        
        // Load image
        let img = image::load_from_memory(&image_data)
            .map_err(|e| JsValue::from_str(&format!("Image load error: {}", e)))?;
        
        // Resize if needed
        let resized = self.resize_if_needed(img);
        
        // Encode to JPEG with quality setting
        let mut output = Vec::new();
        resized
            .write_to(&mut output, ImageFormat::Jpeg)
            .map_err(|e| JsValue::from_str(&format!("Encode error: {}", e)))?;
        
        // Return as base64
        Ok(general_purpose::STANDARD.encode(&output))
    }
    
    fn resize_if_needed(&self, img: DynamicImage) -> DynamicImage {
        let (width, height) = img.dimensions();
        
        if width <= self.max_width && height <= self.max_height {
            return img;
        }
        
        let ratio = f32::min(
            self.max_width as f32 / width as f32,
            self.max_height as f32 / height as f32,
        );
        
        let new_width = (width as f32 * ratio) as u32;
        let new_height = (height as f32 * ratio) as u32;
        
        img.resize(new_width, new_height, image::imageops::FilterType::Lanczos3)
    }
}

The Lambda Handler (JavaScript)

// handler.js
const fs = require('fs');
const path = require('path');

// Load the Wasm module
const wasmBuffer = fs.readFileSync(path.join(__dirname, 'image_processor.wasm'));
let wasmModule;

// Initialize Wasm module (reused across invocations)
async function initWasm() {
    if (!wasmModule) {
        const wasmImports = {
            env: {
                // Provide any imports the Wasm module needs
            }
        };
        
        const wasmInstance = await WebAssembly.instantiate(wasmBuffer, wasmImports);
        wasmModule = wasmInstance.instance.exports;
    }
    return wasmModule;
}

exports.handler = async (event) => {
    try {
        // Initialize Wasm
        const wasm = await initWasm();
        
        // Create processor instance
        const processor = new wasm.ImageProcessor(85, 1920, 1080);
        
        // Process image
        const inputImage = event.body; // base64 encoded
        const outputImage = processor.process_image(inputImage);
        
        return {
            statusCode: 200,
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({ image: outputImage })
        };
        
    } catch (error) {
        console.error('Processing error:', error);
        return {
            statusCode: 500,
            body: JSON.stringify({ error: error.message })
        };
    }
};

Initial Benchmark Results

We ran 10,000 requests comparing Node.js vs. Wasm:

MetricNode.jsWasmImprovement
Cold start320ms45ms86%
Warm execution180ms32ms82%
Memory usage512MB256MB50%
Lambda cost/1M requests$84$2471%

We were blown away. But production would reveal hidden challenges.

Phase 2: Production Deployment (The Reality Check)

Challenge 1: The Memory Leak Mystery

Week 2 in production: Lambda functions started hitting OOM (out-of-memory) errors after ~4,000 invocations.

The investigation:

// Memory profiling revealed the issue
#[wasm_bindgen]
pub fn process_image(&self, base64_input: &str) -> Result<String, JsValue> {
    let image_data = general_purpose::STANDARD.decode(base64_input)?;
    let img = image::load_from_memory(&image_data)?;
    
    // Problem: Large allocations not being freed
    // Wasm linear memory kept growing
    
    let resized = self.resize_if_needed(img); // Allocates new image
    let mut output = Vec::new();
    resized.write_to(&mut output, ImageFormat::Jpeg)?;
    
    // Solution: Explicitly drop large allocations
    drop(img);
    drop(resized);
    
    Ok(general_purpose::STANDARD.encode(&output))
}

The fix: Aggressive memory management and Lambda concurrency limits.

// Updated handler with memory monitoring
exports.handler = async (event, context) => {
    const startMemory = process.memoryUsage().heapUsed;
    
    try {
        const result = await processImage(event);
        
        const endMemory = process.memoryUsage().heapUsed;
        const memoryGrowth = endMemory - startMemory;
        
        // Log memory metrics to CloudWatch
        console.log(JSON.stringify({
            metric: 'memory_growth',
            bytes: memoryGrowth,
            invocation_count: context.invokedFunctionArn
        }));
        
        // Force GC if memory growth is excessive
        if (memoryGrowth > 50 * 1024 * 1024) { // 50MB
            if (global.gc) global.gc();
        }
        
        return result;
        
    } catch (error) {
        // Log error with memory context
        console.error('Error:', error, 'Memory:', process.memoryUsage());
        throw error;
    }
};

Challenge 2: The Debugging Black Hole

When Wasm crashed, we got cryptic errors:

RuntimeError: unreachable executed
    at wasm://wasm/00123abc:wasm-function[142]:0x1f4d8

Not helpful.

Solution: Source maps and better error handling

// Add panic hook for better error messages
#[wasm_bindgen(start)]
pub fn main() {
    console_error_panic_hook::set_once();
}

// Wrap fallible operations with context
use anyhow::{Context, Result};

pub fn process_image(&self, base64_input: &str) -> Result<String, JsValue> {
    let image_data = general_purpose::STANDARD
        .decode(base64_input)
        .context("Failed to decode base64 input")
        .map_err(|e| JsValue::from_str(&e.to_string()))?;
    
    let img = image::load_from_memory(&image_data)
        .context("Failed to load image from memory")
        .map_err(|e| JsValue::from_str(&e.to_string()))?;
    
    // More helpful error: "Failed to load image from memory: UnknownFormat"
    
    // ... rest of processing
}

We also built a Wasm debugging proxy:

// wasm-debug-wrapper.js
class WasmDebugger {
    constructor(wasmModule) {
        this.wasm = wasmModule;
        this.callCount = 0;
        this.errors = [];
    }
    
    process_image(input) {
        this.callCount++;
        const callId = this.callCount;
        
        console.log(`[WASM-${callId}] Starting process_image`);
        console.log(`[WASM-${callId}] Input size: ${input.length} bytes`);
        
        try {
            const startTime = Date.now();
            const result = this.wasm.process_image(input);
            const duration = Date.now() - startTime;
            
            console.log(`[WASM-${callId}] Success in ${duration}ms`);
            console.log(`[WASM-${callId}] Output size: ${result.length} bytes`);
            
            return result;
            
        } catch (error) {
            this.errors.push({
                callId,
                error: error.toString(),
                stack: error.stack,
                input: input.substring(0, 100) // First 100 chars
            });
            
            console.error(`[WASM-${callId}] Error:`, error);
            throw error;
        }
    }
    
    getStats() {
        return {
            totalCalls: this.callCount,
            errorCount: this.errors.length,
            errorRate: (this.errors.length / this.callCount * 100).toFixed(2) + '%',
            recentErrors: this.errors.slice(-5)
        };
    }
}

Challenge 3: The WASI Surprise

Our Wasm module needed filesystem access for temporary files. Enter WASI (WebAssembly System Interface).

Problem: Lambda’s Node.js runtime didn’t support WASI out-of-the-box.

Solution: Used wasmtime (Wasm runtime with WASI support) via custom Lambda layer.

// handler-with-wasi.js
const { Wasmtime } = require('@bytecodealliance/wasmtime');

let wasmModule;
let wasmEngine;

async function initWasmWithWASI() {
    if (!wasmModule) {
        wasmEngine = new Wasmtime.Engine();
        
        const wasmBytes = fs.readFileSync('image_processor.wasm');
        wasmModule = new Wasmtime.Module(wasmEngine, wasmBytes);
        
        // Configure WASI with temp directory access
        const wasi = new Wasmtime.WASI({
            env: process.env,
            preopens: {
                '/tmp': '/tmp', // Allow access to Lambda's /tmp
            },
        });
        
        const linker = new Wasmtime.Linker(wasmEngine);
        wasi.instantiate(linker);
        
        const instance = linker.instantiate(wasmModule);
        return { instance, wasi };
    }
    return wasmModule;
}

exports.handler = async (event) => {
    const { instance, wasi } = await initWasmWithWASI();
    
    // Call Wasm function with WASI support
    const result = instance.exports.process_image_with_cache(event.body);
    
    return {
        statusCode: 200,
        body: result
    };
};

Phase 3: Optimization & Scale

Binary Size Optimization

Our initial Wasm binary was 850KB. We got it down to 320KB:

# Cargo.toml optimizations
[profile.release]
opt-level = 'z'  # Optimize for size
lto = true       # Link-time optimization
codegen-units = 1
panic = 'abort'
strip = true     # Strip symbols
# Build with wasm-opt (from Binaryen toolkit)
wasm-pack build --release --target web
wasm-opt -Oz -o output_optimized.wasm pkg/image_processor_bg.wasm

# Result: 850KB → 320KB (62% reduction)

Streaming for Large Images

For images > 5MB, we implemented streaming processing:

use futures::stream::StreamExt;

#[wasm_bindgen]
pub async fn process_image_stream(
    input_stream: web_sys::ReadableStream
) -> Result<web_sys::ReadableStream, JsValue> {
    // Convert ReadableStream to async iterator
    let mut stream = wasm_streams::ReadableStream::from_raw(input_stream)
        .into_stream();
    
    let mut chunks = Vec::new();
    
    // Read chunks
    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        chunks.push(chunk);
    }
    
    // Process image
    let image_data: Vec<u8> = chunks.concat();
    let processed = process_in_chunks(image_data)?;
    
    // Stream output
    Ok(create_readable_stream(processed))
}

fn process_in_chunks(data: Vec<u8>) -> Result<Vec<u8>, JsValue> {
    // Process in 1MB chunks to avoid memory spikes
    const CHUNK_SIZE: usize = 1024 * 1024;
    
    let img = image::load_from_memory(&data)
        .map_err(|e| JsValue::from_str(&e.to_string()))?;
    
    // Resize
    let resized = img.resize(1920, 1080, image::imageops::FilterType::Lanczos3);
    
    // Encode to JPEG
    let mut output = Vec::new();
    resized.write_to(&mut output, ImageFormat::Jpeg)
        .map_err(|e| JsValue::from_str(&e.to_string()))?;
    
    Ok(output)
}

Multi-Format Support

We extended the service to support multiple output formats:

#[wasm_bindgen]
pub enum OutputFormat {
    JPEG,
    PNG,
    WEBP,
    AVIF,
}

#[wasm_bindgen]
impl ImageProcessor {
    pub fn process_with_format(
        &self,
        base64_input: &str,
        format: OutputFormat
    ) -> Result<String, JsValue> {
        let img = self.load_image(base64_input)?;
        let resized = self.resize_if_needed(img);
        
        let mut output = Vec::new();
        
        match format {
            OutputFormat::JPEG => {
                resized.write_to(&mut output, ImageFormat::Jpeg)?;
            }
            OutputFormat::PNG => {
                resized.write_to(&mut output, ImageFormat::Png)?;
            }
            OutputFormat::WEBP => {
                // Use webp crate
                let encoder = webp::Encoder::from_image(&resized)
                    .map_err(|e| JsValue::from_str(&format!("WebP error: {}", e)))?;
                output = encoder.encode(self.quality as f32).to_vec();
            }
            OutputFormat::AVIF => {
                // Use ravif crate
                let encoder = ravif::Encoder::new()
                    .with_quality(self.quality as f32);
                output = encoder.encode_rgb(resized.to_rgba8().as_raw())
                    .map_err(|e| JsValue::from_str(&format!("AVIF error: {}", e)))?
                    .avif_file;
            }
        }
        
        Ok(general_purpose::STANDARD.encode(&output))
    }
}

Production Results After 6 Months

MetricBefore (Node.js)After (Wasm)Improvement
Average latency180ms28ms84%
p95 latency340ms52ms85%
p99 latency780ms98ms87%
Cold start320ms38ms88%
Memory usage512MB256MB50%
Lambda cost$8,400/mo$2,100/mo75%
Throughput8M req/day10M req/day25%
Error rate0.8%0.2%75%

Annual savings: $75,600 in Lambda costs alone

The Hidden Costs

Not everything was rosy. We faced tradeoffs:

1. Developer Experience

Rust learning curve: 3 months for team to become productive Debugging difficulty: 2x longer to troubleshoot issues Build times: 4x slower than Node.js (60s vs. 15s)

2. Tooling Maturity

Missing pieces we had to build ourselves:

  • Custom logging/tracing integration
  • Wasm-specific monitoring dashboards
  • Local development environment with hot-reload

3. Ecosystem Gaps

Some image formats (HEIC, JPEG-XL) had immature Rust libraries. We maintained fallback paths to Node.js for these.

Lessons Learned

1. Start with a Clear Win

We chose image processing because it was:

  • CPU-bound: Wasm’s strength
  • Performance-critical: Clear success metrics
  • Isolated: Could fail without breaking other services

2. Invest in Observability Early

We built custom CloudWatch metrics for Wasm-specific concerns:

// metrics.js
const { CloudWatch } = require('aws-sdk');
const cloudwatch = new CloudWatch();

function publishWasmMetrics(metrics) {
    cloudwatch.putMetricData({
        Namespace: 'Wasm/ImageProcessor',
        MetricData: [
            {
                MetricName: 'WasmExecutionTime',
                Value: metrics.executionTime,
                Unit: 'Milliseconds',
                Timestamp: new Date()
            },
            {
                MetricName: 'WasmMemoryGrowth',
                Value: metrics.memoryGrowth,
                Unit: 'Bytes',
                Timestamp: new Date()
            },
            {
                MetricName: 'WasmBinarySize',
                Value: metrics.binarySize,
                Unit: 'Bytes',
                Timestamp: new Date()
            }
        ]
    }).promise();
}

3. Binary Size Matters

Every KB counts in serverless. We monitored binary size in CI:

# .github/workflows/size-check.yml
- name: Check Wasm binary size
  run: |
    SIZE=$(wc -c < pkg/image_processor_bg.wasm)
    MAX_SIZE=400000  # 400KB limit
    
    if [ $SIZE -gt $MAX_SIZE ]; then
      echo "Binary size $SIZE exceeds limit $MAX_SIZE"
      exit 1
    fi
    
    echo "Binary size: $SIZE bytes (under limit)"

4. Gradual Rollout is Essential

We used weighted routing in API Gateway:

{
  "routes": [
    {
      "path": "/process",
      "destinations": [
        {
          "target": "lambda-wasm-processor",
          "weight": 20
        },
        {
          "target": "lambda-nodejs-processor",
          "weight": 80
        }
      ]
    }
  ]
}

This allowed us to ramp Wasm traffic from 5% → 50% → 100% over 6 weeks.

5. Maintain Escape Hatches

We kept the Node.js version for 90 days after full Wasm rollout—as insurance.

The Future: Where We’re Headed

We’re now exploring:

  1. Wasm Component Model: Better interop between modules
  2. WASI Preview 2: More system interface capabilities
  3. Wasm-native frameworks: Leveraging tools like wasmCloud
  4. Edge deployment: Moving Wasm to CloudFront@Edge

Final Thoughts

WebAssembly in production is not a silver bullet, but for the right use cases, it’s transformative:

Choose Wasm when:

  • Performance is critical
  • Cold starts are a problem
  • You need language flexibility
  • CPU-bound workloads

Avoid Wasm when:

  • I/O-bound workloads (networking, database)
  • Rapid prototyping needed
  • Team lacks systems programming experience
  • Ecosystem maturity is critical

Our journey from skepticism to production taught us that WebAssembly is production-ready today—if you’re willing to invest in the ecosystem and tooling.

For more on WebAssembly’s role in cloud-native architectures, check out the CrashBytes overview of WebAssembly transforming cloud applications.

The future of serverless is multi-language, high-performance, and sandboxed. That future is Wasm.


Have questions about WebAssembly in production? Reach out on Twitter or connect on LinkedIn.