Migrating 200+ Ingress Resources to Gateway API: What Nobody Tells You

The real story of migrating from Ingress to Kubernetes Gateway API in production, including the breaking changes, the 2AM rollback, and why our latency improved 40%.

The Ingress Problem: When Annotations Become Technical Debt

By late 2024, our Kubernetes Ingress situation had become absurd:

Our largest Ingress resource:

  • 847 lines of YAML
  • 93 custom annotations
  • Comments warning: “DO NOT TOUCH THIS WITHOUT APPROVAL”
  • Last successfully modified: 6 months ago
  • Engineers who understood it: 2 (one left the company)

Every new routing rule required:

  • 45-60 minutes of careful annotation editing
  • Three-person code review (because one person breaking it cost us $80K)
  • Prayer that nobody would fat-finger a regex
  • Crossing fingers during deployment

We needed a better way. The Kubernetes Gateway API promised exactly that.

But migration is never as simple as the tutorials make it look.

Decision: Gateway API or Stick with Ingress Hell?

The Case FOR Gateway API

Pros:

  • Role-oriented design (platform team vs. dev team concerns separated)
  • First-class support for advanced routing (headers, weights, mirrors)
  • Protocol extensibility (HTTP/2, gRPC, TCP, TLS)
  • Strong vendor support (everyone’s adopting it)

Cons:

  • Still “beta” status (we’d be early adopters)
  • Learning curve (new concepts, new patterns)
  • Migration complexity (200+ Ingress resources to convert)
  • Risk of bugs in early implementations

The Case AGAINST Gateway API

Our infrastructure team’s concerns:

  • “If it ain’t broke…” (Ingress works, mostly)
  • Unknown unknowns (what edge cases will we hit?)
  • Training overhead (60 engineers need to learn new APIs)
  • Tooling gaps (existing scripts/automation won’t work)

The deciding vote: Our CTO read a study showing 30% latency improvements from better routing algorithms in Gateway API implementations.

Decision: We migrate. But carefully.

The Migration Strategy: Crawl, Walk, Run, Sprint

We rejected the “big bang” approach immediately. Our plan:

Phase 1: Proof of Concept (2 Weeks)

  • Single service, non-critical
  • Learn the Gateway API patterns
  • Identify tooling gaps

Phase 2: Production Validation (4 Weeks)

  • 5 production services with moderate traffic
  • Shadow deployment (run both Ingress and Gateway side-by-side)
  • Measure everything, trust nothing

Phase 3: Progressive Rollout (8 Weeks)

  • Batch migrations: 10 services per week
  • Automated conversion tooling
  • Rollback plan for every deployment

Phase 4: Decommission Ingress (2 Weeks)

  • Final cleanup
  • Documentation and training
  • Celebration (spoiler: we earned it)

Phase 1: The Proof of Concept That Almost Failed

We picked a simple service: blog-api with 3 routes, 2K req/sec traffic.

How hard could it be?

Attempt 1: Direct Translation (Failed)

Our first attempt was to directly translate the Ingress YAML:

# OLD: Ingress (worked fine)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: blog-api
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /$2
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /v1/posts(/|$)(.*)
        pathType: Prefix
        backend:
          service:
            name: blog-api
            port:
              number: 8080

NEW: Gateway API (first attempt - BROKEN)

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: blog-api
spec:
  parentRefs:
  - name: prod-gateway
    namespace: gateway-system
  hostnames:
  - "api.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /v1/posts
    backendRefs:
    - name: blog-api
      port: 8080

Deployed it. Production broke immediately.

Problem: Gateway API doesn’t support regex path rewrites via annotations. That’s an Ingress-ism.

Solution: HTTPRoute Filters

Gateway API handles this through explicit filters:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: blog-api
spec:
  parentRefs:
  - name: prod-gateway
    namespace: gateway-system
  hostnames:
  - "api.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /v1/posts
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplacePrefixMatch
          replacePrefixMatch: /
    backendRefs:
    - name: blog-api
      port: 8080

This time it worked.

Phase 2: Production Validation - The 2AM Rollback

We migrated 5 services to Gateway API. Everything looked great in staging.

Then we hit production traffic.

The Incident: Certificate Rotation Broke Everything

Timeline:

  • 2:14 AM: Automated cert rotation begins
  • 2:18 AM: All Gateway API routes return 503
  • 2:19 AM: Pages start firing
  • 2:23 AM: Emergency rollback to Ingress
  • 2:35 AM: Services restored

Root cause: Our cert-manager integration expected Ingress annotations. Gateway API uses different certificate reference mechanisms.

The fix: Updated cert-manager configuration:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: prod-gateway
  namespace: gateway-system
spec:
  gatewayClassName: nginx
  listeners:
  - name: https
    protocol: HTTPS
    port: 443
    hostname: "*.example.com"
    tls:
      mode: Terminate
      certificateRefs:
      - kind: Secret
        name: wildcard-tls-cert
        namespace: cert-manager  # KEY CHANGE

Lesson learned: Don’t assume certificate management “just works”. Test cert rotation explicitly.

Phase 3: The Automated Migration Tool

After manually migrating 5 services, we realized we needed automation. 200+ services × 45 minutes each = no way.

We built ingress-to-gateway-api - a Go tool that:

1. Parses Ingress YAML

Extracts:

  • Host rules
  • Path patterns
  • Backend services
  • Custom annotations (the tricky part)

2. Translates to Gateway API

Maps Ingress patterns to Gateway API constructs:

func translateIngress(ingress *networkingv1.Ingress) (*gatewayv1.HTTPRoute, error) {
    httpRoute := &gatewayv1.HTTPRoute{
        ObjectMeta: metav1.ObjectMeta{
            Name:      ingress.Name,
            Namespace: ingress.Namespace,
        },
        Spec: gatewayv1.HTTPRouteSpec{
            ParentRefs: []gatewayv1.ParentReference{
                {
                    Name:      "prod-gateway",
                    Namespace: ptr.To(gatewayv1.Namespace("gateway-system")),
                },
            },
        },
    }
    
    // Translate host rules
    for _, rule := range ingress.Spec.Rules {
        if rule.Host != "" {
            httpRoute.Spec.Hostnames = append(
                httpRoute.Spec.Hostnames, 
                gatewayv1.Hostname(rule.Host),
            )
        }
        
        // Translate paths
        for _, path := range rule.HTTP.Paths {
            match := gatewayv1.HTTPRouteMatch{
                Path: &gatewayv1.HTTPPathMatch{
                    Type:  ptr.To(translatePathType(path.PathType)),
                    Value: ptr.To(path.Path),
                },
            }
            
            backendRef := gatewayv1.HTTPBackendRef{
                BackendRef: gatewayv1.BackendRef{
                    BackendObjectReference: gatewayv1.BackendObjectReference{
                        Name: gatewayv1.ObjectName(path.Backend.Service.Name),
                        Port: ptr.To(gatewayv1.PortNumber(path.Backend.Service.Port.Number)),
                    },
                },
            }
            
            routeRule := gatewayv1.HTTPRouteRule{
                Matches:     []gatewayv1.HTTPRouteMatch{match},
                BackendRefs: []gatewayv1.HTTPBackendRef{backendRef},
            }
            
            // Handle annotations -> filters translation
            if filters, err := translateAnnotations(ingress.Annotations); err == nil {
                routeRule.Filters = filters
            }
            
            httpRoute.Spec.Rules = append(httpRoute.Spec.Rules, routeRule)
        }
    }
    
    return httpRoute, nil
}

// The annotation translation was the HARD part
func translateAnnotations(annotations map[string]string) ([]gatewayv1.HTTPRouteFilter, error) {
    filters := []gatewayv1.HTTPRouteFilter{}
    
    // Handle rewrite rules
    if rewriteTarget, exists := annotations["nginx.ingress.kubernetes.io/rewrite-target"]; exists {
        filters = append(filters, gatewayv1.HTTPRouteFilter{
            Type: gatewayv1.HTTPRouteFilterURLRewrite,
            URLRewrite: &gatewayv1.HTTPURLRewriteFilter{
                Path: &gatewayv1.HTTPPathModifier{
                    Type:               gatewayv1.PrefixMatchHTTPPathModifier,
                    ReplacePrefixMatch: ptr.To(rewriteTarget),
                },
            },
        })
    }
    
    // Handle rate limiting
    if rateLimitAnnotation, exists := annotations["nginx.ingress.kubernetes.io/limit-rps"]; exists {
        // Note: Gateway API doesn't have native rate limiting (yet)
        // This requires custom policy attachment
        // We handled this through ReferenceGrant CRDs
    }
    
    return filters, nil
}

3. Validates Equivalence

Runs traffic through both Ingress and Gateway API routes, comparing:

  • Response codes
  • Response times
  • Response bodies

Only promotes to production if 99.9% match.

4. Generates Rollback Plan

Every migration includes auto-generated rollback script:

#!/bin/bash
# Auto-generated rollback for blog-api
# Generated: 2025-04-15 14:23:17 UTC

echo "Rolling back blog-api to Ingress..."

# Delete Gateway API resources
kubectl delete httproute blog-api -n production
kubectl delete referencegrant blog-api-rg -n production

# Restore Ingress resource
kubectl apply -f backup/blog-api-ingress-2025-04-15.yaml

# Verify rollback
kubectl wait --for=condition=Ready ingress/blog-api -n production --timeout=60s

echo "Rollback complete. Verify traffic manually."

The Breaking Changes Nobody Warns You About

Breaking Change 1: Path Matching Semantics

Ingress: /api/v1 matches /api/v1, /api/v1/, /api/v1/users, etc.

Gateway API: Path matching is more strict.

Impact: 12 services broke because trailing slashes behaved differently.

Fix: Explicit path match types:

rules:
- matches:
  - path:
      type: PathPrefix        # Matches /api/v1/*
      value: /api/v1
  - path:
      type: Exact            # Only matches /api/v1 exactly
      value: /api/v1

Breaking Change 2: Weight-Based Routing

Ingress: No native support. We used custom annotations.

Gateway API: First-class support, but default behavior changed.

rules:
- backendRefs:
  - name: blog-api-v1
    port: 8080
    weight: 90
  - name: blog-api-v2
    port: 8080
    weight: 10  # 10% canary traffic

Problem: Weights are relative, not absolute.

If blog-api-v2 goes down, ALL traffic goes to v1 (not 90% as we expected).

Solution: Use HTTPRoute conditions + health checks:

rules:
- matches:
  - headers:
    - name: X-Canary
      value: "true"
  backendRefs:
  - name: blog-api-v2
    port: 8080
- backendRefs:
  - name: blog-api-v1
    port: 8080

Breaking Change 3: TLS Termination

Ingress: TLS terminates at Ingress Controller.

Gateway API: TLS termination happens at Gateway, not HTTPRoute.

Impact: Our per-service TLS configs broke.

Fix: Moved TLS configuration to Gateway level:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: prod-gateway
spec:
  listeners:
  - name: https-blog
    port: 443
    protocol: HTTPS
    hostname: blog.example.com
    tls:
      mode: Terminate
      certificateRefs:
      - name: blog-tls-cert

Performance Improvements: The Good Surprise

After full migration, we saw dramatic performance improvements:

Latency Reduction

  • p50: 12ms → 11ms (8% improvement)
  • p95: 45ms → 28ms (38% improvement)
  • p99: 180ms → 52ms (71% improvement!)

Why? Gateway API implementations use smarter routing algorithms.

Connection Pooling

Gateway API’s connection management is more efficient:

  • Before: 15,000 active connections per pod
  • After: 8,200 active connections per pod
  • Result: 45% reduction in connection overhead

HTTP/2 Server Push

Gateway API enabled HTTP/2 optimizations we couldn’t do with Ingress:

  • CSS/JS preloading
  • Multiplexed connections
  • Header compression

Result: Initial page load time improved by 32%.

The Hidden Costs

Cost 1: Team Training (240 Hours)

Every engineer needed to learn Gateway API concepts:

  • Gateway vs. HTTPRoute vs. GatewayClass
  • ReferenceGrant for cross-namespace routing
  • Policy attachment mechanisms

Solution: Weekly “Gateway API Office Hours” for 8 weeks.

Cost 2: CI/CD Pipeline Updates

All deployment scripts assumed Ingress YAML:

# OLD: deploy.sh (broken after migration)
kubectl apply -f ingress.yaml
kubectl wait --for=condition=Ready ingress/my-app

# NEW: deploy.sh (Gateway API)
kubectl apply -f httproute.yaml
kubectl wait --for=condition=Accepted httproute/my-app
kubectl wait --for=condition=ResolvedRefs httproute/my-app

Effort: 40+ repositories updated.

Cost 3: Monitoring Dashboards

Our Grafana dashboards tracked Ingress metrics. Gateway API exposes different metrics.

Solution: Built unified dashboards showing both during migration period.

Lessons for Teams Considering Migration

✅ Do This:

  1. Start with non-critical services - Learn on low-stakes deployments
  2. Run shadow traffic - Validate behavior before cutover
  3. Build automation early - Manual migration doesn’t scale
  4. Test certificate rotation - This will bite you at 2 AM
  5. Train teams incrementally - Don’t wait until migration day

❌ Don’t Do This:

  1. Big bang migration - Recipe for disaster
  2. Assume equivalence - Path matching semantics differ
  3. Skip rollback testing - You WILL need to rollback
  4. Forget about TLS - Gateway-level termination is different
  5. Neglect monitoring - Metrics change, dashboards break

The ROI: Was It Worth It?

Yes, but it was harder than expected.

Quantifiable benefits:

  • 71% p99 latency improvement
  • 45% connection overhead reduction
  • 32% faster page loads
  • 60% reduction in routing config errors

Intangible benefits:

  • Cleaner separation of concerns (platform vs. app teams)
  • Future-proof architecture (Gateway API is the future)
  • Better debugging (explicit filters vs. mysterious annotations)
  • Reduced cognitive load (role-oriented design makes sense)

Total migration effort:

  • Engineering time: 480 hours
  • Downtime: 42 minutes (spread across incidents)
  • Bugs discovered: 17 (all fixed)
  • Late-night pages: 8 (mostly cert-manager related)

What’s Next?

We’re now exploring advanced Gateway API features:

  1. TCPRoute for non-HTTP services (database replicas, gRPC)
  2. TrafficSplit for sophisticated canary deployments
  3. BackendTLSPolicy for end-to-end encryption
  4. Custom policy attachment for rate limiting and auth

Gateway API opened doors we didn’t even know existed with Ingress.

For more on Kubernetes traffic management evolution, see the comprehensive Gateway API guide that helped inform our migration strategy.


Migrating to Gateway API? Connect on LinkedIn for questions, or follow the journey on Twitter.