The Cloud Egress Cost Trap (And How to Escape It)
Cloud bills line items don't surprise you the way egress fees do. The patterns that turn a $50k/month surprise into a $5k/month line item.
How egress is charged
Egress is data leaving a cloud provider's network, to the public internet, to another provider, sometimes between regions of the same provider. Cloud bills make ingress (data into the cloud) free; egress is the line item that grows quietly until someone notices the bill.
The pricing tiers. AWS, GCP, and Azure all charge tiered rates that drop as volume rises (roughly $0.09/GB at low volume falling to $0.05/GB at petabyte scale). Inter-region traffic within the same provider is cheaper but not free (roughly $0.02/GB). Inter-AZ traffic within a region is the cheapest tier (roughly $0.01/GB). Cross-cloud (AWS-to-GCP) is full public-internet rate; this is the trap that catches teams hosting "between" providers.
The math at scale. A team serving 100TB/month of public traffic at $0.085/GB pays $8,500/month, $102k/year. The same team serving 1PB/month pays $850k/year. For SaaS with rich media (video, large APIs), egress is often 20-30% of the total cloud bill. The line item dwarfs compute.
The "Basic tier" myth. Each provider gives 1-100GB/month free egress; this is a marketing number, not a meaningful threshold for production workloads. Real workloads blow past the Basic tier in hours. Treat the Basic tier as zero when forecasting.
Three high-leverage cuts
Most egress cost is concentrated in three patterns. Find which one you have and the savings come fast.
Pattern 1: chatty cross-AZ traffic. Microservices in different AZs talking constantly. Each call crosses an AZ boundary at $0.01/GB. The fix is AZ-affinity routing, keep request flows within an AZ when possible. Mesh-aware load balancers and service mesh routing tables handle this; without one, k8s default scheduling spreads pods randomly across AZs and every internal call pays the cross-AZ tax.
Pattern 2: backups and replication leaving the region. Snapshots copied to S3 in another region "for safety". Database read-replicas in a different region "for low-latency reads". These are sensible architectures with real reasons; they're also where 30-50% of egress lives. Audit each cross-region copy: do you actually use it for failover? If you've never done a regional failover and never will, the cross-region replication is paying for theatre.
Pattern 3: external API calls and webhooks. Every call to a third-party API counts as egress. For high-volume integrations (analytics, monitoring, payment), this can be tens of TB/month. Batch the calls; cache responses; deduplicate. A 10x reduction in API call volume usually translates directly to 10x cheaper egress.
Architectural choices that lock in egress
Some choices made early are nearly impossible to reverse later. Recognise them before you make them.
Choice 1: multi-cloud for "vendor lock-in" reasons. The argument: don't get trapped in one provider. The reality: data flowing between AWS and GCP pays full public-internet egress on BOTH ends. A team running services on AWS and storage on GCP can pay 5-10x more in egress than either single-cloud option. Multi-cloud has real benefits (compliance, redundancy) but the egress cost must be in the budget, not a surprise.
Choice 2: hybrid cloud (on-prem + cloud) without dedicated interconnect. Public-internet VPN to on-prem looks cheap; the egress fees over 12 months pay for AWS Direct Connect or GCP Interconnect twice over. If you're moving more than ~5TB/month between cloud and on-prem, dedicated interconnect is the right call.
Choice 3: stateful workloads in different regions than the data. The classic "we'll put compute close to users, storage in the cheapest region" plan. Every read pulls cross-region; the egress eats the savings. Co-locate compute with the data it reads most; replicate the data to user-near regions if latency requires it.
CDN as an egress strategy
CDNs (Cloudflare, Fastly, CloudFront) cache static and semi-static content at the edge. Customers fetch from the CDN, not your origin; origin egress drops by 80-95%.
The pricing math. CDN bandwidth is roughly $0.04/GB (Cloudfront) or as low as $0.01/GB at volume (Cloudflare Enterprise). That's half the cost of direct cloud egress at low volume, dropping to one-tenth at high volume. For any workload serving more than ~500GB/day of cacheable content, CDN beats direct origin egress on price alone.
What's cacheable. Static assets (CSS, JS, images, videos) are obviously cacheable. Less obvious: API responses with reasonable TTLs (product catalogues, search results, public data). Even short TTLs (5-60 seconds) collapse the origin egress for hot endpoints. The Cloudflare "tiered cache" feature pulls hot content to all edges, multiplying the cache hit rate.
The implementation gotcha. Origins must send correct cache-control headers; otherwise the CDN treats everything as uncacheable. A common mistake: developers set cache-control: no-cache by default "for safety" and never revisit. Audit your headers; cacheable content should have explicit max-age values, not vague no-cache.
Track egress per business unit
Aggregate egress is unactionable. "We spent $80k on egress last month" doesn't tell you which feature, team, or customer drove the cost. Per-business-unit attribution is the prerequisite for cuts.
The tagging strategy. Tag every resource (LB, NAT gateway, S3 bucket, instance) with team and product. Cloud cost-allocation reports then split egress by tag. The first attribution report typically reveals 60-70% of egress comes from 1-2 features; the rest is small-pieces noise.
The chargeback effect. Once teams see their own egress line item, behaviour changes fast. The ML team using cross-region S3 reads "because it was easier" suddenly cares; engineering effort to optimise becomes worth it because the savings hit their budget. Without attribution, egress is "the company's problem" and nobody owns it.
The dashboard. A weekly egress dashboard per team. Top 5 endpoints by GB. Trend versus last week. Anomaly alerts when a team jumps 2x. The visibility is what creates the optimisation pressure; the absence of visibility is why egress quietly grows.
Common antipatterns
Multi-cloud "for resilience" without measuring egress cost. Resilience is a real goal; the egress fee for cross-cloud data flow is a tax on the goal. Quantify it before committing.
Cross-region read replicas that nobody reads. Originally added for "low-latency reads from EU"; the EU customers were never onboarded; replica still runs and pays cross-region replication fees. Audit; delete.
NAT gateway data-processing fees. Surprise: AWS NAT gateway charges $0.045/GB processed in addition to egress. For chatty workloads, NAT gateway is more expensive than the egress itself. VPC endpoints (free) replace NAT for AWS-to-AWS traffic.
Skipping the CDN for "dynamic" content. The team assumed dynamic = uncacheable. In practice, even authenticated API responses can cache at edge with short TTLs. The 80% origin-egress reduction is worth the 30 minutes of cache-control header tuning.
What to do this week
Three moves. (1) Pull last month's cloud bill and find the egress line. Note the absolute number and the percent of total spend. If egress is more than 10% of total, optimisation is worth a sprint. (2) Tag your top 10 highest-egress resources by team. The first cost-by-team report will tell you where the next two weeks of work goes. (3) For your top 1-2 high-egress endpoints, check the cache-control headers and CDN configuration. A 30-minute audit often surfaces "this is uncacheable for no reason"; fixing it cuts origin egress by 50%+ on that endpoint.