The Caching Strategy That Cut Our Client's AWS Bill by 60%
$28K/Month on AWS for a Mid-Stage Startup
Our client was processing 2M requests per day with a straightforward stack: Next.js frontend, Node.js API, PostgreSQL database, S3 for assets. Their AWS bill had crept from $4K to $28K over 18 months as traffic grew. The reflexive answer was "optimize the code." The actual answer was caching.
Where the Money Was Going
Monthly AWS breakdown (before):
RDS (PostgreSQL): $8,200 (29%) ← database was the bottleneck
EC2/ECS (compute): $7,400 (26%)
CloudFront + S3: $4,100 (15%)
ElastiCache: $0 (0%) ← no caching at all
Data transfer: $3,800 (14%)
Other (monitoring): $4,500 (16%)
Total: $28,000
The database was handling 12,000 queries per second. Most of them were identical reads being repeated thousands of times per hour.
The Caching Pyramid
Layer 1: Browser Cache (free, immediate)
├── Static assets: Cache for 1 year (immutable hashes)
├── API responses: Cache for 60-300 seconds (stale-while-revalidate)
└── Impact: -30% of requests never hit your servers
Layer 2: CDN Cache (CloudFront/Vercel Edge)
├── HTML pages: Cache for 60 seconds + stale-while-revalidate
├── API responses: Cache for 30-300 seconds by route
└── Impact: -50% of remaining requests never hit origin
Layer 3: Application Cache (Redis/ElastiCache)
├── Database query results: Cache for 60-3600 seconds
├── Computed values: Cache for hours/days
└── Impact: -80% of database queries eliminated
Layer 4: Database Query Optimization
├── Only queries that MUST hit the database reach it
└── Impact: Remaining queries are fast and efficient
Layer 1: Browser Cache Headers
// Set proper cache headers for different content types
// Static assets (JS, CSS, images with hashed filenames)
// Cache forever — the hash changes when content changes
res.setHeader("Cache-Control", "public, max-age=31536000, immutable");
// API responses that change infrequently (product catalog)
res.setHeader(
"Cache-Control",
"public, max-age=60, stale-while-revalidate=300"
);
// Serves cached version for 60s, then revalidates in background
// User always gets a fast response
// User-specific data (cart, account)
res.setHeader("Cache-Control", "private, no-cache");
// Never cache — always fresh
// HTML pages
res.setHeader(
"Cache-Control",
"public, max-age=0, s-maxage=60, stale-while-revalidate=300"
);
// Browser always checks, CDN caches for 60sImpact: 30% of requests eliminated. Browser serves from local cache without any network request.
Layer 2: CDN Edge Caching
CloudFront behaviors configured per path:
/api/products/* → Cache 5 min, vary by query string
/api/categories/* → Cache 1 hour, vary by nothing
/api/search?* → Cache 2 min, vary by full query string
/api/cart/* → No cache (user-specific)
/api/user/* → No cache (user-specific)
/_next/static/* → Cache 1 year (immutable)
/images/* → Cache 1 year (immutable, transformed)
/*.html → Cache 60s, stale-while-revalidate 5 min
Impact: 50% of remaining requests served from CDN edge. Origin server load drops dramatically.
Layer 3: Application Cache (Redis)
This is where the biggest savings happen:
// Generic caching wrapper with automatic invalidation
async function cached<T>(
key: string,
ttlSeconds: number,
fetcher: () => Promise<T>
): Promise<T> {
// Check Redis first
const cachedValue = await redis.get(key);
if (cachedValue) return JSON.parse(cachedValue);
// Cache miss — fetch from database
const value = await fetcher();
// Store in Redis with TTL
await redis.setex(key, ttlSeconds, JSON.stringify(value));
return value;
}
// Usage: Product catalog (changes rarely)
async function getProduct(id: string) {
return cached(`product:${id}`, 3600, async () => {
return db.query("SELECT * FROM products WHERE id = $1", [id]);
});
}
// Usage: Category listing (changes daily)
async function getCategories() {
return cached("categories:all", 1800, async () => {
return db.query("SELECT * FROM categories ORDER BY sort_order");
});
}
// Usage: Search results (changes frequently but can be stale for 60s)
async function searchProducts(query: string, page: number) {
const cacheKey = `search:${query}:${page}`;
return cached(cacheKey, 60, async () => {
return db.query("SELECT * FROM products WHERE ...", [query]);
});
}Cache Invalidation
// When data changes, invalidate affected cache keys
async function updateProduct(id: string, data: ProductUpdate) {
await db.query("UPDATE products SET ... WHERE id = $1", [id, ...]);
// Invalidate specific product cache
await redis.del(`product:${id}`);
// Invalidate category listing (product might affect it)
await redis.del("categories:all");
// Invalidate search cache (pattern delete)
const searchKeys = await redis.keys("search:*");
if (searchKeys.length > 0) await redis.del(...searchKeys);
}Impact: Database queries dropped from 12,000/sec to 2,400/sec. 80% reduction.
The Results
Monthly AWS breakdown (after):
RDS (PostgreSQL): $3,200 (-61%) ← downsized instance
EC2/ECS (compute): $3,100 (-58%) ← fewer instances needed
CloudFront + S3: $2,400 (-41%) ← better cache hit ratio
ElastiCache (Redis): $1,200 (new) ← small Redis instance
Data transfer: $800 (-79%) ← CDN serves most traffic
Other (monitoring): $300 (-93%)
Total: $11,000 (-61%)
Monthly savings: $17,000
Annual savings: $204,000
Implementation cost: ~$15,000 (2 weeks of engineering)
ROI: 13.6x in year one
Performance Improvement (Bonus)
Average API response time:
Before: 340ms (p50), 1,200ms (p99)
After: 12ms (p50, cache hit), 180ms (p99, cache miss)
Page load time:
Before: 2.8s
After: 0.9s
Database CPU utilization:
Before: 78% average (spikes to 95%)
After: 22% average (spikes to 45%)
Common Caching Mistakes
❌ Caching everything with the same TTL
→ Different data needs different freshness guarantees
❌ No cache invalidation strategy
→ Stale data is worse than slow data
❌ Caching user-specific data in shared cache
→ User A sees User B's cart (security incident)
❌ Not monitoring cache hit rates
→ You don't know if caching is actually working
❌ Cache stampede on expiration
→ 1,000 requests hit the DB simultaneously when cache expires
→ Fix: Use stale-while-revalidate or cache locking
Implementation Priority
Week 1: Browser cache headers (free, immediate impact)
Week 2: CDN configuration (CloudFront/Vercel behaviors)
Week 3: Redis for top 10 most-hit database queries
Week 4: Cache invalidation and monitoring dashboard
Total effort: 4 weeks of focused engineering
Expected savings: 40-60% of current infrastructure costs
Caching isn't glamorous. But it's the highest-ROI infrastructure investment most startups can make. Before you scale up your database, add more servers, or rewrite your application — add a caching layer. The math almost always works in your favor.