Why Cache?
Caching stores frequently accessed data in a faster storage layer so subsequent reads skip the slow path entirely. A database query that takes 50ms can return in under 1ms from cache. At scale, this difference is the difference between a responsive app and one that collapses under load.
Cache Layers
Caching isn't a single layer — it happens at every level of the stack. Each layer serves a different purpose:
- Client cache: Browser caches static assets via
Cache-Controlheaders. The fastest cache — the request never leaves the device. - CDN cache: Edge servers cache content geographically close to users. Great for static assets and cacheable API responses.
- Reverse proxy cache: Nginx, Varnish, or similar. Caches full HTTP responses before they hit your application servers.
- Application cache: Redis or Memcached sitting between your app and database. You control exactly what gets cached, with what key, and for how long.
- Database cache: Most databases have built-in query caches and buffer pools. Useful but not always tunable.
Cache Strategies
Cache-Aside (Lazy Loading)
The application checks the cache first. On a miss, it reads from the database, writes the result to cache, and returns it. The cache only stores data that's actually been requested.
Pros: Only requested data gets cached. Cache failures don't break reads (you fall back to DB). Cons: First request is always a cache miss. Data can become stale if the DB is updated without invalidating cache.
Write-Through
Every write goes to both cache and database synchronously. The cache always has current data.
Pros: Cache is never stale. Cons: Write latency increases (two writes per operation). You also cache data that might never be read.
Write-Behind (Write-Back)
Writes go to the cache first, and the cache asynchronously flushes to the database in batches. The application only waits for the cache write.
Pros: Very fast writes. Batching reduces database load. Cons: Risk of data loss if the cache crashes before flushing. More complex to implement correctly.
Refresh-Ahead
The cache proactively refreshes entries before they expire, based on predicted access patterns. If an entry is accessed frequently and its TTL is approaching, the cache fetches a fresh copy in the background.
Pros: No cache miss latency for hot data. Cons: Wasted refreshes if access patterns change. Requires good prediction logic.
| Strategy | Read latency | Write latency | Consistency |
|---|---|---|---|
| Cache-aside | Miss on first read | Normal | Can go stale |
| Write-through | Always fast | Slower (double write) | Always current |
| Write-behind | Always fast | Very fast | Risk of loss |
| Refresh-ahead | Always fast (if predicted) | Normal | Proactively fresh |
Cache Invalidation
"There are only two hard things in computer science: cache invalidation and naming things." — Phil Karlton
When the underlying data changes, the cache needs to know. There are three main approaches:
- TTL (Time to Live): Entries expire after a fixed duration. Simple and safe, but data can be stale for up to the TTL window.
- Event-driven invalidation: When data changes, explicitly delete or update the cached entry. Consistent, but requires plumbing between writes and cache.
- Version keys: Include a version number in the cache key (e.g.,
user:42:v3). Incrementing the version effectively invalidates old entries without explicit deletion.
Eviction Policies
Caches have finite memory. When they're full, something has to go. The eviction policy decides what:
- LRU (Least Recently Used): Evicts the entry that hasn't been accessed for the longest time. The default choice for most caches.
- LFU (Least Frequently Used): Evicts the entry with the fewest total accesses. Better for workloads with stable hot sets.
- FIFO (First In, First Out): Evicts the oldest entry regardless of access pattern. Simple but often suboptimal.
- Random: Evicts a random entry. Surprisingly effective and very cheap to implement.