Performance Tuning Guide
This guide helps you configure async-cache for optimal performance in production.
Choosing maxsize
Workload |
Recommended maxsize |
Reasoning |
|---|---|---|
API response cache |
1,000 – 10,000 |
Bounded memory, LRU evicts cold keys |
ML embeddings |
50,000 – 500,000 |
Large but finite corpus; disk backend for persistence |
Session store |
Active sessions × 1.2 |
Size to active user count + headroom |
Config / feature flags |
100 – 1,000 |
Small, rarely evicted |
Rule of thumb: Start with maxsize = expected_hot_keys * 1.5. Monitor
get_metrics()['size'] vs maxsize — if size is consistently at maxsize,
increase it.
Choosing TTL
Data type |
Recommended TTL |
Notes |
|---|---|---|
User profiles |
60 – 300s |
Balance freshness vs DB load |
Product catalog |
300 – 3600s |
Changes infrequently |
Config / flags |
60 – 120s |
Short TTL for fast rollout |
ML inference |
3600s – None |
Immutable inputs → long/infinite TTL |
Session data |
Match session timeout |
Prevent stale sessions |
Batch Loader Tuning
batch_window_ms (default 5): Increase for higher batching ratio at cost of latency. GraphQL resolvers: 5–10ms. Background jobs: 20–50ms.
max_batch_size (default 100): Match your database’s optimal batch size.
Disk Backend Tuning
Save frequency: Call
save_to_backend()on shutdown (atexit, signal handler). For critical data, save periodically (e.g., every 5 minutes).File location: Use fast local storage (SSD). Avoid network mounts.
Cache size: Pickle files scale linearly. 100K entries ≈ 10–100 MB depending on value size.
Monitoring
Expose get_metrics() to your monitoring system:
# Prometheus example
from prometheus_client import Gauge
hit_rate = Gauge('cache_hit_rate', 'Cache hit rate')
cache_size = Gauge('cache_size', 'Cache entries')
async def update_metrics():
m = cache.get_metrics()
hit_rate.set(m['hit_rate'])
cache_size.set(m['size'])
Key metrics to watch:
hit_rate < 0.5: maxsize too small or TTL too short
hit_rate > 0.99: TTL may be too long (stale data risk)
size == maxsize consistently: increase maxsize
misses growing faster than hits: review access patterns
Performance Characteristics
Operation |
Complexity |
Notes |
|---|---|---|
|
O(1) |
~0.8µs median |
|
O(1) + loader time |
~3µs overhead + loader |
|
O(1) |
Includes LRU eviction if needed |
|
O(1) |
|
Thundering herd |
O(1) loader calls |
Regardless of concurrent requests |
Batch loader |
O(1) per batch |
Amortized across batch window |
|
O(n) |
n = cache size; runs synchronously |
|
O(n) |
n = persisted entries |