Performance Tuning Guide ======================== This guide helps you configure async-cache for optimal performance in production. Choosing maxsize ---------------- .. list-table:: :header-rows: 1 :widths: 25 25 50 * - Workload - Recommended maxsize - Reasoning * - API response cache - 1,000 – 10,000 - Bounded memory, LRU evicts cold keys * - ML embeddings - 50,000 – 500,000 - Large but finite corpus; disk backend for persistence * - Session store - Active sessions × 1.2 - Size to active user count + headroom * - Config / feature flags - 100 – 1,000 - Small, rarely evicted **Rule of thumb**: Start with ``maxsize = expected_hot_keys * 1.5``. Monitor ``get_metrics()['size']`` vs ``maxsize`` — if size is consistently at maxsize, increase it. Choosing TTL ------------ .. list-table:: :header-rows: 1 :widths: 25 25 50 * - Data type - Recommended TTL - Notes * - User profiles - 60 – 300s - Balance freshness vs DB load * - Product catalog - 300 – 3600s - Changes infrequently * - Config / flags - 60 – 120s - Short TTL for fast rollout * - ML inference - 3600s – None - Immutable inputs → long/infinite TTL * - Session data - Match session timeout - Prevent stale sessions Batch Loader Tuning ------------------- - **batch_window_ms** (default 5): Increase for higher batching ratio at cost of latency. GraphQL resolvers: 5–10ms. Background jobs: 20–50ms. - **max_batch_size** (default 100): Match your database's optimal batch size. Disk Backend Tuning ------------------- - **Save frequency**: Call ``save_to_backend()`` on shutdown (``atexit``, signal handler). For critical data, save periodically (e.g., every 5 minutes). - **File location**: Use fast local storage (SSD). Avoid network mounts. - **Cache size**: Pickle files scale linearly. 100K entries ≈ 10–100 MB depending on value size. Monitoring ---------- Expose ``get_metrics()`` to your monitoring system: .. code-block:: python # Prometheus example from prometheus_client import Gauge hit_rate = Gauge('cache_hit_rate', 'Cache hit rate') cache_size = Gauge('cache_size', 'Cache entries') async def update_metrics(): m = cache.get_metrics() hit_rate.set(m['hit_rate']) cache_size.set(m['size']) **Key metrics to watch**: - **hit_rate < 0.5**: maxsize too small or TTL too short - **hit_rate > 0.99**: TTL may be too long (stale data risk) - **size == maxsize consistently**: increase maxsize - **misses growing faster than hits**: review access patterns Performance Characteristics --------------------------- .. list-table:: :header-rows: 1 * - Operation - Complexity - Notes * - ``get()`` (hit) - O(1) - ~0.8µs median * - ``get()`` (miss + loader) - O(1) + loader time - ~3µs overhead + loader * - ``set()`` - O(1) - Includes LRU eviction if needed * - ``delete()`` - O(1) - * - Thundering herd - O(1) loader calls - Regardless of concurrent requests * - Batch loader - O(1) per batch - Amortized across batch window * - ``save_to_backend()`` - O(n) - n = cache size; runs synchronously * - ``load_from_backend()`` - O(n) - n = persisted entries