Runbook
Operator-facing checklists and triage steps for the daemon.
Day one
- [ ] Confirm
tokenops statusreturnshealth: ok,ready: ready. - [ ] Storage path under
storage.pathexists and is writable. - [ ] CA bundle minted at
tls.cert_dir(only iftls.enabled). - [ ] OTLP collector reachable at
otel.endpoint(only ifotel.enabled).
Common ops
- Restart cleanly — SIGTERM (Ctrl-C). Graceful shutdown drains the event bus within
shutdown.timeout(default 15s). - Inspect a slow request —
tokenops status --jsonshows the current bus published / dropped counters; any non-zerodroppedmeans storage couldn't keep up. Bumpevents.QueueCapacity. - Flush the cache — restart the daemon, or send a request with
X-Tokenops-Cache: refreshto force a re-populate.
Subsections
- Health and readiness — what the probes mean
- Cache — bypass / refresh / invalidation
- Performance — bench-gate, latency budgets, queue pressure