Security & Secret Management Playbook
Last updated: 2025-11-10 · Owner: Security/DevOps
This is the single source of truth for security guidance for the Python/FastAPI + React stack.
1. Threat Model Snapshot
2. Provider Secrets Lifecycle
- Storage — Provider API keys are encrypted with AES-GCM before hitting Postgres.
ENCRYPTION_KEYmust be a 32-byte hex string shared across backend, Celery worker, and Celery beat. - Rotation cadence — Rotate at least monthly or immediately after an incident. Use
PATCH /providers/{id}to replace the encrypted payload atomically, then invalidate the old key at the upstream provider. - Backups — Back up
ENCRYPTION_KEYseparately from database dumps. Store one copy in a hardware password manager and another inside your cloud secret vault. Document rotations inspecs/4_notes.md. - Access boundaries — Only
backendandcelery-workercontainers should receive provider secrets. Frontend builds never embed API keys; use the in-app Provider manager instead.
3. Application Credentials Matrix
Document all new variables in Configuration, update .env.example, and mention them in release notes.
4. Network & Platform Controls
- Segmentation — Expose FastAPI via reverse proxy (Traefik/Caddy/Nginx). Only allow inbound traffic on ports 80/443 (frontend/proxy) and 22 (SSH) where required. Block direct access to Postgres/Redis from the public internet.
- TLS — Terminate HTTPS at the proxy or Kubernetes Ingress with managed certificates (ACM, Cert-Manager). Enforce HTTPS via 301 redirect and set HSTS (min-age ≥ 1 day during testing, ≥ 6 months in prod once stable).
- Headers — Ensure
Strict-Transport-Security,Content-Security-Policy 'self' data: blob:,X-Frame-Options DENY, andReferrer-Policy strict-origin-when-cross-originare enabled. Track header coverage inspecs/4_notes.md. - Containers — All Docker images run as user
eval752. Keep bind-mounted directories owned by this UID or writable via group perms. Enable read-only root filesystems for frontend containers if your orchestrator supports it.
5. Operational Guardrails
- Run
uvx pre-commit run -alocally before pushing to ensure secret scanners and linters operate on the same tree CI will validate. - CI pipelines should retrieve all secrets from repository/environment secrets (never store plaintext values in workflow YAML).
- Replace the placeholder
.envvalues in any non-disposable environment before storing real provider keys. - Prefer infrastructure secret stores (AWS Secrets Manager, Azure Key Vault, Doppler, 1Password Connect) even when using Docker Compose; mount
.envfiles generated by automation rather than hand-editing servers.
6. Incident Response Checklist
- Rotate
ENCRYPTION_KEY(requires maintenance window to re-encrypt provider keys). Use Celery beat pause to avoid race conditions. - Rotate affected provider credentials inside eval_752, then revoke upstream keys.
- Revoke HF/LiteLLM tokens and regenerate service accounts used in CI or automation.
- Rebuild Docker images, redeploy, and confirm no residual pods/containers reference the compromised secrets.
- Capture timelines and mitigations in
specs/4_notes.md, then open/close follow-up tasks inspecs/3_tasks.md. - Run
scripts/tests/run_docker_integration.sh --full-runto ensure baseline functionality post-incident.
7. Audit Checklist (Quarterly)
- Verify every environment uses unique Postgres + Redis credentials.
- Confirm vault/backups contain the current
ENCRYPTION_KEY. - Review CI configs for plaintext secrets.
- Spot check logs to ensure secrets are redacted.
- Run container and dependency scans (Grype/Trivy +
pip-audit) and capture results.
