Odoo PostgreSQL pg_hba.conf Authentication Drift Incident Runbook
A production-safe runbook to detect, contain, and recover from Odoo outages caused by PostgreSQL client authentication drift after config or infrastructure changes.
Authentication drift incidents usually appear after failover, subnet migration, TLS rollout, PgBouncer placement changes, or manual edits to pg_hba.conf.
Symptoms are noisy (FATAL: no pg_hba.conf entry, password authentication failed, sudden Odoo reconnect loops), but recovery should be methodical: stabilize traffic, confirm the failing auth path, apply the smallest safe access fix, and verify before reopening all workloads.
Incident signals
Treat as active incident when one or more are true:
- Odoo workers repeatedly fail DB connect at startup or during request bursts.
- Odoo logs show
psycopg2.OperationalErrorwithFATAL: no pg_hba.conf entry. - Recent infra/config changes occurred (new app subnet, failover to new primary, PgBouncer move, TLS/auth method change).
- PostgreSQL is healthy for local/admin sessions but rejects app traffic.
Step 0 — Stabilize before changing auth rules
- Freeze non-critical Odoo jobs (bulk imports, low-priority cron, queue retries).
- Keep one operator on DB auth remediation and one on timeline/comms.
- Avoid broad emergency rules like
host all all 0.0.0.0/0 trust.
Why: broad allow rules fix the incident by creating a larger security incident.
Step 1 — Confirm exact failure mode from app and DB sides
From Odoo/app logs
odoocli logs tail --service odoo --since 20m --grep "OperationalError|pg_hba|authentication failed|FATAL"
From PostgreSQL logs (host or container)
# Example systemd-based host
journalctl -u postgresql --since "20 min ago" | grep -E "pg_hba|authentication failed|FATAL"
# Example containerized Postgres
docker logs --since 20m postgres | grep -E "pg_hba|authentication failed|FATAL"
Capture these fields for each failure:
- source IP / hostname
- database name
- username
- TLS/SSL requirement hints (if present)
- auth method error (
no pg_hba.conf entryvs password failure)
That tuple tells you whether this is network CIDR drift, user/password drift, DB name mismatch, or TLS/auth-method mismatch.
Step 2 — Inspect active auth configuration safely
psql "$ODOO_DB_URI" -c "show hba_file;"
psql "$ODOO_DB_URI" -c "show config_file;"
psql "$ODOO_DB_URI" -c "show password_encryption;"
# Parsed, ordered view of pg_hba.conf rules (PostgreSQL 10+)
psql "$ODOO_DB_URI" -c "
select line_number, type, database, user_name, address, auth_method, options, error
from pg_hba_file_rules
order by line_number;
"
Triage checklist:
- Is the Odoo app subnet/CIDR present?
- Does rule order put a restrictive match before your intended allow rule?
- Does required TLS (
hostssl) match how clients connect? - Is auth method consistent with client credentials (
scram-sha-256vs md5 expectations)? - Is Odoo connecting directly or via PgBouncer IPs that now differ?
Step 3 — Contain and remediate in smallest-safe change set
Apply least-privilege fixes in this order:
- Add/adjust the specific rule for Odoo DB/user/source CIDR.
- Keep strong auth (
scram-sha-256) unless emergency rollback is formally approved. - Preserve replication/admin rules to avoid compounding failure.
Example targeted rule pattern (adjust values):
# pg_hba.conf (example)
hostssl odoo_prod odoo_app 10.42.8.0/24 scram-sha-256
Reload config (no full DB restart):
psql "$ODOO_DB_URI" -c "select pg_reload_conf();"
If you manage PostgreSQL by service files:
# Optional explicit reload path
sudo systemctl reload postgresql
Step 4 — Validate from the real client path before reopening load
Test from an Odoo app host (or PgBouncer host, depending on architecture):
psql "host=<db-host> port=5432 dbname=odoo_prod user=odoo_app sslmode=require" -c "select now(), current_user, current_database();"
Then validate Odoo path:
odoocli logs tail --service odoo --since 10m --grep "database|OperationalError|pg_hba|authentication failed"
If stable, unpause workloads gradually:
- web/API traffic
- critical cron
- backlog/batch lanes
Rollback and safety guardrails
If auth failures worsen after change:
- Revert to last known-good
pg_hba.conffrom versioned backup. - Reload config again (
select pg_reload_conf();). - Re-test with
psqlfrom app path before reopening Odoo workers.
Never "rollback" by enabling permissive trust for all sources in production.
Verification and incident exit criteria
Close only when all are true for at least 15–30 minutes:
- No new auth failures in Odoo or PostgreSQL logs.
- Odoo login + one critical business flow succeeds (e.g., confirm sale order, post invoice).
- Connection churn returns to baseline.
Useful verification query:
psql "$ODOO_DB_URI" -c "
select application_name, usename, client_addr, state, count(*)
from pg_stat_activity
where datname = current_database()
group by 1,2,3,4
order by count(*) desc;
"
Hardening and prevention checklist
- Version-control
pg_hba.confand require peer review for auth rule changes. - Maintain an explicit inventory of Odoo app/PgBouncer source CIDRs per environment.
- Add alerts on PostgreSQL log patterns:
no pg_hba.conf entrypassword authentication failed
- Standardize on TLS requirements (
hostssl+ cert policy) and document exceptions. - Run failover game days that include auth-path verification from app hosts.
- Keep a break-glass, time-boxed recovery procedure with approval gates.
Practical reference points
- PostgreSQL client authentication (
pg_hba.conf, rule order, auth methods): official PostgreSQL documentation. - PostgreSQL
pg_hba_file_rulesview for parsing/validating effective HBA entries. - Odoo deployment docs for DB connectivity and worker restart behavior.
The operating rule: fix authentication incidents with precise scope (right user, DB, source, method), then verify from real traffic paths before scaling load back up.