WORK IN PROGRESS WORK IN PROGRESS WORK IN PROGRESS WORK IN PROGRESS WORK IN PROGRESS WORK IN PROGRESS WORK IN PROGRESS WORK IN PROGRESS

Sat Mar 21 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Odoo PostgreSQL pg_hba.conf Authentication Drift Incident Runbook

A production-safe runbook to detect, contain, and recover from Odoo outages caused by PostgreSQL client authentication drift after config or infrastructure changes.

Authentication drift incidents usually appear after failover, subnet migration, TLS rollout, PgBouncer placement changes, or manual edits to pg_hba.conf.

Symptoms are noisy (FATAL: no pg_hba.conf entry, password authentication failed, sudden Odoo reconnect loops), but recovery should be methodical: stabilize traffic, confirm the failing auth path, apply the smallest safe access fix, and verify before reopening all workloads.

Incident signals

Treat as active incident when one or more are true:

  • Odoo workers repeatedly fail DB connect at startup or during request bursts.
  • Odoo logs show psycopg2.OperationalError with FATAL: no pg_hba.conf entry.
  • Recent infra/config changes occurred (new app subnet, failover to new primary, PgBouncer move, TLS/auth method change).
  • PostgreSQL is healthy for local/admin sessions but rejects app traffic.

Step 0 — Stabilize before changing auth rules

  1. Freeze non-critical Odoo jobs (bulk imports, low-priority cron, queue retries).
  2. Keep one operator on DB auth remediation and one on timeline/comms.
  3. Avoid broad emergency rules like host all all 0.0.0.0/0 trust.

Why: broad allow rules fix the incident by creating a larger security incident.

Step 1 — Confirm exact failure mode from app and DB sides

From Odoo/app logs

odoocli logs tail --service odoo --since 20m --grep "OperationalError|pg_hba|authentication failed|FATAL"

From PostgreSQL logs (host or container)

# Example systemd-based host
journalctl -u postgresql --since "20 min ago" | grep -E "pg_hba|authentication failed|FATAL"

# Example containerized Postgres
docker logs --since 20m postgres | grep -E "pg_hba|authentication failed|FATAL"

Capture these fields for each failure:

  • source IP / hostname
  • database name
  • username
  • TLS/SSL requirement hints (if present)
  • auth method error (no pg_hba.conf entry vs password failure)

That tuple tells you whether this is network CIDR drift, user/password drift, DB name mismatch, or TLS/auth-method mismatch.

Step 2 — Inspect active auth configuration safely

psql "$ODOO_DB_URI" -c "show hba_file;"
psql "$ODOO_DB_URI" -c "show config_file;"
psql "$ODOO_DB_URI" -c "show password_encryption;"
# Parsed, ordered view of pg_hba.conf rules (PostgreSQL 10+)
psql "$ODOO_DB_URI" -c "
select line_number, type, database, user_name, address, auth_method, options, error
from pg_hba_file_rules
order by line_number;
"

Triage checklist:

  • Is the Odoo app subnet/CIDR present?
  • Does rule order put a restrictive match before your intended allow rule?
  • Does required TLS (hostssl) match how clients connect?
  • Is auth method consistent with client credentials (scram-sha-256 vs md5 expectations)?
  • Is Odoo connecting directly or via PgBouncer IPs that now differ?

Step 3 — Contain and remediate in smallest-safe change set

Apply least-privilege fixes in this order:

  1. Add/adjust the specific rule for Odoo DB/user/source CIDR.
  2. Keep strong auth (scram-sha-256) unless emergency rollback is formally approved.
  3. Preserve replication/admin rules to avoid compounding failure.

Example targeted rule pattern (adjust values):

# pg_hba.conf (example)
hostssl  odoo_prod  odoo_app  10.42.8.0/24  scram-sha-256

Reload config (no full DB restart):

psql "$ODOO_DB_URI" -c "select pg_reload_conf();"

If you manage PostgreSQL by service files:

# Optional explicit reload path
sudo systemctl reload postgresql

Step 4 — Validate from the real client path before reopening load

Test from an Odoo app host (or PgBouncer host, depending on architecture):

psql "host=<db-host> port=5432 dbname=odoo_prod user=odoo_app sslmode=require" -c "select now(), current_user, current_database();"

Then validate Odoo path:

odoocli logs tail --service odoo --since 10m --grep "database|OperationalError|pg_hba|authentication failed"

If stable, unpause workloads gradually:

  1. web/API traffic
  2. critical cron
  3. backlog/batch lanes

Rollback and safety guardrails

If auth failures worsen after change:

  1. Revert to last known-good pg_hba.conf from versioned backup.
  2. Reload config again (select pg_reload_conf();).
  3. Re-test with psql from app path before reopening Odoo workers.

Never "rollback" by enabling permissive trust for all sources in production.

Verification and incident exit criteria

Close only when all are true for at least 15–30 minutes:

  • No new auth failures in Odoo or PostgreSQL logs.
  • Odoo login + one critical business flow succeeds (e.g., confirm sale order, post invoice).
  • Connection churn returns to baseline.

Useful verification query:

psql "$ODOO_DB_URI" -c "
select application_name, usename, client_addr, state, count(*)
from pg_stat_activity
where datname = current_database()
group by 1,2,3,4
order by count(*) desc;
"

Hardening and prevention checklist

  • Version-control pg_hba.conf and require peer review for auth rule changes.
  • Maintain an explicit inventory of Odoo app/PgBouncer source CIDRs per environment.
  • Add alerts on PostgreSQL log patterns:
    • no pg_hba.conf entry
    • password authentication failed
  • Standardize on TLS requirements (hostssl + cert policy) and document exceptions.
  • Run failover game days that include auth-path verification from app hosts.
  • Keep a break-glass, time-boxed recovery procedure with approval gates.

Practical reference points

  • PostgreSQL client authentication (pg_hba.conf, rule order, auth methods): official PostgreSQL documentation.
  • PostgreSQL pg_hba_file_rules view for parsing/validating effective HBA entries.
  • Odoo deployment docs for DB connectivity and worker restart behavior.

The operating rule: fix authentication incidents with precise scope (right user, DB, source, method), then verify from real traffic paths before scaling load back up.

Back to blog