Postgres CDC — Quickstart & Runbook

Set the right DB knobs, register a connector, verify changes, and know how to roll back safely.

Prerequisites (DB)

  • wal_level=logical, slots allowed, publication permissions. This enables logical decoding so Debezium can stream row-level changes from the transaction log. It also keeps logical replication slots consistent for downstream consumers.
  • Tables have a stable primary key (or configure REPLICA IDENTITY FULL for keyless tables).
  • WAL retention window ≥ snapshot + catch-up time.
SHOW wal_level;
SHOW max_replication_slots;
SHOW max_wal_senders;
SELECT current_setting('wal_keep_size');

DB Setup (copy/paste)

-- minimal role for logical replication (adjust as needed)
CREATE ROLE cdc LOGIN PASSWORD 'cdc' REPLICATION;

-- ensure replica identity for keyless tables (example)
ALTER TABLE app.customer REPLICA IDENTITY FULL;

-- create publication if you prefer manual management
CREATE PUBLICATION cdc_pub FOR TABLE app.customer;

Debezium can auto-create a filtered publication; configure manually if you need tighter scope.

Connector Config (example)

{
  "name": "pg-cdc",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "pg",
    "database.port": "5432",
    "database.user": "cdc",
    "database.password": "cdc",
    "database.dbname": "postgres",
    "slot.name": "cdc_slot",
    "publication.autocreate.mode": "filtered",
    "schema.include.list": "app",
    "table.include.list": "app.customer",
    "tombstones.on.delete": "false",
    "heartbeat.interval.ms": "5000",
    "include.schema.changes": "false",
    "decimal.handling.mode": "string",
    "snapshot.mode": "initial",
    "errors.tolerance": "all",
    "errors.deadletterqueue.topic.name": "dlq.pg",
    "topic.prefix": "server1"
  }
}

Debezium 2.x requires topic.prefix; use database.server.name if you are running 1.x connectors.

Logical replication slots are persistent queue-like structures that reserve WAL segments until consumers confirm they have processed them, ensuring no data loss and keeping replay consistent for CDC tools.

Verify

-- slot & publication health
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots WHERE slot_name='cdc_slot';

SELECT * FROM pg_publication_tables WHERE pubname='cdc_pub';

-- generate a change
UPDATE app.customer SET email = email || '.x' WHERE id = (
  SELECT id FROM app.customer LIMIT 1
);
# consume from the topic (adjust name)
kafka-console-consumer --bootstrap-server localhost:9092 \
  --topic server1.public.app_customer --from-beginning --max-messages 5

Acceptance (target/sink)

  • No duplicate PKs after a connector restart.
  • Per key, the row reflects the greatest op_ts/version.
  • DLQ empty or only expected test errors.
-- duplicates (generic)
SELECT COUNT(*) AS rows, COUNT(DISTINCT id) AS distinct_keys
FROM target_customers;

Safe Rollback

-- stop connector first (to avoid slot churn)
-- then, if you must remove the slot:
SELECT pg_drop_replication_slot('cdc_slot');

Dropping a live slot can force WAL recycling and data loss for downstream readers. Pause, plan a snapshot, then drop.