Postgres CDC — Quickstart & Runbook
Set the right DB knobs, register a connector, verify changes, and know how to roll back safely.
Prerequisites (DB)
- wal_level=logical, slots allowed, publication permissions. This enables logical decoding so Debezium can stream row-level changes from the transaction log. It also keeps logical replication slots consistent for downstream consumers.
-
Tables have a stable primary key (or configure
REPLICA IDENTITY FULLfor keyless tables). - WAL retention window ≥ snapshot + catch-up time.
SHOW wal_level;
SHOW max_replication_slots;
SHOW max_wal_senders;
SELECT current_setting('wal_keep_size');
DB Setup (copy/paste)
-- minimal role for logical replication (adjust as needed)
CREATE ROLE cdc LOGIN PASSWORD 'cdc' REPLICATION;
-- ensure replica identity for keyless tables (example)
ALTER TABLE app.customer REPLICA IDENTITY FULL;
-- create publication if you prefer manual management
CREATE PUBLICATION cdc_pub FOR TABLE app.customer;
Debezium can auto-create a filtered publication; configure manually if you need tighter scope.
Connector Config (example)
{
"name": "pg-cdc",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "pg",
"database.port": "5432",
"database.user": "cdc",
"database.password": "cdc",
"database.dbname": "postgres",
"slot.name": "cdc_slot",
"publication.autocreate.mode": "filtered",
"schema.include.list": "app",
"table.include.list": "app.customer",
"tombstones.on.delete": "false",
"heartbeat.interval.ms": "5000",
"include.schema.changes": "false",
"decimal.handling.mode": "string",
"snapshot.mode": "initial",
"errors.tolerance": "all",
"errors.deadletterqueue.topic.name": "dlq.pg",
"topic.prefix": "server1"
}
}
Debezium 2.x requires topic.prefix; use database.server.name if you are running 1.x connectors.
Logical replication slots are persistent queue-like structures that reserve WAL segments until consumers confirm they have processed them, ensuring no data loss and keeping replay consistent for CDC tools.
Verify
-- slot & publication health
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots WHERE slot_name='cdc_slot';
SELECT * FROM pg_publication_tables WHERE pubname='cdc_pub';
-- generate a change
UPDATE app.customer SET email = email || '.x' WHERE id = (
SELECT id FROM app.customer LIMIT 1
);
# consume from the topic (adjust name)
kafka-console-consumer --bootstrap-server localhost:9092 \
--topic server1.public.app_customer --from-beginning --max-messages 5
Acceptance (target/sink)
- No duplicate PKs after a connector restart.
- Per key, the row reflects the greatest
op_ts/version. - DLQ empty or only expected test errors.
-- duplicates (generic)
SELECT COUNT(*) AS rows, COUNT(DISTINCT id) AS distinct_keys
FROM target_customers;
Safe Rollback
-- stop connector first (to avoid slot churn)
-- then, if you must remove the slot:
SELECT pg_drop_replication_slot('cdc_slot');
Dropping a live slot can force WAL recycling and data loss for downstream readers. Pause, plan a snapshot, then drop.