At-least-once
- Replays or connector restarts may emit duplicate events.
- Design consumers to be idempotent: upsert by key, avoid increment-only writes.
- Lag metrics and DLQ hygiene keep ALO predictable.
Learn how CDC tools package change events, why before/after images matter, and how delivery guarantees shape your downstream processing.
Roll up your readiness checklist progress before you dive into the deep-dive sections.
0 of 0 readiness checks complete (0%)
No readiness checklists available yet.
Change events are more than a JSON blob. Every field exists to keep data, ordering, and metadata aligned across services. Ground your consumers in the same vocabulary so investigations stay fast.
{
"op": "u",
"ts_ms": 1712857612456,
"source": {
"db": "app",
"table": "orders",
"lsn": 368592330
},
"transaction": {
"id": "4.25.901",
"total_order": 442,
"event_count": 2
},
"before": { "order_id": 42, "status": "processing" },
"after": { "order_id": 42, "status": "shipped" },
"key": { "order_id": 42 }
}
| Field | Purpose | Questions it answers |
|---|---|---|
op / action |
Signals whether the change was an insert, update, or delete. | Should I upsert or tombstone this record? |
ts_ms / source time |
The source commit timestamp or log position. | Was this change applied before or after another event? |
| Transaction identifiers | Log sequence numbers, SCNs, or offsets for recovery. | Where do I resume when reprocessing? |
| Primary key | The stable identifier for idempotent upserts. | Which entity does this change belong to? |
| Before/after payloads | Snapshots of column values at either side of the change. | How do I compute diffs or rollbacks? |
Many downstream jobs need both the new values and the previous state. Audit pipelines, SCD Type 2 tables, and cache invalidation logic all break without the before image.
Treat optional images as a contract. If you turn them off for throughput, publish the policy and add guards to your materialization layer.
CDC stacks normally promise at-least-once (ALO) delivery; exactly-once (EOS) is a design you layer on top with idempotency and transactional sinks.
Per-key ordering is preserved within a partition, but cross-partition sequencing is best-effort. Design your envelope so downstream jobs can stitch history back together.
before: null, not a missing field, so consumers can detect
configuration drift.
Build a contract test that fetches a production envelope, validates it against the schema, and asserts critical fields are non-null. Run it in CI before promoting connector changes.
| Concept | Debezium | Fivetran | Custom/Kafka Streams |
|---|---|---|---|
| Operation | op (`c`, `u`, `d`, `r`) |
op (`INSERT`, `UPDATE`, `DELETE`) |
Explicit `action` field or topic naming |
| Source position | source.lsn / source.ts_ms |
source_lsn / source_commit |
High-water mark stored beside offsets |
| Transaction envelope | transaction.id, total_order |
txn_id (when available) |
Custom headers (`x-tx-id`) |
| Before image | before |
before |
Previous state cached in state store |
| Metadata | source.db, schema, table |
source_table, source_schema |
Headers + payload wrapper object |
Align on vocabulary in your runbooks. Analysts should not have to learn a different event shape per connector.
Use these checklists when reviewing connector configs or consumer code.
These guardrails keep consumers from silently degrading when connector upgrades or config toggles change the envelope shape.
Share this checklist before launch so producers, consumers, and platform teams align on envelope expectations and operating guardrails.
| Capability | Ready when… | If not, do this |
|---|---|---|
| Versioned payload schema is published with required fields called out and nullability documented. | Draft a schema README, add it to source control, and require schema diff reviews for connector merges. | |
| Before/after images, primary keys, and tombstones are explicitly enabled in configuration. | Review connector settings with platform engineering and add automated config tests in CI. | |
| Downstream services have idempotent handlers and are storing envelope metadata they depend on. | Host a contract walkthrough, annotate payload fields, and run a replay drill with each critical consumer. | |
| Envelope validation job runs on a schedule with alerting for missing fields or unexpected types. | Add schema validation output to observability dashboards and alert on consecutive failures. |
All capabilities are ready. Toggle to see everything or reset to start over.
Tie monitors back to owners: every alert should have a responder rotation, a playbook link, and a communication channel to pause downstream consumers if contract drift is detected.
Test your understanding of CDC event structure and delivery guarantees.
The 'before' and 'after' fields capture the complete row state before and after the change. This allows consumers to see exactly what changed, implement custom diffing logic, or reconstruct the full state without querying the source database.
Review the correct answer and explanation.
At-least-once delivery means events may be redelivered due to retries, failures, or replays. Consumers must implement idempotency (using unique keys or timestamps) to handle duplicate processing gracefully.
Review the correct answer and explanation.
A tombstone is a special event where the value is null but the key remains. In Kafka compacted topics, tombstones signal that a record should be removed from the compacted view, allowing proper handling of deletes while maintaining log compaction semantics.
Review the correct answer and explanation.
Per-key ordering guarantees that all events for the same entity (identified by the key) are processed in order. This prevents race conditions where an update might be processed before an insert, or a delete before an update, which would corrupt the downstream state.
Review the correct answer and explanation.
The source metadata includes critical information about the origin of the change: which database, which table, and the log position (LSN for PostgreSQL, SCN for Oracle, etc.). This enables tracking, debugging, and offset management for replays or reconciliation.
Review the correct answer and explanation.
Track journey completion, monitor recent activity, and export your session logs.