End-to-end "exactly-once delivery" across heterogeneous systems
(DB → Kafka → warehouse/search) is not achievable in general. What we actually ship is
Exactly-Once Processing (EOP): combine At-Least-Once (ALO)
transport with idempotent sinks so replays don’t create duplicates.
Scope matters:
Source connectors (e.g., Debezium): ALO into Kafka. They can’t make the
external database “transactional” with Kafka.
Within Kafka→Kafka topologies: EOS is possible using Kafka transactions
(idempotent producers + transactional offsets), but the scope is limited to Kafka topics.
Kafka → external sinks: Effective EOS comes from idempotent upserts
(or staging+MERGE) at the destination, not from transport guarantees alone.
How a CDC pipeline behaves during failures is defined by its data
delivery guarantees:
At-Most-Once: The weakest guarantee. A message is
sent once, but if a failure occurs before it's processed, the
message is lost forever. Unsuitable for critical data.
At-Least-Once: The most common default. The system
ensures every message is delivered, but a message might be delivered
more than once during a retry. This prevents data loss but requires
consumers to handle duplicates.
Exactly-Once Processing: The effective goal. The
system delivers every message at least once, and the consumer is
designed to process each unique message precisely one time, even if
it receives duplicates.
The Problem: Duplicates in At-Least-Once Systems
Imagine a consumer reads an event from a Kafka topic, processes it
(writes to a database), and then crashes before it can commit the
topic offset. When the consumer restarts, it will re-read the same
event, leading to duplicate processing. The interactive demo below
walks through this failure scenario.
1. Consume Message
The consumer service pulls a message (`Order #123`) from the
message broker.
2. Process & Write to Sink
The service processes the order and successfully inserts a record
into the `orders` table in the downstream database.
3. CRASH! 💥
Before the service can acknowledge the message by committing its
offset to the broker, the service crashes. The broker assumes the
message was never processed.
4. Restart & Re-process
The service restarts. Since the offset was not committed, it
fetches the same message (`Order #123`) again and inserts a
duplicate record into the `orders` table.
🎬
Ready to start
The Solution: Idempotent Consumers
Idempotency is the property of an operation that
allows it to be applied multiple times without changing the result
beyond the initial application. An idempotent consumer can safely
process the same message multiple times with no side effects. This
requires two things:
A Unique, Deterministic Event ID: Every change
event must have a unique identifier. This can be a UUID from an
outbox table or a composite key of the source table's primary key
and the transaction's log position.
Deduplication Logic at the Sink: The consumer uses
this ID to check if the event has already been processed before
applying the change. Common patterns include:
Using `MERGE` or `UPSERT`: The sink database
handles the logic of creating a new record or updating an
existing one atomically. This is the preferred method.
Using a Deduplication Table: The consumer first
tries to `INSERT` the event ID into a `processed_events` table.
If it succeeds, it proceeds; if it fails (due to a primary key
violation), it knows the event is a duplicate and can be safely
ignored.
Staging + MERGE: Land events into a staging table keyed by
event_id, then MERGE into the target to make replays safe.
Commit order: write to sink → then commit offsets. If offsets commit first,
a crash can drop data; idempotent sinks tolerate replays after restart.
Naive Consumer
// Pseudocode: may create duplicates
for (event of stream) {
// Simple insert will create a new
// record for every retry.
insert_into_sink(event.payload);
commit_offset();
}
Idempotent Consumer
// Pseudocode: safe from duplicates
for (event of stream) {
// MERGE/UPSERT handles existing records
// based on a primary key.
upsert_into_sink(
event.key,
event.payload
);
commit_offset();
}
Warehouse MERGE (Snowflake-style)
MERGE INTO dw.orders AS t
USING (SELECT :event_id AS event_id, :order_id AS order_id, :payload::variant AS p) s
ON t.event_id = s.event_id
WHEN NOT MATCHED THEN
INSERT (event_id, order_id, payload) VALUES (s.event_id, s.order_id, s.p)
WHEN MATCHED THEN
UPDATE SET payload = s.p; -- idempotent overwrite
BigQuery MERGE (idempotent)
MERGE `dw.orders` t
USING (SELECT @event_id AS event_id, @order_id AS order_id, @payload AS payload) s
ON t.event_id = s.event_id
WHEN NOT MATCHED THEN
INSERT (event_id, order_id, payload) VALUES (s.event_id, s.order_id, s.payload)
WHEN MATCHED THEN
UPDATE SET payload = s.payload;
The Transactional Outbox Pattern (Solving the Dual-Write Problem)
A common anti-pattern is the "dual-write," where an application first
writes to its database and *then*, in a separate network call, tries
to publish a message. If the application crashes between these two
steps, the systems become inconsistent. The
Transactional Outbox Pattern solves this.
Outbox pattern: Write business data and event atomically, then publish via CDC
An application writes both its business data (an `orders` record)
and an event record to an "outbox" table within the same single,
atomic database transaction.
A log-based CDC process monitors the outbox table. When it detects
the committed transaction in the database log, it reliably reads the
event from the log and publishes it to a message bus like Kafka.
This guarantees that an event is published if, and only if, the
corresponding business transaction was successfully committed to the
database. Data consistency is preserved.
Important scope note: Outbox gives you EOS between the application DB and Kafka
(no lost/phantom messages). It does not make your downstream warehouses exactly-once by itself—
you still need idempotent consumers at the sinks.
Exactly-Once Semantics Knowledge Check
Test your understanding of EOS delivery guarantees and the transactional outbox pattern.
0
/
5answered
Q1
What does 'exactly-once semantics' (EOS) truly guarantee in distributed systems?
✓
Correct!
True EOS means the observable effect is as-if each message was processed exactly once, even if the underlying system has retries or duplicates. This is typically achieved through idempotent operations, transactions, or unique identifiers that prevent duplicate effects, not by preventing duplicate delivery.
✗
Not quite.
Review the correct answer and explanation.
Q2
What is the primary purpose of the Transactional Outbox pattern?
✓
Correct!
The Transactional Outbox pattern solves the dual-write problem by writing business data and events to an outbox table within the same database transaction. A separate process then reads and publishes from the outbox, ensuring that if the transaction commits, the event will eventually be published—guaranteeing atomicity.
✗
Not quite.
Review the correct answer and explanation.
Q3
Why is achieving true end-to-end exactly-once semantics challenging in CDC pipelines?
✓
Correct!
End-to-end EOS requires all components in the chain—source database, CDC capture, message broker, and sink—to participate in coordinated transactions or idempotent protocols. Since these are often independent systems with different guarantees, achieving true EOS across the entire pipeline is complex and sometimes impossible.
✗
Not quite.
Review the correct answer and explanation.
Q4
How does Kafka Streams achieve exactly-once semantics within Kafka?
✓
Correct!
Kafka Streams achieves EOS by using transactional producers that atomically commit processed records and consumer offsets together. Combined with idempotent producers (that eliminate duplicate sends), this ensures each input message affects state exactly once, even across failures and retries.
✗
Not quite.
Review the correct answer and explanation.
Q5
What is an alternative to true EOS when it's not achievable?
✓
Correct!
When true EOS isn't feasible, designing idempotent operations is the practical solution. Using natural keys, MERGE/UPSERT operations, or 'insert if not exists' logic ensures that reprocessing the same event multiple times produces the same final state, making at-least-once delivery safe and effective.
✗
Not quite.
Review the correct answer and explanation.
0/5correct
Progress0%No progress yet
Progress is stored locally in this browser.
Let’s Talk CDC Interactive Dashboard
Track journey completion, monitor recent activity, and export your session logs.