Chapter 19 of 25

Kafka Partitions and Consumer Groups: Where Parallelism Lives

Created May 28, 2026 Updated Jun 7, 2026

Two facts shape almost every Kafka design decision:

A partition is the unit of parallelism.
Ordering is a per-partition guarantee, not a per-topic one.

A topic is a named message channel. Internally it's split into partitions — separate ordered logs on disk. Messages are appended to partitions, each one getting a monotonically increasing offset (its position in that log). A producer writing to a topic with 10 partitions distributes messages across them either round-robin or by key: messages with the same key always land in the same partition (as long as partition count doesn't change).

This is how ordering works in practice. Kafka does not guarantee that the entire topic is one ordered stream. It guarantees that within a single partition, messages are read in the order they were appended. Across partitions there's no ordering at all. So if you need user_42's events processed in order, partition by user_id. All of user_42's events land in one partition and a single consumer sees them in order. Different users go to different partitions and get processed in parallel.

Consumer groups are how reads scale. A consumer group is a set of consumer processes that share the partitions of a topic. Kafka assigns each partition to exactly one consumer in the group:

10 partitions, 1 consumer in group → that consumer reads all 10.
10 partitions, 5 consumers → each gets 2 partitions.
10 partitions, 10 consumers → 1 partition each, max parallelism.
10 partitions, 12 consumers → 10 work, 2 sit idle.

The number of partitions defines the upper bound on consumer-group throughput. You can add brokers, add hardware, add consumer instances — none of it helps past partition count.

These two facts are in tension — and that tension is the real design choice. One partition gives you perfect total ordering and no parallelism. A hundred partitions give you massive parallelism and ordering only within each key. No setting maximizes both: Kafka scales by giving up global ordering, keeping just enough of it — per key — to stay useful. Choosing a partition count and a partition key is really choosing how much ordering you're willing to trade for throughput.

When a consumer joins or leaves the group, Kafka triggers a rebalance — partitions get reshuffled. This is also how failure recovery works: if a consumer dies, its partitions get reassigned to surviving members.

Two patterns that fall out of this:

Multiple consumer groups read the same topic independently. Each group has its own offset position. The analytics group might be at offset 1,000,000; the fraud-detection group at 950,000; the real-time dashboard at 999,950. Kafka isn't deleting messages because someone read them — retention is governed by retention.ms (typically 7 days) or retention.bytes. Fan-out to multiple downstream systems is just multiple consumer groups on the same topic.

Partition count is a decision you should overestimate at design time. Increasing partition count later changes the key-to-partition mapping: existing messages stay in the partitions they were already written to (Kafka never physically moves them), but future messages for the same key may route to a different partition — which breaks the per-key ordering guarantee across the change. Most teams pick a partition count generously up front (often 10–50 for a moderately busy topic) and rarely change it.

The whole "Kafka scales" story compresses to: keyed partitioning preserves per-entity ordering while parallelizing across entities, and consumer groups scale read throughput up to the partition ceiling. Many higher-level Kafka features build on top of those primitives: replay and multi-tenant fan-out fall straight out of them, while the stronger guarantees — transactions, exactly-once semantics — add their own machinery on top (idempotent producers, transactional offset commits, and coordination the partition model alone doesn't provide).

Full breakdown — broker / leader / replication, retention policies, exactly-once semantics, Protobuf integration, and where streaming patterns live in production: Data Streams: Kafka and Protobuf.