Effective Design Patterns for a Kafka Cluster

A guide to implement effective Kafka Clusters Design Strategies using Partitioning and Replication

Manoj Kukreja
3 min readNov 9, 2020

--

Image by Tumisu from Pixabay

At the lowest denomination level may be created with a single broker instance. Using a Kafka Producer Data Stream can be sent in form of messages to the Kafka Broker. These messages stay on the Kafka Broker for a configurable period of time until a Kafka Consumer can retrieve and process them.

Image by Author

Looks like a pretty simple design…but has a few drawbacks

What if the Kafka Broker is not able to keep up with the high traffic demands?

Image by Author

The solution is simple, break your Kafka Topic into multiple Partitions, configure multiple Brokers and save each partition on a separate Broker. Therefore instead of one broker taking the entire load you now have many of them sharing the load. The question is how many partitions are optimal? For a successful Leader election by Zookeeper you need to go with a ODD value i.e. 3,5,7.

Image by Author

Above design surely does take care of the the high traffic demands…but is it the best design for High Availability.

What if the Kafka Broker suffers a failure, it’s hardware after all…therefore one day it will

Image by Author

What are the implications? Do I hear DATA LOSS.

So how can you protect a Partition against failures? Create a Kafka High Availability Custer. This can be done by Replicating every Partition to other Brokers. This way you are able to maintain copies of partitions thus elimination the Single Point of Failure (SPOF).

Image by Author

I hope this article was helpful. The topic of Apache Kafka is covered in greater detail as part of the Big Data Hadoop, Spark & Kafka course offered by Datafence Cloud Academy.

--

--

Manoj Kukreja

Author, Big Data Engineering, Data Science, Data Lakes, Cloud Computing and IT security specialist.