A simplified guide to Kafka in Python
Apache Kafka is a popular distributed message broker designed to efficiently handle large volumes of real-time data. A Kafka cluster is not only highly scalable and fault-tolerant, but it also has a much higher throughput compared to other message brokers such as ActiveMQ and RabbitMQ. Though it is generally used as a publish/subscribe messaging system, a lot of organizations also use it for log aggregation because it offers persistent storage for published messages.

Kafka is a horizontally scalable, fault tolerant, and fast messaging system. It’s a pub-sub model in which various producers and consumers can write and read. It decouples source and target systems. Some of the key features are:
- Scale to 100s of nodes.
- Can handle millions of messages per second.
- Real-time processing (~10ms).
Kafka Architecture
The diagram below shows the architecture of Kafka.

Started
First, install Java. Once that’s done, Downloading and installing Kafka. Just download the latest release and untar it, like this:
$ wget http://apache.cs.utah.edu/kafka/2.2.0/kafka_2.12-2.2.0.tgz
$ tar -xzf kafka_2.12-2.2.0.tgz
$ cd kafka_2.12-2.2.0.tgz
Inside the extracted kafka_2.12-2.2.0
, you will conveniently find a bin/zookeeper-server-start.sh
file (which is used to start the server), and a config/zookeeper.properties
(which provides the default configuration for the zookeeper server to run)
Start the server by running
$ bin/zookeeper-server-start.sh config/zookeeper.properties
# Now start the Kafka server:
$ bin/kafka-server-start.sh config/server.properties
Create a topic
Let’s create a topic named “coba” with a single partition and only one replica:
$ bin\kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic coba
Now you can see that topic if we run the list topic command:
$ bin\kafka-topics.sh --list --bootstrap-server localhost:9092
coba
Installing Python client for Apache Kafka
Before we can start working with Apache Kafka in Python program, we need to install the Python client for Apache Kafka.
$ pip install kafka-python
Let’s create consumer, Create file consumer.py and set broker list to localhost:9092 as we have Kafka cluster locally with one broker (node).
Here’s an example of a Python consumer:
To create a Kafka producer, we will need to pass it a list of bootstrap servers (a list of Kafka brokers). we will also specify a client that uniquely identifies this Producer client. In this example, we are going to send messages with key. The message body is a string, so we need a record value serializer as we will send the message body in the Kafka’s records value field. The message key, will be sent as the Kafka’s records key.
Here’s a python example of a Kafka producer:
To start the consumer, run the command:
$ python consumer.py
Open another terminal window and run proucer.:
$ python producer.py
If there are no configuration issues, you should see Message: {"message": "Hello kafka"}
in your terminal.
If you want to see my final code, A simplified guide to Kafka in Python by this article is available here!
Conclusion
You can make use of it in your projects by creating Kafka producers and consumers using Kafka clients, which are available for most programming languages. To learn more about Kafka, you can also consult its documentation.