Getting Started with Kafka and Python

In this article, We will learn about the Installation process, What is Kafka, and How to use it with Python.

What is Kafka?

Apache Kafka is a popular distributed streaming platform that allows you to publish and subscribe to streams of records, and store them in a fault-tolerant manner using the well-liked distributed streaming platform Apache Kafka. We’ll go over how to use Kafka with Python in this tutorial and provide you with some sample code to get you going.

At its core, Kafka is a messaging system that allows you to publish and subscribe to streams of data. It was designed to handle large amounts of data, allowing you to process data in real time, store it in a fault-tolerant way, and replay it as needed.

Prerequisites

  • A basic understanding of Python
  • Kafka installed on your machine

Step 1: Installing Kafka-Python

Installing the Kafka-Python package, which provides a Python client for Apache Kafka, is the first step. It may be installed using pip, a Python package manager:

pip install kafka-python

Step 2: Setting up a Kafka Producer

A Kafka Producer is a component of the Kafka system that writes data to Kafka topics.

A topic in Kafka is a designated category or feed to which data is published. Consumers read data from topics while producers write data for them.

The main job of a Kafka Producer is to write data to Kafka topics in a fault-tolerant and scalable way. A Producer writes data on a topic by sending messages to Kafka brokers. A message in Kafka is an immutable sequence of bytes, which can contain any kind of data, including text, JSON, Avro, or binary data.

from kafka import KafkaProducer

producer = KafkaProducer(bootstrap_servers=['localhost:9092'])

producer.send('kafka-topic-name', b'Code Hubs')

producer.flush()

we created a Kafka producer that connects to the local Kafka broker running on port 9092.

Afterward, we send the message “Hello, World!” to a subject called “kafka-topic-name.” Before cutting the connection, we flush the producer to make sure the message is transferred to Kafka.

Step 3: Introduction of Kafka Topic

In the Kafka system, a topic is a named feed or category to which data is published. The cornerstone of Kafka’s data paradigm, topics offer a fault-tolerant and scalable way to organize and partition data.

Data is categorized into topics in Kafka, and each topic is divided among various Kafka brokers. Kafka uses partitions as a form of parallelism, which stands for a topic’s linearly ordered message stream. Each message is given a certain offset inside the partition, and messages are written to a partition in a strictly ordered manner.

Step 4: Setting up a Kafka Consumer

A Kafka Consumer is a component of the Kafka system that reads data from Kafka topics. In Kafka, a topic is a named feed or category to which data is published. Consumers read data from topics, while producers write data from topics.

A part of the Kafka system that reads data from Kafka topics is called a Kafka Consumer. A topic in Kafka is a designated category or feed to which data is published. As producers write data to topics, consumers read data from them.

from kafka import KafkaConsumer

consumer = KafkaConsumer('kafka-topic-name', bootstrap_servers=['localhost:9092'])

for message in consumer:
    print (message)

we created a Kafka consumer that connects to the local Kafka broker running on port 9092. We then consume messages from the ‘kafka-topic-name’ topic and print them to the console.

Step 5: Sending JSON Messages to Kafka

import json

producer.send('kafka-topic-name', json.dumps({'name': 'Code Hub', 'company': vision}).encode('utf-8'))

We send a JSON message to the ‘kafka-topic-name’ topic. We first use the json.dumps() method to convert a Python dictionary to a JSON string, then encode it to bytes using the encode() method

Step 6: Consuming JSON Messages from Kafka

for message in consumer:
    json_message = json.loads(message.value.decode('utf-8'))
    print (json_message)

we consume messages from the ‘kafka-topic-name’ topic, parse the JSON string using the json.dumps() method, and print the resulting Python dictionary to the console.

I hope this article helps you and you will like it.

Please give your valuable feedback and if you have any questions or issues about this article, please let me know.

Submit a Comment

Your email address will not be published. Required fields are marked *

Subscribe

Select Categories