Share

Kafka

Kafka is an open-source distributed streaming platform used for storing, reading, and processing high-throughput streaming data in real time.

Hevo supports two Kafka variants as Sources.

  • Apache Kafka: a self-managed, open-source Kafka deployment hosted on your own infrastructure.

  • Kafka Confluent Cloud: a fully managed, cloud-native Kafka service built on Apache Kafka, with additional capabilities such as schema registry and built-in connectors.

Configuration steps for each variant are covered in their respective documentation pages.

Key concepts

In Kafka, a Topic is a category or a common name used to store and publish a particular stream of data. Topics in Apache Kafka are similar to tables in a database. Hevo reads your data from the topics created in your Apache Kafka instance.

A Cluster is a collection of Kafka topics. The server that the topics are hosted on in Kafka is called a Broker. Therefore, a Kafka Cluster may consist of many Kafka Brokers on many servers.

Bootstrap servers are a comma-separated list of host and port pairs that represent the addresses of the brokers. In the context of your Hevo Pipeline, these are the initial servers in the cluster which Hevo can connect to establish the initial connection. If one fails, the others are used. The bootstrap server automatically redirects subsequent connections to appropriate servers in cluster.

How Hevo ingests Kafka data

When Hevo ingests a Kafka event, it parses the message payload and creates one Hevo Event per record in the payload. Hevo adds two metadata fields to each ingested Event:

  • __hevo_id: a unique identifier assigned by Hevo to each ingested Event
  • ref_id: a reference identifier linking the ingested Event back to its source Kafka message

Sample Source Event

A single Kafka message containing two records:

[{ "name": "John", "age": 25 }, { "name": "Jack", "occupation": "chef" } ]

Sample Ingested Events

Two Hevo Events created from the above message:

Event 1:

{ "__hevo_id": "abc1", "ref_id": "abcdef", "name": "John", "age": 25 }

Event 2:

{ "__hevo_id": "abc2", "ref_id": "abcdef", "name": "Jack", "occupation": "chef" }

Note: Both Events share the same ref_id because they originated from the same Kafka message.

Last updated on May 25, 2026

Tell us what went wrong

Skip to the section