Introduction

Apache Kafka is an open-source distributed event streaming platform that builds real-time data pipelines and powers streaming applications. It collects, stores, and processes high-throughput data streams while maintaining low latency and fault tolerance.

Core Features

  • Kafka organizes data as topics, which are split into partitions distributed across multiple servers called brokers.
  • It supports producers that write data and consumers that read data, with the ability to process records in parallel through consumer groups.
  • Kafka uses a distributed commit log that provides durable ordered storage.
  • Its design enables horizontal scalability, strong resilience, and efficient data streaming.
  • Kafka works well for use cases like real-time analytics, log aggregation, event sourcing, and microservices communication.

LinkedIn originally developed Kafka and later open-sourced it through the Apache Software Foundation. Today, Kafka powers streaming and messaging in distributed systems and is widely adopted across industries that handle massive volumes of real-time data.

To help you get started, here’s what you can do next:

Use Cases

Discovery Use Cases

  • Discovers Apache Kafka components and outlines the resource structure.
  • Publishes relationships between resources to enable topological views and simplify maintenance.

Monitoring Use Cases

  • Provides metrics related to job scheduling time, status, and performance.
  • Generates concern alerts for each metric to notify administrators about resource issues.

Hierarchy of kafka

  • Apache Kafka Cluster
    • Apache Kafka Broker
    • Apache Kafka Topic

Version History

Click here to view the version history
Application VersionBug fixes / Enhancements
1.0.0Initial Discovery and Monitoring Implementations.