Apache Kafka
Apache Kafka was originated at LinkedIn and later became an open sourced Apache project in 2011
Then First-class Apache project in 2012
Kafka is written in Scala and Java
Apache Kafka is publish-subscribe based fault tolerant messaging system.
Apache Kafka is fast, scalable and distributed by design.
What is a Messaging System?
A Messaging System is responsible for transferring data from one application to another
Distributed messaging is based on the concept of reliable message queuing.
Messages are queued asynchronously between client applications and messaging system.
Two types of messaging patterns are available −
1) point to point
Messages are persisted in a queue.
One or more consumers can consume the messages in the queue, but a particular message can be consumed by a maximum of one consumer only.
Once a consumer reads a message in the queue, it disappears from that queue.
2) publish-subscribe (pub-sub)
message producers are called publishers and message consumers are called subscribers.
messages are persisted in a topic.
consumers can subscribe to one or more topic and consume all the messages in that topic.
Apache Kafka is a distributed publish-subscribe messaging system and a robust queue that can handle a high volume of data and enables you to pass messages from one end-point to another.
Kafka is suitable for both offline and online message consumption.
Kafka messages are persisted on the disk and replicated within the cluster to prevent data loss.
Kafka is built on top of the ZooKeeper synchronization service.
Integrates very well with Apache Storm and Spark for real-time streaming data analysis.
Benefits:
Reliability − Kafka is distributed, partitioned, replicated and fault tolerance.
Scalability − Kafka messaging system scales easily without down time..
Durability − Kafka uses Distributed commit log which means messages persists on disk as fast as possible, hence it is durable..
Performance − Kafka has high throughput for both publishing and subscribing messages. It maintains stable performance even many TB of messages are stored.
Kafka is very fast and guarantees zero downtime and zero data loss.
Use Cases
1. Metrics
2. Log Aggregation Solution
3. Stream Processing
Kafka is very fast, performs 2 million writes/sec.
Comments
Post a Comment