kafka-expert
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseApache Kafka Expert
Apache Kafka 专业指南
Expert guidance for Apache Kafka, event streaming, Kafka Streams, and building event-driven architectures.
为Apache Kafka、事件流、Kafka Streams以及事件驱动架构构建提供专业指导。
Core Concepts
核心概念
- Topics, partitions, and offsets
- Producers and consumers
- Consumer groups
- Kafka Streams
- Kafka Connect
- Exactly-once semantics
- Topics、分区、偏移量
- Producer、Consumer
- 消费者组
- Kafka Streams
- Kafka Connect
- 精确一次语义
Producer
Producer
python
from kafka import KafkaProducer
import json
producer = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8'),
acks='all', # Wait for all replicas
retries=3
)python
from kafka import KafkaProducer
import json
producer = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8'),
acks='all', # Wait for all replicas
retries=3
)Send message
Send message
future = producer.send('user-events', {
'user_id': '123',
'event': 'login',
'timestamp': '2024-01-01T00:00:00Z'
})
future = producer.send('user-events', {
'user_id': '123',
'event': 'login',
'timestamp': '2024-01-01T00:00:00Z'
})
Wait for acknowledgment
Wait for acknowledgment
record_metadata = future.get(timeout=10)
print(f"Topic: {record_metadata.topic}, Partition: {record_metadata.partition}")
producer.flush()
producer.close()
undefinedrecord_metadata = future.get(timeout=10)
print(f"Topic: {record_metadata.topic}, Partition: {record_metadata.partition}")
producer.flush()
producer.close()
undefinedConsumer
Consumer
python
from kafka import KafkaConsumer
consumer = KafkaConsumer(
'user-events',
bootstrap_servers=['localhost:9092'],
group_id='my-group',
auto_offset_reset='earliest',
enable_auto_commit=False,
value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)
for message in consumer:
print(f"Received: {message.value}")
# Process message
process_event(message.value)
# Manual commit
consumer.commit()python
from kafka import KafkaConsumer
consumer = KafkaConsumer(
'user-events',
bootstrap_servers=['localhost:9092'],
group_id='my-group',
auto_offset_reset='earliest',
enable_auto_commit=False,
value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)
for message in consumer:
print(f"Received: {message.value}")
# Process message
process_event(message.value)
# Manual commit
consumer.commit()Kafka Streams
Kafka Streams
java
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> source = builder.stream("input-topic");
// Transform and filter
KStream<String, String> transformed = source
.filter((key, value) -> value.length() > 10)
.mapValues(value -> value.toUpperCase());
transformed.to("output-topic");
KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();java
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> source = builder.stream("input-topic");
// Transform and filter
KStream<String, String> transformed = source
.filter((key, value) -> value.length() > 10)
.mapValues(value -> value.toUpperCase());
transformed.to("output-topic");
KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();Best Practices
最佳实践
- Use appropriate partition keys
- Monitor consumer lag
- Implement idempotent producers
- Use consumer groups for scaling
- Set proper retention policies
- Handle rebalancing gracefully
- Monitor cluster metrics
- 使用合适的分区键
- 监控消费者滞后
- 实现幂等生产者
- 使用消费者组进行扩容
- 设置合理的保留策略
- 优雅处理重平衡
- 监控集群指标
Anti-Patterns
反模式
❌ Single partition topics
❌ No error handling
❌ Ignoring consumer lag
❌ Producing to wrong partitions
❌ Not using consumer groups
❌ Synchronous processing
❌ No monitoring
❌ 单分区主题
❌ 无错误处理
❌ 忽略消费者滞后
❌ 发送至错误分区
❌ 不使用消费者组
❌ 同步处理
❌ 无监控
Resources
资源
- Apache Kafka: https://kafka.apache.org/
- Confluent Platform: https://www.confluent.io/
- Apache Kafka: https://kafka.apache.org/
- Confluent Platform: https://www.confluent.io/