Kafka - Issue with multi-node cluster when some nodes go down

Hello, I have created 3 node cluster and started producing data and I m able consume it at all the nodes. But the thing is If I make two systems down my third system is throwing exception. Can I know what is the reason behind this and how to come out of this issue.

I want to create a cluster and in case if any two nodes goes down also, my third node should take care of the all the other two nodes work. How to achieve this?

You would have to either not publishing with acks=ALL or you would need to insure your min.isr=1.

while it is possible, it is not recommended. Typically if your system is running with just 1 broker (there is no replication) and you can run into a data loss. If you have 4 brokers you could do replication factor of 4 with a min.isr=2 which would give you durability with the desired availability you want (handle 2 brokers being down).

Now, if you are going down the path of wanting the system to function with 1 broker, you have to make sure your __consumer_offsets topic is also configured to have a min.isr setting of 1 (as well as others such as the transaction topic if you are using exactly once semantics).

The other thing is you are now expecting 1 instance to be able to handle the work of 3 — if it is being over provisioned you will most run into additional issues quickly.

Thanks for your quick reply but how to configure __consumer_offsets topic min.isr to 1.

If I use 4 systems then I would wish make 3 systems down and check.

Kafka-topics --topic foo --describe

(foo being the topic of interest)

kafka-configs to add a configuration

kafka-topics still works (I believe) for setting topic settings and a little easier to setup

(kt is an alias of mine for kafka-topics)

Topic: __consumer_offsets	TopicId: ewO5P8mYRbSI6vdVzqvocg	PartitionCount: 50	ReplicationFactor: 3	Configs: compression.type=producer,min.insync.replicas=2,cleanup.policy=compact,segment.bytes=104857600
	Topic: __consumer_offsets	Partition: 0	Leader: 3	Replicas: 3,4,1	Isr: 3,4,1
	Topic: __consumer_offsets	Partition: 1	Leader: 4	Replicas: 4,1,2	Isr: 4,1,2
	Topic: __consumer_offsets	Partition: 2	Leader: 1	Replicas: 1,2,3	Isr: 1,2,3
	Topic: __consumer_offsets	Partition: 3	Leader: 2	Replicas: 2,3,4	Isr: 2,3,4
	Topic: __consumer_offsets	Partition: 4	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
	Topic: __consumer_offsets	Partition: 5	Leader: 4	Replicas: 4,2,3	Isr: 4,2,3```

Kafka-configs --bootstrap-server localhost:19092 --alter --entity-type topics --entity-name __consumer_offsets --add-config min.insync.replicas=1

Completed updating config for topic __consumer_offsets.

kt --topic __consumer_offsets --describe
Topic: __consumer_offsets TopicId: ewO5P8mYRbSI6vdVzqvocg PartitionCount: 50 ReplicationFactor: 3 Configs: compression.type=producer,min.insync.replicas=1,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets Partition: 0 Leader: 3 Replicas: 3,4,1 Isr: 3,4,1
Topic: __consumer_offsets Partition: 1 Leader: 4 Replicas: 4,1,2 Isr: 4,1,2
Topic: __consumer_offsets Partition: 2 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 3 Leader: 2 Replicas: 2,3,4 Isr: 2,3,4

Ok I will check this out. Thank you

Hello, actually min.insync.replica of __consumer__offsets is 1.

Bin/kafka-topics --describe --zookeeper localhost:2181 --topic __consumer_offsets
Topic: __consumer_offsets TopicId: bMbQ9YY0SEeZQI7MjzJhEQ PartitionCount: 50 ReplicationFactor: 1 Configs: compression.type=producer,cleanup.policy=compact,min.insync.replicas=1,segment.bytes=104857600
Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 1 Leader: none Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 2 Leader: 1 Replicas: 1 Isr: 1

But even with this configuration also i could not able to achieve my thing

Look at partition 1. Replicas 2 with isr 2.

Something got misconfigured. This can happen if broker settings that need to be the same are not and a topic is created. Or could be due to a reparation allocation.

I have configured partition 1 , replica 3, n isr 1

Please do help me sort out this issue

Do u have ISR as 1 across the cluster , not just at topic level ?

And here is how my server.properties file looks like

Broker.id=1

listeners=PLAINTEXT://master.mr.com:9092
advertised.listeners=PLAINTEXT://master.mr.com:9092
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600

log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
#log.flush.interval.messages=10000
#log.flush.interval.ms=1000

log.retention.hours=168
#log.retention.bytes=1073741824

log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

zookeeper.connect=master.mr.com:2181
zookeeper.connection.timeout.ms=18000

#confluent.metrics.reporter.bootstrap.servers=localhost:9092
#confluent.metrics.reporter.topic.replicas=1

group.initial.rebalance.delay.ms=0

I have tried many combination with min.insync.replica and replication.factor