Hi, I am new to Kafka but running into something that I can’t make sense of. I have a Kafka topic containing many millions of messages. The partition count is 6, and I am reading from 6 consumers in the same consumer group. However, read performance is 6x slower with 6 consumers than with a single consumer. Does anyone have any idea what might be causing this? or maybe I am missing some understanding here.
Why do you think it’s six times slower?
Have you checked the logs to see if anything weird is happening with the consumer group protocol? Are they all joining sensibly and remaining joined?
(Just as one possible example: if you accidentally coded a loop incorrectly, and were re-subscribing before each
consumer.poll(), you might be constantly rebalancing your consumer group, and that would destroy performance. But you’d see something like that showing up in the logs as endless rebalance messages.
You should see all your consumers join once, and then have things be quiet and stable most of the time.)
What said. You’ll need to check all the broker logs as any of the brokers could be the consumer group coordinator
Are you measuring throughput of a single consumer? I.e. in aggregate all 6 instances have the same throughput?
I’m measuring throughput of each consumer, and I can literally watch the throughput drop of all consumers each time I add a new one.
And the production is steady state?
Yep there are about 1 million messages in the queue at any given time I am never catching up and emptying it.
I will check the logs and see if I can see what is going on there
Sure, the consumption of each consumer will drop as they will be unsubscribed from partitions as each new instance starts, but I’d like to believe the total aggregate of all should remain about the same.