Understanding how Kafka load balances requests among multiple pods in a deployment

Hello everyone,
I have a question.
Does the client consumer read from partition followers? For example, if I have 6 pods (instances) of a deployment (Kubernetes env) that consume the same topics, how does Kafka load balance their requests?

Prior to Kafka 2.4, consumption was always done from the leader. With the introduction of KIP 392, replica follower is now possible.


The default configuration is still based on leader, but can be changed with this configuration


This selected can be used in order to have rack awareness used to try to find a replica closer to the client.


If you do decide to go down this path, I would recommend closely monitoring latency as it can easily go up not down.

** my observation/assumption **
This approach is more about reducing network costs by having clients read from the same AZ more than trying to reduce latency.

Thank you for your ongoing support