Kafka - topic not having replicas in sync post upgrade to 2.8

Hi all, I just upgraded Apache Kafka brokers from 2.5 to 2.8. The upgrade was a success for the most part except for a single partition belonging to consumer__offsets topic not having replicas in sync. On the broker that is supposed to be, but not in sync throws the following exception

kafka.common.OffsetsOutOfOrderException: Out of order offsets found in append to __consumer_offsets-12: List(14240393, 14240394, 14240395, 14240396, 14240397, 14240398, 14240399, 14240400, .... <a lot of offsets> )```
I tried the following
1. Restart the out-of-sync broker (i.e., kafka service)
2. Restart the leader (in-sync broker) of this partition.
3. Delete the `consumer_offsets-12` partition's directory from the data directory on out-of-sync broker and restart broker
4. Generate and execute reassign-partition.sh script...and eventually cancelled to revert the state
Any help is appreciated in fixing this underreplicated partition :slightly_smiling_face:

Did you do a rolling upgrade or just shutdown and upgrade all brokers in one go?
One way I could think of resolving the issue is to set the retention on the consumer_offsets topic to just expire the bad entries as long as its earlier than the most recent entries for your consumers.

it was a rolling restart.
Yea I read about that approach of temporarily changing retention period, but that can adversely affect consumers that deal with high volumes since offsets move swiftly

But you aren’t setting retention to now. Just until after the incident.

Didnt get you…set retention to what? incident already occured

It it happened 24 hours ago try setting the retention to expire all data older than 23 hours.

Ah I get it now. Thanks for the suggestion :slightly_smiling_face: