Hi
I need to create a Kafka producer that needs to store its state somewhere. I know that Kafka can store such state in a compacted topic. Compacted topics have known drawbacks like no actual guarantees about getting the latest value for the key, which is crucial to my application (otherwise exactly-once delivery isn’t guaranteed). Has this changed or not?
I also have read that RocksDb key-value store have been added to Kafka at some point. Maybe one can utilize it to store producer state? Can one also store the state and send the message atomically within a transaction? From what I’ve read it seems possible, but I’m not sure.
Compacted topics have known drawbacks like no actual guarantees about getting the latest value for the key
Yeah, log compaction doesn’t happen on the active segment, and happens asynchronously on closed segments, so you have to read to the end of the topic to ensure you have the latest value.
You can produce to multiple topics in a transaction, yep.
As for storing state locally, you’ll have to implement that yourself, RocksDB is one option as mentioned above.
Okay, so yeah, to ensure you’ve got the absolute latest message for a given key, you have to ensure you’ve consumed to the end of the topic partition, I’m afraid.
If you need RocksDB or just an in-memory hash map depends on other app requirements / properties like state size etc.
But if you write all state updates to a compacted topic and read it to the end of startup (as pointed out earlier), you can always rebuild the latest state on startup. And during processing you use a single TX to write into the actuall producer topic plus changelog topic to keep both in-sync and update the state „on the side“.