How does Kafka implement KTables?

Hello, How does Kafka implement KTables? Are they implemented behind the scene on a RocksDB instance like how KSQL implements tables? Is there any article that explains how KTables are implemented and something that contrasts it with KSQL Tables

KsalDB is a Kafka streams application. Kafka Streams has the concepts of kTables and kStreams.

In order to make a kTable works efficiently Kafka Steams has a key/value cache for lookup. This can be done with in memory caches or with RocksDB caches. The API is there to where other implementations could be built as well.

The durability of this kTable state store is a compacted topic within Kafka. A compacted topic means that for any message where the key exists earlier, I could be removed. However, the compacted topic just means it can be removed. Doesn’t mean that it is (or is yet).

Now if you used Kafka-clients library for just a producer/consumer you do not have a state store at your disposal. However you could create a compacted topic which would allow messages to be removed because they have been replaced - but that has to be implemented in your client; just like what Kafka Streams did for its state management.

So Kafka Streams implements kTables with 2 key components: a key/value store for accessibility and a compacted Kafka topic for durability.

Thanks,. While I got the durability part, what I’m still puzzled about is how does Kafka stream makes the look up efficient? You mentioned it can be done using in memory cache or with RocksDB instance, Let’s say, I sign up for confluent cloud or use AWS MSK what will these installations do for efficient streaming? how will this be implemented internally by Kafka? If there are any links that explain how KTables are implemented please share it. Again, I appreciate your feedback, many thanks

On the Kafka side of things, a KTable is just a compacted topic. In your app, depending on whether your KTable is backed by an InMemoryKeyValueStore or RocksDbStore, your app will either maintain an in-memory TreeMap (a red-black tree-based Map), or a RocksDB database, which will store your data on disk, locally on each instance.
Look-ups are efficient because querying either type of store is very efficient (and effectively O(1) IIRC). The RocksDB stores are recommended for data sets that are larger than available memory, or where restoring the table from the compacted topic would be prohibitively expensive (e.g. when you have a low cardinality of keys, but a very high update rate).

Thanks,. So It’s the responsibility of the app / app developer to stand up something within the app to answer questions/queries and use the compacted kafka topic only for durability (app restarts and such).