AWS lambda to confluent cloud latency issues :
I am currently using basic version of cluster on Confluent cloud and I only have one topic with 9 partitions. I have a REST Api that’s setup using AWS lambda service which publishes messages to Kafka.
Currently i am stress testing pipeline with 10k requests per second, I found that Latency is shooting up to 20-30 seconds to publish a record of size 1kb. Which is generally 300 ms for a single request.
I added producer configurations like linger.ms - 500 ms and batch.size to 100kb. I see some improvement (15-20 seconds per request) but I feel it’s still too high.
Is there anything that I am missing or is it something with the basic cluster on confluent cloud? All of the configurations on the cluster were default.
I would 1st start with evaluating your workload is properly sized. There is a great resource Confluent provides that will indicate how much resources you need https://eventsizer.io/granular
Also I would incrementally test to see if there is a linear degradation at 1k, 5k, 10k. My sense is though that the cluster is undersized and you need to go through a sizing effort to make sure you have enough resources (CKUs)