Comparing KafkaJS, node-rdkafka, or Java microservice for high throughput?

Hello Confluent Team,

I need to have a quick talk with someone regarding which Kafka Client to use, if my throughput needs to be super high, so lets say 5 events per user/ per second. (Can scale up to 100K users so thats 500K events per second, even to 1 million users. So i need to understand these limitations well)

Im comparing basically things like: KafkaJS, node-rdkafka, or should i deploy a Java microservice to interact with Kafka? Am i thinking right here?

Any suggestions would be much appreciated. Thank you!

That’s not super high.

The number of users will be an issue, but that’s a tcp stack issue, as much as a Kafka issue

• okay so first for wether this is high; so at scale, these numbers are okay, i can scale to even millions no problem right? (we can also def decrease events per user per second)
• As for the number of users, initially I will be using NodeJS, so would it be able to better handle the high throughput asynchronously (due to non blocking I/O)? If not what are some common use cases for improving performance?
Thanks in advance ! Also for more context, the scenario is as follows:

  1. You are playing fortnite or some game and using our overwolf app, we record events in real time, like player location, player kills, placement, rank etc.
  2. We send these to our backend via socket lets say or even a kafka client on frontend if thats faster.
  3. The goal is to publish to a kafka topic which will ingest to Apache Pinot automatically
  4. Simultaneously, users are querying pinot in real time to get current leaderboard position, tournament (or whatever gamified event youre playing) stats and more.
    The system is fully event sourced with CQRS & event driven so it will also use kafka to send messages accross services, as well as ingest into Pinot.

Go with the user to socket option. Kafka is limited in the number of clients, that will be a road block for you.

As for which language, it really doesn’t matter. All the major clients are pretty well equal in functionality and speed

okay makes sense, i basically feel like nodejs fits well here since we have a nestjs backend, nodejs is lightweight and can process asynchronously, so I would say in the sense of tcp stack and not kafka itself, as you put it, probably nodejs might have a good advantage here right?

Not really. All the clients are by default async

yes sorry I meant nodejs and requests, unrelated to Kafka, for example PHP is synchronous in the way it processes external requests, so this is basically before you even reach kafka.

alright then amazing! thanks alot for the info!