Are there any specific guidelines for machine sizing and storage for controllers in a Kafka cluster?

Hello everyone! We are planning a deployment of a new cluster using v3.5 in Kraft mode. From my understanding, for production workflows, a machine is either considered a controller or a broker by assigning it the process.roles setting. Does that mean that controllers don’t participate in the event streaming process (replication, data storage, etc)? If so, are there any guidelines for the machine sizing, storage etc that are only applicable for controller? Let’s say I’m planning for a 20 broker cluster; Should my cluster be 20 brokers + 3 controllers instead?

• Hi - If process.roles is set to broker, the server acts as a broker.
• If process.roles is set to controller, the server acts as a controller.
• If process.roles is set to broker,controller, the server acts as both a broker and a controller.
• If process.roles is not set at all, it is assumed to be in ZooKeeper mode.
Kafka servers that act as both brokers and controllers are referred to as “combined” servers. Combined servers are simpler to operate for small use cases like a development environment. The key disadvantage is that the controller will be less isolated from the rest of the system. For example, it is not possible to roll or scale the controllers separately from the brokers in combined mode. Combined mode is not recommended in critical deployment environments.

I would like to add to Bharath’s point that we currently recommend Isolated Mode (where process.roles is strictly controller or broker for production deployments)

You can see the latest information pertaining to KRaft here: https://docs.confluent.io/platform/current/kafka-metadata/kraft.html

Thanks for the clarification. Are there any plans to bring broker,controller in production?

It is under consideration, and we are open to learning more about use cases and getting feedback. Would you be able to describe some of the things you are looking for in a combined mode deployment in production?

Like less machines to manage

Hello totally understand that desire and thank you for that feedback.

Are you currently using KRaft in isolated mode at all?

No, I am not using KRaft at all currently (because we have Zookeeper-based deployment). But considering switching to it in the future

Same usecase here. We have terraform recipes that deploy our clusters across different environments. We haven’t switched to kraft yet as it would require quite a few changes on the terraform code to support it. In combined mode we could probably just use the same deployment code.

Thank you for that background. are these completely custom recipes that deploy Confluent Platform?