Ensuring consistent Kafka environments

oWoods · August 25, 2021, 11:15am

Question for you folks. For those that have an automated environment, how do you ensure you have consistent environments? On release, do you tear down the Kafka cluster and recreate it entirely from code stored in version control, or do you ‘upgrade’ the Kafka cluster in place using a script?

oWoods · August 25, 2021, 1:08pm

So what about the application specific configurations: topics with a variety of configurations, ACLs/RBAC, etc.
Say you release a new version of your application that needs 2 new topics and has a ksql job that requires various permissions.

dJones · August 25, 2021, 2:22pm

That stuff ive seen people script and use ansible. More of the stateful side of Kafka, definitely the harder piece which IaC (at least in AWS) usually doesnt extend to.

johnC · August 25, 2021, 3:15pm

I’d suggest taking a look at: https://github.com/kafka-ops/julie

johnC · August 25, 2021, 3:54pm

As well as: https://github.com/simplesteph/kafka-security-manager

johnC · August 25, 2021, 5:23pm

, we manage topics, ACLs, and connectors in Terraform, so pretty much everything is in code.
Data is persisted across the restarts, so there’s no need to re-create the topics after the broker got restarted, if that’s what you were getting at.

davidH · August 25, 2021, 6:27pm

For the env, we run in k8s and tear them down one by one. We wait until each broker stabilizes (as well as the URPs) and then move onto the next broker until the whole cluster stabilizes. For the ACLs, we use kafka-security-manager. Our topics and consumer groups are not in code yet.

davidH · August 25, 2021, 8:07pm

I’ve keeping an eye on this project for a bit: https://github.com/devshawn/kafka-gitops

But, I am also interested in this recent confluent article on the topic: https://www.confluent.io/blog/devops-for-apache-kafka-with-kubernetes-and-gitops/

oWoods · August 25, 2021, 9:02pm

The streaming ops project is written by a colleague of mine, as is Julie. Both are pretty cool.
I’ve been wondering whether there is a way to help users externalize configuration, particularly from confluent cloud. Julie helps with this to some degree and we had some internal discussions they didn’t land anywhere concrete. From your perspective, what’s on the devops wish list for Kafka/Confluent?

dJones · August 25, 2021, 10:00pm

Big pain point ive seen is copying data between environments for testing. Always tonnes of effort and varying implementations from copying the data directory to mirror maker type migration. Would be cool if there was just some cli command for some set of topics that wasn’t too complex to configure (no unix pipe hell )

oWoods · August 25, 2021, 11:09pm

Yeah, that’s a good idea. There is some work in confluent cloud ongoing to provide that using cluster links (a nice feature as it’s there is no extra overhead as you’d get with mirror maker or replciator). Not sure if there is a CLI for that or not.