Hello guys
I’m new to kafka, sorry for possible misunderstanding
My usecase: 9 server cluster (3 nodes quorum-replicated, 6 others just parallel nodes); all the nodes write some data to postgres locally (separate postgres clusters); I’d like to aggregate each node’s data on the primary node, taking the summary from each node’s data
Is this an adequate usecase from kafka/CDC/debezium/whatever or are they an overkill for my usecase? If they are, why? And what better tools to use then?
1 primary node, 2 replicas of it
other 6 nodes are just basically in the same network and doing the same work as the ‘main’ 3 ones, but are not technically part of a quorum cluster
Once you create topics, their replicas are created on other brokers in that cluster, assuming you are using replication-factor 3 and have 3 or more brokers in that cluster.
I see that for it to work it’s best to have several nodes for kafka and several for the target store (data wharehouse etc.) and maybe several for the source
I have a storage system that is pretty much a rack of 9 nodes, 3 management nodes, 1 of them primary at a time
It has usage statistics on each node
I’d like to aggregate those stats in real-time to the primary (and then from there send them somewhere else or show in UI or …)