Leader driven vs Leaderless replication

Hi, can anyone list me the pros vs cons of leader driven vs leaderless replication in distributed systems?

When would you use which approach and why?

I am new to distributed systems and want to learn concepts!

Thanks in advance!

Leaderless means you have no leader. So to agree on anything the participants have to go through the whole consensus process from scratch every time which is relatively expensive until there is a global order/topology on the nodes and they can benefit from it

With leader you have other kind of problems - fallback costs of leader reelection, membership change, fast path tradeoffs, brain split convergence etc

> “So to agree on anything the participants have to go through the whole consensus process from scratch every time”
Can you describe a scenario in the form of an example for this? I want to understand this better.

The real-life example you can find out on the streets is the implementation of the LWT in Cassandra, or read original papers about the implementation of Google Data Store (MegaStore)

Thanks a lot! I’ll read them both.

There are setups that have consensus to determine the leader and then writes are handled automatically by the leader which seems to be sort of in the middle between the two. Is that a distinct concept or does it fall into one of the above buckets

Or is that a weaker version of leaderless, while the stronger version is all writes go through consensus

> So to agree on anything the participants have to go through the whole consensus process from scratch every time which is relatively expensive until there is a global order/topology on the nodes and they can benefit from it
so in case of leaderless, if nodes are involved in consensus everytime, I am assuming that they might be using some consensus protocol like Raft/Paxos. Don’t the consensus protocols also themselves elect leaders to perform the consensus? So is it true leaderless?

> there are setups that have consensus to determine the leader and then writes are handled automatically by the leader which seems to be sort of in the middle between the two. Is that a distinct concept or does it fall into one of the above buckets
I would say it falls into the pre-determined topology class - check out aerospike design iirc

>> So to agree on anything the participants have to go through the whole consensus process from scratch every time which is relatively expensive until there is a global order/topology on the nodes and they can benefit from it
>> > so in case of leaderless, if nodes are involved in consensus everytime, I am assuming that they might be using some consensus protocol like Raft/Paxos. Don’t the consensus protocols also themselves elect leaders to perform the consensus? So is it true leaderless?

Different consensus protocols do different things. Paxos exists in many variants. Both leader and leaderless. The simplest is leaderless one-shot variant - single decree paxos.

I see. Thanks.

So whenever we say leaderless, the most reasonable thing to assume is that the nodes are most probably using some consensus protocol everytime to perform reads & writes. Now, how the consensus protocol performs consensus is the worry of the protocol design.

Well there’s also a quorum based approach like what was in the dynamo paper

That’s referred to as leaderless and doesn’t use consensus

> are most probably using some consensus protocol everytime to perform reads & writes
they may not use it
> a quorum based approach like what was in the dynamo paper
it tries to achieve eventual consistency but its not a consensus protocol by itself. I would speculate that the “Quorum Approach” does not Agree on the same value 100% of the time as it is not its purpose. even tho those systems would do pessimistic synchronous read repair in case of the divergence but even then the detection is opportunistic.

in consensus you can’t afford participants to decide on different values. for more info see https://en.wikipedia.org/wiki/Consensus_(computer_science)

Good thread. A few comments of my own for what it’s worth.

Perhaps it is useful to distinguish between a single-leader and a multi-leader system since the topic of quorum was mentioned. In a single-leader system, the leader is responsible for keeping its replicas in sync. Usually, the leader employs consensus protocols (paxos/raft/etc.) for this synchronization, but you could use a 2PC too although it does not make practical sense. In a multi-leader system, like the name says, the client may pick one node from among multiple leaders to write to (or read from). The leaders coordinate amongst themselves for getting the writes across. I have not studied many implementations here, but my understanding is that the leaders employ conflict-resolution (using logical and/or time-of-day clocks) to keep other leaders in sync. In other words, instead of employing a consensus protocol that avoids a conflict like in a single-leader system, the leaders in a multi-leader system employ resolution techniques whenever they notice a conflict.

A leaderless system is somewhat similar to a multi-leader system in that the clients are not required to identify a single leader. The difference between a multi-leader and leaderless is primarily about who is responsible for replicating the writes to other nodes. In a multi-leader system, the leaders are responsible. In a leaderless system, the client is responsible for writing to all nodes. Typically, a write and a read quorum is established to ensure the writes have made to enough nodes so you have high confidence that you will read the latest write if you read from enough number of nodes simultaneously. But the situation is fraught because the nodes you have written to are susceptible to crash-stops and omission failures omit and quorums become meaningless; so leaderless is not used in situations where linearization is a must. (Most applications may not require linearization; so leaderless may just work fine). Read/write-repairs and other techniques are used to deal with such issues. Even in leaderless systems, conflict resolution techniques are used to ensure earlier writes do not overwrite later ones instead of consensus protocols or distributed transactions.

Going back to the first question that asked, I guess you could say a single-leader can achieve the more demanding consistency models (like linearization or sequential consistency) at the cost of being slow by making a single node be responsible for achieving total order. Whereas the other systems achieve swiftness and progress by only being able to guarantee the less demanding consistency models (like causal consistency).

that seems fair to me. I guess I was using a more expanded definition like “consensus and things that try to achieve something close to consensus if you squint and are tolerant of strange issues”