Randson Software Engineer

CAP Theorem

If in some cases your search about CAP Theorem on the internet. You gonna see three circles talking about Consistency, Availability and Partition Tolerance.

But what are these points? Why do we need to know them?

Before diving into CAP Theorem. Let’s understand that a distributed system is a network that stores data on more than one node. A node could be a physical machine or a virtual machine.

And they store data on these nodes at the same time.

When we’re dealing with cloud applications, we are dealing with distributed systems. So, how these systems work together, it’s essential to understand the CAP Theorem when designing a cloud app.

So you can choose which data management system better suits your needs.

Let’s take a detailed look at the three distributed system characteristics to which the CAP theorem refers.

Consistency is one of them, what about that?

In general, it means all clients see the same data at the same time. Whenever data is written, at the same time the data is forwarded to other nodes. It means all data is replicated to other nodes instantly.

And Availability?

This, at least for me, is the coolest one. It means wherever the server gets a request, it’ll get a response, even if one or more nodes are down. To state that, all nodes in the network must return a valid request, without exception.

And the last one, Partition Tolerance?

This could be easily known by an error of the network. Despite any number of communication breakdowns the nodes in the system must continue to work. It’s simply saying that despite any error, the server must continue to work without crashes.

So now we know the difference between them. How does it work?

First, let’s mix CP(which means Consistency and Partition).

A CP delivers consistency and partition tolerance at the expense of Availability. So, saying that when an error of partition occurs, the system shutdown the non-consistent node so it becomes unavailable until the partition is resolved.

But what about AP(which means Availability and Partition)?

This kind delivers Availability and Partition Tolerance at the expense of Consistency. When an error occurs, all nodes remain available. But some nodes might return some outdated version of the data. When the error is resolved, they resync the nodes and repair the inconsistencies in the system.

And CA(which means Consistency and Availability)?

The last kind is CA, which delivers Consistency and Availability to all nodes. However, it can’t do this if there’s is a partition error between them. And, therefore, can’t deliver fault tolerance.