Skip to content

Replication

In previous topics, we already learned how to set up MySQL backup so that we can restore the system and all data in case of failures. Of course, even if we can recover the system after failures, they are still undesirable. That is why, to reduce the probability of failures, we use replication.

Consider an example. Imagine that you have one database server, and the probability of its failure is, say, 10% (0.1). If you have two such database servers, the probability that they will both fail at the same time is 10% * 10% (0.1 * 0.1 = 0.01). So, the more database servers we have, the lower the probability that they will all fail simultaneously, and the system will stop working.

Replication is a process during which data from one database server is copied to one or more other servers. Different database management systems implement replication differently and support different types of replication. The main goal of replication is the ability to use more than one database server.

Asynchronous and Synchronous Replication

Replication can be divided into asynchronous and synchronous.

With asynchronous replication, the data write operation completes as soon as the record appears on the server where the query was executed. At the same time, copying this record to other servers happens in the background. The disadvantage of this approach is that if the server where the data is being written fails before the new data has time to be replicated, that new data will be lost.

In contrast to asynchronous replication, with synchronous replication, the write operation continues until the record appears on all database servers participating in the replication. This method of replication is more reliable, but also slower, since the write operation will not complete until the new data appears on all servers.

Without installing additional plugins that are not part of the project, MySQL supports only asynchronous replication. That is, the write operation completes when the data is written to the source server, and it reaches the replicas later, in the background.

Active-Active and Active-Passive Replication

Replication also comes in active-active and active-passive variants.

With active-active replication, all database servers are equal. Read and write queries can be executed on any database server participating in the replication. Such replication is difficult to implement, and data consolidation between replicas requires additional computing resources.

That is why a simpler option is often used - active-passive, in which data can only be written to one main server called the source. Other servers participating in this type of replication are called replicas - data from the source server is copied to them, and they can be used for reading that data. This method of replication is easier to implement, but it requires additional actions in case the main server fails. If that happens, one of the replicas needs to become the main server. This procedure is called failover.

Formally, MySQL supports both active-passive and active-active configurations, but the active-active implementation is considered unreliable. If the same row is updated on both replicas in an active-active configuration, a conflict can arise between the replicas: one replica will have one version of the data in that row, and the other will have a different one.

As of the time of writing, MySQL has no mechanism for resolving such conflicts. That is why, when talking about replication in MySQL, they most often mean specifically active-passive replication.

There are many applications for database replication:

  • Increasing system reliability. The more database servers we have, the lower the probability that the system will fail.
  • Distributing computations to optimize resources. For example, thanks to replication, you can dedicate one server to writing data and another only to reading.
  • Migration, or transferring a database from one server to another without significant system downtime. If the database is large, for example, 1 terabyte, then transferring such a database to another server can take days or even weeks. Thanks to replication, it is possible to transfer data from one server to another without stopping the system: data will be copied in the background, and when the source and replica are fully synchronized, it will be enough to simply perform a failover to finally switch to the new server.

Setting up replication is a separate complex task that usually does not need to be done manually when your application is in the cloud or in Kubernetes. However, it is very useful to know how it works and what types of replication exist, so that you can properly build a system that uses it.