Database Backup
One of the important requirements for computer systems and applications is the ability to recover after failures. To ensure this capability, you need to be able to restore data from a backup copy - a backup.
Recall our Lab Setup - there, before making changes to the configuration file, we made a copy of it so that in case of an error, we could easily restore the operation of our server. It is possible to make mistakes not only when working with the configuration file, but also with data, for example, accidentally deleting needed records.
But human error is not the only possible cause of data loss. Anything can happen to the database server: a hacker attack, a system error, a hardware failure of the computer running the database, or even a natural disaster. Backup is needed to protect against all of this. It is thanks to backups that you can restore data and system operation.
Types of Backups
In general, database backups can be classified by a number of criteria. The first one:
- Physical - this is a complete copy of the database as it is stored on disk. In the topic about configuration, we already mentioned that MySQL stores databases on the server's disk, and that you can set the directory where database files are stored using the
data_dirconfiguration option. So, a physical backup is simply a copy of the files in that directory. The disadvantage of a physical backup: when creating a backup copy, the data on disk needs to be completely correct. For example, if a user is executing a largeINSERTquery during the backup creation, the data from this record may not have been fully written to disk. If you try to restore the database from such a backup copy, it may turn out to be corrupted due to partially written data. - Logical - this is a set of SQL commands that reproduce the database structure and the data in it. It is simply a file with an SQL script consisting of
CREATE TABLEandINSERTcommands, from which the database can be fully restored. Logical backups are usually slower to create, and restoring from such a backup also takes more time. However, since such a backup is simply a set of SQL commands, even if it is corrupted, errors can be traced and fixed.
Backups can also be:
- Online - this is a backup copy that is created without stopping the database. That is, when an online backup is being created, users can work with the database and execute queries. Because users can execute queries while the backup copy is being created, it may be difficult to maintain data consistency and also to determine exactly which data made it into the backup copy and which did not.
- Offline - this is a backup that is performed when users are disconnected from the database. Such a backup guarantees data consistency; you can definitively say which data is included in the backup copy and which is not. If we were performing such a backup, for example, for an online store, we would need to completely stop the website during the backup creation.
The final backup classification:
- Full - this is a backup copy of all data in the database. That is, using a full backup, you can restore all data that was stored in the database at the time the backup copy was created. The disadvantage of a full backup: in the case of large databases, it takes a long time to create. If the backup is created for a long time and is online, it can be hard to predict exactly which data made it into the backup and which did not. If such a backup is created offline, then due to the long creation time, the system downtime will also be long.
- Incremental - this is a backup copy of data that was created in the database over a certain period of time after the full backup was created, for example, during a day.
Full and incremental backups are often used together: for example, a full backup is created once a week and an incremental backup is created daily. Then, if the full backup was created, for example, on Sunday, and the database was corrupted on Thursday, to restore such a database you need to restore the last full backup of the database, and then sequentially restore all incremental backups that were created from Monday to Thursday. Due to this complex recovery procedure, it is not recommended to create too many incremental backups.
Backup
The backup procedure depends on the system requirements and the team's capabilities. Here are a few tips:
-
The system that uses the database should have resiliency requirements established: RTO and RPO. RTO (recovery time objective) is the time within which the system must restore its operation after a failure. If the RTO of your system = 60 minutes, and the backup tool restores your database in more than an hour, then that tool or backup method is not suitable for you. RPO (recovery point objective) is the acceptable data loss. Usually, this value is specified as a time interval during which data loss is considered acceptable. For example, if RPO = 1 day, then your backup solution should create backups at least daily.
-
Backup should be performed at a certain frequency and automatically. The tool you use to set up backup should allow you to define a schedule on which the procedure runs. Generating a backup copy usually creates additional load on the database server. Therefore, to decide exactly when to run the backup, you need to analyze the server metrics (usually CPU load) and determine the time when it is least loaded.
mysqldump
There are many tools that allow you to create backup copies of MySQL databases in different modes: logical and physical, online and offline, full and incremental. Most of these tools are commercial products. The most popular free tool for creating backup copies is mysqldump. It creates full logical backups and can do so both online and offline.
Creating a Backup Copy
mysqldump can create backups both on the local computer and on a remote one. To create a database backup, you need to execute the following command:
Here:
mysqluseris the MySQL user;P@ssw0rdis the password.
The --databases parameter specifies the databases whose backup copy needs to be created. Instead of this parameter, you can specify --all-databases, in which case mysqldump will create a backup copy of all databases on the server.
--result-file defines where the result of the database dump will be saved.
Restoring a Database from a Backup Copy
To restore a database from a backup copy, you simply need to execute the queries from the file we created. Let's delete the database whose backup copy we created and test its restoration:
The company database has been deleted. Now let's restore it from the backup copy:
If you check, our database, its tables, and data will be on the database server:
mysqldump Parameters
When creating backups, you will find it useful to know about several mysqldump parameters.
--no-create-db
With the --no-create-db parameter, the backup file will not contain the CREATE DATABASE SQL command. This can be useful if you want to restore the backup copy into a database with a different name. In practice, such a transfer looks like this:
Note that the --databases parameter is omitted in this command. The reason is that when it is specified, a USE command with the original database name is added to the resulting file. To restore our backup into a database with a different name, we do not need this parameter.
Now let's restore this backup into a database with a different name:
After that, execute the SQL query specifying the database into which you want to restore the backup copy:
mysql -u mysqluser -pP@ssw0rd companyRestored < backup-no-create-db.sql
mysql -u mysqluser -pP@ssw0rd
use companyRestored;
show tables;
--no-create-info
This parameter disables adding database structure creation instructions to the backup file. This can be useful if, when restoring the database, you want to restore the structure in some other way, for example, using Liquibase.
--no-data
The --no-data parameter allows you to create a backup file that will contain only instructions for creating the database structure, but will not contain the actual data.
mysqldump is a simple and therefore popular tool for database backup, but it is not a universal solution. mysqldump works well for small and not very heavily loaded databases. If its functionality is insufficient for a procedure that would allow meeting the RTO and RPO requirements, then you should consider tools that support physical (rather than logical) and incremental (rather than only full) backup copies.