Skip to content

Database Schema Migrations

In practical tasks, we have already encountered the fact that the tables that make up a database can change over time. In this topic, we will get acquainted with the process that developers use to manage such changes. This process is called database migrations.

Consider an example. Recall any database we examined earlier, for example, company. The database is hosted on a database server and consists of tables, relationships between them, indexes, and the data itself. All of this, except the data, is the structure of the database, i.e., its schema. During migrations, we mainly work with changes to the database schema, which is why this process is sometimes also called schema migration.

Schema migration should not be confused with database migration - the process of transferring data from one database server to another. The names of both processes sound similar, but they are completely different.

Why Use Migrations?

First of all, it is very convenient to store the database structure as code (in Git repositories) that can be used to create that structure. If more than one person is on the team developing the database, everyone can review the entire structure, easily track changes, and code review can be set up. At the same time, each team member can deploy their own database on their server to make it easier to test its operation.

If the database structure is stored as code and changes over time, these versions need to be assigned a number so that they can be distinguished. Thanks to versioning, the team will know exactly which code corresponds to each database and what structure it has.

Updating the Schema Version

Consider another example. In our development team, all developers test their changes on their own database servers, and there is one separate database server that serves our website for end users - production. The production server already contains some version of the database structure. One of the developers made a change to the database structure and tested the changes on their database server. After that, they created a pull request with their changes, passed code review, and merged them into the main branch in the Git repository. This created a new version of the database structure, and the production database server now needs to be brought to this new version. That is, for development processes, we need to be able to apply new versions of the database structure to databases that already have an existing structure version and some data.

Rollback

Imagine that our developer who created a new version of the database structure introduced a defect that only manifests itself with large amounts of data. This defect, after bringing the production database to their new structure version, significantly slowed down the system. In such a case, we need to be able to "undo" their changes, i.e., bring the production database version back to the previous structure version that was working stably. Such a reversion to a previous version is called a rollback.

Migration Tools

So, to set up the database structure development process, we need a tool that will:

  • support storing the database structure as code that can be deployed on a database server;
  • allow creating versions of this code;
  • allow bringing the database to a new schema version;
  • allow rolling back to previous versions of the database structure.

Tools that support all these functions are database migration tools, and the structural changes made using these tools are called database migrations. There are many database migration tools. The choice of a specific tool depends on factors such as:

  • the programming language and framework used to develop the application;
  • the complexity of the database structure and the planned frequency of changes to this structure;
  • the expertise of the development team and the team's capabilities for adopting new tools.

For example, if the application is developed in .NET or .NET Core using the Entity Framework database library, then, since Entity Framework itself supports migrations, it becomes the default choice.

If the application is developed in Node.js or golang and the database structure is not complex (up to 5 tables), then often a database migration tool is not used at all. Instead, code is simply written that checks whether the required tables are created, and if not, creates them.

If the database structure is complex and will change frequently, and the framework on which the application is written does not support migrations, then it makes sense to use a separate database migration tool, such as Liquibase.

Setting up the migration process is closely related to DevOps. With migrations, automatic deployment and testing of new application versions is greatly simplified. This allows you to release new versions of programs that use databases faster, without being afraid of defects. Thanks to migrations, it is much easier to deliver changes for testing, and we can be confident that the exact version of the database structure that was tested makes it to production. Additionally, even if a defect is discovered already in production, thanks to migrations we can revert to the previous version of the database structure while we work on fixing the defect.

Running Liquibase

Liquibase is a paid database schema migration tool, but it has a free version that will be more than sufficient for our learning purposes.

We will be running Liquibase in Docker. If you do not have the ability to use Docker, here is a link to the documentation for installing Liquibase on different operating systems.

To run an interactive Docker container with Liquibase, you need to execute a command that will start a container from the Liquibase image, as well as bash inside the container. This way we will be able to execute commands in this container:

docker run -it liquibase/liquibase:latest /bin/bash

Since we will be working with MySQL, a plugin for working with it needs to be installed for Liquibase. It is not distributed by default in the Liquibase image due to MySQL licensing specifics. To install this plugin, our Docker container needs the following environment variable added at startup:

docker run -it -e INSTALL_MYSQL=true liquibase/liquibase /bin/bash

To be able to edit files on the computer from the container, let's add a mount directory:

docker run -it -v $(pwd):/repos -e INSTALL_MYSQL=true liquibase/liquibase /bin/bash

The pwd part will only work on Linux or macOS. If you are working on Windows, to use the mount directory, specify the full path to the directory where you plan to store Liquibase files instead of this construct.

One more important thing: if you are working on macOS or Linux, you need to set permissions for the directory you plan to mount that allow all users to read and write files:

chmod 777 .
docker run -it -v $(pwd):/repos -e INSTALL_MYSQL=true liquibase/liquibase /bin/bash

Now you can work with Liquibase running in a Docker container, while editing files for it from a text editor, for example, VS Code:

liquibase --help

Liquibase Project

First, you need to create a project - a set of files required for working with migrations. To do this, execute the following command in the terminal:

cd /repos
ls
liquibase init project
ls

This command will create several files. Next, reconfigure the permissions on these files in the container to allow reading and writing for all users. This needs to be done so that the user under which you run commands in the container can edit files created from the container in a text editor on the computer:

ls -la
chmod 777 *
ls -la

In VS Code, let's see what files were created:

The liquibase.properties file is the most important file that stores the project configuration:

####     _     _             _ _
##      | |   (_)           (_) |
##      | |    _  __ _ _   _ _| |__   __ _ ___  ___
##      | |   | |/ _` | | | | | '_ \ / _` / __|/ _ \
##      | |___| | (_| | |_| | | |_) | (_| \__ \  __/
##      \_____/_|\__, |\__,_|_|_.__/ \__,_|___/\___|
##                  | |
##                  |_|
##
##      The liquibase.properties file stores properties which do not change often,
##      such as database connection information. Properties stored here save time
##      and reduce risk of mistyped command line arguments.
##      Learn more: https://docs.liquibase.com/concepts/connections/creating-config-properties.html
####
####
##   Note about relative and absolute paths:
##      The liquibase.properties file requires paths for some properties.
##      The classpath is the path/to/resources (ex. src/main/resources).
##      The changeLogFile path is relative to the classpath.
##      The url H2 example below is relative to 'pwd' resource.
####
# Enter the path for your changelog file.
changeLogFile=example-changelog.sql


#### Enter the Target database 'url' information  ####
liquibase.command.url=jdbc:mysql://10.20.30.40:3306/liquibase


# Enter the username for your Target database.
liquibase.command.username: mysqluser


# Enter the password for your Target database.
liquibase.command.password: P@ssw0rd

In this file:

  • changeLogFile specifies where the changelog-file is stored, which describes the database schema versions;
  • url specifies the database connection settings: DB type, address, TCP port, and the database name we will work with in this project;
  • username and password specify the username and password for the database server we will connect to.

The changelog-file describes the database schema versions as changesets (a set of changes that need to be made to bring the database schema from the previous version to the next one). Each changeset has a name, a version, and commands for transitioning from the previous version to the current one.

--liquibase formatted sql


--changeset your.name:1 labels:example-label context:example-context
--comment: example comment
create table person (
   id int primary key auto_increment not null,
   name varchar(50) not null,
   address1 varchar(50),
   address2 varchar(50),
   city varchar(30)
)
--rollback DROP TABLE person;


--changeset your.name:2 labels:example-label context:example-context
--comment: example comment
create table company (
   id int primary key auto_increment not null,
   name varchar(50) not null,
   address1 varchar(50),
   address2 varchar(50),
   city varchar(30)
)
--rollback DROP TABLE company;


--changeset other.dev:3 labels:example-label context:example-context
--comment: example comment
alter table person add column country varchar(2)
--rollback ALTER TABLE person DROP COLUMN country;

Liquibase can work with changesets in different formats: xml, yaml, json, and sql. We will work with sql. A changeset is essentially SQL code for creating a database with specially formatted Liquibase comments. If we look at the changeset that was created using the liquibase init project command, we can notice that before the beginning of a changeset, you need to specify the --changeset comment. You can also add comments and a rollback command to undo the changes of that specific changeset.

The rest of the files are not needed, so they can be deleted.

Running the Migration

Now let's try to deploy the database using Liquibase. For the demonstration, we will use the changelog-file that was created when we initialized the project.

First, we need to create an empty database on our database server and set up permissions for the user under which Liquibase will operate. To do this, connect to our database server and execute the following query:

create database liquibase;

Now let's prepare the properties file, namely fill in the connection data: database type, database server IP address, network port, database name, username, and password.

In a real project, the username and password for the database server should be specified as environment variables and should never be committed to the repository. You can find more about how to use environment variables in the Liquibase Environment Variables documentation.

Let's return to the Liquibase container. To bring the database to the latest version described in the changelog-file, you need to execute the liquibase update command. Note that the command should be executed in the directory where the properties file is stored - Liquibase will read the settings from it:

liquibase update

Liquibase created the company and person tables that were described in the changelog-file. In addition to them, two more tables were created that Liquibase uses internally for tracking schema changes between versions.

Let's try to create a new changeset and add a products table. To do this, first add the SQL query for creating the table:

create table products ( id int primary key auto_increment not null, name varchar(50) not null );

To make this code a changeset, you need to add the appropriate comment before the code:

--changeset user:4

Each changeset must have an author and a version. Changesets can also be marked with labels and contexts - tags that allow grouping changesets as needed. For our changeset, we will not specify any tags.

For changesets, you also need to write a rollback step - code that needs to be executed to undo the change of that changeset. In our case, we are adding a new table, and the rollback step for such an operation would be deleting the table. With all these changes, our changeset will look like this:

--changeset user:4
create table products ( id int primary key auto_increment not null, name varchar(50) not null );
--rollback DROP TABLE products;

The new changeset has been added to the changelog-file.

Let's add tags in Liquibase that allow fixing the state of the database after applying a group of changesets:

liquibase tag other.dev:3

Let's bring the database schema to the latest version:

liquibase update

This time, Liquibase connected to the database, learned the last applied changeset from the service tables, and applied the one we added that had not yet been applied.

Now imagine that the changes we made to the database schema caused an error in the system, and we need to roll them back. To do this, use the liquibase rollback command. It requires a tag parameter specifying the state to roll back to:

liquibase rollback --tag=other.dev:3

This command found the list of all changesets that were applied after the tag in the service table in the database, read those changesets from the changelog-file, and executed their rollbacks.

If after each database schema change we simply executed the SQL script that creates tables, its execution would fail with an error, since at the beginning of the script there are commands to create tables that already exist in the database. To solve this, we would have to create separate files for each change and manually track which files have been executed. Additionally, rolling back changes in this way is much harder.

How to Test Yourself

Just in case you want to test your script on your database before submitting a pull request, you can do it by performing the following actions:

  1. Drop the ShopDB database using the DROP DATABASE ShopDB; statement if you already have it on your database server.
  2. Create an empty ShopDB database on your database server.
  3. Run the initial schema migration for the ShopDB database with Liquibase using Docker on your computer:
docker run -v $(pwd):/repos --workdir /repos/ -e INSTALL_MYSQL=true \
-e LIQUIBASE_COMMAND_USERNAME=<db username> \
-e LIQUIBASE_COMMAND_PASSWORD=<db password> \
-e LIQUIBASE_COMMAND_URL=jdbc:mysql://<db host>:3306/ShopDB \
liquibase/liquibase liquibase update --labels="0.0.1"

Make sure to run the command in the folder where the repository is cloned. If you are running the script on Windows, replace $(pwd) with the full path of the cloned repository.

Replace <db username>, <db password>, and <db host> with values for your database server before running the command.

  1. Tag the database with the initial version, so you will be able to rollback any new changesets:
docker run -v $(pwd):/repos --workdir /repos/ -e INSTALL_MYSQL=true \
-e LIQUIBASE_COMMAND_USERNAME=<db username> \
-e LIQUIBASE_COMMAND_PASSWORD=<db password> \
-e LIQUIBASE_COMMAND_URL=jdbc:mysql://<db host>:3306/ShopDB \
liquibase/liquibase liquibase tag 0.0.1
  1. Use commands described in steps 3 and 4 to update the database to versions 0.0.2 and 0.0.3 you will create while working on this task.
  2. Test rollback of the changeset by reverting the state of the database to a tag:
docker run -v $(pwd):/repos --workdir /repos/ -e INSTALL_MYSQL=true \
-e LIQUIBASE_COMMAND_USERNAME=<db username> \
-e LIQUIBASE_COMMAND_PASSWORD=<db password> \
-e LIQUIBASE_COMMAND_URL=jdbc:mysql://<db host>:3306/ShopDB \
liquibase/liquibase liquibase rollback <tag name>

Replace <tag name> with the tag version you want to rollback to, for example, 0.0.2 or 0.0.1.