Skip to content

MySQL database engine is a software responsible for storing data in the database. MySQL has pluggable storage engines that allow you to set a storage engine when creating a table in the database:

CREATE TABLE Countries (
    ID INT,
    Name VARCHAR(50),
    PRIMARY KEY (ID)
) ENGINE=InnoDB;

Database Internals

In this lesson, we will look "under the hood" of MySQL and understand how its architecture is organized.

MySQL Architecture

MySQL operates on a client-server model. This is a model in which the client (that is you, the user) interacts with the server (where the database is stored) using network services. This model separates the user interface from data storage, which improves security and efficiency. The MySQL architecture consists of three main layers.

Client Layer (Application Layer)

At this layer, the following takes place:

  • sending a request to the server using various connectors (client libraries or standalone programs);
  • authentication when the client connects to the server, which is performed using a username and password.

Server Layer

This is the "brain" of the MySQL system, which manages all logical functions. The main components of this layer have the following functions:

  • thread handling - the server layer manages client connections and executes queries;
  • parser - the server layer analyzes and breaks SQL queries into understandable segments (tokens);
  • optimizer - the server layer optimizes queries for efficient execution;
  • query cache - the server layer stores query results for faster access if the same query is made again.

Storage Layer

This is the "warehouse" where all data is stored. MySQL is special in that you can choose different database engines (storage layer implementations) depending on what you need - speed, reliability, transaction support, etc.

Example

Let's examine the MySQL architecture using the query SELECT * FROM users WHERE age > 30 as an example. When you send this query through a client interface, for example, through MySQL Workbench or the command line, it reaches the client layer. The client establishes a connection with the server, authenticates, and if everything is in order, passes the query to the server layer.

The server layer accepts the query and analyzes it. The server also checks whether the result of this query is already in the cache. If so, it can immediately return the result, bypassing other stages. If the result is not in the cache, the parser breaks the SQL query into logical parts (tokens), checks the syntax, and makes sure the query is correct. After parsing, the optimizer analyzes different ways to execute the query and chooses the most efficient one. For example, it may decide which indexes to use for fast data access.

Finally, the query reaches the storage layer, where the actual access to the users table takes place and records where the user's age is greater than 30 are selected. Depending on the storage engine used, for example, InnoDB (discussed below), data is read and processed in a way that is optimal for that particular engine. Then the query results are sent back through the server layer to the client, where they are displayed in the user interface.

SQL Engine

Creating, reading, updating, and deleting (CRUD operations) information from the database in MySQL is the task of a component called the database engine. A database engine is a program responsible for storing and processing data in the database server. Sometimes the database engine is also called a storage engine.

Database engines can be divided into two types.

Transactional

Transactional engines support transaction execution, and also ensure the reliability and stability of data operations. An example is InnoDB, which is capable of either executing the entire group of operations completely, or none of them if an error occurs. This ensures reliability when working with data. Through ACID support, it also ensures data integrity, i.e., it reduces the risk of data loss in case of failures. However, this type of operation requires more resources, which can lead to reduced data processing speed compared to non-transactional engines.

Transactional engines are well suited for projects where data integrity and consistency are important.

Non-transactional

Non-transactional engines do not support transaction execution but are simpler to use and faster in operation. Examples include:

  • MyISAM - a fast and easy-to-use engine that does not support transactions, so write operations are performed without guarantees of data integrity in case of failures. It is suitable for websites, blogs, or systems where data is mostly read rather than updated.
  • Memory - uses RAM for data storage, which provides high-speed data access. However, in case of a system restart, data will be lost, so it is not suitable for long-term data storage.
  • CSV - stores data in CSV text file format, which facilitates data exchange with other programs. It also lacks indexing capability, which can slow down data search and processing.
  • Blackhole - accepts data but does not store it. It is good for testing systems without the risk of losing real data. However, it is not suitable for data storage.
  • Federated - connects multiple MySQL servers into a single logical database but does not store data locally. It is useful for creating distributed databases with increased availability. However, when choosing it, you need to consider the complexity of management and configuration, which can complicate the further maintenance process.

How to Choose an Engine?

When choosing an engine, you should ask yourself the following questions:

  • What is the goal of the project? For example, is it a web application, an analytical system, a financial system, etc.?
  • What are the main operations that will be performed? Lots of reads, writes, mixed operations?
  • Is data integrity and reliability important? What are the data security requirements?
  • Is transaction support needed?
  • What is the expected load on the database?
  • Is read speed more important than write speed?
  • Is support for specific features needed, such as full-text search or replication?
  • Are special capabilities needed, for example, for distributed systems or temporary data storage?