Understanding Database Transactions: ACID Principles in a Relational DBMS

Understanding Database Transactions: ACID Principles in a Relational DBMS

When working with relational database management systems (RDBMS), you will most likely execute a transaction.

Transactions are a set of operations that encapsulate multiple query statements into a logical unit of work for the database to process. In the context of an RDBMS, it is a sequence of one or more SQL statements that are treated as a single unit of work.

Each logical unit in a transaction must be completed to avoid data integrity problems. Transactions provide a powerful mechanism for managing complex operations consistently and reliably.

One common similarity that exists among relational databases, e.g MySQL, PostgreSQL, Microsoft SQL Server, Oracle Database - is that they are all transactional databases which imply that they rely on database transaction to ensure data integrity and consistency.

Transactions play a key role in ensuring data integrity and consistency. In this post, you will learn about what database transactions are, and the ACID properties of transactions, also you will see an example of how a transaction is implemented in PostgreSQL.

Let's get into it!

What is a Database Transaction?

A database transaction is any action that writes to or reads from a database. In this context, it is a logical unit of work that must either be completed wholly or outright aborted; no intermediate state is allowed.

Oftentimes, a transaction will consist of a series of operations (it can also consist of a single operation). Every operation within a transaction must be completed successfully. In the event that any operation fails, the entire transaction is aborted, and the database state is reverted to its previous state. On the other hand, if all operations within a transaction are successful, the database moves from one consistent state to another consistent state.

A consistent database state is one in which all the data conforms to the defined rules and constraints.

It is important to also note that not all transactions update the database.

In order to illustrate just how transactions work, we will use a sample database. Below is the relational entity diagram of the database:

Fig. 1 illustrates a sample database with three tables - CUSTOMER, BOOK, and ORDER:

  • The CUSTOMER table holds information about the customer. The customer table is associated with the order table in a one-to-many relationship with the foreign key as CUST_ID

  • The BOOK table contains information about the book which includes the title, author, and quantity of books available, and is associated with the order table in a one-to-many relationship with the foreign key as BOOK_ID.

  • The ORDER table records a customer's book order details.

To understand the concept of a transaction, suppose that a customer orders a book. Given that scenario, the transaction will consist of at least the following parts:

  • Checking for the availability of the requested book in the BOOK table

  • Updating the QTY_AVAILABLE after deducting the order quantity.

  • Updating the QTY_ORDERED in the ORDER table

  • And if the book is not available, the transaction is rolled back.

This transaction is an example of a multi-component transaction and in the case of a relational database management system (RDBMS), will consist of a series of SQL statements such as:

  • SELECT statement to query the tables.

  • INSERT statement to add rows to one or more tables.

  • UPDATE statement to update the necessary tables.

In real-world scenarios, database transactions oftentimes consist of two or more database requests. For RDBMS, a single SQL statement equals a database request.

Regardless of the number of database requests, each individual transaction (especially in RDBMS) must adhere to these four principles - atomicity, consistency, isolation, and durability, commonly referred to as ACID. Let's briefly look at what each property means.

ACID Properties of a Transaction

Relational database management systems (RDBMS) typically adhere to ACID principles to ensure data integrity in the case of errors or failures. The ACID acronym stands for:

  • Atomicity - This implies that a transaction is treated as a single, indivisible, logical unit of work. If a transaction (T1) has four database requests (SQL requests), atomicity requires that all four requests must be completed or entirely aborted.

  • Consistency - This ensures the permanence of a database's consistent state. Upon completion of a transaction, the database is required to be in a consistent state. If any part of the transaction violates an integrity constraint, the entire operation is aborted. Every transaction must begin with a known database consistent state to ensure consistency.

  • Isolation - This property ensures that the data accessed during the execution of a transaction remains isolated from other transactions until the first transaction is completed. For example, if transaction T1, is being executed and is using data item X, other transactions, say T2...Tn, cannot access data item X until T1's transaction is complete. In other words, each transaction is executed in isolation from other transactions.

  • Durability - This property guarantees that once the changes made in a transaction are executed and committed, they become permanent and remain unaffected even in the event of a system failure or crash.

Bear in mind that adhering to acid principles comes at a performance cost and as such not all database management systems stick to these principles, especially NoSQL databases. Also, some RDBMS, e.g. MySQL allows you to disable ACID to ensure speed.

In addition to ACID properties, there is yet another property that applies when executing multiple concurrent transactions. This property is known as serializability.

Serializability is a property that ensures the concurrent execution of transactions yields consistent results adhering to a defined schedule. For example, assume a database has three transactions, T1, T2, and T3, executing at the same time. To properly carry out this transaction, the database must figure out the order in which to execute each transaction. This is known as scheduling the transactions. The database schedules the transactions in a way that even with multiple transactions happening at the same time, the outcome is as if the transactions happened one after the other in a serial manner.

Let's take a look at an example implementation of a transaction now that we understand what it means.

How To Implement Transactions (using PostgreSQL as an example)

PostgreSQL is an ACID-compliant, open-source relational database. It is widely known for its robustness, extensibility, and strict adherence to SQL standards.

In PostgreSQL, we use the following commands to control transactions:

  • BEGIN TRANSACTION - used to initiate a transaction

  • COMMIT - used to permanently apply changes made in the transaction to the database.

  • ROLLBACK - used to undo changes made during a transaction, returning the database to its original state before the transaction.

To illustrate how transactions work in PostgreSQL, let's consider a scenario where a customer places an order for a book in an online bookstore (based on the entity relationship diagram shown in Fig. 1). Here's a list of transactions that most likely will occur:

-- Assumming a customer with ID 001 places an order for a book with ID 101, ordering 5 units of the book -->

-- Initializing the transaction
BEGIN TRANSACTION;

-- Checking if the book with ID 101 is available
SELECT QTY_AVAILABLE
FROM BOOK
WHERE BOOK_ID = 101;

-- Deducting 5 book units from the stock of Book ID 101
UPDATE BOOK
SET QTY_AVAILABLE = QTY_AVAILABLE - 5
WHERE BOOK_ID = 101;

-- Recording the order details in the "ORDER" table
INSERT INTO ORDER (BOOK_ID, CUST_ID, QTY_ORDERED, ORDER_DATE)
VALUES (101, 001, 5, CURRENT_TIMESTAMP);

-- Committing the changes
COMMIT;

In this example, the BEGIN TRANSACTION command marks the start of the transaction. The subsequent SELECT statement checks for the book's availability, while the UPDATE statement reduces the available quantity of the book in the "BOOK" table as the customer places the order. The INSERT INTO statement, records the order details in the "ORDER" table, including the customer ID, book ID, ordered quantity, and the current timestamp for the order date.

By using a transaction, all these SQL statements are executed as a single, consistent unit of work. If any part of the transaction encounters an error, the entire transaction can be rolled back with the ROLLBACK command, ensuring that no partial or inconsistent changes are applied to the database.

Conclusion

In this post, you have learned about database transactions, ACID principles, and how transactions can be implemented in PostgreSQL.

Transactions are the foundation of secure and reliable data management, so understanding the concept will enable you to handle complex operations, as well as build robust and reliable applications that maintain data integrity.