Distributed Database Management Systems

Article

Introduction

In today's business computing, data frequently resides on multiple organizational sites. Thus, the information requirements for executing transactions and answering questions might not reside in a single site. Several Database Management Systems might manage this data for multiple reasons such as scalability, performance, access, and management.

What is Distributed Database Management System?

Distributed Database Management Systems deal with the distributed database as a single logical database, and the principles and techniques of Database Management Systems are still applicable to the distributed one; However, the distributed one has special characteristics, and I'm going to illustrate the principles of Distributed Database Management Systems through this article.

Advantages of Distributed Database Management Systems

We can find several advantages to distributed data, such as.

Data is located close to users fostering better communications.
Users work with a subset of whole data, increasing the solution's performance due to faster data delivery.
Data can be processed in several locations, increasing the solution's performance.
Workload balance between the underlying systems.
Increase availability because we don't have a single point of failure in a distributed environment.

Disadvantages of Distributed Database Management Systems

Management of a distributed environment is a complex task because there are several features of database systems, such as transaction management, concurrency control, security, backup, and recovery, as well as query optimization and access path selection which can all be synchronized.
Increase storage requirements because multiple copies of data can be found in different locations.

Types of Distributed Database Management Systems

Distributed processing is accessing one or more entities distributed in several database systems connected by a network using distributed transactions.
Distributed database uses replication techniques to create a copy of the data in several sites connected by a network.

Transparency in Distributed Database Management Systems

Transparency is an essential feature of Distributed Database Management Systems. The end user feels like he's working with a centralized Database Management System by hiding all the complexities of a distributed database; that's transparent to the end user.

In a distributed environment, the end-user does not need to know about the location of the data and whether the data is replicated or fragmented. As distributed transactions enable updating the data residing in several locations, ensuring that the transaction is either completed or aborted by all the involved systems, thus maintaining the integrity of the data.

Other transparency characteristics in distributed database systems are the performance and the integration of heterogeneous DBMS under a common and global schema.

How to Access Entities which are Distributed in Multiple Databases?

This section will explain how to access entities distributed in multiple databases. The main approach is that users can see the database and external data sources as a single logical database. This approach is implemented using database link mechanisms most often created by database administrators responsible for managing access to distributed data sources.

Views, synonyms, and procedures are helpful in distributed database programs because they can create the appearance of location transparency by hiding global naming.

When users want to update data back to the distributed database system becomes a bit more complicated because the failure of part of the distributed transaction (taking place across multiple database systems) could trigger a global integrity constraint violation, and the underlying transaction must be either committed or roll backed as a logical unit of work.

When change occurs across distributed databases, data integrity is ensured using the two-phase commit protocol. The first phase of the two-phase commit protocol is the prepare phase, where an initializing system called the global coordinator or main database system polls each of the distributed transaction participants to determine if they are ready to commit or roll back.

Preparation also involves recording information on transaction logs and placing distributed locks on modified tables to prevent reads of uncommitted data. In the second phase, the changes are committed if all the participants agree that the messages have properly been received. If any nodes involved in the transaction cannot verify receipt of the changes (because of a system or network failure), the transaction is rollbacked to the original state.

In 1991, X/Open defined an open system standard interface through which transaction processing (TP) monitors could communicate with XA-compliant resource managers (the role of Oracle database systems and Microsoft SQL Server as well as other XA-compliant databases in the X/Open model). Then several TP monitors supporting XA appeared, such as BEA Tuxedo, IBM'S CICS, Encina, and Microsoft Transaction Server (MTS).

Conclusion

In this article, I illustrated the principles of Distributed Database Management Systems to help system architects and developers create distributed solutions.