Databases and Distributed Systems: Tips to Build a Fault-tolerant Database Application

Applications should be written taking into account that errors will eventually happen and, in particular, database application developers usually consider this while writing their applications.

Although the concepts required to write such applications are commonly taught in database courses and to some extent are widely spread, building a reliable and fault-tolerant database application is still not an easy task and hides some pitfalls that we intend to highlight in this post with a set of suggestions or tips.

In what follows, we consider that the execution flow in a database application is characterized by two distinct phases: connection and business logic. In the connection phase, the application connects to a database, sets up the environment and passes the control to the business logic phases. In this phase, it gets inputs from a source, which may be an operator, another application or a component within the same application, and issues statements to the database. Statements may be executed within the context of a transaction or not.

First Tip: Errors may happen so plan your database application taking this into account

So we should catch errors, i.e. exceptions, in both the connection and business logic phase. This idea can be translated into code using Python as follows:

class Application(object): 
    def __init__(self, **db_params): 
         self.__cnx = None 
         self.__cursor = None 
         self.__db_params = db_params 

    def connect(self): 
        try: 
            self.__cnx = MySQLConnection(**self.__db_params) 
            self.__cursor = self.__cnx.cursor() 
        except InterfaceError: 
            print "Error trying to get a new database connection" 

    def business(self, operation): 
        try: 
            self._do_operation(operation) 
        except DatabaseError: 
            print "Error executing operation" 

if main == "__main__": 
    app = Application(get_params()) 
    app.connect() 

    while True: 
        app.business(get_operation())

The InterfaceError class identifies errors in the interface, i.e. connection, between the application and the MySQL Server. The DatabaseError class identifies errors associated with database operations.

In this simple example though, the application may abort after any connection error. For instance, a MySQL Server will automatically close a connection after a period of inactivity thus causing an application error if one tries to use the invalid connection.

Second Tip: Set up the appropriate timeout properties

There are two properties which fall under this suggestion:

wait_timeout - It is a MySQL option that defines the interval that must elapse without any communication between an application and a MySQL Server before the MySQL Server closes a connection.
connection_timeout - Sets the socket_timeout property in the Connector Python which defines the maximum amount of time that a socket created to connect to a database will wait for an operation to complete before raising an exception.

The wait_timeout must be set according to the application's characteristics. On the other hand, the connection_timeout property is usually set to zero which means that there will be no socket timeout period. In rare cases, such as when applications must execute operations within a fixed interval, we should set it up.

Third Tip: Connection errors may happen at any time so handle them properly

The previous measurements will not circumvent problems related to transient network issues or server failures though. To handle this type of problem, one needs to consider that a connection may fail at any time. This requires to catch connection errors also while executing the business logic and get a fresh connection to proceed with the execution. In other words, this requires to combine the aforementioned two phases. This idea can be translated into code as follows:

class Application(object): 
    def __init__(self, **db_params): 
         self.__cnx = None 
         self.__cursor = None 
         self.__db_params = db_params.copy() 

    def connect(self): 
        try: 
            self.__cnx = MySQLConnection(**self.__db_params) 
            self.__cursor = self.__cnx.cursor() 
        except InterfaceError: 
            print "Error trying to get a new database connection" 

    def business(self, operation): 
        try: 
            self._do_operation(operation) 
        except (AttributeError, InterfaceError) 
            print "Database connection error" 
            self.connect() 
        except DatabaseError: 
            print "Error executing operation" 

if main == "__main__": 
    app = Application(get_params()) 
    app.connect() 

    while True: 
        app.business(get_operation())

In general, connectors cannot hide connection failures from the application because this may lead to data inconsistency. Only the application has enough knowledge to decide what is safe to do and as such any failure, including connection failures, must be reported back to the application. In what follows, we depict a problem that may happen when a connector tries to hide some failures from the application:

When the connection fails, the server rolls back the on-going transaction thus undoing any change made by the first insert statement. However, the connector gets the error and automatically tries to reconnect and succeeds. With a valid connection to the server, it executes the failed statement and succeeds. Unfortunately, the application does not find out about the connection issue and continues the execution as nothing has happened and by consequence a partial transaction is committed thus leaving the database in an inconsistent state.

It is worth noting that if statements are executed in “autocommit” mode, it is still unsafe to hide failures from the application. In this case, an attempt to automatically reconnect and try to execute the statement may lead to the statement being executed twice. This may happen because the connection may have failed after the statement has been successfully executed but before the server has had a chance to reply back to the connector.

Fourth Tip: Guarantee that session information is properly set after getting a connection

From a fault-tolerant perspective the application looks better now. However, we are still missing one key point.

We should use the "my.cnf" configuration file to set up the necessary MySQL's properties (e.g. autocommit, transaction isolation level). However if several applications share the same database server and require different configuration values, they should be defined along with the routine that gets a connection. If you do it in a different place, you may risk forgetting to set the options up when trying to get a new connection after a failure. Our code snippet already follows this rule and you are safe in that sense.

This suggestion is specially important when the applications (i.e. components) share the same address space and use a connection pool.

We should also avoid using temporary tables and/or user-defined variables to transfer data between transactions. Although this is a common technique among developers, this will fail miserably after a reconnection as the session information will be lost and may require an expensive routine to set up the necessary context. So starting every transaction with a “clean slate” is probably the safest and most solid approach.

Fifth Tip: Design all application components taking failures into account

Finally, it is worth noticing that if the database fails the system as whole will be unavailable. So to build a truly resilient solution, we still need to deploy some type of redundancy at the database level. We will discuss possible high availability solutions for MySQL in a different post.

See http://alfranio-distributed.blogspot.com/2013/09/writing-fault-tolerant-database.html

11 comments:

Unknown said...: It was really a nice article and I was really impressd by reading this.
Thank you for such amazing post. Keep up the good work.

Primavera Training in Chennai
Primavera Course in Chennai
Primavera Software Training in Chennai
Best Primavera Training in Chennai
Primavera p6 Training in Chennai
Primavera Coaching in Chennai
Primavera Course; October 22, 2018 at 11:30 PM
Sports education worldwide said...: appvn download
tutuapp apk ios; November 14, 2018 at 3:21 AM
Morgan said...: As a result, professionals are joining this field in order to upgrade their skills and meet the industry needs. The year 2020 is going to see a great increase in the number of data scientists. data science course syllabus; November 13, 2020 at 4:10 AM
Mike Johnson said...: I think you can easy make a video how to build this database. Your video will be popular on youtube I believe. But if you won't have enough likes you can use this site https://viplikes.net; March 9, 2021 at 1:20 PM
Mike Johnson said...: Or you can post your video on tiktok. I read from here https://totaltechhub.com/tiktok-promotion/ how to promote tiktok profile now and get followers for it; April 28, 2021 at 1:28 PM
Mike Johnson said...: Thank you for your tips! If it is not difficult for you make a video about it, please. I can even buy you some likes from here https://soclikes.com/ if you post it on youtube; August 5, 2021 at 1:33 AM
Evo Syah said...: The Heritage Partnership is a financial advisory firm known to be proactive, progressive and reliable. We strive to be the new standard in financial guidance, creating significant value for our clients. Challenge the status quo with us. We care for the people who care for you. Our pursuit of technological updates, training roadmaps, support platforms and initiatives are thoughtfully designed to develop our people’s competency. It empowers and motivates them to grow and develop, thereby customizing unique solutions towards financial success for you, exceeding expectations every time.; December 13, 2021 at 7:34 PM
Punjabi Grill Bali said...: Finding Indian food in a foreign land is a daunting task. No matter how well you research, you can’t find a perfect place that satisfies your all needs. Some have great ambiance but no taste, others have a taste but are not pocket-friendly. If you are looking for an Indian restaurant in Bali that has superb taste and doesn’t burn a big hole in your pocket, then you must try Punjabi Grill. It is one of the best Indian restaurants that offer a variety of Indian food that will surely satisfy your taste buds. If you are a real food lover, do visit Punjabi Grill once.; July 17, 2023 at 12:20 AM
Data Science Courses In Micronesia said...: Excellent tips. The tips and the guide is very useful and helpful. I found it very useful and informative. Thanks for sharing this information with us.
Data Science Courses in Micronesia

https://iimskills.com/data-science-courses-in-micronesia/

Data Science Courses in Micronesia; January 5, 2025 at 7:46 PM
Chanda said...: Databases in distributed systems ensure scalability, fault tolerance, and high availability using replication, shading, and consensus protocols like Raft or Paxus.
Medical Coding Course; February 6, 2025 at 3:59 AM
Anonymous said...: Great post , Fault tolerance is such a critical aspect of modern database applications, especially in distributed systems where failures are inevitable. I really liked the emphasis on redundancy and replication strategies like master slave or multi-master replication go a long way in ensuring high availability.
Medical Coding Courses in Delhi; July 11, 2025 at 2:52 AM

Saturday, September 21, 2013

Tips to Build a Fault-tolerant Database Application

11 comments: