Optimistic and Pessimistic Concurrency Control
Most web applications allow a user to query a database, retrieve a local copy of the queried data, make changes to that local copy, and then finally send the updates and the unchanged values back to the database.
These are often described as CRUD (Create, Read, Update, Delete) applications. An example would be a simple web registration form which would request from the user their first name, last name, email address, password, and mailing list preferences in order to create a profile on the site. In this scenario, we’ll have a one to one mapping of the user table to the form on a web page. An example of each CRUD operation would be:
- Create – registering on the site
- Read – logging in and viewing your profile
- Update – changing your mailing list preferences
- Delete – requesting your account be closed. Many times data is not actually deleted from the database, but instead an update is executed to set the status to “inactive”.
In this user profile example, only one person is updating the data (since only one person knows the correct password needed to access the record) so the data in the database is the same between the Read and Update steps. However, there are some web applications where multiple users can modify the same data set concurrently. These applications can suffer from the “lost update” problem.
Lost Update Problem
To illustrate the lost update problem, consider a web application which allows salespeople to maintain a list of clients. This web application allows any salesperson to view a list of clients and then click on any one client to update that client’s information. The user must click the “Save” button to persist their changes to the database.
Using this web application, Joe (Salesperson) wants update Bob’s (Client) record by adding Bob’s new cell phone number. Joe performs a search, the application displays a matching list of client records, and Joe clicks on Bob’s record. At this time, the data currently stored in the database for Bob is retrieved and displayed on a page in Joe’s web browser. Joe decides to get a cup of coffee from the break room before making his change to Bob’s record.
While Joe gets his coffee, Fred (Salesperson) wants to update the address on Bob’s record. Like Joe, Fred accesses Bob’s record via a search and is shown the data currently stored in the database for Bob. At this point, both Fred and Joe have the same data displayed in their web browser for Bob. Fred changes Bob’s address in the form, clicks the “Save” button, and goes to Lunch.
Joe comes back with his coffee. At this point, Joe’s view of Bob’s data is not the current data for Bob in the database. Fred changed Bob’s address and that change is not shown in Joe’s web browser. Most web applications operate in this manner – they only send data to the user (Joe) when the user requests it. Now Joe adds Bob’s cell phone number and clicks the “Save” button. This causes Bob’s new phone number as well as the outdated address that was shown in his form to be saved in the database. Fred’s update is now lost.
Fred gets back from lunch, searches for and views Bob’s record again, and sees Bob’s old address along with a new phone number. Fred is angered and fires off an email to his boss telling him the cool new sales web application is broken and he wants to use the old application.
As you can imagine, this is not good for the application development team or the salespeople. The diagram below illustrates the sequence of events leading up to the lost update.
As is often the case, each time the user clicks “Save” on the page in their web browser, all the data in their web form gets sent to the application server and written to the database. The flaw in this case is the application assumes the data submitted by the user is the current state of the data in the database. As you can see with the above example with Joe and Fred, this is not necessarily true.
To handle the lost update problem, you can use either optimistic concurrency control or pessimistic concurrency control.
Optimistic Concurrency Control
Using optimistic concurrency control, an additional value is sent along with the user’s web form data when the user clicks the “Save” button. This additional value is then used to determine if someone else changed the data in the database after this user last read it. If the data in the database was not changed, the user’s change succeeds otherwise it fails and the user gets a friendly error message along the lines of “Another user has changed the data since your last request. Please try again”.
At this point, some implementations might try to help the user merge their changes with the current state in the database. Others implementations send the user the latest state of the data in the database and force them to re-enter their changes. As you can imagine, optimistic concurrency control can get very frustrating if you have a large web form and get many “please try again” messages.
A common implementation of optimistic concurrency control is to have a timestamp column in your table that is automatically updated by the database to match the time the row was last modified. When the user reads the record from the database to populate their web form, the timestamp is included and becomes a hidden form field that is sent back to the database with all the other form data when the users clicks the “Save” button. The “where” clause of the update SQL statement is modified to now include the timestamp:
update … where id = ? and timestamp = ?
Going back to our example, if it used optimistic concurrency control both Fred and Joe would get the same timestamp value from the database in their initial read. Fred’s phone number change would succeed because the timestamp matches the current timestamp in the row for Bob. Besides changing the phone number, Fred’s change causes the database to update Bob’s timestamp column to match the time Fred submitted his change.
Now Joe sends his address change with the old timestamp and the application would reject it because the where clause, which now includes the timestamp, does not succeed. The application would give Joe an error message along with the current state of Bob’s record, including the new timestamp. Joe now has the option to redo his address change.
Pessimistic Concurrency Control
Using pessimistic concurrency control, the concept of locking data to prevent others from attempting to modify it is introduced. A user can only change data if the user has its lock. The lock on the data is obtained when the data is initially read and is released when the user sends their changes to the server. If a user attempts to obtain the lock when another user already has it, it will fail and the user would get an error message.
Locks can create a number of issues. The classic example is one user needing to change a record locked by another user who is on vacation. With pessimistic concurrency, some things to consider are how locks are acquired and released, if locks automatically expire after a period of time, and if a user can forcefully unlock data locked by another user.
A common implementation of pessimistic concurrency control is to use a “select … for update nowait” SQL statement when reading data with the intent to modify it. Once that statement executes, that user has the lock. Another user who later attempts to execute the same statement before the user with the lock either does a commit or rollback would get an error. For Oracle, the error is:
ORA-00054: resource busy and acquire with NOWAIT specified
If the example sales application used pessimistic concurrency control, Joe’s request to read Bob’s data from the database would give Joe the lock on Bob’s record. Later, when Fred would read Bob’s record with the intent to modify it, Fred would be given a “This record is currently locked. Please try again later” application error since Joe already has the lock. Once Joe clicks the “Save” or “Cancel” button, the lock is released.
Of the two strategies, optimistic concurrency control is widely used in web applications. Pessimistic concurrency control is not often seen with web applications, however it is seen with client/server (2-tier) applications.
To use pessimistic concurrency control, the client needs to maintain the same database connection for their entire interaction with the database (think SQL*Plus). This is often described as a stateful connection. Web applications commonly use a database connection pool in the application server for performance and scalability reasons (a relatively small number of db connections can be recycled and serve a large number of web clients). As a result state is not retained because a web user gets a random db connection from the pool with each request.
Going back to the sequence diagram at the beginning, each time Joe or Fred submits their request to the application server, a random db connection from the database connection pool is extracted, executes the SQL statements on their behalf, and is returned to the pool for the next user request.
Another valid option to handling the lost update problem is to design your application so multiple users do not have write access to the same set of records. In my example, if each client is assigned a salesperson, then only the assigned salesperson can change the record. Joe is assigned to Bob and only Joe can change Bob’s data. Fred would never see Bob in the list of users he can edit.
It is also worth noting that loosing an update may not be a problem for some applications as they may have a history table or an audit table that allows users to view past changes inside the application. These applications follow what is a called a “Last update wins” or “Last commit wins” strategy. The sales web application as described at the beginning of this post uses the “Last update wins” strategy. This strategy is very easy to implement because it is what you get if you do nothing.
The “lost update” problem is not a very difficult problem to solve once you are made aware of it.