Sunday, August 7, 2016

ACID (Part 4)

In the previous post, we discussed the Read Committed Isolation level. That level solved 1 type of read phenomena, Dirty Read, which we used to get in the Read Uncommited isolation level, but we still got Repeatable and phantom read with that level. In this post we talk about repeatable read isolation level. A slightly more expensive level of implementation but can kill the non-repeatable read phenomena.

The final state of our Like table look like this from the previous post.



Here is the reference to all ACID posts that I authored:

ACID (Part 1) - Atomicity and Consistency
ACID (Part 2) - Isolation (Read Uncommitted)
ACID (Part 3) - Isolation (Read Committed)
ACID (Part 4) - Isolation (Repeatable Read)


Repeatable Read

With a repeatable read isolation level, we not only read entries from committed transactions but we also read it from a previous timestamp. Usually it is the moment from we began our transaction, any other transactions that update entries after that moment will be overlooked by our transaction and it will attempt to retrieve an older "version" of the truth to make sure results are consistent. Lets jump to examples.


As we see, Eddard has already liked picture 2, he is attempting to fire another like to the same picture, we have atomicity and consistency that prevent him from doing so but lets see what happens. Eddard fires up a like on picture 2, and fraction of a second later Sansa loads up picture 2, this will retrieve the number of likes and the list of users who liked it. 

Eddard sends the like, first query executes successfully, incrementing the likes count.



Before Eddard second query executes, Sansa's Select kicks in to read picture 2 row, she is going to get 3 likes instead of 4. This is because we are operating under the repeatable read isolation level, which only reads committed transactions, and since Eddard still did not commit (or rollback) his transaction, Sansa is getting the current commited value. So we have avoided a dirty read phenomena.




Sansa issues another read to the Like table to get all the users who likes picture 2, she gets three rows, Jon, Reek and Eddard. Consistent with the number of likes she got.




Eddard transaction moves on and executes the second query which fails because of the constraint we have in place. Rolling back the entry for likes back to 3.







Eddard gives up, his transaction is finished and he failed to ruin our system consistent state, (we also chopped off his head). Meanwhile a new user comes in, Cersei and burns the database to the ground with wild fire. Not really, she likes picture 2, she is a brand new user who never liked picture 2 before so his transaction commits fine. 








Sansa's transaction is still running she is querying other tables, doing some stuff, updating the view count perhaps, and then finally, she comes back for a final read of picture 2 getting the likes count. Although the final committed value is 4, Sansa is going to get the original committed value when her transaction began which was 3. So we avoided getting the non-repeatable read phenamona with Repeatable read isolation level. It is slightly expensive since we have to keep history of versions of each committed values and go back searching for a previous value with a timestamp.





She issues a final read to the Like table to find out the list of users who liked picture 2, and surprise surprise, she got an extra record, hence Phantom read is still reproducible with repeatable read isolation level.






So we fix one problem with this isolation level but we introduce a cost of keeping history versions of previous committed values which we didn't have to with Read Committed level.

Next up, serializable isolation level.

-Hussein