When PostgreSQL Doesn’t Prioritize Consistency !
How do concurrently running queries affect each other’s execution in databases? While MVCC (Multi-Version Concurrency Control) allows queries to operate on consistent snapshots, the impact varies between READ and WRITE transactions.
All the cases described below are w.r.t. to Read Committed isolation level
Read Consistency
Let’s illustrate this with an example. Suppose transaction T1 begins a full table scan at 10:00 AM and is expected to finish by 11:00 AM. However, at 10:10 AM, another transaction T2 starts, updates a few rows (specifically, some of the last rows not yet read by T1), and commits those changes at 10:11 AM (doing an Index scan). Now, the question is: what happens to the results of T1? Will it include the older versions of the rows or the newer versions modified by T2? In this scenario, T1 will return the older versions of the rows. This is because T1 operates on a consistent snapshot of the table at the time it began its transaction.
How is this read consistency achieved? It’s all thanks to Multi-Version Concurrency Control (MVCC). Essentially, when T2 updates the rows, it creates new versions of those data entries (tuples). MVCC allows T1 to still “see” and access the original versions using associated metadata, such as xmin
(transaction ID that created the tuple) and xmax
(transaction ID that deleted or updated the tuple). This ensures that T1 remains unaffected by T2's changes and maintains a consistent view of the data.
Write Consistency
That covers read consistency. But how is consistency maintained when we have two concurrent write transactions? This scenario gets a bit more interesting. Let’s revisit our earlier example, but this time, T1 executes an UPDATE query instead of a SELECT. What happens then?
We have two possibilities:
- T1 updates the original version of the row: This would mean the changes made by T2 are lost, effectively overwriting them.
- T1 updates the newer version of the row: This preserves T2’s changes and applies T1’s modifications on top of them.
PostgreSQL chooses the second option. In essence, T1 “sees” the modifications made by a “future” transaction T2 and works on the updated version of the row. However, it’s important to note that T1 only works on the existing rows as of when T1 started and those rows could have been modified by T2 but if T2 inserts new rows, T1’s update query won’t touch them.
Lets understand with a real example given in official Postgres docs.
e.g., assume website
is a two-row table with website.hits
equaling 9
and 10
:
BEGIN;
UPDATE website SET hits = hits + 1; /* T1 */
-- run from another session: DELETE FROM website WHERE hits = 10; /* T2 */
COMMIT;
The DELETE
will have no effect even though there is a website.hits = 10
row before and after the UPDATE
. This occurs because the pre-update row value website.hits = 9
is skipped (DELETE only checking the pre existing version and starts waiting on 2nd row website.hits = 10
), and when the UPDATE
completes and DELETE
obtains a lock on the 2nd row, it will work on the updated row but the new row value is no longer 10
but 11
, which no longer matches the criteria.
So there is one problem in this Postgres implementation, DELETE isn’t working on a consistent view of the data.
Here’s why:
- Before DELETE Starts:
Tuple1:website.hits=9
,Tuple2:website.hits=10
- After UPDATE Ends:
Tuple1:website.hits=10
,Tuple2:website.hits=11
T2 worked on a state which in-between : Tuple1:website.hits=9
, Tuple2:website.hits=11
but we know that our data was never in that state. Indeed the results of concurrent transactions depends, who reaches and process the rows first.
Oracle takes a different approach to handle conflicts between two write transactions, beautifully explained by Tanel Poder. When it detects such a conflict, it rolls back the entire T2 transaction and reruns it. This means T2 will effectively start over, working on a consistent snapshot of the data taken after T1 has completed its changes. While this approach is expensive due to the rollback and rerun, it ensures strong consistency by preventing T1 from inadvertently interfering with T2’s modifications.
End! Can we really expect consistent and predictable results in a highly concurrency environment without any safeguards in place? To avoid unexpected behavior and ensure data integrity, we should consider using techniques like explicit row locking with SELECT FOR UPDATE
and opting for a stricter transaction isolation level.