Postgres Faster Vacuum-2

Virender Singla
6 min readDec 8, 2021

(TOAST impact on VACUUM)

In part-1, I talked about few database parameters, table level configurations and possible future improvements to run a vacuum job more efficiently. Let’s continue on the same topic and see how table storage impacts vacuum job.

This post is mostly focused on Postgres TOAST . If you are unaware of what it is, please take a look on the official documentation. In a summary how I understand this feature is “divide a single large row into multiple rows and store them in multiple tables (main table and a TOAST table) rather than storing in a single table. Then also have a Foreign Key kind of relationship to join two tables data together to produce a combine tuple.

So let’s quickly create test datasets and observe how storage parameters can make a difference on vacuum job.

###Table1 with extended storage (out of line storage)create table table_extended (id int, age int, name text);ALTER TABLE table_extended SET (
autovacuum_enabled = false, toast.autovacuum_enabled = false);
###Table2 with main storage (in line storage)create table table_main (id int, age int, name text);alter table table_main alter column name set storage main;ALTER TABLE table_main SET (
autovacuum_enabled = false, toast.autovacuum_enabled = false);

I have loaded both the table with a wider “name” column data so TOAST feature will get triggered to decide compression/out of line storage.

The TOAST management code is triggered only when a row value to be stored in a table is wider than TOAST_TUPLE_THRESHOLD bytes (normally 2 kB).

###Loaded same amount of data in both the tables and check the sizeSELECT
c1.relname,
pg_size_pretty(pg_relation_size(c1.relname::regclass)) AS size,
c2.relname AS toast_relname,
pg_size_pretty(pg_relation_size(('pg_toast.' || c2.relname)::regclass)) AS toast_size
FROM
pg_class c1
JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE
c1.relname LIKE 'table_%'
AND c1.relkind = 'r';
relname | size | toast_relname | toast_size
----------------+---------+------------------+------------
table_extended | 29 MB | pg_toast_2116928 | 1302 MB
table_main | 1302 MB | pg_toast_2116934 | 0 bytes

So looking at table size above, we see tables are of similar size but the difference here is out of line storage for table_extended and in line storage for table_main.

Now let’s update these tables and make everyone 5 years older.

###cleaned up DB and OS cache beforepostgres=# update table_extended set age=age+5;
UPDATE 500000
Time: 2061.278 ms (00:02.061)
postgres=# update table_main set age=age+5;
UPDATE 500000
Time: 80259.084 ms (01:20.259)
relname | size | toast_relname | toast_size
----------------+---------+------------------+------------
table_extended | 57 MB | pg_toast_2616940 | 1302 MB
table_main | 2604 MB | pg_toast_2616946 | 0 bytes

As we are updating age column only and no change in TOASTed column “name” hence in table_extended all the UPDATE processing happening in the main table only and it did not touch TOAST table, basically it created a new version of tuple for columns id and age only. Whereas table_main has become double in size as it creates a new version of entire tuple including all columns id, age and name. Run time difference is also there which is obvious looking at the resultant table size.

During an UPDATE operation, values of unchanged fields are normally preserved as-is; so an UPDATE of a row with out-of-line values incurs no TOAST costs if none of the out-of-line values change.

Now I have run vacuum on both the tables to cleanup dead tuples and see a clear difference in the run time which also make sense because of difference in main table size.

###cleaned up DB and OS cachepostgres=# vacuum table_extended;
VACUUM
Time: 14500.174 ms (00:14.500)
postgres=# vacuum table_main;
VACUUM
Time: 42836.927 ms (00:42.837)

One important thing to note here is regular vacuum job runs on main table or TOAST table based on dead tuples in them (either we had updated main table columns or TOAST table columns or both). But both main table and TOAST table have their separate visibility information (xmin/xmax/visibility map) and hence anti wraparound vacuum basically has to freeze the rows in both storage areas of a physical table.

Here I have run vacuum freeze and run times are comparable.

postgres=# vacuum freeze table_extended;
VACUUM
Time: 32770.891 ms (00:32.771)
postgres=# vacuum freeze table_main;
VACUUM
Time: 31090.077 ms (00:31.090)

I initiated a thread in pgsql-hacker to understand the need for separate visibility info for a TOAST table and there were nice reasons put up by experts.

Summary: In the above example i compared best case and worst case scenarios and shown the difference but most of the time Postgres is smart enough and it compress and optionally choose out of line storage for a wider column. However for some edge cases, we can have control in our hands.

There are two ways to control the TOAST table:

1.) alter the columns to set appropriate storage
alter table table_extended alter column name set storage extended/main/plain/external;
2.) set parameter toast_tuple_target
create table table_main (id int, age int, name text) WITH (toast_tuple_target=128);

So is it that simple to have a TOAST table every time (out of line storage)? Apart from VACUUM job considerations there are other considerations having a TOAST table storage along with the main table.

Suppose we want to SELECT only few columns that can fit into the main table then definitely with TOAST table approach, we can scan those few columns faster as it has to scan less data in the main table and all the wider rows in TOAST table will be untouched.

ID, AGE --> Main Table, NAME --> TOAST TABLE
SELECT ID,AGE from <TABLE>;

In opposite case, if we want to SELECT all the columns in a table, so there will be some performance penalty in joining main table and TOAST table and then return combined tuples ( i have not done any bench marking over here).

ID, AGE --> Main Table, NAME --> TOAST TABLE
SELECT ID,AGE,NAME from <TABLE>;

Parallel Vacuum Job:

In Postgres13 Index vacuum/cleanup happens in parallel manner and that is with one worker per Index.

While learning TOAST table concept, I came to know that Postgres allows vacuuming of main and toast table separately and I guess autovacuum triggers vacuum job based on individual table stats( dead tuples threshold, age). But if we run manual VACUUM on a table, it runs on main table and then on TOAST table in sequential manner.

So why not run parallel vacuum on main and TOAST table in case of manual vacuuming.

relname     |  size   |  toast_relname   | toast_size
----------------+---------+------------------+------------
table_extended | 1235 MB | pg_toast_2616940 | 3986 MB
relname | n_live_tup | n_dead_tup
----------------+------------+------------
table_extended | 503443 | 21005650
relname | n_live_tup | n_dead_tup
------------------+------------+------------
pg_toast_2616940 | 1000000 | 2061080

Note that TOAST table row count can be different compared to main table as a wider row is split into multiple rows in TOAST table.

Session1: first run vacuum on TOAST table.vacuum (verbose,freeze) pg_toast.pg_toast_2616940;Session2: Then run vacuum on main table.vacuum (verbose,freeze) table_extended;

Once main table cleanup is done and it goes for TOAST table, it gets blocked by session1 if it is still running. That is expected as we can not run two VACUUM jobs on a single table.

pid  | usename  | blocked_by |              blocked_query
------+----------+------------+-------------------------------------
1147 | postgres | {1148} | vacuum (verbose,freeze) table_extended;

Also Session2 could not free up empty blocks at the end of table because of conflicting Session1. Once Session1 resumes, all it does is free up those empty blocks.

Sessoion2:
INFO: "pg_toast_2616940": found 1267454 removable, 1000000 nonremovable row versions in 510180 out of 510180 pages
INFO: "pg_toast_2616940": stopping truncate due to conflicting lock request
Sessions1:
INFO: "pg_toast_2616940": found 0 removable, 0 nonremovable row versions in 1 out of 510180 pages
INFO: "pg_toast_2616940": truncated 510180 to 500000 pages

Do share your experiences and feedback.

--

--