https://www.crunchydata.com/blog/postgresl-unlogged-tables

Crunchy DataCrunchy Data
Products

Featured Products

Crunchy Bridge

Fully managed cloud Postgres available on AWS, Azure, & GCP.

Crunchy Postgres

Automated high availability PostgreSQL solutions for traditional
self-managed deployments on VMs and traditional infrastructure.

Crunchy Postgres for Kubernetes

High availability production deployment for self-managed PostgreSQL
deployed through a Kubernetes Operator.

Crunchy Hardened Postgres

Hardened, secure, access controlled PostgreSQL to meet advanced
security requirements.

All Crunchy Products

Crunchy Certified PostgreSQL
Crunchy Postgres for Cloud Foundry
Crunchy MLS Postgres
Crunchy Spatial
Pricing
Cloud

Crunchy Bridge

Overview
Documentation
Price Calculator
Get Started

Cloud Partners

AWS
Azure
GCP
Heroku
Create your accountLogin
Solutions

Industries

Government
Healthcare
Finance
SaaS
Automotive
Blockchain
Software Consultancies

Use Cases

Internal Database-as-a-Service
Embedded PostgreSQL
Edge Computing
PostGIS

Integrations

Ansible
JFrog
Kong
OpenShift
Developers
Overview
Postgres Operator
Fully Managed Postgres
Postgres Tutorials
Postgres Tips
Why Postgres?
Value of a Subscription
Customers
BlogContact UsDownload
Login
Access PortalCrunchy Bridge
Open menu
Production Postgres

PostgreSQL Unlogged Tables - Look Ma, No WAL!

Greg Sabino MullaneGreg Sabino Mullane

Greg Sabino Mullane

Feb 17, 2023*11 min read

While supporting customers at Crunchy Data, I sometimes see users add
unlogged tables. In this post, I'm going to review some of the
specifics about this.

Unlogged tables in Postgresql are a way of preventing specific tables
from generating WAL (Write Ahead Log) information by adding the
unlogged keyword to a create table command:

CREATE UNLOGGED TABLE foobar (id int);

You can also change existing tables to unlogged, and switch them back
again:

ALTER TABLE mytable SET UNLOGGED; -- cheap!

ALTER TABLE mytable SET LOGGED; -- expensive!

While there are strong advantages to using unlogged tables, you must
use them carefully as their disadvantages make the number of valid
use cases small.

Postgres unlogged table advantages:

  * Massive improvements to write performance (as seen below).
  * Less vacuum impact (because vacuum changes are writes that also
    end up in the WAL stream).
  * Less total WAL (leading to less traffic, smaller backups, etc.)

Postgres unlogged table disadvantages:

  * Tables are truncated on crash recovery. No durability!.
  * Unlogged tables can only be accessed on the primary, not on the
    replicas.
  * Unlogged tables can NOT be used in logical replication, or
    physical backups.

unlogged table

Unlogged tables in Postgres generate no WAL and are faster to update

Let's verify that an unlogged table prevents generating WAL. First,
we will create a test table using the pgbench utility, and then
update it a bit. Not only will we see how long the updates take, but
we will peek inside the pg_wal directory and see how much WAL was
created. For this test, it's important that autovacuum remains off
and no other clients are connecting, to keep things honest.

# Ensure that autovacuum is (temporarily!) disabled:
$ psql -c 'alter system set autovacuum=off' -c 'select pg_reload_conf()'

# Generate a table with one million rows:
$ pgbench -i -s 10
dropping old tables...
creating tables...
generating data (client-side)...
1000000 of 1000000 tuples (100%) done (elapsed 1.25 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 3.60 s (drop tables 0.04 s, create tables 0.01 s,
  client-side generate 1.28 s, vacuum 1.81 s, primary keys 0.48 s).

## Update one million rows:
$ touch gregtest; time psql -c 'update pgbench_accounts set aid=aid'
UPDATE 1000000

real    0m5.592s
user    0m0.002s
sys     0m0.000s

## Determine how much WAL was generated by the above comamnd:
$ DD=$(psql -Atc 'show data_directory')
$ find $DD/pg_wal -newer gregtest -size 16M | wc -l | xargs echo 16* | bc
320

To update a million rows in a normal, logged Postgres table took us
5.6 seconds, and generated roughly 320MB of WAL. How much different
will an unlogged table be?

## pgbench can directly create its tables as unlogged:
$ pgbench -i -s 10 --unlogged
dropping old tables...
creating tables...
generating data (client-side)...
1000000 of 1000000 tuples (100%) done (elapsed 0.73 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 1.56 s (drop tables 0.03 s, create tables 0.03 s, client-side generate 0.80 s, vacuum 0.15 s, primary keys 0.20 s).

$ touch gregtest; time psql -c 'update pgbench_accounts set aid=aid'
UPDATE 1000000

real    0m3.173s
user    0m0.002s
sys     0m0.000s

$ find $DD/pg_wal -newer gregtest -size 16M | wc -l | xargs echo 16* | bc
16

Not only did we perform the update twice as fast as before, both the
table creation time and the amount of WAL generated shrunk as well!
Since pgbench is a benchmarking utility, let's have it run for 10
seconds and see the difference in speed for the default test pgbench
uses:

$ pgbench -i -s 10
$ pgbench --time 10
pgbench (15.1)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
duration: 10 s
number of transactions actually processed: 1680
number of failed transactions: 0 (0.000%)
latency average = 5.949 ms
initial connection time = 7.761 ms
tps = 168.087557 (without initial connection time)

Now unlogged:

$ pgbench -i -s 10 --unlogged
$ pgbench -T 10
pgbench (15.1)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
duration: 10 s
number of transactions actually processed: 4857
number of failed transactions: 0 (0.000%)
latency average = 2.059 ms
initial connection time = 2.522 ms
tps = 485.706168 (without initial connection time)

That's a big difference! 168 transactions per second versus 485
transactions per second for the unlogged version. So while the
performance gains are real, let's look closer at the drawbacks.

Unlogged tables in Postgres are not persistent

The data in unlogged tables is ephemeral and lives outside of the
strict data protection that Postgres usually provides. So should
someone literally pull the plug on your Postgres database server,
your data is 100% safe, except for unlogged tables, in which your
data is 100% gone - by design! Because things are not logged, there
is no way to guarantee data integrity, so the table is simply
truncated. Note that a simple restart will NOT truncate the table:

$ pgbench -i -s 10 --unlogged
$ psql -Atc 'select count(*) from pgbench_accounts'
1000000

$ pg_ctl -D data restart
waiting for server to shut down.... done
server stopped
...
server started

$ psql -Atc 'select count(*) from pgbench_accounts'
1000000

However, a crash will truncate it. We can force Postgres to crash by
forcibly killing one of its subsystems. In this example, we will pick
on the walwriter backend. Do not do this on a production system!

$ pgbench -i -s 10 --unlogged

$ kill -9 $(psql -Atc "select pid from pg_stat_activity where backend_type = 'walwriter'")

$ psql -Atc 'select count(*) from pgbench_accounts'
psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL:  the database system is in recovery mode

$ psql -Atc 'select count(*) from pgbench_accounts'
0

That data is lost forever. Lucky for us, it was only random sample
data generated by the pgbench program, so we can recreate it easy
enough using pgbench -i. If you are using unlogged tables, you must
always be ready to either lose the data forever, or have a way to
quickly repopulate the table.

Unlogged tables in Postgres are not replicated

Because streaming replication (and for that matter, log shipping of
WAL files) uses information from the WAL files to update replicas,
unlogged tables are not able to be replicated. They do exist on the
replica, but are empty and not accessible.

## Spin up a replica
$ pg_basebackup -R -c fast -D data_replica

## Set a new port:
$ echo port=5555 >> data_replica/postgresql.conf
## Cluster names are handy:
$ echo "cluster_name='myreplica'" >> data_replica/postgresql.conf
## We do not want this replica sending WAL files to pgbackrest
$ echo "archive_mode=off" >> data_replica/postgresql.conf
## Start up the replica on port 5555
$ pg_ctl -D data_replica start
## Generate some WAL traffic on the primary (default port)
$ pgbench -i --unlogged; pgbench -T 3
## Confirm replication is happening
$ psql -c 'select application_name, replay_lag from pg_stat_replication'
 application_name |   replay_lag
------------------+-----------------
 myreplica        | 00:00:00.001221


On the replica, the table is not simply empty, it is completely
inaccessible and throws an error when we try to view it:

$ psql -p 5432 -Atc 'select count(*) from pgbench_accounts'
100000
$ psql -p 5555 -Atc 'select count(*) from pgbench_accounts'
ERROR:  cannot access temporary or unlogged relations during recovery

Here, "recovery" means that the the replica is constantly recovering,
or playing through the WAL stream as it comes in. Because the WAL
stream does not contain information about any changes in the table,
we cannot look at its contents. Indeed, the underlying data files do
not even exist on the replica. While the replica is almost a byte for
byte copy of the primary, unlogged tables are absolutely part of the
"almost" at the start of this sentence.

Postgres disk files for a regular table:

$ pgbench -i
$ PRIMARYDIR=$( psql -Atc 'show data_directory' )
$ REPLICADIR=$( psql -p 5555 -Atc 'show data_directory' )
$ FILENODE=$( psql -Atc "select pg_relation_filenode('pgbench_accounts')" )

$ find $PRIMARYDIR -ls | grep $FILENODE | sort -k9
  363139  13120 -rw-------   1 greg  greg    13434880 Dec 25 8:03 data/base/5/94858
  363184     24 -rw-------   1 greg  greg       24576 Dec 25 8:03 data/base/5/94858_fsm
  363183      8 -rw-------   1 greg  greg        8192 Dec 25 8:03 data/base/5/94858_vm

$ find $REPLICADIR -ls | grep $FILENODE | sort -k9
  363174  13120 -rw-------   1 greg  greg    13434880 Dec 25 8:03 data_replica/base/5/94858
  363180     24 -rw-------   1 greg  greg       24576 Dec 25 8:03 data_replica/base/5/94858_fsm
  364135      8 -rw-------   1 greg  greg        8192 Dec 25 8:03 data_replica/base/5/94858_vm

The fsm and vm tables stand for "free space map" and "visibility map"
respectively, and contain metadata about the actual data, which as we
can see is 13,434,880 bytes on both system. Unlogged tables look
different:

Postgres disk files for an unlogged table:

$ pgbench -i --unlogged
$ FILENODE=$( psql -Atc "select pg_relation_filenode('pgbench_accounts')" )

$ find $PRIMARYDIR -ls | grep $FILENODE | sort -k9
  363180  13120 -rw-------   1 greg  greg   13434880 Dec 25 8:03 data/base/5/94880
  363174     24 -rw-------   1 greg  greg      24576 Dec 25 8:03 data/base/5/94880_fsm
  363181      0 -rw-------   1 greg  greg          0 Dec 25 8:03 data/base/5/94880_init
  363133      8 -rw-------   1 greg  greg       8192 Dec 25 8:03 data/base/5/94880_vm

$ find data_replica -ls | grep 94880 | sort -k9
  363173      0 -rw-------   1 greg  greg          0 Dec 25 8:03 data_replica/base/5/94880_init

The init file is an empty version of the "real" table called an
initialization fork whose job is to replace the real file if the
database crashes and starts up again, thus performing the
"truncation" part of the unlogged tables lifecycle.

While we cannot access the contents of the table on the replica, all
of its metadata is still in place, so things like \d still work fine:

$ psql -p 5555 -c '\d pgbench_accounts'
         Unlogged table "public.pgbench_accounts"
  Column  |     Type      | Collation | Nullable | Default
----------+---------------+-----------+----------+---------
 aid      | integer       |           | not null |
 bid      | integer       |           |          |
 abalance | integer       |           |          |
 filler   | character(84) |           |          |
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)

Unlogged tables in Postgres are not backed up

Tools like pgBackRest exclude temporary and unlogged tables from
backups. Again, this means that you must treat all your data in
unlogged tables as very transient. Let's prove that backups do not
copy the files:

$ psql -c 'create table aaa as select 867 as id from generate_series(5,309)'
$ psql -c 'create unlogged table bbb as select 867 as id from generate_series(5,309)'
$ psql -c 'vacuum aaa, bbb'
$ AAA=$( psql -Atc "select pg_relation_filenode('aaa')" ) ## gives 980808
$ BBB=$( psql -Atc "select pg_relation_filenode('bbb')" ) ## gives 980811
$ pgbackrest --stanza=mydemo backup

$ grep -E "$AAA|$BBB" /var/log/pgbackrest/mydemo-backup.log | cut -d\( -f1 | sort -k7

2023-12-25 13:09:37.482 P02 DETAIL: backup file /home/greg/pg/15/data_replica/base/5/980808
2023-12-25 13:09:34.492 P02 DETAIL: backup file /home/greg/pg/15/data_replica/base/5/980808_fsm
2023-12-25 13:09:37.479 P02 DETAIL: backup file /home/greg/pg/15/data_replica/base/5/980808_vm
2023-12-25 13:09:40.658 P02 DETAIL: backup file /home/greg/pg/15/data_replica/base/5/980811_init

If you peer closely at the output of our pgbackrest logs above, you
will notice that table 'aaa' had three files for it backed up: the
main file holding the rows, a visibility map file, and a free space
map file. In contrast, the only thing that got copied for our
unlogged table 'bbb' was the "init" file, which upon a restore will
become a new and empty version of the "bbb" table. Also observe that
our backup pulled the file from the replica, thanks to the awesome
backup-standby pgBackRest feature.

Be careful, because tools that do logical backups such as pg_dump
will still have the data rows, although this is not necessarily a
good thing :)

$ pg_dump -t bbb | wc -l
350

Changing a Postgres logged table to unlogged is cheap

If you take a table that is logged and change it to unlogged, the
process is relatively quick. While a new file on disk is written, no
WAL is generated.

$ pgbench -i -s 20
$ time psql -c 'alter table pgbench_accounts set unlogged'
ALTER TABLE

real    0m1.754s
user    0m0.000s
sys     0m0.004s

Changing a Postgres unlogged table to logged is expensive

On the other hand, changing a table from unlogged to logged means
writing information about the entire file to the WAL stream, which
can be expensive:

$ pgbench -i -s 10 --unlogged
$ time psql -c 'alter table pgbench_accounts set logged'
ALTER TABLE

real    0m16.601s
user    0m0.004s
sys     0m0.000s

PostgreSQL unlogged tables - final judgement

As we have seen, the use of unlogged tables in Postgres can give you
a very large performance improvement, but with some very large
drawbacks. If the data is temporary and/or easily replaced, you may
want to give unlogged tables a try. Just treat your unlogged tables
as living in a Schrodinger's box, in which at any moment the table
could contain all the rows, or none of the rows! :)

Enjoy this article?

You will love our newsletter!

Do not fill this out please: [                    ]Do not fill this
out please, it will be pre-filled: [                    ]
Email address[                    ]
Join The List
Greg Sabino Mullane

Written by

Greg Sabino Mullane

February 17, 2023 *More by this author

Footer

Products

  * Crunchy Postgres
  * Crunchy Postgres for Kubernetes
  * Crunchy Bridge
  * Crunchy Certified PostgreSQL
  * Crunchy PostgreSQL for Cloud Foundry
  * Crunchy MLS PostgreSQL
  * Crunchy Spatial

Services & Support

  * Enterprise PostgreSQL Support
  * Ansible
  * Red Hat Partner
  * Trusted PostgreSQL
  * Crunchy Data Subscription

Resources

  * Customer Portal
  * Software Documentation
  * Blog
  * Events

Company

  * About Crunchy Data
  * Team
  * News
  * Careers
  * Contact Us
  * Newsletter
  * Security

Crunchy Data Newsletter

Subscribe to the Crunchy Data Newsletter to receive Postgres content
every month.

Do not fill this out please: [                    ]Do not fill this
out please, it will be pre-filled: [                    ]
Email address[                    ]
Join The List
YouTubeLinkedInTwitterGitHub

(c) 2018-2023 Crunchy Data Solutions, Inc.

This site uses cookies for usage analytics to improve our service. By
continuing to browse this site, you agree to this use. Learn more

*