PostgreSQL 16: part 3 or CommitFest 2022-11
We continue to follow the news of the upcoming PostgreSQL 16. The third CommitFest concluded in early December. Let's look at the results.
If you missed the previous CommitFests, check out our reviews: 2022-07, 2022-09.
Here are the patches I want to talk about:
- meson: a new source code build system
- Documentation: a new chapter on transaction processing
- psql: \d+ indicates foreign partitions in a partitioned table
- psql: extended query protocol support
- Predicate locks on materialized views
- Tracking last scan time of indexes and tables
- pg_buffercache: a new function pg_buffercache_summary
- walsender displays the database name in the process status
- Reducing the WAL overhead of freezing tuples
- Reduced power consumption when idle
- postgres_fdw: batch mode for COPY
- Modernizing the GUC infrastructure
- Hash index build optimization
- MAINTAIN ― a new privilege for table maintenance
- SET ROLE: better role change management
- Support for file inclusion directives in pg_hba.conf and pg_ident.conf
- Regular expressions support in pg_hba.conf
meson: a new source code build system
Meson support was first introduced in a commit way back in September. Several dozen more commits followed. Several dozens more followed. In early December, a documentation section describing the new build process was committed. There is also a wiki page that explains, among other things, what was behind the decision to introduce the new building tool. One of the reasons is to simplify building the server for Windows systems and to be able to eventually stop supporting the PostgreSQL's own build tools from the /src/tools/msvc entirely.
The new build system will be available alongside with autoconf and make, there is no plans to replace the good old tools.
However, I specifically used Meson to build the fresh PostgreSQL 16 testing environment for this article. On Ubuntu, the building went as follows:
$ git clone git://git.postgresql.org/git/postgresql.git
$ cd postgresql
$ meson setup build --prefix=/home/pal/pg16 -Dpgport=5416
$ cd build
$ ninja
$ ninja install
I left everything on default, except for the installation location and the server port.
All what's left is to initialize the cluster and start the server, which you do the same way regardless of the build tool used.
Documentation: a new chapter on transaction processing
commit: 66bc9d2d
A new chapter has been added to the documentation: 74. Transaction processing. It lives in the Internals section and gives a brief overview of what virtual and real transaction identifiers are, how they are shown in pg_locks
, and how subtransactions and two-phase transactions work.
psql: \d+ indicates foreign partitions in a partitioned table
commit: bd95816f
For a partitioned table, the \d+ command will mark sections that are foreign tables with the word FOREIGN
.
\d+ data
Partitioned table "public.data"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+------+-----------+----------+---------+----------+-------------+--------------+-------------
data_year | text | | | | extended | | |
Partition key: LIST (data_year)
Partitions: data_2020 FOR VALUES IN ('2020'), FOREIGN,
data_2021 FOR VALUES IN ('2021'), FOREIGN,
data_2022 FOR VALUES IN ('2022')
psql: extended query protocol support
commit: 5b66de34
Until now, psql has not directly supported the extended query protocol. Therefore, parameterized queries could only be performed indirectly, such as with the PREPARE
statement.
Now the \bind
command has been added to psql to bind values to the parameters of the next query. Here's how it works:
SET log_statement='all';
\bind 42 'Answer: '
SELECT $2||$1;
?column?
------------
Answer: 42
(1 row)
\! tail -2 logfile
2022-12-05 12:58:05.924 MSK [16566] LOG: execute : SELECT $2||$1;
2022-12-05 12:58:05.924 MSK [16566] DETAIL: parameters: $1 = '42', $2 = 'Answer: '
Only a fraction of the capabilities of the extended query protocol has been implemented, but the initiative is very welcome.
Predicate locks on materialized views
commit: 43351557
Predicate locks are used to ensure consistency for transactions with the SERIALIZABLE
isolation level. They were never designed to update materialized views with the REFRESH MATERIALIZED VIEW
command, because it exclusively locks the view in order to prevent any concurrent access anomalies.
But when updating with the CONCURRENTLY
parameter and without predicate locks, transactions with the SERIALIZABLE
isolation level could experience write skew. Now the problem is fixed.
Tracking last scan time of indexes and tables
commit: c0374718
In addition to counting table and index scans, the cumulative statistics system now records time and date of the last recorded scan. This data is stored in the last_seq_scan
and last_idx_scan
columns in pg_stat_all_tables
and the last_idx_scan
column in pg_stat_all_indexes
.
SELECT seq_scan, last_seq_scan, idx_scan, last_idx_scan
FROM pg_stat_all_tables
WHERE relid = 'tickets'::regclass;
seq_scan | last_seq_scan | idx_scan | last_idx_scan
----------+-------------------------------+----------+-------------------------------
23 | 2022-12-07 15:17:57.261575+03 | 2 | 2022-12-05 14:59:53.564968+03
(1 row)
SELECT indexrelname, idx_scan, last_idx_scan
FROM pg_stat_all_indexes
WHERE relid = 'tickets'::regclass;
indexrelname | idx_scan | last_idx_scan
----------------------+----------+-------------------------------
tickets_pkey | 2 | 2022-12-05 14:59:53.564968+03
tickets_book_ref_idx | 0 |
(2 rows)
It can be helpful for analyzing index and table use if no dedicated monitoring system is implemented.
pg_buffercache: a new function pg_buffercache_summary
commit: 2589434a
The pg_buffercache extension now can display aggregated buffer cache data:
CREATE EXTENSION pg_buffercache;
SELECT * FROM pg_buffercache_summary()\gx
-[ RECORD 1 ]--+------------------
buffers_used | 2544
buffers_unused | 13840
buffers_dirty | 38
buffers_pinned | 0
usagecount_avg | 2.183176100628931
We used to obtain this data by aggregating the pg_buffercache
view, but the new function does not lock the buffers, therefore it is cheaper and can be easily integrated into monitoring systems.
walsender displays the database name in the process status
commit : af205152
walsender
is the process that handles replication protocol connections. For physical replication or backup, walsender
connects to an instance, but not to a specific database. When you create a logical replication subscription, however, the publisher creates a logical replication slot, and walsender
connects to the database where the publication is created.
The patch adds the name of the database to walsender
process description in case of logical replication, which makes monitoring those processes more convenient.
16(sub)=# CREATE SUBSCRIPTION sub CONNECTION 'port=5416 user=postgres dbname=demo' PUBLICATION pub;
NOTICE: created replication slot "sub" on publisher
CREATE SUBSCRIPTION
16(pub)=# \! ps -o pid,command --ppid `head -n 1 /home/pal/pg16/data/postmaster.pid`
PID COMMAND
38514 postgres: checkpointer
38515 postgres: background writer
38517 postgres: walwriter
38518 postgres: autovacuum launcher
38519 postgres: logical replication launcher
38522 postgres: postgres demo [local] idle
38662 postgres: walsender postgres demo [local] START_REPLICATION
Here, the walsender process status displays the demo database name.
Reducing the WAL overhead of freezing tuples
commit: 9e540599
Peter Geoghegan is working on proactive tuple freezing. The purpose is clear: to avoid transaction counter wraparound, you need to start freezing in advance. However, an earlier freeze leads to a larger WAL volume and increases the associated overhead (WAL replication and archiving).
In this patch, Peter proposed to form WAL freezing records more compactly (~5 times so).
Reduced power consumption when idle
When the main server is idle, the startup process on physical replicas still wakes up every 5 seconds to check if the file specified in the promote_trigger_file has appeared. But there are other mechanisms for promoting a replica: pg_ctl promote
and the pg_promote
function. To reduce resource consumption for such wake-ups, the promote_trigger_file parameter was removed.
Similar work was done with the walreceiver
process, which woke up 10 times a second, checking whether it needed to do something. Now, the process calculates the time of its next start in advance.
postgres_fdw: batch mode for COPY
commit: 97da4824
Batch-inserting records into foreign tables first became possible in the 14th release. The batch size is set by the batch_size parameter at the foreign server or table level. However, it was implemented only for the INSERT
command. This patch adds support for batch paste mode for the COPY FROM
command.
Let's compare the insertion speed by different commands. To do this, I first create an empty bookings table in the postgres database, into which we will insert data.
postgres=# CREATE TABLE public.bookings (
book_ref char(6),
book_date timestamptz,
total_amount numeric(10,2)
);
In the demo database in the same cluster, let's create a foreign table for the bookings table.
CREATE EXTENSION postgres_fdw;
CREATE SERVER srv
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname 'postgres', batch_size '1');
CREATE USER MAPPING FOR postgres
SERVER srv
OPTIONS (user 'postgres');
CREATE FOREIGN TABLE bookings_remote (
book_ref char(6),
book_date timestamptz,
total_amount numeric(10,2)
) SERVER srv
OPTIONS (schema_name 'public', table_name 'bookings');
The batch size is set at the server level as 1, which means that the batch mode is disabled and records are sent one at a time.
Let's dump the bookings table contents from the demo database into a file:
\COPY bookings TO 'bookings.txt'
Now, upload the file contents into the foreign table and measure the time:
\timing on
\COPY bookings_remote FROM 'bookings.txt'
COPY 2111110
Time: 57327,152 ms (00:57,327)
It took almost a minute to insert a little more than two million rows. Set the batch size to 100 and repeat the test:
TRUNCATE bookings_remote;
ALTER SERVER srv OPTIONS (SET batch_size '100');
\COPY bookings_remote FROM 'bookings.txt'
COPY 2111110
Time: 8780,000 ms (00:08,780)
The insertion speed has increased by more than 6 times!
Curiously, the INSERT
command works even faster in batch mode:
TRUNCATE bookings_remote;
INSERT INTO bookings_remote SELECT * FROM bookings;
INSERT 0 2111110
Time: 6669,504 ms (00:06,670)
So when it comes to insertion with the COPY
command, there's room for optimization yet! Also, it would be great to teach postgres_fdw
to use batch mode to modify and delete rows, not just read and insert.
Modernizing the GUC infrastructure
commit: f13b2088, 3057465a, 407b50f2, 9c911ec0
This series of commits optimizes storage and speeds up access to configuration parameters, including user-generated ones.
The optimization is most evident in the example that Tom Lane gave in the letter that started the discussion:
do $$
begin
for i in 1..10000 loop
perform set_config('foo.bar' || i::text, i::text, false);
end loop;
end $$;
On my computer, creating ten thousand user parameters in PostgreSQL 15 took about six seconds, and after applying the patches, the time was reduced to ~ 30 milliseconds.
Hash index build optimization
When building hash indexes, the values were sorted only by the bucket number. If you additionally sort by value, then inserting subsequent values is faster.
The first commit was made in July, and the second in November. By now, the hash index build speed increased by 5-15%.
MAINTAIN ― a new privilege for table maintenance
commit: 60684dd8
A number of maintenance actions for tables and materialized views are not regulated by privileges and are usually available only to the object owner and the superuser.
PostgreSQL 16 introduces the MAINTAIN
privilege, which can be granted on materialized views and tables, including system ones:
GRANT MAINTAIN ON pg_class, pg_attribute, pg_type TO alice;
The \dp
output uses the m character for this privilege:
\dp pg_catalog.pg_class
Access privileges
Schema | Name | Type | Access privileges | Column privileges | Policies
------------+----------+-------+----------------------------+-------------------+----------
pg_catalog | pg_class | table | postgres=arwdDxtm/postgres+| |
| | | =r/postgres +| |
| | | alice=m/postgres | |
What actions are granted by this privilege? These include executing the commands ANALYZE
, VACUUM
(including VACUUM FULL
), CLUSTER
, REINDEX
, REFRESH MATERIALIZED VIEW
and LOCK TABLE
.
In addition, a predefined pg_maintain role has been added. Membership in this role grants the MAINTAIN
privilege on all relations in the database. This allows you to grant maintenance privileges to a regular role that does not have the rights to perform DDL and DML operations in the database otherwise.
The initial patch proposed the addition of two privileges for VACUUM
and ANALYZE
, but eventually the community settled on the MAINTAIN
privilege with more extensive maintenance capabilities.
SET ROLE: better role change management
commit: 3d14e171
This patch continues the effort to build a more robust roles and privileges management system. The previous article talked about granting membership in roles and inheriting privileges.
The new patch introduces switching to another role with the SET ROLE
command. Suppose we decide to include the alice
role into pg_read_all_data
with privilege inheritance enabled:
postgres=# GRANT pg_read_all_data TO alice WITH INHERIT TRUE;
The alice
role can now use pg_read_all_data
privileges. But nothing is stopping alice
from switching role to pg_read_all_data
and starting to create objects on its behalf:
postgres=# \c - alice
You are now connected to database "postgres" as user "alice".
alice=> CREATE TABLE t (id int);
CREATE TABLE
alice=> \dt t
List of relations
Schema | Name | Type | Owner
--------+------+-------+------------------
public | t | table | pg_read_all_data
Actually, there is. To create objects, the pg_read_all_data
role must have the CREATE
privilege in some schema, such as in public
. And starting from PostgreSQL 15, the pseudo-role public
no longer has this privilege. As for the example above, I had to run the following command in advance:
postgres=# GRANT CREATE ON SCHEMA public TO public;
Nevertheless, switching to another role may be undesirable in certain situations. If we provided membership in a role with INHERIT TRUE
, then why switch at all? In the new version, this can be prohibited by the same GRANT
command:
postgres=# GRANT pg_read_all_data TO alice WITH INHERIT TRUE, SET FALSE;
Alice can now use pg_read_all_data
privileges, but will not be able to switch to it.
alice=> SET ROLE pg_read_all_data;
ERROR: permission denied to set role "pg_read_all_data"
The ability to switch to a role is stored in the set_option
column of the pg_auth_members
table, next to inherit_option
and admin_option
.
SELECT roleid::regrole, member::regrole, grantor::regrole,
admin_option, inherit_option, set_option
FROM pg_auth_members
WHERE member = 'alice'::regrole\gx
-[ RECORD 1 ]--+-----------------
roleid | pg_read_all_data
member | alice
grantor | postgres
admin_option | f
inherit_option | t
set_option | f
Support for file inclusion directives in pg_hba.conf and pg_ident.conf
commit: a54b658c
The configuration files pg_hba.conf and pg_ident.conf, as well as postgresql.conf, now support the directives to include other files: include
, include_if_exists
, include_dir
.
Columns containing the file name and rule/map number have been added to pg_hba_file_rules
and pg_ident_file_mappings
:
SELECT * FROM pg_hba_file_rules
WHERE file_name LIKE '%hba_ident_test.conf' \gx
-[ RECORD 1 ]---------------------------------
rule_number | 1
file_name | /home/pal/pg/hba_ident_test.conf
line_number | 1
type | local
database. | {all}
user_name | {alice}
address |
netmask |
auth_method | peer
options | {map=m1}
error |
SELECT * FROM pg_ident_file_mappings
WHERE file_name LIKE '%ident_test.conf' \gx
-[ RECORD 1 ]-----------------------------
map_number | 1
file_name | /home/pal/pg/ident_test.conf
line_number | 1
map_name. | m1
sys_name. | student
pg_username | alice
error |
Regular expressions support in pg_hba.conf
commit: 8fea8683, a9039713, fc579e11
pg_hba.conf supports regular expressions for the user name and the database name. If any of these fields starts with a slash (/), then the value is considered a regular expression.
This is all for now. Looking forward to the results of the January CommitFest!