Luca FerrariLuca Ferrari[email protected]https://fluca1978.github.iopgenv 1.4.3 is out!2025-09-21T00:00:00+00:00https://fluca1978.github.io/2025/09/21/pgenv1_4_3<p>A new minor release for the beloved tool to build and manage multiple PostgreSQL instances.</p>
<h1 id="pgenv-143-is-out">pgenv 1.4.3 is out!</h1>
<p><a href="https://github.com/theory/pgenv/releases/tag/v1.4.3" target="_blank">pgenv 1.4.3</a> is out!
This minor release fixes a problem in the build of <em>release candidate</em> versions (e.g., <code class="language-plaintext highlighter-rouge">18rc1</code>) by stripping out all the text part from a version number using a Bash regular expression.</p>
PgTraining OpenDay is over!2025-04-14T00:00:00+00:00https://fluca1978.github.io/2025/04/14/PgTrainingOpenDay<p>We are proud of what we have done in Bolzano.</p>
<h1 id="pgtraining-openday-is-over">PgTraining OpenDay is over!</h1>
<p>Last Friday was <strong><a href="https://pgtraining.gitlab.io/openday/" target="_blank">PgTraining OpenDay in Bolzano</a></strong>, a free of charge day entirely dedicated to PostgreSQL, that we at <a href="https://pgtraining.com" target="_blank">PgTraning</a> organized.</p>
<center>
<br />
<br />
<img src="https://cdn.fosstodon.org/media_attachments/files/114/324/903/125/564/469/original/54fe523ffbac2d9e.jpg" alt="PgTraining Staff" />
<br />
<br />
</center>
<p>We hold the event in the spectacular <em>NOI TechPark</em> in Bozen (Bolzano), north Italy, and the room we had was simply amazing: everything was arranged in a very professional and clean way.</p>
<p>Chris, our <em>host</em>, Enrico and yours truly, had several talks with regard to cool topics like (but not limited to):</p>
<ul>
<li>vector support in PostgreSQL via PgVector, and what you can do with such a tool to create RAG applications;</li>
<li>connection pooling (with regard to <code class="language-plaintext highlighter-rouge">pgagroal</code>)</li>
<li>logical replication and hot upgrade.</li>
</ul>
<p>The afternoon was a more practical part, when we displayed a few live demos to the audience.</p>
<p>All the material, slides and code samples, including a few Docker images, are available on the <a href="https://gitlab.com/pgtraining/slides/-/tree/master/OpenDay-20250411?ref_type=heads" target="_blank">PgTraining Gitlab repository</a>, more material will be available in the next days.</p>
<p>As a joke, during the afternoon, Chris embedded the whole <a href="https://docs.raku.org/" target="_blank">Raku documentation</a> in his RAG application and we enjoyed asking the application about how to connect Raku to a PostgreSQL database, with a couple of very detailed and accurate answers.</p>
<p>The audience was very interested in all the topics, and we are glad of such a good day.</p>
<p>We hope to be able to host soon another event like this, and we would like to get some more feedback, even “bad”, in order to arrange an even better event!</p>
pgagroal now has docker files!2025-04-03T00:00:00+00:00https://fluca1978.github.io/2025/04/03/pgagroalDockerImages<p>An important contribution to <code class="language-plaintext highlighter-rouge">pgagroal</code>.</p>
<h1 id="pgagroal-now-has-docker-files">pgagroal now has docker files!</h1>
<p>Thanks to the <a href="https://github.com/agroal/pgagroal/commit/c27c172caf836b93ec76c9c820ea997b6a19d3f0" target="_blank">contribution of Arshdeep</a> now the <code class="language-plaintext highlighter-rouge">pgagroal</code> connection pooler has also docker images available on the repository.</p>
<p>There are two docker files: one based on Alpine Linux and one based on Rocky Linux 9.</p>
<p>Thanks to these docker files it should be simpler to test and do a play of the connection pooler.</p>
pgenv 1.4.0 is out!2025-03-10T00:00:00+00:00https://fluca1978.github.io/2025/03/10/pgenvConfigurationOverrides<p>A new version with an interesting improvement in the configuration management.</p>
<h1 id="pgenv-140-is-out">pgenv 1.4.0 is out!</h1>
<p><a href="https://github.com/theory/pgenv/releases/tag/v1.4.0">pgenv 1.4.0</a> is out with an interesting improvement regarding the configuration management.</p>
<p>When you install, and then use, a specific PostgreSQL version, <code class="language-plaintext highlighter-rouge">pgenv</code> loads the configuration to start the instance with from a configuration file that is named after the PostgreSQL specific version. For instance, if you are running version <code class="language-plaintext highlighter-rouge">17.1</code>, then <code class="language-plaintext highlighter-rouge">pgenv</code> will load the configuration from a file named <code class="language-plaintext highlighter-rouge">17.1.conf</code>. If the latter file does not exists, the <code class="language-plaintext highlighter-rouge">pgenv</code> script will try to load the default configuration file <code class="language-plaintext highlighter-rouge">default.conf</code>.</p>
<p>Now, thanks to the work done in the <code class="language-plaintext highlighter-rouge">pgenv</code> development, it is possible to allow for multiple configuration files with overrides.
In particular, <code class="language-plaintext highlighter-rouge">pgenv</code> will load more than one configuration file with <strong>narrowing context</strong> related to the PostgreSQL version.
Therefore, using a <code class="language-plaintext highlighter-rouge">17.1</code> PostgreSQL version will trigger the loading of the following files:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">default.conf</code></li>
<li><code class="language-plaintext highlighter-rouge">17.conf</code></li>
<li><code class="language-plaintext highlighter-rouge">17.1.conf</code></li>
</ul>
<p>Note the addition of the <strong>major version specific configuration file</strong> (in the above <code class="language-plaintext highlighter-rouge">17.conf</code>).</p>
<p>This new configuration loading chain will make <code class="language-plaintext highlighter-rouge">pgenv</code> to load configuration from a default to a specific context, allowing also for a quicker sharing of configuration assuming you are interested only in the major version configuration.</p>
OpenDay 2025 by PgTraining2025-03-10T00:00:00+00:00https://fluca1978.github.io/2025/03/10/OpenDay2025<p>There are still seats available for this entire day dedicated to PostgreSQL!</p>
<h1 id="openday-2025-by-pgtraining">OpenDay 2025 by PgTraining</h1>
<center>
<br />
<img src="/images/posts/postgresql/openday25.jpg" alt="" />
<br />
</center>
<p>PgTraining is organizing a <strong>free for all</strong> entire day dedicated to PostgreSQL, where people is going to meet by face.</p>
<p>The event, that will be held in the great <strong>NOI Techpark</strong> in Bolzano (Italy) will be organized in two parts:</p>
<ul>
<li>a talk session in the morning</li>
<li>a laboratory sessione in the afternoon.</li>
</ul>
<p>Please note that <strong>this is an italian only spoken language</strong> event.</p>
<p><strong><a href="https://pgtraining.gitlab.io/openday/" target="_blank">The schedule of day</a></strong> is available, and there are still a few seats available (but you need to register in order to participate).</p>
Open Day 2025 in Bolzano (Italy): schedule available2025-01-20T00:00:00+00:00https://fluca1978.github.io/2025/01/20/Openday<p>The schedule of the free event is available!</p>
<h1 id="open-day-2025-in-bolzano-italy-schedule-available">Open Day 2025 in Bolzano (Italy): schedule available</h1>
<p><strong><a href="https://pgtraining.gitlab.io/openday/" target="_blank">The schedule of the next free event</a></strong> organized by PgTraining is available.</p>
<p>The event, that will be held in the great <strong>NOI Techpark</strong> in Bolzano (Italy) will be organized in two parts:</p>
<ul>
<li>a talk session in the morning</li>
<li>a laboratory sessione in the afternoon.</li>
</ul>
<p>Due to the nature of the afternoon session, it is recommended to bring your own laptop in order to test everything the laboratory will introduce.</p>
<p><strong>There are still available seats</strong> hut please note that you have to reserve your own seat to participate.
<a href="https://www.eventbrite.it/e/biglietti-postgresql-openday-2025-1104461316529?aff=oddtdtcreator&_gl=1*1xswsr*_up*MQ..*_ga*MTg3NzI4OTE2NS4xNzMzMjM0NzI1*_ga_TQVES5V6SH*MTczMzIzNDcyNC4xLjAuMTczMzIzNDcyNC4wLjAuMA/" target="_blank">Follow this link to reserve your seat</a>!</p>
The importance of testing with not-so-usual setups2025-01-16T00:00:00+00:00https://fluca1978.github.io/2025/01/16/pgagroalPortLengthBug<p>How we discovered a trivial bug in pgagroal</p>
<h1 id="the-importance-of-testing-with-not-so-usual-setups">The importance of testing with not-so-usual setups</h1>
<p>This week we found a trivial and silly bug in <code class="language-plaintext highlighter-rouge">[pgagroal](https://github.com/agroal/pgagroal){:target="_blank"}</code>.</p>
<p>This post is a brief description about such bug, not because it is important on itself, but because the way we discovered it emphasizes how important it is to <em>randomize</em> the configuration of a system.
It is a well known concept, however we all still tend to fail on this, due also to the lack and time to configure and test all possibilities (thanks God there is automation!).</p>
<p>As it often happens, the bug was caused by a memory allocation problem.</p>
<p>As it often happens in these cases. the fixing is a very short troophy patch.</p>
<p>Again, the aim of this post is not to discuss a <em>one line patch</em>, rather the importance of running and testing with different tools and setups.</p>
<h2 id="the-memory-bug">The memory bug</h2>
<p>The bug <a href="https://github.com/agroal/pgagroal/issues/491" target="_blank">is described in a dedicated issue</a>. What is interesting, as often happens when dealing with bugs, is how long it get unnoted.</p>
<p>While working and testing other <em>work in progress</em> features of <code class="language-plaintext highlighter-rouge">pgagroal</code>, I was encouraged to compile the project using <code class="language-plaintext highlighter-rouge">clang</code> instead of my usual <code class="language-plaintext highlighter-rouge">gcc</code>. The result was discouraging, since I was not able anymore to start the program:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgagroal
pgagroal: Unknown key <ev_backend> with value <io_uring> <span class="k">in </span>section <span class="o">[</span>pgagroal] <span class="o">(</span>line 46 of file </etc/pgagroal/pgagroal.conf><span class="o">)</span>
2025-01-13 12:38:09 WARN configuration.c:482 pgagroal: max_connections <span class="o">(</span>20<span class="o">)</span> is greater than allowed <span class="o">(</span>8<span class="o">)</span>
2025-01-13 12:38:09 DEBUG configuration.c:3074 PID file automatically <span class="nb">set </span>to: <span class="o">[</span>/tmp/pgagroal.54322.pid]
<span class="o">=================================================================</span>
<span class="o">==</span><span class="nv">17659</span><span class="o">==</span>ERROR: AddressSanitizer: heap-buffer-overflow on address 0x502000009495 at pc 0x559726597f10 bp 0x7ffc9c1e2360 sp 0x7ffc9c1e1b00
WRITE of size 6 at 0x502000009495 thread T0
<span class="c">#0 0x559726597f0f in vsprintf (/usr/local/bin/pgagroal+0x58f0f) (BuildId: 16ffc1dab018cfa8eed6b5cc7e6981bc6e861325)</span>
<span class="c">#1 0x55972659900e in sprintf (/usr/local/bin/pgagroal+0x5a00e) (BuildId: 16ffc1dab018cfa8eed6b5cc7e6981bc6e861325)</span>
<span class="c">#2 0x7f1cd144820b in bind_host /home/luca/pgagroal/src/libpgagroal/network.c:613:4</span>
<span class="c">#3 0x7f1cd1447bfc in pgagroal_bind /home/luca/pgagroal/src/libpgagroal/network.c:104:17</span>
<span class="c">#4 0x55972664e91c in main /home/luca/pgagroal/src/main.c:961:11</span>
<span class="c">#5 0x7f1cd10295cf in __libc_start_call_main (/lib64/libc.so.6+0x295cf) (BuildId: d78a44ae94f1d320342e0ff6c2315b2b589063f8)</span>
<span class="c">#6 0x7f1cd102967f in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2967f) (BuildId: d78a44ae94f1d320342e0ff6c2315b2b589063f8)</span>
<span class="c">#7 0x559726571a94 in _start (/usr/local/bin/pgagroal+0x32a94) (BuildId: 16ffc1dab018cfa8eed6b5cc7e6981bc6e861325)</span>
0x502000009495 is located 0 bytes after 5-byte region <span class="o">[</span>0x502000009490,0x502000009495<span class="o">)</span>
allocated by thread T0 here:
<span class="c">#0 0x55972660d04d in calloc (/usr/local/bin/pgagroal+0xce04d) (BuildId: 16ffc1dab018cfa8eed6b5cc7e6981bc6e861325)</span>
<span class="c">#1 0x7f1cd14481a3 in bind_host /home/luca/pgagroal/src/libpgagroal/network.c:607:12</span>
<span class="c">#2 0x7f1cd1447bfc in pgagroal_bind /home/luca/pgagroal/src/libpgagroal/network.c:104:17</span>
<span class="c">#3 0x55972664e91c in main /home/luca/pgagroal/src/main.c:961:11</span>
<span class="c">#4 0x7f1cd10295cf in __libc_start_call_main (/lib64/libc.so.6+0x295cf) (BuildId: d78a44ae94f1d320342e0ff6c2315b2b589063f8)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>If I wasn’t so lazy to test, even occasionally, another compiler, I would have discovered the problem sooner.</p>
<p><strong>Lesson learned #1</strong>: using a different toolchain can speed up the discover of issues.</p>
<p>However, <em>we were testing and building <code class="language-plaintext highlighter-rouge">pgagroal</code> on different environments and by different toolchains</em>, hence how did this get unnoted?</p>
<p>Simple answer: <strong>because developers tend to be lazy</strong>. We tend to use the same setup over and over, and to follow the guides and howtos.</p>
<h2 id="understanding-the-bug">Understanding the bug</h2>
<p>The stacktrace reports a problem about a <code class="language-plaintext highlighter-rouge">calloc</code> call and something about a `5 byte region’:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
0x502000009495 is located 0 bytes after 5-byte region <span class="o">[</span>0x502000009490,0x502000009495<span class="o">)</span>
allocated by thread T0 here:
<span class="c">#0 0x55972660d04d in calloc (/usr/local/bin/pgagroal+0xce04d) (BuildId: 16ffc1dab018cfa8eed6b5cc7e6981bc6e861325)</span>
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>and luckily enough, we get also a line number and a file to look at:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
<span class="c">#2 0x7f1cd144820b in bind_host /home/luca/pgagroal/src/libpgagroal/network.c:613:4</span>
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>Let’s start from there. The code within <code class="language-plaintext highlighter-rouge">network.c</code>, around that line, was doing the following:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">char</span><span class="o">*</span> <span class="n">sport</span><span class="p">;</span>
<span class="n">sport</span> <span class="o">=</span> <span class="n">calloc</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>At its gist, the code is allocating a string to handle a number that represents a TCP/IP port number. The usage of <code class="language-plaintext highlighter-rouge">calloc</code> simplifies the well known pattern <code class="language-plaintext highlighter-rouge">malloc</code> plus <code class="language-plaintext highlighter-rouge">memset</code> to zero fill the memory.</p>
<p>It is quite simple now to spot the bug: a TCP/IP port upper boundary is <code class="language-plaintext highlighter-rouge">65535</code>, five digits, but the string needs the <code class="language-plaintext highlighter-rouge">\0</code> terminator.
And this is why all of this was unnoted before: the guide for <code class="language-plaintext highlighter-rouge">pgagroal</code> suggests to use the TCP/IP port <code class="language-plaintext highlighter-rouge">2345</code> (the reverse of the PostgreSQL default port) to listen for connections. Since <code class="language-plaintext highlighter-rouge">2345</code> is made by four digits, there is room for the string terminator.</p>
<p>However, on my setup, I use the port <code class="language-plaintext highlighter-rouge">54322</code>, which is five digits, hence the string terminator overflows the <code class="language-plaintext highlighter-rouge">sport</code> calloc-ated buffer.</p>
<p>The troophy patch was embarassing (<a href="https://github.com/fluca1978/pgagroal/commit/aca795b0bd8aaa137f9137f4b3ccf8f79c6bc00c" target="_blank">see this commit</a>):</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- sport = calloc(1, 5);
+ sport = calloc(1, 6);
</code></pre></div></div>
<p><br />
<br /></p>
<p><strong>Lesson learned #2</strong>: use a not standard setup in order to look for problems.</p>
<h1 id="conclusions">Conclusions</h1>
<p>This short story emphasizes, once again, how important it is to change your own development environment and toolchain, as well as setup, in order to ease and speed the identification of problems.
If I did not change the toolchain, I wouldn’t have seen the problem. And if I was not using a different setup than the “default” one, I wouldn’t have seen the problem.</p>
<p><strong>Note that both the conditions had to happen for we to discovered the problem.</strong></p>
<p>And this is the important remark about the whole story.</p>
OpenDay 2025 in Bolzano (Italy)2024-12-18T00:00:00+00:00https://fluca1978.github.io/2024/12/18/OpenDay2025<p>Prepare for the next great event by PgTraining!</p>
<h1 id="openday-2025-in-bolzano-italy">OpenDay 2025 in Bolzano (Italy)</h1>
<p><a href="https://pgtraining.com" target="_blank">PgTraining</a> is organizing the next year event, namely <strong>OpenDay 2025</strong> that will be held on <strong>April 11th</strong> in Bolzano, Italy.</p>
<p>The event will be <strong>totally free</strong> but registration is required because the room assigned has a fixed number of seats.</p>
<p>Please note that all the speaks will be in <em>italian</em>.</p>
<p>The event will be held at the NOI Techpark.</p>
<p>We are working on the schedule, but the day will be organized in a talks-session and a laboratory/practical session, the former in the morning, the latter in the afternoon.</p>
<p><a href="https://pgtraining.gitlab.io/openday/" target="_blank"><strong>Please see the official event page for more details</strong></a> and stay tuned for updates!</p>
PL/Perl now ties %ENV2024-11-21T00:00:00+00:00https://fluca1978.github.io/2024/11/21/PLPerlTieENV<p>A small but great improvement in the security of PL/Perl.</p>
<h1 id="plperl-now-ties-env">PL/Perl now ties %ENV</h1>
<p>PL/Perl is a great language, since it <em>ties</em> together two of my favourite pieces of technology: PostgreSQL and Perl.</p>
<p>While I do usually refer to <strong>PL/Perl</strong> as the capability to run Perl code within PostgreSQL, the correct naming is either <strong><code class="language-plaintext highlighter-rouge">PL/Perl</code></strong> or <strong><code class="language-plaintext highlighter-rouge">PL/Perlu</code></strong>, where the former is the <strong>trusted</strong> language and the latter is the untrusted one.</p>
<p>A trusted language means that the code will run in a PostgreSQL sandbox, with lower permissions than a normal application.</p>
<p><a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3ebcfa54d" target="_blank">This commit introduces a new protection level</a> in the trusted languaged <code class="language-plaintext highlighter-rouge">PL/Perl</code>: is prevents the modification of the <code class="language-plaintext highlighter-rouge">%ENV</code> hash, that represents the enviromental settings for the running code.
<a href="https://nvd.nist.gov/vuln/detail/CVE-2024-10979" target="_blank">An official CVE</a> for the problem has been issued.</p>
<p>The trick to prevent modifications is really elegant, as often Perl is: using a <em>tied hash</em> to wrap up <code class="language-plaintext highlighter-rouge">%ENV</code>.</p>
<p>It works as follows:</p>
<ul>
<li>create a new class that implements the <em>hash protector</em></li>
<li><code class="language-plaintext highlighter-rouge">tie</code> the <code class="language-plaintext highlighter-rouge">%ENV</code> to this new class</li>
<li>provide warnings when something tries to modify the <code class="language-plaintext highlighter-rouge">%ENV</code> tied hash.</li>
</ul>
<p>Let’s explain it a little better.
First of all, a new class to wrap the hash is created:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">package</span> <span class="nn">PostgreSQL::InServer::</span><span class="nv">WarnEnv</span><span class="p">;</span>
<span class="k">use</span> <span class="nv">strict</span><span class="p">;</span>
<span class="k">use</span> <span class="nv">warnings</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">Tie::</span><span class="nv">Hash</span><span class="p">;</span>
<span class="k">our</span> <span class="nv">@ISA</span> <span class="o">=</span> <span class="sx">qw(Tie::StdHash)</span><span class="p">;</span>
<span class="k">sub </span><span class="nf">STORE</span> <span class="p">{</span> <span class="nb">warn</span> <span class="p">"</span><span class="s2">attempted alteration of </span><span class="se">\$</span><span class="s2">ENV{</span><span class="si">$_</span><span class="s2">[1]}</span><span class="p">";</span> <span class="p">}</span>
<span class="k">sub </span><span class="nf">DELETE</span> <span class="p">{</span> <span class="nb">warn</span> <span class="p">"</span><span class="s2">attempted deletion of </span><span class="se">\$</span><span class="s2">ENV{</span><span class="si">$_</span><span class="s2">[1]}</span><span class="p">";</span> <span class="p">}</span>
<span class="k">sub </span><span class="nf">CLEAR</span> <span class="p">{</span> <span class="nb">warn</span> <span class="p">"</span><span class="s2">attempted clearance of ENV hash</span><span class="p">";</span> <span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">PostgreSQL::InServer::WarnEnv</code> class inherits from <code class="language-plaintext highlighter-rouge">Tie::StdHash</code>, a tie-able hash that already defines all the required methods and that requires you to only override those that are in your scope of interest.
In particular, <code class="language-plaintext highlighter-rouge">WarnEnv</code> overrides <code class="language-plaintext highlighter-rouge">STORE</code>, <code class="language-plaintext highlighter-rouge">DELETE</code> and <code class="language-plaintext highlighter-rouge">CLEAR</code> that are method used when adding, deleting of a value in the hash or clearing it all.</p>
<p>Then, it does suffice to <code class="language-plaintext highlighter-rouge">tie</code> the <code class="language-plaintext highlighter-rouge">%ENV</code> to this class, and in fact the <code class="language-plaintext highlighter-rouge">PL/Perl</code> implementation does:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">tie</span> <span class="nv">%</span><span class="nn">main::</span><span class="nv">ENV</span><span class="p">,</span> <span class="p">'</span><span class="s1">PostgreSQL::InServer::WarnEnv</span><span class="p">',</span> <span class="nv">%ENV</span> <span class="ow">or</span> <span class="nb">die</span> <span class="vg">$!</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>that applies <code class="language-plaintext highlighter-rouge">WarnEnv</code> as the class behind the behaviour of <code class="language-plaintext highlighter-rouge">main::ENV</code> keeping all values of <code class="language-plaintext highlighter-rouge">%ENV</code> (that has been changed to a normal hash).
From now on, trying to modify <code class="language-plaintext highlighter-rouge">%ENV</code> will result in a warning according to the method used.</p>
<p>This patch has been backported on older PostgreSQL versions until 12.</p>
dbicdump: using PostgreSQL schemas as package separator in produced Perl classes2024-11-18T00:00:00+00:00https://fluca1978.github.io/2024/11/18/dbicdumpPostgreSQLSchema<p>A way to instrument <code class="language-plaintext highlighter-rouge">dbicdump</code> to use PostgreSQL schemas as package separators.</p>
<h1 id="dbicdump-using-postgresql-schemas-as-package-separator-in-produced-perl-classes">dbicdump: using PostgreSQL schemas as package separator in produced Perl classes</h1>
<p>Perl <code class="language-plaintext highlighter-rouge">DBIx::Class</code> is a great Object Relational Mapper (ORM), and I use it regularly with <code class="language-plaintext highlighter-rouge">dbicdump</code>, which is a tool to <strong>synchronize</strong> your existing database structure with the classes your program is going to use.</p>
<p>PostgreSQL being PostgreSQL, a great rock solid database we all love, allows us to organize tables into <em>schemas</em>, a flat namespace that is usually transparent to the user because the default schema, <code class="language-plaintext highlighter-rouge">public</code>, is always into the <code class="language-plaintext highlighter-rouge">search_path</code> for every user.</p>
<p>But how to take advantage of PostgreSQL schemas and <code class="language-plaintext highlighter-rouge">DBIx::Class</code> packages?</p>
<p>Well, it turned out that this is possible, with a little customization of the way you sycnhronize your own data structure.</p>
<h2 id="example-database">Example Database</h2>
<p>Assume we have an example database with a couple of tables, namely <code class="language-plaintext highlighter-rouge">products</code> and <code class="language-plaintext highlighter-rouge">orders</code>, each one replicated into two different schemas named respectively <code class="language-plaintext highlighter-rouge">italy</code> and <code class="language-plaintext highlighter-rouge">japan</code>. Note, this is probably not the better design for your database, but it does serve only as an example to get a quick and easy idea of how to achieve things.</p>
<p>The database results as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dbic</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">SCHEMA</span> <span class="n">italy</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">SCHEMA</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">SCHEMA</span> <span class="n">japan</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">SCHEMA</span>
<span class="o">^</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">italy</span><span class="p">.</span><span class="n">product</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">serial</span><span class="p">,</span>
<span class="n">code</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">description</span> <span class="nb">text</span><span class="p">,</span>
<span class="k">primary</span> <span class="k">key</span><span class="p">(</span> <span class="n">pk</span> <span class="p">),</span>
<span class="k">unique</span><span class="p">(</span> <span class="n">code</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">japan</span><span class="p">.</span><span class="n">product</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">serial</span><span class="p">,</span>
<span class="n">code</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">description</span> <span class="nb">text</span><span class="p">,</span>
<span class="k">primary</span> <span class="k">key</span><span class="p">(</span> <span class="n">pk</span> <span class="p">),</span> <span class="k">unique</span><span class="p">(</span> <span class="n">code</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">italy</span><span class="p">.</span><span class="n">orders</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">serial</span><span class="p">,</span>
<span class="n">product</span> <span class="nb">int</span> <span class="k">not</span> <span class="k">null</span><span class="p">,</span>
<span class="n">qty</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">0</span>
<span class="p">,</span> <span class="k">primary</span> <span class="k">key</span> <span class="p">(</span> <span class="n">pk</span> <span class="p">)</span>
<span class="p">,</span> <span class="k">foreign</span> <span class="k">key</span><span class="p">(</span> <span class="n">product</span> <span class="p">)</span> <span class="k">references</span> <span class="n">italy</span><span class="p">.</span><span class="n">product</span><span class="p">(</span> <span class="n">pk</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">japan</span><span class="p">.</span><span class="n">orders</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">serial</span><span class="p">,</span>
<span class="n">product</span> <span class="nb">int</span> <span class="k">not</span> <span class="k">null</span><span class="p">,</span>
<span class="n">qty</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">0</span>
<span class="p">,</span> <span class="k">primary</span> <span class="k">key</span> <span class="p">(</span> <span class="n">pk</span> <span class="p">)</span>
<span class="p">,</span> <span class="k">foreign</span> <span class="k">key</span><span class="p">(</span> <span class="n">product</span> <span class="p">)</span> <span class="k">references</span> <span class="n">japan</span><span class="p">.</span><span class="n">product</span><span class="p">(</span> <span class="n">pk</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Let’s populate the <code class="language-plaintext highlighter-rouge">products</code> table with a few rows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dbic</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">italy</span><span class="p">.</span><span class="n">product</span><span class="p">(</span> <span class="n">code</span><span class="p">,</span> <span class="n">description</span> <span class="p">)</span>
<span class="k">values</span><span class="p">(</span> <span class="s1">'it01'</span><span class="p">,</span> <span class="s1">'An italian product'</span> <span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">japan</span><span class="p">.</span><span class="n">product</span><span class="p">(</span> <span class="n">code</span><span class="p">,</span> <span class="n">description</span> <span class="p">)</span>
<span class="k">values</span><span class="p">(</span> <span class="s1">'jp01'</span><span class="p">,</span> <span class="s1">'A japanese product'</span> <span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">japan</span><span class="p">.</span><span class="n">product</span><span class="p">(</span> <span class="n">code</span><span class="p">,</span> <span class="n">description</span> <span class="p">)</span>
<span class="k">values</span><span class="p">(</span> <span class="s1">'jp02'</span><span class="p">,</span> <span class="s1">'A japanese product'</span> <span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">italy</span><span class="p">.</span><span class="n">orders</span><span class="p">(</span> <span class="n">product</span><span class="p">,</span> <span class="n">qty</span> <span class="p">)</span>
<span class="k">select</span> <span class="n">p</span><span class="p">.</span><span class="n">pk</span><span class="p">,</span> <span class="p">(</span> <span class="n">random</span><span class="p">()</span> <span class="o">*</span> <span class="mi">100</span> <span class="p">)::</span><span class="nb">int</span>
<span class="k">from</span> <span class="n">italy</span><span class="p">.</span><span class="n">product</span> <span class="n">p</span><span class="p">,</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">5</span>
<span class="n">dbic</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">japan</span><span class="p">.</span><span class="n">orders</span><span class="p">(</span> <span class="n">product</span><span class="p">,</span> <span class="n">qty</span> <span class="p">)</span>
<span class="k">select</span> <span class="n">p</span><span class="p">.</span><span class="n">pk</span><span class="p">,</span> <span class="p">(</span> <span class="n">random</span><span class="p">()</span> <span class="o">*</span> <span class="mi">100</span> <span class="p">)::</span><span class="nb">int</span>
<span class="k">from</span> <span class="n">japan</span><span class="p">.</span><span class="n">product</span> <span class="n">p</span><span class="p">,</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">10</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="dumping-the-schema-via-dbicdump">Dumping the schema via <code class="language-plaintext highlighter-rouge">dbicdump</code></h2>
<p>In order to dump the schema via <code class="language-plaintext highlighter-rouge">dbicdump</code>, you need to pass several additional options:</p>
<ul>
<li>the schema names to dump, in our example <code class="language-plaintext highlighter-rouge">italy</code> and <code class="language-plaintext highlighter-rouge">jpana</code>;</li>
<li>the <strong>moniker parts to use</strong>, that is how the class name will be built. By default the <code class="language-plaintext highlighter-rouge">moniker</code> is set to <code class="language-plaintext highlighter-rouge">name</code>, that means it will call the <code class="language-plaintext highlighter-rouge">name</code> method (i.e., the table name). In our example, we need to use both <code class="language-plaintext highlighter-rouge">name</code> and <code class="language-plaintext highlighter-rouge">schema</code>, with the latter before the former;</li>
<li>set the <strong>moniker parts separator</strong>, that is the character to use to separate the parts of the name. since we want to produce modules with their namespace, we will use the Perl namespace separator, that means <code class="language-plaintext highlighter-rouge">::</code>.</li>
</ul>
<p>This translates to a command line like the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% dbicdump <span class="nt">-o</span> <span class="nv">dump_directory</span><span class="o">=</span>/home/luca/tmp <span class="se">\</span>
<span class="nt">-o</span> <span class="nv">components</span><span class="o">=</span><span class="s1">'["InflateColumn::DateTime"]'</span> <span class="se">\</span>
<span class="nt">-o</span> <span class="nv">moniker_parts</span><span class="o">=</span><span class="s1">'["schema", "name"]'</span> <span class="se">\</span>
<span class="nt">-o</span> <span class="nv">moniker_part_separator</span><span class="o">=</span><span class="s1">'::'</span> <span class="se">\</span>
<span class="nt">-o</span> <span class="nv">db_schema</span><span class="o">=</span><span class="s1">'["public", "italy", "japan"]'</span> <span class="se">\</span>
Example::Schema <span class="se">\</span>
<span class="s1">'dbi:Pg:dbname=dbic;host=rachel;port=5432'</span> <span class="se">\</span>
luca superSecretPassword
Dumping manual schema <span class="k">for </span>Example::Schema to directory /home/luca/tmp ...
Schema dump completed.
</code></pre></div></div>
<p><br />
<br /></p>
<p>The parameters passed to <code class="language-plaintext highlighter-rouge">dbicdump</code> are the followings:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">dump_directory</code> where to store the Perl code produced;</li>
<li><code class="language-plaintext highlighter-rouge">components='["InflateColumn::DateTime"]'</code> this is not mandatory for this post example, but is a good habit to get automatic date/time data type conversions;</li>
<li><strong><code class="language-plaintext highlighter-rouge">moniker_parts='["schema", "name"]'</code></strong> this tells <code class="language-plaintext highlighter-rouge">dbicdump</code> to compose the name of a class mapped onto a table as the schema name plus the table name, which is what we want;</li>
<li><strong><code class="language-plaintext highlighter-rouge">moniker_part_separator='::'</code></strong> this tells <code class="language-plaintext highlighter-rouge">dbicdump</code> to use the Perl name separator (i.e., package separator <code class="language-plaintext highlighter-rouge">::</code>) between the schema name and the table name;</li>
<li><strong><code class="language-plaintext highlighter-rouge">db_schema='["public", "italy", "japan"]'</code></strong> this tells <code class="language-plaintext highlighter-rouge">dbicdump</code> to dump the <code class="language-plaintext highlighter-rouge">public</code>, <code class="language-plaintext highlighter-rouge">italy</code> and <code class="language-plaintext highlighter-rouge">japan</code> schemas, i.e., where to look for tables.</li>
</ul>
<p>The resulting tree is as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% tree Example
Example
├── Schema
│ └── Result
│ ├── Italy
│ │ ├── Order.pm
│ │ └── Product.pm
│ └── Japan
│ ├── Order.pm
│ └── Product.pm
└── Schema.pm
</code></pre></div></div>
<p><br />
<br /></p>
<p>That is the table <code class="language-plaintext highlighter-rouge">italy.products</code> has been translated to <code class="language-plaintext highlighter-rouge">Italy::Product</code>, and the other similarly.</p>
<h2 id="using-the-table-structure">Using the table structure</h2>
<p>In order to use the Perl classes, and most notably, to query the tables, there is the need to pass the class names into the <code class="language-plaintext highlighter-rouge">resultset</code> method.</p>
<p>As an example:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!perl</span>
<span class="k">use</span> <span class="nv">v5</span><span class="mf">.40</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">Example::</span><span class="nv">Schema</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">Example::Schema::Result::Italy::</span><span class="nv">Product</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">Example::Schema::Result::Japan::</span><span class="nv">Product</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$db</span> <span class="o">=</span> <span class="nn">Example::</span><span class="nv">Schema</span><span class="o">-></span><span class="nb">connect</span><span class="p">(</span> <span class="p">'</span><span class="s1">dbi:Pg:dbname=dbic;host=rachel;port=5432</span><span class="p">'</span> <span class="p">,</span>
<span class="p">'</span><span class="s1">luca</span><span class="p">',</span>
<span class="p">'</span><span class="s1">superSecretPassword</span><span class="p">'</span> <span class="p">);</span>
<span class="k">my</span> <span class="nv">@italian_products</span> <span class="o">=</span> <span class="nv">$db</span><span class="o">-></span><span class="nv">resultset</span><span class="p">(</span> <span class="p">'</span><span class="s1">Italy::Product</span><span class="p">'</span> <span class="p">)</span><span class="o">-></span><span class="nv">all</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">@japanese_products</span> <span class="o">=</span> <span class="nv">$db</span><span class="o">-></span><span class="nv">resultset</span><span class="p">(</span> <span class="p">'</span><span class="s1">Japan::Product</span><span class="p">'</span> <span class="p">)</span><span class="o">-></span><span class="nv">all</span><span class="p">;</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">There are </span><span class="p">"</span> <span class="o">.</span> <span class="nb">scalar</span><span class="p">(</span> <span class="nv">@italian_products</span> <span class="p">)</span> <span class="o">.</span> <span class="p">"</span><span class="s2"> italian products</span><span class="p">";</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">There are </span><span class="p">"</span> <span class="o">.</span> <span class="nb">scalar</span><span class="p">(</span> <span class="nv">@japanese_products</span> <span class="p">)</span> <span class="o">.</span> <span class="p">"</span><span class="s2"> japanese products</span><span class="p">";</span>
<span class="k">for</span> <span class="k">my</span> <span class="nv">$product</span> <span class="p">(</span> <span class="nv">@italian_products</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">[ITALY] </span><span class="p">"</span> <span class="o">.</span> <span class="nb">join</span><span class="p">(</span> <span class="p">"</span><span class="s2"> | </span><span class="p">",</span> <span class="nv">$product</span><span class="o">-></span><span class="nv">code</span><span class="p">,</span> <span class="nv">$product</span><span class="o">-></span><span class="nv">description</span> <span class="p">);</span>
<span class="p">}</span>
<span class="k">for</span> <span class="k">my</span> <span class="nv">$product</span> <span class="p">(</span> <span class="nv">@japanese_products</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">[JAPAN] </span><span class="p">"</span> <span class="o">.</span> <span class="nb">join</span><span class="p">(</span> <span class="p">"</span><span class="s2"> | </span><span class="p">",</span> <span class="nv">$product</span><span class="o">-></span><span class="nv">code</span><span class="p">,</span> <span class="nv">$product</span><span class="o">-></span><span class="nv">description</span> <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Note how the <code class="language-plaintext highlighter-rouge">resultset</code> method does not accept the table name, rather the Perl module name.
In other words, <code class="language-plaintext highlighter-rouge">italy.product</code> does not work, while <code class="language-plaintext highlighter-rouge">Italy::Product</code> works.</p>
<p>In fact, enabling <code class="language-plaintext highlighter-rouge">DBIC_TRACE</code> and running the sample program produces the following output:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">export </span><span class="nv">DBIC_TRACE</span><span class="o">=</span>1
% perl test.pl
SELECT me.pk, me.code, me.description FROM italy.product me:
SELECT me.pk, me.code, me.description FROM japan.product me:
There are 1 italian products
There are 2 japanese products
<span class="o">[</span>ITALY] it01 | An italian product
<span class="o">[</span>JAPAN] jp01 | A japanese product
<span class="o">[</span>JAPAN] jp02 | A japanese product
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the queries are correctly translated into <code class="language-plaintext highlighter-rouge"><schema>.<tablename></code>. This is thanks to the fact that the <code class="language-plaintext highlighter-rouge">table</code> method in every class has been invoked with the fully qualified name. As an example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% less Example/Schema/Result/Italy/Product.pm
...
__PACKAGE__->table<span class="o">(</span><span class="s2">"italy.product"</span><span class="o">)</span><span class="p">;</span>
...
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="using-relationships">Using Relationships</h2>
<p>Once it is clear how the tables are named, it is quite simple to query relationships.
Let’s do it programmatically first:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!perl</span>
<span class="k">use</span> <span class="nv">v5</span><span class="mf">.40</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">Example::</span><span class="nv">Schema</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">Example::Schema::Result::Italy::</span><span class="nv">Product</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">Example::Schema::Result::Japan::</span><span class="nv">Product</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$db</span> <span class="o">=</span> <span class="nn">Example::</span><span class="nv">Schema</span><span class="o">-></span><span class="nb">connect</span><span class="p">(</span> <span class="p">'</span><span class="s1">dbi:Pg:dbname=dbic;host=rachel;port=5432</span><span class="p">'</span> <span class="p">,</span>
<span class="p">'</span><span class="s1">luca</span><span class="p">',</span>
<span class="p">'</span><span class="s1">luca</span><span class="p">'</span> <span class="p">);</span>
<span class="k">my</span> <span class="nv">@italian_orders</span> <span class="o">=</span> <span class="nv">$db</span><span class="o">-></span><span class="nv">resultset</span><span class="p">(</span> <span class="p">'</span><span class="s1">Italy::Order</span><span class="p">'</span> <span class="p">)</span><span class="o">-></span><span class="nv">all</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">@japanese_orders</span> <span class="o">=</span> <span class="nv">$db</span><span class="o">-></span><span class="nv">resultset</span><span class="p">(</span> <span class="p">'</span><span class="s1">Japan::Order</span><span class="p">'</span> <span class="p">)</span><span class="o">-></span><span class="nv">all</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span> <span class="nv">@italian_orders</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">say</span> <span class="nb">sprintf</span> <span class="p">"</span><span class="s2">[ITALY] qty = %d for product %s</span><span class="p">"</span> <span class="p">,</span>
<span class="vg">$_</span><span class="o">-></span><span class="nv">qty</span><span class="p">,</span>
<span class="nb">join</span><span class="p">(</span> <span class="p">"</span><span class="s2">|</span><span class="p">",</span> <span class="vg">$_</span><span class="o">-></span><span class="nv">product</span><span class="o">-></span><span class="nv">code</span><span class="p">,</span> <span class="vg">$_</span><span class="o">-></span><span class="nv">product</span><span class="o">-></span><span class="nv">description</span> <span class="p">);</span>
<span class="p">}</span>
<span class="k">for</span> <span class="p">(</span> <span class="nv">@japanese_orders</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">say</span> <span class="nb">sprintf</span> <span class="p">"</span><span class="s2">[JAPAN] qty = %d for product %s</span><span class="p">"</span> <span class="p">,</span>
<span class="vg">$_</span><span class="o">-></span><span class="nv">qty</span><span class="p">,</span>
<span class="nb">join</span><span class="p">(</span> <span class="p">"</span><span class="s2">|</span><span class="p">",</span> <span class="vg">$_</span><span class="o">-></span><span class="nv">product</span><span class="o">-></span><span class="nv">code</span><span class="p">,</span> <span class="vg">$_</span><span class="o">-></span><span class="nv">product</span><span class="o">-></span><span class="nv">description</span> <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>But let’s assume we want to query all the products that have at least one order of a given quantity (again, this is an example).
This can be done as follows:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">my</span> <span class="nv">@italian_products</span> <span class="o">=</span> <span class="nv">$db</span><span class="o">-></span><span class="nv">resultset</span><span class="p">(</span> <span class="p">'</span><span class="s1">Italy::Order</span><span class="p">'</span> <span class="p">)</span>
<span class="o">-></span><span class="nv">search_related</span><span class="p">(</span> <span class="p">'</span><span class="s1">product</span><span class="p">'</span> <span class="p">)</span>
<span class="o">-></span><span class="nv">search</span><span class="p">(</span> <span class="p">{</span> <span class="s">qty</span> <span class="o">=></span> <span class="mi">36</span> <span class="p">}</span> <span class="p">)</span>
<span class="o">-></span><span class="nv">all</span>
<span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Let’s dissect this:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">resultset( 'Italy::Order' )</code> is what we search first;</li>
<li><code class="language-plaintext highlighter-rouge">search_related( 'product' )</code> is what we join and extract then;</li>
<li><code class="language-plaintext highlighter-rouge">search( { qty => 36 } )</code> is the search condition (i.e., the <code class="language-plaintext highlighter-rouge">WHERE</code> clause);</li>
<li><code class="language-plaintext highlighter-rouge">all</code> is the materialization of the result set.</li>
</ul>
<p>The above translates to the following query (again <code class="language-plaintext highlighter-rouge">DBIC_TRACE</code> to get information about):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">product</span><span class="p">.</span><span class="n">pk</span><span class="p">,</span> <span class="n">product</span><span class="p">.</span><span class="n">code</span><span class="p">,</span> <span class="n">product</span><span class="p">.</span><span class="n">description</span>
<span class="k">FROM</span> <span class="n">italy</span><span class="p">.</span><span class="n">orders</span> <span class="n">me</span>
<span class="k">JOIN</span> <span class="n">italy</span><span class="p">.</span><span class="n">product</span> <span class="n">product</span>
<span class="k">ON</span> <span class="n">product</span><span class="p">.</span><span class="n">pk</span> <span class="o">=</span> <span class="n">me</span><span class="p">.</span><span class="n">product</span> <span class="k">WHERE</span> <span class="p">(</span> <span class="n">qty</span> <span class="o">=</span> <span class="o">?</span> <span class="p">):</span> <span class="s1">'36'</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Wait a minute! What is that <code class="language-plaintext highlighter-rouge">product</code> name that appears into the <code class="language-plaintext highlighter-rouge">search_related</code> method? Why is not <code class="language-plaintext highlighter-rouge">Italy::Product</code> as before?
This is due to how <code class="language-plaintext highlighter-rouge">DBIx::Class</code> handles the relationships: every relationship gets a name that is used to tell DBIx what to join.
Inspecting <code class="language-plaintext highlighter-rouge">Italy::Order</code> you can find something as follows:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">__PACKAGE__</span><span class="o">-></span><span class="nv">belongs_to</span><span class="p">(</span>
<span class="p">"</span><span class="s2">product</span><span class="p">",</span>
<span class="p">"</span><span class="s2">Example::Schema::Result::Italy::Product</span><span class="p">",</span>
<span class="p">{</span> <span class="s">pk</span> <span class="o">=></span> <span class="p">"</span><span class="s2">product</span><span class="p">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="s">is_deferrable</span> <span class="o">=></span> <span class="mi">0</span><span class="p">,</span> <span class="s">on_delete</span> <span class="o">=></span> <span class="p">"</span><span class="s2">NO ACTION</span><span class="p">",</span> <span class="s">on_update</span> <span class="o">=></span> <span class="p">"</span><span class="s2">NO ACTION</span><span class="p">"</span> <span class="p">},</span>
<span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The string <code class="language-plaintext highlighter-rouge">"product"</code> is the name of this join relationship, that has to be used when telling DBIx to join another table from <code class="language-plaintext highlighter-rouge">Italy::Order</code>.</p>
<p>This is a kind of trick used by DBIx, so that having an <code class="language-plaintext highlighter-rouge">Order</code> you can simply spell <code class="language-plaintext highlighter-rouge">$order->product->code</code> and it will work fine. You can rename such association as you like (having care of not irritating <code class="language-plaintext highlighter-rouge">dbicdump</code> self generated code), and use the name you like the most in joining, but I strongly recommend you to avoid this. Rather, design better your tables.</p>
<h1 id="conclusion">Conclusion</h1>
<p><code class="language-plaintext highlighter-rouge">DBIx::Class</code> is a very powerful and elegant ORM, and <code class="language-plaintext highlighter-rouge">dbicdump</code> allows you to organize your code in packages following the same clean order you can achieve with PostgreSQL schemas.</p>
psql watch now has a row limit2024-11-06T00:00:00+00:00https://fluca1978.github.io/2024/11/06/psqlWatchRowsLimit<p>A new feature introduced with PostgreSQL 17.</p>
<h1 id="psql-watch-now-has-a-row-limit">psql \watch now has a row limit</h1>
<p>I often use <code class="language-plaintext highlighter-rouge">\watch</code>, an internal command of the great text client <code class="language-plaintext highlighter-rouge">psql</code> that allows to monitor the last executed query at specific interval times.
Essentially, it works as <code class="language-plaintext highlighter-rouge">watch(8)</code> on a Unix machine.</p>
<p>In the last major release of PostgreSQL, the <code class="language-plaintext highlighter-rouge">\watch</code> command has gained a new interesting feature: the <strong>minrows</strong> limit. The idea is to make <code class="language-plaintext highlighter-rouge">\wwatch</code> to stop automatically as soon as the executed query returns less than the specified number of rows.
This is great, according to me, since I often launch <code class="language-plaintext highlighter-rouge">\watch</code> to just come back to the terminal and find out a lot of <em>empty executions</em>, needing therefore to scroll up the terminal or the log to find the last point when something did happened. With this feature, <code class="language-plaintext highlighter-rouge">\watch</code> will stop for me!</p>
<p>As a very trivial example, imagine to launch a well known <code class="language-plaintext highlighter-rouge">pgbench</code> test (with short time parameters for the sake of this article):</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbench <span class="nt">-T</span> 120 <span class="nt">-c</span> 4 <span class="nt">-n</span> <span class="nt">-U</span> pgbench pgbench
</code></pre></div></div>
<p><br />
<br /></p>
<p>and on a <code class="language-plaintext highlighter-rouge">psql</code> terminal, use <code class="language-plaintext highlighter-rouge">\watch</code> waiting for the <code class="language-plaintext highlighter-rouge">pgench</code> to finish:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">select</span> <span class="n">query</span><span class="p">,</span> <span class="n">wait_event</span>
<span class="k">from</span> <span class="n">pg_stat_activity</span>
<span class="k">where</span> <span class="n">datname</span> <span class="o">=</span> <span class="n">current_database</span><span class="p">()</span> <span class="k">and</span> <span class="n">usename</span> <span class="o">=</span> <span class="s1">'pgbench'</span><span class="p">;</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="err">\</span><span class="n">watch</span> <span class="n">m</span><span class="o">=</span><span class="mi">1</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Whenever the query returns less than one row, that means no more processes are connected to the database, the <code class="language-plaintext highlighter-rouge">\watch</code> will stop:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="err">\</span><span class="n">watch</span> <span class="n">m</span><span class="o">=</span><span class="mi">1</span>
<span class="n">Wed</span> <span class="n">Nov</span> <span class="mi">6</span> <span class="mi">07</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">20</span> <span class="mi">2024</span> <span class="p">(</span><span class="k">every</span> <span class="mi">2</span><span class="n">s</span><span class="p">)</span>
<span class="n">query</span> <span class="o">|</span> <span class="n">wait_event</span>
<span class="c1">------------------------------------------------------------------------------+---------------</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">abalance</span> <span class="o">=</span> <span class="n">abalance</span> <span class="o">+</span> <span class="mi">4197</span> <span class="k">WHERE</span> <span class="n">aid</span> <span class="o">=</span> <span class="mi">9520572</span><span class="p">;</span> <span class="o">|</span> <span class="n">DataFileWrite</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">abalance</span> <span class="o">=</span> <span class="n">abalance</span> <span class="o">+</span> <span class="mi">1973</span> <span class="k">WHERE</span> <span class="n">aid</span> <span class="o">=</span> <span class="mi">98188924</span><span class="p">;</span> <span class="o">|</span> <span class="n">DataFileRead</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">abalance</span> <span class="o">=</span> <span class="n">abalance</span> <span class="o">+</span> <span class="mi">4479</span> <span class="k">WHERE</span> <span class="n">aid</span> <span class="o">=</span> <span class="mi">2905019</span><span class="p">;</span> <span class="o">|</span> <span class="n">DataFileWrite</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">abalance</span> <span class="o">=</span> <span class="n">abalance</span> <span class="o">+</span> <span class="mi">2554</span> <span class="k">WHERE</span> <span class="n">aid</span> <span class="o">=</span> <span class="mi">17213075</span><span class="p">;</span> <span class="o">|</span> <span class="n">DataFileRead</span>
<span class="p">(</span><span class="mi">4</span> <span class="k">rows</span><span class="p">)</span>
<span class="n">Wed</span> <span class="n">Nov</span> <span class="mi">6</span> <span class="mi">07</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">22</span> <span class="mi">2024</span> <span class="p">(</span><span class="k">every</span> <span class="mi">2</span><span class="n">s</span><span class="p">)</span>
<span class="n">query</span> <span class="o">|</span> <span class="n">wait_event</span>
<span class="c1">-------------------------------------------------------------------------------+---------------</span>
<span class="k">BEGIN</span><span class="p">;</span> <span class="o">|</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">abalance</span> <span class="o">=</span> <span class="n">abalance</span> <span class="o">+</span> <span class="o">-</span><span class="mi">3617</span> <span class="k">WHERE</span> <span class="n">aid</span> <span class="o">=</span> <span class="mi">4375996</span><span class="p">;</span> <span class="o">|</span> <span class="n">DataFileWrite</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">abalance</span> <span class="o">=</span> <span class="n">abalance</span> <span class="o">+</span> <span class="mi">1388</span> <span class="k">WHERE</span> <span class="n">aid</span> <span class="o">=</span> <span class="mi">75850626</span><span class="p">;</span> <span class="o">|</span> <span class="n">DataFileWrite</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">abalance</span> <span class="o">=</span> <span class="n">abalance</span> <span class="o">+</span> <span class="o">-</span><span class="mi">3372</span> <span class="k">WHERE</span> <span class="n">aid</span> <span class="o">=</span> <span class="mi">15435869</span><span class="p">;</span> <span class="o">|</span> <span class="n">DataFileRead</span>
<span class="p">(</span><span class="mi">4</span> <span class="k">rows</span><span class="p">)</span>
<span class="n">Wed</span> <span class="n">Nov</span> <span class="mi">6</span> <span class="mi">07</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">24</span> <span class="mi">2024</span> <span class="p">(</span><span class="k">every</span> <span class="mi">2</span><span class="n">s</span><span class="p">)</span>
<span class="n">query</span> <span class="o">|</span> <span class="n">wait_event</span>
<span class="c1">-------+------------</span>
<span class="p">(</span><span class="mi">0</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Clerly, this has its own drawbacks: imagine a long running job pause, releasing the connection, just to come back to its activity later. If <code class="language-plaintext highlighter-rouge">\watch</code> executes the query in the pause period of time, the command will stop and you will not get any update about the resuming of the same activity.
An example of this is by monitoring <code class="language-plaintext highlighter-rouge">pg_stat_progress_xxx</code> views, for example <code class="language-plaintext highlighter-rouge">pg_stat_procress_autovacuum</code> to see when cleaning a table is performed.</p>
<p>However, keeping in mind a good condition (number of rows) to get <code class="language-plaintext highlighter-rouge">\watch</code> on track, and being able to make it stop automatically is a feature that really helps me in my daily activity.</p>
PostgreSQL is super solid in enforcing (well established) constraints!2024-11-06T00:00:00+00:00https://fluca1978.github.io/2024/11/06/Sqlite3ToPostgreSQL<p>A note about mgirating from other databases…</p>
<h1 id="postgresql-is-super-solid-in-enforcing-well-established-constraints">PostgreSQL is super solid in enforcing (well established) constraints!</h1>
<p>Well, let’s turn that around: <em>SQLite3 is somehow too flexible in allowing you to store data!</em></p>
<p>We all know that.</p>
<p>And we all have been fighting situations where we have a well defined structure in SQLite3 and, ocne we try to migrate to PostgreSQL, a bad surprise arrives!
As an example, today I was trying to migrate a Django project with the built-in <code class="language-plaintext highlighter-rouge">loaddata</code> from a <code class="language-plaintext highlighter-rouge">dumpdata</code>, and sadly:</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>django.db.utils.DataError:
Problem installing fixture '/home/luca/git/respi/respiato/sql/data.respi.json':
Could not load respi.PersonPhoto(pk=30647):
value too long for type character varying(20)
</code></pre></div></div>
<p><br />
<br /></p>
<p>So in my SQLite3 tables some fields (at least one) have exceeded the size of the <code class="language-plaintext highlighter-rouge">varchar(20)</code>, and while PostgreSQL correctly refuses to store such value(s), SQLite3 happily get them into the database without warning you!</p>
<p>The fix, in this particular case, is quite simple: issueing an <code class="language-plaintext highlighter-rouge">ALTER TABLE personphoto ALTER COLUMN file_path SET VARCHAR(50)</code> does suffice. I could have used <code class="language-plaintext highlighter-rouge">text</code> also, but I would like to keep under control crazy values incoming from my application.</p>
<p>The point is: sooner or later, you will be stuck against a constraint your stack is not honoring, so be prepared for some troubles.</p>
<p>Using PostgreSQL in first place would have made the long-term maintanance easier, according to me.</p>
PostgreSQL 17 WAL Summarization2024-10-21T00:00:00+00:00https://fluca1978.github.io/2024/10/21/PostgreSQLWalSummarization<p>A new interesting feature in the management of WALs.</p>
<h1 id="postgresql-17-wal-summarization">PostgreSQL 17 WAL Summarization</h1>
<p>PostgreSQL adds a new cool feature in the management of the Write Ahead Logs (WALs): the <strong>WAL summarization</strong>.</p>
<p>Two settings control the WAL Summarization:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">summarize_wal</code> (by default set to <code class="language-plaintext highlighter-rouge">off</code>) indicates if the summaries have to be produced;</li>
<li><code class="language-plaintext highlighter-rouge">wal_summary_keep_time</code> indicates the amount of time (usually days) to keep the summaries before proceeding to an automatic cleanup.</li>
</ul>
<p>Documentation for these two settings can be found <a href="https://www.postgresql.org/docs/17/runtime-config-wal.html" target="_blank">in the official documentation</a>.
Turning on <code class="language-plaintext highlighter-rouge">summarize_wal</code> makes another process appear in the list of PostgreSQL processes: the <strong>walsummarizer</strong>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ps <span class="nt">-auxw</span> | <span class="nb">grep </span>postgres
postgres 1 0.0 0.1 221044 29824 ? Ss 13:45 0:00 postgres
postgres 27 11.2 0.0 74668 6768 ? Ss 13:45 4:09 postgres: logger
postgres 28 0.0 0.3 221312 52088 ? Ss 13:45 0:00 postgres: checkpointer
postgres 29 0.0 0.0 221188 9080 ? Ss 13:45 0:00 postgres: background writer
postgres 31 0.0 0.0 221164 11768 ? Ss 13:45 0:00 postgres: walwriter
postgres 32 0.0 0.0 222608 9720 ? Ss 13:45 0:00 postgres: autovacuum launcher
postgres 33 0.0 0.0 222616 9208 ? Ss 13:45 0:00 postgres: logical replication launcher
postgres 289 0.0 0.0 221652 7696 ? Ss 13:57 0:01 postgres: walsummarizer
</code></pre></div></div>
<p><br />
<br /></p>
<p>Such process is in charge of keeping an eye on what is changed on disk, so to produce the summaries.</p>
<p>WAL summaries are kept in the <code class="language-plaintext highlighter-rouge">pg_wal</code> directory, under the <code class="language-plaintext highlighter-rouge">summaries</code> subdirectory, hence in a very risky zone to walk into!</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">ls</span> <span class="nt">-1</span> <span class="nv">$PGDATA</span>/pg_wal/summaries
000000010000000023001320000000002305F8D8.summary
00000001000000002305F8D80000000026000028.summary
0000000100000000260000280000000029E52AB8.summary
000000010000000029E52AB8000000002BC82690.summary
00000001000000002BC82690000000002E015A98.summary
</code></pre></div></div>
<p><br />
<br /></p>
<p>The summaries are used to enable the very cool new feature of <strong>incremental backups</strong>: since version <code class="language-plaintext highlighter-rouge">17</code> the <code class="language-plaintext highlighter-rouge">pg_basebackup</code> is able to take <a href="https://www.postgresql.org/docs/17/continuous-archiving.html#BACKUP-INCREMENTAL-BACKUP" target="_blank">incremental backups</a>.
The idea is as follows: you run a first <code class="language-plaintext highlighter-rouge">pg_basebackup</code> as usual, so to take a so called <em>full backup</em>. Then you take other backups specifying to <code class="language-plaintext highlighter-rouge">pg_basebackup</code> the <code class="language-plaintext highlighter-rouge">--incremental</code> option, passing the manifest of the previous backup. The command will try to understand what changed from the previous backup on disk and copy over only blocks that have been changed.</p>
<p>Before version <code class="language-plaintext highlighter-rouge">17</code> the only way to take a good incremental backup was to use tools like the excellent <code class="language-plaintext highlighter-rouge">pgbackrest</code>, that was able to do exactly that.</p>
<p>Summaries are used to know which blocks on disk have changed since the last backup, so to inform <code class="language-plaintext highlighter-rouge">pg_basebackup</code> about what is needed to be copied over. WAL summaries are much smaller than the WALs themselves, and therefore can be stored for a pretty much long period with regard to the WALs. In particular, in order to be able to peform an incremental backup, there must be all the summaries covering the timeframe from the previos backup to the current moment, otherwise there will be no possibility to perform an incremental backup. Hence the need for a <code class="language-plaintext highlighter-rouge">wal_summary_keep_time</code> tunable that resembles to me the old days of <code class="language-plaintext highlighter-rouge">wal_keep_segments</code>, with all the related problems and workarounds.</p>
<p>Incremental backups need then to be <em>re-assembled</em> into a single backup by means of a new tool called <code class="language-plaintext highlighter-rouge">pg_composebackup</code>, not discussed here.</p>
<p>One thing that scaries me a lot is that there is no way to automatically delete summaries once they are turned off after having been enabled. In other words, the user is required to remove <em>no more useful summaries</em> if the summarizer process is turned off. Being the summaries in a subdirectory of <code class="language-plaintext highlighter-rouge">pg_wal</code>, and being the latter such a risky place to be into, I believe a distracted user could do a great damage to the system.</p>
pgenv 1.3.8 is out!2024-10-17T00:00:00+00:00https://fluca1978.github.io/2024/10/17/pgenvPG17<p>A new release of pgenv that simplifies the management of PostgreSQL 17.</p>
<h1 id="pgenv-138-is-out">pgenv 1.3.8 is out!</h1>
<p>Yesterday, David Wheeler <a href="https://github.com/theory/pgenv/releases/tag/v1.3.8" target="_blank">releader version 1.3.8 of pgenv</a>, that solves a few problems in dealing with the latest PostgreSQL release version <code class="language-plaintext highlighter-rouge">17</code>.</p>
<p>The build workflow of PostgreSQL <code class="language-plaintext highlighter-rouge">17</code> has slightly changed, so that new dependencies are required to produce the documentation. Thanks to the <a href="https://github.com/theory/pgenv/commit/d97b0505fb067ee79c402800b72261317f715ae8" target="_blank">work by Brian Salehi</a> now the <code class="language-plaintext highlighter-rouge">pgenv build</code> command performs a <code class="language-plaintext highlighter-rouge">make world-bin</code> (essentially <code class="language-plaintext highlighter-rouge">world-bin</code> is the target to build and install PostgreSQL without documentation).
The documentation package is downloaded separately, since now the documentation pre-built has been removed from the source tree and is available as a separate tarball.</p>
<p>Moreover, this release includes <a href="https://github.com/theory/pgenv/commit/f2a7486e7bf1dabd65c430f861160d9429d7c2ac" target="_blank">another Brian’s little contribution</a> that improves the descriptive messages about dependencies.</p>
<p>Enjoy!</p>
PostgreSQL adds the login type for event triggers2024-10-03T00:00:00+00:00https://fluca1978.github.io/2024/10/03/PostgreSQLLoginTrigger<p>Is it now possible to catch a login event.</p>
<h1 id="postgresql-adds-the-login-type-for-event-triggers">PostgreSQL adds the login type for event triggers</h1>
<p>PostgreSQL 17 adds a new firing event for event triggers: <a href="https://www.postgresql.org/docs/current/event-trigger-definition.html" target="_blank">login</a>. Therefore it is now possible to catch a login attempt on a database.</p>
<p>Caution: <em>this is not the same as Oracle logon triggers</em>, even if it resembles the same functionality to me.</p>
<p>However, thanks to this, is is now possible to get some more information when a login attempt succeeds.</p>
<p>In order to implement a poor-man auditing (<strong>don’t do this at home!</strong>) to <em>breifly</em> demonstrate this feature, you can:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">postgres</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">wrong_audit</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span>
<span class="p">,</span> <span class="n">who</span> <span class="nb">text</span>
<span class="p">,</span> <span class="n">ts</span> <span class="nb">timestamp</span> <span class="k">default</span> <span class="k">current_timestamp</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">postgres</span><span class="o">=#</span> <span class="k">grant</span> <span class="k">INSERT</span> <span class="k">on</span> <span class="k">table</span> <span class="n">wrong_audit</span> <span class="k">to</span> <span class="k">public</span><span class="p">;</span>
<span class="k">GRANT</span>
<span class="n">postgres</span><span class="o">=#</span> <span class="k">create</span> <span class="k">or</span> <span class="k">replace</span>
<span class="k">function</span> <span class="n">f_etr_audit</span><span class="p">()</span> <span class="k">returns</span> <span class="n">event_trigger</span>
<span class="k">as</span> <span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">begin</span>
<span class="k">insert</span> <span class="k">into</span> <span class="n">wrong_audit</span><span class="p">(</span> <span class="n">who</span> <span class="p">)</span> <span class="k">select</span> <span class="k">current_role</span><span class="p">;</span>
<span class="k">end</span>
<span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">language</span> <span class="n">plpgsql</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">FUNCTION</span>
<span class="n">postgres</span><span class="o">=#</span> <span class="k">create</span> <span class="n">event</span> <span class="k">trigger</span>
<span class="n">poor_auditing</span> <span class="k">on</span> <span class="n">login</span>
<span class="k">execute</span> <span class="k">function</span> <span class="n">f_etr_audit</span><span class="p">();</span>
<span class="k">CREATE</span> <span class="n">EVENT</span> <span class="k">TRIGGER</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Ano now, when you connect to the database you will see the table getting populated.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">postgres</span><span class="o">=#</span> <span class="k">table</span> <span class="n">wrong_audit</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">who</span> <span class="o">|</span> <span class="n">ts</span>
<span class="c1">----+----------+----------------------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">postgres</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">03</span> <span class="mi">11</span><span class="p">:</span><span class="mi">39</span><span class="p">:</span><span class="mi">27</span><span class="p">.</span><span class="mi">659018</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">postgres</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">03</span> <span class="mi">11</span><span class="p">:</span><span class="mi">40</span><span class="p">:</span><span class="mi">01</span><span class="p">.</span><span class="mi">057011</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">postgres</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">03</span> <span class="mi">11</span><span class="p">:</span><span class="mi">46</span><span class="p">:</span><span class="mi">06</span><span class="p">.</span><span class="mi">38925</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">luca</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">03</span> <span class="mi">11</span><span class="p">:</span><span class="mi">46</span><span class="p">:</span><span class="mi">44</span><span class="p">.</span><span class="mi">621835</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">luca</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">03</span> <span class="mi">11</span><span class="p">:</span><span class="mi">46</span><span class="p">:</span><span class="mi">46</span><span class="p">.</span><span class="mi">389537</span>
<span class="mi">6</span> <span class="o">|</span> <span class="n">postgres</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">03</span> <span class="mi">11</span><span class="p">:</span><span class="mi">46</span><span class="p">:</span><span class="mi">53</span><span class="p">.</span><span class="mi">789339</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>There are a few things to note.</p>
<p>First of all, there is the need to grant the <code class="language-plaintext highlighter-rouge">INSERT</code> permission to the users that are going to fire the event, i.e., the user that are going to connect, or the trigger will not be able to execute. Obviously, there are other ways to do this, like settings permissions on the function itself.</p>
<p>Most important: if the trigger fails (due to an exception), the login attempt is aborted. For example, imagine that I remove the permissions on the tbale:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">%</span> <span class="n">psql</span> <span class="o">-</span><span class="n">h</span> <span class="n">localhost</span> <span class="o">-</span><span class="n">U</span> <span class="n">luca</span> <span class="n">postgres</span>
<span class="n">psql</span><span class="p">:</span> <span class="n">error</span><span class="p">:</span> <span class="k">connection</span> <span class="k">to</span> <span class="n">server</span> <span class="k">at</span> <span class="nv">"localhost"</span> <span class="p">(</span><span class="mi">127</span><span class="p">.</span><span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">.</span><span class="mi">1</span><span class="p">),</span> <span class="n">port</span> <span class="mi">5432</span> <span class="n">failed</span><span class="p">:</span> <span class="n">FATAL</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">wrong_audit</span>
<span class="n">CONTEXT</span><span class="p">:</span> <span class="k">SQL</span> <span class="k">statement</span> <span class="nv">"insert into wrong_audit( who ) select current_role"</span>
<span class="n">PL</span><span class="o">/</span><span class="n">pgSQL</span> <span class="k">function</span> <span class="n">f_etr_audit</span><span class="p">()</span> <span class="n">line</span> <span class="mi">2</span> <span class="k">at</span> <span class="k">SQL</span> <span class="k">statement</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The connection is aborted due to the problem in completing the function.</p>
<p>Last but not least, the trigger function should not be a long running one, or the user will be locked waiting for the trigger to complete.</p>
<p>Now, for me to remember Oracle logon trigger, let’s complicate a little the above example (<strong>don’t try this at home</strong>):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">postgres</span><span class="o">=#</span> <span class="k">alter</span> <span class="k">table</span> <span class="n">wrong_audit</span> <span class="k">add</span> <span class="k">column</span> <span class="n">db</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">ostgres</span><span class="o">=#</span> <span class="k">create</span> <span class="k">or</span> <span class="k">replace</span> <span class="k">function</span> <span class="n">f_etr_audit</span><span class="p">()</span>
<span class="k">returns</span> <span class="n">event_trigger</span>
<span class="k">as</span> <span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">declare</span>
<span class="n">me</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">db</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">begin</span>
<span class="k">SELECT</span> <span class="k">current_role</span><span class="p">,</span> <span class="n">current_database</span><span class="p">()</span>
<span class="k">INTO</span> <span class="n">me</span><span class="p">,</span> <span class="n">db</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">me</span> <span class="o">=</span> <span class="s1">'luca'</span> <span class="k">AND</span> <span class="n">db</span> <span class="o">=</span> <span class="s1">'postgres'</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="s1">'Get out of here!'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">insert</span> <span class="k">into</span> <span class="n">wrong_audit</span><span class="p">(</span> <span class="n">who</span><span class="p">,</span> <span class="n">db</span> <span class="p">)</span> <span class="k">VALUES</span><span class="p">(</span> <span class="n">me</span><span class="p">,</span> <span class="n">db</span> <span class="p">);</span>
<span class="k">end</span>
<span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">language</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>And now the poor bastard me when trying to connect to <code class="language-plaintext highlighter-rouge">postgres</code> gets:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-h</span> localhost <span class="nt">-U</span> luca postgres
psql: error: connection to server at <span class="s2">"localhost"</span> <span class="o">(</span>127.0.0.1<span class="o">)</span>, port 5432 failed: FATAL: Get out of here!
CONTEXT: PL/pgSQL <span class="k">function </span>f_etr_audit<span class="o">()</span> line 10 at RAISE
</code></pre></div></div>
<p><br />
<br /></p>
<p>while other users can still connect, and the table gets populated more and more.</p>
PostgreSQL 17 allow_alter_system tunable2024-10-03T00:00:00+00:00https://fluca1978.github.io/2024/10/03/PostgreSQLAllowAlterSystem<p>PostgreSQL 17 includes a new (among others) tunable to control the <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code> command.</p>
<h1 id="postgresql-17-allow_alter_system-tunable">PostgreSQL 17 allow_alter_system tunable</h1>
<p>Among the new excellent features of PostgreSQL 17, one captured my attention: <a href="https://www.postgresql.org/docs/current/runtime-config-compatible.html#GUC-ALLOW-ALTER-SYSTEM" target="_blank">the capability to disable the ALTER SYSTEM command</a> via the tunable <code class="language-plaintext highlighter-rouge">[allow_alter_system](https://www.postgresql.org/docs/current/runtime-config-compatible.html#GUC-ALLOW-ALTER-SYSTEM){:target="_blank"}</code>.</p>
<p>The <code class="language-plaintext highlighter-rouge">allow_alter_system</code> is a boolean setting that is turned <code class="language-plaintext highlighter-rouge">on</code> by default, meaning that it is always possible to execute <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code> on the enrironment (as in previous versions). When turned <code class="language-plaintext highlighter-rouge">off</code>, the system will report an error, refusing to execute the command:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">postgres</span><span class="o">=#</span> <span class="k">alter</span> <span class="k">system</span> <span class="k">set</span> <span class="n">work_mem</span> <span class="k">to</span> <span class="s1">'512MB'</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">ALTER</span> <span class="k">SYSTEM</span> <span class="k">is</span> <span class="k">not</span> <span class="n">allowed</span> <span class="k">in</span> <span class="n">this</span> <span class="n">environment</span>
<span class="n">postgres</span><span class="o">=#</span> <span class="k">show</span> <span class="n">allow_alter_system</span> <span class="p">;</span>
<span class="n">allow_alter_system</span>
<span class="c1">--------------------</span>
<span class="k">off</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The idea, as explained in the documentation, is to prevent mistakes when PostgreSQL is managed externally, or with an external tool, so that it is not possible to accidentally overwrite a configuration managed outside the database itself (i.e., via traditional files).</p>
<p>The annotation for the tunable explains it:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">postgres</span><span class="o">=#</span> <span class="k">select</span> <span class="n">name</span><span class="p">,</span> <span class="n">context</span><span class="p">,</span> <span class="n">category</span><span class="p">,</span> <span class="n">short_desc</span><span class="p">,</span> <span class="n">extra_desc</span> <span class="k">from</span> <span class="n">pg_settings</span> <span class="k">where</span> <span class="n">name</span> <span class="o">=</span> <span class="s1">'allow_alter_system'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">--------------------------------------------------------------------------------------------------------------</span>
<span class="n">name</span> <span class="o">|</span> <span class="n">allow_alter_system</span>
<span class="n">context</span> <span class="o">|</span> <span class="n">sighup</span>
<span class="n">category</span> <span class="o">|</span> <span class="k">Version</span> <span class="k">and</span> <span class="n">Platform</span> <span class="n">Compatibility</span> <span class="o">/</span> <span class="n">Other</span> <span class="n">Platforms</span> <span class="k">and</span> <span class="n">Clients</span>
<span class="n">short_desc</span> <span class="o">|</span> <span class="n">Allows</span> <span class="n">running</span> <span class="n">the</span> <span class="k">ALTER</span> <span class="k">SYSTEM</span> <span class="n">command</span><span class="p">.</span>
<span class="n">extra_desc</span> <span class="o">|</span> <span class="n">Can</span> <span class="n">be</span> <span class="k">set</span> <span class="k">to</span> <span class="k">off</span> <span class="k">for</span> <span class="n">environments</span> <span class="k">where</span> <span class="k">global</span> <span class="n">configuration</span> <span class="n">changes</span> <span class="n">should</span> <span class="n">be</span> <span class="n">made</span> <span class="k">using</span> <span class="n">a</span> <span class="n">different</span> <span class="k">method</span><span class="p">.</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>There are two important things to keep in mind when using this new feature:</p>
<ul>
<li><strong>this is not a security feature</strong>, it does not add any extra security layer;</li>
<li><strong><code class="language-plaintext highlighter-rouge">postgresql.auto.conf</code></strong> will be always loaded as last included file<em>*, therefore setting the tunable to <code class="language-plaintext highlighter-rouge">off</code> will not change the configuration machinery of PostgreSQL, nor will make impossible for *external tools</em> to operate on <code class="language-plaintext highlighter-rouge">postgresql.auto.conf</code> directly (simulating, thefefore, <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code>),</li>
</ul>
<p>Last but not least, keep in mind that the system is raising an error, thus aborting your existing scripts in the case this feature is set to <code class="language-plaintext highlighter-rouge">off</code>. According to me, the choice to error or ignoer an <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code> would have been a better choice, so that even automated script could rung without any side effect and without interruptions due to errors.</p>
SQLite3 Vacuum and Autovacuum2024-09-23T00:00:00+00:00https://fluca1978.github.io/2024/09/23/SQLite3Vacuum<p>Similarly to PostgreSQL, also SQLite3 needs some care…</p>
<h1 id="sqlite3-vacuum-and-autovacuum">SQLite3 Vacuum and Autovacuum</h1>
<p>Today I discovered, by accident I need to confess, that PostgreSQL is not the only database requiring <code class="language-plaintext highlighter-rouge">VACUUM</code>: also SQLite3 does.</p>
<p>And there’s more: SQLite3 includes an <em>auto-vacuum</em> too!
They behave similarly, at least in theory, to their PostgreSQL counterparts, but clearly there is no autovacuum daemon or process. Moreover, the configuration is simpler and I’ve not found any threshold as we have in PostgreSQL. In the following, I explain how <code class="language-plaintext highlighter-rouge">VACUUM</code> works in SQLite3, at least at glance.</p>
<p>SQLite3 does not have a fully enterprise-level MVCC machinery as PostgreSQL has, but when tuples or tables are updated or deleted from a database, defragmentation and not reclaimed space makes the database file never shrink.
Similarly to what PostgreSQL does, the <em>now</em> empty space (no more occupied by old tuples) is kept for future usage, so that the effect is that the database grows without never shrinking even after large data removal.</p>
<p><code class="language-plaintext highlighter-rouge">VACUUM</code> is the solution that also SQLite3 uses to reclaim space.</p>
<p><a href="https://sqlite.org/lang_vacuum.html" target="_blank">VACUUM</a> is a command available to the SQLite3 prompt to start a manual space reclaiming. It works by copying the database file content into another (temporary) file and restructuring it, so nothing really fancy and new here!</p>
<p>Then comes <a href="https://sqlite.org/pragma.html#pragma_auto_vacuum" target="_blank">auto-vacuum</a> that is turned off by default. The autovacuum works in a <em>full</em> mode or an <em>incremental mode</em>. The former is the most aggressive, and happens after a <code class="language-plaintext highlighter-rouge">COMMIT</code>. The second is the less intrusive, and “prepares” what the vacuum process has to do, without performing it. Is is only when <code class="language-plaintext highlighter-rouge">[incremental_autovacuum](https://sqlite.org/pragma.html#pragma_incremental_vacuum){:target="_blank"}</code> is launched that the space is freed. Therefore, autovacuum is SQLite3 either executes at each <code class="language-plaintext highlighter-rouge">COMMIT</code> or is postponed when considered safe to execute.</p>
pg_dump and --if-exists little gem2024-06-26T00:00:00+00:00https://fluca1978.github.io/2024/06/26/pgdumpIfExsits<p>An option I was not aware of…</p>
<h1 id="pg_dump-and-if-exists-little-gem">pg_dump and –if-exists little gem</h1>
<p><code class="language-plaintext highlighter-rouge">pg_dump</code> is a very useful tool to dump (and hence prepare to restore) a single PostgreSQL database.</p>
<p>When I use it, I usually add the options:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">--clean</code> to <code class="language-plaintext highlighter-rouge">DROP</code> the database I’m dumping;</li>
<li><code class="language-plaintext highlighter-rouge">--create</code> to issue a <code class="language-plaintext highlighter-rouge">CREATE DATABASE</code> and reconnect to it.</li>
</ul>
<p>Thanks to the above options, I’m pretty sure that I’m going to start over from a clean situation when restoring the dump.
This is particularly useful, according to me, when developing a new application and need to start over from scratch.</p>
<p>However, the result of the <code class="language-plaintext highlighter-rouge">--clean</code> option is that the SQL file begins, after the useual preamble, with something like:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">DROP</span> <span class="k">DATABASE</span> <span class="n">miniondb</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>While this is what I want, if I need to restore the backup on a fresh machine, where the target database was not already in place, the restore will cause a warning saying that the database cannot be dropped because it does not exist (yet).</p>
<p>And thic could be annoying from time to time!</p>
<p>But being PostgreSQL such a great advanced piece of software, <code class="language-plaintext highlighter-rouge">pg_dump</code> provides an option for add the very useful <code class="language-plaintext highlighter-rouge">IF EXISTS</code> to <code class="language-plaintext highlighter-rouge">DROP DATABASE</code>: <strong><code class="language-plaintext highlighter-rouge">-if-exists</code></strong> comes to the rescue!</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">--clean</span> <span class="nt">--create</span> <span class="nt">--if-exists</span> ...
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above will result in the <code class="language-plaintext highlighter-rouge">DROP DATABASE miniondb IF EXISTS;</code>, that in turn will stop annoying me when the database is not already in place.</p>
<p>After all, the documentation for the <code class="language-plaintext highlighter-rouge">--clean</code> option already mentioned it clearly:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>If any of the objects <span class="k">do </span>not exist <span class="k">in </span>the destination database,
ignorable error messages will be reported during restore,
unless <span class="nt">--if-exists</span> is also specified.
</code></pre></div></div>
<p><br />
<br /></p>
<p>and much more on the option documentation itself:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">--if-exists</span>
Use DROP ... IF EXISTS commands to drop objects <span class="k">in</span> <span class="nt">--clean</span> mode. This suppresses “does
not exist” errors that might otherwise be reported. This option is not valid unless
<span class="nt">--clean</span> is also specified.
</code></pre></div></div>
<p><br />
<br /></p>
<p>Note that <code class="language-plaintext highlighter-rouge">--if-exists</code> refers to <em>objects</em>, not only the whole database!</p>
PgTraining Free Online Event: Material Available2024-04-23T00:00:00+00:00https://fluca1978.github.io/2024/04/23/PgTrainingOnLineEventMaterial<p>The material and the videos are now online!</p>
<h1 id="pgtraining-free-online-event-material-available">PgTraining Free Online Event: Material Available</h1>
<p>The past Friday, on April 19th, we did our fourth edition of the webinar dedicated entirely to PostgreSQL, provided by <a href="http://pgtraining.com" target="_blank">PgTraining</a>.</p>
<p><br />
As in the previous editions, we had three talks and an open discussion at the end.
The talks (<strong>all in italian</strong>) were:</p>
<ul>
<li><em>Introduzione al linguaggio PL/Java</em> (“An introduction to the PL/Java language”), from yours truly;</li>
<li><em>PgVector - in R768 nessuno può sentirti urlare</em> (“PgVector - in R768 nobody can hear you screaming”), by Chris Mair;</li>
<li><em>Repliche logiche e migrazione di versione a caldo da PostgreSQL 12 a PostgreSQL 16</em> (“Logical replication and hot upgrade from PostgreSQL 12 to PostgreSQL 16”), by Enrico Pirozzi</li>
</ul>
<p><br /></p>
<p>The material is available <a href="https://gitlab.com/pgtraining/slides/-/tree/master/webinar-20240419?ref_type=heads" target="_blank">on our Gitlab repository</a> and such repository contains also links and material from the previous editions!</p>
<p>Some material is still under upgrading, so if not already there, it will appear any moment soon.</p>
Using PL/Java: need for clarifications2024-04-22T00:00:00+00:00https://fluca1978.github.io/2024/04/22/PLJavaInsightsAndFixes<p>Sometimes it happens: I write something in a rush, and present it in a not-optimal way. And then I get advices!</p>
<h1 id="using-pljava-need-for-clarifications">Using PL/Java: need for clarifications</h1>
<p>On January, I wrote an article about <a href="https://fluca1978.github.io/2024/01/17/pljavaPostgreSQL16.html" target="_blank">installing PL/Java on Rocky Linux</a>, and about some of the difficulties I had in achieving a fully operational installation, even if I did not dig enough into the problems that I encountered.</p>
<p><br />
<strong><a href="https://github.com/jcflack" target="_blank">Chapman Flack</a></strong>, the most active developer in the project at the moment, take the time to write to me a very detailed email with a lot of suggestions for improvements and providing corrections to some of the misconceptions I present in such an article.</p>
<p><br />
I’m really glad to have received all those insights, and in order to spread the word, I’m writing here another article that, hopefully, fixes my mistakes.
I’m not following the same order that Chapman presented them to me, since in my opinion some issues are much more important than others, so I present from the most important to the least one, according to me.</p>
<h2 id="editing-the-javapolicy-file">Editing the <code class="language-plaintext highlighter-rouge">java.policy</code> file</h2>
<p>In my previous article, I advised readers to edit <code class="language-plaintext highlighter-rouge">java.policy</code> in the case there was a problem with Java permissions when executing PL/Java code.
Despite the fact that I clearly stated that relaxing the permissions to <em>all permissions</em> was not a good idea, Chapman emphasized two main problems in my example:
1) I was editing the <strong>main</strong> policy file, therefore changing the policy rules for <strong>all</strong> the Java code, not only for PL/Java one;
2) adding <code class="language-plaintext highlighter-rouge">java.security.AllPermission</code> made no distinction between trusted and untrusted languages.</p>
<p>Chapman pointed out that PL/Java uses a customized policy file, that can be found in the PostgreSQL configuration directory, hence in <code class="language-plaintext highlighter-rouge">$(pg_config --sysconfdir)</code>. This customizable configuration is available since PL/Java version 1.6, and is documented <a href="https://tada.github.io/pljava/use/policy.html" target="_blank">here</a> in the section <em>“Permissions available in sandboxed/unsandboxed PL/Java”</em>.
This file defines two main principals:</p>
<p><br />
<br /></p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">grant</span> <span class="n">principal</span> <span class="n">org</span><span class="o">.</span><span class="na">postgresql</span><span class="o">.</span><span class="na">pljava</span><span class="o">.</span><span class="na">PLPrincipal</span><span class="n">$Sandboxed</span> <span class="o">*</span> <span class="o">{</span>
<span class="o">};</span>
<span class="n">grant</span> <span class="n">principal</span> <span class="n">org</span><span class="o">.</span><span class="na">postgresql</span><span class="o">.</span><span class="na">pljava</span><span class="o">.</span><span class="na">PLPrincipal</span><span class="n">$Unsandboxed</span> <span class="o">*</span> <span class="o">{</span>
<span class="n">permission</span> <span class="n">java</span><span class="o">.</span><span class="na">io</span><span class="o">.</span><span class="na">FilePermission</span>
<span class="s">"<<ALL FILES>>"</span><span class="o">,</span> <span class="s">"read,readlink,write,delete"</span><span class="o">;</span>
<span class="o">};</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The first principal, the <code class="language-plaintext highlighter-rouge">PLPrincipal$Sandboxed</code> does not add any particular permission, while the <code class="language-plaintext highlighter-rouge">PLPrincipal$Unsandboxed</code> adds the permission to interact with the filesystem.</p>
<p>It is interesting to note that the <code class="language-plaintext highlighter-rouge">pljava.policy</code> file masks the <code class="language-plaintext highlighter-rouge">~/.java.policy</code> one (if exists), meaning that the latter is not used by PL/Java at all. However, the special property ` pljava.policy_urls` can be set to point and include additional (cumulative) policy files.</p>
<p>Conclusion: configuring the <code class="language-plaintext highlighter-rouge">pljava.policy</code> file is the right way to make permissions available to the PL/Java code in a fine grain manner, without having to deal with the system-wide set of permissions.</p>
<h2 id="hopefully-there-is-no-need-to-set-pljavalibjvm_location">Hopefully, there is no need to <code class="language-plaintext highlighter-rouge">SET pljava.libjvm_location</code></h2>
<p>Chapman provided me a link to the <a href="https://tada.github.io/pljava/build/package.html" target="_blank">PL/Java packaging documentation</a> which contains a section named <em>“What is the default pljava.libjvm_location?”</em> that explains how package mantainers have information about where the default JVM installation is on the target system.
With such information, PL/Java pre-built packages could come pre-configured with the JVM location of the default installation on the system. So far, it seems the case for the Ubuntu package, while on my Rocky Linux it does not seem to be the case (or I messed the JVM installation).</p>
<p>Therefore, it is possible that there is no need to set <code class="language-plaintext highlighter-rouge">pljava.libjvm_location</code> if the package you installed already knows where the default JVM installation is on your operating system. However, knowing the aim of such variable and checking/configuring it allows database administrator to make PL/Java able to use a different (and specific) JVM.</p>
<h2 id="using-the-pljava-api-locally">Using the <code class="language-plaintext highlighter-rouge">pljava-api</code> (locally)</h2>
<p>In my previous post, I wrote that in order to compile Java code against PL/Java there is the need for the API jar installed, namely <code class="language-plaintext highlighter-rouge">pljava-api-x.y.z.jar</code>.
In order to get the API jar on the development machine, I wrote that you need to download the source code and compile it (using Apache <code class="language-plaintext highlighter-rouge">mvn</code>) and that this step is not simple at all, since it could require extra dependencies for the native code bindings.</p>
<p>Chapman pointed out that when you install the PL/Java from the PGDG distribution, you get also the above API jar installed on the PostgreSQL shared folder:</p>
<p><br /><br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">ls</span> <span class="si">$(</span>pg_config <span class="nt">--sharedir</span><span class="si">)</span>/pljava/<span class="k">*</span>.jar
/usr/share/postgresql/16/pljava/pljava-1.6.7.jar
/usr/share/postgresql/16/pljava/pljava-api-1.6.7.jar
/usr/share/postgresql/16/pljava/pljava-examples-1.6.7.jar
</code></pre></div></div>
<p><br /><br /></p>
<p>Therefore, <strong>there is no need to manually compile the API jar by yourself</strong>, but you can use the one already installed into the PostgreSQL directory.</p>
<p>However, in order to make Apache Maven <code class="language-plaintext highlighter-rouge">mvn</code> aware of where the API jar is, you need to <a href="https://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html" target="_blank">install locally the JAR into the Maven repository</a>, so for example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>mvn <span class="nb">install</span>:install-file <span class="se">\</span>
<span class="nt">-Dfile</span><span class="o">=</span><span class="si">$(</span>pg_config <span class="nt">--sharedir</span><span class="si">)</span>/pljava/pljava-api-1.6.7.jar <span class="se">\</span>
<span class="nt">-DgroupId</span><span class="o">=</span>org.postgresql <span class="se">\</span>
<span class="nt">-DartifactId</span><span class="o">=</span>pljava-api <span class="se">\</span>
<span class="nt">-Dversion</span><span class="o">=</span>1.6.7 <span class="se">\</span>
<span class="nt">-Dpackaging</span><span class="o">=</span>jar
</code></pre></div></div>
<p><br />
<br /></p>
<p>After the above, it is possible to compile Java code against the PL/Java API!</p>
<h2 id="information-in-the-sqljjar_repository-table">Information in the <code class="language-plaintext highlighter-rouge">sqlj.jar_repository</code> table</h2>
<p>The <code class="language-plaintext highlighter-rouge">sqlj.jar_repository</code> table contains the unique (short) name given to every installed JAR, as well as the location the JAR was loaded from (<code class="language-plaintext highlighter-rouge">jarorigin</code>):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">jarname</span><span class="p">,</span> <span class="n">jarorigin</span> <span class="k">from</span> <span class="n">sqlj</span><span class="p">.</span><span class="n">jar_repository</span><span class="p">;</span>
<span class="n">jarname</span> <span class="o">|</span> <span class="n">jarorigin</span>
<span class="c1">---------+--------------------------</span>
<span class="n">PWC258</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC258</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">PWC260</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC260</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">PWC257</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC257</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">PWC263</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC263</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">pwc266</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC266</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">PWC264</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC264</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">PWC259</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC259</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">PWC262</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC262</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="n">PWC65</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">PWC265</span><span class="o">-</span><span class="mi">1</span><span class="p">.</span><span class="n">jar</span>
<span class="p">(</span><span class="mi">9</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In my previous article, I poorly explained this concept: when the <code class="language-plaintext highlighter-rouge">install_jar</code> function is executed it accepts as a afirst argument the URI from which the JAR is going to be loaded from, and such value is stored into the <code class="language-plaintext highlighter-rouge">jaroigin</code> field. <strong>Once the JAR is deployed, such field does not have any useful meaning but giving information about the original location of the JAR</strong>, and does not provide information about <em>where</em> the JAR currently is. For example, if on a local storage, the JAR file could even be removed, since <code class="language-plaintext highlighter-rouge">sqlj.install_jar</code> will <em>copy</em> the jar content into the database (I guess into <code class="language-plaintext highlighter-rouge">sqlj.jar_entry</code> table).</p>
<h2 id="is-there-a-round-trip-of-data-between-postgresql-and-pljava">Is there a round-trip of data between PostgreSQL and PL/Java?</h2>
<p>Again, I poorly explained this concept in my previous article, stating that <em>“[…] using an *external</em> language like PL/Java means that PostgreSQL has to manage the round-trip of data between the database and the virtual machine, with the latter being fired at first execution.”*</p>
<p>PL/Java exploits JNI to comunicate with the PostgreSQL backend process, and the comunication happens within the same process. Therefore there is no roundtrip, at least not as in involving a different process (i.e, inter-process comunication). However, there is still the need to properly convert complex data structures from Java types to PostgreSQL ones and viceversa, and that was what I meant with the wrong term “roundtrip”.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The above is set of details towards a better understanding of how PL/Java works. I have to admit that I’m really sorry about the probably worst mistake I did, that was to provide all the permissions to all the Java code running on the machine. It is embarassing, since I did also in the past a lot of work on the Java policy mechanism, but being so long since I don’t develop Java anymore, I forgot all the good practice!</p>
<p>Besides, I hope this is going to better explain how to use PL/Java, and quite frankly I’m really happy to see that pretty much all my problems have a very strighforward solution that PL/Java developers have already addressed. This, again, emphasizes the maturity of such a project!</p>
pgenv: run once scripts2024-04-15T00:00:00+00:00https://fluca1978.github.io/2024/04/15/pgenvRunOnce<p>A new feature to run a single script at the very beginning of the cluster lifecycle.</p>
<h1 id="pgenv-run-once-scripts">pgenv: run once scripts</h1>
<p>Today <a href="https://github.com/theory/pgenv" target="_blank">pgenv</a> got a new release that provides a simple, but quite useful, feature: <strong>the capability to run a custom script the first time the instance is started</strong>.</p>
<p>The idea is simple: after the <code class="language-plaintext highlighter-rouge">initdb</code> phase, if the user has configured a <code class="language-plaintext highlighter-rouge">PGENV_SCRIPT_FIRSTSTART</code> executable, the system will run such script against the (just) started instance. This is different from <code class="language-plaintext highlighter-rouge">PGENVE_SCRIPT_POSTSTART</code> script, since the latter ie executed <em>every time the cluster has started</em>, while <code class="language-plaintext highlighter-rouge">PGENV_SCRIPT_FIRSTSTART</code> is run only the first time the database cluster is started.</p>
<p>The aim of this script is, hence, to install users and databases, or populate some initial data.</p>
PostgreSQL 16 Coin2024-04-14T00:00:00+00:00https://fluca1978.github.io/2024/04/14/PostgreSQL16Coin<p>I just got the coin in the mail!</p>
<h1 id="postgresql-16-coin">PostgreSQL 16 Coin</h1>
<p>I just received in the mail the <strong>PostgreSQL 16 Coin</strong> with a great artwork!</p>
<p><br /></p>
<p>I’m really happy to be part of this great community!</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/postgresql/pg16_coin_1.png" />
<br />
<img src="/images/posts/postgresql/pg16_coin_2.png" />
</center>
pgagroal-cli minor bug fixes2024-03-21T00:00:00+00:00https://fluca1978.github.io/2024/03/21/pgagroalSmallImprovements<p>A few changes to a part of <code class="language-plaintext highlighter-rouge">pgagroal</code>.</p>
<h1 id="pgagroal-cli-minor-bug-fixes">pgagroal-cli minor bug fixes</h1>
<p>In the past days I pushed a few troophy patches to <code class="language-plaintext highlighter-rouge">pgagroal-cli</code>, the command line tool to administer a <code class="language-plaintext highlighter-rouge">pgagroal</code> connection pooler, in order to fix minor issues that produced unattended results.</p>
<p><em>The bug were all harmless</em>, since they only affected what the <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> was producing as output to the user, but could have been confusing for some use cases, hence the need to fix them.</p>
<p>There is quite a momentum around the <code class="language-plaintext highlighter-rouge">pgagroal</code> project, and the activity around the issues has increased with the arrival of new contributors!</p>
pgagroal command refactoring (again!) and a new contributor!2024-03-15T00:00:00+00:00https://fluca1978.github.io/2024/03/15/pgagroalCommandLineRefactoring<p>Changes in <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> and <code class="language-plaintext highlighter-rouge">pgagroal-admin</code>.</p>
<h1 id="pgagroal-command-refactoring-again-and-a-new-contributor">pgagroal command refactoring (again!) and a new contributor!</h1>
<p>Last year I introduced a way in <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> and <code class="language-plaintext highlighter-rouge">pgagroal-admin</code> to arrange commands in a more consistent and manageable way, deprecating some commands too.</p>
<p>Today, a new contributor to the project, <strong>Henrique de Carvalho</strong>, <a href="https://github.com/agroal/pgagroal/commit/11a1f475631534797aab63d6ad34d3990f0f9e70" target="_blank">committed a patch</a> that greatly improves the way commands are handled internally.</p>
<p>The users will not notice any particular difference, except that also <a href="https://github.com/agroal/pgagroal/commit/d9f9253504605194b6bac1dd18491f132059b145" target="_blank">a bug has been fixed in handling deprecated commands</a>, but the changes in the code are very important: now all the commands are organized in a list of <code class="language-plaintext highlighter-rouge">struct</code>s that provide a more accurate way of handling errors, missing arguments or command parts, and logging.</p>
<p>I became thinking about this refactoring months ago, but never got the time to dig into the changes.
However, it all began with an annoying problem with some mispelled commands, that reported a wrong error message to the user.</p>
<p>And now, thanks to the contributions of Henrique, <code class="language-plaintext highlighter-rouge">pgagroal</code> has done another step towards a more complete and robust system.</p>
PgTraining Online Event 2024 (italian)2024-03-15T00:00:00+00:00https://fluca1978.github.io/2024/03/15/PgTrainingOnlineEvent<p>We are back with another event!</p>
<h1 id="pgtraining-online-event-2024-italian">PgTraining Online Event 2024 (italian)</h1>
<p><a href="http://pgtraining.com" target="_blank">PgTraining</a>, the amazing italian professionals that spread the word about PostgreSQL and that I joined in the last years, is organizing another online event (<em>webinar</em>) on next <strong>19th April 2024</strong>.
<br />
Following the success of the previous edition(s), we decided to provide another afternoon full of <em>PostgreSQL talks</em>, in the hope to improve the adoption of this great database.</p>
<p><br />
The event will consist in three hours with talks about <strong>PL/Java</strong>, <strong>PgVector</strong> and <strong>hot upgrade via logical replication</strong>.
<br />
As for the previous editions, the <strong>webinar will be presented in Italian</strong>.
Attendees will be free to actively participate and do questions both during the talks and at the end of the whole event.
<br />
<br />
In the pure spirit of <a href="http://pgtraining.com" target="_blank">PgTraining</a>, the event <strong>will be free of charge</strong>, but it is required to register for participate and the number of available seats is limited, so <a href="https://www.eventbrite.it/e/biglietti-pgtraining-on-line-session-2024-04-863936706947?aff=oddtdtcreator" target="_blank"><strong>hurry up and get your free ticket</strong></a> as soon as possible!
<br />
The material will be available for free after the event has completed, but no live recording will be available.</p>
pgagroal 1.6.0 has been released2024-02-24T00:00:00+00:00https://fluca1978.github.io/2024/02/24/pgagroal1.6.0<p>pgagroal, the fast connection pooler for PostgreSQL, has reached a new stable release!</p>
<h1 id="pgagroal-160-has-been-released">pgagroal 1.6.0 has been released</h1>
<p>A couple of days ago, <a href="https://github.com/agroal/pgagroal/releases/tag/1.6.0" target="_blank">pgagroal</a> version <code class="language-plaintext highlighter-rouge">1.6.0</code> has been released.</p>
<p>This new version includes a lot of new features and small improvements that make <code class="language-plaintext highlighter-rouge">pgagroal</code> much more user-friendly and ease to adopt as a conenction pooler. The main contribution, from yours truly, has been <strong>command line refactoring</strong> and <strong>JSON support</strong>.
Now the command line supports <em>commands and subcommands</em>, like for example <code class="language-plaintext highlighter-rouge">conf get</code> and <code class="language-plaintext highlighter-rouge">conf set</code>, and a more consistent set of commands.
The JSON command output allows for an ease automation and a stable command output, so to ease the adoption in different scenarios.</p>
<p>But there’s more: a lot of other tickets have been solved during this release, and there is now support fo Mac OSX. Moreover, it is now possible to retrieve and set configuration values at run-time, thus without the need to manually editing the configuration file and reloading the daemon.</p>
<p>There is an initial exeperimental support for client certificates, and now it is possible to determine how long a connection must live.</p>
<p>A better handling of the configuration files, hence a better detection and reporting of misconfiguration, as well as a better error messaging system, completes the release.</p>
<p>The list of contributors is also expanding, and this is good and exciting!</p>
<p>Give <code class="language-plaintext highlighter-rouge">pgagroal</code> a try, you will be amazed by the capabilities of this connection pooler!</p>
Using PL/Java to Return SETOF RECORD2024-02-13T00:00:00+00:00https://fluca1978.github.io/2024/02/13/PostgreSQLPLJavaReturnSet<p>A simple way to return multiple records from PL/Java</p>
<h1 id="using-pljava-to-return-setof-record">Using PL/Java to Return SETOF RECORD</h1>
<p>PL/Java allows a quite easy implementation of result set providers, objects that will produce rows that can be used as tables in queries.
In order to produce a result set, the main steps are:
1) implement the <code class="language-plaintext highlighter-rouge">ResultSetProvider</code> interface and its method to effectively produce the data;
2) build a PL/Java function that will instantiate the above <code class="language-plaintext highlighter-rouge">ResultsetProvider</code>, so that PL/Java will <em>wrap</em> such function into a <code class="language-plaintext highlighter-rouge">RETURN SETOF RECORD</code> SQL function.</p>
<p>In the following there is a quite simple demostration about the production of records from PL/Java.</p>
<h2 id="implementing-the-resultsetprovider">Implementing the <code class="language-plaintext highlighter-rouge">ResultSetProvider</code></h2>
<p>PL/Java has the <code class="language-plaintext highlighter-rouge">ResultSetProvider</code> interface that requires the implementation of two methods:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">assignRowValues</code> that is called for every row in the result set, and must return <code class="language-plaintext highlighter-rouge">true</code> to indicate that a new row has been added to the result set, or <code class="language-plaintext highlighter-rouge">false</code> to indicate that the result set is complete and no more rows will be added;</li>
<li><code class="language-plaintext highlighter-rouge">close</code> that is called when the result set is closed by <code class="language-plaintext highlighter-rouge">assignRowValue</code>.</li>
</ul>
<p>The <code class="language-plaintext highlighter-rouge">assignRowValues</code> function accepts two arguments:</p>
<ul>
<li>an <code class="language-plaintext highlighter-rouge">ResultSet</code> object that is the container for all the rows;</li>
<li>a <code class="language-plaintext highlighter-rouge">long</code> value indicating the current row for which the method has been called. This counter <em>starts at zero</em>, as in normal Java list/array manipulations, not as in SQL.</li>
</ul>
<p>Therefore, it is possible to implement the following methods as:</p>
<p><br />
<br /></p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">class</span> <span class="nc">Task1</span> <span class="kd">implements</span> <span class="nc">ResultSetProvider</span> <span class="o">{</span>
<span class="kd">private</span> <span class="kd">final</span> <span class="kd">static</span> <span class="nc">Logger</span> <span class="n">logger</span> <span class="o">=</span> <span class="nc">Logger</span><span class="o">.</span><span class="na">getAnonymousLogger</span><span class="o">();</span>
<span class="kd">public</span> <span class="nf">Task1</span><span class="o">(</span> <span class="kt">int</span> <span class="n">maxRows</span> <span class="o">)</span> <span class="o">{</span>
<span class="kd">super</span><span class="o">();</span>
<span class="k">this</span><span class="o">.</span><span class="na">maxRows</span> <span class="o">=</span> <span class="n">maxRows</span><span class="o">;</span>
<span class="o">}</span>
<span class="kd">private</span> <span class="kt">int</span> <span class="n">maxRows</span> <span class="o">=</span> <span class="mi">10</span><span class="o">;</span>
<span class="nd">@Override</span>
<span class="kd">public</span> <span class="kt">boolean</span> <span class="nf">assignRowValues</span><span class="o">(</span> <span class="nc">ResultSet</span> <span class="n">rs</span><span class="o">,</span> <span class="kt">int</span> <span class="n">row</span> <span class="o">)</span>
<span class="kd">throws</span> <span class="nc">SQLException</span> <span class="o">{</span>
<span class="k">if</span> <span class="o">(</span> <span class="n">row</span> <span class="o">></span> <span class="n">maxRows</span> <span class="o">)</span>
<span class="k">return</span> <span class="kc">false</span><span class="o">;</span>
<span class="n">logger</span><span class="o">.</span><span class="na">info</span><span class="o">(</span> <span class="nc">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span> <span class="s">"Producing row %d/%d"</span><span class="o">,</span> <span class="n">row</span><span class="o">,</span> <span class="n">maxRows</span> <span class="o">)</span> <span class="o">);</span>
<span class="n">rs</span><span class="o">.</span><span class="na">updateString</span><span class="o">(</span> <span class="mi">1</span><span class="o">,</span> <span class="nc">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span> <span class="s">"Row %d out of %d from %s"</span><span class="o">,</span>
<span class="n">row</span><span class="o">,</span>
<span class="n">maxRows</span><span class="o">,</span>
<span class="k">this</span><span class="o">.</span><span class="na">getClass</span><span class="o">().</span><span class="na">getName</span><span class="o">()</span> <span class="o">)</span> <span class="o">);</span>
<span class="n">rs</span><span class="o">.</span><span class="na">updateInt</span><span class="o">(</span> <span class="mi">2</span><span class="o">,</span> <span class="n">row</span> <span class="o">);</span>
<span class="n">rs</span><span class="o">.</span><span class="na">updateInt</span><span class="o">(</span> <span class="mi">3</span><span class="o">,</span> <span class="n">maxRows</span> <span class="o">);</span>
<span class="n">rs</span><span class="o">.</span><span class="na">updateDate</span><span class="o">(</span> <span class="mi">4</span><span class="o">,</span> <span class="k">new</span> <span class="n">java</span><span class="o">.</span><span class="na">sql</span><span class="o">.</span><span class="na">Date</span><span class="o">(</span> <span class="nc">Calendar</span><span class="o">.</span><span class="na">getInstance</span><span class="o">().</span><span class="na">getTimeInMillis</span><span class="o">()</span> <span class="o">)</span> <span class="o">);</span>
<span class="k">return</span> <span class="kc">true</span><span class="o">;</span>
<span class="o">}</span>
<span class="nd">@Override</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">close</span><span class="o">()</span> <span class="o">{</span>
<span class="n">logger</span><span class="o">.</span><span class="na">info</span><span class="o">(</span> <span class="s">"Closing resultset"</span> <span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">assignRowValues</code> function simply adds to the <code class="language-plaintext highlighter-rouge">ResultSet</code> a string field, two integers and one date field.
The production of the result set ends as soon as the produced rows count as in <code class="language-plaintext highlighter-rouge">maxRows</code> parameter, that is decided when the class is instantiated.</p>
<h2 id="creating-a-function-to-call-the-producer">Creating a function to call the producer</h2>
<p>It is possible to create a PL/Java function that will instantiate the aboce class, returning it.
In order for PL/Java to understand that the function will produce a result set, the function must return a <code class="language-plaintext highlighter-rouge">ResultSetProvider</code>.</p>
<p><br />
<br /></p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nd">@Function</span><span class="o">(</span> <span class="n">onNullInput</span> <span class="o">=</span> <span class="no">RETURNS_NULL</span><span class="o">,</span> <span class="n">effects</span> <span class="o">=</span> <span class="no">IMMUTABLE</span> <span class="o">)</span>
<span class="kd">public</span> <span class="kd">static</span> <span class="kd">final</span> <span class="nc">ResultSetProvider</span> <span class="nf">rs_producer_pljava</span><span class="o">()</span> <span class="kd">throws</span> <span class="nc">SQLException</span> <span class="o">{</span>
<span class="n">logger</span><span class="o">.</span><span class="na">log</span><span class="o">(</span> <span class="nc">Level</span><span class="o">.</span><span class="na">INFO</span><span class="o">,</span> <span class="s">"Entering rs_producer_pljava"</span> <span class="o">);</span>
<span class="nc">Task1</span> <span class="n">producer</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Task1</span><span class="o">(</span> <span class="mi">20</span> <span class="o">);</span>
<span class="k">return</span> <span class="n">producer</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Once the function has been compiled, and the JAR installed, there will be a function defined as:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">sf</span> <span class="n">rs_producer_pljava</span>
<span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="k">public</span><span class="p">.</span><span class="n">rs_producer_pljava</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="k">SETOF</span> <span class="n">record</span>
<span class="k">LANGUAGE</span> <span class="n">java</span>
<span class="k">IMMUTABLE</span> <span class="k">STRICT</span>
<span class="k">AS</span> <span class="err">$</span><span class="k">function</span><span class="err">$</span><span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span><span class="p">.</span><span class="n">rs_producer_pljava</span><span class="p">()</span><span class="err">$</span><span class="k">function</span><span class="err">$</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Note how the function has been produced as <code class="language-plaintext highlighter-rouge">RETURN SETOF RECORD</code> and will call the PL(Java function, that in turn will instantiate the <code class="language-plaintext highlighter-rouge">ResultSetProviderr</code>.</p>
<h2 id="using-the-function">Using the function</h2>
<p>It is now possible to query the function from SQL:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">j</span><span class="p">.</span><span class="o">*</span> <span class="k">from</span> <span class="n">rs_producer_pljava</span><span class="p">()</span> <span class="k">as</span> <span class="n">j</span><span class="p">(</span><span class="n">t</span> <span class="nb">text</span><span class="p">,</span> <span class="n">r</span> <span class="nb">int</span><span class="p">,</span> <span class="n">m</span> <span class="nb">int</span><span class="p">,</span> <span class="n">d</span> <span class="nb">date</span><span class="p">);</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Entering</span> <span class="n">rs_producer_pljava</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">0</span><span class="o">/</span><span class="mi">20</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">1</span><span class="o">/</span><span class="mi">20</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">2</span><span class="o">/</span><span class="mi">20</span>
<span class="p">...</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">18</span><span class="o">/</span><span class="mi">20</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">19</span><span class="o">/</span><span class="mi">20</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">20</span><span class="o">/</span><span class="mi">20</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Closing</span> <span class="n">resultset</span>
<span class="n">t</span> <span class="o">|</span> <span class="n">r</span> <span class="o">|</span> <span class="n">m</span> <span class="o">|</span> <span class="n">d</span>
<span class="c1">-----------------------------------+---+----+------------</span>
<span class="k">Row</span> <span class="mi">0</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">20</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">20</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="k">Row</span> <span class="mi">1</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">20</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">20</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="k">Row</span> <span class="mi">2</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">20</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">20</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="k">Row</span> <span class="mi">3</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">20</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">20</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="k">Row</span> <span class="mi">4</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">20</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">20</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="p">...</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>From the log messages it is possible to see that the result is being used to produce the records, and at the end it is closed.</p>
<h2 id="passing-dynamically-the-number-of-rows-to-produce">Passing dynamically the number of rows to produce</h2>
<p>What if there is the need to decide dynamically how many rows the <code class="language-plaintext highlighter-rouge">ResultSetProvider</code> has to produce?
It simply requires to change the PL/Java function passing an integer argument:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="o">@</span><span class="k">Function</span><span class="p">(</span> <span class="n">onNullInput</span> <span class="o">=</span> <span class="n">RETURNS_NULL</span><span class="p">,</span> <span class="n">effects</span> <span class="o">=</span> <span class="k">IMMUTABLE</span> <span class="p">)</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">final</span> <span class="n">ResultSetProvider</span> <span class="n">rs_producer_pljava</span><span class="p">(</span> <span class="nb">int</span> <span class="n">howManyRows</span> <span class="p">)</span> <span class="n">throws</span> <span class="k">SQLException</span> <span class="p">{</span>
<span class="n">logger</span><span class="p">.</span><span class="n">log</span><span class="p">(</span> <span class="k">Level</span><span class="p">.</span><span class="n">INFO</span><span class="p">,</span> <span class="nv">"Entering rs_producer_pljava"</span> <span class="p">);</span>
<span class="n">if</span> <span class="p">(</span> <span class="n">howManyRows</span> <span class="o"><=</span> <span class="mi">0</span> <span class="p">)</span>
<span class="n">howManyRows</span> <span class="o">=</span> <span class="mi">5</span><span class="p">;</span>
<span class="n">Task1</span> <span class="n">producer</span> <span class="o">=</span> <span class="k">new</span> <span class="n">Task1</span><span class="p">(</span> <span class="n">howManyRows</span> <span class="p">);</span>
<span class="k">return</span> <span class="n">producer</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>And it is then possible to query the function with the following query:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">j</span><span class="p">.</span><span class="o">*</span> <span class="k">from</span> <span class="n">rs_producer_pljava</span><span class="p">(</span> <span class="mi">3</span> <span class="p">)</span> <span class="k">as</span> <span class="n">j</span><span class="p">(</span><span class="n">t</span> <span class="nb">text</span><span class="p">,</span> <span class="n">r</span> <span class="nb">int</span><span class="p">,</span> <span class="n">m</span> <span class="nb">int</span><span class="p">,</span> <span class="n">d</span> <span class="nb">date</span><span class="p">);</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Entering</span> <span class="n">rs_producer_pljava</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">0</span><span class="o">/</span><span class="mi">3</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">1</span><span class="o">/</span><span class="mi">3</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">2</span><span class="o">/</span><span class="mi">3</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Producing</span> <span class="k">row</span> <span class="mi">3</span><span class="o">/</span><span class="mi">3</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="n">Closing</span> <span class="n">resultset</span>
<span class="n">t</span> <span class="o">|</span> <span class="n">r</span> <span class="o">|</span> <span class="n">m</span> <span class="o">|</span> <span class="n">d</span>
<span class="c1">----------------------------------+---+---+------------</span>
<span class="k">Row</span> <span class="mi">0</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">3</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="k">Row</span> <span class="mi">1</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">3</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="k">Row</span> <span class="mi">2</span> <span class="k">out</span> <span class="k">of</span> <span class="mi">3</span> <span class="k">from</span> <span class="n">PWC256</span><span class="p">.</span><span class="n">Task1</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">2024</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">07</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>It is quite simple to use PL/Java to implement a row producer, even based on already existing code.</p>
pgagroal-cli gains JSON output2024-02-10T00:00:00+00:00https://fluca1978.github.io/2024/02/10/pgagroalJSONOutput<p>A new feature of <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> that now makes another step towards the full automation.</p>
<h1 id="pgagroal-cli-gains-json-output">pgagroal-cli gains JSON output</h1>
<p>At last, I made it: <a href="https://github.com/agroal/pgagroal/commit/8b13185c4ea47bf7e09547b8813511db7dd014fd" target="_blank">a commit in pgagroal to support JSON output</a>. It has been quite hard and long, not for the technological challenge, rather for all the little details like continuos integration, to get this work completed.
As a rule of thumb, I stated this work last November (of course, slowly working in and out).</p>
<p>What is all of this about?</p>
<p>The idea is to provide JSON based output to <code class="language-plaintext highlighter-rouge">pgagroal-cli</code>, the command line interface and main management tool for the <code class="language-plaintext highlighter-rouge">pgagroal</code> connection pool.</p>
<p>I have to admit that <strong>I hate JSON with a passion</strong> and I put it there on the top ranking of my worst formats with XML. So, why did I spent so much time in doing this patch?
If you are not living under a stone, you probably know and see how many tools nowdays provide JSON output format, and the main reason is that this format, while being still human readable (ehm, to some extent!), it allows for an ease automation. There are tons of JSON parsers out there, and even our beloved database PostgreSQL has a very rich JSON support.
Therefore, <strong>having a consistent and automatically parsable command output</strong> wille ease the automation, <strong>and hence the adoption</strong> of <code class="language-plaintext highlighter-rouge">pgagroal</code>.</p>
<p>To some extent, this work is the natural continuation of the work I initiated almost one year ago to make <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> command line more consistent and understandable, for example I added commands to handle configuration directly from the command line (see for example <a href="https://github.com/agroal/pgagroal/commit/ade40240317bad155dbf1e40866c96257b688b90" target="_blank">this commit</a>) and to have a more compact and consistent set of commands (see for example this <a href="https://github.com/agroal/pgagroal/commit/ade40240317bad155dbf1e40866c96257b688b90" target="_blank">this commit</a> and the following I made).</p>
<h2 id="how-to-use-the-json-output">How to use the JSON output</h2>
<p>The <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> command now supports an optional command line flag <code class="language-plaintext highlighter-rouge">--format</code> that allows to switch from the default text based output to the new JSON format. As the <a href="https://github.com/agroal/pgagroal/commit/ade40240317bad155dbf1e40866c96257b688b90" target="_blank">documentation states</a>, the default output is the text format, so not specifying any <code class="language-plaintext highlighter-rouge">--format</code> option is totally equivalent to specifying <code class="language-plaintext highlighter-rouge">--format text</code>.</p>
<p>On the other hand, to turn on the JSON output format, it is required to pass <code class="language-plaintext highlighter-rouge">--format json</code> on the command line.
As an example, the output of a command will appear to be:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgagroal-cli ping <span class="nt">--format</span> json
<span class="o">{</span>
<span class="s2">"command"</span>: <span class="o">{</span>
<span class="s2">"name"</span>: <span class="s2">"ping"</span>,
<span class="s2">"status"</span>: <span class="s2">"OK"</span>,
<span class="s2">"error"</span>: 0,
<span class="s2">"exit-status"</span>: 0,
<span class="s2">"output"</span>: <span class="o">{</span>
<span class="s2">"status"</span>: 1,
<span class="s2">"message"</span>: <span class="s2">"running"</span>
<span class="o">}</span>
<span class="o">}</span>,
<span class="s2">"application"</span>: <span class="o">{</span>
<span class="s2">"name"</span>: <span class="s2">"pgagroal-cli"</span>,
<span class="s2">"major"</span>: 1,
<span class="s2">"minor"</span>: 6,
<span class="s2">"patch"</span>: 0,
<span class="s2">"version"</span>: <span class="s2">"1.6.0"</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="format-of-json-output">Format of JSON output</h2>
<p>The JSON output has a <em>fixed</em> structure with many pre-defined structure that include:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">command</code> an object that contains the command the server <code class="language-plaintext highlighter-rouge">pgagroal</code> has executed (or has been requested to execute);</li>
<li><code class="language-plaintext highlighter-rouge">application</code> reports the name and version of the application that required the command (so far, always <code class="language-plaintext highlighter-rouge">pgagroal-cli</code>).</li>
</ul>
<p>The <code class="language-plaintext highlighter-rouge">comamnd</code> object, in turn, contains other information, like the status of the command and the notification of errors, as well as <strong><code class="language-plaintext highlighter-rouge">output</code>, an object that contains the command output (if any) and the command status</strong>.</p>
<p>As an example, consider a more verbose command like <code class="language-plaintext highlighter-rouge">status</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% gagroal-cli status <span class="nt">--format</span> json
<span class="o">{</span>
<span class="s2">"command"</span>: <span class="o">{</span>
<span class="s2">"name"</span>: <span class="s2">"status"</span>,
<span class="s2">"status"</span>: <span class="s2">"OK"</span>,
<span class="s2">"error"</span>: 0,
<span class="s2">"exit-status"</span>: 0,
<span class="s2">"output"</span>: <span class="o">{</span>
<span class="s2">"status"</span>: <span class="o">{</span>
<span class="s2">"message"</span>: <span class="s2">"Running"</span>,
<span class="s2">"status"</span>: 1
<span class="o">}</span>,
<span class="s2">"connections"</span>: <span class="o">{</span>
<span class="s2">"active"</span>: 0,
<span class="s2">"total"</span>: 2,
<span class="s2">"max"</span>: 15
<span class="o">}</span>,
<span class="s2">"databases"</span>: <span class="o">{</span>
<span class="s2">"disabled"</span>: <span class="o">{</span>
<span class="s2">"count"</span>: 0,
<span class="s2">"state"</span>: <span class="s2">"disabled"</span>,
<span class="s2">"list"</span>: <span class="o">[]</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="o">}</span>,
<span class="s2">"application"</span>: <span class="o">{</span>
<span class="s2">"name"</span>: <span class="s2">"pgagroal-cli"</span>,
<span class="s2">"major"</span>: 1,
<span class="s2">"minor"</span>: 6,
<span class="s2">"patch"</span>: 0,
<span class="s2">"version"</span>: <span class="s2">"1.6.0"</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the <code class="language-plaintext highlighter-rouge">command</code> has a more extended <code class="language-plaintext highlighter-rouge">output</code> section that includes much more information and reports, with another dress, the output that the normal text command would have reported.</p>
<p><strong>Every command has a different <code class="language-plaintext highlighter-rouge">output</code> format</strong>, that means that in order to interpret every command output there is the need to read the documentation for such command.</p>
<p>Moreover, it is interesting to note that, due to refactoring of the code, <strong>the text command output has slightly changed</strong>, so chances are that if you based your automation on such format you are going to break your scripts. <strong>This is an excellent motivation to switch to the new JSON output format!</strong></p>
<p>Under the hood, all the complex commands like the above <code class="language-plaintext highlighter-rouge">status</code> have been refactored to <em>talk only in JSON</em>, therefore the text output format is nowdays a purified output extracted from the JSON sent over the communication protocol.</p>
<h2 id="what-about-pgagroal-cli-friends">What about <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> friends?</h2>
<p>The other main command, <code class="language-plaintext highlighter-rouge">pgagroal-admin</code> has not migrated to JSON deliberately: I don’t believe that we need a lot of automation on this command, hence we don’t need to provide JSON output. Moreover, the command is not very verbose and does not produce pretty much output, on the other hand it requires an interactive session with the user.</p>
<p>Therefore, I don’t see the need to port JSON output to this command.</p>
<h2 id="a-brief-history">A Brief History</h2>
<p>This patch is, as often it happens, the result of many trials and errors, either in the implementation or in the design.</p>
<p>In the beginning, I thought to add an explicit <code class="language-plaintext highlighter-rouge">--json</code> command line flag to indicate the need for JSON output, but I later changed my mind to the more general and extensible <code class="language-plaintext highlighter-rouge">--format</code> that allows for future addition of output formats, if the need will arise.</p>
<p>I implemented a first prototype using the <a href="https://github.com/json-c/json-c" target="_blank">json-c</a> library, then switched to <a href="https://github.com/DaveGamble/cJSON" target="_blank">cJSON</a>. I have to say that, even if both the libraries deal with JSON, they have a quite different approach in how to build a JSON object. I tend to prefer <code class="language-plaintext highlighter-rouge">cJSON</code> because it has a less structured approach to add scalar values.</p>
<p>Towards the end of the patch, we had to deal with a lot of issues with the port to OSX, and it required a few days for us to discover that we had not fully updated the <code class="language-plaintext highlighter-rouge">CMakeList.txt</code> file section related to the OSX part linkage.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">pgagroal</code> is growing more and more, and I believe that this new JSON feature will open the road for new exciting developments and integrations with other system, thus promoting the adoption of this tool in the PostgreSQL ecosystem!</p>
Installing PostgreSQL 16 (development) on Rocky Linux 9: the Perl::IPC::Run problem2024-02-08T00:00:00+00:00https://fluca1978.github.io/2024/02/08/PostgreSQL16DevPerlIPCRun<p>A possible solution to a common problem</p>
<h1 id="installing-postgresql-16-on-rocky-linux-9-the-perlipcrun-problem">Installing PostgreSQL 16 on Rocky Linux 9: the Perl::IPC::Run problem</h1>
<p>Today I was preparing a new machine, based on Rocky Linux 9, for some development activity.
I was installing PostgreSQL 16 and the development stuff I need, so I was executing (after having imported the PGDG repository), the usual:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>dnf <span class="nb">install </span>postgresql16.x86_64 <span class="se">\</span>
postgresql16-contrib.x86_64 <span class="se">\</span>
postgresql16-devel.x86_64 <span class="se">\</span>
postgresql16-libs.x86_64 <span class="se">\</span>
postgresql16-plperl.x86_64 <span class="se">\</span>
postgresql16-server.x86_64
...
Error:
Problem: cannot <span class="nb">install </span>the best candidate <span class="k">for </span>the job
- nothing provides perl<span class="o">(</span>IPC::Run<span class="o">)</span> needed by postgresql16-devel-16.1-2PGDG.rhel9.x86_64 from pgdg16
<span class="o">(</span>try to add <span class="s1">'--skip-broken'</span> to skip uninstallable packages or <span class="s1">'--nobest'</span> to use not only best candidate packages<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Apparently I’m not able to find out a <code class="language-plaintext highlighter-rouge">Perl-IPC-Run</code> module on the Rocky Linux repositories, nor in the <code class="language-plaintext highlighter-rouge">epel_release</code> ones.</p>
<p>The correct way is to enable the <code class="language-plaintext highlighter-rouge">crb</code> repository:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>dnf config-manager <span class="nt">--set-enabled</span> crb
</code></pre></div></div>
<p><br />
<br /></p>
<p>And that’s it:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% dnf search Perl-IPC-Run
<span class="o">====================================</span> Name Matched: Perl-IPC-Run <span class="o">====================================</span>
perl-IPC-Run.noarch : Perl module <span class="k">for </span>interacting with child processes
perl-IPC-Run3.noarch : Run a subprocess <span class="k">in </span>batch mode
</code></pre></div></div>
<p><br />
<br /></p>
<p>Another approach is to install it the <em>Perl way!</em></p>
<p>I prefer to use <code class="language-plaintext highlighter-rouge">cpanm</code> as Perl package manager nowdays, but <code class="language-plaintext highlighter-rouge">cpan</code> and others work equally well:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>dnf <span class="nb">install </span>perl-App-cpanminus.noarch
% <span class="nb">sudo </span>cpanm IPC::Run
<span class="nt">--</span><span class="o">></span> Working on IPC::Run
Fetching http://www.cpan.org/authors/id/T/TO/TODDR/IPC-Run-20231003.0.tar.gz ... OK
Configuring IPC-Run-20231003.0 ... OK
Building and testing IPC-Run-20231003.0 ... OK
Successfully installed IPC-Run-20231003.0 <span class="o">(</span>upgraded from 20200505.0<span class="o">)</span>
1 distribution installed
</code></pre></div></div>
<p><br />
<br /></p>
pgenv gains a new command (and contributor!)2024-02-06T00:00:00+00:00https://fluca1978.github.io/2024/02/06/pgenvStatusCommand<p>A new command in the pgenv script.</p>
<h1 id="pgenv-gains-a-new-command-and-contributor">pgenv gains a new command (and contributor!)</h1>
<p><a href="https://github.com/theory/pgenv" target="_blank">pgenv</a> , the PostgreSQL binary manager written as a Bourne Again Shell script, has gained a new command: <code class="language-plaintext highlighter-rouge">status</code>.</p>
<p>The idea of this command is to report the status of a selected PostgreSQL instance, mainly if it is running or not.
<a href="https://github.com/theory/pgenv/commit/70af4d4e1de28b41e39c89927c338f23e89b4378" target="_blank">Behind the scenes</a> the implementation exploits the <code class="language-plaintext highlighter-rouge">pg_ctl</code> command for the selected instance, stopping the execution immediatly if the user has no selected any instance.</p>
<p>The output of <code class="language-plaintext highlighter-rouge">pg_ctl</code> has been mangled to appear a little less verbose, in particular the <code class="language-plaintext highlighter-rouge">pg_ctl:</code> prefix has been removed.</p>
<p><a href="https://github.com/briansalehi" target="_blank">Brian Salehi</a> is the author of this patch, and hopefully a new contributor that will help improving <code class="language-plaintext highlighter-rouge">pgenv</code> again and again.</p>
<p>As an example, when using the new <code class="language-plaintext highlighter-rouge">status</code> command you will get something like the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv status
server is running <span class="o">(</span>PID: 51503<span class="o">)</span>
/usr/pgsql-16/bin/postgres <span class="s2">"-D"</span> <span class="s2">"/postgres/16/data"</span>
</code></pre></div></div>
<p><br />
<br /></p>
Changing a Column from Integer to Boolean in One Transaction2024-02-05T00:00:00+00:00https://fluca1978.github.io/2024/02/05/PostgreSQLChangeIntegerToBooleanColumn<p>A way to fix some oddity that comes from other databases.</p>
<h1 id="changing-a-column-from-integer-to-boolean-in-one-transaction">Changing a Column from Integer to Boolean in One Transaction</h1>
<p>I was migrating a database from SQLite3 to PostgreSQL, not because the former isn’t good, rather because the latter shines!</p>
<p>SQLite3 does not have booleans, so the tricky way to simulate booleans is to use integer columns (or characters, or whatever works for you), and I was in this situation with a table <code class="language-plaintext highlighter-rouge">cassification</code> having a <code class="language-plaintext highlighter-rouge">miscellaneous</code> column with only two values: <code class="language-plaintext highlighter-rouge">1</code> to indicate <code class="language-plaintext highlighter-rouge">true</code> and <code class="language-plaintext highlighter-rouge">0</code> to indicate <code class="language-plaintext highlighter-rouge">false</code>. Moreover, the column had a default value set to <code class="language-plaintext highlighter-rouge">0</code> (i.e., <code class="language-plaintext highlighter-rouge">false</code>).</p>
<p>While this is not a problem, it is really annoying when doing queries and data manipulation.
Luckily PostgreSQL allows us for a quick fix of the column, migrating its data type to another. Unluckily, there is no straighforward evaluation of an integer into a boolean, so PostgreSQL is not able to understand how to migrate values, but it is quite simple to instrument it to follow the right path.</p>
<p>First of all, there is the need to check the original column values to identify if, by accident, some not-boolean-ish values have been stored. This is really simple, since you can do something like:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">),</span> <span class="n">miscellaneous</span>
<span class="k">FROM</span> <span class="n">classification</span>
<span class="k">WHERE</span> <span class="n">miscellaneous</span> <span class="k">NOT</span> <span class="k">IN</span> <span class="p">(</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">miscellaneous</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now it is time to migrate the column.</p>
<p><strong>PostgreSQL allows for transactional DDL statements</strong>, that means you can run multiple DDL statements within a transaction.
Therefore, within a single transaction, it is possible to:</p>
<ul>
<li>drop the column default value;</li>
<li>change the column data type, telling PostgreSQL about how to migrate the data;</li>
<li>assign a new default value to the new column.</li>
</ul>
<p>Moreover, PostgreSQL is able to execute a single <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> with multiple <code class="language-plaintext highlighter-rouge">ALTER COLUMN</code> statements, something that reminds me Oracle’s <code class="language-plaintext highlighter-rouge">ALTER TABLE MODIFY ( )</code> expression:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">classification</span>
<span class="k">alter</span> <span class="k">column</span> <span class="n">miscellaneous</span> <span class="k">drop</span> <span class="k">default</span><span class="p">,</span>
<span class="k">alter</span> <span class="k">column</span> <span class="n">miscellaneous</span> <span class="k">set</span> <span class="k">data</span> <span class="k">type</span> <span class="nb">boolean</span>
<span class="k">using</span>
<span class="k">case</span> <span class="n">miscellaneous</span> <span class="k">when</span> <span class="mi">1</span> <span class="k">then</span> <span class="k">true</span> <span class="k">else</span> <span class="k">false</span> <span class="k">end</span><span class="p">,</span>
<span class="k">alter</span> <span class="k">column</span> <span class="n">miscellaneous</span> <span class="k">set</span> <span class="k">default</span> <span class="k">false</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Done!</p>
<p>The first <code class="language-plaintext highlighter-rouge">alter column</code> statement removes the default value, the second one uses a <code class="language-plaintext highlighter-rouge">case</code> to convert an integer into a boolean, and the last one adds a default value.</p>
<p>A more verbose way of doing the same thing is:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">BEGIN</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">classification</span> <span class="k">alter</span> <span class="k">column</span> <span class="n">miscellaneous</span> <span class="k">drop</span> <span class="k">default</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">classification</span> <span class="k">alter</span> <span class="k">column</span> <span class="n">miscellaneous</span> <span class="k">set</span> <span class="k">data</span> <span class="k">type</span> <span class="nb">boolean</span>
<span class="k">using</span>
<span class="k">case</span> <span class="n">miscellaneous</span> <span class="k">when</span> <span class="mi">1</span> <span class="k">then</span> <span class="k">true</span> <span class="k">else</span> <span class="k">false</span> <span class="k">end</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">classification</span> <span class="k">alter</span> <span class="k">column</span> <span class="n">miscellaneous</span> <span class="k">set</span> <span class="k">default</span> <span class="k">false</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">COMMIT</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br />`</p>
<p>The <a href="https://www.postgresql.org/docs/16/sql-altertable.html" target="_blank">documentation and examples for ALTER TABLE</a> provide more details about how to change the data type in similar situations.</p>
'generated always as identity' columns do not have default values (or do they?)2024-01-29T00:00:00+00:00https://fluca1978.github.io/2024/01/29/GenerateAlwaysVSSerialPostgreSQLColumnsDBIxClass<p>Something strange I discovered while using <code class="language-plaintext highlighter-rouge">DBIx::Class</code> and <code class="language-plaintext highlighter-rouge">DBI</code>.</p>
<h1 id="generated-always-as-identity-columns-do-not-have-default-values-or-do-they">‘generated always as identity’ columns do not have default values (or do they?)</h1>
<p>PostgreSQL has two ways of defining what other databases call <em>an auto-increment column</em>:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">serial</code></li>
<li><code class="language-plaintext highlighter-rouge">generated always as identity</code></li>
</ul>
<p>The former, <code class="language-plaintext highlighter-rouge">serial</code>, is the oldest way of declaring an auto-increment column: it creates a sequence and attaches the default value of the column to the <code class="language-plaintext highlighter-rouge">nextval()</code> of the sequence.
The latter, <code class="language-plaintext highlighter-rouge">generated always as identity</code>, is the <em>newest</em> (even if not so new!) <strong>declarative</strong> way of doing the same stuff as <code class="language-plaintext highlighter-rouge">serial</code> does: it creates a sequence and attaches the sequence and the table column.</p>
<p>So what is the difference?</p>
<p>In short, with <code class="language-plaintext highlighter-rouge">serial</code> you get two independent objects (a column and a sequence) that behave separatly, even if the value of the column is tied to the next value of the sequence. If there is the need to restart the counter, the table does not know anything, so you need to explicitly work against the sequence.
On the other hand, with <code class="language-plaintext highlighter-rouge">generated always as identity</code>, the table column <em>knows</em> the sequence used to populate itself, and therefore it is possible to reset the sequence by working against the table. So for instance, you can do a <code class="language-plaintext highlighter-rouge">ALTER TABLE foo ALTER COLUMN pk RESTART;</code> without having to know the sequence name behind the <code class="language-plaintext highlighter-rouge">pk</code> column.</p>
<p>Usually, I do explain that the <code class="language-plaintext highlighter-rouge">generated always as identity</code> is the best way to proceed, because it gives all the advantages of <code class="language-plaintext highlighter-rouge">serial</code> with a more declarative way of handling special cases.</p>
<p><strong>So, you should use <code class="language-plaintext highlighter-rouge">generated always as identity</code>, except when you should not!</strong></p>
<h2 id="the-begin-of-the-problems-dbicdump-and-dbixschemaloader">The begin of the problems: <code class="language-plaintext highlighter-rouge">dbicdump</code> and <code class="language-plaintext highlighter-rouge">DBIx::Schema::Loader</code></h2>
<p>I was migrating a schema from SQLite3 to our beloved database, so I converted every SQLite3 <code class="language-plaintext highlighter-rouge">autoincrement</code> column to <code class="language-plaintext highlighter-rouge">int generated always as identity</code>.
So far, so good, simple enough.</p>
<p>Then I used Perl <code class="language-plaintext highlighter-rouge">DBIx::Schema::Loader</code> and <code class="language-plaintext highlighter-rouge">dbicdump</code> to dump the database structure into so called <em>schema</em>, with objects to use.</p>
<p>And last, I used the generated objects. And here it is where problems become…</p>
<p><code class="language-plaintext highlighter-rouge">DBIx::Class</code> was complaining about my <em>auto-increment</em> columns not being defined as such. What was wrong?</p>
<p>I inspected a class at glance, and I got the following:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">__PACKAGE__</span><span class="o">-></span><span class="nv">add_columns</span><span class="p">(</span>
<span class="p">"</span><span class="s2">pk</span><span class="p">",</span>
<span class="p">{</span>
<span class="s">data_type</span> <span class="o">=></span> <span class="p">"</span><span class="s2">integer</span><span class="p">",</span>
<span class="s">is_nullable</span> <span class="o">=></span> <span class="mi">0</span><span class="p">,</span>
<span class="p">},</span>
<span class="o">...</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>compared to the SQlite3 definition</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">__PACKAGE__</span><span class="o">-></span><span class="nv">add_columns</span><span class="p">(</span>
<span class="p">"</span><span class="s2">pk</span><span class="p">",</span>
<span class="p">{</span>
<span class="s">data_type</span> <span class="o">=></span> <span class="p">"</span><span class="s2">integer</span><span class="p">",</span>
<span class="s">is_auto_increment</span> <span class="o">=></span> <span class="mi">1</span><span class="p">,</span>
<span class="s">is_nullable</span> <span class="o">=></span> <span class="mi">0</span><span class="p">,</span>
<span class="p">},</span>
<span class="o">...</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>immediatly revelead that there was a missing <code class="language-plaintext highlighter-rouge">is_auto_increment</code> variable definition for the same column.</p>
<p><strong><code class="language-plaintext highlighter-rouge">DBIx::Schema::Loader</code> was not understanding the column definition, at least not as I was expecting.</strong></p>
<h2 id="investigating-the-problem">Investigating the problem</h2>
<p>I decided to create a simple table with the two possible column types, and dump the schema to see what happens.</p>
<p>With a PostgreSQL table defined as:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">d</span> <span class="n">foo</span>
<span class="k">Table</span> <span class="nv">"public.foo"</span>
<span class="k">Column</span> <span class="o">|</span> <span class="k">Type</span> <span class="o">|</span> <span class="k">Collation</span> <span class="o">|</span> <span class="k">Nullable</span> <span class="o">|</span> <span class="k">Default</span>
<span class="c1">--------+---------+-----------+----------+---------------------------------</span>
<span class="n">pk</span> <span class="o">|</span> <span class="nb">integer</span> <span class="o">|</span> <span class="o">|</span> <span class="k">not</span> <span class="k">null</span> <span class="o">|</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span>
<span class="n">kp</span> <span class="o">|</span> <span class="nb">integer</span> <span class="o">|</span> <span class="o">|</span> <span class="k">not</span> <span class="k">null</span> <span class="o">|</span> <span class="n">nextval</span><span class="p">(</span><span class="s1">'foo_kp_seq'</span><span class="p">::</span><span class="n">regclass</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>the resulted output from <code class="language-plaintext highlighter-rouge">dbicudmp</code> is:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">__PACKAGE__</span><span class="o">-></span><span class="nv">add_columns</span><span class="p">(</span>
<span class="p">"</span><span class="s2">pk</span><span class="p">",</span>
<span class="p">{</span>
<span class="s">data_type</span> <span class="o">=></span> <span class="p">"</span><span class="s2">integer</span><span class="p">",</span>
<span class="s">is_nullable</span> <span class="o">=></span> <span class="mi">0</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">"</span><span class="s2">kp</span><span class="p">",</span>
<span class="p">{</span>
<span class="s">data_type</span> <span class="o">=></span> <span class="p">"</span><span class="s2">integer</span><span class="p">",</span>
<span class="s">is_auto_increment</span> <span class="o">=></span> <span class="mi">1</span><span class="p">,</span>
<span class="s">is_nullable</span> <span class="o">=></span> <span class="mi">0</span><span class="p">,</span>
<span class="s">sequence</span> <span class="o">=></span> <span class="p">"</span><span class="s2">author_kp_seq</span><span class="p">",</span>
<span class="p">},</span>
<span class="o">...</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is clear that <strong><code class="language-plaintext highlighter-rouge">dbicdump</code> is able to understand <code class="language-plaintext highlighter-rouge">serial</code> columns, while it is not able to understand the default value of <code class="language-plaintext highlighter-rouge">generated always as identity</code></strong>!</p>
<h2 id="more-investigation-dbixclassschemaloaderdbipg">More investigation: <code class="language-plaintext highlighter-rouge">DBIx::Class::Schema::Loader::DBI::Pg</code></h2>
<p>I was puzzled about the problem, so I decided to try to dig about how <code class="language-plaintext highlighter-rouge">DBIx::Schema::Loader</code> understands the definition of PostgreSQL table columns.
It turned out, that <code class="language-plaintext highlighter-rouge">DBIx::Class::Schema::Loader::DBI::Pg</code> is the specific driver behind how the loader interacts with PostgreSQL meta information.</p>
<p>In particular, the <code class="language-plaintext highlighter-rouge">_columns_info_for</code> function tries to get the metadata for every column of the table, and in particular in such function you can find a piece of code like the following:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1"># process SERIAL columns</span>
<span class="k">if</span> <span class="p">(</span> <span class="nv">$</span><span class="p">{</span> <span class="nv">$info</span><span class="o">-></span><span class="p">{</span><span class="nv">default_value</span><span class="p">}</span> <span class="p">}</span> <span class="o">=~</span> <span class="sr">/\bnextval\('([^:]+)'/i</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">$info</span><span class="o">-></span><span class="p">{</span><span class="nv">is_auto_increment</span><span class="p">}</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="nv">$info</span><span class="o">-></span><span class="p">{</span><span class="nv">sequence</span><span class="p">}</span> <span class="o">=</span> <span class="err">$</span><span class="mi">1</span><span class="p">;</span>
<span class="nb">delete</span> <span class="nv">$info</span><span class="o">-></span><span class="p">{</span><span class="nv">default_value</span><span class="p">};</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Despite the comment, it is clear that the branch is evaluating the fact that the <code class="language-plaintext highlighter-rouge">default_value</code> must be like <code class="language-plaintext highlighter-rouge">nextval</code>.
<em>There are no other places that handle the <code class="language-plaintext highlighter-rouge">autoincrement</code> and sequence</em> in the method.</p>
<p>But what is that <code class="language-plaintext highlighter-rouge">$info</code> hash? It is coming from <code class="language-plaintext highlighter-rouge">DBI::column_info</code>.</p>
<h2 id="more-and-more-investigation-dbicolumn_info">More and more investigation: <code class="language-plaintext highlighter-rouge">DBI::column_info</code></h2>
<p>The <code class="language-plaintext highlighter-rouge">DBI</code> driver interface provides a <code class="language-plaintext highlighter-rouge">column_info</code> method that provides an hash with a lot of useful information about the column definition.</p>
<p>It is really simple to write a dummy Perl program to dump the structure of the <code class="language-plaintext highlighter-rouge">foo</code> table presented before:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nv">v5</span><span class="mf">.38</span><span class="p">;</span>
<span class="k">use</span> <span class="nv">DBI</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$db</span> <span class="o">=</span> <span class="nv">DBI</span><span class="o">-></span><span class="nb">connect</span><span class="p">(</span> <span class="p">'</span><span class="s1">dbi:Pg:dbname=testdb;host=venkman;port=5432</span><span class="p">',</span>
<span class="sx">q/luca/</span><span class="p">,</span>
<span class="sx">q/XXXXXXXXXXX/</span> <span class="p">);</span>
<span class="c1"># testdb=> \d foo</span>
<span class="c1"># Table "public.foo"</span>
<span class="c1"># Column | Type | Collation | Nullable | Default</span>
<span class="c1"># --------+---------+-----------+----------+---------------------------------</span>
<span class="c1"># pk | integer | | not null | generated always as identity</span>
<span class="c1"># kp | integer | | not null | nextval('foo_kp_seq'::regclass)</span>
<span class="k">my</span> <span class="nv">$statement</span> <span class="o">=</span> <span class="nv">$db</span><span class="o">-></span><span class="nv">column_info</span><span class="p">(</span> <span class="nb">undef</span><span class="p">,</span> <span class="sx">q/public/</span><span class="p">,</span> <span class="sx">q/foo/</span><span class="p">,</span> <span class="sx">q/pk/</span> <span class="p">);</span>
<span class="k">while</span> <span class="p">(</span> <span class="k">my</span> <span class="nv">$row</span> <span class="o">=</span> <span class="nv">$statement</span><span class="o">-></span><span class="nv">fetchrow_hashref</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">use</span> <span class="nn">Data::</span><span class="nv">Dumper</span><span class="p">;</span>
<span class="nv">say</span> <span class="nv">Dumper</span><span class="p">(</span> <span class="nv">$row</span> <span class="p">);</span>
<span class="p">}</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">==================================</span><span class="p">";</span>
<span class="nv">$statement</span> <span class="o">=</span> <span class="nv">$db</span><span class="o">-></span><span class="nv">column_info</span><span class="p">(</span> <span class="nb">undef</span><span class="p">,</span> <span class="sx">q/public/</span><span class="p">,</span> <span class="sx">q/foo/</span><span class="p">,</span> <span class="sx">q/kp/</span> <span class="p">);</span>
<span class="k">while</span> <span class="p">(</span> <span class="k">my</span> <span class="nv">$row</span> <span class="o">=</span> <span class="nv">$statement</span><span class="o">-></span><span class="nv">fetchrow_hashref</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">use</span> <span class="nn">Data::</span><span class="nv">Dumper</span><span class="p">;</span>
<span class="nv">say</span> <span class="nv">Dumper</span><span class="p">(</span> <span class="nv">$row</span> <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The program produces the following (trimmed) output:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% perl ~/tmp/test.pl
<span class="nv">$VAR1</span> <span class="o">=</span> <span class="o">{</span>
<span class="s1">'TYPE_NAME'</span> <span class="o">=></span> <span class="s1">'integer'</span>,
<span class="s1">'pg_schema'</span> <span class="o">=></span> <span class="s1">'public'</span>,
<span class="s1">'pg_type'</span> <span class="o">=></span> <span class="s1">'integer'</span>,
<span class="s1">'NULLABLE'</span> <span class="o">=></span> 0,
<span class="s1">'COLUMN_DEF'</span> <span class="o">=></span> undef,
<span class="s1">'IS_NULLABLE'</span> <span class="o">=></span> <span class="s1">'NO'</span>,
<span class="s1">'pg_column'</span> <span class="o">=></span> <span class="s1">'pk'</span>,
<span class="s1">'COLUMN_NAME'</span> <span class="o">=></span> <span class="s1">'pk'</span>,
<span class="s1">'pg_table'</span> <span class="o">=></span> <span class="s1">'foo'</span>,
<span class="s1">'TABLE_NAME'</span> <span class="o">=></span> <span class="s1">'foo'</span>,
<span class="s1">'TABLE_SCHEM'</span> <span class="o">=></span> <span class="s1">'public'</span>,
<span class="s1">'pg_constraint'</span> <span class="o">=></span> undef
...
<span class="o">}</span><span class="p">;</span>
<span class="o">==================================</span>
<span class="nv">$VAR1</span> <span class="o">=</span> <span class="o">{</span>
<span class="s1">'pg_column'</span> <span class="o">=></span> <span class="s1">'kp'</span>,
<span class="s1">'COLUMN_NAME'</span> <span class="o">=></span> <span class="s1">'kp'</span>,
<span class="s1">'DECIMAL_DIGITS'</span> <span class="o">=></span> undef,
<span class="s1">'pg_table'</span> <span class="o">=></span> <span class="s1">'foo'</span>,
<span class="s1">'TABLE_NAME'</span> <span class="o">=></span> <span class="s1">'foo'</span>,
<span class="s1">'TABLE_SCHEM'</span> <span class="o">=></span> <span class="s1">'public'</span>,
<span class="s1">'TYPE_NAME'</span> <span class="o">=></span> <span class="s1">'integer'</span>,
<span class="s1">'pg_type'</span> <span class="o">=></span> <span class="s1">'integer'</span>,
<span class="s1">'NULLABLE'</span> <span class="o">=></span> 0,
<span class="s1">'COLUMN_DEF'</span> <span class="o">=></span> <span class="s1">'nextval(\'</span>foo_kp_seq<span class="se">\'</span>::regclass<span class="o">)</span><span class="s1">',
...
};
</span></code></pre></div></div>
<p><br />
<br /></p>
<p>The interesting part is <strong><code class="language-plaintext highlighter-rouge">COLUMN_DEF</code></strong> that defines the default value of the column: note how in the case of <code class="language-plaintext highlighter-rouge">serial</code> there is the <code class="language-plaintext highlighter-rouge">nextval()</code> call, while in the case of <code class="language-plaintext highlighter-rouge">generated always as identity</code> there is nothing. This is the problem.</p>
<h2 id="dont-blame-dbi">Don’t blame <code class="language-plaintext highlighter-rouge">DBI</code>!</h2>
<p>This is not a bug of <code class="language-plaintext highlighter-rouge">DBI</code> in a strict sense, so don’t blame the Perl Database Interface!</p>
<p>In fact, PostgreSQL is a little tricky about giving back information about <code class="language-plaintext highlighter-rouge">generated always as identity</code> columns:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">a</span><span class="p">.</span><span class="n">attname</span><span class="p">,</span> <span class="n">a</span><span class="p">.</span><span class="n">attidentity</span><span class="p">,</span>
<span class="p">(</span><span class="k">select</span> <span class="n">pg_get_expr</span><span class="p">(</span> <span class="n">d</span><span class="p">.</span><span class="n">adbin</span><span class="p">,</span> <span class="n">d</span><span class="p">.</span><span class="n">adrelid</span><span class="p">,</span> <span class="k">true</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">pg_attrdef</span> <span class="n">d</span>
<span class="k">where</span> <span class="n">d</span><span class="p">.</span><span class="n">adrelid</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">attrelid</span>
<span class="k">and</span> <span class="n">d</span><span class="p">.</span><span class="n">adnum</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">attnum</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">pg_attribute</span> <span class="n">a</span>
<span class="k">where</span> <span class="n">a</span><span class="p">.</span><span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'foo'</span><span class="p">::</span><span class="n">regclass</span>
<span class="k">and</span> <span class="n">a</span><span class="p">.</span><span class="n">attname</span> <span class="k">in</span> <span class="p">(</span> <span class="s1">'pk'</span><span class="p">,</span> <span class="s1">'kp'</span> <span class="p">);</span>
<span class="n">attname</span> <span class="o">|</span> <span class="n">attidentity</span> <span class="o">|</span> <span class="n">pg_get_expr</span>
<span class="c1">---------+-------------+---------------------------------</span>
<span class="n">kp</span> <span class="o">|</span> <span class="o">|</span> <span class="n">nextval</span><span class="p">(</span><span class="s1">'foo_kp_seq'</span><span class="p">::</span><span class="n">regclass</span><span class="p">)</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">a</span> <span class="o">|</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the only thing it is possible to extract is the fact that the column has been defined as an identity one (<code class="language-plaintext highlighter-rouge">attidentity = a </code>).</p>
<h2 id="how-does-dbdpg-finds-out-the-information-about-a-column">How does <code class="language-plaintext highlighter-rouge">DBD::Pg</code> finds out the information about a column?</h2>
<p>It turned out that <code class="language-plaintext highlighter-rouge">DBD::Pg</code> is using <a href="https://github.com/bucardo/dbdpg/blob/master/Pg.pm#L501" target="_blank">a very long query to get out the information about a column</a>. I’ve opened a <a href="https://github.com/bucardo/dbdpg/issues/123" target="_blank">ticket on DBB::Pg</a>.</p>
<h1 id="conclusions">Conclusions</h1>
<p>I never thought about the possible difficulty in introspecting a generated column as identity. While I’m still convinced about the fact that it is better to use such columns instead of <code class="language-plaintext highlighter-rouge">serial</code>, the introspective frameworks could gain more information from the <code class="language-plaintext highlighter-rouge">serial</code> attribute defintions.</p>
Learn PostgreSQL - second edition - Tech Bits2024-01-27T00:00:00+00:00https://fluca1978.github.io/2024/01/27/LearnPostgreSQLTechBits<p>Me and Enrico talk about out latest book on Doug’s Tech Bit show!</p>
<h1 id="learn-postgresql---second-edition---tech-bits">Learn PostgreSQL - second edition - Tech Bits</h1>
<p>I’m really glad that me and Enrico were hosted on the great Doug’s <strong>Tech Bits</strong> show. You can see the podcast on <a href="https://www.youtube.com/watch?v=fA2MhUpyM44" target="_blank">YouTube</a>:</p>
<p>I would like to thank Doug Ortiz for his excellent work.</p>
Installing PL/Java on PostgreSQL 16 and Rocky Linux2024-01-17T00:00:00+00:00https://fluca1978.github.io/2024/01/17/pljavaPostgreSQL16<p>A short recap on some issues when dealing with PL/Java and Rocky Linux.</p>
<h1 id="installing-pljava-on-postgresql-16-and-rocky-linux">Installing PL/Java on PostgreSQL 16 and Rocky Linux</h1>
<p>It has been a while since I last used <a href="https://github.com/tada/pljava" target="_blank">PL/Java</a>, and that’s mostly due to the fact that I (luckily) use much more Perl (and hence, PL/Perl) in my everyday activity than Java.</p>
<p>I decided to implement a few functionalities exploiting Java, and so here it comes another installation of PL/Java. Installing on Rocky Linux has been a little tricky, so here it is a short recap about what to do.</p>
<p>I wrote about PL/Java in my book <strong><a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide" target="_blank">PostgreSQL 11 Server Side Programming Quick Start Guide</a></strong>.</p>
<p>As usual within the PostgreSQL ecosystem, PL/Java has <a href="https://github.com/tada/pljava/wiki/User-guide" target="_blank">a very rich documentation</a>.</p>
<p><strong>IMPORTANT (2024-04-22): this article contains a few mistakes, that have been addressed in <a href="https://fluca1978.github.io/2024/04/22/PLJavaInsightsAndFixes.html">my other article</a>, so please ensure to read also the other article!</strong></p>
<h2 id="is-it-worth">Is it worth?</h2>
<p>I had to answer this question over and over: <strong>is it worth using PL/Java for PostgreSQL triggers, functions, procedures and so on?</strong></p>
<p>As usual in these cases, there’s no a single answer. First of all, using an <em>external</em> language like PL/Java means that PostgreSQL has to manage the round-trip of data between the database and the virtual machine, with the latter being fired at first execution. In short: performances are good but never as fast as native languages.
Second, PL/Java brings all the complexitly of a formally compiled language, therefore making changes to the code is not as simple as in other scripting languages.
Last, according to me, it does make sense if you need to bring some Java stuff into your scenario, either because it is a language you are absolutely proficient, or because you already have libraries and utilities that you don’t want to convert in a database usable way.</p>
<h2 id="installing-pljava-on-rocky-linux">Installing PL/Java on Rocky Linux</h2>
<p>Thanks to the PGDG, the official PostgreSQL repositories include an already available PL/Java package.
Therefore, installing PL/Java is as simple as:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>dnf <span class="nb">install </span>pljava_16.x86_64
</code></pre></div></div>
<p><br />
<br /></p>
<p>This makes the executable available, that is PostgreSQL will be able to run Java stuff within the database.</p>
<h2 id="problem-during-compilation-of-pljava">Problem during compilation of PL/Java</h2>
<p>If you need to develop against the PL/Java API, you need not only the executable, but also the whole library, that is compiled via <em>Apache Maven</em>. During the compilation, I got a few problems, most notably a <code class="language-plaintext highlighter-rouge">gssapi</code> related one.</p>
<p>I digged a little more using the <code class="language-plaintext highlighter-rouge">-X</code> flag:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% mvn <span class="nt">-X</span> clean <span class="nb">install</span>
...
In file included from /home/luca/pljava-1_6_6/pljava-so/src/main/c/InstallHelper.c:21:
/usr/pgsql-16/include/server/libpq/libpq-be.h:32:10: fatal error: gssapi/gssapi.h: No such file or directory
32 | <span class="c">#include <gssapi/gssapi.h></span>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
</code></pre></div></div>
<p><br />
<br /></p>
<p>In order to solve the problem, I had to install the Kerberos development package:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>dnf <span class="nb">install </span>krb5-devel.x86_64
</code></pre></div></div>
<p><br />
<br /></p>
<p>and relaunching <code class="language-plaintext highlighter-rouge">mvn</code> worked as expected.</p>
<h2 id="using-pljava">Using PL/Java</h2>
<p>In order to use PL/Java there could be the need to relax the JVM security constraints. I don’t recommend to give an <em>all permissions</em>, but it is the quickest way to get PL/java able to run. Edit the file <code class="language-plaintext highlighter-rouge">/usr/lib/jvm/java/lib/security/default.policy</code> and make sure the very last section appears as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// permissions needed by applications using java.desktop module
grant <span class="o">{</span>
permission java.security.AllPermission<span class="p">;</span>
...
<span class="o">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h3 id="inform-postgresql-and-pljava-about-where-the-jvm-is-located">Inform PostgreSQL and PL/Java about where the JVM is located</h3>
<p>Before being able to use PL/Java there is the need to inform PostgreSQL about where the JVM is located (and hence, which).
This is achieved by a <code class="language-plaintext highlighter-rouge">SET</code> command:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">alter</span> <span class="k">database</span> <span class="n">testdb</span>
<span class="k">set</span>
<span class="n">pljava</span><span class="p">.</span><span class="n">libjvm_location</span> <span class="o">=</span> <span class="s1">'/usr/lib/jvm/java-11-openjdk-11.0.21.0.9-2.el9.x86_64/lib/server/libjvm.so'</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and after this, it is possible to <em>install</em> PL/Java:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">create</span> <span class="n">extension</span> <span class="n">pljava</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h3 id="install-a-jar">Install a JAR</h3>
<p>PL/Java being Java, works on the concept of <em>jar</em> archives.
The JAR needs to be <em>installed</em> into PostgreSQL in order for PL/Java to be able to run its code. Installing a jar means that you need to inform PL/Java and PostgreSQL about the jar location.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">sqlj</span><span class="p">.</span><span class="n">install_jar</span><span class="p">(</span> <span class="s1">'file:///tmp/proj-0.0.1-SNAPSHOT.jar'</span><span class="p">,</span>
<span class="s1">'fluca'</span><span class="p">,</span>
<span class="k">true</span> <span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The first parameter to <code class="language-plaintext highlighter-rouge">install_jar</code> is the URI of the jar, the second is a shortname assigned to the jar and the last indicates if the deployment must be done.</p>
<h3 id="set-the-classpath">Set the classpath</h3>
<p>Java has the notion of <code class="language-plaintext highlighter-rouge">classpath</code> and so does PL/Java. In order to use a function within an installed jar, there is the need to <em>map</em> the PostgreSQL schema to the Java classpath, in particular to the jar.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">sqlj</span><span class="p">.</span><span class="n">set_classpath</span><span class="p">(</span><span class="s1">'public'</span><span class="p">,</span> <span class="s1">'fluca'</span><span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The jar named <code class="language-plaintext highlighter-rouge">fluca</code> will be added to the <code class="language-plaintext highlighter-rouge">public</code> PostgreSQL schema, so that when you refer to a method in the publica schema PL/Java will search within the <code class="language-plaintext highlighter-rouge">fluca</code> jar.</p>
<p>Assuming the jar contains the classic <em>Hello World</em> function, the final result is something like:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">estdb</span><span class="o">=></span> <span class="err">\</span><span class="n">sf</span> <span class="n">hello</span>
<span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="k">public</span><span class="p">.</span><span class="n">hello</span><span class="p">(</span><span class="n">towhom</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">character</span> <span class="nb">varying</span>
<span class="k">LANGUAGE</span> <span class="n">java</span>
<span class="k">AS</span> <span class="err">$</span><span class="k">function</span><span class="err">$</span><span class="n">java</span><span class="p">.</span><span class="n">lang</span><span class="p">.</span><span class="n">String</span><span class="o">=</span><span class="n">com</span><span class="p">.</span><span class="n">example</span><span class="p">.</span><span class="n">proj</span><span class="p">.</span><span class="n">Hello</span><span class="p">.</span><span class="n">hello</span><span class="p">(</span><span class="n">java</span><span class="p">.</span><span class="n">lang</span><span class="p">.</span><span class="n">String</span><span class="p">)</span><span class="err">$</span><span class="k">function</span><span class="err">$</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>which makes very clear that <code class="language-plaintext highlighter-rouge">public.hello</code> is mapped to <code class="language-plaintext highlighter-rouge">Hello.hello</code> in the Java space.</p>
<h1 id="where-is-my-java-stuff">Where is my Java stuff?</h1>
<p>PL/Java creates a schema <code class="language-plaintext highlighter-rouge">sqlj</code> that is used to handle both functions and tables that <em>route</em> stuff from PostgreSQL to Java and back.</p>
<p>In particular, <code class="language-plaintext highlighter-rouge">sqlj.jar_repository</code> contains an entry for every installed jar, so that you can for instance know where a jar is located:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">estdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">jarid</span><span class="p">,</span> <span class="n">jarname</span><span class="p">,</span> <span class="n">jarorigin</span> <span class="k">from</span> <span class="n">sqlj</span><span class="p">.</span><span class="n">jar_repository</span><span class="p">;</span>
<span class="n">jarid</span> <span class="o">|</span> <span class="n">jarname</span> <span class="o">|</span> <span class="n">jarorigin</span>
<span class="c1">-------+---------+-------------------------------------</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">fluca</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">proj</span><span class="o">-</span><span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">.</span><span class="mi">1</span><span class="o">-</span><span class="n">SNAPSHOT</span><span class="p">.</span><span class="n">jar</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The table <code class="language-plaintext highlighter-rouge">sqlj.classpath_entry</code> shows how jar are <em>mapped</em> into PostgreSQL schemas:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">r</span><span class="p">.</span><span class="n">jarname</span><span class="p">,</span> <span class="n">r</span><span class="p">.</span><span class="n">jarorigin</span><span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">schemaname</span>
<span class="k">from</span> <span class="n">sqlj</span><span class="p">.</span><span class="n">jar_repository</span> <span class="n">r</span> <span class="k">join</span> <span class="n">sqlj</span><span class="p">.</span><span class="n">classpath_entry</span> <span class="k">c</span> <span class="k">on</span> <span class="k">c</span><span class="p">.</span><span class="n">jarid</span> <span class="o">=</span> <span class="n">r</span><span class="p">.</span><span class="n">jarid</span><span class="p">;</span>
<span class="n">jarname</span> <span class="o">|</span> <span class="n">jarorigin</span> <span class="o">|</span> <span class="n">schemaname</span>
<span class="c1">---------+-------------------------------------+------------</span>
<span class="n">fluca</span> <span class="o">|</span> <span class="n">file</span><span class="p">:</span><span class="o">///</span><span class="n">tmp</span><span class="o">/</span><span class="n">proj</span><span class="o">-</span><span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">.</span><span class="mi">1</span><span class="o">-</span><span class="n">SNAPSHOT</span><span class="p">.</span><span class="n">jar</span> <span class="o">|</span> <span class="k">public</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>From the above it is possible to get the jar short name, the location of the jar on disk and to which PostgreSQL schema jar attributes have been mapped.</p>
<p>There are other interesting functions, like <code class="language-plaintext highlighter-rouge">get_classpath</code>, <code class="language-plaintext highlighter-rouge">set_classpath</code> and obviously <code class="language-plaintext highlighter-rouge">remove_jar</code> and <code class="language-plaintext highlighter-rouge">replace_jar</code>.</p>
<h1 id="conclusions">Conclusions</h1>
<p>PL/Java is a very powerful tool and an interesting language to extend the already rich set of features that PostgreSQL provides.</p>
Learn PostgreSQL (second edition): screencasts available!2023-12-07T00:00:00+00:00https://fluca1978.github.io/2023/12/07/LearnPostgreSQLSecondEditionScreencasts<p>An example of how to run the Docker images provided by the Github repository.</p>
<h1 id="learn-postgresql-second-edition-screencasts-available">Learn PostgreSQL (second edition): screencasts available!</h1>
<p>One of the improvement we made while rewriting and updating the book
was to introduce <em>Docker images</em> that the readers can launch as a <strong>safe environment to test the concepts expressed in the book</strong>.
This has several advantages, most notably the fact that the user does not need to install a separate PostgreSQL instance on her own, and fill it with the data that could slightly change from chapter to chapter according to diffent examples. Another advantage is that, in the case the user damages the data and wants to restore it, the container can be erased and a new one can be built from scratch.</p>
<p>In order to help users to quickly access the PostgreSQL containers, preventing them from writing long and boring Docker commands, we built a simple shell script named <strong><code class="language-plaintext highlighter-rouge">run-pg-docker.sh</code></strong> that optionally accepts the name of the chapter image and projects the user within the container logging in as the <code class="language-plaintext highlighter-rouge">postgres</code> user. Therefore, with just a simple command, the reader can <em>jump</em> into the PostgreSQL container and start running all the commands and examples detailed in the book!</p>
<p>And, in order to better demonstrate how to quickly jump in, there are a couple of <em>asciinema</em> screencasts to let the readers see how the container process is launched.</p>
<p>The first screencast shows how to run the so called <code class="language-plaintext highlighter-rouge">standalone</code> container, the <em>catch-all</em> container used whenever there is no need for a per-chapter specific container:</p>
<p><br />
<br /></p>
<center>
<a href="https://asciinema.org/a/625735" target="_blank"><img src="https://asciinema.org/a/625735.svg" /></a>
</center>
<p><br />
<br /></p>
<p>The second screencast, on the other hand, shows how to run a specific per-chapter container, in particular the Chapter 10 container (related to users, roles and permissions):</p>
<p><br />
<br /></p>
<center>
<a href="https://asciinema.org/a/625738" target="_blank"><img src="https://asciinema.org/a/625738.svg" /></a>
</center>
<p><br />
<br /></p>
<p>Please consider that the time required for the container to fire up depends on the speed of the Internet connection, of the host machine and on the already downloaded artifacts (i.e., re-launching a container for the second time will require less time).</p>
<h2 id="resources">Resources</h2>
<p>The Github repository for downloading examples, Docker images (via <code class="language-plaintext highlighter-rouge">docker-compose</code>) and in general source files is available at <a href="https://github.com/PacktPublishing/Learn-PostgreSQL-Second-Edition" target="_blank">at this URL</a>.</p>
<p>The <strong><a href="https://www.packtpub.com/product/learn-postgresql-second-edition/9781837635641" target="_blank">Learn PostgreSQL second edition book can be found at this link</a></strong>.</p>
<p>Please consider that some output of the screencasts could be different from the one you get on your system, and that during time the configuration files for the Docker images could slightly change depending on readers’ suggestions and comments.</p>
Learn PostgreSQL - second edition2023-11-21T00:00:00+00:00https://fluca1978.github.io/2023/11/21/LearnPostgreSQLSecondEdition<p>Another edition of our complete book is out there!</p>
<h1 id="learn-postgresql---second-edition">Learn PostgreSQL - second edition</h1>
<p>On the last <em>Halloween</em>, the second edition of our book <strong><a href="https://www.amazon.com/Learn-PostgreSQL-manage-scalable-databases/dp/1837635641/ref=sr_1_4?crid=1TCXJ1I4KX16O&keywords=learn+postgresql&qid=1700582950&sprefix=learn+postg%2Caps%2C215&sr=8-4#customerReviews" title="{target="_blank"}">Learn PostgreSQL</a></strong> has been released!</p>
<p><br /></p>
<center>
<a href="https://www.packtpub.com/product/learn-postgresql-second-edition/9781837635641">
<img src="https://content.packt.com/B19640/cover_image_small.jpg" alt="Learn PostgreSQL - second edition" />
</a>
</center>
<p><br /></p>
<p>I’m very proud of all the work me, and my friend Enrico (co-author), have done to not only and merely update this revision of the book, which now cover <strong>PostgreSQL 16</strong>, but also to provide new content, examples and most notably, a new approach to help readers understanding the concepts expressed in the book.</p>
<p>In fact, with this new edition, readers will have access to a set of Docker containers that can be used to quickly fire up a PostgreSQL instance and get hands on the examples and exercises!</p>
<p>Moreover, every chapter now has a <em>Verify your knowledge</em> ending section, made of questions and short answers to point the reader to the most important concepts of the chapter itself.</p>
<p>While the overall structure of the chapter has remained the same, we got the great chance to improve almost all the content in order to better explain concepts and terminology.</p>
<p>I strongly believe <em>*this is not a simple *update</em> of the book, rather it is a full <em>upgrade</em>! **</p>
<p>And after almost a month in the wild, the <a href="https://www.amazon.com/Learn-PostgreSQL-manage-scalable-databases/dp/1837635641/ref=sr_1_4?crid=1TCXJ1I4KX16O&keywords=learn+postgresql&qid=1700582950&sprefix=learn+postg%2Caps%2C215&sr=8-4#customerReviews">reviews for the book confirm my feelings</a>{target=”_blank”}!</p>
<p>Similarly to the previous edition, a <a href="https://github.com/PacktPublishing/Learn-PostgreSQL-Second-Edition" title="{target="_blank"}">GitHub repository with all the main examples, the Docker containers, and other gadgets is available</a>.</p>
<p>As always, me and Enrico will enjoy any feedback and errata that can help us improve, and other readers to get a better experience.</p>
<p>And this post cannot conclude without giving a very warm and special thanks to our technical reviewers <em>Chris Mair</em> and <em>Silvio Trancanella</em>, who helped us a lot improving the quality and readability of the book.
I thank also all people at Packt that helped and assisted us during this work.</p>
pgagroal: where is my configuration?2023-11-13T00:00:00+00:00https://fluca1978.github.io/2023/11/13/pgagroalConfLs<p>A new command to display where the configuration files are located.</p>
<h1 id="pgagroal-where-is-my-configuration">pgagroal: where is my configuration?</h1>
<p>I <a href="https://github.com/agroal/pgagroal/commit/5988613a32332c13bc6df8b71290e2989bd711e0" target="_blank">implemneted a new command in pgagroal</a> <code class="language-plaintext highlighter-rouge">conf ls</code>. The aim of the command is very simple: display where the configuration files are located.
In fact, <code class="language-plaintext highlighter-rouge">pgagroal</code> configuration is split into several configuration files, and sometimes it could be useful to get information from the runtime system where a configuration file is.</p>
<p>The command works as follows:</p>
<p><br />
<br /></p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgagroal conf <span class="nb">ls
</span>Main Configuration file: /etc/pgagroal/pgagroal.conf
HBA file: /etc/pgagroal/pgagroal_hba.conf
Limit file: /etc/pgagroal/pgagroal_databases.conf
Frontend <span class="nb">users </span>file: /etc/pgagroal/pgagroal_frontend_users.conf
Admins file: /etc/pgagroal/pgagroal_admins.conf
Superuser file:
Users file: /etc/pgagroal/pgagroal_users.conf
</code></pre></div></div>
<p><br />
<br /></p>
<p>If a configuration file has not been specified, the corresponding value will be left empty, otherwise, the full path to the configuration file will be displayed.</p>
<p>This is another small addition towards a better consistent and useulf command line interface.</p>
pgagroal new commands: 'ping' and ìstatus details'2023-10-30T00:00:00+00:00https://fluca1978.github.io/2023/10/30/pgagroalPingStatus<p>Another little improvement to the interface for <code class="language-plaintext highlighter-rouge">pgagroal</code></p>
<h1 id="pgagroal-new-commands-ping-and-ìstatus-details">pgagroal new commands: ‘ping’ and ìstatus details’</h1>
<p>When I <a href="https://github.com/agroal/pgagroal/commit/ade40240317bad155dbf1e40866c96257b688b90" target="_blank">committed the major command refactoring</a> in <code class="language-plaintext highlighter-rouge">pgagroal-cli</code>, I introduced also a simple way to <em>deprecate a command</em>, so that the user running the old version of a command is warned about switching to the new interface.</p>
<p>This lead me to think I can not only refactor <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> commands in a more coherent way, grouping similar commands together, but I can also change existing commands by means of deprecating them.</p>
<p>That is what I did <a href="https://github.com/agroal/pgagroal/commit/5f15164b8eff7063445ac454baefa4b4242c962f" target="_blank">in this commitLink Text</a> where I replaced the <code class="language-plaintext highlighter-rouge">is-alive</code> command with <code class="language-plaintext highlighter-rouge">ping</code> and <code class="language-plaintext highlighter-rouge">details</code> with <code class="language-plaintext highlighter-rouge">status details</code>.</p>
<h2 id="the-ping-command">The <code class="language-plaintext highlighter-rouge">ping</code> command</h2>
<p>I have to confess: the name has been inspired by the MySQL Admin tool, that has a similar command.</p>
<p>The idea of <code class="language-plaintext highlighter-rouge">ping</code> is to test if the connection pooler is alive, and it replaces the old command <code class="language-plaintext highlighter-rouge">is-alive</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgagroal-cli ping <span class="nt">--verbose</span>
pgagroal-cli: Success <span class="o">(</span>0<span class="o">)</span>
<span class="nv">$ </span>pgagroal-cli is-alive <span class="nt">--verbose</span>
pgagroal-cli: <span class="nb">command</span> <is-alive> has been deprecated by <ping> since version 1.6
pgagroal-cli: Success <span class="o">(</span>0<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Please note that, as documented, the <code class="language-plaintext highlighter-rouge">ping</code> command does not print anything if the pooler is running.</p>
<h2 id="the-status-details-command">The <code class="language-plaintext highlighter-rouge">status details</code> command</h2>
<p>The <code class="language-plaintext highlighter-rouge">status</code> command prints a summary information about the pooler, while the <code class="language-plaintext highlighter-rouge">details</code> command prints the same summary and a more verbose and detailed information about every connection.</p>
<p>Why not group these two commands? This is the aim of having <code class="language-plaintext highlighter-rouge">status details</code>:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">status</code> will work as before;</li>
<li><code class="language-plaintext highlighter-rouge">status</code> enhanced with <code class="language-plaintext highlighter-rouge">details</code> will provide more verbose output.</li>
</ul>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgagroal-cli status
Status: Running
Active connections: 0
Total connections: 0
Max connections: 15
<span class="nv">$ </span>pgagroal-cli status details
Status: Running
Active connections: 0
Total connections: 0
Max connections: 15
<span class="nt">---------------------</span>
Server: venkman
Host: venkman
Port: 5432
State: Not init
<span class="nt">---------------------</span>
<span class="nt">---------------------</span>
Server: a
Host: spengler
Port: 5432
State: Not init
<span class="nt">---------------------</span>
<span class="nt">---------------------</span>
Server: b
Host: spengler
Port: 6432
State: Not init
<span class="nt">---------------------</span>
<span class="nt">---------------------</span>
Database: testdb
Username: luca
Active connections: 0
Max connections: 2
Initial connections: 1
Min connections: 1
<span class="nt">---------------------</span>
<span class="nt">---------------------</span>
Database: all
Username: luca
Active connections: 0
Max connections: 10
Initial connections: 2
Min connections: 1
<span class="nt">---------------------</span>
<span class="nt">---------------------</span>
Database: pgbench
Username: pgbench
Active connections: 0
Max connections: 2
Initial connections: 1
Min connections: 1
<span class="nt">---------------------</span>
Connection 0: Not init
Connection 1: Not init
Connection 2: Not init
Connection 3: Not init
Connection 4: Not init
Connection 5: Not init
Connection 6: Not init
Connection 7: Not init
Connection 8: Not init
Connection 9: Not init
Connection 10: Not init
Connection 11: Not init
Connection 12: Not init
Connection 13: Not init
Connection 14: Not init
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore, now the total number of main commands to <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> has shrinked, since a few of them have been grouped.</p>
<h1 id="conclusions">Conclusions</h1>
<p>While it may sound very trivial, having a coherent and easy to understand command line interface is a key value in make the project been approached by mere mortals. That’s why I strongly believe the refactoring of the commands in <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> is going to play a very important role in the connection pooler adoption.</p>
Installing pgBackRest on Amazon Linux (by sources)2023-10-23T00:00:00+00:00https://fluca1978.github.io/2023/10/23/pgbackrestAmazonLinux<p>A recap on how to comile pgBackRest on Amazon Linux.</p>
<h1 id="installing-pgbackrest-on-amazon-linux-by-sources">Installing pgBackRest on Amazon Linux (by sources)</h1>
<p>I had the need to install <a href="https://pgbackrest.org/" target="_blank">pgBackRest</a> on Amazon Linux machines.</p>
<p>Unluckily, even if <a href="https://aws.amazon.com/linux/amazon-linux-2023" target="_blank">Amazon Linux 2023</a> is a <em>Red-Hat like</em> operating system, the official <em>PGDG</em> repository did not install in any version. Therefore, I decided to install from sources, compiling the latest <code class="language-plaintext highlighter-rouge">2.48</code> version.</p>
<p>In order to achieve the final result, I had to install the following packages:</p>
<p><br />
<br /></p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">sudo </span>dnf <span class="nb">install </span>postgresql15-server-devel.x86_64
<span class="nv">$ </span><span class="nb">sudo </span>dnf <span class="nb">install </span>libxml2-static.x86_64
<span class="nv">$ </span><span class="nb">sudo </span>dnf <span class="nb">install</span> <span class="nt">-y</span> libxml2-devel.x86_64
<span class="nv">$ </span><span class="nb">sudo </span>dnf <span class="nb">install</span> <span class="nt">-y</span> libyaml-devel.x86_64
<span class="nv">$ </span><span class="nb">sudo </span>dnf <span class="nb">install</span> <span class="nt">-y</span> bzip2-devel.x86_64
</code></pre></div></div>
<p><br />
<br /></p>
<p>After this, I was able to download and compile <code class="language-plaintext highlighter-rouge">pgbackRest</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>wget https://github.com/pgbackrest/pgbackrest/archive/refs/tags/release/2.48.tar.gz
<span class="nv">$ </span><span class="nb">tar </span>xzf 2.48.tar.gz
<span class="nv">$ </span><span class="nb">cd </span>pgbackrest-2.48/src
<span class="nv">$ </span>./configure <span class="o">&&</span> make <span class="o">&&</span> <span class="nb">sudo </span>make <span class="nb">install</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>I tested it, and it works as solid as only <code class="language-plaintext highlighter-rouge">pgBackRest</code> can be!</p>
Using psql Variables to Introspect Your Script2023-10-23T00:00:00+00:00https://fluca1978.github.io/2023/10/23/PostgreSQLPSQLVariablesToMonitorTransactions<p>A little trick to monitor your own running transaction in term of time and data size.</p>
<h1 id="using-psql-variables-to-introspect-your-script">Using psql Variables to Introspect Your Script</h1>
<p><code class="language-plaintext highlighter-rouge">psql</code> is by far my favourite SQL text client, it has features that even the most expensive database tools provide.
One very interesting property of <code class="language-plaintext highlighter-rouge">psql</code> is to support internal variables, pretty much like the variables you can find in a shell.</p>
<p>Since I often find myself doing some queries to get information about a transaction, in term of time and quantity of data manipulated, and doing manually the math, I decided that <code class="language-plaintext highlighter-rouge">psql</code> can do this for me by means of variables.</p>
<h2 id="the-use-case-quantitative-data-about-a-transaction">The Use Case: Quantitative Data About a Transaction</h2>
<p>I want to run a <em>long</em> transaction that does some data manipulation and transformation, and I want to get an idea about how much it is going to <code class="language-plaintext highlighter-rouge">cost</code> me such a transaction, so that I can estimate how to apply the same transformation in production.</p>
<p>Usually, I begin the transaction having a look at the current time and WAL position, and I do the same at the end of the transaction.
Doing the difference between the values provides me an <em>hint</em> about the wall clock time and the amount of data (assuming no other activity is going on the database).
As an example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">testdb</span><span class="o">=></span> <span class="k">BEGIN</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">SELECT</span> <span class="n">clock_timestamp</span><span class="p">()</span> <span class="k">AS</span> <span class="n">begin_clock</span>
<span class="n">testdb</span><span class="o">-*></span> <span class="p">,</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="k">AS</span> <span class="n">begin_lsn</span><span class="p">;</span>
<span class="n">begin_clock</span> <span class="o">|</span> <span class="n">begin_lsn</span>
<span class="c1">------------------------------+------------</span>
<span class="mi">2023</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">29</span> <span class="mi">10</span><span class="p">:</span><span class="mi">32</span><span class="p">:</span><span class="mi">05</span><span class="p">.</span><span class="mi">51654</span><span class="o">+</span><span class="mi">02</span> <span class="o">|</span> <span class="mi">2</span><span class="o">/</span><span class="n">A39CC3C0</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">t</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="n">testdb</span><span class="o">-*></span> <span class="k">SELECT</span> <span class="s1">'Dummy '</span> <span class="o">||</span> <span class="n">v</span>
<span class="n">testdb</span><span class="o">-*></span> <span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">SELECT</span> <span class="n">clock_timestamp</span><span class="p">()</span> <span class="k">AS</span> <span class="n">end_clock</span>
<span class="p">,</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="k">AS</span> <span class="n">end_lsn</span><span class="p">;</span>
<span class="n">end_clock</span> <span class="o">|</span> <span class="n">end_lsn</span>
<span class="c1">-------------------------------+------------</span>
<span class="mi">2023</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">29</span> <span class="mi">10</span><span class="p">:</span><span class="mi">32</span><span class="p">:</span><span class="mi">48</span><span class="p">.</span><span class="mi">511892</span><span class="o">+</span><span class="mi">02</span> <span class="o">|</span> <span class="mi">2</span><span class="o">/</span><span class="n">A81AC000</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">COMMIT</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now that I have the times and WAL lsn positions, I can <em>manually</em> compute the <em>cost</em> of this transaction by copying and pasting the results:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="s1">'2023-09-29 10:32:48.511892+02'</span><span class="p">::</span><span class="nb">timestamp</span>
<span class="o">-</span> <span class="s1">'2023-09-29 10:32:05.51654+02'</span><span class="p">::</span><span class="nb">timestamp</span> <span class="k">AS</span> <span class="n">wall_clock</span>
<span class="p">,</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_wal_lsn_diff</span><span class="p">(</span> <span class="s1">'2/A81AC000'</span><span class="p">,</span> <span class="s1">'2/A39CC3C0'</span> <span class="p">)</span> <span class="p">)</span> <span class="k">as</span> <span class="k">size</span><span class="p">;</span>
<span class="n">wall_clock</span> <span class="o">|</span> <span class="k">size</span>
<span class="c1">-----------------+-------</span>
<span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">42</span><span class="p">.</span><span class="mi">995352</span> <span class="o">|</span> <span class="mi">72</span> <span class="n">MB</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>So the transaction took <code class="language-plaintext highlighter-rouge">42</code> seconds and produced around <code class="language-plaintext highlighter-rouge">72 MB</code> of data (in the WALs).
Note that I had to manually copy and paste every single value in order for the query to compute the difference I want.</p>
<h2 id="using-psql-variables-to-obtain-the-computation-automatically">Using <code class="language-plaintext highlighter-rouge">psql</code> variables to obtain the computation automatically</h2>
<p>If I store the <em>begin</em> and <em>end</em> values into <code class="language-plaintext highlighter-rouge">psql</code> variables, I can use an immutable query to compute the same results, without having to copy and paste the single values.</p>
<p>This trick is made possible by the special command <code class="language-plaintext highlighter-rouge">\gset</code>, that allows for the declaration and definition of variables out of a query result.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">BEGIN</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">SELECT</span> <span class="n">clock_timestamp</span><span class="p">()</span> <span class="k">AS</span> <span class="n">clock</span>
<span class="p">,</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="k">AS</span> <span class="n">lsn</span> <span class="err">\</span><span class="n">gset</span> <span class="n">begin_</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">t</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="s1">'Dummy '</span> <span class="o">||</span> <span class="n">v</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">SELECT</span> <span class="n">clock_timestamp</span><span class="p">()</span> <span class="k">AS</span> <span class="n">clock</span>
<span class="p">,</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="k">AS</span> <span class="n">lsn</span> <span class="err">\</span><span class="n">gset</span> <span class="n">end_</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">SELECT</span> <span class="p">:</span><span class="s1">'end_clock'</span><span class="p">::</span><span class="nb">timestamp</span> <span class="o">-</span> <span class="p">:</span><span class="s1">'begin_clock'</span><span class="p">::</span><span class="nb">timestamp</span> <span class="k">as</span> <span class="n">wall_clock</span>
<span class="p">,</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_wal_lsn_diff</span><span class="p">(</span> <span class="p">:</span><span class="s1">'end_lsn'</span><span class="p">,</span> <span class="p">:</span><span class="s1">'begin_lsn'</span> <span class="p">)</span> <span class="p">)</span> <span class="k">as</span> <span class="k">size</span><span class="p">;</span>
<span class="n">wall_clock</span> <span class="o">|</span> <span class="k">size</span>
<span class="c1">-----------------+-------</span>
<span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">11</span><span class="p">.</span><span class="mi">400421</span> <span class="o">|</span> <span class="mi">72</span> <span class="n">MB</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">COMMIT</span><span class="p">;</span>
<span class="k">COMMIT</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The two query to get the timing and WAL lsn informations are similar, and exploit a <code class="language-plaintext highlighter-rouge">gset begin_</code> and <code class="language-plaintext highlighter-rouge">\gset end_</code> command respectively. The first command takes the output of the query and, for each column, creates a variable with the given prefix (<code class="language-plaintext highlighter-rouge">begin_</code>) and the column name, therefore <code class="language-plaintext highlighter-rouge">begin_clock</code> and <code class="language-plaintext highlighter-rouge">begin_lsn</code>. The second query does the very same with the prefix <code class="language-plaintext highlighter-rouge">end_</code>, therefore creating <code class="language-plaintext highlighter-rouge">end_clock</code> and <code class="language-plaintext highlighter-rouge">end_lsn</code> variables.</p>
<p>The interesting part is the last query, that by now is totally automated and performs the differences between <code class="language-plaintext highlighter-rouge">end_</code> and <code class="language-plaintext highlighter-rouge">start_</code> values (please note the quoting and casting). Thanks to this little trick, I can now place such queries at the boundaries of my scripts and get as output the result I want or need to monitor the transaction.</p>
<p>Clearly, this approach can be extended, so you can have variables to track the number of tuples, the number of tables created or deleted, and so on. The key idea is to have a kind of <em>catch-all</em> set of queries that depend on variables you will define systematically in your scripts.</p>
<h3 id="why-is-the-second-transacction-faster-than-the-first-one">Why is the second transacction faster than the first one?</h3>
<p>In the above example I shown two identical transactions, but the first one is slower, in terms of execution time, than the second one.
The answer is simple: in the first transaction I was literally typing in the SQL statements, while in the second I was recalling them from the <code class="language-plaintext highlighter-rouge">psql</code> history. It is only a matter of typing the statements!</p>
<h1 id="conclusions">Conclusions</h1>
<p>When I do professional training and present the <code class="language-plaintext highlighter-rouge">psql</code> command line client I see disappointment in my trainee faces. However, the more I go on explaining how flexible and powerful <code class="language-plaintext highlighter-rouge">psql</code> is, the more the classroom likes it.
Thanks to the capabiliy of automagically set variables from a query output, <code class="language-plaintext highlighter-rouge">psql</code> allows you to automate some tasks including your own script introspection.</p>
pgagroal command refactoring2023-10-16T00:00:00+00:00https://fluca1978.github.io/2023/10/16/pgagroalCommandRefactoring<p>A new, cleaner, set of commands for <code class="language-plaintext highlighter-rouge">pgagroal</code>.</p>
<h1 id="pgagroal-command-refactoring">pgagroal command refactoring</h1>
<p>It took me more than one year to get <a href="https://github.com/agroal/pgagroal/commit/ade40240317bad155dbf1e40866c96257b688b90" target="_blank">this patch in</a>! The reason was not that this piece of code is particularly complex, rather it is hitting pretty much all the human interface <code class="language-plaintext highlighter-rouge">pgagroal</code> is exposing to the user.</p>
<p>When I started using <a href="https://github.com/agroal/pgagroal" target="_blank">pgagroal</a> I felt uncomfortably with its command line interface.
Commands had been added as the project improved, but there was not a clear grouping of related commands, and most of them have weird meaning, at least to me.</p>
<p>As an example, the <code class="language-plaintext highlighter-rouge">reset</code> command was dealing with the Prometheus reset, while <code class="language-plaintext highlighter-rouge">reset-server</code> was truly dealing with <code class="language-plaintext highlighter-rouge">pgagroal</code> reset. That sounded weird to me, since I believe <code class="language-plaintext highlighter-rouge">pgagroal</code> should deal first with itself, and then with other components, so the order of the commands appeared wrong to me.</p>
<p>Another example was the <code class="language-plaintext highlighter-rouge">flush-xxx</code> set of commands: <code class="language-plaintext highlighter-rouge">flush-gracefully</code>, <code class="language-plaintext highlighter-rouge">flush-all</code> and <code class="language-plaintext highlighter-rouge">flush-idle</code>. Why not grouping those commands into a big <code class="language-plaintext highlighter-rouge">flush</code> group and then add as a parameter what to effectively flush?</p>
<p>You probably get the point behind my rants. That’s why I started to develop <a href="https://github.com/agroal/pgagroal/commit/ade40240317bad155dbf1e40866c96257b688b90" target="_blank">this patch in</a> to refactor the command line of <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> and <code class="language-plaintext highlighter-rouge">pgagroal-admin</code> to:</p>
<ul>
<li>have command groups</li>
<li>provide more concise and sane defaults</li>
<li>handle commands and subcommands, like <code class="language-plaintext highlighter-rouge">git</code> and other command line oriented tools do.</li>
</ul>
<p>Since this change would have broke the command line, I decided also to place warnings and accept the <em>old</em> commands. Therefore I developed this patch with the option to parse the new set of commands, as well as the old ones, printing out a warning if the user was still using the old ones. This makes retro-compatibility easy, and pushes the user towards the new set of commands to prevent that the future removal of the old commands will break some external tools or scripts.</p>
<p>I’m not going to discuss the whole set of new commands, since the <a href="https://github.com/agroal/pgagroal/blob/master/doc/CLI.md" target="_blank">documentation</a> already does this task. However, just to give you an idea about how the command now looks like, consider the <code class="language-plaintext highlighter-rouge">flsuh</code> commands:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># old commands</span>
<span class="nv">$ </span>pgagroal-cli flush-all
<span class="nv">$ </span>pgagroal-cli flush-idle
<span class="nv">$ </span>pgagroal-cli flush-gracefully
<span class="c"># new commands</span>
<span class="nv">$ </span>pgagroal-cli flush all
<span class="nv">$ </span>pgagroal-cli flush idle
<span class="nv">$ </span>pgagroal-cli flush gracefully
<span class="nv">$ </span>pgagroal-cli flush <span class="c"># same as flush gracefully</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the new command is just <code class="language-plaintext highlighter-rouge">flush</code> and it accepts a <em>subcommand</em> that can either be <code class="language-plaintext highlighter-rouge">idle</code>, <code class="language-plaintext highlighter-rouge">all</code>, <code class="language-plaintext highlighter-rouge">gracefully</code> and is used to specify the <em>mode</em> to execute the <code class="language-plaintext highlighter-rouge">flush</code> command. This also introduces a new default behavior: if the user does not specify how to flush, the graceful mode is automatically selected.</p>
<p>If the user types in the old command, <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> will emit a warning like the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgagroal-cli flush-idle
WARN: <span class="nb">command</span> <flush idle> has been deprecated by <flush idle> since version 1.6.0
</code></pre></div></div>
<p><br />
<br /></p>
<p>In this way, we keep compatibility with previous versions while trying to teach the users the new way to execute a command.</p>
<p><strong>Sooner or later, old commands will be removed!</strong> Therefore users should start updating their tools and scripts to the new interface.</p>
<p>The <code class="language-plaintext highlighter-rouge">conf</code> set of commands is probably the one that groups the most subcommands. In fact, the <code class="language-plaintext highlighter-rouge">conf</code> command includes the <code class="language-plaintext highlighter-rouge">set</code>, the <code class="language-plaintext highlighter-rouge">get</code> and <code class="language-plaintext highlighter-rouge">reload</code> commands that were respectively <code class="language-plaintext highlighter-rouge">config-set</code>, <code class="language-plaintext highlighter-rouge">config-get</code> and <code class="language-plaintext highlighter-rouge">reload</code>. Note how the reload subcommand now makes it clearer what is going to be reloaded (i.e., the <code class="language-plaintext highlighter-rouge">conf</code>).
In other words, according to me, <code class="language-plaintext highlighter-rouge">pgagroal-cli conf reload</code> is much clearer and less error prone than <code class="language-plaintext highlighter-rouge">pgagroal-cli reload</code>.</p>
<p>Some commands changed their name, and this was due to a clash in the default actions. For example, the <code class="language-plaintext highlighter-rouge">reset</code> and <code class="language-plaintext highlighter-rouge">reset-prometheus</code> commands have been moved into the <code class="language-plaintext highlighter-rouge">clear</code> group, with <code class="language-plaintext highlighter-rouge">clear server</code> (the default) and <code class="language-plaintext highlighter-rouge">clear prometheus</code> respectively.</p>
<p>Similarly, also <code class="language-plaintext highlighter-rouge">pgagroal-admin</code> has been updated, so that for instance <code class="language-plaintext highlighter-rouge">pgagroal-admin add-user</code> is now <code class="language-plaintext highlighter-rouge">pgagroal-admin user add</code>.</p>
<p>Thanks to the very well structured <code class="language-plaintext highlighter-rouge">pgagroal</code> source code, and to the introduction of a few new utility functions to handle command line arguments, these changes will allow the introduction to new commands and groups with a more consistent command line experience!</p>
pgenv version 1.3.3 released2023-09-28T00:00:00+00:00https://fluca1978.github.io/2023/09/28/pgenv133<p>A new release for the PostgreSQL binary manager.</p>
<h1 id="pgenv-version-133-released">pgenv version 1.3.3 released</h1>
<p><a href="https://github.com/theory/pgenv/releases/tag/v1.3.3" target="_blank">pgenv release 1.3.3</a> is now available.</p>
<p>This release introduces two main environment variables to instrument the application about configuration files.</p>
<p>The first variable is <code class="language-plaintext highlighter-rouge">PGENV_CONFIGURATION_FILE</code>: such variable can be set to <em>force</em> <code class="language-plaintext highlighter-rouge">pgenv</code> to use a custom configuration file without having to <em>guess</em> which file to use depending on the specific PostgreSQL version in use.
By default, <code class="language-plaintext highlighter-rouge">pgenv</code> looks for a configuration file named after the PostgreSQL version, or if not found, a <code class="language-plaintext highlighter-rouge">default.conf</code> configuration file. Using the above variable, it is now possible to pass information to <code class="language-plaintext highlighter-rouge">pgenv</code> about where a configuration is, and <strong>this allows for the same configuration file to be used over and over without any regard to the PostgreSQL version</strong>.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">export </span><span class="nv">PGENV_CONFIGURATION_FILE</span><span class="o">=</span>~/git/dot-files/pgenv/luca.conf
% pgenv rebuild 16.0
Using PGENV_ROOT /home/luca/git/pgenv
<span class="o">[</span>DEBUG] Configuration file forced by environment variable PGENV_CONFIGURATION_FILE <span class="o">=</span> /home/luca/git/dot-files/pgenv/luca.conf
<span class="o">[</span>DEBUG] Configuration file forced by environment variable PGENV_CONFIGURATION_FILE <span class="o">=</span> /home/luca/git/dot-files/pgenv/luca.conf
<span class="o">[</span>DEBUG] Looking <span class="k">for </span>configuration <span class="k">in</span> /home/luca/git/dot-files/pgenv/luca.conf
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, <code class="language-plaintext highlighter-rouge">pgenv</code> will now use the specified configuration file.</p>
<p>The other variable added is <code class="language-plaintext highlighter-rouge">PGENV_WRITE_CONFIGURATION_FILE_AUTOMATICALLY</code>, that if set to a <em>false</em> value (e.g., <code class="language-plaintext highlighter-rouge">0</code>, <code class="language-plaintext highlighter-rouge">no</code>) will prevent <code class="language-plaintext highlighter-rouge">pgenv</code> to write or overwrite a configuration file once a <code class="language-plaintext highlighter-rouge">build</code> or <code class="language-plaintext highlighter-rouge">rebuild</code> is completed. The normal behavior is to let <code class="language-plaintext highlighter-rouge">pgenv</code> to write/overwrite the configuration file if this variable is not set at all or is set to a true value (e.g., <code class="language-plaintext highlighter-rouge">1</code>, <code class="language-plaintext highlighter-rouge">yes</code>), and this is the behavior of previos releases. Since today, if you set this variable to a false value, <code class="language-plaintext highlighter-rouge">pgenv</code> will not create (nor overwrite) a configuration file at the end of a build phase.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">export </span><span class="nv">PGENV_WRITE_CONFIGURATION_FILE_AUTOMATICALLY</span><span class="o">=</span>no
% pgenv rebuild 16.0
...
<span class="o">[</span>DEBUG] Not writing config file automatically: <span class="nb">set</span> <span class="sb">`</span>PGENV_WRITE_CONFIGURATION_FILE_AUTOMATICALLY<span class="sb">`</span> to a <span class="nb">true </span>value to <span class="nb">enable </span>the automatic file writing
PostgreSQL 16.0 built
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you see from the above, <code class="language-plaintext highlighter-rouge">pgenv</code> will complain it cannot write the configuration file for this build.
Thanks to this option, you will be sure your carefully crafted configuration file will never be overwritten accidentally (please note that <code class="language-plaintext highlighter-rouge">pgenv</code> always make a backup copy before overwriting an existing file).</p>
<p>Last, the new subcommand <code class="language-plaintext highlighter-rouge">config path</code> has been added: the idea is to show to the user where <code class="language-plaintext highlighter-rouge">pgenv</code> expects to find a configuration file and what is going to use without any custom changes.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv config path
Using PGENV_ROOT /home/luca/git/pgenv
/home/luca/git/pgenv/config/default.conf
</code></pre></div></div>
<p><br />
<br /></p>
Functions to Validate User's Input2023-09-25T00:00:00+00:00https://fluca1978.github.io/2023/09/25/PostgreSQLInputValidation<p>PostgreSQL 16 introduces a couple of functions to validate user’s input.</p>
<h1 id="functions-to-validate-users-input">Functions to Validate User’s Input</h1>
<p>PostgreSQL 16 introduces a couple of new embedded functions: <code class="language-plaintext highlighter-rouge">[pg_input_is_valid](https://www.postgresql.org/docs/16/functions-info.html#FUNCTIONS-INFO-VALIDITY-TABLE){:target="_blank"}</code> and <code class="language-plaintext highlighter-rouge">[pg_input_error_info](https://www.postgresql.org/docs/16/functions-info.html#FUNCTIONS-INFO-VALIDITY-TABLE){:target="_blank"}</code>.</p>
<p>Both the functions accepts a couple of strings, the first one being the value to be validated, and the second one being the type to which <em>you want to cast the value</em>. This can be useful because you can check ahead of time if a given data type (expressed as a string) can be converted into a specific data type without raising an exception.</p>
<p>The first use case that comes into my mind is the conversion of some stringified date into an effective date, for example when importing data from an external source like a text file. Let’s see this in action:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">pg_input_is_valid</span><span class="p">(</span> <span class="s1">'1978-07-19'</span><span class="p">,</span> <span class="s1">'timestamp'</span> <span class="p">);</span>
<span class="n">pg_input_is_valid</span>
<span class="c1">-------------------</span>
<span class="n">t</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">pg_input_error_info</span><span class="p">(</span> <span class="s1">'1978-07-19'</span><span class="p">,</span> <span class="s1">'timestamp'</span> <span class="p">);</span>
<span class="n">message</span> <span class="o">|</span> <span class="n">detail</span> <span class="o">|</span> <span class="n">hint</span> <span class="o">|</span> <span class="n">sql_error_code</span>
<span class="c1">---------+--------+------+----------------</span>
<span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>With a valid date, the <code class="language-plaintext highlighter-rouge">pg_input_is_valid</code> function returns true and the <code class="language-plaintext highlighter-rouge">pg_input_error_info</code> does not return any row.
But what happens if the date is in a wrong format?</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">x</span>
<span class="n">Expanded</span> <span class="n">display</span> <span class="k">is</span> <span class="k">on</span><span class="p">.</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">pg_input_is_valid</span><span class="p">(</span> <span class="s1">'1978-19-07'</span><span class="p">,</span> <span class="s1">'timestamp'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-----+--</span>
<span class="n">pg_input_is_valid</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">pg_input_error_info</span><span class="p">(</span> <span class="s1">'1978-19-07'</span><span class="p">,</span> <span class="s1">'timestamp'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">--+--------------------------------------------------</span>
<span class="n">message</span> <span class="o">|</span> <span class="nb">date</span><span class="o">/</span><span class="nb">time</span> <span class="n">field</span> <span class="n">value</span> <span class="k">out</span> <span class="k">of</span> <span class="k">range</span><span class="p">:</span> <span class="nv">"1978-19-07"</span>
<span class="n">detail</span> <span class="o">|</span>
<span class="n">hint</span> <span class="o">|</span> <span class="n">Perhaps</span> <span class="n">you</span> <span class="n">need</span> <span class="n">a</span> <span class="n">different</span> <span class="nv">"datestyle"</span> <span class="n">setting</span><span class="p">.</span>
<span class="n">sql_error_code</span> <span class="o">|</span> <span class="mi">22008</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see from the above example, passing a wrong date/time format raises the error, and thanks to these functions we are now able to discover ahead of its usage what the problem could be.</p>
<p>Another example, just to clarify more:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_input_error_info</span><span class="p">(</span> <span class="s1">'4 months'</span><span class="p">,</span> <span class="s1">'interval'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------+------</span>
<span class="n">pg_input_error_info</span> <span class="o">|</span> <span class="p">(,,,)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_input_error_info</span><span class="p">(</span> <span class="s1">'4 mesi'</span><span class="p">,</span> <span class="s1">'interval'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------+---------------------------------------------------------------</span>
<span class="n">pg_input_error_info</span> <span class="o">|</span> <span class="p">(</span><span class="nv">"invalid input syntax for type interval: </span><span class="se">""</span><span class="nv">4 mesi</span><span class="se">""</span><span class="nv">"</span><span class="p">,,,</span><span class="mi">22007</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is therefore quite easy to use such checks into your own function:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">input_check</span><span class="p">(</span> <span class="n">t</span> <span class="nb">text</span><span class="p">[]</span> <span class="p">)</span><span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="k">current</span> <span class="nb">text</span><span class="p">;</span> <span class="n">ok</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">e</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">FOREACH</span> <span class="k">current</span> <span class="k">IN</span> <span class="n">ARRAY</span> <span class="n">t</span> <span class="n">LOOP</span>
<span class="n">IF</span> <span class="n">pg_input_is_valid</span><span class="p">(</span> <span class="k">current</span><span class="p">,</span> <span class="s1">'date'</span> <span class="p">)</span> <span class="k">THEN</span>
<span class="n">ok</span> <span class="p">:</span><span class="o">=</span> <span class="n">ok</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="k">SELECT</span> <span class="n">message</span>
<span class="k">INTO</span> <span class="n">e</span>
<span class="k">FROM</span> <span class="n">pg_input_error_info</span><span class="p">(</span> <span class="k">current</span><span class="p">,</span> <span class="s1">'date'</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Skipping [%] because is not valid: %'</span><span class="p">,</span> <span class="k">current</span><span class="p">,</span> <span class="n">e</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">ok</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">FUNCTION</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>that, once invoked with the following input, provides the result as shown below:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">input_check</span><span class="p">(</span> <span class="n">array</span><span class="p">[</span> <span class="s1">'2023-09-25'</span><span class="p">,</span> <span class="s1">'luca'</span><span class="p">,</span> <span class="s1">'0001-01-01'</span><span class="p">,</span> <span class="s1">'Sat 23 Sep 2023'</span><span class="p">,</span> <span class="s1">'Feb 30 2023'</span> <span class="p">]</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Skipping</span> <span class="p">[</span><span class="n">luca</span><span class="p">]</span> <span class="n">because</span> <span class="k">is</span> <span class="k">not</span> <span class="k">valid</span><span class="p">:</span> <span class="n">invalid</span> <span class="k">input</span> <span class="n">syntax</span> <span class="k">for</span> <span class="k">type</span> <span class="nb">date</span><span class="p">:</span> <span class="nv">"luca"</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Skipping</span> <span class="p">[</span><span class="n">Feb</span> <span class="mi">30</span> <span class="mi">2023</span><span class="p">]</span> <span class="n">because</span> <span class="k">is</span> <span class="k">not</span> <span class="k">valid</span><span class="p">:</span> <span class="nb">date</span><span class="o">/</span><span class="nb">time</span> <span class="n">field</span> <span class="n">value</span> <span class="k">out</span> <span class="k">of</span> <span class="k">range</span><span class="p">:</span> <span class="nv">"Feb 30 2023"</span>
<span class="n">input_check</span>
<span class="c1">-------------</span>
<span class="mi">3</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
psql \watch improvements2023-09-22T00:00:00+00:00https://fluca1978.github.io/2023/09/22/PostgreSQL16psqlwatch<p>A nice addition to the command \watch in the PostgreSQL command line client.</p>
<h1 id="psql-watch-improvements">psql \watch improvements</h1>
<p><code class="language-plaintext highlighter-rouge">psql</code> is the best command line SQL client ever, and it gets improved constantly. With the new <strong>release of PostgreSQL 16</strong>, also <code class="language-plaintext highlighter-rouge">psql</code> get a new nice addition: <em>the capability to stop a <code class="language-plaintext highlighter-rouge">\watch</code> command loop after a specific amount of iterations</em>.</p>
<p><br /></p>
<p>In this article I briefly show how the new feature works.</p>
<h2 id="what-is-watch">What is <code class="language-plaintext highlighter-rouge">\watch</code>?</h2>
<p>The special command <code class="language-plaintext highlighter-rouge">\watch</code> is similar to the Unix command line utility <code class="language-plaintext highlighter-rouge">watch(1)</code>: it repeats a specific command (in this case, an SQL statement) at regular time intervals.
<br />
I tend to use this, as an example, when I want to monitor some progress or some catalogs: I write the query that will produce the result I want to observe, and then use <code class="language-plaintext highlighter-rouge">\watch</code> to schedule regular repetitions of the query. For instance:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pg_stat_progress_cluster</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="err">\</span><span class="n">watch</span> <span class="mi">5</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above example will show me what is going on as <code class="language-plaintext highlighter-rouge">CLUSTER</code> or <code class="language-plaintext highlighter-rouge">VACUUM</code> with a refresh ratio of 5 seconds.</p>
<p><br /></p>
<p>One problem of the <code class="language-plaintext highlighter-rouge">\watch</code> command is that it loops forever, meaning you need to manually stop it (e.g., <code class="language-plaintext highlighter-rouge">CTRL-c</code>).
Another approach, is to raise an exception when the query has to stop. As a nasty example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">WITH</span> <span class="n">exit</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">AS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">FROM</span> <span class="n">pg_stat_progress_cluster</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">p</span><span class="p">.</span><span class="o">*</span>
<span class="k">FROM</span> <span class="n">pg_stat_progress_cluster</span> <span class="n">p</span><span class="p">,</span> <span class="n">exit</span> <span class="n">e</span>
<span class="k">WHERE</span> <span class="mi">1</span> <span class="o">/</span> <span class="n">e</span><span class="p">.</span><span class="n">x</span> <span class="o">></span> <span class="mi">0</span>
<span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above query will raise a <code class="language-plaintext highlighter-rouge">division by zero</code> as soon as there are no more entries in the <code class="language-plaintext highlighter-rouge">pg_stat_progress_cluster</code> view, and this will in turn stop the <code class="language-plaintext highlighter-rouge">\watch</code> command:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="err">\</span><span class="n">watch</span> <span class="mi">1</span>
<span class="n">Wed</span> <span class="mi">20</span> <span class="n">Sep</span> <span class="mi">2023</span> <span class="mi">09</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">18</span> <span class="n">PM</span> <span class="n">CEST</span> <span class="p">(</span><span class="k">every</span> <span class="mi">1</span><span class="n">s</span><span class="p">)</span>
<span class="n">pid</span> <span class="o">|</span> <span class="n">datid</span> <span class="o">|</span> <span class="n">datname</span> <span class="o">|</span> <span class="n">relid</span> <span class="o">|</span> <span class="n">command</span> <span class="o">|</span> <span class="n">phase</span> <span class="o">|</span> <span class="n">cluster_index_relid</span> <span class="o">|</span> <span class="n">heap_tuples_scanned</span> <span class="o">|</span> <span class="n">heap_tuples_written</span> <span class="o">|</span> <span class="n">heap_blks_total</span> <span class="o">|</span> <span class="n">heap_blks_scanned</span> <span class="o">|</span> <span class="n">index_rebuild_count</span>
<span class="c1">-------+-------+---------+-------+-------------+------------------+---------------------+---------------------+---------------------+-----------------+-------------------+---------------------</span>
<span class="mi">78406</span> <span class="o">|</span> <span class="mi">16385</span> <span class="o">|</span> <span class="n">testdb</span> <span class="o">|</span> <span class="mi">2612</span> <span class="o">|</span> <span class="k">VACUUM</span> <span class="k">FULL</span> <span class="o">|</span> <span class="n">rebuilding</span> <span class="k">index</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">6</span> <span class="o">|</span> <span class="mi">6</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">0</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">Wed</span> <span class="mi">20</span> <span class="n">Sep</span> <span class="mi">2023</span> <span class="mi">09</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">19</span> <span class="n">PM</span> <span class="n">CEST</span> <span class="p">(</span><span class="k">every</span> <span class="mi">1</span><span class="n">s</span><span class="p">)</span>
<span class="n">pid</span> <span class="o">|</span> <span class="n">datid</span> <span class="o">|</span> <span class="n">datname</span> <span class="o">|</span> <span class="n">relid</span> <span class="o">|</span> <span class="n">command</span> <span class="o">|</span> <span class="n">phase</span> <span class="o">|</span> <span class="n">cluster_index_relid</span> <span class="o">|</span> <span class="n">heap_tuples_scanned</span> <span class="o">|</span> <span class="n">heap_tuples_written</span> <span class="o">|</span> <span class="n">heap_blks_total</span> <span class="o">|</span> <span class="n">heap_blks_scanned</span> <span class="o">|</span> <span class="n">index_rebuild_count</span>
<span class="c1">-------+-------+---------+-------+-------------+--------------------------+---------------------+---------------------+---------------------+-----------------+-------------------+---------------------</span>
<span class="mi">78406</span> <span class="o">|</span> <span class="mi">16385</span> <span class="o">|</span> <span class="n">testdb</span> <span class="o">|</span> <span class="mi">3603</span> <span class="o">|</span> <span class="k">VACUUM</span> <span class="k">FULL</span> <span class="o">|</span> <span class="n">performing</span> <span class="k">final</span> <span class="n">cleanup</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">551</span> <span class="o">|</span> <span class="mi">551</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">1</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">division</span> <span class="k">by</span> <span class="n">zero</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>While the above approach is, according to me, ugly, it serves to stop <code class="language-plaintext highlighter-rouge">\watch</code> <em>sometime in the future</em>, so it could be useful to collect historical information without having a bunch of empty executions.</p>
<h2 id="the-new-watch-count-option">The new <code class="language-plaintext highlighter-rouge">\watch</code> count option</h2>
<p>Staring from <code class="language-plaintext highlighter-rouge">psql</code> version 16, the <code class="language-plaintext highlighter-rouge">\watch</code> command has an option to indicate after how many iterations it has to spontaneously stop. The online help of the command is now as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="o">?</span>
<span class="k">General</span>
<span class="p">...</span>
<span class="err">\</span><span class="n">watch</span> <span class="p">[[</span><span class="n">i</span><span class="o">=</span><span class="p">]</span><span class="n">SEC</span><span class="p">]</span> <span class="p">[</span><span class="k">c</span><span class="o">=</span><span class="n">N</span><span class="p">]</span> <span class="k">execute</span> <span class="n">query</span> <span class="k">every</span> <span class="n">SEC</span> <span class="n">seconds</span><span class="p">,</span> <span class="n">up</span> <span class="k">to</span> <span class="n">N</span> <span class="n">times</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>So it is now possible to specify to execute the <code class="language-plaintext highlighter-rouge">\watch</code> statement repeated, for instance, 7 times every 2 seconds:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">watch</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span> <span class="k">c</span><span class="o">=</span><span class="mi">7</span>
<span class="p">...</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Note that, in the case you want to specify the iteration counts, you have to use named parameter <code class="language-plaintext highlighter-rouge">c</code>, or the command will not understand your intention.The interval parameter does not require to be named, on the other hand, therefore to summary:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">\watch 2</code> is valid and will repeat the command every 2 seconds without stopping;</li>
<li><code class="language-plaintext highlighter-rouge">\watch 2 c=7</code> is valid and will repated the command every 2 seconds, stopping after 7 iterations;</li>
<li><code class="language-plaintext highlighter-rouge">\watch i=2 c=7</code> ditto;</li>
<li><code class="language-plaintext highlighter-rouge">\watch c=7 i=2</code> ditto;</li>
<li><code class="language-plaintext highlighter-rouge">\watch 2 7</code> is <em>not valid</em> because both numbers will be considered as an interval and <code class="language-plaintext highlighter-rouge">psql</code> will abort with <code class="language-plaintext highlighter-rouge">\watch: interval value is specified more than once</code>.</li>
</ul>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">psql</code> is a command line client with a lot of features that can help interacting with PostgreSQL, and it gets improved release after release. In my experience, there is no other command line client with as much features as <code class="language-plaintext highlighter-rouge">psql</code>, and even the small addition to <code class="language-plaintext highlighter-rouge">\watch</code> makes it even more valuable.</p>
FOR loops automatically declared variables in PL/PgSQL2023-09-19T00:00:00+00:00https://fluca1978.github.io/2023/09/19/PostgreSQLFORAutodeclaredVariables<p>In PL/PgSQL FOR loops the iterator is automatically declared, and this could bring some problems.</p>
<h1 id="for-loops-automatically-declared-variables-in-plpgsql">FOR loops automatically declared variables in PL/PgSQL</h1>
<p>Consider the following simple function that returns a table made by three columns:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="k">public</span><span class="p">.</span><span class="n">a_table</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">i</span> <span class="nb">int</span><span class="p">,</span> <span class="n">j</span> <span class="nb">int</span><span class="p">,</span> <span class="n">k</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="mi">2</span> <span class="n">LOOP</span>
<span class="k">FOR</span> <span class="n">j</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="mi">2</span> <span class="n">LOOP</span>
<span class="k">FOR</span> <span class="n">k</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="mi">2</span> <span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'i=%, j=%, k=%'</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NEXT</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span>
<span class="k">VOLATILE</span>
<span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>What is the result of invoking the above function?
<br />
Depending on how you know the <code class="language-plaintext highlighter-rouge">FOR</code> loop in PL/PgSQL, it could be surprising:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">a_table</span><span class="p">();</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">i</span> <span class="o">|</span> <span class="n">j</span> <span class="o">|</span> <span class="n">k</span>
<span class="c1">---+---+---</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="p">(</span><span class="mi">8</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Why is the result set empty even if the variables have values?
<br />
Because the <code class="language-plaintext highlighter-rouge">FOR</code> iterator is automatically declared and scoped to the loop itself. The <a href="https://www.postgresql.org/docs/16/plpgsql-control-structures.html" target="_blank">PostgreSQL Documentation</a> explains it:</p>
<blockquote>
<p>The variable name is automatically defined as type integer and exists only inside the loop (any existing definition of the variable name is ignored</p>
</blockquote>
<p>It should be clear that I’m referring to the <em>integer <code class="language-plaintext highlighter-rouge">FOR</code> loop variant</em> here. However, the problem is that while <code class="language-plaintext highlighter-rouge">i</code>, <code class="language-plaintext highlighter-rouge">j</code> and <code class="language-plaintext highlighter-rouge">k</code> are defined as variables for the function (the returning columns), the <code class="language-plaintext highlighter-rouge">FOR</code> loops create variables with the same name but an innser scope, so that it is not possible to refer to the returning columns.</p>
<p><br /></p>
<p>Please note that the problem is not caught even with <a href="https://www.postgresql.org/docs/16/plpgsql-development-tips.html" target="_blank">the warnings about shadowed variables</a> :</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SET</span> <span class="n">plpgsql</span><span class="p">.</span><span class="n">extra_warnings</span> <span class="k">TO</span> <span class="s1">'shadowed_variables'</span><span class="p">;</span>
<span class="k">SET</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">a_table</span><span class="p">();</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">i</span> <span class="o">|</span> <span class="n">j</span> <span class="o">|</span> <span class="n">k</span>
<span class="c1">---+---+---</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="o">|</span> <span class="o">|</span>
<span class="p">(</span><span class="mi">8</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore, so far, the only solution is to choose appropriately the names of iterators, and of course to set the returnig variables accordingly:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="k">public</span><span class="p">.</span><span class="n">a_table</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">i</span> <span class="nb">int</span><span class="p">,</span> <span class="n">j</span> <span class="nb">int</span><span class="p">,</span> <span class="n">k</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">ii</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="mi">2</span> <span class="n">LOOP</span>
<span class="k">FOR</span> <span class="n">jj</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="mi">2</span> <span class="n">LOOP</span>
<span class="k">FOR</span> <span class="n">kk</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="mi">2</span> <span class="n">LOOP</span>
<span class="n">i</span> <span class="p">:</span><span class="o">=</span> <span class="n">ii</span><span class="p">;</span>
<span class="n">j</span> <span class="p">:</span><span class="o">=</span> <span class="n">jj</span><span class="p">;</span>
<span class="n">k</span> <span class="p">:</span><span class="o">=</span> <span class="n">kk</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'i=%, j=%, k=%'</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NEXT</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span>
<span class="k">VOLATILE</span>
<span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above in fact results in what you probably are expecting:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">a_table</span><span class="p">();</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">i</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">j</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span>
<span class="n">i</span> <span class="o">|</span> <span class="n">j</span> <span class="o">|</span> <span class="n">k</span>
<span class="c1">---+---+---</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="using-plpgsql_check-as-a-possible-help">Using <code class="language-plaintext highlighter-rouge">plpgsql_check</code> as a possible help</h2>
<p><em>This is a post update thanks to the comment of <strong>Pavel Stěhule</strong> on 2023-09-20</em>.</p>
<p>The <a href="https://github.com/okbob/plpgsql_check" target="_blank">plpgsql_check</a> extension could help in finding out the above described problem.
Covering <code class="language-plaintext highlighter-rouge">plpgsql_check</code> here is out of the scope, however this is how the extension can provide some help:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">EXTENSION</span> <span class="n">plpgsql_check</span><span class="p">;</span>
<span class="n">CRATE</span> <span class="n">EXTENSION</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">message</span><span class="p">,</span> <span class="k">level</span> <span class="k">FROM</span> <span class="n">plpgsql_check_function_tb</span><span class="p">(</span> <span class="s1">'a_table()'</span> <span class="p">);</span>
<span class="n">message</span> <span class="o">|</span> <span class="k">level</span>
<span class="c1">-----------------------------+---------------</span>
<span class="n">unmodified</span> <span class="k">OUT</span> <span class="k">variable</span> <span class="nv">"i"</span> <span class="o">|</span> <span class="n">warning</span> <span class="n">extra</span>
<span class="n">unmodified</span> <span class="k">OUT</span> <span class="k">variable</span> <span class="nv">"j"</span> <span class="o">|</span> <span class="n">warning</span> <span class="n">extra</span>
<span class="n">unmodified</span> <span class="k">OUT</span> <span class="k">variable</span> <span class="nv">"k"</span> <span class="o">|</span> <span class="n">warning</span> <span class="n">extra</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the check does not <em>understand</em> the effective problem, that is that the variables are all masked out by the context defined in the <code class="language-plaintext highlighter-rouge">FOR</code> loops, but at least it reveals that the output variables have not been modified along the function code. Knowing that such variables have not been modified means that what the function is <em>expecting</em> to achieve is probably not, and that will trigger some extra check by the developers.</p>
Using Emacs and YASnippet to quickly write PostgreSQL functions2023-09-08T00:00:00+00:00https://fluca1978.github.io/2023/09/08/EmacsPostgreSQLFunctionTemplate<p>How a simple snippet can allow you to save time and improve your PostgreSQL code quality.</p>
<h1 id="using-emacs-and-yasnippet-to-quickly-write-postgresql-functions">Using Emacs and YASnippet to quickly write PostgreSQL functions</h1>
<p>I love Emacs, and I also love PostgreSQL.
<br />
Whenever I have to write PostgreSQL code, I use Emacs.
<br />
Emacs can help me improving code quality, for example to write PostgreSQL functions. I use <em>YASnippet</em> as a package to provide the basic template for a PostgreSQL function.</p>
<h1 id="a-postgresql-function-template-in-action">A PostgreSQL Function Template (in action)</h1>
<p>Before explaining the concept, let’s see a couple of short videos that demonstrate my snippet in action:</p>
<p><br />
<br /></p>
<center>
<iframe width="560" height="315" src="https://www.youtube.com/embed/9o5ahcmZK90?si=dRiQIsMn88ijuk5j" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>
<br />
<iframe width="560" height="315" src="https://www.youtube.com/embed/y8YWMZGz5cY?si=ySvUEFgIOIYy0wP3" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>
</center>
<p><br />
<br /></p>
<h1 id="a-postgresql-function-template-the-code">A PostgreSQL Function Template (the code)</h1>
<p>The code for the template is the following one (I may change some bits here and there as time goes by):</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># -*- mode: snippet -*-
# name: PostgreSQL Function
# key: function
# --
--
-- Function ${1:function_name}
-- Schema ${2:public}
--
-- Description:
-- $3
--
-- Return Type: ${4:VOID}
--
CREATE ${5:OR REPLACE} FUNCTION
$2.$1($6)
RETURNS $4
AS $CODE$
$0
$CODE$
LANGUAGE ${8:plpgsql}
VOLATILE
;
</code></pre></div></div>
<p><br />
<br /></p>
<p>The preamble is used from Emacs to understand the template name.
The following, is SQL code that works as a template for a function. Every <code class="language-plaintext highlighter-rouge">$n</code> placeholder is a tab stop that can be used to place the cursor within the text. For example, <code class="language-plaintext highlighter-rouge">${1:function_name}</code> is the first (<code class="language-plaintext highlighter-rouge">1</code>) tab stop, that present the default text <code class="language-plaintext highlighter-rouge">function_name</code> that is overwritten as I type in something. The name of function is then automatically replaced into the other <code class="language-plaintext highlighter-rouge">$1</code> placeholder.
<br />
Note, how I first begin from the documentation, and then jump to the function code. This is a very important <strong>added value: writing the documentation first I ensure every piece of code will have at least some documentation, and thanks to the placeholders, what I write in the documentation is used to name the function and its return type</strong>.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Emacs and YASnippet can be very powerful to help writing PostgreSQL code. While this post focuses on functions, it is possible to provide templates also for other kind of code schemes, like procedures, triggers, and so on.</p>
Using custom variables as per-session global variables2023-08-24T00:00:00+00:00https://fluca1978.github.io/2023/08/24/PostgreSQLSETCustomVariables<p>A possible trick to emulate per-session global variables.</p>
<h1 id="using-custom-variables-as-per-session-global-variables">Using custom variables as per-session global variables</h1>
<p>In a thread <a href="https://www.freelists.org/post/postgresql-it/Alternativa-a-variabil-globali-di-sessione,8" target="_blank">in the italian mailing list</a> we were discussing about <em>session global variables</em>, something I believe is a bad idea, no matter what is the problem you are trying to solve, but probably a more <em>database-oriented</em> approach could solve it (e.g., temporary tables).</p>
<p><br /></p>
<p>One thing I did not know, and I discovered thanks to the above discussion (credits to <em>Andrea Adami</em>) is that PostgreSQL allows the definition of custom variables by means of <code class="language-plaintext highlighter-rouge">SET</code>. Well, <code class="language-plaintext highlighter-rouge">SET</code> is of course the way to configure a GUC, that is a configuration parameter of the cluster.
As you probably know, all GUCs that have a name without a namespace are <em>cluster-wide</em>, while those with a prefix belong to an extension.</p>
<p><br /></p>
<p>Since PostgreSQL does not know in advance if an extension has been loaded or not, and since extension can be loaded at run-time, the cluster allows the user to set parameters that contain a prefix in the name. Documentation <a href="https://www.postgresql.org/docs/15/runtime-config-custom.html" target="_blank">can be found here</a>. Therefore, it is possible to use <code class="language-plaintext highlighter-rouge">SET</code> to define a <em>fake</em> GUC variable to be used in queries and functions.</p>
<p><br /></p>
<p>As an example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SET</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">favourite_database</span> <span class="k">TO</span> <span class="s1">'PostgreSQL'</span><span class="p">;</span>
<span class="k">SET</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SHOW</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">favourite_database</span><span class="p">;</span>
<span class="n">fluca1978</span><span class="p">.</span><span class="n">favourite_database</span>
<span class="c1">------------------------------</span>
<span class="n">PostgreSQL</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="s1">'Luca loves '</span> <span class="o">||</span> <span class="n">current_setting</span><span class="p">(</span> <span class="s1">'fluca1978.favourite_database'</span> <span class="p">);</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span>
<span class="c1">-----------------------</span>
<span class="n">Luca</span> <span class="n">loves</span> <span class="n">PostgreSQL</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The variable behaves as a <code class="language-plaintext highlighter-rouge">user</code> context parameter, and honor also transaction boundaries:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">estdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="s1">'Luca loves '</span> <span class="o">||</span> <span class="n">current_setting</span><span class="p">(</span> <span class="s1">'fluca1978.favourite_database'</span> <span class="p">);</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span>
<span class="c1">-----------------------</span>
<span class="n">Luca</span> <span class="n">loves</span> <span class="n">PostgreSQL</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">BEGIN</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">SET</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">favourite_database</span> <span class="k">TO</span> <span class="s1">'Oracle'</span><span class="p">;</span>
<span class="k">SET</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">SELECT</span> <span class="s1">'Luca loves '</span> <span class="o">||</span> <span class="n">current_setting</span><span class="p">(</span> <span class="s1">'fluca1978.favourite_database'</span> <span class="p">);</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span>
<span class="c1">-------------------</span>
<span class="n">Luca</span> <span class="n">loves</span> <span class="n">Oracle</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="c1">-- argh!</span>
<span class="c1">-- rollback!</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">ROLLBACK</span><span class="p">;</span>
<span class="k">ROLLBACK</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="s1">'Luca loves '</span> <span class="o">||</span> <span class="n">current_setting</span><span class="p">(</span> <span class="s1">'fluca1978.favourite_database'</span> <span class="p">);</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span>
<span class="c1">-----------------------</span>
<span class="n">Luca</span> <span class="n">loves</span> <span class="n">PostgreSQL</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Clearly, this kind of variable is <strong>session-scoped</strong> and cannot be shared among different sessions:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_backend_pid</span><span class="p">(),</span> <span class="n">current_setting</span><span class="p">(</span> <span class="s1">'fluca1978.favourite_database'</span> <span class="p">);</span>
<span class="n">pg_backend_pid</span> <span class="o">|</span> <span class="n">current_setting</span>
<span class="c1">----------------+-----------------</span>
<span class="mi">857</span> <span class="o">|</span> <span class="n">PostgreSQL</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="c1">-- in another session</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_backend_pid</span><span class="p">(),</span> <span class="n">current_setting</span><span class="p">(</span> <span class="s1">'fluca1978.favourite_database'</span> <span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">unrecognized</span> <span class="n">configuration</span> <span class="k">parameter</span> <span class="nv">"fluca1978.favourite_database"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>I don’t recommend this usage of <em>dynamic session-scoped variables</em>, since a temporary table is usually a better idea and provides pretty much the same solution. It is however interesting to know that PostgreSQL has this behavior. Clearly, it is important to avoid clashes in variable names (a thing that you don’t risk with temporary tables) against really existing GUCs defined by an extension.</p>
A Possible Way to Implement a Shift Function in PL/PgSql (part 2)2023-08-03T00:00:00+00:00https://fluca1978.github.io/2023/08/03/PostgreSQLShiftArray2<p>Creating a shift-like function for manipulating arrays in PL/PgSQL.</p>
<h1 id="a-possible-way-to-implement-a-shift-function-in-plpgsql-part-2">A Possible Way to Implement a Shift Function in PL/PgSql (part 2)</h1>
<p>After my post about <a href="https://fluca1978.github.io/2023/08/01/PostgreSQLShiftArray.html" target="_blank">how to implement a <code class="language-plaintext highlighter-rouge">shift</code> like operation in PostgreSQL</a>
I got some comments and suggestions, most notably a <em>pure SQL implementation</em> provided by <strong>Stefan Stefanov</strong>, tho whom belongs the credits for the solution, and that allowed me to explain in this (second) article on the subject.
<br />
In the following you will find the <strong>Stefan Stefanov</strong>’s solution, a PL/Perl implementation I made in the meantime, and a little benchmarking to see how all the approaches compare to each other.</p>
<h2 id="a-pure-sql-implementation-credits-to-stefan-stefanov">A pure SQL Implementation (credits to Stefan Stefanov)</h2>
<p>The following is the function proposed by <strong>Stefan Stefanov</strong>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">array_shift</span><span class="p">(</span><span class="n">arr</span> <span class="n">anyarray</span><span class="p">,</span> <span class="n">loops</span> <span class="nb">integer</span> <span class="k">DEFAULT</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span><span class="n">head</span> <span class="n">anyelement</span><span class="p">,</span> <span class="n">tail</span> <span class="n">anyarray</span><span class="p">)</span>
<span class="k">LANGUAGE</span> <span class="k">sql</span>
<span class="k">AS</span> <span class="err">$</span><span class="k">function</span><span class="err">$</span>
<span class="k">with</span> <span class="n">arr_tbl</span><span class="p">(</span><span class="n">el</span><span class="p">,</span> <span class="n">arr_index</span><span class="p">)</span> <span class="k">as</span> <span class="p">(</span>
<span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="k">unnest</span><span class="p">(</span><span class="n">arr</span><span class="p">)</span> <span class="k">with</span> <span class="k">ordinality</span>
<span class="p">)</span>
<span class="k">select</span> <span class="p">(</span><span class="k">select</span> <span class="n">el</span> <span class="k">from</span> <span class="n">arr_tbl</span> <span class="k">where</span> <span class="n">arr_index</span> <span class="o">=</span> <span class="n">loops</span><span class="p">),</span>
<span class="p">(</span><span class="k">select</span> <span class="n">array_agg</span><span class="p">(</span><span class="n">el</span> <span class="k">order</span> <span class="k">by</span> <span class="n">arr_index</span><span class="p">)</span>
<span class="k">from</span> <span class="n">arr_tbl</span> <span class="k">where</span> <span class="n">arr_index</span> <span class="o">></span> <span class="n">loops</span><span class="p">);</span>
<span class="err">$</span><span class="k">function</span><span class="err">$</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, this is a very clever approach that exploits only <code class="language-plaintext highlighter-rouge">SELECT</code> statements to get the final result.
The <code class="language-plaintext highlighter-rouge">arr_tbl</code> CTE <em>explodes</em> the array by means of the PostgreSQL builting <code class="language-plaintext highlighter-rouge">unnest</code> function, and returns the array as a table with the <code class="language-plaintext highlighter-rouge">ordinality</code>, that is an automatically added column that works as a row number.
The output of the CTE is similar to the following one:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="k">unnest</span><span class="p">(</span> <span class="n">array</span><span class="p">[</span><span class="s1">'alfa'</span><span class="p">,</span><span class="s1">'beta'</span><span class="p">,</span> <span class="s1">'gamma'</span> <span class="p">]</span> <span class="p">)</span> <span class="k">with</span> <span class="k">ordinality</span><span class="p">;</span>
<span class="k">unnest</span> <span class="o">|</span> <span class="k">ordinality</span>
<span class="c1">--------+------------</span>
<span class="n">alfa</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">beta</span> <span class="o">|</span> <span class="mi">2</span>
<span class="n">gamma</span> <span class="o">|</span> <span class="mi">3</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The main <code class="language-plaintext highlighter-rouge">SELECT</code> performs the selection of two different columns, both extracted by a subquery.
The first subquery extracts the <em>last</em> element from the <code class="language-plaintext highlighter-rouge">shift</code> operation, that is the one with the ordinality (i.e., row number) equal to the number of loops. Assuming <code class="language-plaintext highlighter-rouge">loops = 2</code>, it extracts the <code class="language-plaintext highlighter-rouge">beta</code> value from the above table. This is what I called the <code class="language-plaintext highlighter-rouge">head</code> in my functions.
<br />
The other subquery extracts the elements with the ordinality greater than the number of shifts, that is all the remaining elements, and then re-agrgegates them into an array by means of the PostgreSQL builtin <code class="language-plaintext highlighter-rouge">array_agg</code> function.</p>
<p><br />
The beauty of this idea is that everything is built on top of queries, that is the array is transformed into a table and then back into an array, but all the computation is done as <em>cascading <code class="language-plaintext highlighter-rouge">SELECT</code></em>.</p>
<h2 id="a-plperl-implementation">A PL/Perl implementation</h2>
<p>Since Perl comes with a <em>natural</em> <code class="language-plaintext highlighter-rouge">shift</code> operator, why not using it as a wrapper to shift a PostgreSQL array?
<br />
The only drawback of this approach is that PL/Perl does not allow to pass an <code class="language-plaintext highlighter-rouge">anyarray</code> argument to a function, so there is the need to make an array-specific implementation:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">shift_plperl</span><span class="p">(</span> <span class="nb">text</span><span class="p">[],</span>
<span class="nb">int</span> <span class="k">default</span> <span class="mi">1</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">head</span> <span class="nb">text</span><span class="p">,</span> <span class="n">tail</span> <span class="nb">text</span><span class="p">[]</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">array</span><span class="p">,</span> <span class="err">$</span><span class="n">loops</span> <span class="p">)</span> <span class="o">=</span> <span class="o">@</span><span class="n">_</span><span class="p">;</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">head</span> <span class="p">);</span>
<span class="err">$</span><span class="n">head</span> <span class="o">=</span> <span class="n">shift</span> <span class="err">$</span><span class="n">array</span><span class="o">->@*</span> <span class="k">for</span> <span class="p">(</span> <span class="mi">1</span> <span class="p">..</span> <span class="err">$</span><span class="n">loops</span> <span class="p">);</span>
<span class="n">return_next</span><span class="p">(</span> <span class="p">{</span> <span class="n">head</span> <span class="o">=></span> <span class="err">$</span><span class="n">head</span><span class="p">,</span> <span class="n">tail</span> <span class="o">=></span> <span class="err">$</span><span class="n">array</span> <span class="p">}</span> <span class="p">);</span>
<span class="k">return</span> <span class="n">undef</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="the-tests">The tests</h2>
<p>The tests have been done, as in the previous post, with a block code similar to the following:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">DO</span> <span class="k">LANGUAGE</span> <span class="n">plpgsql</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">a</span> <span class="nb">text</span><span class="p">[];</span>
<span class="n">ts_begin</span> <span class="nb">timestamp</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="nb">timestamp</span><span class="p">;</span>
<span class="n">iter</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">i</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">iter</span> <span class="p">:</span><span class="o">=</span> <span class="mi">7000</span><span class="p">;</span>
<span class="c1">-- initialize the array</span>
<span class="n">ts_begin</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="k">SELECT</span> <span class="s1">'{'</span> <span class="o">||</span> <span class="n">string_agg</span><span class="p">(</span> <span class="n">v</span><span class="p">::</span><span class="nb">text</span><span class="p">,</span> <span class="s1">','</span> <span class="p">)</span> <span class="o">||</span> <span class="s1">'}'</span>
<span class="k">INTO</span> <span class="n">a</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="o">+</span> <span class="mi">5</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Array allocation = %'</span><span class="p">,</span> <span class="p">(</span> <span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Using shift for % iterations over % elements = %'</span><span class="p">,</span>
<span class="n">iter</span><span class="p">,</span>
<span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">),</span>
<span class="p">(</span> <span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="p">);</span>
<span class="n">ts_begin</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">iter</span> <span class="n">LOOP</span>
<span class="n">PERFORM</span> <span class="n">shiftx</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="n">ts_begin</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">iter</span> <span class="n">LOOP</span>
<span class="n">PERFORM</span> <span class="n">shiftx</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Using shiftx for % iterations over % elements = %'</span><span class="p">,</span>
<span class="n">iter</span><span class="p">,</span>
<span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">),</span>
<span class="p">(</span> <span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="p">);</span>
<span class="n">ts_begin</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">iter</span> <span class="n">loop</span>
<span class="n">perform</span> <span class="n">array_shift</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">end</span> <span class="n">loop</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Using array_shift for % iterations over % elements = %'</span><span class="p">,</span>
<span class="n">iter</span><span class="p">,</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">),</span>
<span class="p">(</span> <span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="p">);</span>
<span class="n">ts_begin</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">iter</span> <span class="n">loop</span>
<span class="n">perform</span> <span class="n">shift_plperl</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">end</span> <span class="n">loop</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Using array_shift for % iterations over % elements = %'</span><span class="p">,</span>
<span class="n">iter</span><span class="p">,</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">),</span>
<span class="p">(</span> <span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>where changing the <code class="language-plaintext highlighter-rouge">iter</code> variable makes the code to run more shifts against the same array.</p>
<p>In the following table, I show some results made on the same tiny crappy virtual machine. Please consider that the function used are:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">shift</code> a PL/PgSQL iteration based approach where, at each iteration the leftmost element of the array is removed;</li>
<li><code class="language-plaintext highlighter-rouge">shiftx</code> a PL/PgSQL approach that slices the array;</li>
<li><code class="language-plaintext highlighter-rouge">array_shift</code> is the PL/PgSQL function that executes the single query proposed by Stefan Stefanov;</li>
<li><code class="language-plaintext highlighter-rouge">shift_plperl</code> a PL/Perl function that exploits the <code class="language-plaintext highlighter-rouge">shift</code> Perl operator.</li>
</ul>
<p><br />
<br /></p>
<table class="table table-bordered">
<thead>
<tr>
<th style="text-align: center">Iterations</th>
<th style="text-align: center">shifts</th>
<th style="text-align: right"><code class="language-plaintext highlighter-rouge">shift</code></th>
<th style="text-align: right"><code class="language-plaintext highlighter-rouge">shiftx</code></th>
<th style="text-align: right"><code class="language-plaintext highlighter-rouge">array_shift</code></th>
<th style="text-align: right"><code class="language-plaintext highlighter-rouge">shift_plperl</code></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">2000</td>
<td style="text-align: center">1000</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">23.03905</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.007952</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.518305</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.604843</code> secs</td>
</tr>
<tr>
<td style="text-align: center">5000</td>
<td style="text-align: center">2500</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">05:44.236687</code> mins</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.020937</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">03.039885</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">03.672881</code> secs</td>
</tr>
<tr>
<td style="text-align: center">7000</td>
<td style="text-align: center">3500</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">15:23.3999</code> mins</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.033396</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">05.97662</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">07.132485</code> secs</td>
</tr>
<tr>
<td style="text-align: center">8000</td>
<td style="text-align: center">4000</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">23:02.73513</code> mins</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.044517</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">07.445447</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">10.236968</code> secs</td>
</tr>
<tr>
<td style="text-align: center">10000</td>
<td style="text-align: center">5000</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">44:46.496029</code> mins</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.048962</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">12.211704</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">15.091066</code> secs</td>
</tr>
<tr>
<td style="text-align: center">12000</td>
<td style="text-align: center">6000</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">01:16:53.758594</code> hours</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">00.060169</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">17.198828</code> secs</td>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">20.911864</code> secs</td>
</tr>
</tbody>
</table>
<p><br />
<br /></p>
<p>Long story short: the PL/PgSQL iteration based approach (<code class="language-plaintext highlighter-rouge">shift</code>) is by far the slowest approach, while the array-slice approach (<code class="language-plaintext highlighter-rouge">shiftx</code>) is the fastest one. The PL/Perl and query-only approach are comparable, with the latter being a little faster than the former probably due to PL/Perl requiring to marshall the arguments in and out of the function.</p>
<p><br />
Clearly, the above is not a complete benchmarking, and has not been executed multiple times to get average results. However, the above does suffice in providing an idea of how the different approaches relate to each other.</p>
<h2 id="where-is-the-sql-based-solution-spending-its-time">Where is the SQL based solution spending its time?</h2>
<p>It’s interesting to try to understand where <strong>Stefan Stefanov</strong>’s solution is spending most of its execution time, and <code class="language-plaintext highlighter-rouge">EXPLAIN</code> comes to a rescue here.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">explain</span> <span class="k">analyze</span> <span class="k">with</span>
<span class="n">arr</span> <span class="k">as</span> <span class="p">(</span> <span class="k">select</span> <span class="n">array_agg</span><span class="p">(</span> <span class="n">v</span> <span class="p">)</span> <span class="n">v</span> <span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">100000</span> <span class="p">)</span> <span class="n">v</span> <span class="p">)</span>
<span class="p">,</span><span class="n">arr_tbl</span><span class="p">(</span><span class="n">el</span><span class="p">,</span> <span class="n">arr_index</span><span class="p">)</span> <span class="k">as</span> <span class="p">(</span>
<span class="k">select</span> <span class="n">u</span><span class="p">.</span><span class="o">*</span> <span class="k">from</span> <span class="k">unnest</span><span class="p">(</span> <span class="p">(</span><span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">arr</span><span class="p">)</span> <span class="p">)</span> <span class="k">with</span> <span class="k">ordinality</span> <span class="n">u</span>
<span class="p">)</span>
<span class="k">select</span> <span class="p">(</span><span class="k">select</span> <span class="n">el</span> <span class="k">from</span> <span class="n">arr_tbl</span> <span class="k">where</span> <span class="n">arr_index</span> <span class="o">=</span> <span class="mi">5000</span><span class="p">),</span>
<span class="p">(</span><span class="k">select</span> <span class="n">array_agg</span><span class="p">(</span><span class="n">el</span> <span class="k">order</span> <span class="k">by</span> <span class="n">arr_index</span><span class="p">)</span>
<span class="k">from</span> <span class="n">arr_tbl</span> <span class="k">where</span> <span class="n">arr_index</span> <span class="o">></span> <span class="mi">5000</span><span class="p">);</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-----------------------------------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Result</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">1250</span><span class="p">.</span><span class="mi">60</span><span class="p">..</span><span class="mi">1250</span><span class="p">.</span><span class="mi">61</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">width</span><span class="o">=</span><span class="mi">36</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">105</span><span class="p">.</span><span class="mi">725</span><span class="p">..</span><span class="mi">105</span><span class="p">.</span><span class="mi">727</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">CTE</span> <span class="n">arr_tbl</span>
<span class="o">-></span> <span class="k">Function</span> <span class="n">Scan</span> <span class="k">on</span> <span class="k">unnest</span> <span class="n">u</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">1250</span><span class="p">.</span><span class="mi">03</span><span class="p">..</span><span class="mi">1250</span><span class="p">.</span><span class="mi">13</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">59</span><span class="p">.</span><span class="mi">127</span><span class="p">..</span><span class="mi">66</span><span class="p">.</span><span class="mi">255</span> <span class="k">rows</span><span class="o">=</span><span class="mi">100000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">InitPlan</span> <span class="mi">1</span> <span class="p">(</span><span class="k">returns</span> <span class="err">$</span><span class="mi">0</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Aggregate</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">1250</span><span class="p">.</span><span class="mi">01</span><span class="p">..</span><span class="mi">1250</span><span class="p">.</span><span class="mi">02</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">width</span><span class="o">=</span><span class="mi">32</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">51</span><span class="p">.</span><span class="mi">840</span><span class="p">..</span><span class="mi">51</span><span class="p">.</span><span class="mi">841</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Function</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">generate_series</span> <span class="n">v</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">1000</span><span class="p">.</span><span class="mi">00</span> <span class="k">rows</span><span class="o">=</span><span class="mi">100000</span> <span class="n">width</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">30</span><span class="p">.</span><span class="mi">879</span><span class="p">..</span><span class="mi">40</span><span class="p">.</span><span class="mi">827</span> <span class="k">rows</span><span class="o">=</span><span class="mi">100000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">InitPlan</span> <span class="mi">3</span> <span class="p">(</span><span class="k">returns</span> <span class="err">$</span><span class="mi">2</span><span class="p">)</span>
<span class="o">-></span> <span class="n">CTE</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">arr_tbl</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">22</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">width</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">60</span><span class="p">.</span><span class="mi">106</span><span class="p">..</span><span class="mi">79</span><span class="p">.</span><span class="mi">291</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">arr_index</span> <span class="o">=</span> <span class="mi">5000</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">99999</span>
<span class="n">InitPlan</span> <span class="mi">4</span> <span class="p">(</span><span class="k">returns</span> <span class="err">$</span><span class="mi">3</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Aggregate</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">23</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">24</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">width</span><span class="o">=</span><span class="mi">32</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">26</span><span class="p">.</span><span class="mi">330</span><span class="p">..</span><span class="mi">26</span><span class="p">.</span><span class="mi">331</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">CTE</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">arr_tbl</span> <span class="n">arr_tbl_1</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">22</span> <span class="k">rows</span><span class="o">=</span><span class="mi">3</span> <span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">270</span><span class="p">..</span><span class="mi">7</span><span class="p">.</span><span class="mi">439</span> <span class="k">rows</span><span class="o">=</span><span class="mi">95000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">arr_index</span> <span class="o">></span> <span class="mi">5000</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">5000</span>
<span class="n">Planning</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">323</span> <span class="n">ms</span>
<span class="n">Execution</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">106</span><span class="p">.</span><span class="mi">301</span> <span class="n">ms</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The main node where there is time consuption is the <code class="language-plaintext highlighter-rouge">CTE Scan</code> to find out the <code class="language-plaintext highlighter-rouge">head</code>: it consumes more than <code class="language-plaintext highlighter-rouge">20</code> milliseconds. That node is produced by the first main subquery, and it requires a scan of the materialized CTE.
Clearly I’m not considering the time consumed to produce the array, i.e., the <code class="language-plaintext highlighter-rouge">InitPlan 1</code>, because it is used only to feed the query.</p>
<h1 id="conclusions">Conclusions</h1>
<p>While it is easy enough to implement a shift-like operation for PostgreSQL arrays, either by PL/PgSQL or a nested query, performances will never met the PostgreSQL array slicing.</p>
A Possible Way to Implement a Shift Function in PL/PgSql2023-08-01T00:00:00+00:00https://fluca1978.github.io/2023/08/01/PostgreSQLShiftArray<p>Creating a shift-like function for manipulating arrays in PL/PgSQL.</p>
<h1 id="a-possible-way-to-implement-a-shift-function-in-plpgsql">A Possible Way to Implement a Shift Function in PL/PgSql</h1>
<p>PostgreSQL does support arrays in a very excellent way, but it does not provide a <code class="language-plaintext highlighter-rouge">shift</code> like function.
A <code class="language-plaintext highlighter-rouge">shift</code> function takes an array as input and removes the first (left-most) element from the array.
This is quite simple to do in PostgreSQL, since <em>array slices</em> are easy to implement. However, a slice returns the modified (shifted) array, not the shifted element.</p>
<p><br />
<br /></p>
<p>It is possible to implement a very simple function in PL/PgSQL that accepts an array of anytype and returns a table like multi-cardinality result set, with the element removed and the resulting array. The following is a straightforward implementation:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">shift</span><span class="p">(</span> <span class="n">a</span> <span class="n">anyarray</span><span class="p">,</span>
<span class="n">loops</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">1</span><span class="p">,</span>
<span class="n">emit_intermediate</span> <span class="nb">boolean</span> <span class="k">default</span> <span class="k">false</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">head</span> <span class="nb">text</span><span class="p">,</span> <span class="n">tail</span> <span class="n">anyarray</span><span class="p">,</span> <span class="n">step</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="c1">-- check that the array is good and has</span>
<span class="c1">-- at least one element</span>
<span class="n">IF</span> <span class="n">a</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">OR</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="o"><</span> <span class="mi">1</span> <span class="k">THEN</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- if the array has less elements that those</span>
<span class="c1">-- to shift, do only the max available shifting</span>
<span class="n">IF</span> <span class="n">loops</span> <span class="o">></span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="k">THEN</span>
<span class="n">loops</span> <span class="p">:</span><span class="o">=</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- initialize the returning array and the</span>
<span class="c1">-- number of steps</span>
<span class="n">tail</span> <span class="p">:</span><span class="o">=</span> <span class="n">a</span><span class="p">;</span>
<span class="n">step</span> <span class="p">:</span><span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">WHILE</span> <span class="n">loops</span> <span class="o">></span> <span class="mi">0</span> <span class="n">LOOP</span>
<span class="n">head</span> <span class="p">:</span><span class="o">=</span> <span class="n">tail</span><span class="p">[</span> <span class="mi">1</span> <span class="p">];</span>
<span class="n">tail</span> <span class="p">:</span><span class="o">=</span> <span class="n">tail</span><span class="p">[</span> <span class="mi">2</span> <span class="p">:</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">tail</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">];</span>
<span class="n">IF</span> <span class="n">emit_intermediate</span> <span class="k">OR</span> <span class="n">loops</span> <span class="o">=</span> <span class="mi">1</span> <span class="k">THEN</span>
<span class="k">RETURN</span> <span class="k">NEXT</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">loops</span> <span class="p">:</span><span class="o">=</span> <span class="n">loops</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">step</span> <span class="p">:</span><span class="o">=</span> <span class="n">step</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The idea is quite simple: the function accepts the array to shift and, optionally, the number of times the <code class="language-plaintext highlighter-rouge">shift</code> operation has to be performed, as well as a flag to indicate if the intermediate steps have to be emitted. The function iterates over the number of shifts to be performed, and removes the head element from the array.
<br />
As an example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">shift</span><span class="p">(</span> <span class="n">array</span><span class="p">[</span> <span class="s1">'cat'</span><span class="p">,</span> <span class="s1">'dog'</span><span class="p">,</span> <span class="s1">'parrot'</span> <span class="p">]</span> <span class="p">);</span>
<span class="n">head</span> <span class="o">|</span> <span class="n">tail</span> <span class="o">|</span> <span class="n">step</span>
<span class="c1">------+--------------+------</span>
<span class="n">cat</span> <span class="o">|</span> <span class="p">{</span><span class="n">dog</span><span class="p">,</span><span class="n">parrot</span><span class="p">}</span> <span class="o">|</span> <span class="mi">1</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the first element of the array, <code class="language-plaintext highlighter-rouge">cat</code>, is removed from the array that remains as <code class="language-plaintext highlighter-rouge">{dog,parrot]</code>; the removed (shifted) element is returned as <code class="language-plaintext highlighter-rouge">head</code> and the remaining array as <code class="language-plaintext highlighter-rouge">tail</code>. The <code class="language-plaintext highlighter-rouge">step</code> column indicates at which iteration the result refers to.</p>
<p><br />
As another example, let’s do a shift twice, emitting intermediate results in the meantime:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">shift</span><span class="p">(</span> <span class="n">array</span><span class="p">[</span> <span class="s1">'cat'</span><span class="p">,</span> <span class="s1">'dog'</span><span class="p">,</span> <span class="s1">'parrot'</span> <span class="p">],</span> <span class="mi">2</span><span class="p">,</span> <span class="k">true</span> <span class="p">);</span>
<span class="n">head</span> <span class="o">|</span> <span class="n">tail</span> <span class="o">|</span> <span class="n">step</span>
<span class="c1">------+--------------+------</span>
<span class="n">cat</span> <span class="o">|</span> <span class="p">{</span><span class="n">dog</span><span class="p">,</span><span class="n">parrot</span><span class="p">}</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">dog</span> <span class="o">|</span> <span class="p">{</span><span class="n">parrot</span><span class="p">}</span> <span class="o">|</span> <span class="mi">2</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, this time two tuples are emitted. The first one (<code class="language-plaintext highlighter-rouge">step = 1</code>), the <code class="language-plaintext highlighter-rouge">cat</code> element is removed and the <code class="language-plaintext highlighter-rouge">{dog,parrot}</code> array is returned. At the second iteration (<code class="language-plaintext highlighter-rouge">step = 2</code>), the <code class="language-plaintext highlighter-rouge">dog</code> element is removed from the array and the remaining <code class="language-plaintext highlighter-rouge">{parrot}</code> is returned. Without emitting the intermediate results, the function returns always a single tuple:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">shift</span><span class="p">(</span> <span class="n">array</span><span class="p">[</span> <span class="s1">'cat'</span><span class="p">,</span> <span class="s1">'dog'</span><span class="p">,</span> <span class="s1">'parrot'</span> <span class="p">],</span> <span class="mi">2</span> <span class="p">);</span>
<span class="n">head</span> <span class="o">|</span> <span class="n">tail</span> <span class="o">|</span> <span class="n">step</span>
<span class="c1">------+----------+------</span>
<span class="n">dog</span> <span class="o">|</span> <span class="p">{</span><span class="n">parrot</span><span class="p">}</span> <span class="o">|</span> <span class="mi">2</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="usage-example">Usage Example</h2>
<p>As an example, it is possible to use the <code class="language-plaintext highlighter-rouge">shift</code> function into another PL/PgSQL piece of code using an assignment, for example a <code class="language-plaintext highlighter-rouge">SELECT INTO</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">DO</span> <span class="k">LANGUAGE</span> <span class="n">plpgsql</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">a</span> <span class="nb">text</span><span class="p">[];</span>
<span class="n">h</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">I</span> <span class="nb">INT</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">a</span> <span class="p">:</span><span class="o">=</span> <span class="n">array</span><span class="p">[</span> <span class="s1">'alfa'</span><span class="p">,</span> <span class="s1">'beta'</span><span class="p">,</span> <span class="s1">'gamma'</span><span class="p">,</span> <span class="s1">'delta'</span> <span class="p">];</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="mi">2</span> <span class="n">LOOP</span>
<span class="k">SELECT</span> <span class="n">head</span><span class="p">,</span> <span class="n">tail</span>
<span class="k">INTO</span> <span class="n">h</span><span class="p">,</span> <span class="n">a</span>
<span class="k">FROM</span> <span class="n">shift</span><span class="p">(</span> <span class="n">a</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Removed <%> = %'</span><span class="p">,</span> <span class="n">h</span><span class="p">,</span> <span class="n">a</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>that produces the following dummy output:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">NFO</span><span class="p">:</span> <span class="n">Removed</span> <span class="o"><</span><span class="n">alfa</span><span class="o">></span> <span class="o">=</span> <span class="p">{</span><span class="n">beta</span><span class="p">,</span><span class="n">gamma</span><span class="p">,</span><span class="n">delta</span><span class="p">}</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Removed</span> <span class="o"><</span><span class="n">beta</span><span class="o">></span> <span class="o">=</span> <span class="p">{</span><span class="n">gamma</span><span class="p">,</span><span class="n">delta</span><span class="p">}</span>
<span class="k">DO</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="efficiency-considerations">Efficiency Considerations</h2>
<p>The function <code class="language-plaintext highlighter-rouge">shift</code>, as it is implemented, is not really efficient because it performs a set of iterations over the given array. In the case there is no need to emit the intermediate stages, it is possible to shrink the function as a couple of operations, mainly an array slice.
The following is a possible, slightly more efficient implementation:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">shiftx</span><span class="p">(</span> <span class="n">a</span> <span class="n">anyarray</span><span class="p">,</span>
<span class="n">loops</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">1</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">head</span> <span class="nb">text</span><span class="p">,</span> <span class="n">tail</span> <span class="n">anyarray</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="c1">-- check that the array is good and has</span>
<span class="c1">-- at least one element</span>
<span class="n">IF</span> <span class="n">a</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">OR</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="o"><</span> <span class="mi">1</span> <span class="k">THEN</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- if the array has less elements that those</span>
<span class="c1">-- to shift, do only the max available shifting</span>
<span class="n">IF</span> <span class="n">loops</span> <span class="o">></span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="k">THEN</span>
<span class="n">loops</span> <span class="p">:</span><span class="o">=</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- initialize the returning array</span>
<span class="c1">-- and the head of the last element</span>
<span class="n">head</span> <span class="p">:</span><span class="o">=</span> <span class="n">a</span><span class="p">[</span> <span class="n">loops</span> <span class="p">];</span>
<span class="n">tail</span> <span class="p">:</span><span class="o">=</span> <span class="n">a</span><span class="p">[</span> <span class="mi">1</span> <span class="o">+</span> <span class="n">loops</span> <span class="p">:</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">];</span>
<span class="k">RETURN</span> <span class="k">NEXT</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above implementation returns a single tuple, where the <code class="language-plaintext highlighter-rouge">head</code> is the last removed element at the <code class="language-plaintext highlighter-rouge">loops</code> offset, while the <code class="language-plaintext highlighter-rouge">tail</code> is the array slice removed from the array itself. Since this implementation does not perform any iteration, it can be a little faster on multi-occurencies shifts.</p>
<p><br />
It is possible to perform a quick and dirty test about performances with a <code class="language-plaintext highlighter-rouge">DO</code> code that performs a few thousands of iterations comparing the results:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">DO</span> <span class="k">LANGUAGE</span> <span class="n">plpgsql</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">a</span> <span class="nb">text</span><span class="p">[];</span>
<span class="n">ts_begin</span> <span class="nb">timestamp</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="nb">timestamp</span><span class="p">;</span>
<span class="n">iter</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">i</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">iter</span> <span class="p">:</span><span class="o">=</span> <span class="mi">2000</span><span class="p">;</span>
<span class="c1">-- initialize the array</span>
<span class="k">SELECT</span> <span class="s1">'{'</span> <span class="o">||</span> <span class="n">string_agg</span><span class="p">(</span> <span class="n">v</span><span class="p">::</span><span class="nb">text</span><span class="p">,</span> <span class="s1">','</span> <span class="p">)</span> <span class="o">||</span> <span class="s1">'}'</span>
<span class="k">INTO</span> <span class="n">a</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="n">iter</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="n">ts_begin</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">iter</span> <span class="n">LOOP</span>
<span class="n">PERFORM</span> <span class="n">shift</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Using shift for % iteration over % elements = %'</span><span class="p">,</span>
<span class="n">iter</span><span class="p">,</span>
<span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">),</span>
<span class="p">(</span> <span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="p">);</span>
<span class="n">ts_begin</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">iter</span> <span class="n">LOOP</span>
<span class="n">PERFORM</span> <span class="n">shiftx</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="n">iter</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">ts_end</span> <span class="p">:</span><span class="o">=</span> <span class="n">clock_timestamp</span><span class="p">();</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Using shiftx for % iteration over % elements = %'</span><span class="p">,</span>
<span class="n">iter</span><span class="p">,</span>
<span class="n">array_length</span><span class="p">(</span> <span class="n">a</span><span class="p">,</span> <span class="mi">1</span> <span class="p">),</span>
<span class="p">(</span> <span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above builds an array of <code class="language-plaintext highlighter-rouge">2000</code> elements and performs <code class="language-plaintext highlighter-rouge">2000</code> shifts with a cardinality of <code class="language-plaintext highlighter-rouge">100</code>, that is removes the first 1000 elements from the array and loops 2000 times.
<br />
The results are the following, clearly depending on the machine they are run:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">INFO</span><span class="p">:</span> <span class="k">Using</span> <span class="n">shift</span> <span class="k">for</span> <span class="mi">2000</span> <span class="n">iteration</span> <span class="n">over</span> <span class="mi">2000</span> <span class="n">elements</span> <span class="o">=</span> <span class="mi">00</span><span class="p">:</span><span class="mi">01</span><span class="p">:</span><span class="mi">04</span><span class="p">.</span><span class="mi">686821</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Using</span> <span class="n">shiftx</span> <span class="k">for</span> <span class="mi">2000</span> <span class="n">iteration</span> <span class="n">over</span> <span class="mi">2000</span> <span class="n">elements</span> <span class="o">=</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">055818</span>
<span class="k">DO</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">64746</span><span class="p">,</span><span class="mi">850</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">01</span><span class="p">:</span><span class="mi">04</span><span class="p">,</span><span class="mi">747</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the <code class="language-plaintext highlighter-rouge">shiftx</code> is clearly faster than the <code class="language-plaintext highlighter-rouge">shift</code> iterating version. This is clearly evenmore understandable if we raise the number of iterations and shifts to <code class="language-plaintext highlighter-rouge">5000</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">INFO</span><span class="p">:</span> <span class="k">Using</span> <span class="n">shift</span> <span class="k">for</span> <span class="mi">5000</span> <span class="n">iteration</span> <span class="n">over</span> <span class="mi">2505</span> <span class="n">elements</span> <span class="o">=</span> <span class="mi">00</span><span class="p">:</span><span class="mi">05</span><span class="p">:</span><span class="mi">40</span><span class="p">.</span><span class="mi">513576</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Using</span> <span class="n">shiftx</span> <span class="k">for</span> <span class="mi">5000</span> <span class="n">iteration</span> <span class="n">over</span> <span class="mi">2505</span> <span class="n">elements</span> <span class="o">=</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">078997</span>
<span class="k">DO</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">340597</span><span class="p">,</span><span class="mi">743</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">05</span><span class="p">:</span><span class="mi">40</span><span class="p">,</span><span class="mi">598</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>Thanks to array slices it is simple enough to implement a <code class="language-plaintext highlighter-rouge">shift</code> like functionality in PL/PgSQL. The problem of the approach described here is that there is no easy way to modify the array in place, for example thru a reference, so the functions need to return a compound result like a table.</p>
PostgreSQL 16 introduces a few new statistic fields for tables and indexes2023-07-31T00:00:00+00:00https://fluca1978.github.io/2023/07/31/PostgreSQL16Statistics<p>An addition to the pg_stat_xxx_tables and pg_stat_xxx_indexes that can help a lot in finding out seldomly used stuff.</p>
<h1 id="postgresql-16-introduces-a-few-new-statistic-fields-for-tables-and-indexes">PostgreSQL 16 introduces a few new statistic fields for tables and indexes</h1>
<p>PostgreSQL 16 adds two important timestamp fields to the statistics about tables and indexes, most notably <code class="language-plaintext highlighter-rouge">pg_stat_all_tables</code> and <code class="language-plaintext highlighter-rouge">pg_stat_all_indexes</code>. Clearly, such fields are also <em>inherited</em> in user and system catalogs, like for instance <code class="language-plaintext highlighter-rouge">pg_stat_user_tables</code> and <code class="language-plaintext highlighter-rouge">pg_stat_user_indexes</code>.
These two fields contain the last time a sequential scan against a table or an index (i.e., the index was used to extract data, and hence read) happened. As for all things statistics in PostgreSQL, the information is not in real time, rather it is defined at a transaction boundary.
<br />
<br />
Before these two fields were added, the statistics catalog provided only quantitative information, clearly less accurate, because it required the database administrator to <em>guess</em> how the system was behaving, for example understanding if an index was unused. Clearly, a table with a huge quantitative value of sequential scans is a good candidate for a few indexes to be created, while on the other hand an index with a very low usage counter is a good candidate for removal. With PostgreSQL 16 is now possible to better decide what to do in the above cases, understanding <strong>how far in the past</strong> a particular event happened, and hence better understand how acting on such table or index will affect the database.</p>
<p><br /></p>
<p>In this article, I give a very short presentation of what it is like to query the new fields in the statistics catalogs. The tests have been done on PostgreSQL 16 beta-2.</p>
<h2 id="creating-a-simple-workbench">Creating a simple workbench</h2>
<p>Let’s create a very simple table and populate it:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">t</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span>
<span class="p">,</span> <span class="n">t</span> <span class="nb">text</span>
<span class="p">,</span> <span class="k">primary</span> <span class="k">key</span> <span class="p">(</span> <span class="n">pk</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">t</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">select</span> <span class="s1">'Test tuple #'</span> <span class="o">||</span> <span class="n">v</span>
<span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">10000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">10000000</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and add an index to the table</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">idx_t_even</span> <span class="k">ON</span> <span class="n">t</span><span class="p">(</span> <span class="n">pk</span> <span class="p">)</span> <span class="k">WHERE</span> <span class="n">pk</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">INDEX</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above bring up a table and index with the following sizes:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">relname</span><span class="p">,</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="n">oid</span> <span class="p">)</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relname</span> <span class="k">IN</span> <span class="p">(</span> <span class="s1">'t'</span><span class="p">,</span> <span class="s1">'idx_t_even'</span><span class="p">,</span> <span class="s1">'t_pkey'</span> <span class="p">);</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">------------+----------------</span>
<span class="n">t</span> <span class="o">|</span> <span class="mi">498</span> <span class="n">MB</span>
<span class="n">t_pkey</span> <span class="o">|</span> <span class="mi">214</span> <span class="n">MB</span>
<span class="n">idx_t_even</span> <span class="o">|</span> <span class="mi">107</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="lets-have-a-look-at-the-statistics">Let’s have a look at the statistics</h2>
<p>In the <code class="language-plaintext highlighter-rouge">pg_stat_xxx_tables</code> and <code class="language-plaintext highlighter-rouge">pg_stat_xxx_indexes</code> there are now two new fields named <code class="language-plaintext highlighter-rouge">last_seq_scan</code> and <code class="language-plaintext highlighter-rouge">last_idx_scan</code> respectively. These fields are timestamps and contain the timestamp of the last time a sequential scan or an index scan has been performed.</p>
<p>For example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">relname</span><span class="p">,</span> <span class="n">seq_scan</span><span class="p">,</span> <span class="n">now</span><span class="p">()</span> <span class="o">-</span> <span class="n">last_seq_scan</span> <span class="k">as</span> <span class="n">seq_scan_age</span><span class="p">,</span> <span class="n">idx_scan</span>
<span class="k">FROM</span> <span class="n">pg_stat_user_tables</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'t'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">----------------</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">seq_scan</span> <span class="o">|</span> <span class="mi">3</span>
<span class="n">seq_scan_age</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">57</span><span class="p">.</span><span class="mi">627383</span>
<span class="n">idx_scan</span> <span class="o">|</span> <span class="mi">0</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>that gives the idea that the table has been read sequentially three times, last of which near five minutes ago.
And in fact, if we perform another query on the table, the statistics gets update:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">FROM</span> <span class="n">t</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">---</span>
<span class="k">count</span> <span class="o">|</span> <span class="mi">10000000</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">relname</span><span class="p">,</span> <span class="n">seq_scan</span><span class="p">,</span> <span class="n">now</span><span class="p">()</span> <span class="o">-</span> <span class="n">last_seq_scan</span> <span class="k">as</span> <span class="n">seq_scan_age</span><span class="p">,</span> <span class="n">idx_scan</span> <span class="k">FROM</span> <span class="n">pg_stat_user_tables</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'t'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">----------------</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">seq_scan</span> <span class="o">|</span> <span class="mi">6</span>
<span class="n">seq_scan_age</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">01</span><span class="p">.</span><span class="mi">508468</span>
<span class="n">idx_scan</span> <span class="o">|</span> <span class="mi">0</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>What about the index? Well, the <code class="language-plaintext highlighter-rouge">pg_stat_user_indexes</code> shows information about the indexes and, in this case, the <code class="language-plaintext highlighter-rouge">last_idx_scan</code> is the added field:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">relname</span><span class="p">,</span> <span class="n">indexrelname</span><span class="p">,</span> <span class="n">idx_scan</span><span class="p">,</span> <span class="n">now</span><span class="p">()</span> <span class="o">-</span> <span class="n">last_idx_scan</span> <span class="k">FROM</span> <span class="n">pg_stat_user_indexes</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'t'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">indexrelname</span> <span class="o">|</span> <span class="n">t_pkey</span>
<span class="n">idx_scan</span> <span class="o">|</span> <span class="mi">0</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span> <span class="o">|</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">indexrelname</span> <span class="o">|</span> <span class="n">idx_t_even</span>
<span class="n">idx_scan</span> <span class="o">|</span> <span class="mi">0</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span> <span class="o">|</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Even in this case, when the index is used the <code class="language-plaintext highlighter-rouge">last_idx_scan</code> field is updated accordingly:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">relname</span><span class="p">,</span> <span class="n">indexrelname</span><span class="p">,</span> <span class="n">idx_scan</span><span class="p">,</span> <span class="n">now</span><span class="p">()</span> <span class="o">-</span> <span class="n">last_idx_scan</span> <span class="k">FROM</span> <span class="n">pg_stat_user_indexes</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'t'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">----------------</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">indexrelname</span> <span class="o">|</span> <span class="n">t_pkey</span>
<span class="n">idx_scan</span> <span class="o">|</span> <span class="mi">0</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span> <span class="o">|</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="o">+</span><span class="c1">----------------</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">indexrelname</span> <span class="o">|</span> <span class="n">idx_t_even</span>
<span class="n">idx_scan</span> <span class="o">|</span> <span class="mi">1</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">01</span><span class="p">.</span><span class="mi">885197</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>Before PostgreSQL 16, the <code class="language-plaintext highlighter-rouge">pg_stat_xxx_tables</code> and <code class="language-plaintext highlighter-rouge">pg_stat_xxx_indexes</code> provided only quantitative information about the number of sequential scans and index usage, now it is also possible to have an idea on when such event last happened. This is important because it can reveal quickly how your indexes are performing without requiring you to reset the statistics and start monitoring them from scratch.</p>
PostgreSQL Cluster Connection Limits2023-07-27T00:00:00+00:00https://fluca1978.github.io/2023/07/27/PostgreSQLConnections<p>A brief look to understand how the main cluster connection limits work.</p>
<h1 id="postgresql-cluster-connection-limits">PostgreSQL Cluster Connection Limits</h1>
<p>PostgreSQL has two main connection limit tunables that allow the system administrator to decide what is the maximum number of connections the cluster will support and, in case an emergency activity has to be performed, what part of such connections is reserved to superusers.</p>
<p>PostgreSQL 16 is going to introduce a new parameter named <code class="language-plaintext highlighter-rouge">reserved_connections</code> among the other two <code class="language-plaintext highlighter-rouge">max_connections</code> and <code class="language-plaintext highlighter-rouge">superuser_reserved_connections</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> postgres <span class="nt">-h</span> localhost <span class="nt">-c</span> <span class="s1">'SHOW SERVER_VERSION;'</span>
server_version
<span class="nt">----------------</span>
16beta2
<span class="o">(</span>1 row<span class="o">)</span>
% psql <span class="nt">-U</span> postgres <span class="nt">-h</span> localhost <span class="nt">-c</span> <span class="s2">"SELECT name, setting FROM pg_settings WHERE name like '%connections' and name not like 'log%'; "</span>
name | setting
<span class="nt">--------------------------------</span>+---------
max_connections | 100
reserved_connections | 0
superuser_reserved_connections | 3
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above are the default settings, that have not been changed since several releases of PostgreSQL.</p>
<p>The idea is to allow a fine grain tuning of how connections will be limited depending on the user asking for it.
In this article, I try to briefly explain the difference between the two main settings (<code class="language-plaintext highlighter-rouge">max_connections</code> and <code class="language-plaintext highlighter-rouge">superuser_reserved_connections</code>) and the freshly introduced one (<code class="language-plaintext highlighter-rouge">reserved_user_connections</code>).</p>
<h2 id="the-connection-limits-settings-max_connections-and-superuser_reserved_connections">The Connection Limits Settings: <code class="language-plaintext highlighter-rouge">max_connections</code> and <code class="language-plaintext highlighter-rouge">superuser_reserved_connections</code></h2>
<p>First of all, the main idea is that the cluster <strong>is going to accept no more connections than <code class="language-plaintext highlighter-rouge">max_connections</code></strong>, hence <code class="language-plaintext highlighter-rouge">100</code> in the above. Among the <code class="language-plaintext highlighter-rouge">max_connections</code> available, <code class="language-plaintext highlighter-rouge">superuser_reserved_connections</code> will be kept empty for incoming connections from superuser roles.
<br />
In other words, clients and application will be able to establish <code class="language-plaintext highlighter-rouge">max_connections - superuser_reserved_connections</code> connections.
<br />
<br />
It is simple enough to demonstrate this by means of <code class="language-plaintext highlighter-rouge">pgbench</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbench <span class="nt">-U</span> pgbench <span class="nt">-T</span> 60 <span class="nt">-P</span> 5 <span class="nt">-n</span> <span class="nt">-c</span> 100 <span class="nt">-h</span> localhost pgbench
pgbench <span class="o">(</span>15.3, server 16beta2<span class="o">)</span>
pgbench: error: connection to server at <span class="s2">"localhost"</span> <span class="o">(</span>::1<span class="o">)</span>, port 5432 failed: FATAL: remaining connection slots are reserved <span class="k">for </span>roles with SUPERUSER
pgbench: error: could not create connection <span class="k">for </span>client 97
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above, I asked <code class="language-plaintext highlighter-rouge">pgbench</code> to create <code class="language-plaintext highlighter-rouge">100</code> concurrent connections, that is the <code class="language-plaintext highlighter-rouge">max_connections</code> value. That fails because three connections are reserved to superusers.</p>
<p><br />
It is possible to demonstrate this using <code class="language-plaintext highlighter-rouge">pgbench</code> and simultaneously opening other connections. In a terminal launch the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbench <span class="nt">-U</span> pgbench <span class="nt">-T</span> 120 <span class="nt">-P</span> 5 <span class="nt">-n</span> <span class="nt">-c</span> 97 <span class="nt">-h</span> localhost pgbench
</code></pre></div></div>
<p><br />
<br /></p>
<p>that will consume all available user-level connections and will last for two minutes. Meanwhile, in another terminal, if you try to login as a non-superuser you get an error, while superuser can connect:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> pgbench <span class="nt">-h</span> localhost pgbench
psql: error: connection to server at <span class="s2">"localhost"</span> <span class="o">(</span>::1<span class="o">)</span>, port 5432 failed: FATAL: remaining connection slots are reserved <span class="k">for </span>roles with SUPERUSER
% psql <span class="nt">-U</span> postgres <span class="nt">-h</span> localhost pgbench
psql <span class="o">(</span>15.3, server 16beta2<span class="o">)</span>
WARNING: psql major version 15, server major version 16.
Some psql features might not work.
Type <span class="s2">"help"</span> <span class="k">for </span>help.
<span class="nv">pgbench</span><span class="o">=</span><span class="c">#</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="the-new-connection-limit-reserved_connections">The New Connection Limit <code class="language-plaintext highlighter-rouge">reserved_connections</code></h2>
<p>As already written, this is a new parameter introduced by PostgreSQL 16.
This parameter allows connections by user granted by the <code class="language-plaintext highlighter-rouge">pg_use_reserved_connections</code>, and is a way to make some non-superuser role more powerful, granting to him more capabilities.</p>
<p><br />
First of all, let’s set the parameter to <code class="language-plaintext highlighter-rouge">10</code> connections; please note that being a network related parameter it is required a reboot of the cluster.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-h</span> localhost <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s1">'ALTER SYSTEM SET reserved_connections TO 10;'</span>
ALTER SYSTEM
% pgenv stop
% pgenv start
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above I use <code class="language-plaintext highlighter-rouge">pgenv</code> as my PostgreSQL manager, but that is not the important part.
After that, there is the need to grant some user(s) with the <code class="language-plaintext highlighter-rouge">pg_use_reserved_connections</code> permission:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> postgres <span class="nt">-h</span> localhost <span class="nt">-c</span> <span class="s1">'GRANT pg_use_reserved_connections TO luca;'</span>
GRANT ROLE
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is now time to try:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbench <span class="nt">-U</span> pgbench <span class="nt">-T</span> 120 <span class="nt">-P</span> 5 <span class="nt">-n</span> <span class="nt">-c</span> 97 <span class="nt">-h</span> localhost pgbench
pgbench <span class="o">(</span>15.3, server 16beta2<span class="o">)</span>
pgbench: error: connection to server at <span class="s2">"localhost"</span> <span class="o">(</span>::1<span class="o">)</span>, port 5432 failed: FATAL: remaining connection slots are reserved <span class="k">for </span>roles with privileges of the <span class="s2">"pg_use_reserved_connections"</span> role
pgbench: error: could not create connection <span class="k">for </span>client 87
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, <code class="language-plaintext highlighter-rouge">pgbench</code> is no more able to obtain up to <code class="language-plaintext highlighter-rouge">97</code> connections because now <code class="language-plaintext highlighter-rouge">10</code> are reserved for non-superuser roles with the <code class="language-plaintext highlighter-rouge">pg_use_reserved_connections</code>. Therefore, the only way to make it work is to low the concurrent connections to <code class="language-plaintext highlighter-rouge">max_connections - reserved_connections - superuser_reserved_connections</code>, that means <code class="language-plaintext highlighter-rouge">100 - 10 - 3 = 87</code>.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbench <span class="nt">-U</span> pgbench <span class="nt">-T</span> 120 <span class="nt">-P</span> 5 <span class="nt">-n</span> <span class="nt">-c</span> 87 <span class="nt">-h</span> localhost pgbench
</code></pre></div></div>
<p><br />
<br /></p>
<p>and while the above is working, you can try to connect from another concurrent session:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> pgbench <span class="nt">-h</span> localhost pgbench
psql: error: connection to server at <span class="s2">"localhost"</span> <span class="o">(</span>::1<span class="o">)</span>, port 5432 failed: FATAL: remaining connection slots are reserved <span class="k">for </span>roles with privileges of the <span class="s2">"pg_use_reserved_connections"</span> role
% psql <span class="nt">-U</span> luca <span class="nt">-h</span> localhost pgbench
psql <span class="o">(</span>15.3, server 16beta2<span class="o">)</span>
WARNING: psql major version 15, server major version 16.
Some psql features might not work.
Type <span class="s2">"help"</span> <span class="k">for </span>help.
<span class="nv">pgbench</span><span class="o">=></span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the first attempt, the connection fails because the <code class="language-plaintext highlighter-rouge">pgbench</code> user does not have any more connection slots to use, or better, there are no connection slots within the cluster to use.
<br />
However, the user <code class="language-plaintext highlighter-rouge">luca</code> succeed at connecting because he has the special <code class="language-plaintext highlighter-rouge">pg_use_reserved_connections</code> permission and there are still available slots.</p>
<p><br />
It is important to note that <strong>no matter if your cluster does not have any role with the <code class="language-plaintext highlighter-rouge">pg_use_reserved_connections</code>, once the setting <code class="language-plaintext highlighter-rouge">reserved_connections</code> is not zero the cluster will keep such connection slots available</strong>!
In other words, use <code class="language-plaintext highlighter-rouge">reserved_connections</code> only when you are sure you are going to grant the permission to a few roles.</p>
<h1 id="conclusions">Conclusions</h1>
<p>PostgreSQL is able to prevent the system administrator to lock out the cluster, even when the number of connections is approaching the maximum allowance.
Thanks to the new parameter <code class="language-plaintext highlighter-rouge">reserved_connections</code> added in upcoming PostgreSQL 16, it will be possible to fine-grain tune the connection allowance even better!</p>
A PL/PgSQL Simple Roman Number Translator2023-07-24T00:00:00+00:00https://fluca1978.github.io/2023/07/24/PostgreSQLRomanCalculator<p>A way to decode a Roman number into an Arabic one and vice-versa using PL/PgSQL.</p>
<h1 id="a-plpgsql-simple-roman-number-translator">A PL/PgSQL Simple Roman Number Translator</h1>
<p>In the last <a href="https://theweeklychallenge.org/blog/perl-weekly-challenge-227/" target="_blank">Weekly Challenge 227</a> the second task was about building a simple <em>Roman numbers calculator</em>. Since I usually try to implement those tasks also in PL/PgSQL (as well as in PL/Perl), I tried to implement such calculator and, along the path, I implemented a couple of simple functions to translate a number from and to roman notations.
<br />
In this short post I explain how the two functions work.
<br />
My approach is based on a lookup table that stores arabic and roman correspondencies for <em>special cases</em> and <em>base units</em>.</p>
<h2 id="the-lookup-table">The lookup table</h2>
<p>I defined a lookup table, that can be in whatever schema you want, even temporary, and that is populated with base units and some special cases:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">SCHEMA</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">fluca1978</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span><span class="p">(</span> <span class="n">r</span> <span class="nb">text</span><span class="p">,</span> <span class="n">n</span> <span class="nb">int</span><span class="p">,</span> <span class="k">repeatable</span> <span class="nb">boolean</span> <span class="p">);</span>
<span class="k">TRUNCATE</span> <span class="k">TABLE</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span>
<span class="k">VALUES</span>
<span class="p">(</span><span class="s1">'I'</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="k">true</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'IV'</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'V'</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'IX'</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'X'</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="k">true</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'XL'</span><span class="p">,</span> <span class="mi">40</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'L'</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'XC'</span><span class="p">,</span> <span class="mi">90</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'C'</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="k">true</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'CD'</span><span class="p">,</span> <span class="mi">400</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'D'</span><span class="p">,</span> <span class="mi">500</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'CM'</span><span class="p">,</span> <span class="mi">900</span><span class="p">,</span> <span class="k">false</span> <span class="p">)</span>
<span class="p">,(</span> <span class="s1">'M'</span><span class="p">,</span> <span class="mi">1000</span><span class="p">,</span> <span class="k">true</span> <span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">r</code> field holds the roman value for a number <code class="language-plaintext highlighter-rouge">n</code>, while the <code class="language-plaintext highlighter-rouge">repeatable</code> flag indicates if the number can be repeated consequently in the same stirng. For example, <code class="language-plaintext highlighter-rouge">I</code> can be repeated to form <code class="language-plaintext highlighter-rouge">III</code>, while <code class="language-plaintext highlighter-rouge">IV</code> cannot be repeated into <code class="language-plaintext highlighter-rouge">IVIV</code>. This will be useful during validation.</p>
<h2 id="validating-a-roman-string">Validating a Roman String</h2>
<p>The following function perform the minimal validation for a given input string that is supposed to be a roman number:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca1978</span><span class="p">.</span><span class="n">validate_roman</span><span class="p">(</span> <span class="n">r</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">boolean</span>
<span class="k">STRICT</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">current_record</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span><span class="o">%</span><span class="n">rowtype</span><span class="p">;</span>
<span class="n">rx</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">matches</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">current_record</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">n</span> <span class="k">DESC</span> <span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Iterating over Roman value % = %'</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">n</span><span class="p">;</span>
<span class="n">matches</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">rx</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'^%s'</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span> <span class="p">);</span>
<span class="n">WHILE</span> <span class="n">r</span> <span class="o">~</span> <span class="n">rx</span> <span class="n">LOOP</span>
<span class="n">matches</span> <span class="p">:</span><span class="o">=</span> <span class="n">matches</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Input string % -> % matches the Roman value %'</span><span class="p">,</span> <span class="n">r</span><span class="p">,</span> <span class="n">matches</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span><span class="p">;</span>
<span class="n">IF</span> <span class="k">NOT</span> <span class="n">current_record</span><span class="p">.</span><span class="k">repeatable</span> <span class="k">AND</span> <span class="n">matches</span> <span class="o">></span> <span class="mi">1</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Roman symbol % cannot be repeated!'</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">false</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">r</span> <span class="p">:</span><span class="o">=</span> <span class="n">regexp_replace</span><span class="p">(</span> <span class="n">r</span><span class="p">,</span> <span class="n">rx</span><span class="p">,</span> <span class="s1">''</span> <span class="p">);</span>
<span class="n">EXIT</span> <span class="k">WHEN</span> <span class="k">length</span><span class="p">(</span> <span class="n">r</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">EXIT</span> <span class="k">WHEN</span> <span class="k">length</span><span class="p">(</span> <span class="n">r</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">IF</span> <span class="k">length</span><span class="p">(</span> <span class="n">r</span> <span class="p">)</span> <span class="o">></span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="k">RETURN</span> <span class="k">false</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">true</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The idea is simple: I order the lookup table in descending order, so from the biggest value to the smallest one.
At each iteration, I search fi the current roman string starts with the roman letter (or couple of letters). If that is the case, I keep track of how many <code class="language-plaintext highlighter-rouge">matches</code> I’ve found, then remove the roman symbol from the beginning of the string.
Then I see if the same letter/symbol can be found in the beginning of the string, and if so, I ensure it is a repeatable value, otherwise there is an error.
If everything goes well, the ending string <code class="language-plaintext highlighter-rouge">r</code> will be empty due to the substitutions, otherwise if some characters remain then the string is wrong.
That happens, for example, when the roman values on the right are biggest than those on the left.</p>
<h2 id="converting-from-roman-to-arabic">Converting from Roman to Arabic</h2>
<p>The following function does the convertion starting from a Roman string:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca1978</span><span class="p">.</span><span class="n">from_roman</span><span class="p">(</span> <span class="n">r</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">STRICT</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">v</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">current_record</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span><span class="o">%</span><span class="n">rowtype</span><span class="p">;</span>
<span class="n">rx</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">IF</span> <span class="n">r</span> <span class="o">=</span> <span class="s1">''</span> <span class="k">THEN</span>
<span class="k">RETURN</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="k">NOT</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">validate_roman</span><span class="p">(</span> <span class="n">r</span> <span class="p">)</span> <span class="k">THEN</span>
<span class="k">RETURN</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">FOR</span> <span class="n">current_record</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">n</span> <span class="k">DESC</span> <span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Iterating over Roman value % = %'</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">n</span><span class="p">;</span>
<span class="n">rx</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'^%s'</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span> <span class="p">);</span>
<span class="n">WHILE</span> <span class="n">r</span> <span class="o">~</span> <span class="n">rx</span> <span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Input string % matches the Roman value %'</span><span class="p">,</span> <span class="n">r</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span><span class="p">;</span>
<span class="n">v</span> <span class="p">:</span><span class="o">=</span> <span class="n">v</span> <span class="o">+</span> <span class="n">current_record</span><span class="p">.</span><span class="n">n</span><span class="p">;</span>
<span class="n">r</span> <span class="p">:</span><span class="o">=</span> <span class="n">regexp_replace</span><span class="p">(</span> <span class="n">r</span><span class="p">,</span> <span class="n">rx</span><span class="p">,</span> <span class="s1">''</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Converted value is %'</span><span class="p">,</span> <span class="n">v</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">v</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is really similar to the validating function: it iterates on each part of the string searching to decode as the biggest possible value in the roman lookup table.
<br />
<br />
It is possible to see the workflow of the function by means of using the <code class="language-plaintext highlighter-rouge">DEBUG</code> log level:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">set</span> <span class="n">client_min_messages</span> <span class="k">to</span> <span class="n">debug</span><span class="p">;</span>
<span class="k">SET</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">from_roman</span><span class="p">(</span> <span class="s1">'MCMLXXVIII'</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">M</span> <span class="o">=</span> <span class="mi">1000</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">MCMLXXVIII</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">M</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">CM</span> <span class="o">=</span> <span class="mi">900</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">CMLXXVIII</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">CM</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">D</span> <span class="o">=</span> <span class="mi">500</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">CD</span> <span class="o">=</span> <span class="mi">400</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="k">C</span> <span class="o">=</span> <span class="mi">100</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">XC</span> <span class="o">=</span> <span class="mi">90</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">L</span> <span class="o">=</span> <span class="mi">50</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">LXXVIII</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">L</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">XL</span> <span class="o">=</span> <span class="mi">40</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">X</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">XXVIII</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">X</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">XVIII</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">X</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">IX</span> <span class="o">=</span> <span class="mi">9</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">V</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">VIII</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">V</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">IV</span> <span class="o">=</span> <span class="mi">4</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Iterating</span> <span class="n">over</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">I</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">III</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">I</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">II</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">I</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Input</span> <span class="n">string</span> <span class="n">I</span> <span class="n">matches</span> <span class="n">the</span> <span class="n">Roman</span> <span class="n">value</span> <span class="n">I</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Converted</span> <span class="n">value</span> <span class="k">is</span> <span class="mi">1978</span>
<span class="n">from_roman</span>
<span class="c1">------------</span>
<span class="mi">1978</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, at every iteration the function removes the leftmost letter from the string and continues to see what it can find next.
<br />
The matching is performed by building a regular expression as condition to the <code class="language-plaintext highlighter-rouge">WHILE</code> loop: the condition has the <em>begin at string</em> anchor <code class="language-plaintext highlighter-rouge">^</code> followed by whatever roman symbole is in the current record out of the lookup table. The special <code class="language-plaintext highlighter-rouge">EXIT</code> part ensures that there cannot be repetitions of two letetrs symbols. For example you cannot express <code class="language-plaintext highlighter-rouge">IVIV</code> as <code class="language-plaintext highlighter-rouge">8</code>, so once <code class="language-plaintext highlighter-rouge">IV</code> is encountered, the <code class="language-plaintext highlighter-rouge">WHILE</code> knows it can safely exit the loop.</p>
<h2 id="converting-from-arabic-to-roman">Converting From Arabic to Roman</h2>
<p>The following function does the opposite: converts an integer into a roman string.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca1978</span><span class="p">.</span><span class="n">to_roman</span><span class="p">(</span> <span class="n">n</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">text</span>
<span class="k">STRICT</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">roman_value</span> <span class="nb">text</span> <span class="p">:</span><span class="o">=</span> <span class="s1">''</span><span class="p">;</span>
<span class="n">current_record</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span><span class="o">%</span><span class="n">rowtype</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">IF</span> <span class="n">n</span> <span class="o"><=</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Cannot convert zero!'</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NULL</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">FOR</span> <span class="n">current_record</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">n</span> <span class="k">DESC</span> <span class="n">LOOP</span>
<span class="n">WHILE</span> <span class="n">n</span> <span class="o">>=</span> <span class="n">current_record</span><span class="p">.</span><span class="n">n</span> <span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'The value % is greater than % so appending a %'</span><span class="p">,</span> <span class="n">n</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">n</span><span class="p">,</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span><span class="p">;</span>
<span class="n">roman_value</span> <span class="p">:</span><span class="o">=</span> <span class="n">roman_value</span> <span class="o">||</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span><span class="p">;</span>
<span class="n">n</span> <span class="p">:</span><span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="n">current_record</span><span class="p">.</span><span class="n">n</span><span class="p">;</span>
<span class="n">EXIT</span> <span class="k">WHEN</span> <span class="k">length</span><span class="p">(</span> <span class="n">current_record</span><span class="p">.</span><span class="n">r</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Computed value is %'</span><span class="p">,</span> <span class="n">roman_value</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">roman_value</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Again, thanks to the debug output it is easy to understand the workflow of the converter:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">to_roman</span><span class="p">(</span> <span class="mi">1978</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">1978</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">1000</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">M</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">978</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">900</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">CM</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">78</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">50</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">L</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">28</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">10</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">X</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">18</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">10</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">X</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">8</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">5</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">V</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">3</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">1</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">I</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">2</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">1</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">I</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">The</span> <span class="n">value</span> <span class="mi">1</span> <span class="k">is</span> <span class="n">greater</span> <span class="k">than</span> <span class="mi">1</span> <span class="n">so</span> <span class="n">appending</span> <span class="n">a</span> <span class="n">I</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Computed</span> <span class="n">value</span> <span class="k">is</span> <span class="n">MCMLXXVIII</span>
<span class="n">to_roman</span>
<span class="c1">------------</span>
<span class="n">MCMLXXVIII</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="caching-results">Caching Results</h2>
<p>It is, clearly, very simple to define a materialized view or a cache table to handle all values for a faster lookup.
As an example, imagine to create a table that serves as a cache:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman_cache_table</span><span class="p">(</span> <span class="n">n</span> <span class="nb">int</span><span class="p">,</span> <span class="n">r</span> <span class="nb">text</span> <span class="p">);</span>
<span class="k">TRUNCATE</span> <span class="k">TABLE</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman_cache_table</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman_cache_table</span><span class="p">(</span> <span class="n">n</span><span class="p">,</span> <span class="n">r</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">n</span><span class="p">,</span> <span class="n">r</span>
<span class="k">FROM</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">n</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and then a function that, given a number, tries to understand if the caching table contains such a number, otherwise populates the table from the last found index to the given one</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca1978</span><span class="p">.</span><span class="n">roman_cache</span><span class="p">(</span> <span class="n">x</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">text</span>
<span class="k">STRICT</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">max_cached_value</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">i</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">v</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">SELECT</span> <span class="k">max</span><span class="p">(</span> <span class="n">n</span> <span class="p">)</span>
<span class="k">INTO</span> <span class="n">max_cached_value</span>
<span class="k">FROM</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman_cache_table</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Max cached value % and looking for %'</span><span class="p">,</span> <span class="n">max_cached_value</span><span class="p">,</span> <span class="n">x</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">max_cached_value</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">OR</span> <span class="n">x</span> <span class="o">></span> <span class="n">max_cached_value</span> <span class="k">THEN</span>
<span class="n">IF</span> <span class="n">max_cached_value</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">THEN</span>
<span class="n">max_cached_value</span> <span class="p">:</span><span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Repopulating the cache from % to %'</span><span class="p">,</span> <span class="n">max_cached_value</span><span class="p">,</span> <span class="n">x</span><span class="p">;</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="n">max_cached_value</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">x</span> <span class="n">LOOP</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman_cache_table</span><span class="p">(</span> <span class="n">n</span><span class="p">,</span> <span class="n">r</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">i</span><span class="p">,</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">to_roman</span><span class="p">(</span> <span class="n">i</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">SELECT</span> <span class="n">r</span>
<span class="k">INTO</span> <span class="n">v</span>
<span class="k">FROM</span> <span class="n">fluca1978</span><span class="p">.</span><span class="n">roman_cache_table</span>
<span class="k">WHERE</span> <span class="n">n</span> <span class="o">=</span> <span class="n">x</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">v</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>When you query the above function, the system inspects the <code class="language-plaintext highlighter-rouge">roman_cache_table</code> for the asked arabic number, and the number is in there it returns it. If the number is greater than the max value within the caching table, the function populates the table up to the given number.</p>
<h1 id="conclusions">Conclusions</h1>
<p>With some patient and a few iterations, it is possible to create a fully functional Roman Number Converter, and hence also a calculator.
Clearly, this kind of task is much more simpler with Perl (and PL/Perl), but PL/PgSQL can handle it too with a littlemore verbosity.
Code from the above examples can be found on my <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/roman-converter/roman-converter.sql" target="_blank">Github repository</a>.</p>
Multi-Dimensional Arrays in PostgreSQL2023-05-18T00:00:00+00:00https://fluca1978.github.io/2023/05/18/PostgreSQLMultiDimensionalArraysInFunctions<p>A look at how PostgreSQL handles multi-dimensional arrays.</p>
<h1 id="multi-dimensional-arrays-in-postgresql">Multi-Dimensional Arrays in PostgreSQL</h1>
<p>PostgreSQL supports arrays of various types, and handles also multi-dimensional arrays.
<em>Except that it does not support multi-dimensional arrays</em>!</p>
<p><br /></p>
<p>Allow me to better explain.
Multi-dimensional arrays are <em>just</em> an array that contains other arrays. In this sense, PostgreSQL does not provide a pure native multi-dimensional array, even if you can specify them.</p>
<p>Let’s see this in action by means of <code class="language-plaintext highlighter-rouge">pg_typeof</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_typeof</span><span class="p">(</span> <span class="n">array</span><span class="p">[</span> <span class="n">array</span><span class="p">[</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span> <span class="p">],</span>
<span class="n">array</span><span class="p">[</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span> <span class="p">]</span> <span class="p">]::</span><span class="nb">int</span><span class="p">[][]</span> <span class="p">);</span>
<span class="n">pg_typeof</span>
<span class="c1">-----------</span>
<span class="nb">integer</span><span class="p">[]</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the above <em>matrix</em> is repoted to be a single flat array.</p>
<p><br /></p>
<p>Consider now the following function, that accepts a multi-dimensional array and returns a table:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_matrix</span><span class="p">(</span> <span class="nb">int</span><span class="p">[][]</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">a</span> <span class="nb">int</span><span class="p">,</span> <span class="n">b</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">matrix</span> <span class="p">)</span> <span class="o">=</span> <span class="o">@</span><span class="n">_</span><span class="p">;</span>
<span class="k">for</span> <span class="n">my</span> <span class="err">$</span><span class="k">row</span> <span class="p">(</span> <span class="mi">0</span> <span class="p">..</span> <span class="err">$</span><span class="n">matrix</span><span class="o">->@*</span> <span class="o">-</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">my</span> <span class="err">$</span><span class="k">column</span> <span class="p">(</span> <span class="mi">0</span> <span class="p">..</span> <span class="err">$</span><span class="n">matrix</span><span class="o">-></span><span class="p">[</span> <span class="err">$</span><span class="k">row</span> <span class="p">]</span><span class="o">->@*</span> <span class="o">-</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">return_next</span><span class="p">(</span> <span class="p">{</span> <span class="n">a</span> <span class="o">=></span> <span class="err">$</span><span class="k">row</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span>
<span class="n">b</span> <span class="o">=></span> <span class="err">$</span><span class="n">matrix</span><span class="o">-></span><span class="p">[</span> <span class="err">$</span><span class="k">row</span> <span class="p">]</span><span class="o">-></span><span class="p">[</span> <span class="err">$</span><span class="k">column</span> <span class="p">]</span>
<span class="p">}</span> <span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">undef</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above function, when invoked with a multi-dimensional array, works as expected:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span>
<span class="k">from</span> <span class="n">f_matrix</span><span class="p">(</span> <span class="n">array</span><span class="p">[</span> <span class="n">array</span><span class="p">[</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span> <span class="p">],</span>
<span class="n">array</span><span class="p">[</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span> <span class="p">]</span> <span class="p">]::</span><span class="nb">int</span><span class="p">[][]</span> <span class="p">);</span>
<span class="n">a</span> <span class="o">|</span> <span class="n">b</span>
<span class="c1">---+---</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span>
<span class="p">(</span><span class="mi">4</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>However, if you inspect the function, <em>its signature clearly tells that the input parameter is a flat array</em>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">df</span> <span class="n">f_matrix</span>
<span class="n">List</span> <span class="k">of</span> <span class="n">functions</span>
<span class="k">Schema</span> <span class="o">|</span> <span class="n">Name</span> <span class="o">|</span> <span class="k">Result</span> <span class="k">data</span> <span class="k">type</span> <span class="o">|</span> <span class="n">Argument</span> <span class="k">data</span> <span class="n">types</span> <span class="o">|</span> <span class="k">Type</span>
<span class="c1">--------+----------+-----------------------------+---------------------+------</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">f_matrix</span> <span class="o">|</span> <span class="k">TABLE</span><span class="p">(</span><span class="n">a</span> <span class="nb">integer</span><span class="p">,</span> <span class="n">b</span> <span class="nb">integer</span><span class="p">)</span> <span class="o">|</span> <span class="nb">integer</span><span class="p">[]</span> <span class="o">|</span> <span class="n">func</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The same result, clearly, can be achieved by <code class="language-plaintext highlighter-rouge">plpgsql</code>, for example implementing the following function:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_matrix</span><span class="p">(</span> <span class="n">m</span> <span class="nb">int</span><span class="p">[][]</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">a</span> <span class="nb">int</span><span class="p">,</span> <span class="n">b</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">r</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">c</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">r</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">m</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="n">LOOP</span>
<span class="k">FOR</span> <span class="k">c</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">m</span><span class="p">,</span> <span class="mi">2</span> <span class="p">)</span> <span class="n">LOOP</span>
<span class="n">a</span> <span class="p">:</span><span class="o">=</span> <span class="n">r</span><span class="p">;</span>
<span class="n">b</span> <span class="p">:</span><span class="o">=</span> <span class="n">m</span><span class="p">[</span> <span class="n">r</span> <span class="p">][</span> <span class="k">c</span> <span class="p">];</span>
<span class="k">RETURN</span> <span class="k">NEXT</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In conclusion, PostgreSQL manages multi-dimensional arrays as flat lists, like what you would do in the C programming language.
This does not mean that you cannot use multi-dimensional arrays in an comfortable and efficient way, rather that you need to take into account how they are <em>handled</em> by the database engine, especially when passing them to a function.</p>
Table name as function arguments: a few checks2023-05-11T00:00:00+00:00https://fluca1978.github.io/2023/05/11/PostgreSQLCheckTableNamesInFunctions<p>How to check if a given table name exists and where to find it.</p>
<h1 id="table-name-as-function-arguments-a-few-checks">Table name as function arguments: a few checks</h1>
<p>Often I write some piece of code, usually a function or a procedure, that must operate dynamically on a table. To achieve this, I often pass the table name as an argument to the function.
<br />
The function should always check that the table exists and, moreover, the function should always use the fully qualified name of the table to avoid schema conflicts and <code class="language-plaintext highlighter-rouge">search_path</code> pollution problems.
Last, sometime I use a relative name when I do pass the table as an argument, sometime I want to pass a fully qualified name to the function.
<br />
<br /></p>
<p>I’ve a template for doing this minimal checks, clearly it is just an idea on how to improve your own functions when dealing with table names.
Imagine a simple function that accepts a table name, as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_do_stuff_on_table</span><span class="p">(</span> <span class="n">t_name</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">bool</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">s_name</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">info</span> <span class="nb">text</span><span class="p">[];</span>
<span class="n">pg_version</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">qualified_name</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="c1">-- parse the schema name</span>
<span class="n">info</span> <span class="p">:</span><span class="o">=</span> <span class="n">parse_ident</span><span class="p">(</span> <span class="n">t_name</span> <span class="p">);</span>
<span class="n">IF</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">info</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">2</span> <span class="k">THEN</span>
<span class="n">s_name</span> <span class="p">:</span><span class="o">=</span> <span class="n">info</span><span class="p">[</span> <span class="mi">1</span> <span class="p">];</span>
<span class="n">t_name</span> <span class="p">:</span><span class="o">=</span> <span class="n">info</span><span class="p">[</span> <span class="mi">2</span> <span class="p">];</span>
<span class="k">ELSE</span>
<span class="c1">-- try to understand if PostgreSQL 15 or higher</span>
<span class="k">SELECT</span> <span class="n">setting</span><span class="p">::</span><span class="nb">int</span>
<span class="k">INTO</span> <span class="n">pg_version</span>
<span class="k">FROM</span> <span class="n">pg_settings</span>
<span class="k">WHERE</span> <span class="n">name</span> <span class="o">=</span> <span class="s1">'server_version_num'</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">pg_version</span> <span class="o">>=</span> <span class="mi">150000</span> <span class="k">THEN</span>
<span class="k">SELECT</span> <span class="k">current_role</span>
<span class="k">INTO</span> <span class="n">s_name</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="n">s_name</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'public'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- check if the table exists</span>
<span class="n">PERFORM</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span>
<span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">c</span>
<span class="k">JOIN</span> <span class="n">pg_namespace</span> <span class="n">n</span>
<span class="k">ON</span> <span class="n">n</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">relnamespace</span>
<span class="k">WHERE</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">n</span><span class="p">.</span><span class="n">nspname</span> <span class="o">=</span> <span class="n">s_name</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="o">=</span> <span class="n">t_name</span><span class="p">;</span>
<span class="n">IF</span> <span class="k">NOT</span> <span class="k">FOUND</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="s1">'Table %.% does not exist, cannot proceed!'</span><span class="p">,</span> <span class="n">s_name</span><span class="p">,</span> <span class="n">t_name</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">qualified_name</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'%I.%I'</span><span class="p">,</span> <span class="n">s_name</span><span class="p">,</span> <span class="n">t_name</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Table %'</span><span class="p">,</span> <span class="n">qualified_name</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">true</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The function accepts <code class="language-plaintext highlighter-rouge">t_name</code> that can be a relative name (e.g., <em>foo</em>) or an absolute name like <em>public.foo</em>.
<br />
Initially the function exploits the <code class="language-plaintext highlighter-rouge">parse_identifier</code> internal PostgreSQL function to get out an array of elements, where the first one represents the schema name and the second one represents the table name. Thanks to this, and checking if the returned array has a size of 2, I can discriminate on what the function has received as an argument.
<br />
If the function received a fully qualified table name, I store the schema into <code class="language-plaintext highlighter-rouge">s_name</code> and rewrite <code class="language-plaintext highlighter-rouge">t_name</code> with only its relative name, and nothing more has to be done on the naming part. On the other hand, if the function received a relative name, I must use a <em>default</em> schema, that generally speaking is <code class="language-plaintext highlighter-rouge">public</code> unless PostgreSQL 15, where it is the current role name. Therefore, I get the number of the PostgreSQL version and decide what value <code class="language-plaintext highlighter-rouge">s_name</code> will assume, either <code class="language-plaintext highlighter-rouge">public</code> or the <code class="language-plaintext highlighter-rouge">current_role</code> (interpolated) value.
<br />
Once I have both the schema and the relative table name, I can check for the table in <code class="language-plaintext highlighter-rouge">pg_class</code>, assuming that <code class="language-plaintext highlighter-rouge">pg_namespace</code> confirms that the table is in such schema.
If the table is not there, I can <code class="language-plaintext highlighter-rouge">RAISE</code> an exception and stop the function right there, otherwise I can build a qualified name and go on with the rest of the work.</p>
Extracting the list of columns from the catalogs2023-05-08T00:00:00+00:00https://fluca1978.github.io/2023/05/08/PostgreSQLTableColumns<p>A simple look at the PostgreSQL catalogs to get the list of a table’s column.</p>
<h1 id="extracting-the-list-of-columns-from-the-catalogs">Extracting the list of columns from the catalogs</h1>
<p>The special catalog <code class="language-plaintext highlighter-rouge">pg_attribute</code> keeps track of every column that your tabular structure holds.
However, before using such catalog, you need to keep in mind some basic rules. In particular, every attribute in the catalog has an ordinality number named <code class="language-plaintext highlighter-rouge">attnum</code>: when the number is positive the attribute refers to a user defined column, whenever it is negative it represent a PostgreSQL special column.
Moreover, the special column <code class="language-plaintext highlighter-rouge">attisdropped</code> indicates if the attribute has been dropped. Moreover, when an attribute is dropped, it takes a <em>special</em> name in the catalog, like <code class="language-plaintext highlighter-rouge">.......pg.dropped.5......</code>.</p>
<p>Imagine we create a dummy table as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">foo</span><span class="p">(</span>
<span class="n">a</span> <span class="nb">int</span>
<span class="p">,</span> <span class="n">b</span> <span class="nb">char</span>
<span class="p">,</span> <span class="k">c</span> <span class="nb">int</span>
<span class="p">,</span> <span class="n">d</span> <span class="nb">int</span>
<span class="p">,</span> <span class="n">e</span> <span class="nb">int</span>
<span class="p">,</span> <span class="n">f</span> <span class="nb">int</span>
<span class="p">,</span> <span class="k">g</span> <span class="nb">char</span>
<span class="p">,</span> <span class="n">z</span> <span class="nb">text</span>
<span class="p">,</span> <span class="n">y</span> <span class="nb">char</span>
<span class="p">,</span> <span class="n">k</span> <span class="nb">int</span>
<span class="p">,</span> <span class="n">j</span> <span class="nb">int</span>
<span class="p">);</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">foo</span> <span class="k">DROP</span> <span class="k">COLUMN</span> <span class="n">e</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Getting the list of columns is easy via the catalog <strong><code class="language-plaintext highlighter-rouge">pg_attribute</code></strong>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">attname</span><span class="p">,</span> <span class="n">attnum</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'foo'</span><span class="p">::</span><span class="n">regclass</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="n">attisdropped</span><span class="p">;</span>
<span class="n">attname</span> <span class="o">|</span> <span class="n">attnum</span>
<span class="c1">----------+--------</span>
<span class="n">tableoid</span> <span class="o">|</span> <span class="o">-</span><span class="mi">6</span>
<span class="n">cmax</span> <span class="o">|</span> <span class="o">-</span><span class="mi">5</span>
<span class="n">xmax</span> <span class="o">|</span> <span class="o">-</span><span class="mi">4</span>
<span class="n">cmin</span> <span class="o">|</span> <span class="o">-</span><span class="mi">3</span>
<span class="n">xmin</span> <span class="o">|</span> <span class="o">-</span><span class="mi">2</span>
<span class="n">ctid</span> <span class="o">|</span> <span class="o">-</span><span class="mi">1</span>
<span class="n">a</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">b</span> <span class="o">|</span> <span class="mi">2</span>
<span class="k">c</span> <span class="o">|</span> <span class="mi">3</span>
<span class="n">d</span> <span class="o">|</span> <span class="mi">4</span>
<span class="n">f</span> <span class="o">|</span> <span class="mi">6</span>
<span class="k">g</span> <span class="o">|</span> <span class="mi">7</span>
<span class="n">z</span> <span class="o">|</span> <span class="mi">8</span>
<span class="n">y</span> <span class="o">|</span> <span class="mi">9</span>
<span class="n">k</span> <span class="o">|</span> <span class="mi">10</span>
<span class="n">j</span> <span class="o">|</span> <span class="mi">11</span>
<span class="p">(</span><span class="mi">17</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>There are few things to note in the above output:
1) the <code class="language-plaintext highlighter-rouge">attnum</code> is the order the column has been added to the table, in fact it respects the original table definition;
2) if <code class="language-plaintext highlighter-rouge">attnum</code> is <strong>positive</strong> than the attribute is a <em>user defined one</em>, that means it is a column you added to the table;
3) if <code class="language-plaintext highlighter-rouge">attnum</code> is <strong>negative</strong> than the attribute has been added by the system (i.e., PostgreSQL) for its internal usage;
4) <strong>all attributes listed in <code class="language-plaintext highlighter-rouge">pg_attribute</code></strong> can be queried by the user;
5) the dropped column <code class="language-plaintext highlighter-rouge">e</code> is missing, note how the <code class="language-plaintext highlighter-rouge">attnum</code> skips the ordering 5.</p>
<p><br /></p>
<p>It is now simple enough to get a list of columns and paste it into a query:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">string_agg</span><span class="p">(</span> <span class="n">attname</span><span class="p">,</span> <span class="s1">','</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'foo'</span><span class="p">::</span><span class="n">regclass</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="n">attisdropped</span><span class="p">;</span>
<span class="n">string_agg</span>
<span class="c1">---------------------------------------------------------</span>
<span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">,</span><span class="k">c</span><span class="p">,</span><span class="n">cmax</span><span class="p">,</span><span class="n">cmin</span><span class="p">,</span><span class="n">ctid</span><span class="p">,</span><span class="n">d</span><span class="p">,</span><span class="n">e</span><span class="p">,</span><span class="n">f</span><span class="p">,</span><span class="k">g</span><span class="p">,</span><span class="n">j</span><span class="p">,</span><span class="n">k</span><span class="p">,</span><span class="n">tableoid</span><span class="p">,</span><span class="n">xmax</span><span class="p">,</span><span class="n">xmin</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">z</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">,</span><span class="k">c</span><span class="p">,</span><span class="n">cmax</span><span class="p">,</span><span class="n">cmin</span><span class="p">,</span><span class="n">ctid</span><span class="p">,</span><span class="n">d</span><span class="p">,</span><span class="n">e</span><span class="p">,</span><span class="n">f</span><span class="p">,</span><span class="k">g</span><span class="p">,</span><span class="n">j</span><span class="p">,</span><span class="n">k</span><span class="p">,</span><span class="n">tableoid</span><span class="p">,</span><span class="n">xmax</span><span class="p">,</span><span class="n">xmin</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">z</span> <span class="k">FROM</span> <span class="n">foo</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">--------</span>
<span class="n">a</span> <span class="o">|</span>
<span class="n">b</span> <span class="o">|</span>
<span class="k">c</span> <span class="o">|</span>
<span class="n">cmax</span> <span class="o">|</span> <span class="mi">0</span>
<span class="n">cmin</span> <span class="o">|</span> <span class="mi">0</span>
<span class="n">ctid</span> <span class="o">|</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">)</span>
<span class="n">d</span> <span class="o">|</span>
<span class="n">f</span> <span class="o">|</span>
<span class="k">g</span> <span class="o">|</span>
<span class="n">j</span> <span class="o">|</span>
<span class="n">k</span> <span class="o">|</span>
<span class="n">tableoid</span> <span class="o">|</span> <span class="mi">44309</span>
<span class="n">xmax</span> <span class="o">|</span> <span class="mi">0</span>
<span class="n">xmin</span> <span class="o">|</span> <span class="mi">2075116</span>
<span class="n">y</span> <span class="o">|</span>
<span class="n">z</span> <span class="o">|</span> <span class="n">test</span> <span class="n">tuple</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Clearly, you can manipulate the query to build something that allows you to choose between user columns and system columns, for example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">WITH</span> <span class="n">user_columns</span> <span class="k">AS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">attname</span> <span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'foo'</span><span class="p">::</span><span class="n">regclass</span> <span class="k">AND</span> <span class="n">attnum</span> <span class="o">></span> <span class="mi">0</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="n">attisdropped</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="mi">1</span> <span class="p">)</span>
<span class="p">,</span> <span class="n">system_columns</span> <span class="k">AS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">attname</span> <span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'foo'</span><span class="p">::</span><span class="n">regclass</span> <span class="k">AND</span> <span class="n">attnum</span> <span class="o"><</span> <span class="mi">0</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="n">attisdropped</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="mi">1</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">string_agg</span><span class="p">(</span> <span class="k">c</span><span class="p">.</span><span class="n">attname</span><span class="p">,</span> <span class="s1">', '</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">user_columns</span> <span class="k">c</span><span class="p">;</span>
<span class="n">string_agg</span>
<span class="c1">---------------------------------</span>
<span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="k">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">f</span><span class="p">,</span> <span class="k">g</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>You can even build something a little more complex, in order to get for instance the definition of a trigger (or something like that):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">WITH</span> <span class="n">user_columns</span> <span class="k">AS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">attname</span> <span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'foo'</span><span class="p">::</span><span class="n">regclass</span> <span class="k">AND</span> <span class="n">attnum</span> <span class="o">></span> <span class="mi">0</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="n">attisdropped</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="mi">1</span> <span class="p">)</span>
<span class="p">,</span> <span class="n">system_columns</span> <span class="k">AS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">attname</span> <span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'foo'</span><span class="p">::</span><span class="n">regclass</span> <span class="k">AND</span> <span class="n">attnum</span> <span class="o"><</span> <span class="mi">0</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="n">attisdropped</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="mi">1</span> <span class="p">)</span>
<span class="p">,</span> <span class="n">user_columns_list</span> <span class="k">AS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">string_agg</span><span class="p">(</span> <span class="k">c</span><span class="p">.</span><span class="n">attname</span> <span class="p">,</span> <span class="s1">','</span> <span class="p">)</span> <span class="k">as</span> <span class="n">l</span>
<span class="k">FROM</span> <span class="n">user_columns</span> <span class="k">c</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="s1">'CREATE TRIGGER tr_foo_ins '</span>
<span class="o">||</span> <span class="s1">' BEFORE UPDATE OF '</span>
<span class="o">||</span> <span class="n">ucl</span><span class="p">.</span><span class="n">l</span>
<span class="o">||</span> <span class="s1">' ON foo FOR EACH ROW EXECUTE PROCEDURE f_tr_foo_ins() '</span>
<span class="k">FROM</span> <span class="n">user_columns_LIST</span> <span class="n">ucl</span><span class="p">;</span>
<span class="o">?</span><span class="k">column</span><span class="o">?</span>
<span class="c1">--------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">CREATE</span> <span class="k">TRIGGER</span> <span class="n">tr_foo_ins</span> <span class="k">BEFORE</span> <span class="k">UPDATE</span> <span class="k">OF</span> <span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">,</span><span class="k">c</span><span class="p">,</span><span class="n">d</span><span class="p">,</span><span class="n">f</span><span class="p">,</span><span class="k">g</span><span class="p">,</span><span class="n">j</span><span class="p">,</span><span class="n">k</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">z</span> <span class="k">ON</span> <span class="n">foo</span> <span class="k">FOR</span> <span class="k">EACH</span> <span class="k">ROW</span> <span class="k">EXECUTE</span> <span class="k">PROCEDURE</span> <span class="n">f_tr_foo_ins</span><span class="p">()</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>This can be pushed into a function or a <code class="language-plaintext highlighter-rouge">EXECUTE</code> dynamic query to provide a dinamically generated statement.</p>
PgTraining Online Event 2023: available material2023-04-28T00:00:00+00:00https://fluca1978.github.io/2023/04/28/PgTrainingOnlineEventMaterial<p>Links to the material of our last free online event related to PostgreSQL!</p>
<h1 id="pgtraining-online-event-2023-available-material">PgTraining Online Event 2023: available material</h1>
<p>On April 2023, the 14th, we had another free online event related to PostgreSQL, namely <strong>PgTraining Online Event 2023</strong>.
The recording material and the slides are now available:</p>
<ul>
<li><strong>Il linguaggio PL/Perl</strong>, by yours truly, <a href="https://gitlab.com/pgtraining/slides/-/blob/master/webinar-20230414/2023_PGTRAINING_PLPERL.pdf" target="_blank">slides</a> and the <a href="https://vimeo.com/820564361" target="_blank">video</a></li>
<li><strong>Il linguaggio PL/Python</strong>, by Chris Mair, <a href="https://gitlab.com/pgtraining/slides/-/blob/master/webinar-20230414/2023_PGTRAINING_PLPYTHON_V1.md" target="_blank">slides</a> and <a href="https://vimeo.com/817679597" target="_blank">video</a></li>
<li><strong>Le novità di PostgreSQL 15</strong>, by Enrico Pirozzo, <a href="https://gitlab.com/pgtraining/slides/-/blob/master/webinar-20230414/pg15_new_features.pdf" target="_blank">slides</a></li>
</ul>
<p>Talks are listed in the same order they appeared in the afternoon on the live stream.</p>
<p>I would like to thank my friends Enrico and Chris for being able, again, to provide with me a very nice and interesting mini-conference.</p>
<p>We hope to be able to deliver much more PostgreSQL related content, and in the case you are interested and have aparituclar subject you would like us to talk about, please feel free to contact us!</p>
How much does it take to compile PostgreSQL (on my machines)?2023-04-11T00:00:00+00:00https://fluca1978.github.io/2023/04/11/PostgreSQLCompilationTimes<p>A few considerations on how fast (or slow) it can be to compile PostgreSQL</p>
<h1 id="how-much-does-it-take-to-compile-postgresql-on-my-machines">How much does it take to compile PostgreSQL (on my machines)?</h1>
<p>A couple of months ago I bought a new laptop, an <em>Acer Aspire 5 A515-45-R9EC</em>, with an <em>AMD Ryzen 5 500U</em> CPU.
<strong>It is the first Ryzen processor I never had</strong>, so I was curios to see how it performs, and what a better test approach for me than compiling PostgreSQL from scratch?
<br />
I fired up <a href="https://github.com/theory/pgenv" target="_blank">pgenv</a> and did a simple <code class="language-plaintext highlighter-rouge">pgenv build 15.2</code> to compile the whole PostgreSQL 15.2 distribution. I was quite happy with my results, and I <a href="https://www.linkedin.com/posts/fluca1978_emacs-postgresql-activity-7042159450253602816-cYAC?utm_source=share&utm_medium=member_desktop" target="_blank">posted the following</a></p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>My new "mule" laptop with AMD Ryzen 5 5500U
compiles #emacs 28.2 in 86 secs and #PostgreSQL 15.2 in 226 secs. There's no time for a coffee anymore!
</code></pre></div></div>
<p><br />
<br /></p>
<p>Then a friend of mine pointed out that they were not great times, sob!
<br />
This triggered some curiosity, so I decided to compare my three hardware machines to see how they perform. I report a single time example, but all times are pretty much stable and could change only by means of a couple of seconds (in other words, <em>this is not a real benchmark!</em>):</p>
<p><br /><br /></p>
<table>
<thead>
<tr>
<th style="text-align: center">CPU</th>
<th style="text-align: center">cores</th>
<th style="text-align: center">thread</th>
<th style="text-align: center">compilation type</th>
<th style="text-align: center">seconds</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">Intel i5-5257U</td>
<td style="text-align: center">2</td>
<td style="text-align: center">4</td>
<td style="text-align: center">make -j</td>
<td style="text-align: center">166</td>
</tr>
<tr>
<td style="text-align: center">Intel i5-10500T</td>
<td style="text-align: center">6</td>
<td style="text-align: center">12</td>
<td style="text-align: center">make -j</td>
<td style="text-align: center">57</td>
</tr>
<tr>
<td style="text-align: center">AMD Ryzen 5 5500 U</td>
<td style="text-align: center">6</td>
<td style="text-align: center">12</td>
<td style="text-align: center">make -j</td>
<td style="text-align: center">61</td>
</tr>
</tbody>
</table>
<p><br /><br /></p>
<p>The usage of <code class="language-plaintext highlighter-rouge">make -j</code> is to use all the available parallelism possible on the machine, and in the meantime all the computers were not doing anything else.
The slowest is, as obvious, the machine with two cores. It is interesting to note that all the machine are <em>low energy consumption</em>, therefore they are not performing as well as a desktop or server environment.</p>
<p><br />
I then decided to comile another <em>beast</em>: my favourite editor <em>Emacs 28.2</em>!</p>
<p><br /><br /></p>
<table>
<thead>
<tr>
<th style="text-align: center">CPU</th>
<th style="text-align: center">cores</th>
<th style="text-align: center">thread</th>
<th style="text-align: center">compilation type</th>
<th style="text-align: center">seconds</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">Intel i5-5257U</td>
<td style="text-align: center">2</td>
<td style="text-align: center">4</td>
<td style="text-align: center">make -j</td>
<td style="text-align: center">76</td>
</tr>
<tr>
<td style="text-align: center">Intel i5-10500T</td>
<td style="text-align: center">6</td>
<td style="text-align: center">12</td>
<td style="text-align: center">make -j</td>
<td style="text-align: center">31</td>
</tr>
<tr>
<td style="text-align: center">AMD Ryzen 5 5500 U</td>
<td style="text-align: center">6</td>
<td style="text-align: center">12</td>
<td style="text-align: center">make -j</td>
<td style="text-align: center">33</td>
</tr>
</tbody>
</table>
<p><br /><br /></p>
<p>Again, times are not so bad after all. But the above results are with the maximum parallelism, what happens in normal conditions?</p>
<p><br /><br /></p>
<table>
<thead>
<tr>
<th style="text-align: center">CPU</th>
<th style="text-align: center">cores</th>
<th style="text-align: center">thread</th>
<th style="text-align: center">compilation type</th>
<th style="text-align: center">seconds to compile PostgreSQL 15.2</th>
<th>seconds to compile Emacs 28.2</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">Intel i5-10500T</td>
<td style="text-align: center">6</td>
<td style="text-align: center">12</td>
<td style="text-align: center">make</td>
<td style="text-align: center">211</td>
<td>106</td>
</tr>
<tr>
<td style="text-align: center">AMD Ryzen 5 5500 U</td>
<td style="text-align: center">6</td>
<td style="text-align: center">12</td>
<td style="text-align: center">make</td>
<td style="text-align: center">193</td>
<td>92</td>
</tr>
</tbody>
</table>
<p><br /><br /></p>
<p>As a final note, if I virtualize the <code class="language-plaintext highlighter-rouge">Intel i5-10500T</code> as a single CPU with as a single core, the time to compile PostgreSQL grows to 250 seconds. Clearly, such time is comparable with the <code class="language-plaintext highlighter-rouge">make</code> sequential on the same CPU.</p>
pgagroal: setting configuration at run-time2023-04-06T00:00:00+00:00https://fluca1978.github.io/2023/04/06/pgagroal_config_set<p>A new feature added to <code class="language-plaintext highlighter-rouge">pgagroal</code> that allows users to dynamically change the configuration.</p>
<h1 id="pgagroal-setting-configuration-at-run-time">pgagroal: setting configuration at run-time</h1>
<p>I’m happy since today my contribution to <a href="https://github.com/agroal/pgagroal/" target="_blank">pgagroal</a> has been merged.
<a href="https://github.com/agroal/pgagroal/commit/07b79ccd95c2fd709594ea8002c2ea89715adb20" target="_blank">The last year</a> I added the <code class="language-plaintext highlighter-rouge">config-get</code> command to <code class="language-plaintext highlighter-rouge">pgagroal-cli</code>: such command allowed users to get information about how <code class="language-plaintext highlighter-rouge">pgagroal</code> was configured.</p>
<p>The natural improvement over the above work would have been the <strong><code class="language-plaintext highlighter-rouge">config-set</code></strong> command, and now <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> has one (see <a href="https://github.com/agroal/pgagroal/commit/29902144832662598ebcc324f627627c1595a319" target="_blank">this commit</a> !
It took me a few months to complete the work, since I was very busy on my day job: I had a working prototype working before Christmas, but then I let it there for <em>the future me</em> to have some time to complete the effort. And in the last month, I had some spare time, so I completed it!</p>
<h2 id="pgagroal-cli-config-set">pgagroal-cli config-set</h2>
<p>The new command allows to dynamically change some configuration values. Clearly, not everything can be changed at run-time without a daemon restart. I wanted the command to be useful also in automating scripts, so I thought it could be useful for the command to report back the actual value of a configuration parameter. Therefore, checking the <em>desiring</em> value and the <em>obtained</em> value can confirm if the change has been applied or not.</p>
<p>Similarly to the <code class="language-plaintext highlighter-rouge">config-get</code> command, also <code class="language-plaintext highlighter-rouge">config-set</code> accepts <em>contexts</em>:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">pgagroal</code> (or nothing) means that the specified configuration parameter is within the <code class="language-plaintext highlighter-rouge">[pgagroal]</code> configuration section;</li>
<li><code class="language-plaintext highlighter-rouge">limit</code> means that the user requested to change a limit entry;</li>
<li><code class="language-plaintext highlighter-rouge">server</code> the user wants to change something about a server section;</li>
<li><code class="language-plaintext highlighter-rouge">hba</code> the user wants to change an HBA entry.</li>
</ul>
<p>In the case of <code class="language-plaintext highlighter-rouge">limit</code>, <code class="language-plaintext highlighter-rouge">hba</code> and <code class="language-plaintext highlighter-rouge">server</code> contexts, the entry to modify must be identified with a name. While in a server configuration the name must be unique, the limits and hbas could not have unique names, so the first match wins.</p>
<p>As an example, imagine the user wants to change the <code class="language-plaintext highlighter-rouge">max_connection</code> setting. Such setting is within the <code class="language-plaintext highlighter-rouge">[pgagroal]</code> section, so the following are two identical commands:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgagroal-cli config-set max_connections 100
40
<span class="nv">$ </span>pgagroal-cli config-set pgagroal.max_connections 100
40
</code></pre></div></div>
<p><br />
<br /></p>
<p>In both the above cases, the system is returning <code class="language-plaintext highlighter-rouge">40</code> instead of the desired value <code class="language-plaintext highlighter-rouge">100</code>. That’s normal, since the <code class="language-plaintext highlighter-rouge">max_connections</code> requires a restart of the daemon, and can be better understood using the <code class="language-plaintext highlighter-rouge">--verbose</code> flag:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgagroal-cli config-set max_connections 100 <span class="nt">--verbose</span>
max_connections <span class="o">=</span> 40
pgagroal-cli: Error <span class="o">(</span>2<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Clearly the system reports there was an error, and provides the information that <code class="language-plaintext highlighter-rouge">max_connections</code> is set at <code class="language-plaintext highlighter-rouge">40</code>.</p>
<p>Another example could be when the user decides to change a server or limit value, where he has to specify the context:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgagroal-cli config-set server.venkman.port 6432
6432
<span class="nv">$ </span>pgagroal config-set limit.pgbench.max_size 2
2
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above, the user changes the <code class="language-plaintext highlighter-rouge">venkman</code> server port, and the <code class="language-plaintext highlighter-rouge">pgbench</code> user limit entry.</p>
<p>If the system, that is the <code class="language-plaintext highlighter-rouge">pgagroal</code> daemon, cannot apply the requested change on the fly, the logs will be populated accordingly:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DEBUG Trying to change main configuration setting <max_connections> to <100>
INFO Restart required <span class="k">for </span>max_connections - Existing 40 New 100
WARN 1 settings cannot be applied
DEBUG pgagroal_management_write_config_set: unable to apply changes to <max_connections> -> <100>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Clearly the messages may vary depending on the log level configuration.</p>
<p>For more information, please see <a href="https://github.com/agroal/pgagroal/blob/master/doc/CLI.md" target="_blank">the official documentation</a> .</p>
PgTraining online webinar on 2023-04-14 (Italian): schedule available!2023-04-02T00:00:00+00:00https://fluca1978.github.io/2023/04/02/PgTrainingOnlineEvent<p>Yet another online event organized by PgTraining!</p>
<h1 id="pgtraining-online-webinar-on-2023-04-14-italian-schedule-available">PgTraining online webinar on 2023-04-14 (Italian): schedule available</h1>
<p><a href="http://pgtraining.com" target="_blank">PgTraining</a>, the amazing italian group of people that spread the word about PostgreSQL and that I joined in the last years, is organizing another online event (<em>webinar</em>) on next <strong>14th April 2023</strong>.
<br />
The schedule of the event will be as follows:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">3 pm</code>: gathering and welcoming the participants;</li>
<li><code class="language-plaintext highlighter-rouge">3:15 pm</code>: <strong>Il linguaggio PL/Perl</strong>, by yours truly;</li>
<li><code class="language-plaintext highlighter-rouge">4 pm</code>: <strong>Il linguaggio PL/Python</strong>, by Chris Mair;</li>
<li><code class="language-plaintext highlighter-rouge">4:45 pm</code>: <strong>Nuove funzionalità di PostgreSQL 15</strong>, by Enrico Pirozzi;</li>
<li><code class="language-plaintext highlighter-rouge">5:30 pm</code>: closing.
<br /></li>
</ul>
<center>
<img src="https://img.evbuc.com/https%3A%2F%2Fcdn.evbuc.com%2Fimages%2F447573619%2F158990937159%2F1%2Foriginal.20210213-142502?w=940&auto=format%2Ccompress&q=75&sharp=10&rect=1%2C141%2C2364%2C1182&s=35ffcbb442035ca41ef3919cf91d0722" alt="pgtraining online event" />
</center>
<p>The afternoon is <em>*for free</em>, but registration is required so <a href="https://www.eventbrite.it/e/biglietti-pgtraining-on-line-session-2023-04-550667801217" target="_blank"><strong>hurry up and get your free ticket</strong></a> (seats are limited).
<br /></p>
<p><a href="https://packt.com" target="_blank">Packt</a> will offer a <strong>discount for buying a copy of the book <em>Learn PostgreSQL</em> by Luca Ferrari and Enrico Pirozzi</strong>.</p>
<p><br />
So, what are you waiting for? There are no reasons to skip this event!</p>
PgTraining online webinar on 2023-04-14 (Italian)2023-02-21T00:00:00+00:00https://fluca1978.github.io/2023/02/21/PgTrainingOnlineEvent<p>Yet another online event organized by PgTraining!</p>
<h1 id="pgtraining-online-webinar-on-2023-04-14-italian">PgTraining online webinar on 2023-04-14 (Italian)</h1>
<p><a href="http://pgtraining.com" target="_blank">PgTraining</a>, the amazin italian group of people that spread the word about PostgreSQL and that I joined in the last years, is organizing another online event (<em>webinar</em>) on next <strong>14th April 2023</strong>.
<br />
Following the success of the previous edition(s), we decided to provide another afternoon full of <em>PostgreSQL talks</em>, in the hope to improve the adoption of this great database.</p>
<p><br />
The event will consist in three hours with talks about <strong>PL/Perl</strong>, <strong>PL/Python</strong> and <strong>all things news in PostgreSQL 15</strong>.
<br />
As for the previous editions, the webinar will be presented in Italian. Attendees will be free to actively participate and do questions both during the talks and at the end of the whole event.
<br />
<br />
In the pure spirit of <a href="http://pgtraining.com" target="_blank">PgTraining</a>, the event <strong>will be free of charge</strong>, but it is required to register for participate and the number of available seats is limited, so <a href="https://www.eventbrite.it/e/biglietti-pgtraining-on-line-session-2023-04-550667801217" target="_blank"><strong>hurry up and get your free ticket</strong></a> as soon as possible!
<br />
The material will be available for free after the event has completed, but no live recording will be available.</p>
Invoking (your own) Perl from PL/Perl2023-02-06T00:00:00+00:00https://fluca1978.github.io/2023/02/06/PostgreSQLInvokingPerlFromPLPerl<p>A glance at how to invoke Perl code within PL/Perl code.</p>
<h1 id="invoking-your-own-perl-from-plperl">Invoking (your own) Perl from PL/Perl</h1>
<p>Invoking your own Perl code from PL/Perl, how hard can it be?
<br />
Well, it turns out that it can be harder than you think.
PL/Perl is made to allow Perl interacting with the SQL world. Assume a function, named <code class="language-plaintext highlighter-rouge">get_prime</code> requires to invoke another Perl function <code class="language-plaintext highlighter-rouge">is_prime</code> to test if a number is prime or not.
<br />
How is it possible to chain the function invocation?</p>
<h2 id="invoking-plperl-from-plperl-via-a-query">Invoking PL/Perl from PL/Perl via a query</h2>
<p>One obvious possibility is to wrap <code class="language-plaintext highlighter-rouge">is_prime</code> into a PL/Perl function.
Since a PL/Perl function is, well, an ordinary function, it is always possible to call it <em>the SQL way</em>, as another ordinary function.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">is_prime</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">bool</span> <span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">n</span> <span class="p">)</span> <span class="o">=</span> <span class="o">@</span><span class="n">_</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span> <span class="mi">2</span> <span class="p">..</span> <span class="err">$</span><span class="n">n</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">last</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span> <span class="o">>=</span> <span class="p">(</span> <span class="err">$</span><span class="n">n</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span> <span class="n">if</span> <span class="err">$</span><span class="n">n</span> <span class="o">%</span> <span class="err">$</span><span class="n">_</span> <span class="o">==</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span> <span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">get_primes</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">SETOF</span> <span class="nb">int</span> <span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">for</span> <span class="p">(</span> <span class="mi">1</span> <span class="p">..</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">my</span> <span class="err">$</span><span class="n">result_set</span> <span class="o">=</span> <span class="n">spi_exec_query</span><span class="p">(</span> <span class="nv">"SELECT is_prime( $_ )"</span> <span class="p">);</span>
<span class="n">return_next</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span> <span class="p">)</span> <span class="n">if</span> <span class="p">(</span> <span class="err">$</span><span class="n">result_set</span><span class="o">-></span><span class="p">{</span> <span class="k">rows</span> <span class="p">}[</span><span class="mi">0</span><span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="n">is_prime</span> <span class="p">}</span> <span class="n">eq</span> <span class="s1">'t'</span> <span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">undef</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span> <span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The function <code class="language-plaintext highlighter-rouge">get_primes</code> builds a query (<code class="language-plaintext highlighter-rouge">SELECT is_prime( $_ );</code>) that is executed several times in order to get the result.
<br />
Advantages of this approach are that this is the natural way to query PostgreSQL functions, and therefore it would be possible to mix and match PL/Perl functions with other PL-functions. The main drawback is that this approach is tedious and error prone, since there is the need to build SQL queries. Moreover, handling invocation and argument passing will slow down the execution of the main function.</p>
<h2 id="using-anonymous-subroutines">Using anonymous subroutines</h2>
<p>Luckily, Perl allows the definition of a subroutine within another subroutine, and to call it when required.
One way to achieve this is by <em>code references</em>.</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">get_prime</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">SETOF</span> <span class="nb">int</span> <span class="nv">AS</span> <span class="nv">$CODE</span><span class="err">$</span>
<span class="k">my</span> <span class="nv">$is_prime</span> <span class="o">=</span> <span class="k">sub </span><span class="p">{</span>
<span class="k">my</span> <span class="p">(</span> <span class="nv">$n</span> <span class="p">)</span> <span class="o">=</span> <span class="nv">@_</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span> <span class="mi">2</span> <span class="o">..</span> <span class="nv">$n</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">last</span> <span class="k">if</span> <span class="vg">$_</span> <span class="o">>=</span> <span class="p">(</span> <span class="nv">$n</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span> <span class="k">if</span> <span class="nv">$n</span> <span class="o">%</span> <span class="vg">$_</span> <span class="o">==</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">};</span>
<span class="k">for</span> <span class="p">(</span> <span class="mi">1</span> <span class="o">..</span> <span class="vg">$_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">return_next</span><span class="p">(</span> <span class="vg">$_</span> <span class="p">)</span> <span class="k">if</span> <span class="nv">$is_prime</span><span class="o">-></span><span class="p">(</span> <span class="vg">$_</span> <span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">undef</span><span class="p">;</span>
<span class="nv">$CODE</span><span class="err">$</span> <span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>This makes <em>very easy and natural</em> to invoke Perl code (in this example, the <code class="language-plaintext highlighter-rouge">$is_prime</code> function) within PL/Perl.
The main advantage of this approach is that it is all written in pure Perl. The main drawback is that the <code class="language-plaintext highlighter-rouge">$is_prime</code> function is now <em>private</em> to the scope of the <code class="language-plaintext highlighter-rouge">get_prime</code> function, and therefore cannot be reused by other functions.</p>
<h2 id="injecting-a-line-of-code-at-plperl-boot">Injecting a line of code at PL/Perl boot</h2>
<p>PostgreSQL provides a set of <code class="language-plaintext highlighter-rouge">plperl</code> GUCs that allow you to set different properties of the Perl enrivornment.
One possibility, is to pre-declare a function so that once the PL/Perl engine will run, the function will be there.
Unluckily, GUCs do not allow for a setting to be split on different lines. Luckily, Perl is not Python, so you can write your own code in a single line.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># in postgresql.conf</span>
plperl.on_plperl_init <span class="o">=</span> <span class="s1">'sub is_prime { ... }'</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore, when the PL/Perl engine starts, it gets the <code class="language-plaintext highlighter-rouge">is_prime</code> sub defined for free. This means that <code class="language-plaintext highlighter-rouge">get_rpimes</code> can be simply written as:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">get_primes</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">SETOF</span> <span class="nb">int</span> <span class="nv">AS</span> <span class="nv">$CODE</span><span class="err">$</span>
<span class="k">return</span> <span class="p">[</span> <span class="nb">grep</span> <span class="p">{</span> <span class="nv">is_prime</span><span class="p">(</span> <span class="vg">$_</span> <span class="p">)</span> <span class="p">}</span> <span class="p">(</span> <span class="mi">2</span> <span class="o">..</span> <span class="vg">$_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="p">)</span> <span class="p">];</span>
<span class="nv">$CODE</span><span class="err">$</span> <span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The main advantage of this approach is that it is simple. The main drawback is that it is complex, too.
Writing code within a single line is not a good habit. Most notably, this makes the code declared into <code class="language-plaintext highlighter-rouge">plperl.on_plperl_init</code> available to every instance of the PL/Perl engine, and this is a security risk!</p>
<h2 id="injecting-a-module-at-plperl-boot">Injecting a module at PL/Perl boot</h2>
<p>Following a similar approach, it is possible to place your custom code into a module and make PL/Perl <code class="language-plaintext highlighter-rouge">use</code> such module before the execution starts.
The first step required is to provide a Perl module and place it where PostgreSQL can find on the filesystem.</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># $PGDATA/conf.d/fluca1978.pm</span>
<span class="k">sub </span><span class="nf">is_prime</span> <span class="p">{</span>
<span class="k">my</span> <span class="p">(</span> <span class="nv">$n</span> <span class="p">)</span> <span class="o">=</span> <span class="nv">@_</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span> <span class="mi">2</span> <span class="o">..</span> <span class="nv">$n</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">last</span> <span class="k">if</span> <span class="vg">$_</span> <span class="o">>=</span> <span class="p">(</span> <span class="nv">$n</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span> <span class="k">if</span> <span class="nv">$n</span> <span class="o">%</span> <span class="vg">$_</span> <span class="o">==</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="mi">1</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now, it is possible to load the module either in <code class="language-plaintext highlighter-rouge">plperl.on_init</code> or in <code class="language-plaintext highlighter-rouge">plperl.on_plperlu_init</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>plperl.on_init <span class="o">=</span> <span class="s1">'use lib q{/postgres/15/data/conf.f/fluca1978.pm}; use fluca1978;
</span></code></pre></div></div>
<p><br />
<br /></p>
<p>Last, since the module has been loaded for every PL/Perl engine, the <code class="language-plaintext highlighter-rouge">get_primes</code> function remains as simple as:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">get_primes</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">SETOF</span> <span class="nb">int</span> <span class="nv">AS</span> <span class="nv">$CODE</span><span class="err">$</span>
<span class="k">return</span> <span class="p">[</span> <span class="nb">grep</span> <span class="p">{</span> <span class="nv">is_prime</span><span class="p">(</span> <span class="vg">$_</span> <span class="p">)</span> <span class="p">}</span> <span class="p">(</span> <span class="mi">2</span> <span class="o">..</span> <span class="vg">$_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="p">)</span> <span class="p">];</span>
<span class="nv">$CODE</span><span class="err">$</span> <span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the case <code class="language-plaintext highlighter-rouge">plperl.on_init</code> does not include the <code class="language-plaintext highlighter-rouge">use</code> statement, the function <code class="language-plaintext highlighter-rouge">get_primes</code> should have been defined as <code class="language-plaintext highlighter-rouge">plperlu</code> loading the module itself.</p>
<p>The main advantage of this approach is that it provides modularity of available code.
The main drawback, as for the previous approach, is that it injects code into every PL/Perl engine that is going to be started.</p>
<h2 id="using-shared-code">Using shared code</h2>
<p>PL/Perl provides the <code class="language-plaintext highlighter-rouge">%_SHARED</code> hash that is shared among functions running within the same connections and with the same user.
This allows for storing an anonymous subroutine into the <code class="language-plaintext highlighter-rouge">%_SHARED</code> object and use it later.</p>
<p>The first step is to <em>initialize</em> the hash with the anonymous subroutine:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">my_plperl_init</span><span class="p">()</span>
<span class="nv">RETURNS</span> <span class="nv">VOID</span> <span class="nv">AS</span> <span class="nv">$CODE</span><span class="err">$</span>
<span class="nv">$_SHARED</span><span class="p">{</span> <span class="nv">is_prime</span> <span class="p">}</span> <span class="o">=</span> <span class="k">sub </span><span class="p">{</span>
<span class="k">my</span> <span class="p">(</span> <span class="nv">$n</span> <span class="p">)</span> <span class="o">=</span> <span class="nv">@_</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span> <span class="mi">2</span> <span class="o">..</span> <span class="nv">$n</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">last</span> <span class="k">if</span> <span class="vg">$_</span> <span class="o">>=</span> <span class="p">(</span> <span class="nv">$n</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span> <span class="k">if</span> <span class="nv">$n</span> <span class="o">%</span> <span class="vg">$_</span> <span class="o">==</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">};</span>
<span class="nv">$CODE</span><span class="err">$</span> <span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Then, it is possible to make <code class="language-plaintext highlighter-rouge">get_primes</code> to use the shared reference:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">get_primes</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">SETOF</span> <span class="nb">int</span> <span class="nv">AS</span> <span class="nv">$CODE</span><span class="err">$</span>
<span class="k">return</span> <span class="p">[</span> <span class="nb">grep</span> <span class="p">{</span> <span class="nv">$_SHARED</span><span class="p">{</span> <span class="nv">is_prime</span> <span class="p">}</span><span class="o">-></span><span class="p">(</span> <span class="vg">$_</span> <span class="p">)</span> <span class="p">}</span> <span class="p">(</span> <span class="mi">2</span> <span class="o">..</span> <span class="vg">$_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="p">)</span> <span class="p">];</span>
<span class="nv">$CODE</span><span class="err">$</span> <span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The main adavantage of this approach is that it does not require a full code injection: only sessions that require the code to be shared will use it. The main drawback is that there is no <em>real sharing</em> of code, it is just temporary code living in the session space. Moreover, this approach requires an initialization phase, that can be error prone.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Being PL/Perl, well, Perl, <em>there’s more than one way to do it!</em>
<br />
Depending on the aims and constraints, there are different ways to invoke Perl code from other PL/Perl code. The main considerations, when choosing an approach or another, are related to <strong>code reusability</strong> and <strong>performances</strong>. Wrapping Perl into PL/Perl provides for better code reusability, but requires more resources and code bloating. Using a pure Perl approach provides for the best performances and code readibility, but can open the door to some security risks.</p>
PostgreSQL command line colors!2023-01-23T00:00:00+00:00https://fluca1978.github.io/2023/01/23/PostgreSQLColors<p>A simple way to make more attractive the PostgreSQL command line interface!</p>
<h1 id="postgresql-command-line-colors">PostgreSQL command line colors!</h1>
<p>Did you know that PostgreSQL tools can, under specific circumstances, display colors?
<br />
Well, I didn’t know until I came across <a href="https://www.postgresql.org/docs/current/color.html" target="_blank">this section in the documentation</a> that explains it.
<br />
There are <strong>two different environment variables</strong> named <code class="language-plaintext highlighter-rouge">PG_COLOR</code> and <code class="language-plaintext highlighter-rouge">PG_COLORS</code> respectively. The first (note the singular) decides if the colors have to be activated or not, while the second contains the sequence of colors.
<br />
Clearly, colors are related to errors and other messages regarding <em>a tool</em> and not SQL errors!
<br /></p>
<p>Let’s see this in action:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/postgresql/pg_colors.png" />
</center>
<p><br />
<br /></p>
<p>As you can see, after setting <code class="language-plaintext highlighter-rouge">PG_COLOR</code> to <code class="language-plaintext highlighter-rouge">always</code>, both <code class="language-plaintext highlighter-rouge">psql</code> and <code class="language-plaintext highlighter-rouge">pg_dump</code> show the error with a red color and the message tag with a bold face.
You can change the default color behaviour by setting the values in the <code class="language-plaintext highlighter-rouge">PG_COLORS</code> environment variable, so for example you can turn the errors to purple:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/postgresql/pg_colors2.png" />
</center>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">PG_COLORS</code> variable is a string that contains the log level (e.g., <code class="language-plaintext highlighter-rouge">error</code>) followed by the color code (e.g., <code class="language-plaintext highlighter-rouge">01;31</code> means bold red).
The same color palette that you apply in shell and <code class="language-plaintext highlighter-rouge">printf(2)</code> like escape sequences can be applied to <code class="language-plaintext highlighter-rouge">PG_COLORS</code> variable.
You can even make text blinking:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/postgresql/pg_colors_blinking.gif" />
</center>
<p><br />
<br /></p>
<p>As far as I understand, the command line tools adapts to the colors thru the logging subsystem.</p>
Handling NULLs and Empty values in PL/Perl2023-01-17T00:00:00+00:00https://fluca1978.github.io/2023/01/17/PLPerlNullHandling<p>How to correctly detect an SQL NULL value in PL/Perl.</p>
<h1 id="handling-nulls-and-empty-values-in-plperl">Handling NULLs and Empty values in PL/Perl</h1>
<p>Perl has a very simple <strong>concept of truth</strong>: <em>everything that is a non-empty, non-zero value is true!</em>
<br />
It’s that simple!
<br />
<br />
The problem with PL/Perl, the PostgreSQL internal language, is that SQL provides <code class="language-plaintext highlighter-rouge">NULL</code> values, that somehow are equivalent to Perl <code class="language-plaintext highlighter-rouge">undef</code> values. But unlike Perl, in SQL an empty string or a <code class="language-plaintext highlighter-rouge">0</code> value <em>is not <code class="language-plaintext highlighter-rouge">NULL</code></em>!
<br />
This implies that if you pass a <code class="language-plaintext highlighter-rouge">NULL</code> value to a PL/Perl function (or piece of code), Perl will evaluate them as <code class="language-plaintext highlighter-rouge">undef</code>, and this is good. The problem is that SQL values <code class="language-plaintext highlighter-rouge">0</code> and <code class="language-plaintext highlighter-rouge">''</code> (empty string) will be treated by Perl as <em>false</em> values while they are not.
<br />
Luckily, the rule is simple: use Perl’s <code class="language-plaintext highlighter-rouge">defined</code> operator to see if a value is <code class="language-plaintext highlighter-rouge">NULL</code> in the SQL sense.
<br /></p>
<p>Let’s see this with a <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/plperl/catch_nulls.plperl" target="_blank">very trivial code example</a>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">plperl_catch_nulls</span><span class="p">(</span> <span class="nb">int</span><span class="p">,</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="n">VOID</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="err">$</span><span class="n">arg</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span> <span class="o">@</span><span class="n">_</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">elog</span><span class="p">(</span><span class="n">INFO</span><span class="p">,</span> <span class="nv">"Input argument number $arg [$_] is false (as Perl)"</span> <span class="p">)</span> <span class="n">if</span> <span class="o">!</span> <span class="err">$</span><span class="n">_</span><span class="p">;</span>
<span class="n">elog</span><span class="p">(</span><span class="n">INFO</span><span class="p">,</span> <span class="nv">"Input argument number $arg [$_] is NULL (as SQL)"</span> <span class="p">)</span> <span class="n">if</span> <span class="o">!</span> <span class="k">defined</span> <span class="err">$</span><span class="n">_</span><span class="p">;</span>
<span class="n">elog</span><span class="p">(</span><span class="n">INFO</span><span class="p">,</span> <span class="nv">"Input argument number $arg is valid [$_]"</span> <span class="p">)</span> <span class="n">if</span> <span class="k">defined</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span> <span class="p">)</span> <span class="o">&&</span> <span class="err">$</span><span class="n">_</span><span class="p">;</span>
<span class="err">$</span><span class="n">arg</span><span class="o">++</span><span class="p">;</span>
<span class="p">}</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Let’s try the function with a few different set of arguments:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">plperl_catch_nulls</span><span class="p">(</span> <span class="mi">19</span><span class="p">,</span> <span class="s1">'Hello World!'</span> <span class="p">);</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="k">is</span> <span class="k">valid</span> <span class="p">[</span><span class="mi">19</span><span class="p">]</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="k">is</span> <span class="k">valid</span> <span class="p">[</span><span class="n">Hello</span> <span class="n">World</span><span class="o">!</span><span class="p">]</span>
<span class="n">plperl_catch_nulls</span>
<span class="c1">--------------------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">plperl_catch_nulls</span><span class="p">(</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">''</span> <span class="p">);</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">is</span> <span class="k">false</span> <span class="p">(</span><span class="k">as</span> <span class="n">Perl</span><span class="p">)</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="p">[]</span> <span class="k">is</span> <span class="k">false</span> <span class="p">(</span><span class="k">as</span> <span class="n">Perl</span><span class="p">)</span>
<span class="n">plperl_catch_nulls</span>
<span class="c1">--------------------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">plperl_catch_nulls</span><span class="p">(</span> <span class="k">NULL</span><span class="p">,</span> <span class="k">NULL</span> <span class="p">);</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="p">[]</span> <span class="k">is</span> <span class="k">false</span> <span class="p">(</span><span class="k">as</span> <span class="n">Perl</span><span class="p">)</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="p">[]</span> <span class="k">is</span> <span class="k">NULL</span> <span class="p">(</span><span class="k">as</span> <span class="k">SQL</span><span class="p">)</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="p">[]</span> <span class="k">is</span> <span class="k">false</span> <span class="p">(</span><span class="k">as</span> <span class="n">Perl</span><span class="p">)</span>
<span class="n">INFO</span><span class="p">:</span> <span class="k">Input</span> <span class="n">argument</span> <span class="n">number</span> <span class="mi">1</span> <span class="p">[]</span> <span class="k">is</span> <span class="k">NULL</span> <span class="p">(</span><span class="k">as</span> <span class="k">SQL</span><span class="p">)</span>
<span class="n">plperl_catch_nulls</span>
<span class="c1">--------------------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, when the argument is <code class="language-plaintext highlighter-rouge">NULL</code> the branches <em>false (as Perl)</em> and <em>NULL (as SQL)</em> are both triggered, while not <code class="language-plaintext highlighter-rouge">NULL</code> values are triggered only by the <code class="language-plaintext highlighter-rouge">defined</code> branch.</p>
<p><br />
If you like the ease of thinking of Perl, and do not want to go deep into the <code class="language-plaintext highlighter-rouge">NULL</code>/<code class="language-plaintext highlighter-rouge">defined</code> stuff, you can define your function as <code class="language-plaintext highlighter-rouge">STRICT</code> that makes PostgreSQL preventing the function invocation when at least one argument is <code class="language-plaintext highlighter-rouge">NULL</code>.</p>
From Numbers to Words using Perl (and Lingua::)!2023-01-12T00:00:00+00:00https://fluca1978.github.io/2023/01/12/PostgreSQLNumbersToWords<p>How to convert a digit into a sentence with the power of Perl.</p>
<h1 id="from-numbers-to-words-using-perl-and-lingua">From Numbers to Words using Perl (and Lingua::)!</h1>
<p>A few days ago I came across a question on Facebook regarding the conversion of a number into its english representation. First of all, I hate Facebook with a passion and <strong>I strongly encourage people that have questions related to PostgreSQL to use mailing lists and IRC channels</strong>.
<br />
Despite that, how can we convert a number, let’s say <code class="language-plaintext highlighter-rouge">19</code> to <code class="language-plaintext highlighter-rouge">nineteen</code>?
<br />
The first thing that came into my mind was the excellent <strong><code class="language-plaintext highlighter-rouge">Lingua::</code></strong> set of Perl modules. And since Perl is a very well supported language into every vanilla PostgreSQL instance, why not create a simple wrapper in PL/Perl?
<br />
So, as trivial, as it is, <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/num2words.pl" target="_blank">here you can find a simple implementation</a> to translate a number into an english sentence.</p>
<h2 id="installing-linguaennumbers">Installing <code class="language-plaintext highlighter-rouge">Lingua::EN::Numbers</code></h2>
<p>Before you can use my wrapper, you need to install the Perl module <code class="language-plaintext highlighter-rouge">Lingua::EN::Numbers</code> in your system, so that PostgreSQL can find it. One easy way to achieve this is by means of <code class="language-plaintext highlighter-rouge">cpanm</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% cpanm Lingua::EN::Numbers
</code></pre></div></div>
<p><br />
<br /></p>
<p>Once the module is installed, you can install the procedure.</p>
<h2 id="the-num2words-plperl-function-a-first-simple-implementation">The <code class="language-plaintext highlighter-rouge">num2words</code> PL/Perl function: a first simple implementation</h2>
<p>The function is a simple wrapper around the super powerful <code class="language-plaintext highlighter-rouge">num2en</code> function loadable via <code class="language-plaintext highlighter-rouge">Lingua::EN::Numbers</code>.
Since the PL/Perl function requires an external module, the function has to be defined as <code class="language-plaintext highlighter-rouge">plperlu</code>, therefore potentially unsafe.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">num2words</span><span class="p">(</span> <span class="nb">numeric</span> <span class="k">default</span> <span class="mi">0</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">text</span>
<span class="k">STRICT</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">use</span> <span class="n">Lingua</span><span class="p">::</span><span class="n">EN</span><span class="p">::</span><span class="n">Numbers</span> <span class="n">qw</span><span class="o">/</span> <span class="n">num2en</span> <span class="o">/</span><span class="p">;</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">number</span> <span class="p">)</span> <span class="o">=</span> <span class="o">@</span><span class="n">_</span><span class="p">;</span>
<span class="n">num2en</span><span class="p">(</span> <span class="err">$</span><span class="n">number</span> <span class="p">);</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperlu</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The function is defined as <code class="language-plaintext highlighter-rouge">STRICT</code>, and therefore it does return <code class="language-plaintext highlighter-rouge">NULL</code> on <code class="language-plaintext highlighter-rouge">NULL</code> input.
The function accepts a <code class="language-plaintext highlighter-rouge">number</code> input, therefore even a very large number, and passes it to <code class="language-plaintext highlighter-rouge">num2en</code> function, returning the result.
<br />
As an example of invocations:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">num2words</span><span class="p">();</span>
<span class="n">num2words</span>
<span class="c1">-----------</span>
<span class="n">zero</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">num2words</span><span class="p">(</span> <span class="mi">19071978</span> <span class="p">);</span>
<span class="n">num2words</span>
<span class="c1">------------------------------------------------------------------------</span>
<span class="n">nineteen</span> <span class="n">million</span><span class="p">,</span> <span class="n">seventy</span><span class="o">-</span><span class="n">one</span> <span class="n">thousand</span><span class="p">,</span> <span class="n">nine</span> <span class="n">hundred</span> <span class="k">and</span> <span class="n">seventy</span><span class="o">-</span><span class="n">eight</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">num2words</span><span class="p">(</span> <span class="mi">1</span><span class="p">.</span><span class="mi">23</span> <span class="p">);</span>
<span class="n">num2words</span>
<span class="c1">---------------------</span>
<span class="n">one</span> <span class="n">point</span> <span class="n">two</span> <span class="n">three</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="the-num2words-function-a-multi-language-approach">The <code class="language-plaintext highlighter-rouge">num2words</code> function: a multi-language approach</h2>
<p>It is possible to improve the function to support multiple languages, for example specifying the language as an argument.
One problem is that not all the <code class="language-plaintext highlighter-rouge">Lingua::*::Numbers</code> behave the same, so it is not simple to dynamically load a module and a function depending on the input argument. For example, in the English module the function to use is <code class="language-plaintext highlighter-rouge">num2word</code> while in the Italian language is <code class="language-plaintext highlighter-rouge">number_to_it</code>, so there is not a well established pattern name.
<br />
Here there is a multilanguage implementation:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">num2words</span><span class="p">(</span> <span class="nb">numeric</span> <span class="k">default</span> <span class="mi">0</span><span class="p">,</span> <span class="nb">text</span> <span class="k">default</span> <span class="s1">'en'</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">text</span>
<span class="k">STRICT</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">number</span><span class="p">,</span> <span class="err">$</span><span class="k">language</span> <span class="p">)</span> <span class="o">=</span> <span class="o">@</span><span class="n">_</span><span class="p">;</span>
<span class="err">$</span><span class="k">language</span> <span class="o">=</span> <span class="s1">'en'</span> <span class="n">unless</span><span class="p">(</span> <span class="err">$</span><span class="k">language</span> <span class="p">);</span>
<span class="n">if</span> <span class="p">(</span> <span class="err">$</span><span class="k">language</span> <span class="o">=~</span> <span class="o">/^</span><span class="n">en</span><span class="p">(</span><span class="n">glish</span><span class="p">)</span><span class="o">?</span><span class="err">$</span><span class="o">/</span><span class="n">i</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">use</span> <span class="n">Lingua</span><span class="p">::</span><span class="n">EN</span><span class="p">::</span><span class="n">Numbers</span> <span class="n">qw</span><span class="o">/</span> <span class="n">num2en</span> <span class="o">/</span><span class="p">;</span>
<span class="n">num2en</span><span class="p">(</span> <span class="err">$</span><span class="n">number</span> <span class="p">);</span>
<span class="p">}</span>
<span class="n">elsif</span><span class="p">(</span> <span class="err">$</span><span class="k">language</span> <span class="o">=~</span> <span class="o">/^</span><span class="n">it</span><span class="p">(</span><span class="n">alian</span><span class="p">)</span><span class="o">?</span><span class="err">$</span><span class="o">/</span><span class="n">i</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">use</span> <span class="n">Lingua</span><span class="p">::</span><span class="n">IT</span><span class="p">::</span><span class="n">Numbers</span> <span class="n">qw</span><span class="o">/</span> <span class="n">number_to_it</span> <span class="o">/</span><span class="p">;</span>
<span class="n">number_to_it</span><span class="p">(</span> <span class="err">$</span><span class="n">number</span> <span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">NOTICE</span><span class="p">,</span> <span class="nv">"Unsupported language $language"</span> <span class="p">);</span>
<span class="k">return</span> <span class="n">undef</span><span class="p">;</span>
<span class="p">}</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperlu</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The function accepts the locale (two letters) or the full language name (e.g., <code class="language-plaintext highlighter-rouge">italian</code>) in a case insensitive manner.
In every branch of the <code class="language-plaintext highlighter-rouge">if-elsif-else</code> the appropriate module is loaded, and then the appropriated function is evaluated.
<br />
Clearly, the function needs to access every language specific module, so before you can use the above PL/Perl function you need to install the Perl modules on the machine:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>cpanm Lingua::EN::Numbers
% <span class="nb">sudo </span>cpanm Lingua::IT::Numbers
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>I love the capability to melt Perl into PostgreSQL, giving the best of two worlds in a very quick and smart way!</p>
Using table aliases in UPSERTs2022-12-07T00:00:00+00:00https://fluca1978.github.io/2022/12/07/PostgreSQLUpsertWithAlias<p>A simple way to achieve a “counter-like” using UPSERTs.</p>
<h1 id="using-table-aliases-in-upserts">Using table aliases in UPSERTs</h1>
<p>PostgreSQL has had the <em>UPSERT</em> feature since a while, now somehow overtaken by the <a href="https://www.postgresql.org/docs/15/sql-merge.html" target="_blank"><code class="language-plaintext highlighter-rouge">MERGE</code></a> command.
<br />
One interesting feature of <code class="language-plaintext highlighter-rouge">UPSERT</code> is that it can quickly help you to implement a <em>counter-like</em> approach, but often you need to use table aliasing in order for the feature to be able to accumulate results.
<br />
Allow me to explain with a very simple example: we want to count the occurrencies of every letter into a bunch of text, and we want it to be able to accumulate between different calls.
One possible solution is to use a table, even a <code class="language-plaintext highlighter-rouge">TEMPORARY</code> one, to store the letter and a counter, and then populate such table.</p>
<p>With an <em>UPSERT</em> the thing reduces to:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">temp</span> <span class="k">table</span> <span class="n">letters</span><span class="p">(</span> <span class="n">l</span> <span class="nb">char</span> <span class="k">primary</span> <span class="k">key</span><span class="p">,</span> <span class="k">c</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">0</span> <span class="p">)</span> <span class="k">on</span> <span class="k">commit</span> <span class="k">delete</span> <span class="k">rows</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">begin</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">with</span> <span class="n">flow</span> <span class="k">as</span> <span class="p">(</span> <span class="k">select</span> <span class="n">l</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="n">n</span>
<span class="k">from</span> <span class="n">regexp_split_to_table</span><span class="p">(</span> <span class="s1">'fhcgfeettrrzz'</span><span class="p">,</span> <span class="s1">''</span> <span class="p">)</span> <span class="n">l</span>
<span class="k">group</span> <span class="k">by</span> <span class="n">l</span> <span class="p">)</span>
<span class="k">insert</span> <span class="k">into</span> <span class="n">letters</span> <span class="k">as</span> <span class="n">mem</span> <span class="c1">-- table alias</span>
<span class="k">select</span> <span class="n">l</span><span class="p">,</span> <span class="n">n</span>
<span class="k">from</span> <span class="n">flow</span>
<span class="k">on</span> <span class="n">conflict</span><span class="p">(</span> <span class="n">l</span> <span class="p">)</span>
<span class="k">do</span> <span class="k">update</span> <span class="k">set</span> <span class="k">c</span> <span class="o">=</span> <span class="n">mem</span><span class="p">.</span><span class="k">c</span> <span class="o">+</span> <span class="n">excluded</span><span class="p">.</span><span class="k">c</span><span class="p">;</span> <span class="c1">-- accumulate !</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">8</span>
<span class="c1">-- repeat with different sources of text</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">table</span> <span class="n">letters</span><span class="p">;</span>
<span class="n">l</span> <span class="o">|</span> <span class="k">c</span>
<span class="c1">---+----</span>
<span class="n">a</span> <span class="o">|</span> <span class="mi">12</span>
<span class="n">b</span> <span class="o">|</span> <span class="mi">20</span>
<span class="n">r</span> <span class="o">|</span> <span class="mi">2</span>
<span class="n">z</span> <span class="o">|</span> <span class="mi">2</span>
<span class="k">g</span> <span class="o">|</span> <span class="mi">1</span>
<span class="k">c</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">t</span> <span class="o">|</span> <span class="mi">2</span>
<span class="n">h</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">f</span> <span class="o">|</span> <span class="mi">18</span>
<span class="n">e</span> <span class="o">|</span> <span class="mi">18</span>
<span class="p">(</span><span class="mi">10</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">flow</code> CTE is simulating our flow of letters, and provides an instant counting of the occurencies. We want such counting to accumulate into the <code class="language-plaintext highlighter-rouge">letters</code> table, but we want to use the <em>UPSERT</em> feature.
When a conflict is found, over the primary key <code class="language-plaintext highlighter-rouge">l</code>, the <code class="language-plaintext highlighter-rouge">INSERT</code> degenrates into an <code class="language-plaintext highlighter-rouge">UPDATE</code> and the <code class="language-plaintext highlighter-rouge">c</code> counter must be increased by the value of the <code class="language-plaintext highlighter-rouge">excluded</code> tuple counter.
Here arise the problem: how can we refer to the current value of the counter? <strong>We need a table aliasing</strong>, in our case <code class="language-plaintext highlighter-rouge">mem</code> to refer to the current tuple.
<br />
If we omit the <code class="language-plaintext highlighter-rouge">mem.c</code> from the update statement, the query will refuse to work because there is ambiguity on which tuple column we are referring to:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=*></span> <span class="k">with</span> <span class="n">flow</span> <span class="k">as</span> <span class="p">(</span> <span class="k">select</span> <span class="n">l</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="n">n</span>
<span class="k">from</span> <span class="n">regexp_split_to_table</span><span class="p">(</span> <span class="s1">'fhcgfeettrrzz'</span><span class="p">,</span> <span class="s1">''</span> <span class="p">)</span> <span class="n">l</span>
<span class="k">group</span> <span class="k">by</span> <span class="n">l</span> <span class="p">)</span>
<span class="k">insert</span> <span class="k">into</span> <span class="n">letters</span> <span class="k">as</span> <span class="n">mem</span>
<span class="k">select</span> <span class="n">l</span><span class="p">,</span> <span class="n">n</span>
<span class="k">from</span> <span class="n">flow</span>
<span class="k">on</span> <span class="n">conflict</span><span class="p">(</span> <span class="n">l</span> <span class="p">)</span>
<span class="k">do</span> <span class="k">update</span> <span class="k">set</span> <span class="k">c</span> <span class="o">=</span> <span class="k">c</span> <span class="o">+</span> <span class="n">excluded</span><span class="p">.</span><span class="k">c</span>
<span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">column</span> <span class="n">reference</span> <span class="nv">"c"</span> <span class="k">is</span> <span class="n">ambiguous</span>
<span class="n">LINE</span> <span class="mi">8</span><span class="p">:</span> <span class="k">do</span> <span class="k">update</span> <span class="k">set</span> <span class="k">c</span> <span class="o">=</span> <span class="k">c</span> <span class="o">+</span> <span class="n">excluded</span><span class="p">.</span><span class="k">c</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Note that the column name on the left of the assignement does not need to be disambiguated, since it is clear that we are referring to the conflicting tuple we tried to insert.</p>
pgagroal: getting run-time configuration2022-11-30T00:00:00+00:00https://fluca1978.github.io/2022/11/30/pgagroal_config_get<p>A new command to interactively get the <code class="language-plaintext highlighter-rouge">pgagroal</code> runtime configuration.</p>
<h1 id="pgagroal-getting-run-time-configuration">pgagroal: getting run-time configuration</h1>
<p><a href="https://github.com/agroal/pgagroal/" target="_blank">pgagroal</a>, the fast connection pooler for PostgreSQL, is gaining new features!
In <a href="https://github.com/agroal/pgagroal/commit/07b79ccd95c2fd709594ea8002c2ea89715adb20" target="_blank">this commit</a> I introduced a new <em>command</em> for <code class="language-plaintext highlighter-rouge">pgagroal-cli</code>: <strong><code class="language-plaintext highlighter-rouge">config-get</code></strong>. Such command allows the user to specify the name of a configuration parameter and get back the value the pooler is using.
As an example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgagroal-cli get-config max_connections
300
% pgagroal-cli get-config max_connections <span class="nt">--verbose</span>
max_connections <span class="o">=</span> 300
</code></pre></div></div>
<p><br />
<br /></p>
<p>When the command is invoked with the <code class="language-plaintext highlighter-rouge">--verbose</code> flag, the application respond with a full configuration line that can then be copied and pasted into a new configuration file.</p>
<h2 id="ini-sections">INI sections</h2>
<p>The <code class="language-plaintext highlighter-rouge">config-get</code> command allows also for the specification of <em>sections</em>, for example if you <code class="language-plaintext highlighter-rouge">pgagroal.conf</code> configuration file is like the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">cat </span>pgagroal.conf
<span class="o">[</span>pgagroal]
...
<span class="o">[</span>venkman]
host <span class="o">=</span> 192.168.2.2
port <span class="o">=</span> 5432
primary <span class="o">=</span> off
</code></pre></div></div>
<p><br />
<br /></p>
<p>it is possible to query the command with the section name, and the application will dig into the INI file section:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgagroal-cli config-get venkman.host
192.168.2.2
</code></pre></div></div>
<p><br />
<br /></p>
<p>Thanks to this, it is possible to get all the main configuration out of <code class="language-plaintext highlighter-rouge">pgagroal-cli</code>.</p>
<h2 id="limit-and-hba-entries">Limit and HBA entries</h2>
<p>Similarly to the sections, the <code class="language-plaintext highlighter-rouge">config-get</code> command allows the specification of parameters to search for a limit entry or an HBA entry. In such case, the <em>key</em> to search for is the username to match for an HBA entry and the database to match for a limit entry. The special key prefix <code class="language-plaintext highlighter-rouge">limit</code> or <code class="language-plaintext highlighter-rouge">hba</code> allows the command to understand where to dig.
As an example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgagroal-cli config-get limit.testdb.max_size
10
% pgagroal-cli config-get hba.luca.method
md5
</code></pre></div></div>
<p><br />
<br /></p>
<p>See the <a href="https://github.com/agroal/pgagroal/blob/master/doc/CLI.md" target="_blank"><code class="language-plaintext highlighter-rouge">pgagroal-cli</code></a> documentation for more examples and details.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
</code></pre></div></div>
<p>```</p>
PostgreSQL scary settings: data_sync_retry2022-11-24T00:00:00+00:00https://fluca1978.github.io/2022/11/24/PostgreSQLDataSyncRetry<p>A look at how this setting works.</p>
<h1 id="postgresql-scary-settings-data_sync_retry">PostgreSQL scary settings: data_sync_retry</h1>
<p>Is this a scary setting? Well, of course no!
<br />
However, it is a setting that you should not touch unless you are really, really aware of what you are doing.
<br /></p>
<p><a href="https://www.postgresql.org/docs/current/runtime-config-error-handling.html" target="_blank"><code class="language-plaintext highlighter-rouge">data_sync_retry</code></a> is a setting that instrument the cluster to <strong>retry after an <code class="language-plaintext highlighter-rouge">fsync</code> failure</strong> related to data pages. What does that mean?
As we all know, PostgreSQL has to flush data, sooner or later, from memory to the data files, and this happens with <code class="language-plaintext highlighter-rouge">fsync(2)</code>, an operating system call that forces the data to be flushed from memory to the filesystem layer and, hopefully, to the disk into the data files.
<br />
Clearly, PostgreSQL cannot allow any data loss, even of one bit, and therefore the system takes great care about what happens when flushing data.
In normal circumstances, if <code class="language-plaintext highlighter-rouge">fsync(2)</code> fails, data has not been written to disk and therefore there is nothing PostgreSQL can do about it.
Since this is a <em>huge problem</em>, <strong>PostgreSQL issues a <code class="language-plaintext highlighter-rouge">PANIC</code></strong> and crashes.
That’s not so good, but it is safe after all, since it means we are going to recover from the WALs and therefore we are not going to loose any data.
<br />
In the case we force a <em>retry</em>, by means of setting <code class="language-plaintext highlighter-rouge">data_sync_retry</code> to <code class="language-plaintext highlighter-rouge">on</code>, PostgreSQL will not crash, so that later on the flush-to-disk could be retried (hence the name of this setting).
<br />
<br />
So, why you should not enable such a great setting that promises to you to avoid crash even when <code class="language-plaintext highlighter-rouge">fsync(2)</code> fails?
<br />
The problem is that, after an <code class="language-plaintext highlighter-rouge">fsync(2)</code> failure, the kernel pacge cache status could be unknown. This means that the page, that was moved to the page cache (i.e., filesystem layer) to be flushed, could have been removed from the kernel cache even if did not hit the disc (because <code class="language-plaintext highlighter-rouge">fsync(2)</code> was responsible for this, and it failed). In such situation, PostgreSQL could have discarded the dirty page from the shared buffers, the kernel could have thrown away the page, and the page did not hit the disc. At this point, a retry happens, and the operating system reports a success, making things even worst! <strong>And here’s where the data loss happens!</strong></p>
<h2 id="how-does-postgresql-handles-the-above">How does PostgreSQL handles the above?</h2>
<p>If you take a look at the code, in particular at the function <a href="https://github.com/postgres/postgres/blob/master/src/backend/storage/file/fd.c#L3736" target="_blank"><code class="language-plaintext highlighter-rouge">data_sync_elevel</code></a>:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span>
<span class="nf">data_sync_elevel</span><span class="p">(</span><span class="kt">int</span> <span class="n">elevel</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="n">data_sync_retry</span> <span class="o">?</span> <span class="n">elevel</span> <span class="o">:</span> <span class="n">PANIC</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>you will see that, unless the <code class="language-plaintext highlighter-rouge">data_sync_retry</code> option is enabled, the system returns a <code class="language-plaintext highlighter-rouge">PANIC</code>.
Such function is used whenever a call to <code class="language-plaintext highlighter-rouge">fsync(2)</code> stuff happens, <a href="https://github.com/postgres/postgres/blob/master/src/backend/storage/file/fd.c#L507" target="_blank">for example</a> :</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rc</span> <span class="o">=</span> <span class="n">sync_file_range</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">offset</span><span class="p">,</span> <span class="n">nbytes</span><span class="p">,</span>
<span class="n">SYNC_FILE_RANGE_WRITE</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">rc</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">elevel</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">elevel</span> <span class="o">=</span> <span class="n">data_sync_elevel</span><span class="p">(</span><span class="n">WARNING</span><span class="p">);</span>
<span class="n">ereport</span><span class="p">(</span><span class="n">elevel</span><span class="p">,</span>
<span class="p">(</span><span class="n">errcode_for_file_access</span><span class="p">(),</span>
<span class="n">errmsg</span><span class="p">(</span><span class="s">"could not flush dirty data: %m"</span><span class="p">)));</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>For short: when there is an error in data syncing, PostgreSQL tries to understand at which level it must emit the problem. Here the system decides, thru <code class="language-plaintext highlighter-rouge">data_sync_level</code> to choose for <code class="language-plaintext highlighter-rouge">WARNING</code> (in the case there will be another try) or the <code class="language-plaintext highlighter-rouge">PANIC</code> (the default).</p>
<h1 id="conclusions">Conclusions</h1>
<p>Similarly to what happens for <code class="language-plaintext highlighter-rouge">fsync</code>, that you should never change and keep always set to <code class="language-plaintext highlighter-rouge">on</code>, the <code class="language-plaintext highlighter-rouge">data_sync_retry</code> setting must be never touched and should always retain its default value <code class="language-plaintext highlighter-rouge">off</code>. Turning <code class="language-plaintext highlighter-rouge">on</code> this parameter could result in data loss, and surely does not provide you any performance benefit.
<br />
An interesting thread to <a href="https://www.postgresql.org/message-id/957805.1668461398%40sss.pgh.pa.us" target="_blank">read about</a> that provides interesting concepts and explaination.</p>
Emacs(client) as editor in psql2022-11-16T00:00:00+00:00https://fluca1978.github.io/2022/11/16/psqlEmacsClient<p>At last I found a way to use Emacs (as a client) within psql!</p>
<h1 id="emacsclient-as-editor-in-psql">Emacs(client) as editor in psql</h1>
<p><strong>psql</strong> is an amazing interactive SQL swiss-army knife terminal thingy that can really tunr your day! Quite frankly, in my professional training activity, I always tell participants to learn to use <code class="language-plaintext highlighter-rouge">psql</code> for several reasons, and I also ask them if any of their interactive SQL-terminals has the same set of features, without having back an answer!</p>
<p>One nice thing that <code class="language-plaintext highlighter-rouge">psql</code> provides, is the capability to edit a complex query directly within your editor of choice.</p>
<p>My <em>editor of choice</em> is <strong>Emacs</strong>!</p>
<p>Starting Emacs every time I have to edit a query buffer (via <code class="language-plaintext highlighter-rouge">\e</code> command) is awkward: even on recent hardware, Emacs startup is slow.
Thankfully, there is a trick: <em>Emacs embeds a client-server approach</em>. And it is a <strong>real</strong> client-server approach.</p>
<p>The idea is to have Emacs started as a <em>daemon</em>, and every time you need a new editor frame (ehm, every time you need to edit something), you can invoke the special command <code class="language-plaintext highlighter-rouge">emacsclient</code> asking to attach to the daemon running. As a result, while the daemon startup is as slow as a normal <code class="language-plaintext highlighter-rouge">emacs</code> instance is, the <code class="language-plaintext highlighter-rouge">emacsclient</code> editing session is shiningly fast!</p>
<p>So far so good! Ehm, no, well, not for me. I had a lot of troubles in trying to configure <code class="language-plaintext highlighter-rouge">psql</code> to use <code class="language-plaintext highlighter-rouge">emacsclient</code> as an editor. And, <em>shame on me</em>, I had troubles because I was using the wrong set of flags to launcha <code class="language-plaintext highlighter-rouge">emacsclient</code>! To make things worst: I’m using <em>ZSH</em> as my default shell, and for some strange reason I need to investigate on, the shell does not work really well with <em>commands and flags within the same environment variable</em>.</p>
<h2 id="tldr---how-to-do-that">TL;DR - How to do that?</h2>
<p>There are two ways to change the default editor that <code class="language-plaintext highlighter-rouge">psql</code> is going to use to edit a query buffer:</p>
<ul>
<li>set the well known <code class="language-plaintext highlighter-rouge">EDITOR</code> environment variable;</li>
<li>set the <code class="language-plaintext highlighter-rouge">psql</code> specific <code class="language-plaintext highlighter-rouge">PSQL_EDITOR</code> environment variable.</li>
</ul>
<p>Since I’m a whole-Emacs kind of guy, I decided for the first, so that whenever I need to use <code class="language-plaintext highlighter-rouge">$EDITOR</code>, I will be dropped into my comfortable Emacs environment.</p>
<p>Therefore, in order to achieve this, after some research on the flags, simply put the following in your <code class="language-plaintext highlighter-rouge">.zshrc</code> configuration file:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">EDITOR</span><span class="o">=</span><span class="s2">"emacsclient -t"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>or do the same with <code class="language-plaintext highlighter-rouge">PSQL_EDITOR</code>.</p>
<p>Then, when you are in the <code class="language-plaintext highlighter-rouge">psql</code> application, hit <code class="language-plaintext highlighter-rouge">\e</code> and look at Emacs quickly appear. The very first time, when you will come back to your <code class="language-plaintext highlighter-rouge">psql</code> session, you will notice a few lines of warnings coming out from Emacs itself:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">testdb</span><span class="o">=></span> <span class="se">\e</span>
emacsclient: can<span class="s1">'t find socket; have you started the server?
emacsclient: To start the server in Emacs, type "M-x server-start".
Starting Emacs daemon.
Emacs daemon should have started, trying to connect again
testdb=>
</span></code></pre></div></div>
<p><br />
<br /></p>
<p>That’s because, at the very first time, <code class="language-plaintext highlighter-rouge">emacsclient</code> does not find any <code class="language-plaintext highlighter-rouge">emacs</code> instance running as a daemon, so it starts one by itself, waits a moment, and connect back to the server. This is clearly explained in the messages, that are therefore just warnings.</p>
<h2 id="what-about-zsh-and-bash">What about ZSH (and Bash)?</h2>
<p>One of the reason it took me so long to understand how to configure <code class="language-plaintext highlighter-rouge">emacsclient</code>, was that for some reasons I don’t know (yet), ZSH behaves nastly when you try to launch a command within an environment variable, assuming such variable contains spaces. Why? Because in order to set the variable with spaces, you have to quote your command line, and at that point ZSH assumes the whole string is a single command:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">echo</span> <span class="nv">$EDITOR</span>
emacsclient <span class="nt">-t</span>
% <span class="nv">$EDITOR</span>
zsh: <span class="nb">command </span>not found: emacsclient <span class="nt">-t</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The same does not happen in Bash, that behaves as expected:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% bash
<span class="nv">$ </span><span class="nb">echo</span> <span class="nv">$EDITOR</span>
emacsclient <span class="nt">-t</span>
<span class="nv">$ $EDITOR</span>
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>So, I’ve also found a place where Bash seems to behave more intuitively than ZSH is (and <em>no, I’m not going to switch back to Bash for this reason!</em>).</p>
<h2 id="colors-and-themes">Colors and themes!</h2>
<p>I tend to use a dark color scheme in all my activities, included terminals I use to SSH-in machines. In these circumstances, Emacs has a very poor default color choice.
Assuming you don’t have a specific Emacs startup configuration (and you should, believe me!), you can add a line like the following to one of your startup files (e.g., <code class="language-plaintext highlighter-rouge">~/.emacs</code> or <code class="language-plaintext highlighter-rouge">~/.emacs.d/init.el</code>):</p>
<p><br />
<br /></p>
<div class="language-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">load-theme</span> <span class="ss">'tango-dark</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>This will make your Emacs experience on dark terminals a lot better!</p>
<p>Please consider that changes will not be reflect into any running daemon, so you have to restart Emacs (or re-evaluate) the startup files once you have changed!</p>
<h1 id="conclusions">Conclusions</h1>
<p>Emacs, thanks to its client-server component, can really improve your already awesome experience within <code class="language-plaintext highlighter-rouge">psql</code>!
For instance, you can use external tools to reformat your <a href="https://fluca1978.github.io/2022/04/13/EmacsPgFormatter.html" target="_blank">queries while you are writing them in the editor</a>.</p>
<p>For more documentation on how to customize <code class="language-plaintext highlighter-rouge">emacsclient</code> <a href="https://www.emacswiki.org/emacs/EmacsClient" target="_blank">see the official documentation</a>.</p>
PostgreSQL 15: logging in JSON2022-10-21T00:00:00+00:00https://fluca1978.github.io/2022/10/21/PostgreSQL15JsonLogs<p>PostgreSQL 15 has now the capability to output logs in JSON format!</p>
<h1 id="postgresql-15-logging-in-json">PostgreSQL 15: logging in JSON</h1>
<p>The <a href="https://www.postgresql.org/docs/15/release-15.html" target="_blank">freshly released PostgreSQL 15</a> introduces a lot of new features and improvements, but one, according to me, is going to change the way our favourite database is monitored: <a href="https://www.postgresql.org/docs/15/runtime-config-logging.html#GUC-LOG-DESTINATION" target="_blank">the capability to log daemon status in JSON</a>.</p>
<p><br />
<br />
Essentially, the <code class="language-plaintext highlighter-rouge">log_destination</code> configuration parameter now has another enumerated value: <strong><code class="language-plaintext highlighter-rouge">jsonlog</code></strong>. When this value is added to <code class="language-plaintext highlighter-rouge">log_destination</code>, PostgreSQL will start to emit JSON structured logs.
Here it is a simple configuration example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">grep </span>log_destination /postgres/15/data/postgresql.conf
log_destination <span class="o">=</span> <span class="s1">'stderr,jsonlog'</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and this is how the <code class="language-plaintext highlighter-rouge">log</code> directory appears right after the configuration has been reloaded:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo</span> <span class="nt">-u</span> postgres <span class="nb">ls</span> /postgres/15/data/log <span class="nt">-1</span>
postgresql-Fri.json
postgresql-Fri.log
</code></pre></div></div>
<p><br />
<br /></p>
<p>Clearly the logs will contain the same values, but in different formats.</p>
<p><br />
<br /></p>
<p>There is more: when there’s more than one value set in <code class="language-plaintext highlighter-rouge">log_destination</code>, PostgreSQL will store a file named <strong><code class="language-plaintext highlighter-rouge">current_logfiles</code></strong>, where each line will represent the format and the current logfile where PostgreSQL has to store the data:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo</span> <span class="nt">-u</span> postgres <span class="nb">cat</span> /postgres/15/data/current_logfiles
stderr log/postgresql-Fri.log
jsonlog log/postgresql-Fri.json
</code></pre></div></div>
<p><br />
<br /></p>
<p>In this way, not only PostgreSQL, but even the sysadmin can keep track of where the system is going to log right now, and this is useful especially when there’s a log rotation in place.</p>
<p><br />
<br />
On the SQL side, the function <code class="language-plaintext highlighter-rouge">pg_current_logfile()</code> can optionally accept the log format (the same specified in <code class="language-plaintext highlighter-rouge">log_destination</code>) and provide the current log file depending on the choosen format:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">pg_current_logfile</span><span class="p">();</span>
<span class="n">pg_current_logfile</span>
<span class="c1">------------------------</span>
<span class="n">log</span><span class="o">/</span><span class="n">postgresql</span><span class="o">-</span><span class="n">Fri</span><span class="p">.</span><span class="n">log</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">pg_current_logfile</span><span class="p">(</span> <span class="s1">'jsonlog'</span> <span class="p">);</span>
<span class="n">pg_current_logfile</span>
<span class="c1">-------------------------</span>
<span class="n">log</span><span class="o">/</span><span class="n">postgresql</span><span class="o">-</span><span class="n">Fri</span><span class="p">.</span><span class="n">json</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">pg_current_logfile</span><span class="p">(</span> <span class="s1">'stderr'</span> <span class="p">);</span>
<span class="n">pg_current_logfile</span>
<span class="c1">------------------------</span>
<span class="n">log</span><span class="o">/</span><span class="n">postgresql</span><span class="o">-</span><span class="n">Fri</span><span class="p">.</span><span class="n">log</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>I suspect we will see more and more <em>log crunching</em> applications to switch over the new JSON log format!</p>
PostgreSQL ASCII numeric operators2022-10-10T00:00:00+00:00https://fluca1978.github.io/2022/10/10/PostgreSQLNumericOperators<p>PostgreSQL has some special ways to provide numeric opeators by means of ASCII chars.</p>
<h1 id="postgresql-ascii-numeric-operators">PostgreSQL ASCII numeric operators</h1>
<p>PostgreSQL has some <em>ASCII numeric representations</em> of commonly used numeric operators.
It could be not well know, since I suspect pretty much everyone is using the function operators, and moreover it is not so simple to <a href="https://www.postgresql.org/docs/14/functions-math.html">find them in the documentation by means of a searching for</a>.</p>
<p><br /></p>
<p>In any case, here they are:</p>
<ul>
<li><strong><code class="language-plaintext highlighter-rouge">|/</code></strong> is the same as <strong><code class="language-plaintext highlighter-rouge">sqrt</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">||/</code></strong> is the same as <strong><code class="language-plaintext highlighter-rouge">cbrt</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">@</code></strong> is the same as <strong><code class="language-plaintext highlighter-rouge">abs</code></strong>.</li>
</ul>
<p><br />
<br /></p>
<p>An of course, it is quite easy to test such operators in action:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span>
<span class="n">sqrt</span><span class="p">(</span> <span class="mi">81</span> <span class="p">)</span> <span class="k">as</span> <span class="n">sqrt</span><span class="p">,</span>
<span class="o">|/</span> <span class="mi">81</span> <span class="k">as</span> <span class="n">root</span><span class="p">,</span>
<span class="n">cbrt</span><span class="p">(</span> <span class="mi">1000</span> <span class="p">)</span> <span class="k">as</span> <span class="n">cube_root</span><span class="p">,</span>
<span class="o">||/</span> <span class="mi">1000</span> <span class="k">as</span> <span class="n">root3</span><span class="p">,</span>
<span class="k">abs</span><span class="p">(</span> <span class="o">-</span><span class="mi">19</span> <span class="p">)</span> <span class="k">as</span> <span class="k">abs</span><span class="p">,</span>
<span class="o">@</span> <span class="o">-</span><span class="mi">19</span> <span class="k">as</span> <span class="k">absolute</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">-</span>
<span class="n">sqrt</span> <span class="o">|</span> <span class="mi">9</span>
<span class="n">root</span> <span class="o">|</span> <span class="mi">9</span>
<span class="n">cube_root</span> <span class="o">|</span> <span class="mi">10</span>
<span class="n">root3</span> <span class="o">|</span> <span class="mi">10</span>
<span class="k">abs</span> <span class="o">|</span> <span class="mi">19</span>
<span class="k">absolute</span> <span class="o">|</span> <span class="mi">19</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Quite frankly, I believe the function operators are more readable, in particular since I’ve never seen (yet) such operators in other programming languages.</p>
pgenv 1.3.2 is out!2022-09-20T00:00:00+00:00https://fluca1978.github.io/2022/09/20/pgenv1.3.2<p>A new release of the PostgreSQL virtual environment manager.</p>
<h1 id="pgenv-132-is-out"><code class="language-plaintext highlighter-rouge">pgenv</code> 1.3.2 is out!</h1>
<p>Today we <a href="https://github.com/theory/pgenv/releases/tag/v1.3.2" target="_blank">released version 1.3.2 of pgenv</a>, the <a href="https://github.com/theory/pgenv" target="_blank">binary manager for PostgreSQL</a>.
<br />
This release fixes a quite subtle bug in the handling of the configuration that prevented custom settings to be correctly loaded back into the running system.
<em>Users are encouraged to upgrade</em> as soon as possible.</p>
<h2 id="a-description-of-the-problem">A description of the problem</h2>
<p><code class="language-plaintext highlighter-rouge">17bob17</code> <a href="https://github.com/theory/pgenv/issues/56" target="_blank">noticed the issue</a>: when you edited your configuration file, either the default or a per-version one, and changed settings in a (Bash) array, the configuration was not correctly loaded.
<br />
It took a lot of time to figure out that the problem was not directly in the way the configuration was loaded, rather in the way the configuration was stored.
<br />
When <code class="language-plaintext highlighter-rouge">pgenv</code> acquired the configuration settings as arrays, it started using <code class="language-plaintext highlighter-rouge">declare -p</code> as a way to print out a Bash compatible representation of the array, and such representation was stored in the configuration file.
The problem was that <code class="language-plaintext highlighter-rouge">declare -p</code> assumes you want to use <code class="language-plaintext highlighter-rouge">declare</code> back when you re-evaluate the variable (array), and so placed a <code class="language-plaintext highlighter-rouge">declare -a</code> as the output.
<br />
The configuration is then loaded within the <code class="language-plaintext highlighter-rouge">pgenv_configuration_load</code> function, and <code class="language-plaintext highlighter-rouge">declare</code> run into a function has the same effect as <code class="language-plaintext highlighter-rouge">local</code>, that is it lexically scope the variables. Therefore, as soon as <code class="language-plaintext highlighter-rouge">pgenv_configuration_load</code> ends its job, the lexically scoped variables are gone and the old (previous) one are kept with their default value. It is a <em>boring masquerading problem due to inner contexts</em>.
<br />
One possible solution could have been to use <code class="language-plaintext highlighter-rouge">-g</code> as a flag to <code class="language-plaintext highlighter-rouge">declare</code>, so to force the variable to be global and therefore not lexically scoped, but such flag is not everywhere in different Bash versions and implementation.
<br />
The <code class="language-plaintext highlighter-rouge">-x</code> flag to declare, to export the variable, did not have any effect too.
<br />
<br />
Therefore, the current release removes the use of <code class="language-plaintext highlighter-rouge">declare</code> at all when the configuration is sourced back (loaded).</p>
pgagroal 1.5.0 released!2022-09-15T00:00:00+00:00https://fluca1978.github.io/2022/09/15/pgagroal_1_5<p>A new release of the <code class="language-plaintext highlighter-rouge">pgagroal</code> connection pooler.</p>
<h1 id="pgagroal-150-released"><code class="language-plaintext highlighter-rouge">pgagroal</code> 1.5.0 released!</h1>
<p><code class="language-plaintext highlighter-rouge">[pgagroal](https://agroal.github.io/pgagroal/){:target=_blank}</code> is a fast connection pooler for PostgreSQL, written in the C language.
<br />
A couple of weeks ago, a new release, the <strong>1.5.0</strong> was released. I’m writing about this just now because I was on holidays!
<br />
The new release brings a new set of features, in particular a lot of small checks within the configuration file setup (e.g., avoiding duplicated servers or wrong parameters) and a lot of new loggin capabilities, including <em>log rotation</em> and <em>log line prefix</em>.
<br />
Other areas of improvements include code clean-up, shell completion for command line tools, and portability towards FreeBSD and OpenBSD systems.
<br />
Last but not least, a new set of <a href="https://github.com/agroal/pgagroal/tree/master/doc/tutorial" target="_blank">tutorials</a> will help the newcomers to correctly start using <code class="language-plaintext highlighter-rouge">pgagroal</code>!</p>
Shell completions for pgagroal2022-08-19T00:00:00+00:00https://fluca1978.github.io/2022/08/19/pgagroalCompletions<p>A small patch to ease the use of <code class="language-plaintext highlighter-rouge">pgagroal</code> tools.</p>
<h1 id="shell-completions-for-pgagroal">Shell completions for <code class="language-plaintext highlighter-rouge">pgagroal</code></h1>
<p>In the beginning of the current month I pushed a commit that introduces shell completions for <code class="language-plaintext highlighter-rouge">pgagroal</code> related commands, in particular <code class="language-plaintext highlighter-rouge">pgagroal-cli</code> (used to manage the pooler) and <code class="language-plaintext highlighter-rouge">pgagroal-admin</code> (used to manage authentication and users).
<br />
The <a href="https://github.com/agroal/pgagroal/commit/1296cc4216c73119a1ff4c3a3ffd0c610ca04f69" target="_blank">shell completions</a> work only for Bash and Zsh, and allow you to hit <code class="language-plaintext highlighter-rouge"><TAB></code> after a command and get it automatically completed with the appropriate options.
<br />
While importing the completions in Bash is as simple as <code class="language-plaintext highlighter-rouge">source</code>ing the file, in Zsh you need to enable the completion framework. <a href="https://github.com/agroal/pgagroal/blob/master/doc/tutorial/01_install.md#shell-completion">Detailed instructions about how to enable the completions</a> have been placed in the tutorials.</p>
PostgreSQL 15: changes in the public schema permissions2022-07-15T00:00:00+00:00https://fluca1978.github.io/2022/07/15/PostgreSQL15PublicSchema<p>The upcoming new release of PostgreSQL does some changes on the <code class="language-plaintext highlighter-rouge">public</code> schema permissions.</p>
<h1 id="postgresql-15-changes-in-the-public-schema-permissions">PostgreSQL 15: changes in the <code class="language-plaintext highlighter-rouge">public</code> schema permissions</h1>
<p>In PostgreSQL 15 the default <code class="language-plaintext highlighter-rouge">public</code> schema that every database has will have a different set of permissions. In fact, before PostgreSQL 15, every user could manipulate the <code class="language-plaintext highlighter-rouge">public</code> schema of a database he is not owner.
Since the upcoming new version, only the database owner will be granted full access to the <code class="language-plaintext highlighter-rouge">public</code> schema, while other users will need to get an explicit <code class="language-plaintext highlighter-rouge">GRANT</code>:</p>
<p><br />
Imagine the user <code class="language-plaintext highlighter-rouge">luca</code> is owner of the database <code class="language-plaintext highlighter-rouge">testdb</code>: it means he can do whatever he wants on the database.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SHOW</span> <span class="n">server_version</span><span class="p">;</span>
<span class="n">server_version</span>
<span class="c1">----------------</span>
<span class="mi">15</span><span class="n">beta2</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">current_role</span><span class="p">,</span> <span class="k">current_user</span><span class="p">;</span>
<span class="k">current_role</span> <span class="o">|</span> <span class="k">current_user</span>
<span class="c1">--------------+--------------</span>
<span class="n">luca</span> <span class="o">|</span> <span class="n">luca</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">mytable</span><span class="p">(</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>On the other hand, another user, let’s say <code class="language-plaintext highlighter-rouge">pgbench</code>, cannot:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">current_role</span><span class="p">,</span> <span class="k">current_user</span><span class="p">;</span>
<span class="k">current_role</span> <span class="o">|</span> <span class="k">current_user</span>
<span class="c1">--------------+--------------</span>
<span class="n">pgbench</span> <span class="o">|</span> <span class="n">pgbench</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">mytable2</span><span class="p">(</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">schema</span> <span class="k">public</span>
<span class="n">LINE</span> <span class="mi">1</span><span class="p">:</span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">mytable2</span><span class="p">(</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">);</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">mytable</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">mytable</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>That means that <code class="language-plaintext highlighter-rouge">public</code> is not managed as a user defined schema, and therefore in order to allow other users to do operations, an explicit <code class="language-plaintext highlighter-rouge">GRANT</code> must be executed.
<br />
What has changed is that <strong>there is no more the <code class="language-plaintext highlighter-rouge">CREATE</code> permission on <code class="language-plaintext highlighter-rouge">public</code> schema</strong>, while <code class="language-plaintext highlighter-rouge">YUSAGE</code> is as before. Therefore, in order to allow not-owners to create objects, an explicit <code class="language-plaintext highlighter-rouge">GRANT CREATE ON SCHEMA public TO pgbench</code> statement myust be executed.
<br />
This affects newly created databases, not those restored from previous backups.
<br />
But there is a trick that could help in setting back the previous behavior: if you set the permissions on the <code class="language-plaintext highlighter-rouge">template1</code> (or in a template database) you could have them for free on new databases:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">template1</span><span class="o">=#</span> <span class="k">GRANT</span> <span class="k">CREATE</span> <span class="k">ON</span> <span class="k">SCHEMA</span> <span class="k">public</span> <span class="k">TO</span> <span class="k">PUBLIC</span><span class="p">;</span>
<span class="k">GRANT</span>
<span class="n">template1</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">newdb</span> <span class="k">WITH</span> <span class="k">OWNER</span> <span class="n">luca</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">DATABASE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>And now, collecting as not-owning user:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">%</span> <span class="n">psql</span> <span class="o">-</span><span class="n">U</span> <span class="n">pgbench</span> <span class="o">-</span><span class="n">h</span> <span class="n">localhost</span> <span class="n">newdb</span>
<span class="n">newdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">table</span> <span class="n">foo</span><span class="p">(</span> <span class="n">i</span> <span class="nb">int</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
</code></pre></div></div>
<br7>
<br />
the permissions are as in previous PostgreSQL versions.
<br />
It is not clear if the above trick will remain in place once the PostgreSQL version exists the beta status, in any case I discourage you to adopt it. The choice of revoking by default privileges on the `public` schmea could be annoying, but is a good choice in term of security and forces you to decide how to deal with permissions.
</br7>
PostgreSQL 15: changes in the low level backup functions2022-07-13T00:00:00+00:00https://fluca1978.github.io/2022/07/13/PostgreSQL15BackupFunctions<p>The upcoming new release of PostgreSQL does some changes to low level backup functions.</p>
<h1 id="postgresql-15-changes-in-the-low-level-backup-functions">PostgreSQL 15: changes in the low level backup functions</h1>
<p>The upcoming PostgreSQL 15 release implements a few changes into the /low level/ backup functions.
<br />
Nowdays I suspect nobody, except backup solution developers, know or use such functions, but I clearly remember when we developed our own scripts to do a continuos backup using functions like <code class="language-plaintext highlighter-rouge">pg_start_backup()</code> and <code class="language-plaintext highlighter-rouge">pg_stop_backup()</code>.
<br />
You should use other backup solutions today, like the great <a href="https://pgbackrest.org/" target="_blank">pgBackRest</a>.
<br />
In any case, what are the changes?
<br />
As you can read from [the release notes](https://www.postgresql.org/docs/15/release-15.html{:target=”_blank”} there two mainly:</p>
<ol>
<li><em>functions have been renamed</em> to a more consistent naming scheme.</li>
<li>a few functions and modes have been removed.</li>
</ol>
<p>The <a href="https://www.postgresql.org/docs/15/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP" target="_blank">functions are now named as <code class="language-plaintext highlighter-rouge">pg_backup_</code></a>, so <strong><code class="language-plaintext highlighter-rouge">pg_start_backup()</code> becomes <code class="language-plaintext highlighter-rouge">pg_backup_start()</code></strong>, and similarly, <strong><code class="language-plaintext highlighter-rouge">pg_stop_backup()</code> becomes <code class="language-plaintext highlighter-rouge">pg_backup_stop()</code></strong>. Quite frankly I like this decision, it makes the naming simpler to search for and to remember.
<br />
Moreover, there is no more the presence of deprecated (since version 9.6, if I remember correctly), the <em>exclusive backup mode</em>. This was the only way to perform a low level backup back in the days, but since a lot it has been deprecated. One of the problems with exclusive backups is that the system will create a label file that prevents the primary to restart after a crash, and in turn this led people to delete the label file also on standby servers.
Now this is no more a problem, and the <code class="language-plaintext highlighter-rouge">pg_backup_start()</code> and <code class="language-plaintext highlighter-rouge">pg_backup_stop()</code> functions do not handle anymore the exclusive backup parameter.
<br />
As a consequence of this choice, the functions <code class="language-plaintext highlighter-rouge">pg_is_in_backup()</code> and <code class="language-plaintext highlighter-rouge">pg_backup_start_time()</code> have been removed because <em>they were focused only on exclusive backups</em>, that do not exist anymore.</p>
A new pgenv release2022-06-27T00:00:00+00:00https://fluca1978.github.io/2022/06/27/PgenvNewRelease<p>A new release of the PostgreSQL virtual environment manager.</p>
<h1 id="a-new-pgenv-release">A new <code class="language-plaintext highlighter-rouge">pgenv</code> release</h1>
<p>Today we <a href="https://github.com/theory/pgenv/releases/tag/v1.3.1" target="_blank">released version 1.3.1 of pgenv</a>, the <a href="https://github.com/theory/pgenv" target="_blank">binary manager for PostgreSQL</a>.
<br />
This release fixes an annoying bug introduced on Mac OSX (Bash) that was preventing <code class="language-plaintext highlighter-rouge">pgenv</code> to properly work on such platform. The bug was introduced with <a href="https://github.com/theory/pgenv/commit/a547e3eaa4d21da5838fc6aeb0326f2c66a7b604" target="_blank">a commit of mines</a> that was fixing another bug about wrong configuration reload.
<br />
Unluckily, such bug gets unnoted because I don’t have access to a Mac OSX, and I was quite sure Bash was much more portable than what we discovered!
<br />
<br />
Anyway, sorry for the bug, and please update your version of <code class="language-plaintext highlighter-rouge">pgenv</code> and enjoy it!</p>
Ordinality in function queries2022-06-23T00:00:00+00:00https://fluca1978.github.io/2022/06/23/PostgreSQLOrdinality<p>A trick about queryies that involves function.</p>
<h1 id="ordinality-in-fuction-queries">Ordinality in fuction queries</h1>
<p>The PostgreSQL <code class="language-plaintext highlighter-rouge">SELECT</code> statement allows you to query function that return <em>result set</em> (either a <code class="language-plaintext highlighter-rouge">SET OF</code> or <code class="language-plaintext highlighter-rouge">TABLE</code>), that are used as source of tuple for the query itself.
<br />
There is nothing surprising about that!
<br />
However, the <code class="language-plaintext highlighter-rouge">SELECT</code> statement, when invoked against a function that provides a result set, allows an extra clause to appear: <code class="language-plaintext highlighter-rouge">[WITH ORDINALITY](https://www.postgresql.org/docs/14/sql-select.html){:target="_blank"}</code>. This clause adds a column to the result set with a numerator (of type <code class="language-plaintext highlighter-rouge">bigint</code>) representing the number of the tuple as got from the function.
<br />
<br />
Why is this important? Because you don’t need your function to provide by itself a kind of <em>tuple numerator</em>.</p>
<h2 id="with-ordinality-in-action"><code class="language-plaintext highlighter-rouge">WITH ORDINALITY</code> in action</h2>
<p>Let’s take a simple example to understand how it works. Let’s create a function that returns a table:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">animals</span><span class="p">(</span> <span class="n">l</span> <span class="nb">int</span> <span class="k">DEFAULT</span> <span class="mi">5</span><span class="p">,</span>
<span class="n">animal</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'cat'</span><span class="p">,</span>
<span class="k">owner</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'nobody'</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">int</span><span class="p">,</span> <span class="n">description</span> <span class="nb">text</span><span class="p">,</span> <span class="n">mood</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">i</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">j</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">l</span> <span class="n">LOOP</span>
<span class="n">pk</span> <span class="p">:</span><span class="o">=</span> <span class="n">i</span><span class="p">;</span>
<span class="n">description</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'%s #%s owned by %s'</span><span class="p">,</span> <span class="n">animal</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="k">owner</span> <span class="p">);</span>
<span class="n">j</span> <span class="p">:</span><span class="o">=</span> <span class="n">random</span><span class="p">()</span> <span class="o">*</span> <span class="mi">100</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">j</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">mood</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'good'</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="n">mood</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'bad'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Generating % # % with mood %'</span><span class="p">,</span> <span class="n">animal</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">mood</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NEXT</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above function <code class="language-plaintext highlighter-rouge">animals()</code> produced an output with a simple name of the animal (numerated), the index of the generated tuple (i.e., a numerator) and a randomly select mood.
<br />
It is clearly easy to test it out:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">animals</span><span class="p">();</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">description</span> <span class="o">|</span> <span class="n">mood</span>
<span class="c1">----+------------------------+------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">pk</code> column contains the numerator of the generated tuple, so we know that the <code class="language-plaintext highlighter-rouge">cat #1</code> tuple has been generated first, the <code class="language-plaintext highlighter-rouge">cat #2</code> as second and so on.
<br />
Let’s kick <code class="language-plaintext highlighter-rouge">WITH ORDINALITY</code> in:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">animals</span><span class="p">()</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">description</span> <span class="o">|</span> <span class="n">mood</span> <span class="o">|</span> <span class="k">ordinality</span>
<span class="c1">----+------------------------+------+------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">3</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">5</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">WITH ORDINALITY</code> clause must follow the function it will be apply onto. Such clause appends a new column to the result set, by default named <code class="language-plaintext highlighter-rouge">ordinality</code> with a progressive numerator. Note how <code class="language-plaintext highlighter-rouge">pk</code> and <code class="language-plaintext highlighter-rouge">ordinality</code> contain the very same value: <strong><code class="language-plaintext highlighter-rouge">WITH ORDINALITY</code> is keeping track for you of the tuple produced by the result set stream (the function)</strong>, so you don’t need to compute by yourself.
<br />
Clearly, this works also with a reordering of the tuples, because the clause does not numerate the appearance of the tuples, rather the <em>instant</em> (or better, the <em>sequence</em>) a tuple has been added to the result set:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">animals</span><span class="p">()</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">random</span><span class="p">();</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">description</span> <span class="o">|</span> <span class="n">mood</span> <span class="o">|</span> <span class="k">ordinality</span>
<span class="c1">----+------------------------+------+------------</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">3</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">5</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">1</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is also possible to rename the <code class="language-plaintext highlighter-rouge">ordinality</code> column with an alias, like the following:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">animals</span><span class="p">()</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span>
<span class="k">AS</span> <span class="n">cat</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">mood</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">random</span><span class="p">();</span>
<span class="n">i</span> <span class="o">|</span> <span class="n">name</span> <span class="o">|</span> <span class="n">mood</span> <span class="o">|</span> <span class="n">n</span>
<span class="c1">---+------------------------+------+---</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">5</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">3</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p><em>Clearly, you have to alias the whole result set, not a single column!</em></p>
<h2 id="with-ordinality-as-a-filtering-condition"><code class="language-plaintext highlighter-rouge">WITH ORDINALITY</code> as a filtering condition</h2>
<p>Having the automatically named <code class="language-plaintext highlighter-rouge">ordinality</code> column, or a custom chosen named column, it is possible to add such column to the <code class="language-plaintext highlighter-rouge">WHERE</code> clause of a query:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">animals</span><span class="p">()</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span> <span class="k">AS</span> <span class="n">cat</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">mood</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="k">WHERE</span> <span class="n">n</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">random</span><span class="p">();</span>
<span class="n">i</span> <span class="o">|</span> <span class="n">name</span> <span class="o">|</span> <span class="n">mood</span> <span class="o">|</span> <span class="n">n</span>
<span class="c1">---+------------------------+------+---</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">2</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>as you can see, the above query filters on the <code class="language-plaintext highlighter-rouge">n</code> column to get only even tuples.</p>
<h2 id="with-ordinality-vs-row_number"><code class="language-plaintext highlighter-rouge">WITH ORDINALITY</code> vs <code class="language-plaintext highlighter-rouge">row_number()</code></h2>
<p>You may think that the window function <code class="language-plaintext highlighter-rouge">[row_number()](https://www.postgresql.org/docs/14/functions-window.html){:target="_blank"}</code> does the same job as <code class="language-plaintext highlighter-rouge">WITH ORDINALITY</code>, at least in the function call scenario.
However, the <code class="language-plaintext highlighter-rouge">row_number()</code> window function is a different beast, and can work on a window defined against the result set ordinality. <em>In short, window functions cover a diferent set of problems!</em>
<br />
Therefore, even if the following seems to produce the very same result:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span><span class="p">,</span> <span class="n">row_number</span><span class="p">()</span> <span class="n">OVER</span> <span class="p">()</span> <span class="k">FROM</span> <span class="n">animals</span><span class="p">()</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">description</span> <span class="o">|</span> <span class="n">mood</span> <span class="o">|</span> <span class="k">ordinality</span> <span class="o">|</span> <span class="n">row_number</span>
<span class="c1">----+------------------------+------+------------+------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="mi">5</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>as soon as you define your partition to number in a more specialized way you see different results:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span><span class="p">,</span> <span class="n">row_number</span><span class="p">()</span> <span class="n">OVER</span> <span class="p">(</span> <span class="k">order</span> <span class="k">by</span> <span class="n">pk</span> <span class="k">desc</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">animals</span><span class="p">()</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">description</span> <span class="o">|</span> <span class="n">mood</span> <span class="o">|</span> <span class="k">ordinality</span> <span class="o">|</span> <span class="n">row_number</span>
<span class="c1">----+------------------------+------+------------+------------</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">5</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above, you can see that the last row produced by the function (this with <code class="language-plaintext highlighter-rouge">ordinality</code> set to <code class="language-plaintext highlighter-rouge">5</code>) is the first row encountered by <code class="language-plaintext highlighter-rouge">row_number()</code>.
<br />
Another example of different results can be quickly obtained when joining:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span><span class="p">,</span> <span class="n">row_number</span><span class="p">()</span> <span class="n">OVER</span> <span class="p">()</span>
<span class="k">FROM</span> <span class="n">animals</span><span class="p">()</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span><span class="p">,</span>
<span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="k">WITH</span> <span class="k">ORDINALITY</span> <span class="k">as</span> <span class="n">x</span><span class="p">(</span><span class="n">gs</span><span class="p">,</span> <span class="n">counter</span><span class="p">);</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">description</span> <span class="o">|</span> <span class="n">mood</span> <span class="o">|</span> <span class="k">ordinality</span> <span class="o">|</span> <span class="n">gs</span> <span class="o">|</span> <span class="n">counter</span> <span class="o">|</span> <span class="n">row_number</span>
<span class="c1">----+------------------------+------+------------+----+---------+------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">3</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">5</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">6</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">7</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">8</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">9</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">10</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">1</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">bad</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">11</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">2</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">12</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">3</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">13</span>
<span class="mi">4</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">4</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">14</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">cat</span> <span class="o">#</span><span class="mi">5</span> <span class="n">owned</span> <span class="k">by</span> <span class="n">nobody</span> <span class="o">|</span> <span class="n">good</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">15</span>
<span class="p">(</span><span class="mi">15</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>For every <code class="language-plaintext highlighter-rouge">generate_series()</code> tuple (column <code class="language-plaintext highlighter-rouge">counter</code>) there are five <code class="language-plaintext highlighter-rouge">animals()</code> tuples (column <code class="language-plaintext highlighter-rouge">ordinality</code>), each one progressively tracked by <code class="language-plaintext highlighter-rouge">row_number()</code>.</p>
<h2 id="conclusions">Conclusions</h2>
<p>Why is this ordinality thing important?
<br />
It may happen that you are tempted to include into your function result sets some extra information that will ease the post-processing of the result set itself. This practice should be avoided when the “external world” (i.e., the query using the function) is able to add such extra information by itself. You will not waste resources, but also keep your code cleaner and more readable.</p>
An introduction to pgagroal (Italian)2022-06-10T00:00:00+00:00https://fluca1978.github.io/2022/06/10/PgTrainingVideoTalkpgagroal<p>The video recording about my <code class="language-plaintext highlighter-rouge">pgagroal</code> talk.</p>
<h1 id="an-introduction-to-pgagroal-italian">An introduction to pgagroal (Italian)</h1>
<p>At the past <a href="http://pgtraining.com" target="_blank">PgTraining</a> online event we have a set of amazing talks.
I gave an introduction about <a href="https://agroal.github.io/pgagroal/configuration.html"><code class="language-plaintext highlighter-rouge">pgagroal</code></a>, a very interesting connection pooler for PostgreSQL.</p>
<p><br />
Here there’s a recording of my video (<em>in italian</em>):</p>
<center>
<iframe id="odysee-iframe" width="560" height="315" src="https://odysee.com/$/embed/PGAGROAL_pgtraining_20220429/f2d6eb7ec81be473307e47eaaccfdc174b346ed6?r=EQ7cnUa3BF83nt8C3T9tMALPDUn2vBpD" allowfullscreen=""></iframe>
</center>
<p><br /></p>
<p>and here there are the <a href="https://gitlab.com/pgtraining/slides/-/blob/master/webinar-20220429/2022_PGTRAINING_PGAGROAL.pdf" target="_blank">slides (in italian)</a>.</p>
<p><br />
And don’t forget to glance at all the other <a href="https://gitlab.com/pgtraining/slides/-/tree/master/webinar-20220429" target="_blank">online material</a> shared about the online event!</p>
pgenv `switch`2022-05-11T00:00:00+00:00https://fluca1978.github.io/2022/05/11/pgenv_switch<p><code class="language-plaintext highlighter-rouge">pgenv</code> 1.3.0 adds a new command: <code class="language-plaintext highlighter-rouge">switch</code></p>
<h1 id="pgenv-switch"><code class="language-plaintext highlighter-rouge">pgenv </code>switch`</h1>
<p><a href="https://github.com/theory/pgenv" target="_blank"><code class="language-plaintext highlighter-rouge">pgenv</code></a>, a simple but great shell script that helps managing several PostgreSQL instances on your machine, have been improved in the last days.
<br />
<br />
Thanks to the contribution of <strong>Nils Dijk</strong> <a href="https://github.com/thanodnl" target="_blank"><code class="language-plaintext highlighter-rouge">@thanodnl</code> on GitHub</a>, there is now a new command named <code class="language-plaintext highlighter-rouge">switch</code> that allows you to quickly prepare the whole environment for a different PostgreSQL version without having to start it.</p>
<p><br />
The problem, as described in <a href="https://github.com/theory/pgenv/pull/53" target="_blank">this pull request</a> was that the <code class="language-plaintext highlighter-rouge">use</code> command, trying to be <em>smart</em>, starts a PostgreSQL instance once it has been chosen. On the other hand, <code class="language-plaintext highlighter-rouge">switch</code>, allows you to pre-select the PostgreSQL instance to use without starting it. This is handy, for example, when you want to compile some code against a particular version of PostgreSQL (managed by <code class="language-plaintext highlighter-rouge">pgenv</code>) but don’t want to waste your computer resources starting up PostgreSQL.
<br />
To some extent, <code class="language-plaintext highlighter-rouge">switch</code> can be thought as an efficient equivalent of:</p>
<p><br />
<br /></p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv use 14.2
% pgenv stop
</code></pre></div></div>
<p><br />
<br /></p>
<p>The command has been implemented as a <em>subcase</em> of <code class="language-plaintext highlighter-rouge">use</code>, but while <code class="language-plaintext highlighter-rouge">use</code> does fire up an instance, <code class="language-plaintext highlighter-rouge">switch</code> does not.
<br />
However, in the case an instance is already running, <strong>switching to a new instance will stop the previously running one</strong>!</p>
<h2 id="other-minor-contributions">Other minor contributions</h2>
<p>If you have <code class="language-plaintext highlighter-rouge">pgenv</code> on the radar, you probably have seen another release in the last days, that covered a bug fix spot by Nils Dijk about the management of the configuration.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">pgenv</code> keeps growing and adding new features, and is becoming a more complex beast than it was in the beginning.
Hopefully, it can help your workflow too!</p>
Don't forget the PgTraining online webinar on 2022-04-29 (Italian)2022-04-20T00:00:00+00:00https://fluca1978.github.io/2022/04/20/pgtrainingOnlineEventReminder<p>Yet another online event organized by PgTraining!</p>
<h1 id="dont-forget-the-pgtraining-online-webinar-on-2022-04-29-italian">Don’t forget the PgTraining online webinar on 2022-04-29 (Italian)</h1>
<p><strong>There are still some seats available for another great <em>online event</em> provided you by <a href="http://pgtraining.com" target="_blank">PgTraining</a></strong>!</p>
<p><br />
<br />
Don’t forget to get your <strong>free of charge</strong> access to the online event, that will be brought you in Italian language on <strong>next 29th April</strong>: <a href="https://www.eventbrite.it/e/biglietti-pgtraining-on-line-session-2022-04-262565138397" target="_blank"><strong>hurry up and get your free ticket</strong></a>.</p>
Formatting SQL code with pgFormatter within Emacs2022-04-13T00:00:00+00:00https://fluca1978.github.io/2022/04/13/EmacsPgFormatter<p>Editing SQL and PostgreSQL related code within Emacs, in a beautiful war!</p>
<h1 id="formatting-sql-code-with-pgformatter-within-emacs">Formatting SQL code with pgFormatter within Emacs</h1>
<p><a href="https://github.com/darold/pgFormatter" target="_blank">pgFormatter</a> is a great Perl 5 tool that parses SQL input and re-format it in a beautiful way.
<br />
Despite the name, it works with any SQL piece of code, since it does support the standars from <em>SQL-92</em> to <em>SQL-2011</em>, plus all little keywords and details that are specific to PostgreSQL.
<br />
<br />
Being myself an <em>Emacs addicted</em>, I reasoned about how to “pkug in” <code class="language-plaintext highlighter-rouge">pgFormatter</code> into Emacs, and I came up with a short and ugly snippet of code that does the trick.
<br />
But, being Emacs what it is, there is no particular need to plug in such code, as I will show you in a moment.</p>
<h2 id="use-pgformatter-from-emacs-the-portable-way">Use <code class="language-plaintext highlighter-rouge">pgFormatter</code> from Emacs, the <em>portable</em> way</h2>
<p>Emacs allows users to run a <em>shell command</em> over a region or a buffer content. The <code class="language-plaintext highlighter-rouge">M-|</code> (menomic: <em>pipe</em>) does that. With the universal prefix (<code class="language-plaintext highlighter-rouge">C-u</code>) it can also replace the region or buffer you are running the command against.
<br />
This means that, given your own Emacs instance, you can format the code within the region by simply doing</p>
<p><br />
<br />
<code class="language-plaintext highlighter-rouge">C-u M-| pg_format</code>
<br />
<br />
where <code class="language-plaintext highlighter-rouge">pg_format</code> is the name of the executable of pgFormatter (e.g., it is called like that in Rocky Linux).</p>
<h2 id="a-more-lispy-approach">A more lispy approach</h2>
<p>I <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/emacs/pgformatter.el" target="_blank">developed a simple and ugly snippet of Lisp</a> that can be loaded into Emacs to make the pgFormatter usage quicker.</p>
<p><br />
<br /></p>
<div class="language-common-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">pgformatter-on-region</span> <span class="p">()</span>
<span class="s">"A function to invoke pgFormatter as an external program."</span>
<span class="p">(</span><span class="nv">interactive</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">b</span> <span class="p">(</span><span class="k">if</span> <span class="nv">mark-active</span> <span class="p">(</span><span class="nb">min</span> <span class="p">(</span><span class="nv">point</span><span class="p">)</span> <span class="p">(</span><span class="nv">mark</span><span class="p">))</span> <span class="p">(</span><span class="nv">point-min</span><span class="p">)))</span>
<span class="p">(</span><span class="nv">e</span> <span class="p">(</span><span class="k">if</span> <span class="nv">mark-active</span> <span class="p">(</span><span class="nb">max</span> <span class="p">(</span><span class="nv">point</span><span class="p">)</span> <span class="p">(</span><span class="nv">mark</span><span class="p">))</span> <span class="p">(</span><span class="nv">point-max</span><span class="p">)))</span>
<span class="p">(</span><span class="nv">pgfrm</span> <span class="s">"/usr/bin/pg_format"</span> <span class="p">)</span> <span class="p">)</span>
<span class="p">(</span><span class="nv">shell-command-on-region</span> <span class="nv">b</span> <span class="nv">e</span> <span class="nv">pgfrm</span> <span class="p">(</span><span class="nv">current-buffer</span><span class="p">)</span> <span class="mi">1</span><span class="p">))</span> <span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above piece of code defines an interactive function (i.e., a function that can be invoked with <code class="language-plaintext highlighter-rouge">M-x</code>) named <code class="language-plaintext highlighter-rouge">pgformatter-on-region</code>. The function defined three variables:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">b</code> is the beginning of the region to format;</li>
<li><code class="language-plaintext highlighter-rouge">e</code> is the end of the region to format;</li>
<li><code class="language-plaintext highlighter-rouge">pgfrm</code> is the path to the executable to invoke.</li>
</ul>
<p>If there is a region active (i.e., <code class="language-plaintext highlighter-rouge">mark-active</code>) the function operates over the region, otherwise, if no region is applied, it operates on the whole buffer.
<br />
In the end, the function invokes the shell command <code class="language-plaintext highlighter-rouge">pgfrm</code> using the internal interactive function <code class="language-plaintext highlighter-rouge">shell-command-on-region</code> over the current buffer. The last argument, <code class="language-plaintext highlighter-rouge">1</code> indicates that I want to substitute the content of the current region (or buffer) with the command output.</p>
<p><br />
<br />
In order to execute the formatting, I then just need to <code class="language-plaintext highlighter-rouge">M-x pgformatter-on-region</code> with either an active region or not. It is also possible to bind the function to a keyboard sequence with something like:</p>
<p><br />
<br /></p>
<div class="language-common-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">"C-i"</span><span class="p">)</span> <span class="ss">'pgformatter-on-region</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>or a local key map entry.</p>
<p><br />
<br />
The ending result is something like the following:</p>
<p><br /></p>
<center>
<img src="/images/posts/emacs/pgformatter.gif" width="50%" />
</center>
pgbadger incremental mode via SSH2022-04-06T00:00:00+00:00https://fluca1978.github.io/2022/04/06/pgbadger_ssh_download<p>How great it is <code class="language-plaintext highlighter-rouge">pgbadger</code>?</p>
<h1 id="pgbadger-incremental-mode-via-ssh">pgbadger incremental mode via SSH</h1>
<p><a href="https://github.com/darold/pgbadger" target="_blank">pgbadger</a> is a great tool, and quite frankly I suggest everyone I talk to about PostgreSQL to install it!
<br />
Why?
<br />
It is <em>cheap</em> and does its job in analyzing logs and providing you insights about what happened in your cluster.
<br />
<br />
A few days ago I caught a strange, to me, behavior. <code class="language-plaintext highlighter-rouge">pgbadger</code> has a very handy <em>incremental mode</em> that allows you to keep it running processing new logs every day (or whatever period you choose) and get historical and up-to-date insights.
Well, when downloading a file over an SSH connection, this incremental behavior was not working.
<br />
Uhm, I was sure it was working, since I use it quite often, but I was unable to understand what I was missing in the configuration of <code class="language-plaintext highlighter-rouge">pgbadger</code>. After a few experiments and comparisons with other working systems of mines, I found that the <code class="language-plaintext highlighter-rouge">-r</code> (<em>remote</em>) flag was able to work over SSH, while a “simple” URI like <code class="language-plaintext highlighter-rouge">ssh://me@you//var/postgresql/logs</code> was not.
<br />
<br />
<a href="https://github.com/darold/pgbadger/issues/723" target="_blank">I reported the issue</a>, and in less than a week <a href="https://github.com/darold/pgbadger/commit/6a4750b35a49ed2f9153315a3642aea3c27db556" target="_blank">the problem was fixed</a>!
<br />
Well, this is unfair: it is true that the commits is a week after the initial issue, but <strong><a href="https://github.com/darold/pgbadger/issues/723#issuecomment-1078753300" target="_blank">after only 48 hours there was a commit aimed to fix the problem</a></strong>, but then there was some around-the-daylight time spent in communicating tests and their results.
<br />
<br />
<em>This is something you simply don’t get in your commercial ecosystem!</em>
<br />
<br />
Thanks for the great work and keep this useful project going!</p>
pgagroal log rotation and formatting2022-04-04T00:00:00+00:00https://fluca1978.github.io/2022/04/04/pgagroal_log_rotation<p>My small contributions to <code class="language-plaintext highlighter-rouge">pgagroal</code>.</p>
<h1 id="pgagroal-log-rotation-and-formatting"><code class="language-plaintext highlighter-rouge">pgagroal</code> log rotation and formatting</h1>
<p>A few weeks ago I implemented a small contribution to <code class="language-plaintext highlighter-rouge">[pgagroal](https://agroal.github.io/pgagroal/){:target="_blank"}</code>, the <em>high-performance</em> PostgreSQL connection pooler, in order to implement <em>log rotation</em> and <em>log formatting</em>.
<br />
At last, my contribution was <a href="https://github.com/agroal/pgagroal/pull/216" target="_blank">accepted and merged</a>, but I did not get enough time to write on this until now.</p>
<h2 id="the-issues">The issues</h2>
<p>As you can read in the issues about <a href="https://github.com/agroal/pgagroal/issues/45">log rotation</a> and <a href="https://github.com/agroal/pgagroal/issues/44" target="_blank">log formatting</a>, <code class="language-plaintext highlighter-rouge">pgagroal</code> was born with a very minimal support for logging.
With “minimal” I mean that the log file was not able to be <code class="language-plaintext highlighter-rouge">strftime(3)</code> compatible, therefore no placeholder were available in the log filename, and at the same time there was no rotation of logs at all.
<br />
Therefore I decided to try to implement both these features, and here there is a short description of what I did.</p>
<h3 id="strftime"><code class="language-plaintext highlighter-rouge">strftime</code></h3>
<p>This has been the first place to start: allow the support of <code class="language-plaintext highlighter-rouge">strftime(3)</code> compatible strings in the <code class="language-plaintext highlighter-rouge">log_filename</code> parameter.
This has been quite easy, since the only need was to use <code class="language-plaintext highlighter-rouge">strftime</code> when <a href="https://github.com/agroal/pgagroal/commit/ec08c52cc3a1589fa413e6af2141d01b5f9fa32a#diff-832ef7b42a93f7c071fa992df0be71e172ab68c7c85d8e87c9297c0225ddfbadR243" target="_blank">opening the log file</a>.</p>
<h3 id="log-rotation">Log rotation</h3>
<p>This was much bigger to implement. First of all, I have to ensure that every time a new rotation was required, the system was able to rotate the logs.
<br />
I decided to implement the <em>check</em> once a new log entry was outputted. This is clearly an unefficent approach, since the system is going to check the rotation needs much more than it is required, and can also slow down the logging system. However, the idea is that <code class="language-plaintext highlighter-rouge">pgagroal</code> is not going to log so much data to be impacted by the continuos check for rotation. Moreover, this was a kind of forced choice, since the logging is not done via a separated process.
<br />
The next step was to ensure that, once a log rotation is required, the system can rotate the log file. In order to do tat two changes were required:</p>
<ul>
<li>the <code class="language-plaintext highlighter-rouge">log_filename</code> should be able to support different names, in particular becoming a <code class="language-plaintext highlighter-rouge">strftime(3)</code> compatible strings;</li>
<li>the log file cannot be opened only in the application startup, but I needed a dedicated function to re-open the log file with a new name if needed.
<br />
In order to speed up a little the activities, I also provided an utility function to test if log rotation was enabled. Therefore, in the case the rotation was disabled, nothing of the above will ever happen.
<br />
Of course, the user must have some parameters to control log rotation, so I introduced <code class="language-plaintext highlighter-rouge">log_rotation_age</code> and <code class="language-plaintext highlighter-rouge">log_rotation_size</code>, both accepting strings that have to be converted respectively in seconds and bytes. And I provided the parsing functions as well.
<br />
Rotating the log on the size basis was quite simple: I have to test if the file size has exceeded the <code class="language-plaintext highlighter-rouge">log_rotation_size</code>. It was much more difficult to implement the rotation by age.
<br />
The rotation by age was implemented like this: a global variable keeps track of the last time the log file has been opened or reopened. Then, when I need to check for log rotation, I count the current time and get the difference between the current time and the last time the log file has been opened, if such number of seconds is greater than <code class="language-plaintext highlighter-rouge">log_rotation_age</code>, the log must be closed and re-opened.
<br />
<br />
There is an important consequence in the above logic: the rotation does not happen <strong>exactly</strong> when it is supposed to happen, but always with a size/time delay. In other words, it is allowed for a log file to exceed by a single log entry (therefore, a feew bytes) the <code class="language-plaintext highlighter-rouge">log_rotation_size</code>, as well as not time based rotation will happen before a new log entry is flushed. This means that on a low busy server, you could see rotation to happen much later than what you configured.</li>
</ul>
<p><br />
Last, there was to implement a some kind of <em>truncation</em> when a log rotates. Luckily, there was already such parameter, named <code class="language-plaintext highlighter-rouge">log_mode</code>, for opening a log file in append or truncate mode. I reused such logic in the re-opening log file function.</p>
<h3 id="log-formatting">Log Formatting</h3>
<p>The last piece to add to the picture was the log formatting option. This was, after all, quite easy: every log entry was flushed with a <code class="language-plaintext highlighter-rouge">strftime(3)</code> fixed preamble; it was sufficient to provide an option <code class="language-plaintext highlighter-rouge">log_line_prefix</code> to use as a variable preamble to <code class="language-plaintext highlighter-rouge">strfime(3)</code>.</p>
<h3 id="glance-at-contributed-code">Glance at contributed code</h3>
<p>Here you can find a glance at the <a href="https://github.com/agroal/pgagroal/commit/ec08c52cc3a1589fa413e6af2141d01b5f9fa32a#diff-832ef7b42a93f7c071fa992df0be71e172ab68c7c85d8e87c9297c0225ddfbadR243" target="_blank">contributed code</a>:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">log_rotation_enabled()</code> returns <em>true</em> if the log rotation is active. The log rotation is automatically disabled if some configuration parameter is not set accordingly;</li>
<li><code class="language-plaintext highlighter-rouge">log_rotation_disable()</code> turns off the log rotation. This is done as a last resort when the configuration is miswritten;</li>
<li><code class="language-plaintext highlighter-rouge">log_rotation_required()</code> checks if <strong>now</strong> is required a log rotation, either by age or size;</li>
<li><code class="language-plaintext highlighter-rouge">log_rotation_set_next_rotation()</code> computes the next <em>age</em> at which a rotation by time should be triggered;</li>
<li><code class="language-plaintext highlighter-rouge">log_file_open()</code> opens or re-opens the log file using <code class="language-plaintext highlighter-rouge">strftime(3)</code> agains the <code class="language-plaintext highlighter-rouge">log_filename</code> configuration parameter. Every time the rotation must happen this function is invoked;</li>
<li><code class="language-plaintext highlighter-rouge">log_file_rotate()</code> performs the log rotation.</li>
</ul>
<p><br />
As an example, the <code class="language-plaintext highlighter-rouge">log_file_rotate()</code> function is really simple:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span>
<span class="nf">log_file_rotate</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">log_rotation_enabled</span><span class="p">())</span>
<span class="p">{</span>
<span class="n">fflush</span><span class="p">(</span><span class="n">log_file</span><span class="p">);</span>
<span class="n">fclose</span><span class="p">(</span><span class="n">log_file</span><span class="p">);</span>
<span class="n">log_file_open</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and as you can see, it flushes the current log and re-opens it.
And all the magic happens when the log entry is spurted:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">vfprintf</span><span class="p">(</span><span class="n">log_file</span><span class="p">,</span> <span class="n">fmt</span><span class="p">,</span> <span class="n">vl</span><span class="p">);</span>
<span class="n">fprintf</span><span class="p">(</span><span class="n">log_file</span><span class="p">,</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
<span class="n">fflush</span><span class="p">(</span><span class="n">log_file</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">log_rotation_required</span><span class="p">())</span>
<span class="p">{</span>
<span class="n">log_file_rotate</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Briefly, after the <code class="language-plaintext highlighter-rouge">fprintf</code> and the <code class="language-plaintext highlighter-rouge">fflush</code> the system asks itself if a log rotation is required, and in case, rotates the log file.</p>
<h1 id="conclusions">Conclusions</h1>
<p>While <code class="language-plaintext highlighter-rouge">pgagroal</code> has a basic logging mechanism, this contribution provides the log rotation features in a semi-precise way.
Contributing to this was fun, even if hard in some aspects, in particular because it’s way too long since I develop something in C.
<br />
<code class="language-plaintext highlighter-rouge">pgagroal</code> is a promising project, and I’m sure it is going to quickly show all its potential!</p>
Perl code reuse in Pl/Perl thru pg_proc and anonymous code blocks2022-03-09T00:00:00+00:00https://fluca1978.github.io/2022/03/09/PlPerlAnonymousCodeBlock<p>Perl is great. PostgreSQL is great. And great plus great means super-powers!</p>
<h1 id="perl-code-reuse-in-plperl-thru-pg_proc-and-anonymous-code-blocks">Perl code reuse in Pl/Perl thru pg_proc and anonymous code blocks</h1>
<p>PostgreSQL allows you to write executable code, e.g., <code class="language-plaintext highlighter-rouge">FUNCTION</code>s and <code class="language-plaintext highlighter-rouge">PROCEDURE</code>s in Perl thru its extension language Pl/Perl (<code class="language-plaintext highlighter-rouge">plperl</code> and <code class="language-plaintext highlighter-rouge">plperlu</code>).
But sometimes there is the need to use the same Perl block over and over again across different code blocks and functions.
<br />
There are different approaches, most notably the <em>module</em> one: abstract your behavior into a module and load it whenever you need. Yeah, this means using <code class="language-plaintext highlighter-rouge">plperlu</code>, but it is a fair tradeoff.
<br />
<br />
However, keeping in mind how PostgreSQL stores procedures and their code, it is possible to use a more fancy approach. In this post I show you a couple of simple examples as proofs of concept, clearly in order to push this into production there is the need for a more sophisticated approach.</p>
<h2 id="an-easy-function-in-plperl">An easy function in Pl/Perl</h2>
<p>Let’s start simple: a Pl/Perl function to say prime numbers.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca</span><span class="p">.</span><span class="n">is_prime</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">bool</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">return</span> <span class="mi">1</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">2</span><span class="p">;</span>
<span class="k">for</span> <span class="n">my</span> <span class="err">$</span><span class="n">i</span> <span class="p">(</span> <span class="mi">2</span> <span class="p">..</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">%</span> <span class="err">$</span><span class="n">i</span> <span class="o">==</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Quite simple, uh? Now imagine that we need to prepare another function that needs to generate prime numbers, and thus needs to know if a given number is prime or not.
<br />
One approach could be to call the above <code class="language-plaintext highlighter-rouge">fluca.is_prime()</code> function, but this will slow down the whole process. But after all, this is the building block logic on functions!
<br />
Another approach could be to take apart the above bunch of Perl code, create a module, and use it wherever needed. But it is not the approach followed here.
<br />
Again, another approach could be to store a block reference into the <code class="language-plaintext highlighter-rouge">%_SHARED</code> global hash.
<br />
Last, why not querying the catalog <code class="language-plaintext highlighter-rouge">pg_proc</code> to extract the <em>source code</em> of the above function and wrap it into another Perl anonymous code block? It goes like this:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca</span><span class="p">.</span><span class="n">generate_primes_up_to</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">SETOF</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="err">$</span><span class="n">query</span> <span class="o">=</span> <span class="nv">"select prosrc from pg_proc where proname = 'is_prime' and pronamespace = ( select oid from pg_namespace where nspname = 'fluca' );"</span><span class="p">;</span>
<span class="n">my</span> <span class="err">$</span><span class="n">code</span> <span class="o">=</span> <span class="n">spi_exec_query</span><span class="p">(</span> <span class="err">$</span><span class="n">query</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="k">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="n">prosrc</span> <span class="p">};</span>
<span class="n">my</span> <span class="err">$</span><span class="n">is_prime</span> <span class="o">=</span> <span class="n">eval</span><span class="p">(</span> <span class="nv">"sub { $code }; "</span> <span class="p">);</span>
<span class="k">for</span> <span class="n">my</span> <span class="err">$</span><span class="n">n</span> <span class="p">(</span> <span class="mi">1</span> <span class="p">..</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">return_next</span><span class="p">(</span> <span class="err">$</span><span class="n">n</span> <span class="p">)</span> <span class="n">if</span> <span class="err">$</span><span class="n">is_prime</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">n</span> <span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">undef</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">$query</code> statement selects the <code class="language-plaintext highlighter-rouge">pg_proc.prosrc</code> text field that contains the source code, whatever you have written between <code class="language-plaintext highlighter-rouge">$CODE$</code> separators. That’s because <code class="language-plaintext highlighter-rouge">plperl</code> is a <em>pl</em> language, therefore its source code is stored in the system catalog.
<br />
Having stated that, the <code class="language-plaintext highlighter-rouge">$code</code> string contains the block of code, so it was like the variable was declared as follows:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">my</span> <span class="nv">$code</span> <span class="o">=</span> <span class="p">"</span><span class="s2"> return 1 if </span><span class="si">$_</span><span class="s2">[0] <= 2;
for my </span><span class="si">$i</span><span class="s2"> ( 2 .. </span><span class="si">$_</span><span class="s2">[0] - 1 ) {
return 0 if </span><span class="si">$_</span><span class="s2">[0] % </span><span class="si">$i</span><span class="s2"> == 0;
}
return 1;</span><span class="p">";</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Seems like a Perl sub, but it is not (yet). There is the need to wrap the bunch of code into a <code class="language-plaintext highlighter-rouge">sub</code> declaration, and this is the easy part, and then we need to compile it. That’s the task of <code class="language-plaintext highlighter-rouge">eval( "sub { $code };" )</code>, that creates an anonymous subroutine with the source code extracted from the other function.
<br />
Such code is stored into a scalar <code class="language-plaintext highlighter-rouge">$is_prime</code> that is then used as a standard anonymous subroutine via <code class="language-plaintext highlighter-rouge">-></code>.
<br />
And that’s all!</p>
<h3 id="advantages">Advantages</h3>
<p>The main advantage of the above approach is that whenever a change is done nto <code class="language-plaintext highlighter-rouge">fluca.is_prime()</code>, the same change is immediatly reflected into <code class="language-plaintext highlighter-rouge">fluca.generate_primes_up_to()</code>, because the source code of the former is always queried at the time the latter starts its execution.</p>
<h3 id="drawbacks">Drawbacks</h3>
<p>Time!
<br />
Extracting the code and compiling it every time requires time and resources, so it can be a pitfall for big Perl code blocks. There are different modules that can help in this scenario, e.g., <code class="language-plaintext highlighter-rouge">[Perl::Parse](https://metacpan.org/pod/Parse::Perl){:target="_blank"}</code>.
<br />
An <em>hidden</em> drawback is that the two functions are not explicitly coupled, so if <code class="language-plaintext highlighter-rouge">fluca.is_prime</code> is accidentaly deleted, the other function will no more be able to run at all!</p>
<h2 id="the-setof-problem">The <code class="language-plaintext highlighter-rouge">SETOF</code> problem</h2>
<p>Reusing a piece of code that returns a scalar is simple, but what about functions that return sets?
<br />
Assume there is the need for a function that returns all the <em>even</em> numbers up to a limit, and does that efficiently, that is returning one value at a time.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca</span><span class="p">.</span><span class="n">generate_evens</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">SETOF</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">for</span> <span class="p">(</span> <span class="mi">1</span> <span class="p">..</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">return_next</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span> <span class="p">)</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">undef</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>While the function is really simple, the problem of the sets arises immediatly: Pl/Perl provides particular ways of interacting with PostgreSQL, and <code class="language-plaintext highlighter-rouge">return_next</code> is one of such ways. Long story short: <code class="language-plaintext highlighter-rouge">return_next</code> yelds the function adding a new element to the current result set.
<br />
Since (regular) Perl does not have a <code class="language-plaintext highlighter-rouge">return_next</code> operator nor a function, how to translate such code? A very <em>inefficient</em> approach is to put the result set into an array and return the whole array. It is not the same as <code class="language-plaintext highlighter-rouge">return_next</code>, because there is no <em>yelding</em>, but it can work. Therefore, a function that wants to use the previous code could inject an array on the function prologue and substitute <code class="language-plaintext highlighter-rouge">return_next</code> with a regular array returning.
<br />
Imagine we want to build a function that computes odd numbers on top of the even ones; the code looks like the following snippet.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fluca</span><span class="p">.</span><span class="n">generate_odds</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">SETOF</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="err">$</span><span class="n">query</span> <span class="o">=</span> <span class="nv">"select prosrc from pg_proc where proname = 'generate_evens' and pronamespace = ( select oid from pg_namespace where nspname = 'fluca' );"</span><span class="p">;</span>
<span class="n">my</span> <span class="err">$</span><span class="n">code</span> <span class="o">=</span> <span class="n">spi_exec_query</span><span class="p">(</span> <span class="err">$</span><span class="n">query</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="k">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="n">prosrc</span> <span class="p">};</span>
<span class="err">$</span><span class="n">code</span> <span class="o">=</span> <span class="nv">"my </span><span class="se">\@</span><span class="nv">return_values;</span><span class="se">\n</span><span class="nv">"</span> <span class="p">.</span> <span class="err">$</span><span class="n">code</span> <span class="p">;</span>
<span class="err">$</span><span class="n">code</span> <span class="o">=~</span> <span class="n">s</span><span class="o">/</span><span class="n">return_next</span><span class="err">\</span><span class="n">s</span><span class="o">*</span><span class="err">\</span><span class="p">(</span><span class="o">/</span><span class="n">push</span><span class="p">(</span> <span class="err">\</span><span class="o">@</span><span class="n">return_values</span><span class="p">,</span><span class="o">/</span><span class="k">g</span><span class="p">;</span>
<span class="err">$</span><span class="n">code</span> <span class="o">=~</span> <span class="n">s</span><span class="o">/</span><span class="k">return</span><span class="err">\</span><span class="n">s</span><span class="o">+</span><span class="n">undef</span><span class="err">\</span><span class="p">.</span><span class="o">*</span><span class="p">;</span><span class="o">/</span><span class="k">return</span> <span class="err">\</span><span class="o">@</span><span class="n">return_values</span><span class="p">;</span><span class="o">/</span><span class="k">g</span><span class="p">;</span>
<span class="n">my</span> <span class="err">$</span><span class="n">generate_evens</span> <span class="o">=</span> <span class="n">eval</span><span class="p">(</span> <span class="nv">"sub { $code }; "</span> <span class="p">);</span>
<span class="n">my</span> <span class="o">@</span><span class="n">odds</span> <span class="o">=</span> <span class="k">map</span><span class="p">(</span> <span class="p">{</span> <span class="err">$</span><span class="n">_</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">}</span> <span class="err">$</span><span class="n">generate_evens</span><span class="o">-></span><span class="p">(</span> <span class="mi">10</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">for</span> <span class="p">(</span> <span class="o">@</span><span class="n">odds</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">return_next</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span> <span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">undef</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The base idea is the same as in the previous case: query <code class="language-plaintext highlighter-rouge">pg_proc</code> to get the source of the function and store it into <code class="language-plaintext highlighter-rouge">$code</code>.
<br />
Then, add the declaration for an array, named <code class="language-plaintext highlighter-rouge">@return_values</code> (a better and unique name should be chosen), and substitue with a regular expression all <code class="language-plaintext highlighter-rouge">return_next</code> with a <code class="language-plaintext highlighter-rouge">push</code> into the above array, removing also any <code class="language-plaintext highlighter-rouge">return undef</code> (that in PlPerl is the way to end the result set).
<br />
Yeah, I hear you screaming! This is surely something not to do in production, but it is a simple and dirty way to make Perl do what you want.
<br />
As in the previous case, store the result of <code class="language-plaintext highlighter-rouge">eval</code>uating the so rewritten <code class="language-plaintext highlighter-rouge">$code</code> into a scalar named <code class="language-plaintext highlighter-rouge">$generate_evens</code> and use it as you prefer.
<br />
<br />
<strong>Danger Will Robinson!</strong>
<br />
The regex substitution is awful because it will go beyond the <em>scope</em> where <code class="language-plaintext highlighter-rouge">return_next</code> applies. Imagine to apply the same technique recursively to <code class="language-plaintext highlighter-rouge">fluca.generate_odds()</code>: there is a <code class="language-plaintext highlighter-rouge">return_next</code> level inside the code extracted from <code class="language-plaintext highlighter-rouge">pg_proc</code> and an outer scope with <code class="language-plaintext highlighter-rouge">return_next</code> used within the function itself. The regular expression is not able to find out the scope, so both <code class="language-plaintext highlighter-rouge">return_next</code> will appear similar and will get substituted in the very same manner. <strong>And that’s why you should not use such approach in production</strong>! Again, there are Perl modules to get rid of these details and get things done in the right way.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Perl is great. It allows you to build dynamic code in a very dynamic way.
<br />
PostgreSQL is great. It allows you to inspect every single part of the system, including executable code.
<br />
The Perl power to push some code out of a PostgreSQL table (<code class="language-plaintext highlighter-rouge">pg_proc</code>) into a scalar, so to use it later on, allows for code sharing among Pl/Perl functions and routines.
<br />
It is up to you to decide to shoot yourself in the foot or hit something valuable!</p>
Pl/Perl Recursion2022-03-02T00:00:00+00:00https://fluca1978.github.io/2022/03/02/PlPerlRecursion<p>Some thoughts and experiments in Pl/Perl recursion.</p>
<h1 id="plperl-recursion">Pl/Perl Recursion</h1>
<p>While solving the <a href="https://perlweeklychallenge.org/blog/perl-weekly-challenge-0154/" target="_blank">Perl Weekly Challenge 154</a>, I provided a couple of possible solutions in Pl/Perl, the widely available Perl integration within PostgreSQL.
<br />
One task to solve, <em>Padovan numbers</em>, required to use <em>recursion</em>, and that is something not as simple as it could seem to implement using Pl/Perl.
<br />
Why?
<br />
Because Pl/Perl does not expose <em>Perl objects</em>, rather is a way to execute Perl within SQL objects (e.g., functions). What it means is that SQL objects are (clearly) the <em>first class objects</em> available, so you have always to use SQL functions to recurse.
<br />
Except when you don’t want to!
<br />
<br />
But let’s start simple and see how to solve the problem.</p>
<h2 id="padovan-numbers">Padovan numbers</h2>
<p>A <em>Padovan number</em> is a number defined as the sum of two preceeding numbers in the sequence. In particular:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">P(0) = P(1) = P(2) = 1</code>, the first three elements of the sequence are equal;</li>
<li><code class="language-plaintext highlighter-rouge">P(n) = P(n - 3) + P(n - 2)</code></li>
</ul>
<p>This is great for recursion, because you can define a function in Pl/pgSQL as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">pwc154</span><span class="p">.</span><span class="n">padovan</span><span class="p">(</span> <span class="n">i</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="n">IF</span> <span class="n">i</span> <span class="o"><=</span> <span class="mi">2</span> <span class="k">THEN</span>
<span class="k">RETURN</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">pwc154</span><span class="p">.</span><span class="n">padovan</span><span class="p">(</span> <span class="n">i</span> <span class="o">-</span> <span class="mi">3</span> <span class="p">)</span> <span class="o">+</span> <span class="n">pwc154</span><span class="p">.</span><span class="n">padovan</span><span class="p">(</span> <span class="n">i</span> <span class="o">-</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="translating-to-plperl-and-the-problem-of-recursion">Translating to Pl/Perl and the problem of recursion</h2>
<p>The above Pl/pgSQL function cannot be translated byte-by-byte to Pl/Perl; the following will not be possible:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">padovan_not_working</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">return</span> <span class="mi">1</span> <span class="n">if</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">2</span> <span class="p">);</span>
<span class="k">return</span> <span class="n">padovan_not_working</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">3</span> <span class="p">)</span>
<span class="o">+</span> <span class="n">padovan_not_working</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">2</span> <span class="p">);</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In fact, the <code class="language-plaintext highlighter-rouge">padovan_not_working</code> is a function on the SQL side, and thus cannot be called by PlPerl as a Perl function.
<br />
One, ease, solution, could be to accept the fact that the resulting function is an SQL object and interact with it accordingly:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">pwc154</span><span class="p">.</span><span class="n">padovan_plperl</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">return</span> <span class="mi">1</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">2</span><span class="p">;</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">a</span><span class="p">,</span> <span class="err">$</span><span class="n">b</span> <span class="p">)</span> <span class="o">=</span> <span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="o">-</span> <span class="mi">3</span><span class="p">,</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="o">-</span> <span class="mi">2</span> <span class="p">);</span>
<span class="n">my</span> <span class="err">$</span><span class="n">rs</span> <span class="o">=</span> <span class="n">spi_exec_query</span><span class="p">(</span> <span class="nv">"SELECT pwc154.padovan_plperl( $a ) + pwc154.padovan_plperl( $b ) AS p"</span> <span class="p">);</span>
<span class="k">return</span> <span class="err">$</span><span class="n">rs</span><span class="o">-></span><span class="p">{</span> <span class="k">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="n">p</span> <span class="p">};</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the function invokes itself by means of an SQL query.</p>
<h2 id="using-a-closure">Using a closure</h2>
<p>It is possible to use a closure to hold the reference to an anonymous code block, so that it is possible to implement the recursion as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">plperl_padovan_recursive</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="err">$</span><span class="n">padovan</span><span class="p">;</span>
<span class="err">$</span><span class="n">padovan</span> <span class="o">=</span> <span class="n">sub</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">2</span><span class="p">;</span>
<span class="k">return</span> <span class="err">$</span><span class="n">padovan</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">3</span> <span class="p">)</span> <span class="o">+</span> <span class="err">$</span><span class="n">padovan</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">2</span> <span class="p">);</span>
<span class="p">};</span>
<span class="k">return</span> <span class="err">$</span><span class="n">padovan</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="p">);</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>No need for queries, no need for external modules, <strong>but</strong> there are memory leaks due to the reference counting.</p>
<h2 id="using-subrecursive">Using <code class="language-plaintext highlighter-rouge">Sub::Recursive</code></h2>
<p>There is a module, named <a href="https://metacpan.org/pod/Sub::Recursive" target="_blank"><code class="language-plaintext highlighter-rouge">Sub::Recursive</code></a> that does exactly what I would like to go: allows to define an anonymous code block that can recursively invoke itself without any leak.
<br />
The only drawback is that the function must be run as <em>Pl/Perl unsafe</em> because it needs to load a module outside of the PostgreSQL server (and of course, the module must be on the system, <code class="language-plaintext highlighter-rouge">cpanm</code> is your friend!):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">plperl_padovan</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">use</span> <span class="n">Sub</span><span class="p">::</span><span class="k">Recursive</span><span class="p">;</span>
<span class="n">my</span> <span class="err">$</span><span class="n">padovan</span> <span class="o">=</span> <span class="k">recursive</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">2</span><span class="p">;</span>
<span class="k">return</span> <span class="err">$</span><span class="n">REC</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">3</span> <span class="p">)</span> <span class="o">+</span> <span class="err">$</span><span class="n">REC</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">2</span> <span class="p">);</span>
<span class="p">};</span>
<span class="k">return</span> <span class="o">&</span><span class="err">$</span><span class="n">padovan</span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="p">);</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperlu</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>That’s it! No need for queries, no need for <code class="language-plaintext highlighter-rouge">%_SHARED</code>, no need for closures (apparently), just Perl!
<br />
<strong>But, there is the need for <code class="language-plaintext highlighter-rouge">plperlu</code>!</strong>
<br />
The module provides the special <em>keyword</em> <code class="language-plaintext highlighter-rouge">recursive</code> that accepts a code reference with the closure <code class="language-plaintext highlighter-rouge">$REC</code> that holds a reference to the code block itself.</p>
<h2 id="using-_shared-">Using <code class="language-plaintext highlighter-rouge">%_SHARED</code> ?</h2>
<p>Another way to use recursion is by means of the Pl/Perl global hash <code class="language-plaintext highlighter-rouge">%_SHARED</code>, that is used to share whatever object you want across different functions. The idea is to share a function, so that it is possible to invoke it directly later on.
<br />
The implementation could be as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">plperl_padovan_init</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="n">VOID</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="err">$</span><span class="n">padovan</span><span class="p">;</span>
<span class="err">$</span><span class="n">padovan</span> <span class="o">=</span> <span class="n">sub</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span> <span class="n">if</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">2</span><span class="p">;</span>
<span class="k">return</span> <span class="err">$</span><span class="n">padovan</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">3</span> <span class="p">)</span> <span class="o">+</span> <span class="err">$</span><span class="n">padovan</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="mi">2</span> <span class="p">);</span>
<span class="p">};</span>
<span class="err">$</span><span class="n">_SHARED</span><span class="p">{</span> <span class="n">padovan</span> <span class="p">}</span> <span class="o">=</span> <span class="err">$</span><span class="n">padovan</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
<span class="k">SELECT</span> <span class="n">plperl_padovan_init</span><span class="p">();</span>
<span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">plperl_padovan_shared</span><span class="p">(</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="n">my</span> <span class="err">$</span><span class="n">padovan</span> <span class="o">=</span> <span class="err">$</span><span class="n">_SHARED</span><span class="p">{</span> <span class="n">padovan</span> <span class="p">};</span>
<span class="k">return</span> <span class="err">$</span><span class="n">padovan</span><span class="o">-></span><span class="p">(</span> <span class="err">$</span><span class="n">_</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="p">);</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The first function, <code class="language-plaintext highlighter-rouge">plperl_padovan_init</code>, installs a code reference <code class="language-plaintext highlighter-rouge">$padovan</code> into the global Pl/perl hash <code class="language-plaintext highlighter-rouge">%_SHARED</code>, so that other functions can obtain such code reference. The code is the same as in the other examples.
<br />
Then the function is explicitly invoked, so that the code reference is installed.
<br />
Later on, the <code class="language-plaintext highlighter-rouge">plperl_padovan_shared</code> function gets the code reference and use it as a normal function.</p>
<h2 id="quick-and-dirt-comparison">Quick and dirt comparison</h2>
<p>I’ve done a very short one-launch comparison among the approaches, excluding the one based on <code class="language-plaintext highlighter-rouge">%_SHARED</code> because it is very similar to the approach using the <em>pure</em> recursion via code reference. Just for the records, asking to the <code class="language-plaintext highlighter-rouge">%_SHARED</code> based approach to compute the <code class="language-plaintext highlighter-rouge">50</code>-th Padovan number requires around <code class="language-plaintext highlighter-rouge">0.5 secs</code> that is, as expected, in line with the other similar approaches.
<br />
Increasing the Padovan number to compute makes the Perl approaches based on pure or <code class="language-plaintext highlighter-rouge">Sub::Recursive</code> really similar in terms of execution time. The approach that performs a query to use recursion is, as you can imagine, the slowest one and its performance decreases very quickly as the numbers grow.</p>
<p>The following table summarizes times depending on the generated number:</p>
<p><br /><br /><br /></p>
<table class="table table-bordered">
<thead>
<tr>
<th>Padovan number</th>
<th><code class="language-plaintext highlighter-rouge">Sub::Recursive</code></th>
<th>closure</th>
<th>query</th>
</tr>
</thead>
<tbody>
<tr>
<td>10</td>
<td>12 ms</td>
<td>20.39 ms</td>
<td>17.61 ms</td>
</tr>
<tr>
<td>20</td>
<td>0.51 ms</td>
<td>18.18 ms</td>
<td>12 ms</td>
</tr>
<tr>
<td>30</td>
<td>1.2 ms</td>
<td>17.9 ms</td>
<td>27.99 ms</td>
</tr>
<tr>
<td>40</td>
<td>31.9 ms</td>
<td>30.01 ms</td>
<td>323.4 ms</td>
</tr>
<tr>
<td>50</td>
<td>0.50 secs</td>
<td>0.52 secs</td>
<td>5.9 secs</td>
</tr>
<tr>
<td>60</td>
<td>12.76 secs</td>
<td>10.35 secs</td>
<td>99.77 secs</td>
</tr>
<tr>
<td>65</td>
<td>35.09 secs</td>
<td>35.78 secs</td>
<td>385 secs</td>
</tr>
<tr>
<td>70</td>
<td>142 secs</td>
<td>145 secs</td>
<td>1580 secs</td>
</tr>
<tr>
<td>71</td>
<td>187.89 secs</td>
<td>196.24 secs</td>
<td>2122.8 secs</td>
</tr>
</tbody>
</table>
<p><br /><br /></p>
<p>It is not possible to keep increasing the Padovan number because of integer overflow, therefore I would have to adjust the functions to return <code class="language-plaintext highlighter-rouge">bigint</code>, but in any case I’m not expecting much different result trends.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Recursion in Pl/Perl could be hard to implement and could require fancy approaches like the closure based ones.
First of all, you need to decide if you can deal with <em>untrusted</em> languages: if so, probably installing a module is the easiest and rightmost approach.
If you don’t want to deal with untrusted code, you need to decide if you prefer to use a <em>pure</em> Perl approach, in such case a code reference is the choice, or you want to have something that can be invoked by other languages. The latter means using a more SQL-toward approach, while the former means sticking with a code refence, either used immediatly or by means of some sort of shared storage.</p>
Contributing to pgagroal (and pgmoneta?)2022-02-23T00:00:00+00:00https://fluca1978.github.io/2022/02/23/pgagroal_patches<p>My small contributions to two interesting projects.</p>
<h1 id="contributing-to-pgagroal-and-pgmoneta">Contributing to <code class="language-plaintext highlighter-rouge">pgagroal</code> (and <code class="language-plaintext highlighter-rouge">pgmoneta</code>?)</h1>
<p><code class="language-plaintext highlighter-rouge">[pgagroal](https://agroal.github.io/pgagroal/){:target="_blank"}</code> is an interesting <em>high-performance</em> PostgreSQL connection pooler. I started using and studying it at the end of the past year, and due to some discussions on the <a href="https://github.com/agroal/pgagroal/discussions" target="_blank">project discussions page</a>, I decided to have a look at the source code.
<br />
The above resulted in a few small contributions that have been merged:</p>
<ul>
<li><a href="https://github.com/agroal/pgagroal/pull/202" target="_blank">small changes to configuration error messages</a>. During some testing I noted that there was a misleading, at least to me, <code class="language-plaintext highlighter-rouge">FATAL</code> log message when the configuration of <em>limits</em> was inconsistent; this patch tries to improve the situation that was not strictly related to the log message, rather to the way of throwing the configuration inconsistency thru the application stack. The patch has been squashed and merged.</li>
<li><a href="https://github.com/agroal/pgagroal/pull/207" target="_blank">master key error messages</a> I was frustated one day while trying to insert a <em>master-key</em> password for the <code class="language-plaintext highlighter-rouge">pgagroal</code> <em>vault</em>, but since I was inserting a too short password, the system kept asking me the password without any error message. Thsi patch, squashed and merged, improves the verbosity of the application in such condition.</li>
<li><a href="https://github.com/agroal/pgagroal/pull/201" target="_blank">generate a PID file depending on the listening socket</a> was required because I was not able to stop <code class="language-plaintext highlighter-rouge">pgagroal</code> sometime. The problem was that launching two different instances on the same machine could result in some problems when the configuration was duplicated. The solution was to use a <em>guard</em> PID file based on the listening socket, so that the daemon could not be started on the same host in such conditions.</li>
<li><a href="https://github.com/agroal/pgagroal/pull/209" target="_blank">verbosity about master-key vault</a> provides an informational string to the user about <em>where</em>, on disk, the vault has been stored.</li>
</ul>
<p><br />
<br />
The above are very small contributions, due also to the fact that I don’t know (yet) <code class="language-plaintext highlighter-rouge">pgagroal</code> so well to be confident in doing more complex contributions. Also, as you can see from the pull requests, my Emacs decided to wipe out several times the code style, resulting in merge to stay pending. Moreover, it has been years since I developed something in C!</p>
<p><br />
<br />
Getting in touch with <code class="language-plaintext highlighter-rouge">pgagroal</code> lead me to get to know also another interesting project: <code class="language-plaintext highlighter-rouge">[pgmoneta](https://github.com/pgmoneta/pgmoneta){:target="_blank"}</code>, a backup solution for PostgreSQL.
I did not have very much time to inspect and study this project, but at least I was able to re-propose a similar patch about <a href="https://github.com/pgmoneta/pgmoneta/pull/43" target="_blank">the guard PID file</a>.</p>
<p><br />
<br />
What’s next? Well, I’m trying to implement <em>log rotation</em> on <code class="language-plaintext highlighter-rouge">pgagroal</code>, but I’m not yet ready to push in the wild a proposal. So far, I’m testing my work, so stay tuned!</p>
Perl Weekly Challenge 153: recursive CTEs2022-02-22T00:00:00+00:00https://fluca1978.github.io/2022/02/22/PerlWeeklyChallenge153PostgreSQL<p>My personal solutions to the Perl Weekly Challenge.</p>
<h1 id="perl-weekly-challenge-153-recursive-ctes">Perl Weekly Challenge 153: recursive CTEs</h1>
<p>This is a short tour about my solutions to the <a href="https://perlweeklychallenge.org/blog/perl-weekly-challenge-0153/" target="_blank">Challenge 153</a> done in PostgreSQL.</p>
<p><br /></p>
<ul>
<li><a href="#task1">Task 1</a></li>
<li><a href="#task2">Task 2</a></li>
</ul>
<p><br /></p>
<p><a name="task1"></a></p>
<h2 id="pwc-153---task-1">PWC 153 - Task 1</h2>
<p>This task was about producing <em>left factorial numbers</em>, where each value is computed by summing all the previously computed factorials.
<br />
I decided to implement it on top of a <em>recursive CTE</em> named <code class="language-plaintext highlighter-rouge">factorials</code> that, well, computes factorials. That was the easy part, then I needed to compute the <code class="language-plaintext highlighter-rouge">sum</code> of all the values less than the current one. Let’s use a <code class="language-plaintext highlighter-rouge">LATERAL JOIN</code> for the task:
<br /></p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">with</span> <span class="k">recursive</span> <span class="n">factorials</span> <span class="k">as</span>
<span class="p">(</span>
<span class="k">SELECT</span> <span class="mi">0</span><span class="p">::</span><span class="nb">numeric</span> <span class="k">as</span> <span class="n">num</span>
<span class="p">,</span><span class="mi">1</span><span class="p">::</span><span class="nb">numeric</span> <span class="k">as</span> <span class="n">fac</span>
<span class="k">UNION</span>
<span class="k">SELECT</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="o">+</span> <span class="mi">1</span>
<span class="p">,</span> <span class="p">(</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">)</span> <span class="o">*</span> <span class="n">f</span><span class="p">.</span><span class="n">fac</span>
<span class="k">FROM</span> <span class="n">factorials</span> <span class="n">f</span>
<span class="k">WHERE</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="o"><</span> <span class="mi">1000</span>
<span class="p">)</span>
<span class="k">SELECT</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span><span class="p">,</span> <span class="k">sum</span><span class="p">(</span> <span class="n">w</span><span class="p">.</span><span class="n">fac</span> <span class="p">)</span> <span class="k">as</span> <span class="n">left_factorial</span>
<span class="k">FROM</span> <span class="n">factorials</span> <span class="n">f</span><span class="p">,</span> <span class="k">LATERAL</span>
<span class="p">(</span> <span class="k">SELECT</span> <span class="n">ff</span><span class="p">.</span><span class="n">fac</span> <span class="k">FROM</span> <span class="n">factorials</span> <span class="n">ff</span> <span class="k">WHERE</span> <span class="n">ff</span><span class="p">.</span><span class="n">num</span> <span class="o"><</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">ff</span><span class="p">.</span><span class="n">num</span> <span class="p">)</span> <span class="n">w</span>
<span class="k">WHERE</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="o"><=</span> <span class="mi">10</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span><span class="p">,</span> <span class="n">f</span><span class="p">.</span><span class="n">fac</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span>
<span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>I limit the number of work to be done to <code class="language-plaintext highlighter-rouge">10</code>, as requested by the task. The <code class="language-plaintext highlighter-rouge">factorials</code> CTE computes all the factorials, and then I join <code class="language-plaintext highlighter-rouge">LATERAL</code> with a subquery <code class="language-plaintext highlighter-rouge">w</code> that selects all the factorial values for entries less then current one. Therefore, using the built-in <code class="language-plaintext highlighter-rouge">sum</code> function in the outer query solves the problem.
<br />
Clearly, this is not a particularly efficient solution, but it is a good example of what recursive CTEs and <code class="language-plaintext highlighter-rouge">LATERAL</code> an do when combined together.</p>
<p><a name="task2"></a></p>
<h2 id="pwc-153---task-2">PWC 153 - Task 2</h2>
<p>Similar to the previous task, but simpler: see if a given number is made by digits that, when summed as factorials, provide the number itself. As an example, <code class="language-plaintext highlighter-rouge">145</code> is a number that can be expressed as <code class="language-plaintext highlighter-rouge">!1 + 4! + 5!</code>.
<br />
Having a recursive CTE to compute factorials from the previous task, I decided to use the same starting point. However, this time, I used a <code class="language-plaintext highlighter-rouge">psql</code> variable named <code class="language-plaintext highlighter-rouge">needle</code> to which I assign the value I want to test:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">\</span><span class="k">set</span> <span class="n">needle</span> <span class="mi">145</span>
<span class="k">with</span> <span class="k">recursive</span> <span class="n">factorials</span> <span class="k">as</span>
<span class="p">(</span>
<span class="k">SELECT</span> <span class="mi">0</span><span class="p">::</span><span class="nb">numeric</span> <span class="k">as</span> <span class="n">num</span>
<span class="p">,</span><span class="mi">1</span><span class="p">::</span><span class="nb">numeric</span> <span class="k">as</span> <span class="n">fac</span>
<span class="k">UNION</span>
<span class="k">SELECT</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="o">+</span> <span class="mi">1</span>
<span class="p">,</span> <span class="p">(</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">)</span> <span class="o">*</span> <span class="n">f</span><span class="p">.</span><span class="n">fac</span>
<span class="k">FROM</span> <span class="n">factorials</span> <span class="n">f</span>
<span class="k">WHERE</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span> <span class="o"><</span> <span class="mi">1000</span>
<span class="p">)</span>
<span class="k">SELECT</span> <span class="k">CASE</span> <span class="k">sum</span><span class="p">(</span> <span class="n">f</span><span class="p">.</span><span class="n">fac</span> <span class="p">)</span> <span class="k">WHEN</span> <span class="p">:</span><span class="n">needle</span> <span class="k">THEN</span> <span class="p">:</span><span class="n">needle</span> <span class="o">||</span> <span class="s1">' OK'</span> <span class="k">ELSE</span> <span class="p">:</span><span class="n">needle</span> <span class="o">||</span> <span class="s1">' KO'</span> <span class="k">END</span> <span class="k">AS</span> <span class="n">factorions</span>
<span class="k">FROM</span> <span class="n">factorials</span> <span class="n">f</span> <span class="k">JOIN</span> <span class="n">regexp_split_to_table</span><span class="p">(</span> <span class="p">:</span><span class="n">needle</span><span class="p">::</span><span class="nb">text</span><span class="p">,</span> <span class="s1">''</span> <span class="p">)</span> <span class="n">w</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="k">ON</span> <span class="n">w</span><span class="p">.</span><span class="n">n</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">num</span><span class="p">::</span><span class="nb">text</span>
<span class="p">;</span>
<span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The trick here is that I join <code class="language-plaintext highlighter-rouge">factorials</code> with <code class="language-plaintext highlighter-rouge">regexp_split_to_table</code> that returns all the digits as a set of tuples. Then, in the outer query, I do <code class="language-plaintext highlighter-rouge">sum</code> the factorials of every digit and see if the result is still the <code class="language-plaintext highlighter-rouge">needle</code>, producing an <code class="language-plaintext highlighter-rouge">OK</code> string or a <code class="language-plaintext highlighter-rouge">KO</code> one.</p>
PgTraining online webinar on 2022-04-29 (Italian)2022-02-05T00:00:00+00:00https://fluca1978.github.io/2022/02/05/PgTrainingOnlineEvent<p>Yet another online event organized by PgTraining!</p>
<h1 id="pgtraining-online-webinar-on-2022-04-29-italian">PgTraining online webinar on 2022-04-29 (Italian)</h1>
<p><a href="http://pgtraining.com" target="_blank">PgTraining</a>, the amazin italian group of people that spread the word about PostgreSQL and that I joined in the last years, is organizing another online event (<em>webinar</em>) on next 29th April 2022.
<br />
Following the success of the previous edition(s), we decided to provide another afternoon full of <em>PostgreSQL talks</em>, in the hope to improve the adoption of this great database.</p>
<p><br />
The event will consist in three hours with talks about <strong>connection pooling</strong>, <strong>timeseries extensions</strong> and <strong>column storage internals</strong>.
<br />
As for the previous editions, the webinar will be presented in Italian. Attendees will be free to actively participate and do questions both during the talks and at the end of the whole event.
<br />
<br />
In the pure spirit of <a href="http://pgtraining.com" target="_blank">PgTraining</a>, the event <strong>will be free of charge</strong>, but it is required to register for participate and the number of available seats is limited, so <a href="https://www.eventbrite.it/e/biglietti-pgtraining-on-line-session-2022-04-262565138397" target="_blank"><strong>hurry up and get your free ticket</strong></a> as soon as possible!
<br />
The material will be available for free after the event has completed, but no live recording will be available.</p>
Pentagon numbers2022-01-11T00:00:00+00:00https://fluca1978.github.io/2022/01/11/PostgreSQLPentagonNumbers<p>A couple of different solutions to an interesting problem.</p>
<h1 id="pentagon-numbers">Pentagon numbers</h1>
<p>Since a few weeks, I tend to implement some of the tasks for the <a href="https://perlweeklychallenge.org/" target="_blank">Perl Weekly Challenge</a> into PostgreSQL specific code. One interesting problem has been this week task 2 of the <a href="https://perlweeklychallenge.org/blog/perl-weekly-challenge-0147/" target="_blank">Challenge 147</a>: finding out a couple of <strong>pentagon numbers</strong> that have simultaneously a sum and a diff that is another pentagon number.
<br />
In this post, I discuss two possible solutions to the task.
<br /></p>
<h2 id="what-is-a-pentagon-number">What is a Pentagon Number?</h2>
<p>A <strong>pentagon number</strong> is defined as the value of the expression <code class="language-plaintext highlighter-rouge">n * ( 3 * n - 1 ) / 2</code>, therefore the pentagon number corresponding to <code class="language-plaintext highlighter-rouge">3</code> is <code class="language-plaintext highlighter-rouge">12</code>.
<br />
The task required to find out a couple of pentagon numbers so that:</p>
<p><br />
<br /></p>
<div class="language-mathematica highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">P</span><span class="p">(</span><span class="nv">n1</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nv">P</span><span class="p">(</span><span class="nv">n2</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">P</span><span class="p">(</span><span class="nv">x</span><span class="p">)</span><span class="w">
</span><span class="nv">P</span><span class="p">(</span><span class="nv">n1</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="nv">P</span><span class="p">(</span><span class="nv">n2</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">P</span><span class="p">(</span><span class="nv">y</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>
<p><br />
<br /></p>
<p>It does not matter what <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> are, but <code class="language-plaintext highlighter-rouge">n1</code> and <code class="language-plaintext highlighter-rouge">n2</code> must be pentagon numbers and both their sum and diff must be pentagon numbers too.</p>
<h2 id="the-first-approach-a-record-based-function">The first approach: a record based function</h2>
<p>The first solution I came with was inspired by the solution I provided in Raku, and is quite frankly a kind of <em>record-based</em> approach.
<br />
Firs of all, I define an <code class="language-plaintext highlighter-rouge">IMMUTABLE</code> function named <code class="language-plaintext highlighter-rouge">f_pentagon</code> that computes the pentagon number value starting from a given number, so that <code class="language-plaintext highlighter-rouge">f_pentagon( 3 )</code> returns <code class="language-plaintext highlighter-rouge">12</code>. Why do I need a function? Because I want to implement a table with a stored virtual column to keep track of numbers and their pentagon values.
<br />
For that reason, I created a <code class="language-plaintext highlighter-rouge">pentagons</code> table with a generic <code class="language-plaintext highlighter-rouge">n</code> column that represents the starting value and the <code class="language-plaintext highlighter-rouge">p</code> column that represents the computed pentagon value.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_pentagon</span><span class="p">(</span> <span class="n">n</span> <span class="nb">bigint</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">bigint</span>
<span class="k">AS</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">SELECT</span> <span class="p">(</span> <span class="n">n</span> <span class="o">*</span> <span class="p">(</span> <span class="mi">3</span> <span class="o">*</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span> <span class="p">)</span> <span class="o">/</span> <span class="mi">2</span> <span class="p">);</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="k">sql</span>
<span class="k">IMMUTABLE</span><span class="p">;</span>
<span class="k">DROP</span> <span class="k">TABLE</span> <span class="n">IF</span> <span class="k">EXISTS</span> <span class="n">pentagons</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">pentagons</span>
<span class="p">(</span>
<span class="n">n</span> <span class="nb">bigint</span>
<span class="p">,</span> <span class="n">p</span> <span class="nb">bigint</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="p">(</span> <span class="n">f_pentagon</span><span class="p">(</span> <span class="n">n</span> <span class="p">)</span> <span class="p">)</span> <span class="n">STORED</span>
<span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">pentagons</span><span class="p">(</span> <span class="n">n</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5000</span> <span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>I inserted into the table <code class="language-plaintext highlighter-rouge">5000</code> records because I know, from the Raku solution, that what I’m looking for is within such range of values. It is, of course, possible to increase that limit to find out other values.
<br />
The table content looks therefore like the following:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">pentagons</span> <span class="k">limit</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">n</span> <span class="o">|</span> <span class="n">p</span>
<span class="c1">----+-----</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">5</span>
<span class="mi">3</span> <span class="o">|</span> <span class="mi">12</span>
<span class="mi">4</span> <span class="o">|</span> <span class="mi">22</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">35</span>
<span class="mi">6</span> <span class="o">|</span> <span class="mi">51</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">70</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">92</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">117</span>
<span class="mi">10</span> <span class="o">|</span> <span class="mi">145</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now it is possible to implement a function, named <code class="language-plaintext highlighter-rouge">f_pentagon_pairs</code> that seeks the above table searching for the required values. The table returns a <code class="language-plaintext highlighter-rouge">TABLE</code>, even if only one row will be returned, but since I want to output multiple values, I decided to implement it as a row level returning function. In particular, the returned information is:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">n1</code> is the first number;</li>
<li><code class="language-plaintext highlighter-rouge">n2</code> is the second number;</li>
<li><code class="language-plaintext highlighter-rouge">s</code> is the sum of the pentagons, that is <code class="language-plaintext highlighter-rouge">P(n1) + P(n2)</code>;</li>
<li><code class="language-plaintext highlighter-rouge">d</code> is the difference of the pentagons, that is <code class="language-plaintext highlighter-rouge">abs( P(n1) - P(n2) )</code>;</li>
<li><code class="language-plaintext highlighter-rouge">ps</code> is the number which pentagon corresponds to the sum of the two pentagons, that is <code class="language-plaintext highlighter-rouge">P(ps) = P(n1) + P(n2)</code>;</li>
<li><code class="language-plaintext highlighter-rouge">pd</code> is the number which pentagon corresponds to the difference of the two pentagons, that is <code class="language-plaintext highlighter-rouge">P(pd) = abs( P(n1) - P(n2) )</code>;</li>
</ul>
<p><br />
The function is the following one:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_pentagons_pairs</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span> <span class="p">(</span> <span class="n">n1</span> <span class="nb">bigint</span><span class="p">,</span> <span class="n">n2</span> <span class="nb">bigint</span><span class="p">,</span> <span class="n">s</span> <span class="nb">bigint</span><span class="p">,</span> <span class="n">d</span> <span class="nb">bigint</span><span class="p">,</span> <span class="n">ps</span> <span class="nb">bigint</span><span class="p">,</span> <span class="n">pd</span> <span class="nb">bigint</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">current_tuple</span> <span class="n">pentagons</span><span class="o">%</span><span class="n">rowtype</span><span class="p">;</span>
<span class="n">other_tuple</span> <span class="n">pentagons</span><span class="o">%</span><span class="n">rowtype</span><span class="p">;</span>
<span class="n">fnd</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">current_tuple</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pentagons</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">n</span> <span class="n">LOOP</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">INTO</span> <span class="n">other_tuple</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">pp</span>
<span class="k">WHERE</span> <span class="k">EXISTS</span><span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="n">current_tuple</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">pp</span><span class="p">.</span><span class="n">p</span>
<span class="p">)</span>
<span class="k">AND</span> <span class="k">EXISTS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="k">abs</span><span class="p">(</span> <span class="n">current_tuple</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">pp</span><span class="p">.</span><span class="n">p</span> <span class="p">)</span>
<span class="p">);</span>
<span class="n">IF</span> <span class="k">FOUND</span> <span class="k">THEN</span>
<span class="k">SELECT</span> <span class="n">current_tuple</span><span class="p">.</span><span class="n">n</span>
<span class="p">,</span> <span class="n">other_tuple</span><span class="p">.</span><span class="n">n</span>
<span class="p">,</span> <span class="n">current_tuple</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">other_tuple</span><span class="p">.</span><span class="n">p</span>
<span class="p">,</span> <span class="k">abs</span><span class="p">(</span> <span class="n">current_tuple</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">other_tuple</span><span class="p">.</span><span class="n">p</span> <span class="p">)</span>
<span class="p">,</span> <span class="n">p1</span><span class="p">.</span><span class="n">n</span>
<span class="p">,</span> <span class="n">p2</span><span class="p">.</span><span class="n">n</span>
<span class="k">INTO</span> <span class="n">n1</span><span class="p">,</span> <span class="n">n2</span><span class="p">,</span> <span class="n">s</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">ps</span><span class="p">,</span> <span class="n">pd</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">p1</span><span class="p">,</span> <span class="n">pentagons</span> <span class="n">p2</span>
<span class="k">WHERE</span> <span class="n">p1</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="n">current_tuple</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">other_tuple</span><span class="p">.</span><span class="n">p</span>
<span class="k">AND</span> <span class="n">p2</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="k">abs</span><span class="p">(</span> <span class="n">current_tuple</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">other_tuple</span><span class="p">.</span><span class="n">p</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'P(%) + P(%) = P(%) = %'</span><span class="p">,</span>
<span class="n">n1</span><span class="p">,</span> <span class="n">n2</span><span class="p">,</span> <span class="n">ps</span><span class="p">,</span> <span class="n">s</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'P(%) - P(%) = P(%) = %'</span><span class="p">,</span>
<span class="n">n1</span><span class="p">,</span> <span class="n">n2</span><span class="p">,</span> <span class="n">pd</span><span class="p">,</span> <span class="n">d</span><span class="p">;</span>
<span class="n">fnd</span> <span class="p">:</span><span class="o">=</span> <span class="n">fnd</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NEXT</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is quite simple to understand:</p>
<ul>
<li>it performs a <em>one-record-at-time</em> loop placing every row of <code class="language-plaintext highlighter-rouge">pentagons</code> into <code class="language-plaintext highlighter-rouge">current_tuple</code>;</li>
<li>it searches for an<code class="language-plaintext highlighter-rouge">other_tuple</code> in <code class="language-plaintext highlighter-rouge">pentagons</code> so that the sum and the difference <code class="language-plaintext highlighter-rouge">EXISTS</code> in <code class="language-plaintext highlighter-rouge">pentagons</code> at the very same time. This means that the <code class="language-plaintext highlighter-rouge">other_tuple</code> and <code class="language-plaintext highlighter-rouge">current_tuple</code> lead to a sum and a difference that is still another pentagon number;</li>
<li>when such tuple is <code class="language-plaintext highlighter-rouge">FOUND</code>, the output tuple is built.</li>
</ul>
<p><br />
In order to get the <em>reverse</em> values that lead to the sum and difference, I do another double join with <code class="language-plaintext highlighter-rouge">pentagons</code> to get out the result.
<br />
The <code class="language-plaintext highlighter-rouge">RAISE</code> instructions are placed only to provide a textual representation of the expressions.
<br />
Launching the function on a very little virtual machine, busy in doing other stuff, results in:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">f_pentagons_pairs</span><span class="p">();</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">P</span><span class="p">(</span><span class="mi">1020</span><span class="p">)</span> <span class="o">+</span> <span class="n">P</span><span class="p">(</span><span class="mi">2167</span><span class="p">)</span> <span class="o">=</span> <span class="n">P</span><span class="p">(</span><span class="mi">2395</span><span class="p">)</span> <span class="o">=</span> <span class="mi">8602840</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">P</span><span class="p">(</span><span class="mi">1020</span><span class="p">)</span> <span class="o">-</span> <span class="n">P</span><span class="p">(</span><span class="mi">2167</span><span class="p">)</span> <span class="o">=</span> <span class="n">P</span><span class="p">(</span><span class="mi">1912</span><span class="p">)</span> <span class="o">=</span> <span class="mi">5482660</span>
<span class="n">n1</span> <span class="o">|</span> <span class="n">n2</span> <span class="o">|</span> <span class="n">s</span> <span class="o">|</span> <span class="n">d</span> <span class="o">|</span> <span class="n">ps</span> <span class="o">|</span> <span class="n">pd</span>
<span class="c1">------+------+---------+---------+------+------</span>
<span class="mi">1020</span> <span class="o">|</span> <span class="mi">2167</span> <span class="o">|</span> <span class="mi">8602840</span> <span class="o">|</span> <span class="mi">5482660</span> <span class="o">|</span> <span class="mi">2395</span> <span class="o">|</span> <span class="mi">1912</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">3346</span><span class="p">,</span><span class="mi">886</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">03</span><span class="p">,</span><span class="mi">347</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="a-cte-approach">A CTE Approach</h2>
<p>Is there the need for the <code class="language-plaintext highlighter-rouge">pentagons</code> table? Uhm…it is possible to materialize the same set of data with a recursive CTE.
And, therefore, it is possible to move the query at the outer level so that there is no need to perform a record-by-record scan.
After all, SQL is a set based language!</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">WITH</span> <span class="k">RECURSIVE</span> <span class="n">pentagons</span><span class="p">(</span> <span class="n">n</span><span class="p">,</span> <span class="n">p</span> <span class="p">)</span>
<span class="k">AS</span>
<span class="p">(</span>
<span class="k">SELECT</span> <span class="mi">1</span> <span class="k">AS</span> <span class="n">n</span>
<span class="p">,</span> <span class="n">f_pentagon</span><span class="p">(</span> <span class="mi">1</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">p</span>
<span class="k">UNION</span>
<span class="k">SELECT</span> <span class="n">p</span><span class="p">.</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span>
<span class="p">,</span> <span class="n">f_pentagon</span><span class="p">(</span> <span class="n">p</span><span class="p">.</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">p</span>
<span class="k">WHERE</span> <span class="n">p</span><span class="p">.</span><span class="n">n</span> <span class="o"><</span> <span class="mi">5000</span>
<span class="p">)</span>
<span class="k">SELECT</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'%s, %s'</span><span class="p">,</span> <span class="n">l</span><span class="p">.</span><span class="n">n</span><span class="p">,</span> <span class="n">r</span><span class="p">.</span><span class="n">n</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">pentagon_pairs</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">l</span><span class="p">,</span> <span class="n">pentagons</span> <span class="n">r</span>
<span class="k">WHERE</span> <span class="k">EXISTS</span><span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span>
<span class="p">)</span>
<span class="k">AND</span> <span class="k">EXISTS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="k">abs</span><span class="p">(</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span> <span class="p">)</span>
<span class="p">)</span>
<span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The query executes in a little less time than the approach using the table and the record-based function:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">pentagon_pairs</span>
<span class="c1">----------------</span>
<span class="mi">1020</span><span class="p">,</span> <span class="mi">2167</span>
<span class="mi">2167</span><span class="p">,</span> <span class="mi">1020</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">5820</span><span class="p">,</span><span class="mi">066</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">05</span><span class="p">,</span><span class="mi">820</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>There are some details that are harder to tune with the CTE approach, most notably the reverse lookup of the resulting base numbers and the exclusion of the duplicated row. However, it is possible to tune it to your needs.
<br />
Why is the CTE approach require more time than the function approach? Well, even if the times are similar, the function <em>terminates</em> as soon as it finds a solution, while the CTE does not, and therefore <em>scans</em> the whole dataset.</p>
<h2 id="plans">Plans!</h2>
<p>Timing is not as much difference as it could seem at glance, and effectively the two approaches are comparable with regard to performances.
The execution plans are, of course, a lot more different since the function approach works as a black box:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">f_pentagons_pairs</span><span class="p">();</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">P</span><span class="p">(</span><span class="mi">1020</span><span class="p">)</span> <span class="o">+</span> <span class="n">P</span><span class="p">(</span><span class="mi">2167</span><span class="p">)</span> <span class="o">=</span> <span class="n">P</span><span class="p">(</span><span class="mi">2395</span><span class="p">)</span> <span class="o">=</span> <span class="mi">8602840</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">P</span><span class="p">(</span><span class="mi">1020</span><span class="p">)</span> <span class="o">-</span> <span class="n">P</span><span class="p">(</span><span class="mi">2167</span><span class="p">)</span> <span class="o">=</span> <span class="n">P</span><span class="p">(</span><span class="mi">1912</span><span class="p">)</span> <span class="o">=</span> <span class="mi">5482660</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">---------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Function</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">f_pentagons_pairs</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">25</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">25</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1000</span> <span class="n">width</span><span class="o">=</span><span class="mi">48</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">4754</span><span class="p">.</span><span class="mi">081</span><span class="p">..</span><span class="mi">4754</span><span class="p">.</span><span class="mi">082</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Planning</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">047</span> <span class="n">ms</span>
<span class="n">Execution</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">4856</span><span class="p">.</span><span class="mi">988</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">4909</span><span class="p">,</span><span class="mi">165</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">,</span><span class="mi">909</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">WITH</span> <span class="k">RECURSIVE</span> <span class="n">pentagons</span><span class="p">(</span> <span class="n">n</span><span class="p">,</span> <span class="n">p</span> <span class="p">)</span>
<span class="k">AS</span>
<span class="p">(</span>
<span class="k">SELECT</span> <span class="mi">1</span> <span class="k">AS</span> <span class="n">n</span>
<span class="p">,</span> <span class="n">f_pentagon</span><span class="p">(</span> <span class="mi">1</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">p</span>
<span class="k">UNION</span>
<span class="k">SELECT</span> <span class="n">p</span><span class="p">.</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span>
<span class="p">,</span> <span class="n">f_pentagon</span><span class="p">(</span> <span class="n">p</span><span class="p">.</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">p</span>
<span class="k">WHERE</span> <span class="n">p</span><span class="p">.</span><span class="n">n</span> <span class="o"><</span> <span class="mi">5000</span>
<span class="p">)</span>
<span class="k">SELECT</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'%s, %s'</span><span class="p">,</span> <span class="n">l</span><span class="p">.</span><span class="n">n</span><span class="p">,</span> <span class="n">r</span><span class="p">.</span><span class="n">n</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">pentagon_pairs</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">l</span><span class="p">,</span> <span class="n">pentagons</span> <span class="n">r</span>
<span class="k">WHERE</span> <span class="k">EXISTS</span><span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span>
<span class="p">)</span>
<span class="k">AND</span> <span class="k">EXISTS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="k">abs</span><span class="p">(</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span> <span class="p">)</span>
<span class="p">)</span>
<span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-------------------------------------------------------------------------------------------------------------------------------</span>
<span class="n">Hash</span> <span class="n">Semi</span> <span class="k">Join</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">5</span><span class="p">.</span><span class="mi">57</span><span class="p">..</span><span class="mi">30</span><span class="p">.</span><span class="mi">98</span> <span class="k">rows</span><span class="o">=</span><span class="mi">23</span> <span class="n">width</span><span class="o">=</span><span class="mi">32</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">2659</span><span class="p">.</span><span class="mi">801</span><span class="p">..</span><span class="mi">10415</span><span class="p">.</span><span class="mi">225</span> <span class="k">rows</span><span class="o">=</span><span class="mi">2</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Hash</span> <span class="n">Cond</span><span class="p">:</span> <span class="p">(</span><span class="k">abs</span><span class="p">((</span><span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span><span class="p">))</span> <span class="o">=</span> <span class="n">ps_1</span><span class="p">.</span><span class="n">p</span><span class="p">)</span>
<span class="n">CTE</span> <span class="n">pentagons</span>
<span class="o">-></span> <span class="k">Recursive</span> <span class="k">Union</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">3</span><span class="p">.</span><span class="mi">56</span> <span class="k">rows</span><span class="o">=</span><span class="mi">31</span> <span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">002</span><span class="p">..</span><span class="mi">21</span><span class="p">.</span><span class="mi">821</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Result</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">01</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">001</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">001</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">WorkTable</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">pentagons</span> <span class="n">p</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">29</span> <span class="k">rows</span><span class="o">=</span><span class="mi">3</span> <span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">000</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">000</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1</span> <span class="n">loops</span><span class="o">=</span><span class="mi">5000</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">n</span> <span class="o"><</span> <span class="mi">5000</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">0</span>
<span class="o">-></span> <span class="n">Hash</span> <span class="n">Semi</span> <span class="k">Join</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">1</span><span class="p">.</span><span class="mi">01</span><span class="p">..</span><span class="mi">25</span><span class="p">.</span><span class="mi">63</span> <span class="k">rows</span><span class="o">=</span><span class="mi">149</span> <span class="n">width</span><span class="o">=</span><span class="mi">24</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">361</span><span class="p">.</span><span class="mi">447</span><span class="p">..</span><span class="mi">10156</span><span class="p">.</span><span class="mi">895</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5341</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Hash</span> <span class="n">Cond</span><span class="p">:</span> <span class="p">((</span><span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span><span class="p">)</span> <span class="o">=</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span><span class="p">)</span>
<span class="o">-></span> <span class="n">Nested</span> <span class="n">Loop</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">20</span><span class="p">.</span><span class="mi">15</span> <span class="k">rows</span><span class="o">=</span><span class="mi">961</span> <span class="n">width</span><span class="o">=</span><span class="mi">24</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">005</span><span class="p">..</span><span class="mi">6323</span><span class="p">.</span><span class="mi">391</span> <span class="k">rows</span><span class="o">=</span><span class="mi">25000000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">CTE</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">pentagons</span> <span class="n">l</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span> <span class="k">rows</span><span class="o">=</span><span class="mi">31</span> <span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">002</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">671</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">CTE</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">pentagons</span> <span class="n">r</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span> <span class="k">rows</span><span class="o">=</span><span class="mi">31</span> <span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">000</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">546</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">5000</span><span class="p">)</span>
<span class="o">-></span> <span class="n">Hash</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span> <span class="k">rows</span><span class="o">=</span><span class="mi">31</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">354</span><span class="p">.</span><span class="mi">101</span><span class="p">..</span><span class="mi">354</span><span class="p">.</span><span class="mi">101</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Buckets</span><span class="p">:</span> <span class="mi">8192</span> <span class="p">(</span><span class="n">originally</span> <span class="mi">1024</span><span class="p">)</span> <span class="n">Batches</span><span class="p">:</span> <span class="mi">1</span> <span class="p">(</span><span class="n">originally</span> <span class="mi">1</span><span class="p">)</span> <span class="n">Memory</span> <span class="k">Usage</span><span class="p">:</span> <span class="mi">260</span><span class="n">kB</span>
<span class="o">-></span> <span class="n">CTE</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">pentagons</span> <span class="n">ps</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span> <span class="k">rows</span><span class="o">=</span><span class="mi">31</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">001</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">405</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">Hash</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span> <span class="k">rows</span><span class="o">=</span><span class="mi">31</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">255</span><span class="p">.</span><span class="mi">920</span><span class="p">..</span><span class="mi">255</span><span class="p">.</span><span class="mi">920</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Buckets</span><span class="p">:</span> <span class="mi">8192</span> <span class="p">(</span><span class="n">originally</span> <span class="mi">1024</span><span class="p">)</span> <span class="n">Batches</span><span class="p">:</span> <span class="mi">1</span> <span class="p">(</span><span class="n">originally</span> <span class="mi">1</span><span class="p">)</span> <span class="n">Memory</span> <span class="k">Usage</span><span class="p">:</span> <span class="mi">260</span><span class="n">kB</span>
<span class="o">-></span> <span class="n">CTE</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">pentagons</span> <span class="n">ps_1</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">62</span> <span class="k">rows</span><span class="o">=</span><span class="mi">31</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">004</span><span class="p">..</span><span class="mi">49</span><span class="p">.</span><span class="mi">709</span> <span class="k">rows</span><span class="o">=</span><span class="mi">5000</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Planning</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">195</span> <span class="n">ms</span>
<span class="n">Execution</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">10431</span><span class="p">.</span><span class="mi">113</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">21</span> <span class="k">rows</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">10509</span><span class="p">,</span><span class="mi">619</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">,</span><span class="mi">510</span><span class="p">)</span>
</code></pre></div></div>
<p><br/<
<br /></p>
<p>Changing the CTE between <code class="language-plaintext highlighter-rouge">MATERIALIZED</code> and <code class="language-plaintext highlighter-rouge">NOT MATERIALIZED</code> does not produce any sensible change, of course.
<br />
Creating an index on <code class="language-plaintext highlighter-rouge">pentagons(p)</code> makes the function approach a little faster, but not very much faster since it is used only in the final part of the function.</p>
<h2 id="a-query-only-approach">A query only approach</h2>
<p>Having the <code class="language-plaintext highlighter-rouge">pentagons</code> table in place, it is possible to use it as the materialization of the CTE, thus pushing the query out of the function and not within the CTE:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'%s, %s'</span><span class="p">,</span> <span class="n">l</span><span class="p">.</span><span class="n">n</span><span class="p">,</span> <span class="n">r</span><span class="p">.</span><span class="n">n</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">pentagon_pairs</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">l</span><span class="p">,</span> <span class="n">pentagons</span> <span class="n">r</span>
<span class="k">WHERE</span> <span class="k">EXISTS</span><span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span>
<span class="p">)</span>
<span class="k">AND</span> <span class="k">EXISTS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="k">abs</span><span class="p">(</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span> <span class="p">)</span>
<span class="p">)</span>
<span class="p">;</span>
<span class="n">pentagon_pairs</span>
<span class="c1">----------------</span>
<span class="mi">1020</span><span class="p">,</span> <span class="mi">2167</span>
<span class="mi">2167</span><span class="p">,</span> <span class="mi">1020</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">5024</span><span class="p">,</span><span class="mi">468</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">05</span><span class="p">,</span><span class="mi">024</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is possible to push some <code class="language-plaintext highlighter-rouge">LIMIT 1</code> into the subqueries, so to force them to terminate as soon as a match is found, and this slightly improves the speed of the whole query:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'%s, %s'</span><span class="p">,</span> <span class="n">l</span><span class="p">.</span><span class="n">n</span><span class="p">,</span> <span class="n">r</span><span class="p">.</span><span class="n">n</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">pentagon_pairs</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">l</span><span class="p">,</span> <span class="n">pentagons</span> <span class="n">r</span>
<span class="k">WHERE</span> <span class="k">EXISTS</span><span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">+</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span> <span class="k">LIMIT</span> <span class="mi">1</span>
<span class="p">)</span>
<span class="k">AND</span> <span class="k">EXISTS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pentagons</span> <span class="n">ps</span>
<span class="k">WHERE</span> <span class="n">ps</span><span class="p">.</span><span class="n">p</span> <span class="o">=</span> <span class="k">abs</span><span class="p">(</span> <span class="n">l</span><span class="p">.</span><span class="n">p</span> <span class="o">-</span> <span class="n">r</span><span class="p">.</span><span class="n">p</span> <span class="p">)</span> <span class="k">LIMIT</span> <span class="mi">1</span>
<span class="p">)</span>
<span class="p">;</span>
<span class="n">pentagon_pairs</span>
<span class="c1">----------------</span>
<span class="mi">1020</span><span class="p">,</span> <span class="mi">2167</span>
<span class="mi">2167</span><span class="p">,</span> <span class="mi">1020</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">4328</span><span class="p">,</span><span class="mi">603</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">,</span><span class="mi">329</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The number of rows on the table is, however, too small for triggering the usage of the index, even forcing an <code class="language-plaintext highlighter-rouge">ORDER BY</code>.
Even an including index, that could cover all the columns, will not help in this case.</p>
<h1 id="conclusiops">Conclusiops</h1>
<p><em>There is more than one way to do it!</em>
<br />
No, sorry, this is not Perl, but PostgreSQL! However, given a specific problem, PostgreSQL provides a lot of fun and tools to solve a task.</p>
kill that backend!2021-12-06T00:00:00+00:00https://fluca1978.github.io/2021/12/06/pgterminatebackend<p>How to kill a backend process, the right way!</p>
<h1 id="kill-that-backend"><code class="language-plaintext highlighter-rouge">kill</code> that backend!</h1>
<p>Sometimes it happens: you need, as a DBA, to be harsh and terminate a backend, that is a user connection.
<br />
There are two main ways to do that:</p>
<ul>
<li>use the operating system <code class="language-plaintext highlighter-rouge">kill(1)</code> command to, well, <em>kill</em> such process;</li>
<li>use PostgreSQL administrative functions like <code class="language-plaintext highlighter-rouge">pg_terminate_backend()</code> or the more polite `pg_cancel_backend()**.</li>
</ul>
<h2 id="postgresql-pg_cancel_backend-and-pg_terminate_backend">PostgreSQL <code class="language-plaintext highlighter-rouge">pg_cancel_backend()</code> and <code class="language-plaintext highlighter-rouge">pg_terminate_backend()</code></h2>
<p>What is the difference between the two functions?
<br />
Quite easy to understand: <code class="language-plaintext highlighter-rouge">pg_cancel_backend()</code> sends a <code class="language-plaintext highlighter-rouge">SIGINT</code> to the backend process, that is it <em>asks politely to exit</em>. It is the equivalent of a standard <code class="language-plaintext highlighter-rouge">kill -INT</code> against the process.
<br />
But, what does it mean to <em>aks politely to exit</em>? It means <strong>to cancel the current query</strong>, that is it does not terminates the user session, rather the user interaction. That is why it is mapped to <code class="language-plaintext highlighter-rouge">SIGINT</code>, the equivalent to <code class="language-plaintext highlighter-rouge">CTRL-c</code> (interrupt by keyboard).
<br />
On the other hand, <code class="language-plaintext highlighter-rouge">pg_terminate_backend()</code> sends a <code class="language-plaintext highlighter-rouge">SIGTERM</code> to the process, that is equivalent to <code class="language-plaintext highlighter-rouge">kill -TERM</code> and <em>forces brutally the process to exit</em>.
<br /></p>
<h2 id="now-kill-it">Now, Kill it!</h2>
<p>Which method should you use?
<br />
<strong>If you are absolutely sure about what you are doing, you can use whatever method you want!</strong>
<br />
But sometimes caffeine is at a too low level in your body to do it right, <strong>you should use the PostgreSQL way</strong>!
There are at least two good reasons to use the <a href="https://www.postgresql.org/docs/14/functions-admin.html" target="_blank">PostgreSQL administrative functions</a>:</p>
<ul>
<li>you don’t need access to the server, i.e., you don’t need an operating system shell;</li>
<li>you will not accidentally kill another process.</li>
</ul>
<p><br />
The first reason is really simple to understand, and improves security about the machine hosting PostgreSQL, at least in my opinion.
<br />
The second reason is a little less obvious, and relies on the fact that <code class="language-plaintext highlighter-rouge">pg_cancel_backends()</code> and <code class="language-plaintext highlighter-rouge">pg_terminate_backend()</code> <strong>act only against processes within the PostgreSQL space</strong>, that is only processes spawn by the <code class="language-plaintext highlighter-rouge">postmaster</code>.
<br />
Let’s see this in action: imagine we select the wrong process to kill, like <code class="language-plaintext highlighter-rouge">174601</code> that is running Emacs on the server.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% ssh luca@miguel <span class="s1">'ps -aux | grep emacs'</span>
luca 174601 1.6 4.6 320068 46584 pts/0 S+ 08:40 0:04 emacs
% psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s2">"SELECT pg_cancel_backend( 174601 );"</span> testdb
WARNING: PID 174601 is not a PostgreSQL server process
pg_cancel_backend
<span class="nt">-------------------</span>
f
<span class="o">(</span>1 row<span class="o">)</span>
% psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s2">"SELECT pg_terminate_backend( 174601 );"</span> testdb
WARNING: PID 174601 is not a PostgreSQL server process
pg_terminate_backend
<span class="nt">----------------------</span>
f
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, there is no way to misbehave against a non PostgreSQL process! The logs provide, of course, the very same warning message:</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>WARNING: PID 174601 is not a PostgreSQL server process
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now, imagine what happened if the administrator did run something like:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% ssh luca@miguel <span class="s1">'sudo kill 1747601'</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The process, in this case Emacs, would have been killed.</p>
<h1 id="conclusions">Conclusions</h1>
<p>While you can always use the well known Unix tools to interact with PostgreSQL processes, it is strongly suggested to use the PostgreSQL tools. This improves safety checks and requires less effort in keeping track of what is happening on the cluster.</p>
pgdump, text and xz2021-12-06T00:00:00+00:00https://fluca1978.github.io/2021/12/06/pgdumpxz<p>A not-scientific look at how to compress a set of SQL dumps.</p>
<h1 id="pgdump-text-and-xz">pgdump, text and xz</h1>
<p>I have a database that contains around <code class="language-plaintext highlighter-rouge">50 GB</code> of data. I do continuos backup thru <a href="https://pgbackrest.org/" target="_blank">pgBackRest</a>, I also do regular <code class="language-plaintext highlighter-rouge">pg_dump</code> in <em>directory format via multiple jobs</em>, so I’m fine with backups.
<br />
However, why not have a look at SQL backups?
<br />
First of all: the content of the database is mostly numeric, being a quite large container of sensors data. This means that the data should be very good for compression.
<br />
Moreover, tables are <em>partitioned</em> on a per-year and per-month basis, therefore I have a regular structure with one year table and twelve month childrens. For instance, in the current year there is a table named <code class="language-plaintext highlighter-rouge">y2021</code> with other partitions named <code class="language-plaintext highlighter-rouge">y2021m01</code> thru <code class="language-plaintext highlighter-rouge">y2021m12</code>.</p>
<h2 id="pg_dump-in-text-mode"><code class="language-plaintext highlighter-rouge">pg_dump</code> in text mode</h2>
<p>I did a simple <code class="language-plaintext highlighter-rouge">for</code> loop in my shell to produce a few backup files, separating every single file by its year:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="k">for </span>y <span class="k">in</span> <span class="si">$(</span><span class="nb">echo </span>2018 2019 2020 2021 2022 <span class="si">)</span><span class="p">;</span> <span class="k">do
</span><span class="nb">echo</span> <span class="s2">"Backup year </span><span class="nv">$y</span><span class="s2">"</span>
<span class="nb">time </span>pg_dump <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="nt">-f</span> sensorsdb.<span class="nv">$y</span>.sql <span class="nt">-t</span> <span class="s2">"respi.y</span><span class="k">${</span><span class="nv">y</span><span class="k">}</span><span class="s2">*"</span> sensorsdb
<span class="k">done</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>This produce the following amount of data:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">ls</span> <span class="nt">-sh1</span> <span class="k">*</span>.sql
3,5G sensorsdb.2018.sql
13G sensorsdb.2019.sql
12G sensorsdb.2020.sql
10G sensorsdb.2021.sql
20K sensorsdb.2022.sql
</code></pre></div></div>
<p>The following is a table that summarizes the file size and the time required to create it:</p>
<p><br /></p>
<table class="table table-bordered">
<thead>
<tr>
<th>year</th>
<th>SQL size</th>
<th>time</th>
</tr>
</thead>
<tbody>
<tr>
<td>2018</td>
<td>3.5 GB</td>
<td>7 minutes</td>
</tr>
<tr>
<td>2019</td>
<td>13 GB</td>
<td>20 minutes</td>
</tr>
<tr>
<td>2020</td>
<td>12 GB</td>
<td>20 minutes</td>
</tr>
<tr>
<td>2021</td>
<td>10 GB</td>
<td>17 minutes</td>
</tr>
</tbody>
</table>
<p><br /></p>
<h2 id="compress-them">Compress them!</h2>
<p>Use <code class="language-plaintext highlighter-rouge">xz</code> with the default settings, that according to my installation is a compression level <em>6</em>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="k">for </span>y <span class="k">in</span> <span class="si">$(</span><span class="nb">echo </span>2018 2019 2020 2021 2022 <span class="si">)</span><span class="p">;</span> <span class="k">do
</span><span class="nb">echo</span> <span class="s2">"Compress year </span><span class="nv">$y</span><span class="s2">"</span>
<span class="nb">time </span>xz sensorsdb.<span class="nv">$y</span>.sql
<span class="k">done
</span>Compress year 2018
xz sensorsdb.<span class="nv">$y</span>.sql 2911,75s user 12,62s system 98% cpu 49:22,22 total
Compress year 2019
xz sensorsdb.<span class="nv">$y</span>.sql 7411,57s user 41,22s system 98% cpu 2:06:24,38 total
Compress year 2020
xz sensorsdb.<span class="nv">$y</span>.sql 6599,22s user 19,08s system 98% cpu 1:52:07,38 total
Compress year 2021
xz sensorsdb.<span class="nv">$y</span>.sql 5487,37s user 15,25s system 98% cpu 1:33:08,32 total
Compress year 2022
xz sensorsdb.<span class="nv">$y</span>.sql 0,01s user 0,01s system 36% cpu 0,069 total
</code></pre></div></div>
<p><br />
<br /></p>
<p>It requires from one to two hours to compress every single file, as summarized in the following table:</p>
<p><br />
<br /></p>
<table class="table table-bordered">
<thead>
<tr>
<th>File size</th>
<th>Time</th>
<th>Compressed size</th>
<th>Compression ratio</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.5 GB</td>
<td>50 minutes</td>
<td>227 MB</td>
<td>92 %</td>
</tr>
<tr>
<td>13 GB</td>
<td>2 hours</td>
<td>766 MB</td>
<td>94 %</td>
</tr>
<tr>
<td>12 GB</td>
<td>2 hours</td>
<td>658 MB</td>
<td>94 %</td>
</tr>
<tr>
<td>10 GB</td>
<td>1 and half hour</td>
<td>566 MB</td>
<td>94 %</td>
</tr>
</tbody>
</table>
<p><br />
<br /></p>
<p>Therefore, <code class="language-plaintext highlighter-rouge">xz</code> is a great tool to compress dump data, especially if that data is textual and most in a numeric form. Unluckily, <code class="language-plaintext highlighter-rouge">xz</code> results a little slow when applied with the default compression.
<br />
How much does it take to decompress the data? Well, it takes around 4 minutes for every file, that is much faster than the compression.</p>
<p><br />
Just as a comparison, doing a compression with <code class="language-plaintext highlighter-rouge">-2</code> instead of <code class="language-plaintext highlighter-rouge">-6</code> requires around <em>one quarter of the time doing only 1/3 of less compression</em>, e.g., <code class="language-plaintext highlighter-rouge">13 GB</code> required <code class="language-plaintext highlighter-rouge">35 minutes</code> instead of <code class="language-plaintext highlighter-rouge">120 minutes</code>, requiring <code class="language-plaintext highlighter-rouge">1.1 GB</code><code class="language-plaintext highlighter-rouge"> of disk space instead of </code>0.77 GB<code class="language-plaintext highlighter-rouge">.
Let's see the result using </code>-2` as default compression:</p>
<p><br />
<br /></p>
<table class="table table-bordered">
<thead>
<tr>
<th>File size</th>
<th>Time</th>
<th>Compressed size</th>
<th>Compression ratio</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.5 GB</td>
<td>10 minutes</td>
<td>338 MB</td>
<td>90 %</td>
</tr>
<tr>
<td>13 GB</td>
<td>35 minutes</td>
<td>1.1 GB</td>
<td>91 %</td>
</tr>
<tr>
<td>12 GB</td>
<td>37 minutes</td>
<td>918 MB</td>
<td>92 %</td>
</tr>
<tr>
<td>10 GB</td>
<td>30 minutes</td>
<td>786 MB</td>
<td>92 %</td>
</tr>
</tbody>
</table>
<p><br />
<br /></p>
<p>As you can see, using compression <code class="language-plaintext highlighter-rouge">-2</code> can greatly improve the speed of compression with a minum extra disk space requirement.
<br />
What about a directory format of dumping? Well, the same backup with <code class="language-plaintext highlighter-rouge">pg_dump -Fd</code>, that defaults at creating compressed objects, required <code class="language-plaintext highlighter-rouge">4.7 GB</code> of disk space. The <code class="language-plaintext highlighter-rouge">xz</code> version requires from <code class="language-plaintext highlighter-rouge">3.1 GB</code> (compression <code class="language-plaintext highlighter-rouge">-2</code>) to <code class="language-plaintext highlighter-rouge">2.2 GB</code> (compression <code class="language-plaintext highlighter-rouge">-6</code>).</p>
<h2 id="conclusions">Conclusions</h2>
<p><code class="language-plaintext highlighter-rouge">xz</code> can help you save a lot of disk storage for textual (SQL) backups, but the default compression level could require an huge amount of time, especially on not-so-poweful machines. However, a lower level of compression can greatly make <code class="language-plaintext highlighter-rouge">pg_dump</code> and <code class="language-plaintext highlighter-rouge">xz</code> as fast as <code class="language-plaintext highlighter-rouge">pg_dump -Fd</code> with some extra space saving.</p>
Monitoring Schema Changes via Last Commit Timestamp2021-11-26T00:00:00+00:00https://fluca1978.github.io/2021/11/26/UsingLastCommitTimestampToInspectDatabase<p>An ugly way to introspect database changes.</p>
<h1 id="monitoring-schema-changes-via-last-commit-timestamp">Monitoring Schema Changes via Last Commit Timestamp</h1>
<p>A few days ago, a colleague of mine shown to me that a <a href="https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_2005.htm#i1583352" target="_blank">commercial database keeps track of <em>last DDL change timestamp</em> against database objects</a>.
<br />
I began to mumble… is that possible in PostgreSQL? Of course it is, but what is the smartest way to achieve it?
<br />
I asked on the <a href="https://www.postgresql.org/message-id/CAKoxK%2B5uz67DxYcF%3D6ese6imi7ovHoqwjB%2BYE4X46sNv_9KN0w%40mail.gmail.com" target="_blank">mailing list</a>, because the first idea that came into my mind was to use <em>commit timestamps</em>.
<br />
Clearly, it is possible to implement something that can do the job using <em>event triggers</em>, that in short are triggers not attached to table tuples rather to database event like DDL commands. Great! And in fact, a very good explaination <a href="https://www.depesz.com/2012/07/29/waiting-for-9-3-event-triggers/" target="_blank">can be found here</a>.
<br />
In this article, I present my first idea about using commit timestamps.
<br />
The system used for the test is PostgreSQL 13.4 running on Fedora Linux, with only myself connected to it (this simplifies following transactions). The idea is, in any case, general and easy enough to be used on busy systems.</p>
<h1 id="introduction-to-pg_last_committed_xact">Introduction to <code class="language-plaintext highlighter-rouge">pg_last_committed_xact()</code></h1>
<p>The special function <a href="https://www.postgresql.org/docs/current/functions-info.html" target="_blank"><code class="language-plaintext highlighter-rouge">pg_last_committed_xact()</code></a> allows the database administrator (or an user) to get information about which transaction has committed last.
<br />
Let’s see this in action:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> luca <span class="nt">-h</span> miguel <span class="nt">-c</span> <span class="s1">'select pg_last_committed_xact();'</span> testdb
ERROR: could not get commit timestamp data
HINT: Make sure the configuration parameter <span class="s2">"track_commit_timestamp"</span> is set.
</code></pre></div></div>
<p><br />
<br /></p>
<p><strong>First of all</strong> in order to get information about the committed transaction timestamps, there must be the option <code class="language-plaintext highlighter-rouge">track_commit_timestamp</code> configured.
<br />
Turning on and off the parameter will not provide historic data, that is even if you <em>had</em> the parameter on and then you turned off, you will not be able to access collected data.
<br />
Let’s turn on the parameter and see how it works. <strong>The <code class="language-plaintext highlighter-rouge">track_commit_timestamp</code> is a parameter with the <code class="language-plaintext highlighter-rouge">postmaster</code> context, and therefore requires a server restart!</strong></p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> postgres <span class="nt">-h</span> miguel <span class="se">\</span>
<span class="nt">-c</span> <span class="s1">'ALTER SYSTEM SET track_commit_timestamp to "on"; '</span> <span class="se">\</span>
testdb
ALTER SYSTEM
% ssh luca@miguel <span class="s1">'sudo systemctl restart postgresql-13'</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above I restarted a remote system via <code class="language-plaintext highlighter-rouge">ssh</code>, of course you are free to configure the parameter and restart the cluster with your preferred (or available) method.
<br />
It is now time to see which information we can get with <code class="language-plaintext highlighter-rouge">track_commit_timestamp</code> turned on.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">txid_current</span><span class="p">();</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">-------------</span>
<span class="n">txid_current</span> <span class="o">|</span> <span class="mi">380316302458</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">txid_status</span><span class="p">(</span> <span class="mi">380316302457</span> <span class="p">),</span>
<span class="n">pg_last_committed_xact</span><span class="p">();</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">------------------------------</span>
<span class="n">txid_status</span> <span class="o">|</span> <span class="k">committed</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180410</span>
<span class="nb">timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">04</span><span class="p">:</span><span class="mi">28</span><span class="p">:</span><span class="mi">50</span><span class="p">.</span><span class="mi">223275</span><span class="o">-</span><span class="mi">05</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Let’s dissect the above example:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">txid_current()</code> simulates a new transaction in one row, because the function gets a new <code class="language-plaintext highlighter-rouge">xid</code> (transaction identifier) even if not used for effective work;</li>
<li><code class="language-plaintext highlighter-rouge">txid_status()</code> accepts a xid identifier and returns a string with the status of the transaction, and as shown, the <em>fake</em> transaction <code class="language-plaintext highlighter-rouge">380316302458</code> results in status <code class="language-plaintext highlighter-rouge">committed</code>;</li>
<li><code class="language-plaintext highlighter-rouge">pg_last_committed_xact()</code> now is able to report both the <code class="language-plaintext highlighter-rouge">xid</code> and the <code class="language-plaintext highlighter-rouge">timestamp</code> at which the last transaction has committed, that is the transaction <code class="language-plaintext highlighter-rouge">380316302458</code> committed at <code class="language-plaintext highlighter-rouge">2021-11-20 04:28:50.223275-05</code>.</li>
</ul>
<p><br />
Wait a minute: <code class="language-plaintext highlighter-rouge">pg_last_committed_xact()</code> states that the last committed transaction is <code class="language-plaintext highlighter-rouge">2359180410</code>, not <code class="language-plaintext highlighter-rouge">380316302458</code>. What is happening?
<br />
<strong>Wrap-around is on its way!</strong>
<br />
The above system has done a so called <em>xid wraparound</em>, that is normal situation in a long running PostgreSQL instance. What this means, is that <code class="language-plaintext highlighter-rouge">txid_current()</code> is resturning a <em>bumped</em> value that is, somehow, an absolute value. However, PostgreSQL “reasons” in terms of values modulo 2^32, therefore we must take into account this possible difference.
<br />
The above example therefore becomes:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">txid_current</span><span class="p">()</span> <span class="k">as</span> <span class="n">xid_absolute</span><span class="p">,</span>
<span class="k">mod</span><span class="p">(</span> <span class="n">txid_current</span><span class="p">(),</span> <span class="n">pow</span><span class="p">(</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">32</span> <span class="p">)::</span><span class="nb">bigint</span> <span class="p">)</span> <span class="k">as</span> <span class="n">xid</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">-------------</span>
<span class="n">xid_absolute</span> <span class="o">|</span> <span class="mi">380316302460</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180412</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span>
<span class="n">txid_status</span><span class="p">(</span> <span class="mi">380316302460</span> <span class="p">)</span> <span class="k">as</span> <span class="n">xid_abs_status</span><span class="p">,</span>
<span class="n">txid_status</span><span class="p">(</span> <span class="mi">2359180412</span> <span class="p">)</span> <span class="k">as</span> <span class="n">xid_status</span><span class="p">,</span>
<span class="n">pg_last_committed_xact</span><span class="p">();</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">--+------------------------------</span>
<span class="n">xid_abs_status</span> <span class="o">|</span> <span class="k">committed</span>
<span class="n">xid_status</span> <span class="o">|</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180412</span>
<span class="nb">timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">04</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">54</span><span class="p">.</span><span class="mi">531106</span><span class="o">-</span><span class="mi">05</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above demonstrates that transactions <code class="language-plaintext highlighter-rouge">380316302460</code> and <code class="language-plaintext highlighter-rouge">2359180412</code> are the same, according to PostgreSQL. However, <code class="language-plaintext highlighter-rouge">txid_status()</code> requires an “absolute” xid number (note how the short transaction number does not report any status), while <code class="language-plaintext highlighter-rouge">pg_last_committed_xact()</code> reasons in terms of “running” numbers, i.e., the modulo ones.
<br />
There is another interesting function to keep in mind: <code class="language-plaintext highlighter-rouge">pg_xact_commit_timestamp()</code> that, given a transaction identifier, returns the known commit timestamp:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span>
<span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="mi">2359180412</span><span class="p">::</span><span class="nb">text</span><span class="p">::</span><span class="n">xid</span> <span class="p">),</span>
<span class="n">pg_last_committed_xact</span><span class="p">();</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">------------+------------------------------</span>
<span class="n">pg_xact_commit_timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">04</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">54</span><span class="p">.</span><span class="mi">531106</span><span class="o">-</span><span class="mi">05</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180412</span>
<span class="nb">timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">04</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">54</span><span class="p">.</span><span class="mi">531106</span><span class="o">-</span><span class="mi">05</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the timestamp for the same transaction is always the same. <strong>Note that a <code class="language-plaintext highlighter-rouge">bigint</code> requires a conversion to <code class="language-plaintext highlighter-rouge">text</code> before being translated into a <code class="language-plaintext highlighter-rouge">xid</code></strong>.</p>
<h1 id="tracking-ddl-commands">Tracking DDL Commands</h1>
<p>Every table in PostgreSQL has two hidden fields that track the <em>transaction ranges</em>: <code class="language-plaintext highlighter-rouge">xmin</code> indicates the transaction that created a tuple, while <code class="language-plaintext highlighter-rouge">xmax</code> indicates the transaction that invalidated the tuple. This is used in the MVCC (Multi Version Concurrency Control) machinery that I’m not going to discuss here, so trust that everything works just fine.
<br />
The keypoint here is: <strong>every table has fields that track the transaction that generated the tuple</strong>. This applies also to system catalogs, and in particular (with regard to this article) to <code class="language-plaintext highlighter-rouge">pg_class</code>.
<br />
Having stated that, and knowing that every time a DDL command applies, something is changed in the system catalogs, it is therefore possible to track <strong>when changes did happen</strong> on a particular database object or table.
<br />
Let’s see this in action:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">BEGIN</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">txid_current</span><span class="p">()</span> <span class="k">as</span> <span class="n">xid_absolute</span><span class="p">,</span>
<span class="k">mod</span><span class="p">(</span> <span class="n">txid_current</span><span class="p">(),</span> <span class="n">pow</span><span class="p">(</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">32</span> <span class="p">)::</span><span class="nb">bigint</span> <span class="p">)</span> <span class="k">as</span> <span class="n">xid</span><span class="p">,</span>
<span class="k">current_timestamp</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-----+------------------------------</span>
<span class="n">xid_absolute</span> <span class="o">|</span> <span class="mi">380316302463</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180415</span>
<span class="k">current_timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">11</span><span class="p">:</span><span class="mi">56</span><span class="p">.</span><span class="mi">343542</span><span class="o">-</span><span class="mi">05</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">ddl_test</span><span class="p">(</span>
<span class="n">pk</span> <span class="nb">int</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span><span class="p">,</span>
<span class="n">t</span> <span class="nb">text</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">COMMIT</span><span class="p">;</span>
<span class="k">COMMIT</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>At timestamp <code class="language-plaintext highlighter-rouge">2021-11-20 05:11:56</code> the table <code class="language-plaintext highlighter-rouge">ddl_test</code> has been created. Since every DDL command in PostgreSQL is transactional, it is possible to track the transaction that committed such DDL (in the above example, <code class="language-plaintext highlighter-rouge">380316302463</code> alis <code class="language-plaintext highlighter-rouge">2359180415</code>).
<br />
Let’s query <code class="language-plaintext highlighter-rouge">pg_class</code> to get information about last DDL commands on <code class="language-plaintext highlighter-rouge">ddl_test</code> table:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_at</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------+------------------------------</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="mi">2359180415</span>
<span class="n">modified_at</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="k">table</span> <span class="o">|</span> <span class="n">ddl_test</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above queries tells us that <code class="language-plaintext highlighter-rouge">1</code> transaction ago the transaction number <code class="language-plaintext highlighter-rouge">2359180415</code> modified the structure of <code class="language-plaintext highlighter-rouge">ddl_test</code> at timestamp <code class="language-plaintext highlighter-rouge">2021-11-20 05:12:21.359126-05**.
<br/>
**Everything seems fine except for the timestamp**: the transaction timestamp is not really the same as reported by </code>pg_xact_commit_timestamp()`. The reason for this is that <em>the moment a transaction commits is not the same as the transaction is consolidated</em>, therefore there could some offset and lag. However, checking deeper we can see that data is coherent:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pg_last_committed_xact</span><span class="p">();</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">----------------------------</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180415</span>
<span class="nb">timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>So this is a first <em>ugly</em> but pretty much unexpensive way to track changes to the table.</p>
<p><br />
<br />
Let’s now add a column to the table, so to see if this machinery can work:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">BEGIN</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">txid_current</span><span class="p">()</span> <span class="k">as</span> <span class="n">xid_absolute</span><span class="p">,</span> <span class="k">mod</span><span class="p">(</span> <span class="n">txid_current</span><span class="p">(),</span> <span class="n">pow</span><span class="p">(</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">32</span> <span class="p">)::</span><span class="nb">bigint</span> <span class="p">)</span> <span class="k">as</span> <span class="n">xid</span><span class="p">,</span> <span class="k">current_timestamp</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-----+------------------------------</span>
<span class="n">xid_absolute</span> <span class="o">|</span> <span class="mi">380316302464</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180416</span>
<span class="k">current_timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">21</span><span class="p">:</span><span class="mi">03</span><span class="p">.</span><span class="mi">089031</span><span class="o">-</span><span class="mi">05</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">ddl_test</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">tt</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">COMMIT</span><span class="p">;</span>
<span class="k">COMMIT</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pg_last_committed_xact</span><span class="p">();</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">----------------------------</span>
<span class="n">xid</span> <span class="o">|</span> <span class="mi">2359180416</span>
<span class="nb">timestamp</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">21</span><span class="p">:</span><span class="mi">32</span><span class="p">.</span><span class="mi">376468</span><span class="o">-</span><span class="mi">05</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Transaction <code class="language-plaintext highlighter-rouge">2359180416</code> at timestamp <code class="language-plaintext highlighter-rouge">2021-11-20 05:21:32.376468-05</code> committed the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code>. Let’s run again our query against <code class="language-plaintext highlighter-rouge">pg_class</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_at</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------+------------------------------</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="mi">2359180416</span>
<span class="n">modified_at</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">21</span><span class="p">:</span><span class="mi">32</span><span class="p">.</span><span class="mi">376468</span><span class="o">-</span><span class="mi">05</span>
<span class="k">table</span> <span class="o">|</span> <span class="n">ddl_test</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore we now know when the table was last <em>touched</em> by a DDL command.</p>
<h2 id="going-deeper-introspection-against-columns">Going Deeper: Introspection Against Columns</h2>
<p>From the above we now know <em>when</em> a change happened to our table, but we don’t know which attribute has been changed. It is possible to <em>push</em> the same logic against other parts of the system catalog, for example <code class="language-plaintext highlighter-rouge">pg_attribute</code> that handles information about single table columns.
<br />
Here it the example applied to our demo table:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">xmin</span><span class="p">,</span> <span class="n">attname</span><span class="p">,</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">),</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">::</span><span class="n">regclass</span><span class="p">;</span>
<span class="n">xmin</span> <span class="o">|</span> <span class="n">attname</span> <span class="o">|</span> <span class="n">age</span> <span class="o">|</span> <span class="n">pg_xact_commit_timestamp</span>
<span class="c1">------------+----------+-----+-------------------------------</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">tableoid</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">cmax</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">xmax</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">cmin</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">xmin</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">ctid</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">pk</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180415</span> <span class="o">|</span> <span class="n">t</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="mi">2359180416</span> <span class="o">|</span> <span class="n">tt</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">21</span><span class="p">:</span><span class="mi">32</span><span class="p">.</span><span class="mi">376468</span><span class="o">-</span><span class="mi">05</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>All the columns except <code class="language-plaintext highlighter-rouge">tt</code> have been created by the very same transaction at the very same timestamp, while <code class="language-plaintext highlighter-rouge">tt</code> has been touched from another transation 11 minutes after.
<br />
The above is not very useful, so it is possible to improve sligthly the query into the following one:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">array_agg</span><span class="p">(</span> <span class="n">attname</span> <span class="p">)</span> <span class="k">as</span> <span class="n">columns</span><span class="p">,</span>
<span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="k">when</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">::</span><span class="n">regclass</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">);</span>
<span class="n">columns</span> <span class="o">|</span> <span class="k">when</span>
<span class="c1">------------------------------------------+-----------------</span>
<span class="p">{</span><span class="n">tableoid</span><span class="p">,</span><span class="n">cmax</span><span class="p">,</span><span class="n">xmax</span><span class="p">,</span><span class="n">cmin</span><span class="p">,</span><span class="n">xmin</span><span class="p">,</span><span class="n">ctid</span><span class="p">,</span><span class="n">pk</span><span class="p">,</span><span class="n">t</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">19</span><span class="p">:</span><span class="mi">38</span><span class="p">.</span><span class="mi">202794</span>
<span class="p">{</span><span class="n">tt</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">:</span><span class="mi">27</span><span class="p">.</span><span class="mi">185452</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>That reports all the column “touched” at the very same time and how many time has elapsed from the last change. For example, the column <code class="language-plaintext highlighter-rouge">tt</code> has been changed 10 minutes ago, while the other columns 19 minutes ago.
<br />
Let’s do more changes to our table and see what happen; please note that everything is executed in autocommit mode:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">ddl_test</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">ttt</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">ddl_test</span>
<span class="k">ALTER</span> <span class="k">COLUMN</span> <span class="n">tt</span> <span class="k">SET</span> <span class="k">DEFAULT</span> <span class="s1">'FizzBuzz'</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">ddl_test</span> <span class="k">DROP</span> <span class="k">COLUMN</span> <span class="n">t</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pg_last_committed_xact</span><span class="p">();</span>
<span class="n">xid</span> <span class="o">|</span> <span class="nb">timestamp</span>
<span class="c1">------------+------------------------------</span>
<span class="mi">2359180419</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">36</span><span class="p">:</span><span class="mi">48</span><span class="p">.</span><span class="mi">54285</span><span class="o">-</span><span class="mi">05</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>If we inspect again <code class="language-plaintext highlighter-rouge">pg_attribute</code> we have:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">array_agg</span><span class="p">(</span> <span class="n">attname</span> <span class="p">)</span> <span class="k">as</span> <span class="n">columns</span><span class="p">,</span>
<span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">time_ago</span><span class="p">,</span>
<span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="k">when</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">::</span><span class="n">regclass</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">);</span>
<span class="n">columns</span> <span class="o">|</span> <span class="n">time_ago</span> <span class="o">|</span> <span class="k">when</span>
<span class="c1">----------------------------------------+-----------------+-------------------------------</span>
<span class="p">{</span><span class="n">tableoid</span><span class="p">,</span><span class="n">cmax</span><span class="p">,</span><span class="n">xmax</span><span class="p">,</span><span class="n">cmin</span><span class="p">,</span><span class="n">xmin</span><span class="p">,</span><span class="n">ctid</span><span class="p">,</span><span class="n">pk</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">26</span><span class="p">:</span><span class="mi">25</span><span class="p">.</span><span class="mi">685984</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">12</span><span class="p">:</span><span class="mi">21</span><span class="p">.</span><span class="mi">359126</span><span class="o">-</span><span class="mi">05</span>
<span class="p">{</span><span class="n">ttt</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">08</span><span class="p">.</span><span class="mi">244367</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">38</span><span class="p">.</span><span class="mi">800743</span><span class="o">-</span><span class="mi">05</span>
<span class="p">{</span><span class="n">tt</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">31</span><span class="p">.</span><span class="mi">791574</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">36</span><span class="p">:</span><span class="mi">15</span><span class="p">.</span><span class="mi">253536</span><span class="o">-</span><span class="mi">05</span>
<span class="p">{........</span><span class="n">pg</span><span class="p">.</span><span class="n">dropped</span><span class="p">.</span><span class="mi">2</span><span class="p">........}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">01</span><span class="p">:</span><span class="mi">58</span><span class="p">.</span><span class="mi">50226</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">36</span><span class="p">:</span><span class="mi">48</span><span class="p">.</span><span class="mi">54285</span><span class="o">-</span><span class="mi">05</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_at</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------+------------------------------</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="mi">3</span>
<span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="mi">2359180417</span>
<span class="n">modified_at</span> <span class="o">|</span> <span class="mi">2021</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">20</span> <span class="mi">05</span><span class="p">:</span><span class="mi">34</span><span class="p">:</span><span class="mi">38</span><span class="p">.</span><span class="mi">800743</span><span class="o">-</span><span class="mi">05</span>
<span class="k">table</span> <span class="o">|</span> <span class="n">ddl_test</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>There are some interesting things in the above output.
First of all, <code class="language-plaintext highlighter-rouge">pg_class</code> reports only the changes related to <em>new</em> attributes, not the dropped ones or the internally changed. On the other hand, <code class="language-plaintext highlighter-rouge">pg_attribute</code> reports information about every single attribute, including those changed in a “minor” mode (the <code class="language-plaintext highlighter-rouge">SET DEFAULT</code> for instance).
<br />
Please note how the dropped column (namely <code class="language-plaintext highlighter-rouge">t</code>) is no more visible, even if there is <code class="language-plaintext highlighter-rouge">pg.dropped.2</code> that clearly refers to such column. In the above example it is easy enough: only one column has been dropped in a single user instance, however in a more concurrent system it is hard to get track about the information related to dropped attributes. For more information about the dropped columns, please see <a href="https://fluca1978.github.io/2020/02/09/PostgreSQLDROPCOlumn.html">my previous article about why PostgreSQL does not reclaim disk space on column drop</a>.</p>
<h2 id="what-about-vacuum">What about <code class="language-plaintext highlighter-rouge">VACUUM</code>?</h2>
<p>The <code class="language-plaintext highlighter-rouge">VACUUM FULL</code> command totally rewrites a table, therefore this means that every information about transactions that have “touched” systsem catalogs are updated by a newer transaction. <strong>This does not mean that <code class="language-plaintext highlighter-rouge">VACUUM</code> is a transactional command</strong>, rather it happen to do a <code class="language-plaintext highlighter-rouge">CREATE TABLE</code> pretty much as we did manually.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">VACUUM</span> <span class="k">FULL</span> <span class="n">ddl_test</span><span class="p">;</span>
<span class="k">VACUUM</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_since</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="n">modified_since</span> <span class="o">|</span> <span class="k">table</span>
<span class="c1">--------------------+---------------------------+-----------------+----------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">2359180423</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">.</span><span class="mi">615678</span> <span class="o">|</span> <span class="n">ddl_test</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">array_agg</span><span class="p">(</span> <span class="n">attname</span> <span class="p">)</span> <span class="k">as</span> <span class="n">columns</span><span class="p">,</span>
<span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="k">when</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span>
<span class="k">WHERE</span> <span class="n">attrelid</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">::</span><span class="n">regclass</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">);</span>
<span class="n">columns</span> <span class="o">|</span> <span class="k">when</span>
<span class="c1">----------------------------------------+-----------------</span>
<span class="p">{</span><span class="n">tableoid</span><span class="p">,</span><span class="n">cmax</span><span class="p">,</span><span class="n">xmax</span><span class="p">,</span><span class="n">cmin</span><span class="p">,</span><span class="n">xmin</span><span class="p">,</span><span class="n">ctid</span><span class="p">,</span><span class="n">pk</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">50</span><span class="p">:</span><span class="mi">03</span><span class="p">.</span><span class="mi">972343</span>
<span class="p">{</span><span class="n">ttt</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">27</span><span class="p">:</span><span class="mi">46</span><span class="p">.</span><span class="mi">530726</span>
<span class="p">{</span><span class="n">tt</span><span class="p">}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">26</span><span class="p">:</span><span class="mi">10</span><span class="p">.</span><span class="mi">077933</span>
<span class="p">{........</span><span class="n">pg</span><span class="p">.</span><span class="n">dropped</span><span class="p">.</span><span class="mi">2</span><span class="p">........}</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">25</span><span class="p">:</span><span class="mi">36</span><span class="p">.</span><span class="mi">788619</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is interesting to note an apparent inconsistency: the table has been modified 2 seconds ago while the columns have been touched between 25 and 50 minutes ago. How is that possible? Well, <code class="language-plaintext highlighter-rouge">VACUUM FULL</code> has rewritten the table but metadata about columns did not change.
<br />
In short, this is an indicator about <code class="language-plaintext highlighter-rouge">VACUUM FULL</code> execution: if the change time of a table is earlier than that of its columns probably vacuum ran. The correct way to know <em>when</em> <code class="language-plaintext highlighter-rouge">VACUUM FULL</code> run is to inspect appropriate catalogs like <code class="language-plaintext highlighter-rouge">pg_stat_user_tables</code>. In any case, combining these information help understanding what happened into the system.</p>
<p><br />
<br />
Let’s see something about <code class="language-plaintext highlighter-rouge">VACUUM</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_since</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------+----------------</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="mi">11</span>
<span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="mi">2359180423</span>
<span class="n">modified_since</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">20</span><span class="p">:</span><span class="mi">41</span><span class="p">.</span><span class="mi">953205</span>
<span class="k">table</span> <span class="o">|</span> <span class="n">ddl_test</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">VACUUM</span> <span class="n">ddl_test</span><span class="p">;</span>
<span class="k">VACUUM</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_since</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------+----------------</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="mi">11</span>
<span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="mi">2359180423</span>
<span class="n">modified_since</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">20</span><span class="p">:</span><span class="mi">51</span><span class="p">.</span><span class="mi">272209</span>
<span class="k">table</span> <span class="o">|</span> <span class="n">ddl_test</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The result is that <code class="language-plaintext highlighter-rouge">pg_class</code> is unchanged, with regard to the transaction that generated the tuple.
<br />
Why?
<br />
Since <code class="language-plaintext highlighter-rouge">VACUUM</code> is a command that cannot be run within a transaction, it cannot be considered in the described workflow, therefore it is like <em>an invisible command (with regard to transactions)</em>.</p>
<h2 id="what-about-analyze">What about <code class="language-plaintext highlighter-rouge">ANALYZE</code>?</h2>
<p><strong>Unlike <code class="language-plaintext highlighter-rouge">VACUUM</code>, the command <code class="language-plaintext highlighter-rouge">ANALYZE</code> can be run in a transaction</strong>, and this is clearly shown by the <code class="language-plaintext highlighter-rouge">age</code> result increasing by one:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_since</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------+----------------</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="mi">14</span>
<span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="mi">2359180423</span>
<span class="n">modified_since</span> <span class="o">|</span> <span class="mi">01</span><span class="p">:</span><span class="mi">37</span><span class="p">:</span><span class="mi">54</span><span class="p">.</span><span class="mi">483495</span>
<span class="k">table</span> <span class="o">|</span> <span class="n">ddl_test</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">ANALYZE</span> <span class="n">ddl_test</span><span class="p">;</span>
<span class="k">ANALYZE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">transaction_before</span>
<span class="p">,</span> <span class="n">xmin</span> <span class="k">as</span> <span class="n">it_was_transaction_number</span>
<span class="p">,</span> <span class="k">current_timestamp</span> <span class="o">-</span> <span class="n">pg_xact_commit_timestamp</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">)</span> <span class="k">as</span> <span class="n">modified_since</span>
<span class="p">,</span> <span class="n">relname</span> <span class="k">as</span> <span class="k">table</span>
<span class="k">FROM</span> <span class="n">pg_class</span>
<span class="k">WHERE</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------+----------------</span>
<span class="n">transaction_before</span> <span class="o">|</span> <span class="mi">15</span>
<span class="n">it_was_transaction_number</span> <span class="o">|</span> <span class="mi">2359180423</span>
<span class="n">modified_since</span> <span class="o">|</span> <span class="mi">01</span><span class="p">:</span><span class="mi">38</span><span class="p">:</span><span class="mi">05</span><span class="p">.</span><span class="mi">267443</span>
<span class="k">table</span> <span class="o">|</span> <span class="n">ddl_test</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>What is not changing in the above example is the transaction that generated the tuple in <code class="language-plaintext highlighter-rouge">pg_class</code>: it is always <code class="language-plaintext highlighter-rouge">2359180423</code>, before and after the <code class="language-plaintext highlighter-rouge">ANALYZE</code> command (that did run in a transaction).
<br />
Why?
<br />
Well, <code class="language-plaintext highlighter-rouge">ANALYZE</code> <em>hits</em> another table: <code class="language-plaintext highlighter-rouge">pg_statistic</code>. Such table is the root of all statistical information like <code class="language-plaintext highlighter-rouge">pg_stat_user_tables</code> and friends, and is the one updated by <code class="language-plaintext highlighter-rouge">ANALYZE</code>. This can be clearly inspected with a similar query:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">xmin</span><span class="p">,</span> <span class="n">age</span><span class="p">(</span> <span class="n">xmin</span> <span class="p">),</span> <span class="n">staattnum</span>
<span class="k">FROM</span> <span class="n">pg_statistic</span>
<span class="k">WHERE</span> <span class="n">starelid</span> <span class="o">=</span> <span class="s1">'ddl_test'</span><span class="p">::</span><span class="n">regclass</span><span class="p">;</span>
<span class="n">xmin</span> <span class="o">|</span> <span class="n">age</span> <span class="o">|</span> <span class="n">staattnum</span>
<span class="c1">------------+-----+-----------</span>
<span class="mi">2359180437</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2359180437</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">3</span>
<span class="mi">2359180437</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">4</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Please note that the query has been run as a superuser, because of the need of privileges. The result set is made of three rows because there are three “active” (i.e., not dropped) columns within the table, and all of them has been modified (from a statistic point of view) by <code class="language-plaintext highlighter-rouge">ANALYZE</code>, that ran in transaction <code class="language-plaintext highlighter-rouge">2359180437</code> that is now one transaction far (i.e., it was the previous transaction).</p>
<h1 id="conclusions">Conclusions</h1>
<p>Keeping track of commit timestamps could be useful for database introspection, at least to get a glance at <strong>when</strong> things changed.
<br />
The same trick can also be used against regular table tuples, to get an idea of when a tuple appeared in that form in the table.
<br />
However, this is not a very good approach, and something much more complex can be built like using already mentioned event triggers.
<br />
<em>But hey, this is PostgreSQL: you can extend it in pretty much any direction!</em></p>
pgenv `config migrate`2021-11-24T00:00:00+00:00https://fluca1978.github.io/2021/11/24/pgenvconfig<p><code class="language-plaintext highlighter-rouge">pgenv</code> 1.2.1 introduces a different configuration setup.</p>
<h1 id="pgenv-config-migrate"><code class="language-plaintext highlighter-rouge">pgenv config migrate</code></h1>
<p>Just a few hours I blogged about some new cool features in <a href="https://github.com/theory/pgenv" target="_blank"><code class="language-plaintext highlighter-rouge">pgenv</code></a>, I completed the work about <em>configuration in one place</em>.
<br />
Now <code class="language-plaintext highlighter-rouge">pgenv</code> will keep all configuration files into a single directory, named <code class="language-plaintext highlighter-rouge">config</code> . This is useful because it allows you to backup and/or migrate all the configuration from one machine to another easily.
<br />
But it’s not all: since the configuration is now under a single directory, the single configuration file name has changed. Before this release, a configuration file was named like <code class="language-plaintext highlighter-rouge">.pgenv.PGVERSION.conf</code>, with the <code class="language-plaintext highlighter-rouge">.pgenv</code> prefix that both made the file hidden and stated to which application such file belongs to. Since the configuration files are now into a subdirectory, the prefix has been dropped, so that every configuration file is now simply named as <code class="language-plaintext highlighter-rouge">PGVERSION.conf</code>, like for example <code class="language-plaintext highlighter-rouge">10.4.conf</code>.
<br />
And since we like to make things easy, there is a <code class="language-plaintext highlighter-rouge">config migrate</code> command that helps you move your existing configuration from the <em>old</em> naming scheme to the <em>new</em> one:</p>
<p><br />
<br /></p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv config migrate
Migrated 3 configuration file<span class="o">(</span>s<span class="o">)</span> from previous versions <span class="o">(</span>0 not migrated<span class="o">)</span>
Your configuration file<span class="o">(</span>s<span class="o">)</span> are now into <span class="o">[</span>~/git/misc/PostgreSQL/pgenv/config]
</code></pre></div></div>
<p><br />
<br /></p>
<p>Let’s have fun with <code class="language-plaintext highlighter-rouge">pgenv</code>!</p>
New features in pgenv2021-11-18T00:00:00+00:00https://fluca1978.github.io/2021/11/18/pgenv12<p><code class="language-plaintext highlighter-rouge">pgenv</code> 1.2 introduces a few nice features.</p>
<h1 id="new-features-in-pgenv">New features in <code class="language-plaintext highlighter-rouge">pgenv</code></h1>
<p><a href="https://github.com/theory/pgenv" target="_blank"><code class="language-plaintext highlighter-rouge">pgenv</code></a> is a great tool to simply manage different binary installations of PostgreSQL.
<br />
It is a shell script, specifically designed for the Bash shell, that provides a single command named <code class="language-plaintext highlighter-rouge">pgenv</code> that accepts sub-commands to fetch, configure, install, start and stop different PostgreSQL versions on the same machine.
<br />
It is not designed to be used in production or in an enterprise environment, even if it could, but rather it is designed to be used as a compact and simple way to switch between different versions in order to test applications and libraries.
<br />
<br />
In the last few weeks, there has been quite work around <code class="language-plaintext highlighter-rouge">pgenv</code>, most notably:</p>
<ul>
<li>support for multiple configuration flags;</li>
<li>consistent behavior about configuration files.</li>
</ul>
<p><br />
In the following, I briefly describe each of the above.</p>
<h1 id="support-for-multiple-configuration-flags">Support for multiple configuration flags</h1>
<p><code class="language-plaintext highlighter-rouge">pgenv</code> does support configuration files, where you can store shell variables that drive the PostgreSQL build and configuration. One problem <code class="language-plaintext highlighter-rouge">pgenv</code> had was due to the limitation of the shell environment variables: since they represent a single value, passing multiple values <em>separated by spaces</em> was not possible. This made build flags, e.g., <code class="language-plaintext highlighter-rouge">CFLAGS</code> hard to write if not impossible.
<br />
Since <a href="https://github.com/theory/pgenv/commit/e7e289cea8c3a232d51e06af93fc798d01c8a36b" target="_blank">this commit</a>, <em>David</em> (the original author) introduced the capability to <strong>configure options containing spaces</strong>. The trick was to switch from simple environment variables to Bash arrays, so that the configuration can be written as</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">PGENV_CONFIGURE_OPTIONS</span><span class="o">=(</span>
<span class="nt">--with-perl</span>
<span class="nt">--with-openssl</span>
<span class="s1">'CFLAGS=-I/opt/local/opt/openssl/include -I/opt/local/opt/libxml2/include'</span>
<span class="s1">'LDFLAGS=-L/opt/local/opt/openssl/lib -L/opt/local/opt/libxml2/lib'</span>
<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br />
where the <code class="language-plaintext highlighter-rouge">CFLAGS</code> and <code class="language-plaintext highlighter-rouge">LDFLAGS</code> both contain spaces.
<br />
To be coherent, this also renamed a lot of <code class="language-plaintext highlighter-rouge">_OPT_</code> parameters to <code class="language-plaintext highlighter-rouge">_OPTIONS_</code> to reflect the fact that they now can contain multiple values.</p>
<h2 id="consistent-behavior-about-configuration-files">Consistent behavior about configuration files</h2>
<p><code class="language-plaintext highlighter-rouge">pgenv</code> exploits a <em>default configuration file</em> when no specific PostgreSQL configuration is found.
The idea is that, if you launch PostgreSQL version <em>x</em>, an <code class="language-plaintext highlighter-rouge">.pgenv.x.conf</code> file is searched for, and if not found, the command tries to load the configuration from a default file named <code class="language-plaintext highlighter-rouge">.pgenv.default.conf</code>.
<br />
However, when you delete the configuration, the system did remove also the default configuration.
<br />
Therefore, <a href="https://github.com/theory/pgenv/commit/e46d35f5b09a998fddf8bec43a2ac8d6f9fd0402" target="_blank">since this commit</a>, there is more consistency in the usage of the <code class="language-plaintext highlighter-rouge">config</code> subcommand.
<br />
In particular, in order to delete the default configuration you have to specify <code class="language-plaintext highlighter-rouge">config delete defauòt</code> explicitly, since <code class="language-plaintext highlighter-rouge">config delete</code> will no more nuke your default configuration.
Moreover, the <code class="language-plaintext highlighter-rouge">config init</code> command has been added, so that you can <em>initialize</em> the configuration and then modify it by means of the <code class="language-plaintext highlighter-rouge">config write</code> command. Why these two commands? Well, <code class="language-plaintext highlighter-rouge">config init</code> will create a “default” configuration file from scratch with current default settings, while <code class="language-plaintext highlighter-rouge">config write</code> will modify the specified configuration.</p>
<h2 id="there-is-more">There is more…</h2>
<p>I’m currently working at another change in the configuration subsystem, so that you can keep all the configuration files into a single directory. The idea is to ease the migration of <code class="language-plaintext highlighter-rouge">pgenv</code> to a different machine (e.g., a new one), keeping your own configuration.</p>
My Perl Weekly Challenge Solutions in PostgreSQL2021-11-09T00:00:00+00:00https://fluca1978.github.io/2021/11/09/PWCPostgreSQL<p>Pushing PostgreSQL solutions to my own repositories.</p>
<h1 id="my-perl-weekly-challenge-solutions-in-postgresql">My Perl Weekly Challenge Solutions in PostgreSQL</h1>
<p>Starting back at Perl Weekly Challenge 136, I decided to try to implement, whenever possible (to me), the challenges not only in Raku (i.e., Perl 6), but also in PostgreSQL (either pure SQL or <code class="language-plaintext highlighter-rouge">plpgsql</code>).
<br />
<br />
Recently, I modified my <em>sync script</em> that drags solutions from the <a href="https://github.com/manwar/perlweeklychallenge-club" target="_blank">official Perl Weekly Challenge repository</a> to my own repositories, and of course, I added a way to synchronized PostgreSQL solutions.
<br />
<br />
The solutions are now available on <a href="https://github.com/fluca1978/fluca1978-pg-utils/tree/master/PWC" target="_blank">GitHub under the <code class="language-plaintext highlighter-rouge">PWC</code> directory of my <em>PostgreSQL examples</em> repository</a>.</p>
PostgreSQL USB Sticks in the Attic!2021-11-08T00:00:00+00:00https://fluca1978.github.io/2021/11/08/ITPUG_USB_Sticks<p>USB sticks I found in the attic…</p>
<h1 id="postgresql-usb-sticks-in-the-attic">PostgreSQL USB Sticks in the Attic!</h1>
<p><strong>TLDR: this is not a technical post!</strong>
<br />
<br />
Cleaning the attic, I found a couple of old <em>PostgreSQL USB Sticks</em>.
<br />
It happened that, back at the Italian PostgreSQL Day (PGDay.IT) 2012, we (at the time I was an happy memeber of ITPUG) created PostgreSQL-branded USB sticks to give away as gadgets to participants.
<br />
The USB stick was cool, with soft rubber envelope, a clear white and blue elephant logo on its sides, the size of <code class="language-plaintext highlighter-rouge">4 GB</code> (that back then, it was quite common) and a necklace.
<br />
However, it had something that I didn’t like.
<br />
So, when I was the ITPUG president back in 2013, I decided to change the design of the USB stick (as well as doubling its size).
<br />
Let’s inspect the differences, and please apologize if the sticks printing is not clear anymore, but well, some years have gone by:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/ITPUG/itpug_usb_1.png" />
<img src="/images/posts/ITPUG/itpug_usb_2.png" />
</center>
<p><br />
<br /></p>
<p>The upper stick is the 2012 edition, the lower one is the 2013 edition.
<br />
Do you spot the difference?
<br />
Yes, <strong>the 2013 edition USB stick did have the PostgreSQL logo on one side and the ITPUG logo on the other side</strong>, while the 2012 edition did not have any reference to the organizing and local user group ITPUG!
<br />
<br />
When I decided to give a new spark to the ITPUG, I also decided to improve its visibility via such gadgets, that were too much generic and, for this reason, also re-usable in other events as PostgreSQL related gadgets.
<br />
<br />
Therefore, such gadget was both presenting PostgreSQL and the italian users’ group, no shame at all!</p>
pg_upgrade and OpenBSD2021-11-05T00:00:00+00:00https://fluca1978.github.io/2021/11/05/PostgreSQL_pg_upgrade_OpenBSD<p>OpenBSD ships <code class="language-plaintext highlighter-rouge">pg_upgrade</code> as a separate package.</p>
<h1 id="pg_upgrade-and-openbsd">pg_upgrade and OpenBSD</h1>
<p>I never noted that, on OpenBSD, the <code class="language-plaintext highlighter-rouge">pg_upgrade</code> command is not shipped with the <em>default</em> PostgreSQL server isntallation. I usually install PostgreSQL from sources, so I never digged into Open BSD packages. The choice of OpenBSD is to keep <code class="language-plaintext highlighter-rouge">pg_upgrade</code> separate from the rest of the binaries and executables of PostgreSQL.
<br />
Allow me to explain and let’s start from the installed binaries on a OpenBSD 7.0 machine:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">ls</span> <span class="nt">-1</span> /usr/local/bin/pg<span class="k">*</span>
/usr/local/bin/pg_archivecleanup
/usr/local/bin/pg_basebackup
/usr/local/bin/pg_checksums
/usr/local/bin/pg_config
/usr/local/bin/pg_controldata
/usr/local/bin/pg_ctl
/usr/local/bin/pg_dump
/usr/local/bin/pg_dumpall
/usr/local/bin/pg_isready
/usr/local/bin/pg_receivewal
/usr/local/bin/pg_recvlogical
/usr/local/bin/pg_resetwal
/usr/local/bin/pg_restore
/usr/local/bin/pg_rewind
/usr/local/bin/pg_standby
/usr/local/bin/pg_test_fsync
/usr/local/bin/pg_test_timing
/usr/local/bin/pg_verifybackup
/usr/local/bin/pg_waldump
/usr/local/bin/pgbench
</code></pre></div></div>
<p><br />
<br /></p>
<p>The server is a PostgreSQL 13.4, installed via <code class="language-plaintext highlighter-rouge">pkg_add</code>. The PostgreSQL <em>contrib</em> module is installed, but as you can see, there is no <code class="language-plaintext highlighter-rouge">pg_upgrade</code> binary in the above listing.
<br />
Let’s inspect the packages:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pkg_info <span class="nt">-Q</span> postgresql
postgresql-client-13.4p0 <span class="o">(</span>installed<span class="o">)</span>
postgresql-contrib-13.4p0 <span class="o">(</span>installed<span class="o">)</span>
postgresql-docs-13.4p0 <span class="o">(</span>installed<span class="o">)</span>
postgresql-odbc-10.02.0000p0
postgresql-pg_upgrade-13.4p0
postgresql-pllua-2.0.7
postgresql-plpython-13.4p0
postgresql-plr-8.4.1
postgresql-server-13.4p0 <span class="o">(</span>installed<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Please note the <code class="language-plaintext highlighter-rouge">postgresql-pg_upgrade-13.4p0</code> that is what contains the <code class="language-plaintext highlighter-rouge">pg_upgrade</code> command:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pkg_info postgresql-pg_upgrade-13.4p0
Information <span class="k">for </span>https://cdn.openbsd.org/pub/OpenBSD/7.0/packages/amd64/postgresql-pg_upgrade-13.4p0.tgz
Comment:
Support <span class="k">for </span>upgrading PostgreSQL data from previous version
Description:
Contains pg_upgrade, used <span class="k">for </span>upgrading PostgreSQL database
directories to newer major versions without requiring a dump and
reload.
Maintainer: Pierre-Emmanuel Andre <[email protected]>
WWW: https://www.postgresql.org/
</code></pre></div></div>
<p><br />
<br /></p>
<p>This choice of packaging is somehow strange.
<br />
Let’s install <code class="language-plaintext highlighter-rouge">pg_upgrade</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% doas pkg_add postgresql-pg_upgrade
quirks-4.53 signed on 2021-10-30T11:32:24Z
postgresql-pg_upgrade-13.4p0:postgresql-previous-12.8: ok
postgresql-pg_upgrade-13.4p0: ok
% <span class="nb">ls</span> <span class="nt">-lh</span> <span class="si">$(</span>which pg_upgrade<span class="si">)</span>
<span class="nt">-rwxr-xr-x</span> 1 root bin 185K Sep 26 21:25 /usr/local/bin/pg_upgrade
</code></pre></div></div>
<p><br />
<br /></p>
<p>So, the binary itself is very tiny, and sizes at <code class="language-plaintext highlighter-rouge">185 kB</code>, therefore placing it on its own package does not make sense with regard to the disk space occupation. However, please note that installing <code class="language-plaintext highlighter-rouge">pg_upgrade</code> also triggered the installation of <code class="language-plaintext highlighter-rouge">postgresql-previous-12.8</code>, that means the system has installed also PostgreSQL 12.8.
<br />
This is clearly shown from a query on such package:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pkg_info postgresql-previous-12.8
Information <span class="k">for </span>inst:postgresql-previous-12.8
Comment:
PostgreSQL RDBMS <span class="o">(</span>previous version, <span class="k">for </span>pg_upgrade<span class="o">)</span>
Required by:
postgresql-pg_upgrade-13.4p0
Description:
PostgreSQL RDBMS server, the previous version
This is the previous version of PostgreSQL, necessary to allow <span class="k">for
</span>pg_upgrade to work <span class="k">in </span>the currently supported PostgreSQL version.
</code></pre></div></div>
<p><br />
<br /></p>
<p>And in fact, the package installs <em>all</em> the previous version of the cluster, included libraries and executables:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pkg_info <span class="nt">-L</span> postgresql-previous-12.8 | <span class="nb">grep </span>bin
/usr/local/bin/postgresql-12/clusterdb
/usr/local/bin/postgresql-12/createdb
/usr/local/bin/postgresql-12/createuser
/usr/local/bin/postgresql-12/dropdb
/usr/local/bin/postgresql-12/dropuser
/usr/local/bin/postgresql-12/ecpg
/usr/local/bin/postgresql-12/initdb
/usr/local/bin/postgresql-12/oid2name
/usr/local/bin/postgresql-12/pg_archivecleanup
/usr/local/bin/postgresql-12/pg_basebackup
/usr/local/bin/postgresql-12/pg_checksums
/usr/local/bin/postgresql-12/pg_config
...
</code></pre></div></div>
<p><br />
<br /></p>
<p><strong>Therefore, installing <code class="language-plaintext highlighter-rouge">pg_upgrade</code> will also install the *whole</strong> previous major version of PostgreSQL.**</p>
<h2 id="it-was-a-separated-packages-since-a-while">It was a separated packages since a while…</h2>
<p>Inspecting the CVS of the ports tree, it is possible to note that the <a href="https://cvsweb.openbsd.org/cgi-bin/cvsweb/ports/databases/postgresql/pkg/DESCR-pg_upgrade?rev=1.1&content-type=text/x-cvsweb-markup" target="_blank"><code class="language-plaintext highlighter-rouge">pg_upgrade</code> command has been separated into a <em>subpackage</em> since 2016</a>:</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>This moves pg_upgrade to a subpackage, and has that
subpackage depend on postgresql-previous.
</code></pre></div></div>
<p><br />
<br /></p>
<p>In fact, <a href="https://cvsweb.openbsd.org/cgi-bin/cvsweb/ports/databases/postgresql/Makefile.diff?r1=1.218&r2=1.219&f=h" target="_blank">this is the commit</a> that made the <code class="language-plaintext highlighter-rouge">pg_upgrade</code> a distinct package into the build system.
<br />
The rationale about this can be found <a href="https://undeadly.org/cgi?action=article;sid=20161112112023" target="_blank">in the b2k16 hackaton article</a>, where Jeremy Evans explain that in order to get <code class="language-plaintext highlighter-rouge">pg_upgrade</code> to work, there was the need to have the previous binaries for PostgreSQL. Therefore, the application has been moved to a different package, so that it can install also the previous binaries on the system.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The choice of keeping <code class="language-plaintext highlighter-rouge">pg_upgrade</code> as a separated package is a choice. I don’t think it is right or wrong, it is just a choice that ensures that if you decide to install a newer PostgreSQL, you must have a previous version to upgrade from.
<br />
Quite frankly, I don’t see the reason because I could have a different database version into the system, that I want to upgrade from, even if I did not have installed from ports.
<br />
Moreover, <code class="language-plaintext highlighter-rouge">pg_upgrade</code> can upgrade PostgreSQL even from non-sequential PostgreSQL versions, even if I personally don’t recommend this, especially if the “hole” in versioning is big. However, this means that installing the previous version of PostgreSQL could not be the right choice in every scenario. Again, <em>this is not either a good or bad choice, it is just a choice</em> and it must be noted that, unlike other operating systems, OpenBSD does not offer old versions of PostgreSQL as packages (if we exclude the <code class="language-plaintext highlighter-rouge">-previous</code> package), that means it is a choice coherent with the philosophy of the operating system.</p>
Perl Weekly Challenge 136: PostgreSQL Solutions2021-10-29T00:00:00+00:00https://fluca1978.github.io/2021/10/29/PerlWeeklyChallenge136PostgreSQL<p>My personal solutions to the Perl Weekly Challenge, this time in PostgreSQL!</p>
<h1 id="perl-weekly-challenge-136-postgresql-solutions">Perl Weekly Challenge 136: PostgreSQL Solutions</h1>
<p>Wait a minute, what the hell is going on? A Perl challenge and PostgreSQL?
<br />
Well, it is almost two years now since I’ve started participating regurarly in the <a href="https://perlweeklychallenge.org/" target="_blank">Perl Weekly Challenge</a>, and I always solve the tasks in Raku (aka Perl 6).
<br />
Today I decided to spend a few minutes in order to try to solve the assigned tasks in PostgreSQL. And I tried to solve them in an SQL way: declaratively.
<br />
<br />
So here there are my solutions in PostgreSQL for the <a href="https://perlweeklychallenge.org/blog/perl-weekly-challenge-0136/" target="_blank">Challenge 136</a>.</p>
<p><br /></p>
<ul>
<li><a href="#task1">Task 1</a></li>
<li><a href="#task2">Task 2</a></li>
</ul>
<p><a name="task1"></a></p>
<h2 id="pwc-136---task-1">PWC 136 - Task 1</h2>
<p>The first task asked to find out if two numbers are friends, meaning that their <em>greatest common divisor</em> should be a positive power of <code class="language-plaintext highlighter-rouge">2</code>. This is quite easy to implement in pure SQL:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">friendly</span><span class="p">(</span> <span class="n">m</span> <span class="nb">int</span><span class="p">,</span> <span class="n">n</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">SELECT</span>
<span class="k">CASE</span> <span class="n">gcd</span><span class="p">(</span> <span class="n">m</span><span class="p">,</span> <span class="n">n</span> <span class="p">)</span> <span class="o">%</span> <span class="mi">2</span>
<span class="k">WHEN</span> <span class="mi">0</span> <span class="k">THEN</span> <span class="mi">1</span>
<span class="k">ELSE</span> <span class="mi">0</span>
<span class="k">END</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="k">SQL</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br />
The <code class="language-plaintext highlighter-rouge">gcd</code> function finds out the greatest common divisor, then I apply the module <code class="language-plaintext highlighter-rouge">%</code> operator and catch the remainder: if it is <code class="language-plaintext highlighter-rouge">0</code> then the gcd is a power of <code class="language-plaintext highlighter-rouge">2</code>, else it is not.</p>
<p><a name="task2"></a></p>
<h2 id="pwc-136---task-2">PWC 136 - Task 2</h2>
<p>The second task was much more complicated to solve, and required, at least to me, a little <em>try-and-modify</em> approach. Given a specific value, we need to find out all unique combinations of numbers within the Fibonacci sequence that can lead to that value sum.
<br />
I decided to solve it via a <code class="language-plaintext highlighter-rouge">RECURSIVE</code> Common Table Expression (CTE), due to the fact I need to produce a Fibonacci series:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">CREATE OR REPLACE FUNCTION fibonacci_sum( l int DEFAULT 16 )
RETURNS bigint
AS $CODE$
WITH RECURSIVE
fibonacci( n, p ) AS
(
SELECT 1, 1
UNION
SELECT p + n, n
FROM fibonacci
WHERE n < l
)
, permutations AS
(
SELECT n::text AS current_value, n as total_sum
FROM fibonacci
UNION
SELECT current_value || ',' || n, total_sum + n
FROM permutations, fibonacci
WHERE
position( n::text in current_value ) = 0
AND n > ALL( string_to_array( current_value, ',' )::int[] )
)
SELECT count(*)
FROM permutations
WHERE total_sum = l
;
$CODE$
LANGUAGE SQL;
</code></pre>
<p><br />
<br /></p>
<p>The searched value is the argument to the function, that is <code class="language-plaintext highlighter-rouge">l</code>.
<br />
The first part of the CTE computes the Fibonacci sequence of values that lead to <code class="language-plaintext highlighter-rouge">l</code>, and thus we can throw away all the other values since their sum will be greater than <code class="language-plaintext highlighter-rouge">l</code>.
<br />
The <code class="language-plaintext highlighter-rouge">permutations</code> CTE computes a two column materialization: each value from the Fibonacci sequence is appended to the next value, and the sum so far is computed. Note the <code class="language-plaintext highlighter-rouge">WHERE</code> clause:</p>
<ul>
<li>the <code class="language-plaintext highlighter-rouge">position</code> function checks that the digit has not already be inserted in the list;</li>
<li>the <code class="language-plaintext highlighter-rouge">n > ALL</code> considers only ordered values, that is <code class="language-plaintext highlighter-rouge">3,5</code> is a good list, but <code class="language-plaintext highlighter-rouge">5,3</code> is not because <code class="language-plaintext highlighter-rouge">n</code> is <code class="language-plaintext highlighter-rouge">5</code>.</li>
</ul>
<p>Thanks to the trick of considering only ordered sequences, I can trim out all the sequences that produce the same sum, with the same numbers, in a different order. For example <code class="language-plaintext highlighter-rouge">3, 13</code> and <code class="language-plaintext highlighter-rouge">13,3</code> produce the same value, but only the first one is kept.
<br />
At this point, it does suffice to count how many tuples there are in <code class="language-plaintext highlighter-rouge">permutations</code> to get final answer of the task: how many permutations that lead to <code class="language-plaintext highlighter-rouge">l</code> by sum can be found in the Fibonacci series.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Clearly PostgreSQL provides all the features to implement <em>program-like</em> behaviors in a declarative way. Of course, the above solutions are neither the best nor the more efficient that can be implemented, but they demonstrate how powerful PostgreSQL (and more in general, SQL), can be to solve tasks where a few nested loops seem the simpler approach!</p>
pspg lands in OpenBSD2021-10-28T00:00:00+00:00https://fluca1978.github.io/2021/10/28/pspgOpenBSD<p>A great pager into a great operating system.</p>
<h1 id="pspg-lands-in-openbsd">pspg lands in OpenBSD</h1>
<p><a href="https://github.com/okbob/pspg{:target="_blank"}"><code class="language-plaintext highlighter-rouge">pspg</code></a> is a great pager specifically designed for PostgreSQL, or better, for <code class="language-plaintext highlighter-rouge">psql</code>, the default and powerful text client for PostgreSQL databases.
<br />
But <code class="language-plaintext highlighter-rouge">pspg</code> is more than simply a <em>pager for PostgreSQL</em>: it is a general purpose pager for tabular data.
<br />
<br />
It happened that a few weeks ago I was using an OpenBSD system, and since I had to do some work with PostgreSQL, I decided to install <code class="language-plaintext highlighter-rouge">pspg</code> to get some advantages. Unluckily, there was no package for OpenBSD, and most notably, no port in the ports tree.
<br />
Therefore, the only chance to install <code class="language-plaintext highlighter-rouge">pspg</code> was to compile it from sources, but I failed. I <a href="https://github.com/okbob/pspg/issues/189" target="_blank">opened an issue</a> to get some help, and after some assistance, I decided to dig deeper. So <a href="https://marc.info/?l=openbsd-misc&m=163402630004343&w=2" target="_blank">I asked for help on the <code class="language-plaintext highlighter-rouge">misc</code> OpenBSD mailing list</a> and get much more that I was expecting: not only I solved the problem on how to install <code class="language-plaintext highlighter-rouge">pspg</code>, but the application was noticed and a proposed for a new port was issued.
<br />
In fact, another italian guy, Omar, <a href="https://marc.info/?l=openbsd-ports&m=163404042013173&w=2" target="_blank">did prepared and proposed a <code class="language-plaintext highlighter-rouge">pspg</code> port</a>, and after a few days the <a href="https://cvsweb.openbsd.org/cgi-bin/cvsweb/ports/databases/pspg/" target="_blank">port get included into the ports tree</a>!
<br />
<br />
What does tha mean? That, at least at the moment of writing, that you can get <code class="language-plaintext highlighter-rouge">pspg</code> installed on OpenBSD via the ports:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">cd</span> /usr/ports/databases/pspg
% doas make <span class="nb">install</span>
<span class="o">===></span> pspg-5.4.0 depends on: postgresql-client-<span class="k">*</span> -> postgresql-client-13.4p0
<span class="o">===></span> pspg-5.4.0 depends on: readline-<span class="k">*</span> -> readline-7.0p0
<span class="o">===></span> pspg-5.4.0 depends on: metaauto-<span class="k">*</span> -> metaauto-1.0p4
<span class="o">===></span> pspg-5.4.0 depends on: autoconf-2.69 -> autoconf-2.69p3
<span class="o">===></span> pspg-5.4.0 depends on: gmake-<span class="k">*</span> -> gmake-4.3
<span class="o">===></span> Verifying specs: c curses ereadline m panel pq
<span class="o">===></span> found c.96.1 curses.14.0 ereadline.2.0 m.10.1 panel.6.0 pq.6.12
<span class="o">===></span> Installing pspg-5.4.0 from /usr/ports/packages/amd64/all/
pspg-5.4.0: ok
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is important to note that the ports tree that include <code class="language-plaintext highlighter-rouge">pspg</code>, at the time of writing, is the <code class="language-plaintext highlighter-rouge">-CURRENT</code> (see <a href="https://www.openbsd.org/faq/ports/ports.html" target="_blank">here</a>), and therefore there is still some time to wait to get <code class="language-plaintext highlighter-rouge">pspg</code> as a package and a port in the <code class="language-plaintext highlighter-rouge">-RELEASE</code> ports tree.</p>
<h2 id="great-openbsd-job">Great OpenBSD Job!</h2>
<p>I must say that I was astonished by the great work done by <a href="https://www.omarpolo.com/" target="_blank">Omar</a> and the other OpenBSD volunteers to get the <code class="language-plaintext highlighter-rouge">pspg</code> within the ports tree.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">pspg</code> is a very useful and interesting pager for tabular like data, and of course this includes output from PostgreSQL’s <code class="language-plaintext highlighter-rouge">psql</code> command line client.
<br />
With a bit of luck, patience, and the effort of the OpenBSD community, this program will be soon available on OpenBSD too as a package!</p>
Installing PostgreSQL on OpenBSD2021-10-09T00:00:00+00:00https://fluca1978.github.io/2021/10/09/PostgreSQLOnOpenBSD<p>A quick look at how to get PostgreSQL up and running on OpenBSD.</p>
<h1 id="installing-postgresql-on-openbsd">Installing PostgreSQL on OpenBSD</h1>
<p>OpenBSD is a rock solid, super secure, real Unix operating system.
<br />
PostgreSQL is a rock solid, enterprise level, fully feautured relational database.
<br />
Is it possible to merge the two for a great database on such operating system? <em>Yes</em>, of course!
<br />
<br />
OpenBSD has a packaging system that is somehow different from many other operating systems; in particular the packages are deeply inspected before they are installed, so that the installation process proceed only if it really sure the installation can succeed. Moreover, the operating system provides a simple and flexible way to manage services like PostgreSQL.
<br />
In this short article, I will show how you can start working with PostgreSQL on OpenBSD.</p>
<h2 id="packages">Packages</h2>
<p>OpenBSD uses the <code class="language-plaintext highlighter-rouge">pkg_xxx</code> tools, a set of intelligent Perl applications that handle all the packaging mechanics. While it is true that you can install an application out of <em>ports</em>, like other BSDs, OpenBSD recommends to install via packages because the system can easily track what you have installed so far, and consequently, handle updates.
<br />
<br />
The first thing to do is therefore to search for some PostgreSQL related package, and this is done by means of <code class="language-plaintext highlighter-rouge">pkg_info</code> command, with the particular <code class="language-plaintext highlighter-rouge">-Q</code> flag (for “query”):</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# pkg_info <span class="nt">-Q</span> postgresql
debug-dovecot-postgresql-2.3.15v0
debug-dovecot-postgresql-2.3.16v0
dovecot-postgresql-2.3.15v0
dovecot-postgresql-2.3.16v0
postgresql-client-13.4
postgresql-contrib-13.4
postgresql-docs-13.4
postgresql-pg_upgrade-13.4
postgresql-plpython-13.4
postgresql-previous-12.8
postgresql-server-13.4
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the currently supported version of PostgreSQL is 13.4, while the 14 is already out from a few days (at the time of writing).
<br />
Installing packages is done by means of <code class="language-plaintext highlighter-rouge">pkg_add</code>, and this case there is no particular <em>flavour</em> (i.e., configuration or stack) required, so it is as simple as:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# pkg_add postgresql-server-13.4 postgresql-client-13.4 postgresql-contrib-13.4 postgresql-docs-13.4
quirks-3.633 signed on 2021-10-05T18:48:49Z
postgresql-server-13.4:libexecinfo-0.3p2v0: ok
postgresql-server-13.4:xz-5.2.5: ok
postgresql-server-13.4:libiconv-1.16p0: ok
postgresql-server-13.4:libxml-2.9.10p3: ok
postgresql-server-13.4:postgresql-client-13.4: ok
useradd: Warning: home directory <span class="sb">`</span>/var/postgresql<span class="s1">' doesn'</span>t exist, and <span class="nt">-m</span> was not specified
postgresql-server-13.4: ok
postgresql-contrib-13.4: ok
postgresql-docs-13.4: ok
Running tags: ok
The following new rcscripts were installed: /etc/rc.d/postgresql
See rcctl<span class="o">(</span>8<span class="o">)</span> <span class="k">for </span>details.
New and changed readme<span class="o">(</span>s<span class="o">)</span>:
/usr/local/share/doc/pkg-readmes/postgresql-server
</code></pre></div></div>
<p><br />
<br /></p>
<p>It takes less than a minute to have all the components installed (and if you are curious, it takes much more time to install Emacs 27.2 without X Window support!).
<br />
It is important to note that a new <em>rc-script</em> has been installed. <em>rc-scripts</em> are a set of well defined Korn Shell based scripts that are used to manage daemons; they act similar to other init systems like <em>systemd</em> without being, well, so much bloated.
<br />
Since the installed script is named <code class="language-plaintext highlighter-rouge">/etc/rc.d/postgresql</code>, the service will be called as the relative file name of the script, therefore <code class="language-plaintext highlighter-rouge">postgresql</code>.</p>
<h2 id="start-the-postgresql-server-and-failing">Start the PostgreSQL Server (and Failing)</h2>
<p>You will not be able to start PostgreSQL just after the installation:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl start postgresql
postgresql<span class="o">(</span>failed<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>To understand why PostgreSQL is not starting, we need to dig a little more into the <em>rc-scripts</em>. First of all, ask the OpenBSD system what it knows about PostgreSQL, and this is done thru the <code class="language-plaintext highlighter-rouge">rcctl</code> command and the <code class="language-plaintext highlighter-rouge">get</code> option:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl get postgresql
<span class="nv">postgresql_class</span><span class="o">=</span>daemon
<span class="nv">postgresql_flags</span><span class="o">=</span>NO
<span class="nv">postgresql_logger</span><span class="o">=</span>
<span class="nv">postgresql_rtable</span><span class="o">=</span>0
<span class="nv">postgresql_timeout</span><span class="o">=</span>30
<span class="nv">postgresql_user</span><span class="o">=</span>_postgresql
</code></pre></div></div>
<p><br />
<br /></p>
<p>There is not much output in the above command, but essentially PostgreSQL is system-wide disabled. However, this is not why the process is failing, and in order to discover what is causing the fault, we need to <em>debug</em> the <code class="language-plaintext highlighter-rouge">rcctl</code> execution:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl <span class="nt">-df</span> start postgresql
doing _rc_parse_conf
doing _rc_quirks
postgresql_flags empty, using default <span class="o">></span><span class="nt">-D</span> /var/postgresql/data <span class="nt">-w</span> <span class="nt">-l</span> /var/postgresql/logfile<
doing rc_check
pg_ctl: directory <span class="s2">"/var/postgresql/data"</span> does not exist
postgresql
doing rc_start
doing _rc_wait start
doing rc_check
pg_ctl: directory <span class="s2">"/var/postgresql/data"</span> does not exist
pg_ctl: directory <span class="s2">"/var/postgresql/data"</span> does not exist
doing _rc_rm_runfile
<span class="o">(</span>failed<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Essentially, the system is failing because <strong>packages did not created the PGDATA directory</strong>, and therefore this must be done manually:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# <span class="nb">mkdir</span> /var/postgresql/data
puffy# <span class="nb">chown </span>_postgresql:_postgresql /var/postgresql/data
puffy# su - _postgresql
puffy<span class="nv">$ </span>initdb <span class="nt">-D</span> /var/postgresql/data
The files belonging to this database system will be owned by user <span class="s2">"_postgresql"</span><span class="nb">.</span>
This user must also own the server process.
The database cluster will be initialized with locale <span class="s2">"C"</span><span class="nb">.</span>
The default database encoding has accordingly been <span class="nb">set </span>to <span class="s2">"SQL_ASCII"</span><span class="nb">.</span>
The default text search configuration will be <span class="nb">set </span>to <span class="s2">"english"</span><span class="nb">.</span>
Data page checksums are disabled.
fixing permissions on existing directory /var/postgresql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 20
selecting default shared_buffers ... 128MB
selecting default <span class="nb">time </span>zone ... Europe/Rome
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling <span class="s2">"trust"</span> authentication <span class="k">for </span><span class="nb">local </span>connections
You can change this by editing pg_hba.conf or using the option <span class="nt">-A</span>, or
<span class="nt">--auth-local</span> and <span class="nt">--auth-host</span>, the next <span class="nb">time </span>you run initdb.
Success. You can now start the database server using:
pg_ctl <span class="nt">-D</span> /var/postgresql/data <span class="nt">-l</span> logfile start
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now that the data directory is set up, the system can be started:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl start postgresql
postgresql<span class="o">(</span>ok<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Of course, you can play around the fresh installed and fired up PostgreSQL:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# psql <span class="nt">-U</span> _postgresql template1
psql <span class="o">(</span>13.4<span class="o">)</span>
Type <span class="s2">"help"</span> <span class="k">for </span>help.
<span class="nv">template1</span><span class="o">=</span><span class="c"># SHOW SERVER_VERSION;</span>
server_version
<span class="nt">----------------</span>
13.4
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Did you spot the little trick up there? <strong>Since the <code class="language-plaintext highlighter-rouge">initdb</code> has been executed by the <code class="language-plaintext highlighter-rouge">_postgresql</code> user, the database administrator is the <code class="language-plaintext highlighter-rouge">_postgresql</code> user too!</strong></p>
<h2 id="configuring-your-postgresql-the-openbsd-way">Configuring your PostgreSQL, the OpenBSD way!</h2>
<p>The <code class="language-plaintext highlighter-rouge">rcctl</code> set of scripts is based on a set of variables, that can construct a set of flags passed to the daemon in order to configure it. For example, in the default installation, the <code class="language-plaintext highlighter-rouge">PGDATA</code> directory is set to <code class="language-plaintext highlighter-rouge">/var/postgresql/data</code>, but where is this set? Let’s inspect again what the rc-scripts knows about:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl get postgresql
<span class="nv">postgresql_class</span><span class="o">=</span>daemon
<span class="nv">postgresql_flags</span><span class="o">=</span>NO
<span class="nv">postgresql_logger</span><span class="o">=</span>
<span class="nv">postgresql_rtable</span><span class="o">=</span>0
<span class="nv">postgresql_timeout</span><span class="o">=</span>30
<span class="nv">postgresql_user</span><span class="o">=</span>_postgresql
puffy# rcctl getdef postgresql
<span class="nv">postgresql_class</span><span class="o">=</span>daemon
<span class="nv">postgresql_flags</span><span class="o">=</span><span class="nt">-D</span> /var/postgresql/data <span class="nt">-w</span> <span class="nt">-l</span> /var/postgresql/logfile
<span class="nv">postgresql_logger</span><span class="o">=</span>
<span class="nv">postgresql_rtable</span><span class="o">=</span>0
<span class="nv">postgresql_timeout</span><span class="o">=</span>30
<span class="nv">postgresql_user</span><span class="o">=</span>_postgresql
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">get</code> subcommand reports no flags, but the <code class="language-plaintext highlighter-rouge">getdef</code> (for <em>get defaults</em>) reports the current settings of the daemon. Clearly, what we are interested in is the <code class="language-plaintext highlighter-rouge">postgresql_flags</code>. There are two ways to make a change to a value of the <code class="language-plaintext highlighter-rouge">rcctl</code> variables:</p>
<ul>
<li>editing the rc script by hand;</li>
<li>using <code class="language-plaintext highlighter-rouge">rcctl</code> to set the value.</li>
</ul>
<p>The latest is the preferred way, but, hey, this is Unix, so you can also fire up your favourite editor and go change the <code class="language-plaintext highlighter-rouge">/etc/rc.d/postgresql</code> file, that in fact appears as:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# less /etc/rc.d/postgresql
<span class="c">#!/bin/ksh</span>
<span class="c">#</span>
<span class="c"># $OpenBSD: postgresql.rc,v 1.13 2019/08/27 19:49:46 awolk Exp $</span>
<span class="nv">daemon</span><span class="o">=</span><span class="s2">"/usr/local/bin/pg_ctl"</span>
<span class="nv">daemon_flags</span><span class="o">=</span><span class="s2">"-D /var/postgresql/data -w -l /var/postgresql/logfile"</span>
<span class="nv">daemon_user</span><span class="o">=</span><span class="s2">"_postgresql"</span>
<span class="nb">.</span> /etc/rc.d/rc.subr
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>Clearly, editing this file by hand can be error prone and must be done with the cluster (i.e., PostgreSQL) not running, or you can result in not being able to stop the instance (e.g., changing the <code class="language-plaintext highlighter-rouge">PGDATA</code>).
<br /></p>
<p>Using <code class="language-plaintext highlighter-rouge">rcctl</code> can do something better than manually editing the script file, but it has some constraints:</p>
<ul>
<li>the daemon must be system wide enabled;</li>
<li>only <em>rc variables</em> can be edited (i.e., you cannot define your own variables);</li>
<li>a variable is named without the daemon prefix.</li>
</ul>
<p><br />
Therefore, in order to change both the <code class="language-plaintext highlighter-rouge">PGDATA</code> and the logging directory and file, we can edit the <code class="language-plaintext highlighter-rouge">flags</code> variable as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl <span class="nb">enable </span>postgresql
puffy# rcctl <span class="nb">set </span>postgresql flags <span class="s2">"-D /var/postgresql/13/data -l /var/postgresql/13/log/postgresql.log"</span>
puffy# rcctl get postgresql
<span class="nv">postgresql_class</span><span class="o">=</span>daemon
<span class="nv">postgresql_flags</span><span class="o">=</span><span class="nt">-D</span> /var/postgresql/13/data <span class="nt">-l</span> /var/postgresql/13/data/log/postgresql.log
<span class="nv">postgresql_logger</span><span class="o">=</span>
<span class="nv">postgresql_rtable</span><span class="o">=</span>0
<span class="nv">postgresql_timeout</span><span class="o">=</span>30
<span class="nv">postgresql_user</span><span class="o">=</span>_postgresql
</code></pre></div></div>
<p><br />
<br /></p>
<p>Of course, you have to create the new <code class="language-plaintext highlighter-rouge">PGDATA</code> and the logging directory by hand, assign the right ownership to <code class="language-plaintext highlighter-rouge">_postgresql</code> before you can start the service.</p>
<p><br />
Please also note that if you disable, at system-wide level, the daemon, the customized configuration will be lost:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl get postgresql
<span class="nv">postgresql_class</span><span class="o">=</span>daemon
<span class="nv">postgresql_flags</span><span class="o">=</span><span class="nt">-D</span> /var/postgresql/13/data <span class="nt">-l</span> /var/postgresql/13/data/log/postgresql.log
<span class="nv">postgresql_logger</span><span class="o">=</span>
<span class="nv">postgresql_rtable</span><span class="o">=</span>0
<span class="nv">postgresql_timeout</span><span class="o">=</span>30
<span class="nv">postgresql_user</span><span class="o">=</span>_postgresql
puffy# rcctl disable postgresql
puffy# rcctl get postgresql
<span class="nv">postgresql_class</span><span class="o">=</span>daemon
<span class="nv">postgresql_flags</span><span class="o">=</span>NO
<span class="nv">postgresql_logger</span><span class="o">=</span>
<span class="nv">postgresql_rtable</span><span class="o">=</span>0
<span class="nv">postgresql_timeout</span><span class="o">=</span>30
<span class="nv">postgresql_user</span><span class="o">=</span>_postgresql
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="make-postgresql-start-at-boot">Make PostgreSQL start at boot</h2>
<p>In order to let OpenBSD start PostgreSQL at boot, you have to enable the service system-wide. This can be achieved, as already shown, by means of the <code class="language-plaintext highlighter-rouge">enable</code> command, or by setting the <code class="language-plaintext highlighter-rouge">stauts</code> variable to <code class="language-plaintext highlighter-rouge">on</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl <span class="nb">enable </span>postgresql
<span class="c"># the same</span>
puffy# rcctl <span class="nb">set </span>postgresql status on
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="the-at-boot-configuration">The <em>At-Boot</em> Configuration</h2>
<p>Once the service is enabled at system-wide level, it can be customized by means of the <code class="language-plaintext highlighter-rouge">rcctl set</code> command, as already shown. The reason is that, once a daemon is <em>enabled</em> at boot, its name is appended to the list of serices in the file <code class="language-plaintext highlighter-rouge">/etc/rc.conf.local</code>, that is in turn used to determine what to start at boot.
<br />
The custom configuration goes in that file too, and once the daemon is disabled, the configuration is scrubbed out of the file, so that only the default values (in the rc-script) survive:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl <span class="nb">enable </span>postgresql
puffy# rcctl <span class="nb">set </span>postgresql flags <span class="s2">"-D /var/postgresql/13/data -l /var/postgresql/13/data/log/postgresql.log"</span>
puffy# <span class="nb">cat</span> /etc/rc.conf.local
<span class="nv">amd_flags</span><span class="o">=</span>
<span class="nv">pkg_scripts</span><span class="o">=</span>postgresql transmission_daemon
<span class="nv">postgresql_flags</span><span class="o">=</span><span class="nt">-D</span> /var/postgresql/13/data <span class="nt">-l</span> /var/postgresql/13/data/log/postgresql.log
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above, there are two services that have been installed on the system: PostgreSQL and Transmission. The system is going to start PostgreSQL first (because it is leftmost), and then Transmission. When starting PostgreSQL, it is going to use the specified flags.
<br />
If the PostgreSQL is now disabled, the setting are also lost:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl disable postgresql
puffy# <span class="nb">cat</span> /etc/rc.conf.local
<span class="nv">amd_flags</span><span class="o">=</span>
<span class="nv">pkg_scripts</span><span class="o">=</span>transmission_daemon
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="decide-when-to-start-at-boot">Decide When to Start at Boot</h2>
<p>It is also possible to let PostgreSQL start after (or before) specific other daemons. If we re-enable PostgreSQL, it will be appended into the <code class="language-plaintext highlighter-rouge">rc.conf.local</code> file, and therefore it will be started after the Transmission daemon; this can be obtained also from the <code class="language-plaintext highlighter-rouge">rcctl order</code> command:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl <span class="nb">enable </span>postgresql
puffy# rcctl order
transmission_daemon postgresql
</code></pre></div></div>
<p><br />
<br /></p>
<p>Let’s say we want PostgreSQL to be started as soon as possible, it is possible to change the order of starting by means of <code class="language-plaintext highlighter-rouge">rcctl order</code> command: you need to specify the leftmost (absolute first) daemon to start, or the list of daemons you want to start in the beginning:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl order postgresql
puffy# rcctl order
postgresql transmission_daemon
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="two-is-better-than-one">Two is Better Than One</h2>
<p>What if you want another PostgreSQL instance controlled by <code class="language-plaintext highlighter-rouge">rcctl</code>?
<br />
You can copy the rc-script, giving another name, and chagne the set of flags to let it start:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# <span class="nb">cp</span> /etc/rc.d/postgresql /etc/rc.d/postgresql_replica
puffy# rcctl <span class="nb">enable </span>postgresql_replica
puffy# rcctl <span class="nb">set </span>postgresql_replica flags <span class="s2">"-D /var/postgresql/replica/data -l /var/postgresql/replica/data/log/postgresql.log -o '-p 5433'"</span>
puffy# <span class="nb">mkdir</span> <span class="nt">-p</span> /var/postgresql/replica/data
puffy# <span class="nb">chown</span> <span class="nt">-R</span> _postgresql:_postgresql /var/postgresql/replica/data
puffy# su - _postgresql
puffy<span class="nv">$ </span>initdb /var/postgresql/replica/data
...
puffy<span class="nv">$ </span><span class="nb">mkdir</span> /var/postgresql/replica/data/log
puffy# rcctl start postgresql_replica
postgresql_replica<span class="o">(</span>ok_
</code></pre></div></div>
<p><br />
<br /></p>
<p>There is some work to perform, but it is quite simple after all.</p>
<h2 id="so-is-postgresql-running">So, is PostgreSQL Running?</h2>
<p>Besides checking for allowed connections, you can use <code class="language-plaintext highlighter-rouge">rcctl</code> to see if the daemon is running: the <code class="language-plaintext highlighter-rouge">ls</code> command accepts a <em>status</em> you are looking for, <code class="language-plaintext highlighter-rouge">started</code> for running daemons, and returns the running services:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>puffy# rcctl <span class="nb">ls </span>started
...
postgresql
postgresql_replica
...
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>PostgreSQL can, of course, run well on OpenBSD systems. It can also be managed via the integrated service handler, named <code class="language-plaintext highlighter-rouge">rcctl</code> and its <em>rc-scripts</em>, as well as manually by means of PostgreSQL utility (e.g., <code class="language-plaintext highlighter-rouge">pg_ctl</code>).</p>
GNU Guix and PostgreSQL2021-09-30T00:00:00+00:00https://fluca1978.github.io/2021/09/30/GNU_GUIX_PostgreSQL<p>Installing PostgreSQL via GNU Guix.</p>
<h1 id="gnu-guix-and-postgresql">GNU Guix and PostgreSQL</h1>
<p><a href="https://guix.gnu.org/en/download/" target="_blank">GNU Guix</a> is <strong>an advanced transactional package manager</strong> for the GNU operating system. It is both a complete Linux distribution and a package manager that can be installed on an existing operating system.
<br />
The idea behind GNU Guix is to provide a package manager that works in a way similar to that of <em>binary environment managers</em>: Guix uses <em>user profiles</em> and a set of self-contained directory tree to make available libraries and executables.
<br />
GNU Guix provides a <code class="language-plaintext highlighter-rouge">guix</code> command line command that can be used to manage packages and all the GNU Guix dependencies and configuration.
<br />
<br />
In this article, I show how to use GNU Guix on a CentOS Linux operating system to install and manage PostgreSQL. Please note that I’m not going to show how to install <code class="language-plaintext highlighter-rouge">guix</code>, please refer to the <a href="https://guix.gnu.org/manual/en/html_node/Binary-Installation.html" target="_blank">official GNU Guix installation guide</a>.</p>
<p><br />
<br />
Please note that managing PostgreSQL versions via GNU Guix has nothing to do with PostgreSQL point in time recovery (PITR) or backup strategies.</p>
<h1 id="using-gnu-guix">Using GNU Guix</h1>
<p>The main command to interact with GNU Guix is, guess what, <code class="language-plaintext highlighter-rouge">guix</code>. The command allows for subcommands, in particular the <code class="language-plaintext highlighter-rouge">help</code> one that can provide you interactive help about other subcommands.
<br />
In this article I’m going to use the following subcommands:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">search</code> to search for packages to install;</li>
<li><code class="language-plaintext highlighter-rouge">pull</code> to get fresh <code class="language-plaintext highlighter-rouge">guix</code> package lists and update the program itself;</li>
<li><code class="language-plaintext highlighter-rouge">install</code> and <code class="language-plaintext highlighter-rouge">remove</code> to install a package and delete it;</li>
<li><code class="language-plaintext highlighter-rouge">package</code> the <strong>main <code class="language-plaintext highlighter-rouge">guix</code> command</strong>, many other subcommands are aliases to the <code class="language-plaintext highlighter-rouge">package</code> one. The <code class="language-plaintext highlighter-rouge">package</code> command allows for various operations on packages and their history.</li>
</ul>
<h2 id="searching-for-postgresql">Searching for PostgreSQL</h2>
<p>The subcommand <code class="language-plaintext highlighter-rouge">search</code> can be used to search for a package, in our case the beloved PostgreSQL database. The <code class="language-plaintext highlighter-rouge">search</code> command allows the specification of what to search as a regular expression, and in the following example I’ll search for only packages that start with <code class="language-plaintext highlighter-rouge">postgresql</code> to avoid getting information about drivers and extensions:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix search <span class="s1">'^postgresql.*$'</span>
name: postgresql
version: 9.6.21
outputs: out
systems: x86_64-linux i686-linux
dependencies: [email protected][email protected][email protected][email protected]
location: <span class="se">\g</span>nu/packages/databases.scm:1124:2<span class="se">\</span>
homepage: https://www.postgresql.org/
license: X11-style
synopsis: Powerful object-relational database system
description: PostgreSQL is a powerful object-relational database system. It is fully ACID compliant, has full support <span class="k">for </span>foreign keys, joins,
+ views, triggers, and stored procedures <span class="o">(</span><span class="k">in </span>multiple languages<span class="o">)</span><span class="nb">.</span> It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR,
+ VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video.
relevance: 30
name: postgresql
version: 13.2
outputs: out
systems: x86_64-linux i686-linux
dependencies: [email protected][email protected][email protected][email protected]
location: <span class="se">\g</span>nu/packages/databases.scm:1085:2<span class="se">\</span>
homepage: https://www.postgresql.org/
license: X11-style
synopsis: Powerful object-relational database system
description: PostgreSQL is a powerful object-relational database system. It is fully ACID compliant, has full support <span class="k">for </span>foreign keys, joins,
+ views, triggers, and stored procedures <span class="o">(</span><span class="k">in </span>multiple languages<span class="o">)</span><span class="nb">.</span> It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR,
+ VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video.
relevance: 30
...
</code></pre></div></div>
<p><br />
<br />
There are other PostgreSQL versions in the command output, that I trimmed out for sake of readibility. In short, the <code class="language-plaintext highlighter-rouge">search</code> allows you to search for a package and all its available versions.
<br />
As many <code class="language-plaintext highlighter-rouge">guix</code> subcommands, the <code class="language-plaintext highlighter-rouge">search</code> command is just a shortcut for the invocation of the <code class="language-plaintext highlighter-rouge">package</code> subcommand with the appropriate options. In other words, <code class="language-plaintext highlighter-rouge">guix search foo</code> is the same as calling <code class="language-plaintext highlighter-rouge">guix package -s foo</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix package <span class="nt">-s</span> <span class="s1">'^postgresql.*$'</span>
name: postgresql
version: 9.6.21
outputs: out
systems: x86_64-linux i686-linux
dependencies: [email protected][email protected][email protected][email protected]
location: <span class="se">\g</span>nu/packages/databases.scm:1124:2<span class="se">\</span>
homepage: https://www.postgresql.org/
license: X11-style
synopsis: Powerful object-relational database system
description: PostgreSQL is a powerful object-relational database system. It is fully ACID compliant, has full support <span class="k">for </span>foreign keys, joins,
+ views, triggers, and stored procedures <span class="o">(</span><span class="k">in </span>multiple languages<span class="o">)</span><span class="nb">.</span> It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR,
+ VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video.
relevance: 30
...
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="installing-postgresql">Installing PostgreSQL</h2>
<p>If we don’t specify any particular version, <code class="language-plaintext highlighter-rouge">guix</code> will install the latest available in its repositories, that as I write is PostgreSQL 13.2.
<br />
There are two ways to install stuff in <code class="language-plaintext highlighter-rouge">guix</code>:</p>
<ul>
<li>compiling all software on the local machine (the default behaviour);</li>
<li>using binary packages where and when available.</li>
</ul>
<p><br />
Of course, compiling all the software on the local machine can require a lot of time and resources, depending on the power of the machine <code class="language-plaintext highlighter-rouge">guix</code> is running on.
<br />
Binary packages are called <strong>substitutes</strong> in <code class="language-plaintext highlighter-rouge">guix</code>, because they <em>substitute source compiled software</em>.
<br />
<br />
In both installation scenarios, <code class="language-plaintext highlighter-rouge">guix</code> will install every dependency required by the specific software you are going to install.
<br />
A source based installation will look like the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix <span class="nb">install </span>postgresql
The following package will be installed:
postgresql 13.2
...
building /gnu/store/jkzin3sk1kk8ah9j066k3a03q4d99hc4-tcc-boot0-0.9.26-1103-g6e62e0e.drv...
| <span class="s1">'build'</span> phase
building /gnu/store/35lsvpkqwgzmcs3gnhqkmxhivwfisidm-gzip-mesboot-1.2.4.drv...
building /gnu/store/gzlrw46slsi423qh5vcq91ki0rw4xzm4-make-mesboot0-3.80.drv...
building /gnu/store/2nvaxgs0rdxfkrwklh622ggaxg0wap6n-bash-mesboot0-2.05b.drv...
- <span class="s1">'unpack'</span> phase
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the case of binary packages, <em>substitutions</em>, the installation will look like:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix <span class="nb">install </span>postgresql
...
perl-5.30.2 13.6MiB 1.6MiB/s 00:09 <span class="o">[</span><span class="c">##################] 100.0%</span>
pkg-config-0.29.2 201KiB 721KiB/s 00:00 <span class="o">[</span><span class="c">##################] 100.0%</span>
postgresql-13.2 5.4MiB 543KiB/s 00:10 <span class="o">[</span><span class="c">##################] 100.0%</span>
guile-3.0.2 6.9MiB 457KiB/s 00:15 <span class="o">[</span><span class="c">##################] 100.0%</span>
texinfo-6.7 1.2MiB 3.5MiB/s 00:00 <span class="o">[</span><span class="c">##################] 100.0%</span>
building CA certificate bundle...
building fonts directory...
building directory of Info manuals...
building database <span class="k">for </span>manual pages...
building profile with 1 package...
hint: Consider setting the necessary environment variables by running:
<span class="nv">GUIX_PROFILE</span><span class="o">=</span><span class="s2">"/home/luca/.guix-profile"</span>
<span class="nb">.</span> <span class="s2">"</span><span class="nv">$GUIX_PROFILE</span><span class="s2">/etc/profile"</span>
Alternately, see <span class="sb">`</span>guix package <span class="nt">--search-paths</span> <span class="nt">-p</span> <span class="s2">"/home/luca/.guix-profile"</span><span class="s1">'.
</span></code></pre></div></div>
<p><br />
<br /></p>
<p>The first time, <code class="language-plaintext highlighter-rouge">guix</code> has to <em>bootstrap</em> a lot of dependencies, so it will download, (build) and install libraries and tools even if they are already available on your operating system.
<br />
At the end of the installation, <code class="language-plaintext highlighter-rouge">guix</code> will give you an hint about setting environment variables to give you access to the installed PostgreSQL (and other installed software).</p>
<p><br />
<br />
Inspecting the content of the directory pointed by the above variable, you can see it contains PostgreSQL binaries and executables:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">export </span><span class="nv">GUIX_PROFILE</span><span class="o">=</span><span class="s2">"/home/luca/.guix-profile"</span>
% <span class="nb">source</span> <span class="s2">"</span><span class="nv">$GUIX_PROFILE</span><span class="s2">/etc/profile"</span>
% <span class="nb">ls</span> <span class="nv">$GUIX_PROFILE</span>
bin etc include lib manifest share
% <span class="nb">ls</span> /home/luca/.guix-profile/bin
clusterdb dropuser pg_archivecleanup pg_config pg_dumpall pg_resetwal pg_test_fsync pg_waldump reindexdb
createdb ecpg pg_basebackup pg_controldata pg_isready pg_restore pg_test_timing postgres vacuumdb
createuser initdb pgbench pg_ctl pg_receivewal pg_rewind pg_upgrade postmaster vacuumlo
dropdb oid2name pg_checksums pg_dump pg_recvlogical pg_standby pg_verifybackup psql
</code></pre></div></div>
<p><br />
<br /></p>
<h3 id="locale-and-language-problems"><code class="language-plaintext highlighter-rouge">locale</code> and Language problems</h3>
<p>It is suggested to install the locales, because within <code class="language-plaintext highlighter-rouge">guix</code> ecosystem the ones you have already system-wide will not be available. This could make all your executables, included PostgreSQL’s one, not working at all and cause you some problems.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix <span class="nb">install </span>glibc-locales
The following package will be installed:
glibc-locales 2.31
...
glibc-locales-2.31 10.8MiB 222KiB/s 00:50 <span class="o">[</span><span class="c">##################] 100.0%</span>
linux-libre-headers-5.4.20 1.0MiB 629KiB/s 00:02 <span class="o">[</span><span class="c">##################] 100.0%</span>
building CA certificate bundle...
building fonts directory...
building directory of Info manuals...
building database <span class="k">for </span>manual pages...
building profile with 3 packages...
</code></pre></div></div>
<p><br />
<br /></p>
<p>After installing locales, you need to export Guix related environment variables to make the former available:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">export </span><span class="nv">GUIX_LOCPATH</span><span class="o">=</span><span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/.guix-profile/lib/locale"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="using-the-freshly-installed-postgresql">Using the freshly installed PostgreSQL</h2>
<p>Sourcing the <code class="language-plaintext highlighter-rouge">profile</code> file as suggested by <code class="language-plaintext highlighter-rouge">guix</code> makes the PostgreSQL executables available to your shell:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% which pg_ctl
~/.guix-profile/bin/pg_ctl
</code></pre></div></div>
<p><br />
<br /></p>
<p>The trick is simple: the <code class="language-plaintext highlighter-rouge">profile</code> file manipulates the <code class="language-plaintext highlighter-rouge">PATH</code> environment variable to place the installed software executables in front of the already available ones:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">cat</span> <span class="s2">"</span><span class="nv">$GUIX_PROFILE</span><span class="s2">/etc/profile"</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">GUIX_PROFILE</span><span class="k">:-</span><span class="p">/gnu/store/xh9k8z9x5aspfqfcp1gycqlwksgl1m3g-profile</span><span class="k">}</span><span class="s2">/bin</span><span class="k">${</span><span class="nv">PATH</span>:+:<span class="k">}</span><span class="nv">$PATH</span><span class="s2">"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is now straightforward to use PostgreSQL as “usual”:</p>
<p><br/<
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">mkdir</span> <span class="nt">-p</span> pgdata/13
% initdb <span class="nt">-k</span> <span class="nt">-D</span> pgdata/13
...
Success. You can now start the database server using:
pg_ctl <span class="nt">-D</span> pgdata/13 <span class="nt">-l</span> logfile start
</code></pre></div></div>
<p><br />
<br /></p>
<p>There is an important thing to note here: <strong>PostgreSQL has been installed as a normal user</strong>, this is very similar to virtual binary environment manages, for instance my favourite in PostgreSQL scenario <a href="https://github.com/theory/pgenv/"><code class="language-plaintext highlighter-rouge">pgenv</code></a>.</p>
<p><br />
It is now possible to start PostgreSQL, and since I’ve already a system-wide PostgreSQL running, I need to specify a different port to listen on:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_ctl <span class="nt">-D</span> pgdata/13 <span class="nt">-o</span> <span class="s1">'-p 5433'</span> start
waiting <span class="k">for </span>server to start....
LOG: starting PostgreSQL 13.2 on x86_64-unknown-linux-gnu, compiled by gcc <span class="o">(</span>GCC<span class="o">)</span> 7.5.0, 64-bit
LOG: listening on IPv6 address <span class="s2">"::1"</span>, port 5433
LOG: listening on IPv4 address <span class="s2">"127.0.0.1"</span>, port 5433
LOG: listening on Unix socket <span class="s2">"/tmp/.s.PGSQL.5433"</span>
LOG: database system was shut down at 2021-09-30 08:41:44 EDT
LOG: database system is ready to accept connections
<span class="k">done
</span>server started
</code></pre></div></div>
<p><br />
<br /></p>
<p>And it is now possible to see that two instances are running on the machine:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-c</span> <span class="s1">'SHOW SERVER_VERSION;'</span> template1
server_version
<span class="nt">----------------</span>
13.4
<span class="o">(</span>1 row<span class="o">)</span>
% psql <span class="nt">-c</span> <span class="s1">'SHOW SERVER_VERSION;'</span> <span class="nt">-p</span> 5433 template1
server_version
<span class="nt">----------------</span>
13.2
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The PostgreSQL version 13.4 is the system wide one, while the version 13.2 is the one installed via <code class="language-plaintext highlighter-rouge">guix</code>.</p>
<h2 id="getting-newer-postgresql-versions">Getting Newer PostgreSQL Versions</h2>
<p>In order to get newer PostgreSQL versions, you need to “ask” <code class="language-plaintext highlighter-rouge">guix</code> to search for updates. This is done via the <code class="language-plaintext highlighter-rouge">pull</code> command, that tell <code class="language-plaintext highlighter-rouge">guix</code> to update the list of available software:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix pull
Migrating profile generations to <span class="s1">'/var/guix/profiles/per-user/luca'</span>...
Updating channel <span class="s1">'guix'</span> from Git repository at <span class="s1">'https://git.savannah.gnu.org/git/guix.git'</span>...
Authenticating channel <span class="s1">'guix'</span>, commits 9edb3f6 to 7b59508 <span class="o">(</span>6.374 new commits<span class="o">)</span>...
Building from this channel:
guix https://git.savannah.gnu.org/git/guix.git 7b59508
...
guix-7b59508ca-modules 1.5MiB/s 00:20 | 29.2MiB transferred
guix-module-union 8.9MiB/s 00:00 | 3KiB transferred
guix-command 635B 18KiB/s 00:00 <span class="o">[</span><span class="c">##################] 100.0%</span>
guix-daemon 391B 1.0MiB/s 00:00 <span class="o">[</span><span class="c">##################] 100.0%</span>
guix-7b59508ca 44.4MiB/s 00:00 | 16KiB transferred
building CA certificate bundle...
building fonts directory...
building directory of Info manuals...
building database <span class="k">for </span>manual pages...
building profile with 1 package...
hint: Consider setting the necessary environment variables by running:
<span class="nv">GUIX_PROFILE</span><span class="o">=</span><span class="s2">"/home/luca/.config/guix/current"</span>
<span class="nb">.</span> <span class="s2">"</span><span class="nv">$GUIX_PROFILE</span><span class="s2">/etc/profile"</span>
% <span class="nb">source</span> <span class="s2">"</span><span class="nv">$GUIX_PROFILE</span><span class="s2">/etc/profile"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p><strong>It is really important to <code class="language-plaintext highlighter-rouge">source</code> again the <code class="language-plaintext highlighter-rouge">profile</code> file</strong> since it has changed due to the update process.</p>
<p><br />
After the <code class="language-plaintext highlighter-rouge">pull</code> update, we can search for PostgreSQL again and the available version has bumped to 13.3:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix search <span class="s1">'postgresql.*'</span>
...
name: postgresql
version: 13.3
outputs: out
systems: x86_64-linux i686-linux
dependencies: [email protected][email protected][email protected][email protected]
location: <span class="se">\g</span>nu/packages/databases.scm:1127:2<span class="se">\</span>
homepage: https://www.postgresql.org/
license: X11-style
synopsis: Powerful object-relational database system
description: PostgreSQL is a powerful object-relational database system. It is fully ACID compliant, has full support <span class="k">for </span>foreign keys, joins,
+ views, triggers, and stored procedures <span class="o">(</span><span class="k">in </span>multiple languages<span class="o">)</span><span class="nb">.</span> It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR,
+ VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video.
relevance: 30
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is now time to upgrade the currently running PostgreSQL (and it is suggested to stop the running instance before);</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_ctl <span class="nt">-D</span> pgdata/13 stop
% guix upgrade postgresql
The following package will be upgraded:
postgresql 13.2 → 13.3
...
postgresql-13.3 5.4MiB 1.7MiB/s 00:03 <span class="o">[</span><span class="c">##################] 100.0%</span>
substitute: updating substitutes from <span class="s1">'https://ci.guix.gnu.org'</span>... 100.0%
The following derivation will be built:
/gnu/store/ghlc1angdx9q7gx4hm4yagam6m0gmxzw-profile.drv
0,2 MB will be downloaded
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>Is the new PostgreSQL version installed? Let’s check out:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_ctl <span class="nt">-D</span> pgdata/13 <span class="nt">-o</span> <span class="s1">'-p 5433'</span> start
...
server started
% psql <span class="nt">-p</span> 5433 <span class="nt">-c</span> <span class="s1">'SHOW SERVER_VERSION;'</span> template1
server_version
<span class="nt">----------------</span>
13.3
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br />
Success!</p>
<h2 id="generations-or-how-do-i-go-back-in-time">Generations (or, “How do I go back in time?”)</h2>
<p><code class="language-plaintext highlighter-rouge">guix</code> stores the so called <strong>generations</strong>, that are <em>point in time</em> that contain the history of the installed/removed packages.
The <code class="language-plaintext highlighter-rouge">package</code> sub command can show you the generations available in your system, for example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix package <span class="nt">--list-generations</span>
<span class="se">\G</span>eneration 1 <span class="nb">set </span>30 2021 08:16:02<span class="se">\</span>
postgresql 13.2 out /gnu/store/ivmkwkjsvbkv3g0jq9gcgwlhrhwx91gw-postgresql-13.2
<span class="se">\G</span>eneration 2 <span class="nb">set </span>30 2021 08:37:18<span class="se">\</span>
+ glibc-utf8-locales 2.31 out /gnu/store/rgydar9dfvflqqz2irgh7njj34amaxc6-glibc-utf8-locales-2.31
<span class="se">\G</span>eneration 3 <span class="nb">set </span>30 2021 08:40:43<span class="se">\</span>
+ glibc-locales 2.31 out /gnu/store/wnw0nwlyg92vv33f5f65jj1rd3p4fi3c-glibc-locales-2.31
<span class="se">\G</span>eneration 4 <span class="nb">set </span>30 2021 10:04:21<span class="se">\ </span> <span class="o">(</span>current<span class="o">)</span>
+ postgresql 13.3 out /gnu/store/1nlzmg4hw4gga56g58dsqf9nx90z9kkn-postgresql-13.3
- postgresql 13.2 out /gnu/store/ivmkwkjsvbkv3g0jq9gcgwlhrhwx91gw-postgresql-13.2
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above example, we installed PostgreSQL 13.2 as first thing (<em>generation 1</em>), while the upgrade of PostgreSQL to version 13.3 happened in the fourth generation. Note that the output is somehow similar to a <code class="language-plaintext highlighter-rouge">diff</code> status report, where <code class="language-plaintext highlighter-rouge">+</code> lines are addition and <code class="language-plaintext highlighter-rouge">-</code> are somehow removals.
<br />
Imagine we need to come back to version 13.2 of PostgreSQL. How can we achieve this?
There are two ways:</p>
<ul>
<li>do a so called <em>rollback</em> that makes the last generation active (that is goes to generation number 3);</li>
<li>jump to a specific revision, in this case the number 1.</li>
</ul>
<p><br />
Depending on the history of your system, you can choose the correct approach.
<br />
Let’s jump to generation one (again, ensure your PostgreSQL server is turned off):</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_ctl <span class="nt">-D</span> pgdata/13 stop
% guix package <span class="nt">--switch-generation</span><span class="o">=</span>1
switched from generation 4 to 1
% pg_ctl <span class="nt">--version</span>
pg_ctl <span class="o">(</span>PostgreSQL<span class="o">)</span> 13.2
</code></pre></div></div>
<p><br />
<br /></p>
<p>Unlike installing new software, switching to a previous generation is a very fast, almost immediate, operation, since the only thing to do is to adjust the binary environment. As you can see, the PostgreSQL executables are turned back to version 13.2.</p>
<p><br />
<br />
What if we want to upgrade PostgreSQL version again? One solution is to <code class="language-plaintext highlighter-rouge">switch-generation</code> again, but it is also possible to run <code class="language-plaintext highlighter-rouge">upgrade</code> again, that is an almost immediate operation since everything is already on the system:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix upgrade postgresql
building CA certificate bundle...
listing Emacs sub-directories...
building fonts directory...
building directory of Info manuals...
building database <span class="k">for </span>manual pages...
building profile with 1 package...
% pg_ctl <span class="nt">--version</span>
pg_ctl <span class="o">(</span>PostgreSQL<span class="o">)</span> 13.3
</code></pre></div></div>
<p><br />
<br /></p>
<p>What has changed in the <em>generations</em>? Since we moved back to history placeholder <em>one</em>, and then upgrade PostgreSQL, the upgrade has been <em>squashed</em> from there:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix package <span class="nt">--list-generations</span>
<span class="se">\G</span>eneration 1 Sep 30 2021 08:16:02<span class="se">\</span>
postgresql 13.2 out /gnu/store/ivmkwkjsvbkv3g0jq9gcgwlhrhwx91gw-postgresql-13.2
<span class="se">\G</span>eneration 2 Sep 30 2021 10:15:33<span class="se">\ </span> <span class="o">(</span>current<span class="o">)</span>
+ postgresql 13.3 out /gnu/store/1nlzmg4hw4gga56g58dsqf9nx90z9kkn-postgresql-13.3
- postgresql 13.2 out /gnu/store/ivmkwkjsvbkv3g0jq9gcgwlhrhwx91gw-postgresql-13.2
<span class="se">\G</span>eneration 3 Sep 30 2021 08:40:43<span class="se">\</span>
+ glibc-locales 2.31 out /gnu/store/wnw0nwlyg92vv33f5f65jj1rd3p4fi3c-glibc-locales-2.31
+ glibc-utf8-locales 2.31 out /gnu/store/rgydar9dfvflqqz2irgh7njj34amaxc6-glibc-utf8-locales-2.31
+ postgresql 13.2 out /gnu/store/ivmkwkjsvbkv3g0jq9gcgwlhrhwx91gw-postgresql-13.2
- postgresql 13.3 out /gnu/store/1nlzmg4hw4gga56g58dsqf9nx90z9kkn-postgresql-13.3
<span class="se">\G</span>eneration 4 Sep 30 2021 10:04:21<span class="se">\</span>
+ postgresql 13.3 out /gnu/store/1nlzmg4hw4gga56g58dsqf9nx90z9kkn-postgresql-13.3
- postgresql 13.2 out /gnu/store/ivmkwkjsvbkv3g0jq9gcgwlhrhwx91gw-postgresql-13.2
</code></pre></div></div>
<p><br />
<br /></p>
<p>The PostgreSQL set of changes is propagated from history number one to all the other entries.</p>
<h2 id="removing-postgresql">Removing PostgreSQL</h2>
<p>Imagine we want to remove the 13.3 PostgreSQL version, keeping the older one available. The <code class="language-plaintext highlighter-rouge">remove</code> command does pretty much what you would expect:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix remove [email protected]
The following package will be removed:
postgresql 13.3
The following derivation will be built:
/gnu/store/ilxkw0i597n0qvirb11mksbyad8qmnvd-profile.drv
building profile with 0 packages...
</code></pre></div></div>
<p><br />
<br /></p>
<p>Again, this is a very fast operation, and this should hint you that nothing has been removed from the storage. Note that I specified the version to remove with the <code class="language-plaintext highlighter-rouge">@<version></code> syntax after the package name.
<br />
Is the older PostgreSQL version immediatly available? NO!
<br />
If you test the binaries, you will find out that the system wide (if any) because, as <code class="language-plaintext highlighter-rouge">guix</code> has told you in the above command output, it has removed the package from the current profile. This means, PostgreSQL is no more available via <code class="language-plaintext highlighter-rouge">guix</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% which pg_ctl
/usr/pgsql-13/bin/pg_ctl
% pg_ctl <span class="nt">--version</span>
pg_ctl <span class="o">(</span>PostgreSQL<span class="o">)</span> 13.4
</code></pre></div></div>
<p><br />
<br /></p>
<p>In order to enable the PostgreSQL 13.2 version via <code class="language-plaintext highlighter-rouge">guix package --switch-generations</code> to jump back to the generation that has the required PostgreSQL package:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix package <span class="nt">--switch-generation</span><span class="o">=</span>1
switched from generation 3 to 1
% pg_ctl <span class="nt">--version</span>
pg_ctl <span class="o">(</span>PostgreSQL<span class="o">)</span> 13.2
</code></pre></div></div>
<p><br />
<br /></p>
<p>But how to free disk space from unused PostgreSQL versions?
<br />
The <code class="language-plaintext highlighter-rouge">gc</code> subcommand will <em>garbage collect</em> packages that are not in use. <strong>Here, <em>not in use</em> could be something different from what you think: <code class="language-plaintext highlighter-rouge">guix</code> is of course smarter than you (at least, smarter than me) in finding out references between generations and packages</strong>. This means that the only fact that a package is not currently in use, does not make it eligible for hard deletion. It is therefore recommended to delete all the generations that refer to a specific package in order to get it deleted from the garbage collector:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% guix package <span class="nt">--delete-generations</span><span class="o">=</span>2
% guix package <span class="nt">--delete-generations</span><span class="o">=</span>3
% guix package <span class="nt">--delete-generations</span><span class="o">=</span>4
% guix gc
...
note: currently hard linking saves 913.85 MiB
guix gc: freed 3,631.25278 MiBs
% <span class="nb">du</span> <span class="nt">-hs</span> /gnu/store/<span class="k">*</span>postgresql-13.?
30M /gnu/store/ivmkwkjsvbkv3g0jq9gcgwlhrhwx91gw-postgresql-13.2
</code></pre></div></div>
<p><br />
<br /></p>
<p>Of course, this approach brings back your <em>whole</em> system since upgrading will require a new fresh installtion.</p>
<h1 id="conclusions">Conclusions</h1>
<p>GNU Guix is a very interesting package manager that can be used to setup a binary environment useful for testing and deploying software stacks, including our beloved database and its dependencies (e.g., tools and libraries).
<br />
Probably you are not going to use <code class="language-plaintext highlighter-rouge">guix</code> in a PostgreSQL production environment because you will have other package revision tools, to automate and keep <em>stable</em> your packages. However, <code class="language-plaintext highlighter-rouge">guix</code> can be very handy in testing and upgrading your own environment.</p>
Restarting a sequence: how hard could it be? (PostgreSQL and Oracle)2021-09-23T00:00:00+00:00https://fluca1978.github.io/2021/09/23/OracleAlterSequence<p>How hard could it be to reset a sequence?</p>
<h1 id="restarting-a-sequence-how-hard-could-it-be-postgresql-and-oracle">Restarting a sequence: how hard could it be? (PostgreSQL and Oracle)</h1>
<p>One reason I like PostgreSQL so much is that it makes me feel at home: it has a very consistent and coherent interface to its objects.
An example of this, is the management of sequences: <code class="language-plaintext highlighter-rouge">ALTER SEQUENCE</code> allows you to modify pretty much every detail about a sequence, in particular to <em>restart</em> it from its initial value.
<br />
Let’s see this in action:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="n">sequence</span> <span class="n">batch_seq</span>
<span class="k">increment</span> <span class="k">by</span> <span class="mi">1</span> <span class="k">start</span> <span class="k">with</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="n">SEQUENCE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">do</span> <span class="err">$$</span>
<span class="k">declare</span>
<span class="n">i</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">begin</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="p">..</span><span class="mi">100</span> <span class="n">loop</span>
<span class="n">perform</span> <span class="n">nextval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="k">end</span> <span class="n">loop</span><span class="p">;</span>
<span class="k">end</span>
<span class="err">$$</span>
<span class="p">;</span>
<span class="k">DO</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">currval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="n">currval</span>
<span class="c1">---------</span>
<span class="mi">100</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>In the above piece of code, I’ve created a <code class="language-plaintext highlighter-rouge">batch_seq</code> and queried it one hundred times, so that the current value of the sequence is holding <code class="language-plaintext highlighter-rouge">100</code>.
<br />
<br />
How is it possible to make the sequence start over again?
<br />
A first possibility is to use the <code class="language-plaintext highlighter-rouge">setval</code> function:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">setval</span><span class="p">(</span> <span class="s1">'batch_seq'</span><span class="p">,</span> <span class="mi">1</span> <span class="p">);</span>
<span class="n">setval</span>
<span class="c1">--------</span>
<span class="mi">1</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">currval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="n">currval</span>
<span class="c1">---------</span>
<span class="mi">1</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Another option is to use <code class="language-plaintext highlighter-rouge">ALTER SEQUENCE</code>, that is a command aimed to this purpose (and others):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="n">sequence</span> <span class="n">batch_seq</span> <span class="k">restart</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="n">SEQUENCE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">nextval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="n">nextval</span>
<span class="c1">---------</span>
<span class="mi">1</span>
</code></pre></div></div>
<p><br />
<br />
An important thing to note here, is that the only option specified has been <code class="language-plaintext highlighter-rouge">RESTART</code>, that is the sequence already knows what <em>restarting</em> means: it means <em>reset to its original starting value</em>.
<br />
It is also possible to specify a specific value for the restarting:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="n">sequence</span> <span class="n">batch_seq</span> <span class="k">restart</span> <span class="k">with</span> <span class="mi">666</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="n">SEQUENCE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">nextval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="n">nextval</span>
<span class="c1">---------</span>
<span class="mi">666</span>
</code></pre></div></div>
<p><br />
<br />
<em>That’s so simple!</em>
<br />
The above behaviour is guaranteed back to the <code class="language-plaintext highlighter-rouge">8.1</code> PostgreSQL version (and probably even before): <a href="https://www.postgresql.org/docs/8.1/sql-altersequence.html" target="_blank">see the old documentation here</a>.</p>
<h2 id="wait-what-about-currval">Wait, what about <code class="language-plaintext highlighter-rouge">currval()</code>?</h2>
<p>The careful reader has probably noted that I used <code class="language-plaintext highlighter-rouge">nextval()</code> to see if the reset of a sequence worked, instead of <code class="language-plaintext highlighter-rouge">currval()</code>. The reason can be found in the <a href="https://www.postgresql.org/docs/13/functions-sequence.html" target="_blank">official documentation</a>: *Returns the value most recently obtained by nextval for this sequence <strong>*in the current session</strong> . *
<br />
It is easy to test this:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">nextval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="n">nextval</span>
<span class="c1">---------</span>
<span class="mi">667</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="n">sequence</span> <span class="n">batch_seq</span> <span class="k">restart</span> <span class="k">with</span> <span class="mi">999</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="n">SEQUENCE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">currval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="n">currval</span>
<span class="c1">---------</span>
<span class="mi">667</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">nextval</span><span class="p">(</span> <span class="s1">'batch_seq'</span> <span class="p">);</span>
<span class="n">nextval</span>
<span class="c1">---------</span>
<span class="mi">999</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, after an <code class="language-plaintext highlighter-rouge">ALTER SEQUENCE RESTART</code> the <code class="language-plaintext highlighter-rouge">currval()</code> result remains unchanged (it is the last polled value within the current session), while <code class="language-plaintext highlighter-rouge">nextval()</code> (that actually queries the sequence) provides the right and expected value.</p>
<h1 id="what-about-oracle-sequences">What about Oracle sequences?</h1>
<p>Oracle provides a powerful <code class="language-plaintext highlighter-rouge">ALTER SEQUENCE</code> command only in recent versions. For older versions, the <a href="" target="_blank">official documentation for the command <code class="language-plaintext highlighter-rouge">ALTER SEQUENCE</code></a> clearly states that <strong>To restart the sequence at a different number, you must drop and re-create it</strong>!</p>
<p><br />
<br />
Err… what?
<br />
<br /></p>
<p>Until version 18: <code class="language-plaintext highlighter-rouge">ALTER SEQUENCE</code> cannot <em>restart</em> the sequence. What is then the solution? You need to trigger a sequence update:</p>
<ul>
<li>change the increment of the sequence to effectively subtract values;</li>
<li>ask the sequence a new value, so that it applies the subtraction;</li>
<li>set the increment to its correct value.</li>
</ul>
<p><br />
This means you have to do something like the following:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SQL</span><span class="o">></span> <span class="k">select</span> <span class="n">batch_seq</span><span class="p">.</span><span class="n">nextval</span> <span class="k">from</span> <span class="n">dual</span><span class="p">;</span>
<span class="k">SQL</span><span class="o">></span> <span class="k">alter</span> <span class="n">sequence</span> <span class="n">batch_seq</span> <span class="k">increment</span> <span class="k">by</span> <span class="o">-</span><span class="mi">666</span><span class="p">;</span>
<span class="k">SQL</span><span class="o">></span> <span class="k">select</span> <span class="n">batch_seq</span><span class="p">.</span><span class="n">nextval</span> <span class="k">from</span> <span class="n">dual</span><span class="p">;</span>
<span class="k">SQL</span><span class="o">></span> <span class="k">alter</span> <span class="n">sequence</span> <span class="n">batch_seq</span> <span class="k">increment</span> <span class="k">by</span> <span class="mi">1</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>I don’t like this approach very much, because it is error prone and requires you to do some computation ensuring you are not going to go outside the sequence boundaries.</p>
<p><br />
<br />
In recent versions of Oracle Database (e.g., <code class="language-plaintext highlighter-rouge">21</code>), the <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/ALTER-SEQUENCE.html#GUID-A6468B63-E7C9-4EF0-B048-82FE2449B26D" target="_blank"><code class="language-plaintext highlighter-rouge">ALTER SEQUENCE</code> command works as in PostgreSQL, i.e., as in the standard SQL</a>, and this is good, of course.
<br />
With a quick search for within the <a href="https://docs.oracle.com/search/?q=alter+sequence&category=database&product=en%2Fdatabase%2Foracle%2Foracle-database" target="_blank">Oracle documentation about <code class="language-plaintext highlighter-rouge">ALTER SEQUENCE</code></a>, the <em>right</em> behaviour has been introduced in Oracle <code class="language-plaintext highlighter-rouge">18</code> and next. Therefore, if you are facing a previous Oracle version, you need to do the above set of commands to manually adjust the sequences.</p>
<h1 id="conclusions">Conclusions</h1>
<p>PostgreSQL has a very strict approach to the SQL standard, that roots even in old versions. Unluckily, Oracle is not the same, and older versions require some tricks to simulate the PostgreSQL behavior.
<br />
This is not meant to be a flame or a comparison, it simply indicates how counter-intuitive could be to handle Oracle once you have been used to PostgreSQL!</p>
Using jq to get information out of pgbackrest2021-09-10T00:00:00+00:00https://fluca1978.github.io/2021/09/10/pgbackrest-jq<p>pgbackrest supports the JSON output format, and this can be useful to automate some information analysys.</p>
<h1 id="using-jq-to-get-information-out-of-pgbackrest">Using <code class="language-plaintext highlighter-rouge">jq</code> to get information out of <code class="language-plaintext highlighter-rouge">pgbackrest</code></h1>
<p><a href="https://pgbackrest.org/" target="_blank"><code class="language-plaintext highlighter-rouge">pgbackrest</code></a> offers the output of its commands in the JSON format. I’m not a great fan of JSON, but it having such an output offers a few advantages, most notably <strong>it is a stable text output format</strong> that can be inspected easily with other tools.
<br />
In other words, no need for regular expression to parse the textual output, and moreover, the output is guaranteed to be stable, that means no changes will happen (or better, no fields will be removed), while a simple rephrasing in the text output could crash your crafty regular expression!
<br />
<br />
Among the available tools, <code class="language-plaintext highlighter-rouge">jq</code> is a good sheel program that allows you to <em>parse and navigate</em> a JSON content.
<br />
Let’s see how it is possible to get some output combining <code class="language-plaintext highlighter-rouge">jq</code> and <code class="language-plaintext highlighter-rouge">pgbackrest</code>.</p>
<h2 id="get-the-last-backup-information">Get the <em>last backup</em> information</h2>
<p>When your stanza has a lot of backup, you probably don’t want to monitor all of them in deep, but would rather like to get a quick hint on when the last backup did took place.
<br />
The <code class="language-plaintext highlighter-rouge">pgbackrest info</code> command reports all the backup available for a given stanza, and it can then be piped into <code class="language-plaintext highlighter-rouge">jq</code> to get more human readable information.
<br />
Quick! Show me the snippet:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgbackrest info <span class="nt">--output</span> json | jq <span class="s1">'"Stanza: " + .[].name + " (" + .[].status.message + ") " + "Last backup completed at " + (.[].backup[-1].timestamp.stop | strftime("%Y-%m-%d %H:%M") )'</span>
<span class="s2">"Stanza: miguel (ok) Last backup completed at 2021-07-27 09:23"</span>
</code></pre></div></div>
<p><br />
<br />
This is what I would like to see when I’m in a rush and need to see which machine are in trouble with backups: it shows me the name of the stanza, the status of the backup (<code class="language-plaintext highlighter-rouge">ok</code>) and the time and date the backup ended.
<br />
Let’s analyze the command in more detail:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">pgbackrest info --output json</code> enables the output of the <code class="language-plaintext highlighter-rouge">info</code> command as JSON;</li>
<li><code class="language-plaintext highlighter-rouge">jq</code> is used to parse the JSON output concatenating strings, delimited by <code class="language-plaintext highlighter-rouge">"</code> with <code class="language-plaintext highlighter-rouge">+</code>
<ul>
<li><code class="language-plaintext highlighter-rouge">.[].name</code> provides the name of the stanza, that is it reads the <code class="language-plaintext highlighter-rouge">name</code> property of the JSON output;</li>
<li><code class="language-plaintext highlighter-rouge">.[].status.message</code> provides the backup status message, that is the appearing <code class="language-plaintext highlighter-rouge">ok</code>;</li>
<li><code class="language-plaintext highlighter-rouge">(.[].backup[-1].timestamp.stop | strftime("%Y-%m-%d %H:%M") )</code> is clearly the trickiest part, and it gets the last backup (i.e., the backup <code class="language-plaintext highlighter-rouge">-1</code> from the end), extracts its stop timestamp (there are <code class="language-plaintext highlighter-rouge">start</code> and <code class="language-plaintext highlighter-rouge">stop</code> timestamp properties) and filters it (i.e., pipes within <code class="language-plaintext highlighter-rouge">jq</code>) to <code class="language-plaintext highlighter-rouge">strftime</code> to display the timestamp in a more human friendly way.</li>
</ul>
</li>
</ul>
<h2 id="get-all-the-backups-for-a-stanza">Get all the backups for a stanza</h2>
<p>It is possible to iterate over all the backup information and therefore get an overall status of all the backups:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgbackrest info <span class="nt">--stanza</span> miguel <span class="nt">--output</span> json | jq <span class="nt">-r</span> <span class="s1">'"Stanza: " + .[].name + " (" + .[].status.message + ") " + " backup completed at " + (.[].backup[].timestamp.stop | strftime("%Y-%m-%d") ) + " of size " + (.[].backup[].info.size/1024|tostring ) + " MB"'</span>
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-01-27 of size 3578696.4814453125 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-02-27 of size 3578696.4814453125 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-03-27 of size 3578696.4814453125 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-04-27 of size 3578696.4814453125 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-05-27 of size 3582783.4150390625 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-06-27 of size 3582783.4150390625 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-07-27 of size 3582783.4150390625 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-07-27 of size 3582783.4150390625 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-09-27 of size 3585732.208984375 MB
Stanza: miguel <span class="o">(</span>ok<span class="o">)</span> backup completed at 2021-07-27 of size 3585732.208984375 MB
</code></pre></div></div>
<p><br />
<br />
The trick here is to use <code class="language-plaintext highlighter-rouge">-r</code> to let the application to iterate on every backup information. Also note that it is possible to add the dimension of the backup, as well as other information tailored to your needs.</p>
<h2 id="get-the-last-backup-within-a-set-of-servers">Get the <em>last backup</em> within a set of servers</h2>
<p>It is possible to elaborate a little more on the <code class="language-plaintext highlighter-rouge">jq</code> extract string and loop it within a simple shell iteration to get information about all your servers. Of course, this is simpler if your servers have all a pre-defined name, like <code class="language-plaintext highlighter-rouge">server-01</code>, <code class="language-plaintext highlighter-rouge">server-02</code> and so on.
<br /></p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="k">for </span>server <span class="k">in</span> <span class="o">{</span>1..10<span class="o">}</span><span class="p">;</span> <span class="k">do </span><span class="nb">printf</span> <span class="s2">"Stanza server-%02d with last backup at %s</span><span class="se">\n</span><span class="s2">"</span> <span class="nv">$server</span> <span class="s2">"</span><span class="si">$(</span> pgbackrest info <span class="nt">--stanza</span> <span class="si">$(</span><span class="nb">printf</span> <span class="s1">'%02d'</span> <span class="nv">$server</span><span class="si">)</span> <span class="nt">--output</span> json | jq <span class="s1">' (.[0].backup[-1].timestamp.stop | strftime("%Y-%m-%d %H:%M") )'</span> <span class="si">)</span><span class="s2">"</span> <span class="p">;</span> <span class="k">done
</span>Stanza server-01 with last backup at <span class="s2">"2021-07-27 09:23"</span>
Stanza server-02 with last backup at <span class="s2">"2021-07-27 01:23"</span>
Stanza server-03 with last backup at <span class="s2">"2021-07-27 02:23"</span>
Stanza server-04 with last backup at <span class="s2">"2021-07-27 03:23"</span>
Stanza server-05 with last backup at <span class="s2">"2021-07-27 05:23"</span>
Stanza server-06 with last backup at <span class="s2">"2021-07-27 06:23"</span>
Stanza server-07 with last backup at <span class="s2">"2021-07-27 07:23"</span>
Stanza server-08 with last backup at <span class="s2">"2021-07-27 08:23"</span>
Stanza server-09 with last backup at <span class="s2">"2021-07-27 10:23"</span>
Stanza server-10 with last backup at <span class="s2">"2021-07-27 11:23"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Please note the usage of <code class="language-plaintext highlighter-rouge">printf(1)</code> to cope with numbers like <code class="language-plaintext highlighter-rouge">01</code>, as well as the <code class="language-plaintext highlighter-rouge">for</code> to invoke <code class="language-plaintext highlighter-rouge">pgbackrest info</code> against every single stanza. Similar results can be obtained with <code class="language-plaintext highlighter-rouge">jq</code> iterations:</p>
<h1 id="conclusions">Conclusions</h1>
<p>The capability to output information in JSON can simplify a lot the monitoring of the backup status. There is no need to deploy a complex monitoring stack though, and it does suffice to use <code class="language-plaintext highlighter-rouge">jq</code> to get a report about servers and backups. Of course, being able to navigate the JSON output and play with shell scripting can allow you to get even better results.</p>
A simple example of LATERAL use2021-08-07T00:00:00+00:00https://fluca1978.github.io/2021/08/07/PostgreSQLLateralJoin<p>How <code class="language-plaintext highlighter-rouge">LATERAL</code> can help to solve problems…</p>
<h1 id="a-simple-example-of-lateral-use">A simple example of LATERAL use</h1>
<p>A few days ago I found a question by a user on Facebook: <em>how to select events from a table where they are no more than 10 minutes one from another?</em>
<br />
My first answer was related to <code class="language-plaintext highlighter-rouge">LATERAL</code>, and this post I try to represent with an example how I understood and could solve the above question.</p>
<p><br />
<br />
First of all, let’s build an <em>events</em> table, where each row has a timestamp.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">events</span><span class="p">(</span>
<span class="n">pk</span> <span class="nb">int</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span>
<span class="p">,</span> <span class="n">event</span> <span class="nb">text</span>
<span class="p">,</span> <span class="n">ts</span> <span class="nb">timestamp</span> <span class="k">default</span> <span class="k">CURRENT_TIMESTAMP</span>
<span class="p">,</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">(</span> <span class="n">pk</span> <span class="p">)</span>
<span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
</code></pre></div></div>
<p><br />
<br />
Now, let’s populate the table with “random” data so that there are events spread in a range of <code class="language-plaintext highlighter-rouge">2 minutes</code> each:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">events</span><span class="p">(</span> <span class="n">event</span><span class="p">,</span> <span class="n">ts</span> <span class="p">)</span>
<span class="k">select</span> <span class="s1">'event #'</span> <span class="o">||</span> <span class="n">v</span><span class="p">,</span> <span class="k">current_timestamp</span> <span class="o">-</span> <span class="p">(</span> <span class="p">(</span> <span class="n">v</span> <span class="o">*</span> <span class="mi">2</span> <span class="p">)</span> <span class="o">||</span> <span class="s1">' minutes'</span> <span class="p">)::</span><span class="n">interval</span>
<span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">100</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">100</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Having setup the table and the data, how can we relate every tuple with other events that are not outside a ten minutes window? <code class="language-plaintext highlighter-rouge">LATERAL</code> comes to the rescue.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span><span class="p">,</span> <span class="n">e1</span><span class="p">.</span><span class="n">event</span><span class="p">,</span> <span class="n">e2</span><span class="p">.</span><span class="o">*</span>
<span class="k">FROM</span> <span class="n">events</span> <span class="n">e1</span><span class="p">,</span>
<span class="k">LATERAL</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">pk</span><span class="p">,</span> <span class="n">event</span><span class="p">,</span> <span class="n">ts</span> <span class="o">-</span> <span class="n">e1</span><span class="p">.</span><span class="n">ts</span> <span class="k">as</span> <span class="n">time_elapsed</span>
<span class="k">FROM</span> <span class="n">events</span>
<span class="k">WHERE</span> <span class="n">pk</span> <span class="o"><></span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span> <span class="k">AND</span> <span class="n">e1</span><span class="p">.</span><span class="n">ts</span> <span class="o">-</span> <span class="n">ts</span> <span class="o"><=</span> <span class="s1">'10 minutes'</span><span class="p">::</span><span class="n">interval</span> <span class="p">)</span> <span class="n">e2</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span> <span class="k">LIMIT</span> <span class="mi">20</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">event</span> <span class="o">|</span> <span class="n">pk</span> <span class="o">|</span> <span class="n">event</span> <span class="o">|</span> <span class="n">time_elapsed</span>
<span class="c1">-----+----------+-----+----------+--------------</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">505</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">5</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">506</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">6</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">505</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">5</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">506</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">6</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">507</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">7</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">505</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">5</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">506</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">6</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">507</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">7</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">508</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">8</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="p">(</span><span class="mi">20</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br />
Let’s disassemble the query and see how it works.
The subquery selects all the tuples that are within a <code class="language-plaintext highlighter-rouge">10 minutes</code> range and that are different from the query the system is currently evaluating (i.e., <code class="language-plaintext highlighter-rouge">e1.pk</code>). But usually a subquery is evaluated once for the outer query, but note that the subquery is prefixed with <code class="language-plaintext highlighter-rouge">LATERAL</code> that, in simple words, means <em>evaluate the subquery for every row of the outer result set</em>. This means that the <code class="language-plaintext highlighter-rouge">LATERAL</code> subquery can access the outer query row, and can “reason” about its own result set.
<br />
An important thing to keep in mind while dealing with <code class="language-plaintext highlighter-rouge">LATERAL</code> is that the subquery must be referenced with an alias, in my case <code class="language-plaintext highlighter-rouge">e2</code>. Please note that within the <code class="language-plaintext highlighter-rouge">LATERAL</code> subquery I do compute the time difference between the timestamp of the outer tuple and the one of the inner result set, and as you can see from the output column <code class="language-plaintext highlighter-rouge">time_elapsed</code> every row differs by 2 minutes, that is how we generated the rows.
<br />
<br />
What happens if you don’t use <code class="language-plaintext highlighter-rouge">LATERAL</code>? Well, you cannot reference the <code class="language-plaintext highlighter-rouge">e1</code> outer tuple, that is there is no way for a subquery to cross-reference something outside of its scope:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span><span class="p">,</span> <span class="n">e1</span><span class="p">.</span><span class="n">event</span><span class="p">,</span> <span class="n">e2</span><span class="p">.</span><span class="o">*</span>
<span class="k">FROM</span> <span class="n">events</span> <span class="n">e1</span><span class="p">,</span>
<span class="p">(</span> <span class="k">SELECT</span> <span class="n">pk</span><span class="p">,</span> <span class="n">event</span><span class="p">,</span> <span class="n">ts</span> <span class="o">-</span> <span class="n">e1</span><span class="p">.</span><span class="n">ts</span> <span class="k">as</span> <span class="n">time_elapsed</span>
<span class="k">FROM</span> <span class="n">events</span> <span class="k">WHERE</span> <span class="n">pk</span> <span class="o"><></span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span> <span class="k">AND</span> <span class="n">e1</span><span class="p">.</span><span class="n">ts</span> <span class="o">-</span> <span class="n">ts</span> <span class="o"><=</span> <span class="s1">'10 minutes'</span><span class="p">::</span><span class="n">interval</span> <span class="p">)</span> <span class="n">e2</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span> <span class="k">LIMIT</span> <span class="mi">20</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">invalid</span> <span class="n">reference</span> <span class="k">to</span> <span class="k">FROM</span><span class="o">-</span><span class="n">clause</span> <span class="n">entry</span> <span class="k">for</span> <span class="k">table</span> <span class="nv">"e1"</span>
<span class="n">LINE</span> <span class="mi">1</span><span class="p">:</span> <span class="p">...,</span> <span class="n">e2</span><span class="p">.</span><span class="o">*</span> <span class="k">FROM</span> <span class="n">events</span> <span class="n">e1</span><span class="p">,</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">pk</span><span class="p">,</span> <span class="n">event</span><span class="p">,</span> <span class="n">ts</span> <span class="o">-</span> <span class="n">e1</span><span class="p">.</span><span class="n">ts</span> <span class="k">as</span> <span class="n">t</span><span class="p">...</span>
<span class="o">^</span>
<span class="n">HINT</span><span class="p">:</span> <span class="n">There</span> <span class="k">is</span> <span class="n">an</span> <span class="n">entry</span> <span class="k">for</span> <span class="k">table</span> <span class="nv">"e1"</span><span class="p">,</span> <span class="n">but</span> <span class="n">it</span> <span class="n">cannot</span> <span class="n">be</span> <span class="n">referenced</span> <span class="k">from</span> <span class="n">this</span> <span class="n">part</span> <span class="k">of</span> <span class="n">the</span> <span class="n">query</span><span class="p">.</span>
</code></pre></div></div>
<p><br />
<br />
As you can see, PostgreSQL clearly states that you cannot refer to <code class="language-plaintext highlighter-rouge">e1</code> (the outer tuple) from within the scope of the subquery.</p>
<h2 id="lateral-joins"><code class="language-plaintext highlighter-rouge">LATERAL</code> Joins</h2>
<p>It is, of course, possible to use <code class="language-plaintext highlighter-rouge">LATERAL</code> in a join, and in this case the above query can be rewritten as:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span><span class="p">,</span> <span class="n">e1</span><span class="p">.</span><span class="n">event</span><span class="p">,</span> <span class="n">e2</span><span class="p">.</span><span class="o">*</span>
<span class="k">FROM</span> <span class="n">events</span> <span class="n">e1</span> <span class="k">JOIN</span> <span class="k">LATERAL</span>
<span class="p">(</span> <span class="k">SELECT</span> <span class="n">pk</span><span class="p">,</span> <span class="n">event</span><span class="p">,</span> <span class="n">ts</span> <span class="o">-</span> <span class="n">e1</span><span class="p">.</span><span class="n">ts</span> <span class="k">as</span> <span class="n">time_elapsed</span>
<span class="k">FROM</span> <span class="n">events</span> <span class="k">WHERE</span> <span class="n">pk</span> <span class="o"><></span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span> <span class="k">AND</span> <span class="n">e1</span><span class="p">.</span><span class="n">ts</span> <span class="o">-</span> <span class="n">ts</span> <span class="o"><=</span> <span class="s1">'10 minutes'</span><span class="p">::</span><span class="n">interval</span> <span class="p">)</span> <span class="n">e2</span>
<span class="k">ON</span> <span class="k">true</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">e1</span><span class="p">.</span><span class="n">pk</span> <span class="k">LIMIT</span> <span class="mi">20</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">event</span> <span class="o">|</span> <span class="n">pk</span> <span class="o">|</span> <span class="n">event</span> <span class="o">|</span> <span class="n">time_elapsed</span>
<span class="c1">-----+----------+-----+----------+--------------</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">505</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">5</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">506</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">6</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">505</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">5</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">506</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">6</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">507</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">7</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">505</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">5</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">506</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">6</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">507</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">7</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">08</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">503</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">3</span> <span class="o">|</span> <span class="mi">508</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">8</span> <span class="o">|</span> <span class="o">-</span><span class="mi">00</span><span class="p">:</span><span class="mi">10</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="mi">501</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">1</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">504</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">4</span> <span class="o">|</span> <span class="mi">502</span> <span class="o">|</span> <span class="n">event</span> <span class="o">#</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">:</span><span class="mi">00</span>
<span class="p">(</span><span class="mi">20</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">LATERAL</code> is a very powerful SQL operator in PostgreSQL, and can help solving problems you would normally solve by means of cursors and iterations.</p>
Select Distinct Bytea (or Blobs)2021-08-04T00:00:00+00:00https://fluca1978.github.io/2021/08/04/SelectDistinctBlob<p>A strange behaviour I found in Oracle.</p>
<h1 id="select-distinct-bytea-or-blobs">Select Distinct Bytea (or Blobs)</h1>
<p><em>TLDR: seems to me that PostgreSQL has a more comfortable behaviour than Oracle when dealing with <code class="language-plaintext highlighter-rouge">distinct</code> and <code class="language-plaintext highlighter-rouge">BLOB</code>-like fields</em></p>
<p><br />
<br /></p>
<p>I’m not an avid Oracle user, at least not as much as I’m with regard to PostgreSQL.
<br />
In the last days I spot a problem with an application of mine: after having added a <code class="language-plaintext highlighter-rouge">BLOB</code> column to an Oracle table, a few automated queries began to fail. It was not so simple, in the beginning, to find out what the problem was, but essentially the <em>ORM</em> I am using was generating a query with a <code class="language-plaintext highlighter-rouge">distinct</code> clause, and it seems that Oracle does not accept such kind of query when it involves a <code class="language-plaintext highlighter-rouge">BLOB</code> or a <code class="language-plaintext highlighter-rouge">CLOB</code> field.
<br />
Let’s see an example: the <code class="language-plaintext highlighter-rouge">blobby</code> table is made by a <code class="language-plaintext highlighter-rouge">varchar2</code> <code class="language-plaintext highlighter-rouge">description</code> field and a <code class="language-plaintext highlighter-rouge">bdata</code> field of type <code class="language-plaintext highlighter-rouge">BLOB</code>.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SQL</span><span class="o">></span> <span class="k">select</span> <span class="k">distinct</span> <span class="n">bdata</span><span class="p">,</span> <span class="n">description</span> <span class="k">from</span> <span class="n">blobby</span><span class="p">;</span>
<span class="k">select</span> <span class="k">distinct</span> <span class="n">bdata</span><span class="p">,</span> <span class="n">description</span> <span class="k">from</span> <span class="n">blobby</span>
<span class="o">*</span>
<span class="n">ERROR</span> <span class="k">at</span> <span class="n">line</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">ORA</span><span class="o">-</span><span class="mi">00932</span><span class="p">:</span> <span class="n">inconsistent</span> <span class="n">datatypes</span><span class="p">:</span> <span class="n">expected</span> <span class="o">-</span> <span class="n">got</span> <span class="nb">BLOB</span>
</code></pre></div></div>
<p><br />
<br />
The reported error is somehow obscure to me: <strong>ORA-00932: inconsistent datatypes: expected - got BLOB</strong> does not provide to me enough information to understand what type the system was expecting. However, seeing the <code class="language-plaintext highlighter-rouge">BLOB</code> final part let me reason about the problem.
<br />
However, in the begin, I was not even able to reproduce the problem because if you don’t specify an explicit column list, the same query works:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SQL</span><span class="o">></span> <span class="k">select</span> <span class="k">distinct</span> <span class="o">*</span> <span class="k">from</span> <span class="n">blobby</span><span class="p">;</span>
<span class="p">...</span>
<span class="mi">6</span> <span class="k">rows</span> <span class="n">selected</span><span class="p">.</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>I was unable to make the query to work even using a cast to different types, so I guess Oracle cannot handle the query when the columns are explicitly listed. And that was the problem: many ORMs, including the one I’m using, produce queries where all the columns are asked as output fields, and so Oracle was refusing to run the query.</p>
<h1 id="what-about-postgresql">What About PostgreSQL?</h1>
<p>I was curious to see how does PostgreSQL handle the same situation, assuming <code class="language-plaintext highlighter-rouge">BLOB</code> can be translated into a <code class="language-plaintext highlighter-rouge">bytea</code> field.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">table</span> <span class="n">blobby</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span><span class="p">,</span>
<span class="n">description</span> <span class="nb">text</span><span class="p">,</span> <span class="n">bdata</span> <span class="n">bytea</span><span class="p">,</span> <span class="k">primary</span> <span class="k">key</span><span class="p">(</span> <span class="n">pk</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">blobby</span><span class="p">(</span> <span class="n">description</span> <span class="p">)</span> <span class="k">select</span> <span class="s1">'Record '</span> <span class="o">||</span> <span class="n">v</span> <span class="k">from</span>
<span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">5</span>
<span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">lo_import</span> <span class="n">myfile</span><span class="p">.</span><span class="n">pdf</span>
<span class="n">lo_import</span> <span class="mi">50626</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">update</span> <span class="n">blobby</span> <span class="k">set</span> <span class="n">bdata</span> <span class="o">=</span> <span class="n">lo_get</span><span class="p">(</span> <span class="mi">50626</span> <span class="p">);</span>
<span class="k">UPDATE</span> <span class="mi">5</span>
<span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">o</span> <span class="n">test</span><span class="p">.</span><span class="n">csv</span>
<span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">a</span>
<span class="k">Output</span> <span class="n">format</span> <span class="k">is</span> <span class="n">unaligned</span><span class="p">.</span>
<span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">f</span> <span class="s1">';'</span>
<span class="n">Field</span> <span class="n">separator</span> <span class="k">is</span> <span class="nv">";"</span><span class="p">.</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="k">distinct</span> <span class="n">bdata</span><span class="p">,</span> <span class="n">description</span> <span class="k">from</span> <span class="n">blobby</span><span class="p">;</span>
<span class="c1">-- same as select distinct * from blobby</span>
<span class="o">%</span> <span class="n">ls</span> <span class="o">-</span><span class="mi">1</span><span class="n">hs</span> <span class="n">test</span><span class="p">.</span><span class="n">csv</span>
<span class="mi">23</span><span class="n">M</span> <span class="n">test</span><span class="p">.</span><span class="n">csv</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Despite the initial part to create and populate the table, as you can see the <code class="language-plaintext highlighter-rouge">SELECT</code> works both with an explicit column list or a wildcard.</p>
<h1 id="conclusions">Conclusions</h1>
<p>I don’t have any conclusions, and I don’t blame a product or another. They just behave differently under pretty much the same context.
<br />
I like the PostgreSQL approach the most, it seems more natural. Moreover, Oracle error messages seem to me very obscure!</p>
pgbackrest async behavior2021-07-27T00:00:00+00:00https://fluca1978.github.io/2021/07/27/pgbackrestAsync<p>pgbackrest can work in asynchronous way in order to improve the resource usage.</p>
<h1 id="pgbackrest-async-behavior">pgbackrest async behavior</h1>
<p><a href="https://pgbackrest.org/configuration.html" target="_blank">pgbackrest</a> is an <strong>amazing backup tool</strong>, it is rock-solid (as PostgreSQL is) and designed to work under heavy database load.
<br />
One feature it has to improve efficienty of WAL archiving is the <em>async</em> mode.
<br />
<br />
In “standard” mode, <code class="language-plaintext highlighter-rouge">pgbackrest</code> will <em>push</em> WAL segments to the backup machine, using the <em>classical</em> <code class="language-plaintext highlighter-rouge">archive_command</code> provided by PostgreSQL. As you probably already know, PostgreSQL will wait for <code class="language-plaintext highlighter-rouge">archive_command</code> to complete and acknowledge the WAL transfert. It could happen that:</p>
<ul>
<li>the <code class="language-plaintext highlighter-rouge">archive_command</code> could take a very long time, and while PostgreSQL will continue to work, not yet transferred WALs will make <code class="language-plaintext highlighter-rouge">pg_wal</code> to grow;</li>
<li>the <code class="language-plaintext highlighter-rouge">archive_command</code> could fail, and PostgreSQL will warn you (in the logs) about this event and will try again to archive the failed WALs (forever, or better, unless it succeed).</li>
</ul>
<p><br />
On the other hand, when doing a restore, PostgreSQL executes the <code class="language-plaintext highlighter-rouge">restore_command</code> to get a new WAL segment, and this in turn results in running <code class="language-plaintext highlighter-rouge">pgbackrest</code> for a single WAL request.
<br />
The key concept here is probably <strong>single WAL request</strong>, both for push and get.
<br />
<br />
<code class="language-plaintext highlighter-rouge">pgbackrest</code> allows for an improvement on this situation by means of <em>asynchronous archive management</em>, both <code class="language-plaintext highlighter-rouge">push</code> and <code class="language-plaintext highlighter-rouge">get</code>. The idea is to give more control to <code class="language-plaintext highlighter-rouge">pgbackrest</code> so that it can optimize I/O operations.
<br />
When PostgreSQL archives a WAL segment, it executes the <code class="language-plaintext highlighter-rouge">archive_command</code> within a loop (allow me to simplify things): when a WAL is ready, <code class="language-plaintext highlighter-rouge">archive_command</code> is invoked and until it has finished, there is no chance to archive an already available WAL segment. On the other hand, when PostgreSQL needs to get a WAL in order to do a restore/recovery, it executes <code class="language-plaintext highlighter-rouge">restore_command</code> on every WAL segment it is expecting to replay. Therefore, if the server has to replay many WALs, it has to execute <code class="language-plaintext highlighter-rouge">restore_command</code> and “download” every WAL one after the other.
<br />
How does the asynchronous mode improve on the above?
<br />
When archiving, that means <em>pushing</em>, <code class="language-plaintext highlighter-rouge">pgbackrest</code> can decide to group several WALs in a single transfert, that means for instance to reduce the setup/tear-down operations for establishing a network connection with the backup machine.
<br />
When restoring, that means <em>getting</em>, <code class="language-plaintext highlighter-rouge">pgbackrest</code> could perform a pre-fetch, downloading a few WALs on the local machine and make them available immediatly to the PostgreSQL server when needed.</p>
<h2 id="the-test-environment">The test environment</h2>
<p>In this post, I will demonstrate the usage of <code class="language-plaintext highlighter-rouge">pgbackrest</code> asynchronously using my usual two-machine setup:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">miguel</code> is the PostgreSQL server, running Fedora Linux with PostgreSQL 13.3;</li>
<li><code class="language-plaintext highlighter-rouge">carmensita</code> is the backup machine, running Fedora Linux.</li>
</ul>
<p><br />
<code class="language-plaintext highlighter-rouge">pgbackrest</code> is at version <code class="language-plaintext highlighter-rouge">2.34</code>.</p>
<h2 id="asynchronous-configuration-parameters">Asynchronous Configuration Parameters</h2>
<p>There are a bunch of configuration parameters that can be configured within the <code class="language-plaintext highlighter-rouge">pgbackrest.conf</code> file or specified on the command line, as usual.
<br />
The settings mainly regard the spool directory, the queues and the enabling of the asynchronous mode.</p>
<h3 id="enabling-or-disabling-the-asynchronous-mode">Enabling or disabling the asynchronous mode</h3>
<p>There is a single configuration parameter to enable the asynchronous mode: <code class="language-plaintext highlighter-rouge">async</code>.
By default this is false, meaning <code class="language-plaintext highlighter-rouge">pgbackrest</code> will work “normally” as you expect. Turning it on, will automatically make any <code class="language-plaintext highlighter-rouge">archive-get</code> and <code class="language-plaintext highlighter-rouge">archive-push</code> in asynchronous mode.</p>
<h3 id="the-spool-directory">The spool directory</h3>
<p>In order to manage the async operations, <code class="language-plaintext highlighter-rouge">pgbackrest</code> creates on the PostgreSQL machine a <em>spool directory</em>, usually <code class="language-plaintext highlighter-rouge">/var/spool/pgbackrest</code> where it places an <code class="language-plaintext highlighter-rouge">archive</code> directory and a directory named after the server, or better, the <em>stanza</em>. Such directory could then be split into <code class="language-plaintext highlighter-rouge">in</code> or <code class="language-plaintext highlighter-rouge">out</code> for respectively <code class="language-plaintext highlighter-rouge">archive-get</code> and <code class="language-plaintext highlighter-rouge">archive-push</code>.
<br />
The spool directory root can be defined with the <code class="language-plaintext highlighter-rouge">spool-path</code> configuration parameter.
<br />
For example, given the stanza named <code class="language-plaintext highlighter-rouge">miguel</code>, the spool directory will either be <code class="language-plaintext highlighter-rouge">/var/spool/pgbackrest/archive/miguel/out</code> or <code class="language-plaintext highlighter-rouge">/var/spool/pgbackrest/archive/miguel/in</code>.
<br />
<br />
In the <code class="language-plaintext highlighter-rouge">out</code> directory the system will write book-keeping stuff, mainly small text files that will be used to identify at which point the archiving has arrived.
<br />
In the <code class="language-plaintext highlighter-rouge">in</code> directory, the system will store incoming WALs ready to be restored from the PostgreSQL server.</p>
<h3 id="queues">Queues</h3>
<p>There are two different setting to manage the queues of <code class="language-plaintext highlighter-rouge">pgbackrest</code>:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">archive-push-max-queue</code>;</li>
<li><code class="language-plaintext highlighter-rouge">archive-get-max-queue</code>.</li>
</ul>
<p><br />
They configure the max size of the data enqueued for the push and get operations. When the queue is full, <code class="language-plaintext highlighter-rouge">pgbackrest</code> will behave differently depending on the operation that is ongoing, as explained below.</p>
<h1 id="configuration">Configuration</h1>
<p>The backup machine, named <code class="language-plaintext highlighter-rouge">carmensita</code> has a <code class="language-plaintext highlighter-rouge">7etc/pgbackrest.conf</code> file configured as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cat</span> /etc/pgbackrest.conf
<span class="o">[</span>global]
start-fast <span class="o">=</span> y
stop-auto <span class="o">=</span> y
repo1-path <span class="o">=</span> /backup/pgbackrest
repo1-retention-full<span class="o">=</span>2
repo1-host-user <span class="o">=</span> backup
log-level-console <span class="o">=</span> info
<span class="o">[</span>miguel]
pg1-host <span class="o">=</span> miguel
pg1-path <span class="o">=</span> /postgres/13/data
</code></pre></div></div>
<p><br />
<br />
while on the PostgreSQL server machine, named <code class="language-plaintext highlighter-rouge">miguel</code> the <code class="language-plaintext highlighter-rouge">/etc/pgbackrest.conf</code> file is</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span>global]
repo1-path <span class="o">=</span> /backup/pgbackrest
repo1-host-user <span class="o">=</span> backup
log-level-console <span class="o">=</span> info
repo1-host <span class="o">=</span> carmensita
archive-async <span class="o">=</span> y
archive-push-queue-max <span class="o">=</span> 500MB
spool-path <span class="o">=</span> /var/spool/pgbackrest
archive-get-queue-max <span class="o">=</span> 32MB
</code></pre></div></div>
<p><br />
<br /></p>
<p>Last, the <code class="language-plaintext highlighter-rouge">archive_command</code> on the PostgreSQL machine is configured as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>archive_command <span class="o">=</span> <span class="s1">'/usr/bin/pgbackrest \
--pg1-path=/postgres/13/data \
--config=/etc/pgbackrest.conf \
--stanza=miguel \
archive-push %p'</span>
archive_mode <span class="o">=</span> on
</code></pre></div></div>
<p><br />
<br />
Please note that the <code class="language-plaintext highlighter-rouge">archive-async</code> parameter is specified in the configuration, instead of setting it in the <code class="language-plaintext highlighter-rouge">archive-push</code> or <code class="language-plaintext highlighter-rouge">archive-get</code>. This simplifies, in my opinion, the usage of <code class="language-plaintext highlighter-rouge">pgbackrest</code>.</p>
<p><br />
<br />
With all the above up and running, it is possible to see how the asynchronous mode works.</p>
<h1 id="archiving-archive-push">Archiving (<code class="language-plaintext highlighter-rouge">archive-push</code>)</h1>
<p>Let’s start with the backup scenario, that is <code class="language-plaintext highlighter-rouge">archive-push</code>.</p>
<h2 id="when-things-go-right">When things go right</h2>
<p>Let’s see what happens when everything is fine: I launched a <code class="language-plaintext highlighter-rouge">pgbench</code> session in order to generate some traffic and, therefore, some WAL segment generation and archiving.
On one hand, <code class="language-plaintext highlighter-rouge">pgbench</code> was running as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbench <span class="nt">-c</span> 8 <span class="nt">-T</span> 120 <span class="nt">-h</span> miguel <span class="nt">-U</span> pgbench <span class="nt">-n</span> <span class="nt">-P</span> 5 pgbench
</code></pre></div></div>
<p><br />
<br /></p>
<p>While <code class="language-plaintext highlighter-rouge">pgbench</code> is running, let’s inspect what is happening on the PostgreSQL machine, with particular regard to the spooling folder:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ls -1s /var/spool/pgbackrest/archive/miguel/out \</span>
<span class="o">&&</span> psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="se">\</span>
<span class="nt">-c</span> <span class="s1">'select last_archived_wal from pg_stat_archiver;'</span> postgres
0 000000070000014E00000015.ok
last_archived_wal
<span class="nt">--------------------------</span>
000000070000014E00000015
<span class="c"># # after a while</span>
<span class="c"># ls -1s /var/spool/pgbackrest/archive/miguel/out \</span>
<span class="o">&&</span> psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="se">\</span>
<span class="nt">-c</span> <span class="s1">'select last_archived_wal from pg_stat_archiver;'</span> postgres
0 000000070000014E00000016.ok
last_archived_wal
<span class="nt">--------------------------</span>
000000070000014E00000016
</code></pre></div></div>
<p><br />
<br />
As you can see, in the spool directory there will be an <strong>empty file</strong> named after the last archived WAL segment, that is the last segment sent to the backup machine, and the suffix <code class="language-plaintext highlighter-rouge">.ok</code>.
<br />
In the PostgreSQL logs, there will be a notice when the <code class="language-plaintext highlighter-rouge">pgbackrest</code> completes the pushing (depending on the log level you configured):</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INFO: pushed WAL file <span class="s1">'000000070000014E000000AD'</span> to the archive asynchronously
</code></pre></div></div>
<h2 id="when-things-go-wrong">When things go wrong</h2>
<h3 id="first-case-shutting-down-the-backup-machine">First case: shutting down the backup machine</h3>
<p>Assume the backup machine, <code class="language-plaintext highlighter-rouge">carmensita</code>, is turned off. The archiving cannot work, of course, and if you generate again some traffic on the PostgreSQL server (e.g., by using <code class="language-plaintext highlighter-rouge">pgbench</code> as shown above), the situation on the spool directory is:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ls -1s /var/spool/pgbackrest/archive/miguel/out \</span>
<span class="o">&&</span> psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="se">\</span>
<span class="nt">-c</span> <span class="s1">'select last_archived_wal, last_failed_wal from pg_stat_archiver;'</span> postgres
4 global.error
last_archived_wal | last_failed_wal
<span class="nt">--------------------------</span>|--------------------------
000000070000014E0000001A | 000000070000014E0000001B
</code></pre></div></div>
<p><br />
<br />
The file <code class="language-plaintext highlighter-rouge">global.error</code> contains a textual description of what is happening:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># cat /var/spool/pgbackrest/archive/miguel/out/global.error </span>
103
unable to find a valid repository:
repo1: <span class="o">[</span>UnknownError] remote-0 process on <span class="s1">'carmensita'</span> terminated unexpectedly <span class="o">[</span>255]: ssh: connect to host carmensita port 22: No route to hos
</code></pre></div></div>
<p><br />
<br /></p>
<p>If you then restart the backup machine, so that the archiving starts working again, the situation on the spool directory is as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ls -1s /var/spool/pgbackrest/archive/miguel/out \</span>
<span class="o">&&</span> psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="se">\</span>
<span class="nt">-c</span> <span class="s1">'select last_archived_wal, last_failed_wal from pg_stat_archiver;'</span> postgres
0 000000070000014E0000001B.ok
0 000000070000014E0000001C.ok
0 000000070000014E0000001D.ok
0 000000070000014E0000001E.ok
0 000000070000014E0000001F.ok
last_archived_wal | last_failed_wal
<span class="nt">--------------------------</span>|--------------------------
000000070000014E0000001F | 000000070000014E0000001B
</code></pre></div></div>
<p><br />
<br />
As you can see, the <code class="language-plaintext highlighter-rouge">.ok</code> files are there and the archiving is working again.
<br />
During the time, there could be one or more <code class="language-plaintext highlighter-rouge">.ok</code> files. The idea is that the <em>last</em> <code class="language-plaintext highlighter-rouge">.ok</code> file indicates the last asynchronously archived WAL segment (in the above, the one ending with <code class="language-plaintext highlighter-rouge">1F</code>).</p>
<h2 id="second-case-generating-more-wals">Second case: generating more WALs</h2>
<p>Shutdown the backup machine again, so that the PostgreSQL server is not able to archive WAL segments; then generate quite an amount of traffic to increase the WAL directory size (<code class="language-plaintext highlighter-rouge">pg_wal</code>).
<br />
Let’s inspect the situation:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ls -1s /var/spool/pgbackrest/archive/miguel/out \</span>
<span class="o">&&</span> psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="se">\</span>
<span class="nt">-c</span> <span class="s1">'select last_archived_wal, last_failed_wal from pg_stat_archiver;'</span> postgres
4 global.error
last_archived_wal | last_failed_wal
<span class="nt">--------------------------</span>|--------------------------
000000070000014E000000ED | 000000070000014E000000EE
<span class="c"># cat /var/spool/pgbackrest/archive/miguel/out/global.error </span>
103
unable to find a valid repository:
repo1: <span class="o">[</span>UnknownError] remote-0 process on <span class="s1">'carmensita'</span> terminated unexpectedly <span class="o">[</span>255]: ssh: connect to host carmensita port 22: No route to host
</code></pre></div></div>
<p><br />
<br />
Therefore, the <code class="language-plaintext highlighter-rouge">14E0xED</code> is the last archived WAL on the backup machine.
<br />
Suppose now a larger amount of data is mangled on PostgreSQL, so that it starts generating WAL segments. Clearly PostgreSQL cannot archive segments anymore, and will start accumulating them into <code class="language-plaintext highlighter-rouge">pg_wal</code> to keep them available for when the <code class="language-plaintext highlighter-rouge">archive_command</code> will start to work again.
<br />
Or does it?
<br />
Inspect again the situation on disk:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ls -1s /var/spool/pgbackrest/archive/miguel/out \</span>
<span class="o">&&</span> psql <span class="nt">-h</span> miguel <span class="nt">-U</span> postgres <span class="se">\</span>
<span class="nt">-c</span> <span class="s1">'select last_archived_wal, last_failed_wal from pg_stat_archiver;'</span> postgres
4 000000070000014F00000053.ok
4 000000070000014F00000054.ok
4 000000070000014F00000055.ok
4 000000070000014F00000056.ok
4 000000070000014F00000057.ok
4 000000070000014F00000058.ok
4 000000070000014F00000059.ok
4 000000070000014F0000005A.ok.pgbackrest.tmp
last_archived_wal | last_failed_wal
<span class="nt">--------------------------</span>|--------------------------
000000070000014F00000059 | 000000070000014F00000053
<span class="c"># cat /var/spool/pgbackrest/archive/miguel/out/000000070000014F00000059.ok</span>
0
dropped WAL file <span class="s1">'000000070000014F00000059'</span> because archive queue exceeded 500MB
</code></pre></div></div>
<p><br />
<br />
First of all: <strong><code class="language-plaintext highlighter-rouge">last_archived_wal</code> advanced even if the <code class="language-plaintext highlighter-rouge">acrhive_command</code> is failing (remember that the backup machine is down)</strong>! How is that possible?
<br />
The answer is in how <code class="language-plaintext highlighter-rouge">pgbackrest</code> asynchronous works: <strong>if the number of failed WALs is greater than a specified size, <code class="language-plaintext highlighter-rouge">pgbackrest</code> decides to ackwnloedge the archiving to the PostgreSQL server, that in turn advances in archiving even if <em>the archived WAL did not hit the backup machine!</em></strong>
<br />
The idea is that <code class="language-plaintext highlighter-rouge">pgbackrest</code> will prevent the <code class="language-plaintext highlighter-rouge">pg_wal</code> to grow undefinetly, thus risking to stop PostgreSQL to work at all. However, ** acknowledging a <em>fake archiving</em> means that the WAL-stream is broken, so Point In Time Recovery will not be possible anymore around this “hole” and a new backup is <em>strongly recommended</em>!**
<br />
<code class="language-plaintext highlighter-rouge">pgbackrest</code> inserts an information into its <code class="language-plaintext highlighter-rouge">.ok</code> files, that now are non-empty and inform the administrator that the WAL segment has been dropped explicitly.
<br />
You can find the same information into the PostgreSQL logs, where <code class="language-plaintext highlighter-rouge">pgbackrest</code> prints a message to make it clear:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo grep </span>000000070000014F00000059 <span class="nv">$PGLOG</span>
INFO: archive-push <span class="nb">command </span>begin 2.34: <span class="o">[</span>pg_wal/000000070000014F00000059] <span class="nt">--archive-async</span> <span class="nt">--archive-push-queue-max</span><span class="o">=</span>500MB <span class="nt">--config</span><span class="o">=</span>/etc/pgbackrest.conf <span class="nt">--exec-id</span><span class="o">=</span>40124-af251b1c <span class="nt">--log-level-console</span><span class="o">=</span>info <span class="nt">--pg1-path</span><span class="o">=</span>/postgres/13/data <span class="nt">--repo1-host</span><span class="o">=</span>carmensita <span class="nt">--repo1-host-user</span><span class="o">=</span>backup <span class="nt">--repo1-path</span><span class="o">=</span>/backup/pgbackrest <span class="nt">--spool-path</span><span class="o">=</span>/var/spool/pgbackrest <span class="nt">--stanza</span><span class="o">=</span>miguel
WARN: dropped WAL file <span class="s1">'000000070000014F00000059'</span> because archive queue exceeded 500MB
INFO: pushed WAL file <span class="s1">'000000070000014F00000059'</span> to the archive asynchronously
</code></pre></div></div>
<p><br />
<br />
There are three pieces of information:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">pgbackrest</code> tried to archive the WAL segment, failing;</li>
<li>there is a <code class="language-plaintext highlighter-rouge">WARN</code> that informs you that <code class="language-plaintext highlighter-rouge">pgbackrest</code> instrumented PostgreSQL to drop the WAL file <em>as if it was archived correctly</em>;</li>
<li><code class="language-plaintext highlighter-rouge">pgbackrest</code> states that it has archived the file, so PostgreSQL can proceed to delete or recycle it.</li>
</ul>
<p><br />
What and when does <code class="language-plaintext highlighter-rouge">pgbackrest</code> decides to give up and starts faking to PostgreSQL? The <code class="language-plaintext highlighter-rouge">acrhive-push-queue-max</code> configuration paramater establish how many data <code class="language-plaintext highlighter-rouge">pgbackrest</code> can fail behind the normal WAL operations before trying to make PostgreSQL delete segments.
<br />
In my configuration, there is <code class="language-plaintext highlighter-rouge">archive-push-queue-max=500MB</code>, that means that after <code class="language-plaintext highlighter-rouge">500MB</code> of failed WALs, <code class="language-plaintext highlighter-rouge">pgbackrest</code> will start faking and there will be a hole into the WAL stream. Roughly, this corresponds to <code class="language-plaintext highlighter-rouge">32</code> failed WALs on a row.</p>
<h2 id="parallel-processes">Parallel Processes</h2>
<p>The configuration parameter <code class="language-plaintext highlighter-rouge">process-max</code> can be used to control how many <em>push workers</em> can be launched to serve the asynchronous system. Suppose that in the configuration there is <code class="language-plaintext highlighter-rouge">process-max = 4</code>, then during WAL archiving you could see something as follows in the process list:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># pstree -c -A</span>
systemd-|-NetworkManager-|-<span class="o">{</span>NetworkManager<span class="o">}</span>
...
|-pgbackrest-|-pgbackrest---ssh
| |-pgbackrest---ssh
| |-pgbackrest---ssh
| |-pgbackrest---ssh
| <span class="sb">`</span><span class="nt">-ssh</span>
...
|-postmaster-|-postmaster
| |-postmaster
| |-postmaster
| |-postmaster
| |-postmaster
| |-postmaster---pgbackrest
| |-postmaster
| <span class="sb">`</span><span class="nt">-postmaster</span>
...
</code></pre></div></div>
<p><br />
<br />
As you can see, PostgreSQL has launched <code class="language-plaintext highlighter-rouge">pgbackrest</code> (that is, is executing the <code class="language-plaintext highlighter-rouge">archive_command</code>), and there are four <code class="language-plaintext highlighter-rouge">pgbackrest</code> processes.
<br />
If the system is pushing archives in synchronous mode, <code class="language-plaintext highlighter-rouge">process-max</code> is ignored.
<br />
Every concurrent process will share an <code class="language-plaintext highlighter-rouge">exec-id</code> that identifies the batch to which the process belongs:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># pstree -A -c -a -l | grep pgbackrest</span>
...
|-pgbackrest <span class="nt">--config</span><span class="o">=</span>/etc/pgbackrest.conf <span class="nt">--exec-id</span><span class="o">=</span>46475-10e060a1
| |-pgbackrest <span class="nt">--config</span><span class="o">=</span>/etc/pgbackrest.conf <span class="nt">--exec-id</span><span class="o">=</span>46475-10e060a1
| |-pgbackrest <span class="nt">--config</span><span class="o">=</span>/etc/pgbackrest.conf <span class="nt">--exec-id</span><span class="o">=</span>46475-10e060a1
| |-pgbackrest <span class="nt">--config</span><span class="o">=</span>/etc/pgbackrest.conf <span class="nt">--exec-id</span><span class="o">=</span>46475-10e060a1
...
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="restoring-archive-get">Restoring (<code class="language-plaintext highlighter-rouge">archive-get</code>)</h1>
<p>Let’s do a restore from a recent backup:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>systemctl stop postgresql-13.service
% <span class="nb">sudo</span> <span class="nt">-u</span> postgres pgbackrest <span class="nt">--stanza</span> miguel <span class="se">\</span>
<span class="nt">--pg1-path</span> /postgres/13/data <span class="nt">--delta</span> restore
...
INFO: restore <span class="nb">command </span>end: completed successfully <span class="o">(</span>69861ms<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br />
During the restore, the <code class="language-plaintext highlighter-rouge">archive</code> directory within the spool directory of <code class="language-plaintext highlighter-rouge">pgbackrest</code> is cleaned, in particular the specific server directory <code class="language-plaintext highlighter-rouge">miguel</code> is removed, since no WAL archiving is in progress.
<br />
The <code class="language-plaintext highlighter-rouge">postgresql.auto.conf</code> file contains the <code class="language-plaintext highlighter-rouge">archive-get</code> command ready to fetch the WAL segments:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo cat</span> /postgres/13/data/postgresql.auto.conf
<span class="c"># Recovery settings generated by pgBackRest restore on 2021-07-27 05:26:26</span>
restore_command <span class="o">=</span> <span class="s1">'pgbackrest --pg1-path=/postgres/13/data --stanza=miguel archive-get %f "%p"'</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>During the system startup, <code class="language-plaintext highlighter-rouge">pgbackrest</code> will get (as usual) WAL segments from the backup machine, but this time in an asynchronous way:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INFO: archive-get <span class="nb">command </span>begin 2.34: <span class="o">[</span>000000070000014F000000A4, pg_wal/RECOVERYXLOG] <span class="nt">--archive-async</span> <span class="nt">--archive-get-queue-max</span><span class="o">=</span>32MB <span class="nt">--exec-id</span><span class="o">=</span>42831-f4ada646 <span class="nt">--log-level-console</span><span class="o">=</span>info <span class="nt">--pg1-path</span><span class="o">=</span>/postgres/13/data <span class="nt">--repo1-host</span><span class="o">=</span>carmensita <span class="nt">--repo1-host-user</span><span class="o">=</span>backup <span class="nt">--repo1-path</span><span class="o">=</span>/backup/pgbackrest <span class="nt">--spool-path</span><span class="o">=</span>/var/spool/pgbackrest <span class="nt">--stanza</span><span class="o">=</span>miguel
INFO: found 000000070000014F000000A4 <span class="k">in </span>the archive asynchronously
INFO: archive-get <span class="nb">command </span>end: completed successfully <span class="o">(</span>713ms<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br />
The above is an excerpt of the PostgreSQL log.
In the meantime, the spool directory was populated with a <code class="language-plaintext highlighter-rouge">in</code> subdirectory for the server, and in such directory the incoming WALs were stored waiting to be replayed by the PostgreSQL server:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo ls</span> <span class="nt">-1s</span> /var/spool/pgbackrest/archive/miguel/in
16384 000000070000014F000000A4.pgbackrest.tmp
</code></pre></div></div>
<p><br />
<br />
In this scenario, the <code class="language-plaintext highlighter-rouge">archive-get-queue-max</code> parameter can specify the size of <em>pre-fetched</em> WALs: <code class="language-plaintext highlighter-rouge">pgbackrest</code> will fetch and store in the spooling directory no more WAL segments than the specified amount. Unlike the push configuration, setting this parameter does not imply the system will throw away WALs.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">pgbackrest</code> is an amazing backup tool, rock solid and with a lot of configuration parameters that can help improving the resource usage so that the backup and restore work fast and reliably even under heavy loads.
<br />
The asynchronous mode can help improving performances by means of <em>batches</em> and <em>pre-fetching</em> of WAL segments. However, you need to be aware about the fact that, by design, asynchronous pushing of WALs could produce holes in the WAL stream if the archiving accumulates too much data.
<br />
This is, in my opinion, an excellent feature, because in my experience I’ve seen many times a PostgreSQL server accumulating too much WAL segments (up to consuming all the storage) due to a faulty backup machine (or networking). After all, <code class="language-plaintext highlighter-rouge">pgbackrest</code> is ensuring you that a backup exists, and at least that your PostgreSQL server will not go read-only due to <code class="language-plaintext highlighter-rouge">archive_command</code> failing.
<br />
<br />
Love it or hate it!</p>
PostgreSQL Extension Catalogs2021-07-20T00:00:00+00:00https://fluca1978.github.io/2021/07/20/PostgreSQLExtensions<p>How to see the available and/or installed extensions?</p>
<h1 id="postgresql-extension-catalogs">PostgreSQL Extension Catalogs</h1>
<p>There are three main catalogs that can be useful when dealing with extensions:</p>
<ul>
<li><a href="https://www.postgresql.org/docs/current/catalog-pg-extension.html" target="_blank"><code class="language-plaintext highlighter-rouge">pg_extension</code></a>;</li>
<li><a href="https://www.postgresql.org/docs/13/view-pg-available-extension-versions.html" target="_blank"><code class="language-plaintext highlighter-rouge">pg_available_extension_versions</code></a>;</li>
<li><a href="https://www.postgresql.org/docs/13/view-pg-available-extensions.html" target="_blank"><code class="language-plaintext highlighter-rouge">pg_available_extensions</code></a>.</li>
</ul>
<p><br />
The former one, <code class="language-plaintext highlighter-rouge">pg_extension</code> provides information about <em>which extensions are installed in the current database</em>, while the latter, <code class="language-plaintext highlighter-rouge">pg_available_extensions</code> provides information about <em>which extensions are available to the cluster</em>.
<br />
The difference is simple: to be used an extension must appear first on <code class="language-plaintext highlighter-rouge">pg_available_extensions</code>, that means it has been installed on the cluster (e.g., via <code class="language-plaintext highlighter-rouge">pgxnclient</code>). From this point on, the extension can be <em>installed</em> into the database by means of a <code class="language-plaintext highlighter-rouge">CREATE EXTENSION</code> statement; as a result the extension will appear into the <code class="language-plaintext highlighter-rouge">pg_extension</code> catalog.</p>
<p><br />
As an example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">name</span><span class="p">,</span> <span class="n">default_version</span> <span class="k">from</span> <span class="n">pg_available_extensions</span><span class="p">;</span>
<span class="n">name</span> <span class="o">|</span> <span class="n">default_version</span>
<span class="c1">--------------------|-----------------</span>
<span class="n">intagg</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">1</span>
<span class="n">plpgsql</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">dict_int</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">dict_xsyn</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">adminpack</span> <span class="o">|</span> <span class="mi">2</span><span class="p">.</span><span class="mi">1</span>
<span class="n">intarray</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">3</span>
<span class="n">amcheck</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">autoinc</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">isn</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">bloom</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">fuzzystrmatch</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">1</span>
<span class="n">jsonb_plperl</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">btree_gin</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">3</span>
<span class="n">jsonb_plperlu</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">btree_gist</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">5</span>
<span class="n">hstore</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">7</span>
<span class="n">hstore_plperl</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">hstore_plperlu</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">citext</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">6</span>
<span class="n">lo</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">1</span>
<span class="n">ltree</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="k">cube</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">4</span>
<span class="n">insert_username</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">moddatetime</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">dblink</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">earthdistance</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">1</span>
<span class="n">file_fdw</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">pageinspect</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">8</span>
<span class="n">pg_buffercache</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">3</span>
<span class="n">pg_freespacemap</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">pg_prewarm</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">pg_stat_statements</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">8</span>
<span class="n">pg_trgm</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">5</span>
<span class="n">pg_visibility</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">pgcrypto</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">3</span>
<span class="n">pgrowlocks</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">pgstattuple</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">5</span>
<span class="n">postgres_fdw</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">refint</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">seg</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">3</span>
<span class="n">bool_plperl</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">plperlu</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">sslinfo</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">anon</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">9</span><span class="p">.</span><span class="mi">0</span>
<span class="n">tablefunc</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">tcn</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">tsm_system_rows</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">bool_plperlu</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">tsm_system_time</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">pgaudit</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">5</span>
<span class="n">pg_qualstats</span> <span class="o">|</span> <span class="mi">2</span><span class="p">.</span><span class="mi">0</span><span class="p">.</span><span class="mi">2</span>
<span class="n">unaccent</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">1</span>
<span class="n">plperl</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">orafce</span> <span class="o">|</span> <span class="mi">3</span><span class="p">.</span><span class="mi">13</span>
<span class="n">uuid</span><span class="o">-</span><span class="n">ossp</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">1</span>
<span class="n">xml2</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">1</span>
<span class="n">pg_background</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above list represents all the available extensions installed on the cluster, thus those I can execute a <code class="language-plaintext highlighter-rouge">CREATE EXTENSION</code> against.
<br />
<br />
The <code class="language-plaintext highlighter-rouge">pg_available_extensions</code> has an <code class="language-plaintext highlighter-rouge">installed_version</code> field that provides the version number of the extension installed in the current database, or <code class="language-plaintext highlighter-rouge">NULL</code> if the extension is not installed in the current database. Therefore, in order to know if an extension is installed or not in a database, you can run a query like the following:</p>
<p><br/<
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">name</span><span class="p">,</span> <span class="n">default_version</span><span class="p">,</span> <span class="n">installed_version</span>
<span class="k">from</span> <span class="n">pg_available_extensions</span>
<span class="k">where</span> <span class="n">installed_version</span> <span class="k">is</span> <span class="k">not</span> <span class="k">null</span><span class="p">;</span>
<span class="n">name</span> <span class="o">|</span> <span class="n">default_version</span> <span class="o">|</span> <span class="n">installed_version</span>
<span class="c1">---------------|-----------------|-------------------</span>
<span class="n">plpgsql</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">dblink</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">orafce</span> <span class="o">|</span> <span class="mi">3</span><span class="p">.</span><span class="mi">13</span> <span class="o">|</span> <span class="mi">3</span><span class="p">.</span><span class="mi">13</span>
<span class="n">pg_background</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
</code></pre></div></div>
<p><br />
<br />
This is a little too much effort, and since extension could have been installed with different <em>flags</em> in different database, the <code class="language-plaintext highlighter-rouge">pg_extension</code> catalog provides a more detailed and narrowed information: it lists all extensions that have been installed on the current database.</p>
<p><br />
Therefore, to see what a database can use, that means which extensions it has access to, I need to use the <code class="language-plaintext highlighter-rouge">pg_extension</code> catalog:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">extname</span><span class="p">,</span> <span class="n">extversion</span> <span class="k">from</span> <span class="n">pg_extension</span> <span class="p">;</span>
<span class="n">extname</span> <span class="o">|</span> <span class="n">extversion</span>
<span class="c1">---------------|------------</span>
<span class="n">plpgsql</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">orafce</span> <span class="o">|</span> <span class="mi">3</span><span class="p">.</span><span class="mi">13</span>
<span class="n">dblink</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span>
<span class="n">pg_background</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">0</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The current database has a much smaller list of available extensions.</p>
<h2 id="extension-version-numbers">Extension Version Numbers</h2>
<p>As you know, an extension can come with different version number and the beauty of this mechanism is that it is easy to upgrade an extension from one version to another.
<br />
The <code class="language-plaintext highlighter-rouge">pg_available_extensions</code> catalog provides only the last (i.e., newest) version of an available extension. Let’s try with a very popular extension: <code class="language-plaintext highlighter-rouge">pg_stat_statements</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">name</span><span class="p">,</span> <span class="n">default_version</span><span class="p">,</span> <span class="n">installed_version</span>
<span class="k">from</span> <span class="n">pg_available_extensions</span>
<span class="k">where</span> <span class="n">name</span> <span class="o">=</span> <span class="s1">'pg_stat_statements'</span><span class="p">;</span>
<span class="n">name</span> <span class="o">|</span> <span class="n">default_version</span> <span class="o">|</span> <span class="n">installed_version</span>
<span class="c1">--------------------|-----------------|-------------------</span>
<span class="n">pg_stat_statements</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">8</span> <span class="o">|</span>
</code></pre></div></div>
<p><br />
<br />
The extension could be installed to the version <code class="language-plaintext highlighter-rouge">1.8</code> and is currently not available in the current database.
<br />
But what about other version numbers?
<br />
The catalog <code class="language-plaintext highlighter-rouge">pg_available_extension_versions</code> provides a list of all available versions an extension is currently available:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">name</span><span class="p">,</span> <span class="k">version</span><span class="p">,</span> <span class="n">installed</span><span class="p">,</span> <span class="n">relocatable</span>
<span class="k">from</span> <span class="n">pg_available_extension_versions</span>
<span class="k">where</span> <span class="n">name</span> <span class="o">=</span> <span class="s1">'pg_stat_statements'</span>
<span class="k">order</span> <span class="k">by</span> <span class="k">version</span><span class="p">;</span>
<span class="n">name</span> <span class="o">|</span> <span class="k">version</span> <span class="o">|</span> <span class="n">installed</span> <span class="o">|</span> <span class="n">relocatable</span>
<span class="c1">--------------------|---------|-----------|-------------</span>
<span class="n">pg_stat_statements</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">4</span> <span class="o">|</span> <span class="n">f</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">pg_stat_statements</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">5</span> <span class="o">|</span> <span class="n">f</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">pg_stat_statements</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">6</span> <span class="o">|</span> <span class="n">f</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">pg_stat_statements</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">7</span> <span class="o">|</span> <span class="n">f</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">pg_stat_statements</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">8</span> <span class="o">|</span> <span class="n">f</span> <span class="o">|</span> <span class="n">t</span>
</code></pre></div></div>
<p><br />
<br />
As you can see, the extension is available in five different versions, and I can choose the version that fit the best my requirements.
<br />
This catalog provides different information, in particular it can give you an idea if the extension can be installed only by superusers (field <code class="language-plaintext highlighter-rouge">superuser</code>) or by a user with appropriate privileges (field <code class="language-plaintext highlighter-rouge">trusted</code>), as well as other required extensions (field <code class="language-plaintext highlighter-rouge">requires_name</code>), and relocatability.</p>
How much data goes into the WALs? (part 2)2021-07-15T00:00:00+00:00https://fluca1978.github.io/2021/07/15/PostgreSQLWalTraffic2<p>I did some more experiments with WALs.</p>
<h1 id="how-much-data-goes-into-the-wals-part-2">How much data goes into the WALs? (part 2**</h1>
<p>In order to get a better idea about how WAL settings can change the situation within the WAL management, I decided to run a kind of automated test and store the results into a table, so that I can query them back later.
<br />
The idea is the same of <a href="https://fluca1978.github.io/2021/07/13/PostgreSQLWalTraffic.html" target="_blank">my previous article</a>: produce some workload, meausere the differences in the Log Sequence Numbers, and see how the size of WALs change depending on some setting. This is not an accurate research, it’s just a quick and dirty experiment.</p>
<p><br /></p>
<p>At the end, I decided to share my numbers so that you can have a look at them and elaborate a bit more. For example, I’m no good at all at doing graphs (I know only the very minimum about <code class="language-plaintext highlighter-rouge">gnuplot</code>!).</p>
<h3 id="-warning-">!!! WARNING !!!</h3>
<p><strong>WARNING: this is not a guide on how to tune WAL settings!</strong>
<em>This is not even a real and comprhensive set of experiments</em>, it is just what I’ve played with to see how much traffic can be generated for certain amount of workloads.
<br />
Your case and situation could be, and probably is, different from the very simple test I’ve done, and I do not pretend to be right about the small and obvious conclusions I come up at the end. In the case you see or know something that can help making more clear what I write in the following, please comment or contact me!</p>
<h2 id="set-up">Set up</h2>
<p>First of all I decided to run an <code class="language-plaintext highlighter-rouge">INSERT</code> only workload, so that the size of the resulting table does not include any bloating and is therefore <em>comparable</em> to the effort about the WAL records.
<br />
No other database activity was ongoing, so that the only generated WAL traffic was about my own workload.
<br />
Each time the configuration was changed, the system was restarted, so that every workload started with the same (empty) clean situation and without any need to reason about ongoing checkpoints. Of course, checkpoints were happening as usual, but not at the beginning of the workload.
<br />
<br />
I used two tables to run the test:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">wal_traffic</code> stores the results of each run;</li>
<li><code class="language-plaintext highlighter-rouge">wal_traffic_data</code> is used to store the data about every workload, that is tuples inserted in the database.
<br />
The <code class="language-plaintext highlighter-rouge">wal_traffic_data</code> was dropped and re-created every time a new run was started, so to avoid data bloating
It is interesting to note that any workload setup activity is performed before the server is restarted, so that the only WAL traffic measured is as close as possible to the workload only.
<br />
The <code class="language-plaintext highlighter-rouge">wal_traffic</code> table is defined as follows:</li>
</ul>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">wal_traffic</span>
<span class="p">(</span>
<span class="n">pk</span> <span class="nb">int</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span>
<span class="p">,</span> <span class="n">workload</span> <span class="nb">text</span>
<span class="p">,</span> <span class="n">lsn_start</span> <span class="n">pg_lsn</span>
<span class="p">,</span> <span class="n">lsn_end</span> <span class="n">pg_lsn</span>
<span class="p">,</span> <span class="n">lsn_insert_start</span> <span class="n">pg_lsn</span>
<span class="p">,</span> <span class="n">lsn_insert_end</span> <span class="n">pg_lsn</span>
<span class="p">,</span> <span class="n">run</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">0</span>
<span class="p">,</span> <span class="n">data_size</span> <span class="nb">bigint</span> <span class="k">default</span> <span class="mi">0</span>
<span class="p">,</span> <span class="n">wal_size</span> <span class="nb">bigint</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="p">(</span> <span class="n">lsn_end</span> <span class="o">-</span> <span class="n">lsn_start</span> <span class="p">)</span> <span class="n">stored</span>
<span class="p">,</span> <span class="n">wal_data_ratio</span> <span class="nb">numeric</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="p">(</span> <span class="p">(</span> <span class="n">lsn_end</span> <span class="o">-</span> <span class="n">lsn_start</span> <span class="p">)::</span><span class="nb">real</span> <span class="o">/</span> <span class="n">data_size</span> <span class="o">*</span> <span class="mi">100</span> <span class="p">)</span> <span class="n">stored</span>
<span class="p">,</span> <span class="n">wal_insert_data_ratio</span> <span class="nb">numeric</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="p">(</span> <span class="p">(</span> <span class="n">lsn_insert_end</span> <span class="o">-</span> <span class="n">lsn_insert_start</span> <span class="p">)::</span><span class="nb">real</span> <span class="o">/</span> <span class="n">data_size</span> <span class="o">*</span> <span class="mi">100</span> <span class="p">)</span> <span class="n">stored</span>
<span class="p">,</span> <span class="n">settings</span> <span class="n">jsonb</span>
<span class="p">,</span> <span class="n">workload_repetitions</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">1</span>
<span class="p">,</span> <span class="n">ts_start</span> <span class="nb">timestamp</span> <span class="k">default</span> <span class="k">current_timestamp</span>
<span class="p">,</span> <span class="n">ts_end</span> <span class="nb">timestamp</span> <span class="k">default</span> <span class="k">current_timestamp</span>
<span class="p">,</span> <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="p">(</span> <span class="n">pk</span> <span class="p">)</span>
<span class="p">);</span>
</code></pre></div></div>
<p><br />
<br />
The <code class="language-plaintext highlighter-rouge">workload</code> field stores the text string about the executed query.
<br />
The <code class="language-plaintext highlighter-rouge">lsn_xxx</code> fields store the location within the WAL, in particular:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">lsn_start</code> and <code class="language-plaintext highlighter-rouge">lsn_end</code> store the result of <code class="language-plaintext highlighter-rouge">pg_current_wal_lsn()</code> function invoked at the begin and at the end of the workload;</li>
<li><code class="language-plaintext highlighter-rouge">lsn_insert_start</code> and <code class="language-plaintext highlighter-rouge">lsn_insert_end</code> store the result of <code class="language-plaintext highlighter-rouge">pg_current_wal_insert_lsn()</code> function invoked at the beginning and ending of the workload.
<br /></li>
</ul>
<p>I decided to store both the information to be able to examine differences in a more accurate way, however for this kind of experiment the differences between the values are pretty much useless.
<br />
The <code class="language-plaintext highlighter-rouge">data_size</code> column contains the result of <code class="language-plaintext highlighter-rouge">pg_relation_size()</code>, that is a rough estimation of the volumen of data produced during the workload.
<br />
The columns <code class="language-plaintext highlighter-rouge">wal_size</code>, <code class="language-plaintext highlighter-rouge">wal_data_ratio</code>, and <code class="language-plaintext highlighter-rouge">wal_insert_data_ratio</code>are generated, and contain repsectively the amount of generated WAL records and the ratio between the size of the <em>actual</em> data and that of the WAL records.
<br />
Last, the <code class="language-plaintext highlighter-rouge">settings</code> column contains a <code class="language-plaintext highlighter-rouge">jsonb</code> representation of the settings used to run the test, like for example the value for <code class="language-plaintext highlighter-rouge">wal_level</code>, <code class="language-plaintext highlighter-rouge">wal_compression</code> and so on.</p>
<p><br />
<br />
There is also a view to quickly get results about the workload size:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">VIEW</span> <span class="n">vw_wal_traffic</span>
<span class="k">AS</span>
<span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">data_size</span> <span class="p">)</span> <span class="k">as</span> <span class="n">data_size</span><span class="p">,</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">wal_size</span> <span class="p">)</span> <span class="k">as</span> <span class="n">wal_size</span><span class="p">,</span> <span class="n">wal_data_ratio</span><span class="p">::</span><span class="nb">numeric</span><span class="p">(</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">2</span> <span class="p">)</span> <span class="o">||</span> <span class="s1">' %'</span> <span class="k">as</span> <span class="n">ratio</span><span class="p">,</span>
<span class="n">wal_insert_data_ratio</span><span class="p">::</span><span class="nb">numeric</span><span class="p">(</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">2</span> <span class="p">)</span> <span class="o">||</span> <span class="s1">'%'</span> <span class="k">as</span> <span class="n">ins_ratio</span><span class="p">,</span>
<span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_start</span> <span class="k">as</span> <span class="n">elapsed_time</span><span class="p">,</span>
<span class="n">settings</span> <span class="k">from</span> <span class="n">wal_traffic</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="details-about-the-workloads">Details about the workloads</h2>
<p>I’ve prepared two different workload, both based on <code class="language-plaintext highlighter-rouge">INSERT</code>s.
<br />
The first workload does two transactions: the first one inserts a certain amount of tuples, while the second inserts a smaller amount of tuples. In particular, the first transaction inserts a number of tuples specified by <code class="language-plaintext highlighter-rouge">$workload_scale</code>, while the second transaction inserts <code class="language-plaintext highlighter-rouge">1/5</code> of the same value.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">BEGIN</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="err">$</span><span class="n">WORKLOAD_TABLE</span> <span class="k">SELECT</span> <span class="n">v</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span> <span class="n">v</span><span class="p">::</span><span class="nb">text</span> <span class="p">)::</span><span class="nb">text</span> <span class="o">||</span> <span class="n">random</span><span class="p">()::</span><span class="nb">text</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="err">$</span><span class="n">workload_scale</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">COMMIT</span><span class="p">;</span>
<span class="k">BEGIN</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="err">$</span><span class="n">WORKLOAD_TABLE</span>
<span class="k">SELECT</span> <span class="n">v</span> <span class="o">+</span> <span class="n">v</span><span class="p">,</span> <span class="n">t</span> <span class="o">||</span> <span class="s1">' - '</span> <span class="o">||</span> <span class="n">t</span> <span class="o">||</span> <span class="n">random</span><span class="p">()::</span><span class="nb">text</span>
<span class="k">FROM</span> <span class="err">$</span><span class="n">WORKLOAD_TABLE</span>
<span class="k">WHERE</span> <span class="n">v</span> <span class="o">%</span> <span class="mi">5</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">COMMIT</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br />
The <code class="language-plaintext highlighter-rouge">$workload_scale</code> variable assumes the values ranging from <code class="language-plaintext highlighter-rouge">100</code> to <code class="language-plaintext highlighter-rouge">10 million</code> growing by a factor of ten (e.g., <code class="language-plaintext highlighter-rouge">100</code>, <code class="language-plaintext highlighter-rouge">1000</code>, <code class="language-plaintext highlighter-rouge">10000</code> and so on).
<br />
The second workload type is shorter, and does the following:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">DO</span> <span class="err">$$</span>
<span class="k">DECLARE</span>
<span class="n">i</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="err">$</span><span class="n">workload_scale</span> <span class="n">LOOP</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="err">$</span><span class="n">WORKLOAD_TABLE</span> <span class="k">SELECT</span> <span class="mi">1</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span> <span class="n">random</span><span class="p">()::</span><span class="nb">text</span> <span class="p">)::</span><span class="nb">text</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$$</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore performs the same number of tuple insertions as in the previous transaction, but it does by looping.
The final effect is that the first workload executes a single <code class="language-plaintext highlighter-rouge">INSERT</code> statetement, while the second workload executes several <code class="language-plaintext highlighter-rouge">INSERT</code> statements.
<br />
<br />
The usage of <code class="language-plaintext highlighter-rouge">random()</code> within the <code class="language-plaintext highlighter-rouge">INSERT</code> statements is to generate some more traffic on logical decoding.</p>
<h2 id="the-workload-workflow">The Workload Workflow</h2>
<p>In order to do the tests, I wrote an ugly shell script with the following workflow:</p>
<ul>
<li>truncate the <code class="language-plaintext highlighter-rouge">wal_traffic_data</code> table, so that its size on disk does not include previous experiments;</li>
<li>execute a few <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code> to set some configuration on WAL related parameters (<code class="language-plaintext highlighter-rouge">wal_level</code>, <code class="language-plaintext highlighter-rouge">full_page_writes</code>, <code class="language-plaintext highlighter-rouge">wal_compression</code> and so on);</li>
<li>restart the PostgreSQL system, so to ensure every test has a clean and clear situation;</li>
<li>get the current WAL position (<code class="language-plaintext highlighter-rouge">pg_current_wal_lsn()</code> and <code class="language-plaintext highlighter-rouge">pg_current_wal_insert_lsn()</code>);</li>
<li>execute the workload with the right <em>scale</em>;</li>
<li>get the current WAL position (<code class="language-plaintext highlighter-rouge">pg_current_wal_lsn()</code> and <code class="language-plaintext highlighter-rouge">pg_current_wal_insert_lsn()</code>);</li>
<li>insert the result tuple with WAL differences into <code class="language-plaintext highlighter-rouge">wal_traffic</code>;</li>
<li>loop with a different scaling factor.</li>
</ul>
<h1 id="the-results">The Results</h1>
<p>It is now time to have a look at the test results.</p>
<p>Let’s consider a few results:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">vw_wal_traffic</span> <span class="k">where</span> <span class="n">settings</span><span class="o">->></span><span class="s1">'wal_level'</span> <span class="o">=</span> <span class="s1">'minimal'</span> <span class="k">and</span> <span class="n">settings</span><span class="o">->></span><span class="s1">'wal_compression'</span> <span class="o">=</span> <span class="s1">'on'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">1205</span> <span class="n">MB</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">2148</span> <span class="n">MB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">178</span><span class="p">.</span><span class="mi">27</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">178</span><span class="p">.</span><span class="mi">27</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">33</span><span class="p">.</span><span class="mi">282366</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"minimal"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">1205</span> <span class="n">MB</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">2148</span> <span class="n">MB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">178</span><span class="p">.</span><span class="mi">27</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">178</span><span class="p">.</span><span class="mi">27</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">34</span><span class="p">.</span><span class="mi">882126</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"minimal"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">}</span>
</code></pre></div></div>
<p><br />
<br />
As you can see, for <code class="language-plaintext highlighter-rouge">1,2 GB</code> of data the system has produced roughly <code class="language-plaintext highlighter-rouge">2,1 GB</code> of WAL records. And the situation is even worst when there is no <code class="language-plaintext highlighter-rouge">wal_compression</code> (as you could expect):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">8</span> <span class="p">]</span><span class="o">+</span><span class="c1">------------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">1205</span> <span class="n">MB</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">2402</span> <span class="n">MB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">199</span><span class="p">.</span><span class="mi">34</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">199</span><span class="p">.</span><span class="mi">34</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">:</span><span class="mi">30</span><span class="p">.</span><span class="mi">725138</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"minimal"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">}</span>
</code></pre></div></div>
<p><br />
<br />
this time, for the same amount of data, the WAL size is almost double that of the real data.
<br />
Changing the setting of <code class="language-plaintext highlighter-rouge">wal_level</code> to <code class="language-plaintext highlighter-rouge">logical</code> or <code class="language-plaintext highlighter-rouge">replicat</code> does not change very much the situation,</p>
<p><br />
It is possible to get the best ratio between the WAL produced and the data stored:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">vw_wal_traffic</span> <span class="n">v</span> <span class="k">where</span> <span class="n">ratio</span> <span class="o">=</span> <span class="p">(</span> <span class="k">select</span> <span class="k">min</span><span class="p">(</span> <span class="n">ratio</span> <span class="p">)</span> <span class="k">from</span> <span class="n">vw_wal_traffic</span> <span class="k">where</span> <span class="n">settings</span><span class="o">->></span><span class="s1">'wal_level'</span> <span class="o">=</span> <span class="n">v</span><span class="p">.</span><span class="n">settings</span><span class="o">->></span><span class="s1">'wal_level'</span> <span class="p">)</span> <span class="k">and</span> <span class="n">v</span><span class="p">.</span><span class="n">settings</span><span class="o">->></span><span class="s1">'wal_level'</span> <span class="k">IN</span> <span class="p">(</span> <span class="s1">'minimal'</span><span class="p">,</span> <span class="s1">'replica'</span><span class="p">,</span> <span class="s1">'logical'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">16</span> <span class="n">kB</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">16</span> <span class="n">kB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">101</span><span class="p">.</span><span class="mi">95</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">101</span><span class="p">.</span><span class="mi">95</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">133674</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"logical"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">16</span> <span class="n">kB</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">16</span> <span class="n">kB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">101</span><span class="p">.</span><span class="mi">56</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">101</span><span class="p">.</span><span class="mi">56</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">120578</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"replica"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">3</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">16</span> <span class="n">kB</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">18</span> <span class="n">kB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">111</span><span class="p">.</span><span class="mi">13</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">100</span><span class="p">.</span><span class="mi">34</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">427126</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"minimal"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">}</span>
</code></pre></div></div>
<p><br />
<br />
and on the other side, the worst ratio:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">vw_wal_traffic</span> <span class="n">v</span> <span class="k">where</span> <span class="n">ratio</span> <span class="o">=</span> <span class="p">(</span> <span class="k">select</span> <span class="k">max</span><span class="p">(</span> <span class="n">ratio</span> <span class="p">)</span> <span class="k">from</span> <span class="n">vw_wal_traffic</span> <span class="k">where</span> <span class="n">settings</span><span class="o">->></span><span class="s1">'wal_level'</span> <span class="o">=</span> <span class="n">v</span><span class="p">.</span><span class="n">settings</span><span class="o">->></span><span class="s1">'wal_level'</span> <span class="p">)</span> <span class="k">and</span> <span class="n">v</span><span class="p">.</span><span class="n">settings</span><span class="o">->></span><span class="s1">'wal_level'</span> <span class="k">IN</span> <span class="p">(</span> <span class="s1">'minimal'</span><span class="p">,</span> <span class="s1">'replica'</span><span class="p">,</span> <span class="s1">'logical'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">23</span> <span class="n">kB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">289</span><span class="p">.</span><span class="mi">16</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">190</span><span class="p">.</span><span class="mi">72</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">266881</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"minimal"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">23</span> <span class="n">kB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">289</span><span class="p">.</span><span class="mi">16</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">190</span><span class="p">.</span><span class="mi">72</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">112946</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"minimal"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">3</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">23</span> <span class="n">kB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">284</span><span class="p">.</span><span class="mi">47</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">190</span><span class="p">.</span><span class="mi">63</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">076021</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"logical"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">4</span> <span class="p">]</span><span class="o">+</span><span class="c1">-----------------------------------------------------------------------------------------------------</span>
<span class="n">data_size</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span>
<span class="n">wal_size</span> <span class="o">|</span> <span class="mi">23</span> <span class="n">kB</span>
<span class="n">ratio</span> <span class="o">|</span> <span class="mi">289</span><span class="p">.</span><span class="mi">65</span> <span class="o">%</span>
<span class="n">ins_ratio</span> <span class="o">|</span> <span class="mi">190</span><span class="p">.</span><span class="mi">53</span><span class="o">%</span>
<span class="n">elapsed_time</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">113793</span>
<span class="n">settings</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"wal_level"</span><span class="p">:</span> <span class="nv">"replica"</span><span class="p">,</span> <span class="nv">"wal_log_hints"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"wal_compression"</span><span class="p">:</span> <span class="nv">"off"</span><span class="p">,</span> <span class="nv">"full_page_writes"</span><span class="p">:</span> <span class="nv">"on"</span><span class="p">}</span>
</code></pre></div></div>
<p><br />
<br />
From the above, it is clear that the worst cases are those with <code class="language-plaintext highlighter-rouge">wal_compression</code> disabled, while the best cases are those with compression enabled.</p>
<p><br /></p>
<h2 id="download-the-results">Download the Results</h2>
<p>The <a href="https://gitlab.com/fluca1978/fluca1978-pg-utils/-/blob/master/examples/wal_traffic/wal_traffic.csv" target="_blank">results are available by means of a CSV file</a>, so you can load and inspect them yourself.
In order to load the files, create a table <code class="language-plaintext highlighter-rouge">wal_traffic_results</code> as follows:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">table</span> <span class="n">wal_traffic_results</span> <span class="p">(</span>
<span class="n">run</span> <span class="nb">int</span><span class="p">,</span> <span class="n">workload</span> <span class="nb">text</span><span class="p">,</span> <span class="n">wal_size</span> <span class="nb">bigint</span><span class="p">,</span>
<span class="n">data_size</span> <span class="nb">bigint</span><span class="p">,</span>
<span class="n">wal_data_ratio</span> <span class="nb">numeric</span><span class="p">(</span> <span class="mi">5</span><span class="p">,</span><span class="mi">2</span><span class="p">),</span>
<span class="n">wall_clock</span> <span class="nb">time</span><span class="p">,</span> <span class="n">wal_level</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">wal_log_hints</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">wal_compression</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">full_page_writes</span> <span class="nb">text</span> <span class="p">);</span>
</code></pre></div></div>
<p><br />
<br />
and then load the CSV file with a command like the following one:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="k">copy</span> <span class="n">wal_traffic_results</span> <span class="k">from</span> <span class="n">wal_traffic</span><span class="p">.</span><span class="n">csv</span> <span class="k">with</span> <span class="p">(</span> <span class="n">format</span> <span class="n">csv</span><span class="p">,</span> <span class="n">header</span> <span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Please note that I’ve split the <code class="language-plaintext highlighter-rouge">jsonb</code> field into a set of columns with a query like the following one, that produced the CSV file:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">%</span> <span class="n">psql</span> <span class="o">-</span><span class="n">A</span> <span class="c1">--csv -h miguel </span>
<span class="o">-</span><span class="k">c</span> <span class="s1">'select run, workload, wal_size, data_size, wal_data_ratio, ts_end - ts_start as wall_clock, x.* from wal_traffic
cross join lateral jsonb_to_record( settings ) as x( wal_level text, wal_log_hints text, wal_compression text, full_page_writes text );'</span>
<span class="n">testdb</span> <span class="o">>!</span> <span class="n">wal_traffic</span><span class="p">.</span><span class="n">csv</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="more-results">More Results</h2>
<p>From the de-<code class="language-plaintext highlighter-rouge">jsonb</code> representation of the results, it is easier to get a glance at the WAL ratio by workload type</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">workload</span><span class="p">,</span> <span class="k">min</span><span class="p">(</span> <span class="n">wal_data_ratio</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">wal_data_ratio</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">wal_data_ratio</span> <span class="p">)</span> <span class="o">-</span> <span class="k">min</span><span class="p">(</span> <span class="n">wal_data_ratio</span> <span class="p">)</span> <span class="k">as</span> <span class="n">diff</span>
<span class="k">from</span> <span class="n">wal_traffic_results</span>
<span class="k">group</span> <span class="k">by</span> <span class="n">workload</span> <span class="k">order</span> <span class="k">by</span> <span class="mi">4</span> <span class="k">asc</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'BEGIN; +
| INSERT INTO wal_traffic_workload SELECT v, md5( v::text )::text || random()::text+
| FROM generate_series( 1, 1000000 ) v; +
| COMMIT; +
| +
| BEGIN; +
| INSERT INTO wal_traffic_workload +
| SELECT v + v, t || </span><span class="se">''</span><span class="s1"> - </span><span class="se">''</span><span class="s1"> || t || random()::text +
| FROM wal_traffic_workload +
| WHERE v % 5 = 0; +
| COMMIT;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">55</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">60</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">05</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'DO $wl$ DECLARE +
| i int; +
| BEGIN +
| FOR i IN 1 .. 1000000 LOOP +
| INSERT INTO wal_traffic_workload SELECT 1, md5( random()::text )::text; +
| END LOOP; +
| END $wl$;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">76</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">82</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">06</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">3</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'DO $wl$ DECLARE +
| i int; +
| BEGIN +
| FOR i IN 1 .. 10000000 LOOP +
| INSERT INTO wal_traffic_workload SELECT 1, md5( random()::text )::text; +
| END LOOP; +
| END $wl$;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">76</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">92</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">16</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">4</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'BEGIN; +
| INSERT INTO wal_traffic_workload SELECT v, md5( v::text )::text || random()::text+
| FROM generate_series( 1, 100000 ) v; +
| COMMIT; +
| +
| BEGIN; +
| INSERT INTO wal_traffic_workload +
| SELECT v + v, t || </span><span class="se">''</span><span class="s1"> - </span><span class="se">''</span><span class="s1"> || t || random()::text +
| FROM wal_traffic_workload +
| WHERE v % 5 = 0; +
| COMMIT;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">49</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">73</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">24</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">5</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'DO $wl$ DECLARE +
| i int; +
| BEGIN +
| FOR i IN 1 .. 100000 LOOP +
| INSERT INTO wal_traffic_workload SELECT 1, md5( random()::text )::text; +
| END LOOP; +
| END $wl$;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">72</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">97</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">25</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">6</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'BEGIN; +
| INSERT INTO wal_traffic_workload SELECT v, md5( v::text )::text || random()::text+
| FROM generate_series( 1, 10000 ) v; +
| COMMIT; +
| +
| BEGIN; +
| INSERT INTO wal_traffic_workload +
| SELECT v + v, t || </span><span class="se">''</span><span class="s1"> - </span><span class="se">''</span><span class="s1"> || t || random()::text +
| FROM wal_traffic_workload +
| WHERE v % 5 = 0; +
| COMMIT;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">124</span><span class="p">.</span><span class="mi">99</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">126</span><span class="p">.</span><span class="mi">55</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">56</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">7</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'DO $wl$ DECLARE +
| i int; +
| BEGIN +
| FOR i IN 1 .. 10000 LOOP +
| INSERT INTO wal_traffic_workload SELECT 1, md5( random()::text )::text; +
| END LOOP; +
| END $wl$;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">125</span><span class="p">.</span><span class="mi">14</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">127</span><span class="p">.</span><span class="mi">47</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">2</span><span class="p">.</span><span class="mi">33</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">8</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'BEGIN; +
| INSERT INTO wal_traffic_workload SELECT v, md5( v::text )::text || random()::text+
| FROM generate_series( 1, 10000000 ) v; +
| COMMIT; +
| +
| BEGIN; +
| INSERT INTO wal_traffic_workload +
| SELECT v + v, t || </span><span class="se">''</span><span class="s1"> - </span><span class="se">''</span><span class="s1"> || t || random()::text +
| FROM wal_traffic_workload +
| WHERE v % 5 = 0; +
| COMMIT;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">178</span><span class="p">.</span><span class="mi">27</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">199</span><span class="p">.</span><span class="mi">46</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">21</span><span class="p">.</span><span class="mi">19</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">9</span> <span class="p">]</span><span class="c1">-------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'BEGIN; +
| INSERT INTO wal_traffic_workload SELECT v, md5( v::text )::text || random()::text+
| FROM generate_series( 1, 1000 ) v; +
| COMMIT; +
| +
| BEGIN; +
| INSERT INTO wal_traffic_workload +
| SELECT v + v, t || </span><span class="se">''</span><span class="s1"> - </span><span class="se">''</span><span class="s1"> || t || random()::text +
| FROM wal_traffic_workload +
| WHERE v % 5 = 0; +
| COMMIT;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">121</span><span class="p">.</span><span class="mi">58</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">152</span><span class="p">.</span><span class="mi">01</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">30</span><span class="p">.</span><span class="mi">43</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">10</span> <span class="p">]</span><span class="c1">------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'DO $wl$ DECLARE +
| i int; +
| BEGIN +
| FOR i IN 1 .. 1000 LOOP +
| INSERT INTO wal_traffic_workload SELECT 1, md5( random()::text )::text; +
| END LOOP; +
| END $wl$;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">118</span><span class="p">.</span><span class="mi">45</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">167</span><span class="p">.</span><span class="mi">37</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">48</span><span class="p">.</span><span class="mi">92</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">11</span> <span class="p">]</span><span class="c1">------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'BEGIN; +
| INSERT INTO wal_traffic_workload SELECT v, md5( v::text )::text || random()::text+
| FROM generate_series( 1, 100 ) v; +
| COMMIT; +
| +
| BEGIN; +
| INSERT INTO wal_traffic_workload +
| SELECT v + v, t || </span><span class="se">''</span><span class="s1"> - </span><span class="se">''</span><span class="s1"> || t || random()::text +
| FROM wal_traffic_workload +
| WHERE v % 5 = 0; +
| COMMIT;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">101</span><span class="p">.</span><span class="mi">56</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">247</span><span class="p">.</span><span class="mi">46</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">145</span><span class="p">.</span><span class="mi">90</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">12</span> <span class="p">]</span><span class="c1">------------------------------------------------------------------------------</span>
<span class="n">workload</span> <span class="o">|</span> <span class="s1">'DO $wl$ DECLARE +
| i int; +
| BEGIN +
| FOR i IN 1 .. 100 LOOP +
| INSERT INTO wal_traffic_workload SELECT 1, md5( random()::text )::text; +
| END LOOP; +
| END $wl$;'</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">124</span><span class="p">.</span><span class="mi">02</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">289</span><span class="p">.</span><span class="mi">65</span>
<span class="n">diff</span> <span class="o">|</span> <span class="mi">165</span><span class="p">.</span><span class="mi">63</span>
</code></pre></div></div>
<p><br />
<br />
There are certain workload (by type and size) that do not produce any sensible variation in the WAL produced, while for example the last workload for a small amount of tuples produces a very wide range of WAL record writes.
<br />
We could also query to search for a <em>trend</em> in the ratio:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">wal_data_ratio</span><span class="p">,</span> <span class="n">wal_level</span><span class="p">,</span> <span class="n">wal_log_hints</span><span class="p">,</span> <span class="n">wal_compression</span><span class="p">,</span> <span class="n">full_page_writes</span> <span class="k">from</span> <span class="n">wal_traffic_results</span> <span class="k">where</span> <span class="n">workload</span> <span class="k">like</span> <span class="s1">'%FOR i IN 1 .. 100 LOOP%'</span> <span class="k">order</span> <span class="k">by</span> <span class="mi">1</span> <span class="k">desc</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">----|--------</span>
<span class="n">wal_data_ratio</span> <span class="o">|</span> <span class="mi">289</span><span class="p">.</span><span class="mi">65</span>
<span class="n">wal_level</span> <span class="o">|</span> <span class="n">replica</span>
<span class="n">wal_log_hints</span> <span class="o">|</span> <span class="k">off</span>
<span class="n">wal_compression</span> <span class="o">|</span> <span class="k">off</span>
<span class="n">full_page_writes</span> <span class="o">|</span> <span class="k">on</span>
<span class="p">...</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">6</span> <span class="p">]</span><span class="c1">----|--------</span>
<span class="n">wal_data_ratio</span> <span class="o">|</span> <span class="mi">256</span><span class="p">.</span><span class="mi">35</span>
<span class="n">wal_level</span> <span class="o">|</span> <span class="n">logical</span>
<span class="n">wal_log_hints</span> <span class="o">|</span> <span class="k">on</span>
<span class="n">wal_compression</span> <span class="o">|</span> <span class="k">off</span>
<span class="n">full_page_writes</span> <span class="o">|</span> <span class="k">on</span>
<span class="p">...</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">19</span> <span class="p">]</span><span class="c1">---|--------</span>
<span class="n">wal_data_ratio</span> <span class="o">|</span> <span class="mi">150</span><span class="p">.</span><span class="mi">78</span>
<span class="n">wal_level</span> <span class="o">|</span> <span class="n">replica</span>
<span class="n">wal_log_hints</span> <span class="o">|</span> <span class="k">on</span>
<span class="n">wal_compression</span> <span class="o">|</span> <span class="k">on</span>
<span class="n">full_page_writes</span> <span class="o">|</span> <span class="k">off</span>
<span class="p">...</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">36</span> <span class="p">]</span><span class="c1">---|--------</span>
<span class="n">wal_data_ratio</span> <span class="o">|</span> <span class="mi">124</span><span class="p">.</span><span class="mi">02</span>
<span class="n">wal_level</span> <span class="o">|</span> <span class="n">replica</span>
<span class="n">wal_log_hints</span> <span class="o">|</span> <span class="k">on</span>
<span class="n">wal_compression</span> <span class="o">|</span> <span class="k">on</span>
<span class="n">full_page_writes</span> <span class="o">|</span> <span class="k">on</span>
</code></pre></div></div>
<p><br />
<br />
The above confirms how much <code class="language-plaintext highlighter-rouge">wal_compression</code> is going to reduce the WAL traffic.
<br />
And again, the <code class="language-plaintext highlighter-rouge">wal_level</code> is not going to influence the WAL size too much:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="k">min</span><span class="p">(</span> <span class="n">wal_data_ratio</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">wal_data_ratio</span> <span class="p">),</span> <span class="n">wal_level</span>
<span class="k">from</span> <span class="n">wal_traffic_results</span>
<span class="k">where</span> <span class="n">workload</span> <span class="k">like</span> <span class="s1">'%FOR i IN 1 .. 100 LOOP%'</span>
<span class="k">group</span> <span class="k">by</span> <span class="n">wal_level</span> <span class="k">order</span> <span class="k">by</span> <span class="mi">1</span> <span class="k">desc</span><span class="p">,</span> <span class="mi">2</span> <span class="k">desc</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">------</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">146</span><span class="p">.</span><span class="mi">00</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">289</span><span class="p">.</span><span class="mi">16</span>
<span class="n">wal_level</span> <span class="o">|</span> <span class="n">minimal</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="c1">------</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">124</span><span class="p">.</span><span class="mi">12</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">284</span><span class="p">.</span><span class="mi">47</span>
<span class="n">wal_level</span> <span class="o">|</span> <span class="n">logical</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">3</span> <span class="p">]</span><span class="c1">------</span>
<span class="k">min</span> <span class="o">|</span> <span class="mi">124</span><span class="p">.</span><span class="mi">02</span>
<span class="k">max</span> <span class="o">|</span> <span class="mi">289</span><span class="p">.</span><span class="mi">65</span>
<span class="n">wal_level</span> <span class="o">|</span> <span class="n">replica</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>Even a small amount of <em>real data</em> can produce quite a lot amount of WAL records, and this is good because within those records there are all the information PostgreSQL needs to keep our data at safe, that after all its our final goal.
<br />
WAL related settings can, of course, influence the amount of generated data and the idea behind this article is not to provide an exhaustive guide to tune WALs, rather to show how you can measure your WAL traffic depending on the workload you are facing.
<br />
This should then help you to decide the right way to tune your WALs.
<br />
In the case you find something wrong in the approach described above, or want to integrate or share your experience, please comment on contact me.</p>
How much data goes into the WALs?2021-07-13T00:00:00+00:00https://fluca1978.github.io/2021/07/13/PostgreSQLWalTraffic<p>What is the amount of traffic generated in the Write Ahead Logs?</p>
<h1 id="how-much-data-goes-into-the-wals">How much data goes into the WALs?</h1>
<p>PostgreSQL exploits the <em>Write Ahead Log</em>s (WALs) to make data changes persistent: whenever you <code class="language-plaintext highlighter-rouge">COMMIT</code> (implicitly or explicitly) a work, the data is stored in the WALs before it phisically hits the table it belongs to.
<br />
There are different advantages in this approach, most notably performances and the ability to survive a crash.
<br />
And one beautiful thing about PostgreSQL is that it provides you all the tools to follow, study and understand what it is happening under the hood. With regard to the WALs, there are few <code class="language-plaintext highlighter-rouge">pg_wal_xxx</code> functions that can be exploited to get a clue about what is happening in the WALs.
<br />
In this post I’m going to use mainly:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">pg_current_wal_lsn()</code> that provides the current offset within the WAL stream where the <em>next thing</em> will happen. Such offset in the WAL stream is called <strong>Log Sequence Number</strong> or <em>LSN</em> for short;</li>
<li><code class="language-plaintext highlighter-rouge">pg_walfile_name()</code> that given a Log Sequence Number (<em>LSN</em>) provides you the name of the WAL file, in the <code class="language-plaintext highlighter-rouge">pg_wal</code> directory, that contains the WAL location.</li>
</ul>
<p><br />
<br />
It is worth spending a little time to explain what LSNs are.
<br />
PostgreSQL organizes the WALs into files large <code class="language-plaintext highlighter-rouge">16 MB</code> each (you can change this setting, but assume you will not). Every time a WAL file is full, that is it contains <code class="language-plaintext highlighter-rouge">16 MBG</code> of <em>valid WAL data</em>, PostgreSQL produces a new file (or recycles a no more used one).
<br />
The database must know exactly when things happened during the <em>history of transactions</em>, and this means it must be able to point to a location into the WAL files to clearly identify a transaction, or a statement, or something else. This location is expressed a <em>Log Sequence Number</em>, something that points the server to an offset within the WAL stream.
<br />
Therefore, when you execute an SQL statement, the database stores the result of the statement into the WALs at the position indicated by the current log sequence number, and the next statement will happen at a different log sequence number.
<br />
Log sequence numbers have the form of <code class="language-plaintext highlighter-rouge">AA/BBxxxxxx</code> where <code class="language-plaintext highlighter-rouge">AA</code> and <code class="language-plaintext highlighter-rouge">BB</code> can be used to identify the WAL file on disk (knowing the current timeline). In fact, usually the WAL file that contains a log sequence number is named as <code class="language-plaintext highlighter-rouge">000000<timeline>000000AA000000BB</code>. As an example, if the LSN is <code class="language-plaintext highlighter-rouge">16/70D22618</code> the corresponding file on disk is <code class="language-plaintext highlighter-rouge">000000070000001600000070</code> (given the timeline number <code class="language-plaintext highlighter-rouge">7</code>). This rule of thumb is not always true, since the LSN could be near the end of the WAL file, or even on the beginning of the new one, but you get the idea.
The remaining part, represented by <code class="language-plaintext highlighter-rouge">xxxxxx</code> is the offset within the WAL file to find the position of the LSN.
<br />
PostgreSQL has a dedicated data type, <code class="language-plaintext highlighter-rouge">pg_lsn</code>, to store information about a Log Sequence Number. You can apply operators to <code class="language-plaintext highlighter-rouge">pg_lsn</code>, for example to do a difference between two values, and PostgreSQL will show you the result as a <code class="language-plaintext highlighter-rouge">numeric</code> value.
<br />
Now that is clear what a LSN is and how it relates to the WAL files on disk, let’s see how it is possible to get the amount of data written in the WALs with regards to the amount of data written to a table. In the following examples I’m using a server 13.3 with only me running queries, so numbers are effectively related only to my experimentations.</p>
<h2 id="an-example-with-a-normal-table">An example with a normal table</h2>
<p>Let’s create a very simple table:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">table</span> <span class="n">logged_table</span><span class="p">(</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
</code></pre></div></div>
<p><br />
<br />
Now let’s do a bulk insert, and check the current <em>Log Seqeuence Number</em> before and after the insertion of one million tuples:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span> <span class="n">pg_current_wal_lsn</span><span class="p">(),</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000070</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">70</span><span class="n">D22618</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">logged_table</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">select</span> <span class="s1">'logged '</span> <span class="o">||</span> <span class="n">v</span> <span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000075</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">752611</span><span class="n">C8</span> <span class="o">|</span> <span class="mi">42</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="s1">'16/752611C8'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="o">-</span> <span class="s1">'16/70D22618'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">69</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, generating <code class="language-plaintext highlighter-rouge">42 MB</code> of <strong>real table data</strong> implied the generation of <code class="language-plaintext highlighter-rouge">69 MB</code> of WAL data. Why there is more data in the WALs than in the actual table? Because the WAL records must keep links to themselves, checksum and a lot of other data that can be used by PostgreSQL by replication and crash recovery.</p>
<h2 id="using-an-unlogged-table">Using an unlogged table</h2>
<p>Let’s now start over, transforming the table as <code class="language-plaintext highlighter-rouge">UNLOGGED</code>, so that it is not going to hit the WALs.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">truncate</span> <span class="k">table</span> <span class="n">logged_table</span> <span class="p">;</span>
<span class="k">TRUNCATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">logged_table</span> <span class="k">set</span> <span class="n">unlogged</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">logged_table</span> <span class="k">rename</span> <span class="k">to</span> <span class="n">unlogged_table</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Replay the same above insertion of one million tuples and see what happens to the WALs:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span> <span class="n">pg_current_wal_lsn</span><span class="p">(),</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'unlogged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000075</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">75285</span><span class="n">AD0</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">unlogged_table</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">select</span> <span class="s1">'logged '</span> <span class="o">||</span> <span class="n">v</span> <span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'unlogged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000075</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">75285</span><span class="n">B30</span> <span class="o">|</span> <span class="mi">42</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="s1">'16/75285B30'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="o">-</span> <span class="s1">'16/75285AD0'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">96</span> <span class="n">bytes</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the table has grown by the same size of the previous example, that is <code class="language-plaintext highlighter-rouge">42 MB</code> of real data. This time however, the WAL records have not grown, except for a very little amount of <code class="language-plaintext highlighter-rouge">96 bytes</code> of roomkeeping datata.</p>
<h2 id="going-back-to-a-logged-table">Going back to a logged table</h2>
<p>What happens if the table comes back as <code class="language-plaintext highlighter-rouge">LOGGED</code>?</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">unlogged_table</span> <span class="k">rename</span> <span class="k">to</span> <span class="n">logged_table</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000075</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">75295120</span> <span class="o">|</span> <span class="mi">42</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">logged_table</span> <span class="k">set</span> <span class="n">logged</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000079</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">7978</span><span class="n">EFF0</span> <span class="o">|</span> <span class="mi">42</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="s1">'16/7978EFF0'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="o">-</span> <span class="s1">'16/75295120'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">69</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br />
As you can see, setting the table from <code class="language-plaintext highlighter-rouge">UNLOGGED</code> to <code class="language-plaintext highlighter-rouge">LOGGED</code> generated pretty much the same amount of WAL traffice (i.e., <code class="language-plaintext highlighter-rouge">69 MB</code>) as in the original insert transaction.</p>
<h2 id="add-some-fields">Add some fields</h2>
<p>Let’s add a couple of more fields to the table, to see what happens with regard to the traffic:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">logged_table</span> <span class="k">add</span> <span class="k">column</span> <span class="n">pk</span> <span class="nb">serial</span> <span class="k">primary</span> <span class="k">key</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">logged_table</span> <span class="k">add</span> <span class="k">column</span> <span class="n">price</span> <span class="nb">numeric</span><span class="p">(</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span> <span class="p">)</span> <span class="k">default</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">truncate</span> <span class="n">logged_table</span> <span class="p">;</span>
<span class="k">TRUNCATE</span> <span class="k">TABLE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and now re-run our little benchmark (note that the added fields have default values):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">00000007000000160000007</span><span class="n">D</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">7</span><span class="n">DD16CA8</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">logged_table</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">select</span> <span class="s1">'logged '</span> <span class="o">||</span> <span class="n">v</span> <span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table_pkey'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------|----------------</span>
<span class="mi">000000070000001600000086</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">86</span><span class="n">BD7110</span> <span class="o">|</span> <span class="mi">50</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">43</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="s1">'16/86BD7110'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="o">-</span> <span class="s1">'16/7DD16CA8'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">143</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br />
This time, as you can see, the table has grown about <code class="language-plaintext highlighter-rouge">20%</code> of its previous size, that is to <code class="language-plaintext highlighter-rouge">50 MB</code> of real data, but there is also the index (on the primary key column) to consider, and that is <code class="language-plaintext highlighter-rouge">43 MB</code>, for an overall total of <code class="language-plaintext highlighter-rouge">93 MB</code> of real data.
However, the WALs almost doubled their previous size, and still are larger than the size of the real data due to the structure of the records.</p>
<h2 id="doing-a-rollback">Doing a rollback</h2>
<p>What happens if the transaction does a rollback?
<br />
WALs are managed as an <em>append-only</em> storage, so there will be WAL traffic. It is quite easy to experiment this:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">00000007000000160000009</span><span class="n">E</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">9</span><span class="n">EC3A680</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">begin</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">00000007000000160000009</span><span class="n">E</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">9</span><span class="n">EC3A680</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">logged_table</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">select</span> <span class="s1">'logged '</span> <span class="o">||</span> <span class="n">v</span> <span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">0000000700000016000000</span><span class="n">A7</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="n">A7AFA000</span> <span class="o">|</span> <span class="mi">50</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">rollback</span><span class="p">;</span>
<span class="k">ROLLBACK</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">0000000700000016000000</span><span class="n">A7</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="n">A7AFAB50</span> <span class="o">|</span> <span class="mi">50</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000092</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">92</span><span class="n">D11AD8</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">begin</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">000000070000001600000092</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">92</span><span class="n">D11AD8</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">logged_table</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">select</span> <span class="s1">'logged '</span> <span class="o">||</span> <span class="n">v</span> <span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">00000007000000160000009</span><span class="n">B</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">9</span><span class="n">BBD0000</span> <span class="o">|</span> <span class="mi">50</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="s1">'16/9BBD0000'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="o">-</span> <span class="s1">'16/92D11AD8'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">143</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">rollback</span><span class="p">;</span>
<span class="k">ROLLBACK</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------</span>
<span class="mi">00000007000000160000009</span><span class="n">B</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="mi">9</span><span class="n">BBD1F40</span> <span class="o">|</span> <span class="mi">50</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="s1">'16/9BBD1F40'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="o">-</span> <span class="s1">'16/92D11AD8'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">143</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Before the transaction starts, the current LSN is <code class="language-plaintext highlighter-rouge">16/92D11AD8</code> and it remains unchanged until the transaction actually does some work. Before the <code class="language-plaintext highlighter-rouge">ROLLBACK</code> the LSN is <code class="language-plaintext highlighter-rouge">16/9BBD0000</code> and immediatly after the <code class="language-plaintext highlighter-rouge">ROLLBACK</code> the LSN moved forward to <code class="language-plaintext highlighter-rouge">16/9BBD1F40</code>. Therefore, <em>simply</em> issuing a <code class="language-plaintext highlighter-rouge">ROLLBACK</code> caused the WAL to increase about <code class="language-plaintext highlighter-rouge">8kB</code>.</p>
<h2 id="pg_waldump"><code class="language-plaintext highlighter-rouge">pg_waldump</code></h2>
<p>The special command <code class="language-plaintext highlighter-rouge">pg_waldump</code> provides information about WAL contents.
<br />
<strong>It is required to have the WALs</strong> to inspect: as trivial as it could sound, you will not be able to <em>observe</em> your transaction if the database has executed a <code class="language-plaintext highlighter-rouge">CHECKPOINT</code> and has recycled the WAL segments (but you can archive them if you want to inspect <em>old</em> transactions).</p>
<p><br />
Let’s play again our rollback transaction to get effective LSN numbers:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">truncate</span> <span class="n">logged_table</span> <span class="p">;</span>
<span class="k">TRUNCATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">begin</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table_pkey'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------|----------------</span>
<span class="mi">0000000700000016000000</span><span class="n">C3</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="n">C3E18BB0</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">logged_table</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span> <span class="k">values</span><span class="p">(</span> <span class="s1">'a single record'</span> <span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table_pkey'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------|----------------</span>
<span class="mi">0000000700000016000000</span><span class="n">C3</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="n">C3E18BB0</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span> <span class="o">|</span> <span class="mi">16</span> <span class="n">kB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=*></span> <span class="k">rollback</span><span class="p">;</span>
<span class="k">ROLLBACK</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">),</span>
<span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table'</span> <span class="p">)</span> <span class="p">),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'logged_table_pkey'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="n">pg_size_pretty</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------------------|--------------------|----------------|----------------</span>
<span class="mi">0000000700000016000000</span><span class="n">C3</span> <span class="o">|</span> <span class="mi">16</span><span class="o">/</span><span class="n">C3E18D48</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span> <span class="o">|</span> <span class="mi">16</span> <span class="n">kB</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="s1">'16/C3E18D48'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="o">-</span> <span class="s1">'16/C3E18BB0'</span><span class="p">::</span><span class="n">pg_lsn</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">408</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br />
Why inserting asingle tuple this time? Because when using <code class="language-plaintext highlighter-rouge">pg_waldump</code> the system is going to produce a very verbose output and I don’t want to mess with a ton of <code class="language-plaintext highlighter-rouge">INSERT</code>s.
<br />
The above generated a very small amount of WAL traffic, <code class="language-plaintext highlighter-rouge">408 bytes</code> exactly. Let’s inspect what is in the WALs by means of <code class="language-plaintext highlighter-rouge">pg_waldump</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo</span> <span class="nt">-u</span> postgres /usr/pgsql-13/bin/pg_waldump <span class="nt">-p</span> <span class="nv">$PGDATA</span>/pg_wal <span class="nt">-s</span> 16/C3E18BB0 <span class="nt">-e</span> 16/C3E18D48 <span class="nt">-t</span> 7
rmgr: Heap tx: 3562600, lsn: 16/C3E18BB0, prev 16/C3E18B50, desc: INSERT+INIT off 1 flags 0x00, blkref <span class="c">#0: rel 1663/89735/89935 blk 0</span>
rmgr: Btree tx: 3562600, lsn: 16/C3E18C00, prev 16/C3E18BB0, desc: NEWROOT lev 0, blkref <span class="c">#0: rel 1663/89735/89937 blk 1, blkref #2: rel 1663/89735/89937 blk 0</span>
rmgr: Btree tx: 3562600, lsn: 16/C3E18C68, prev 16/C3E18C00, desc: INSERT_LEAF off 1, blkref <span class="c">#0: rel 1663/89735/89937 blk 1</span>
rmgr: Transaction tx: 3562600, lsn: 16/C3E18CA8, prev 16/C3E18C68, desc: ABORT 2021-07-13 04:59:03.235599 EDT
</code></pre></div></div>
<p><br />
<br />
I’ve removed part of the information to better fit the screen size.
<br />
The first entry on the top is the execution of the <code class="language-plaintext highlighter-rouge">INSERT</code> statement, followed by two entries that create the values in the index, and last there is the <code class="language-plaintext highlighter-rouge">ABORT</code>, that is the <code class="language-plaintext highlighter-rouge">ROLLBACK</code> statement.
<br />
As you can see, every record has the clear indication of what LSN it is by means of the <code class="language-plaintext highlighter-rouge">lsn</code> field, as well as pointer to its previous record (i.e., the <code class="language-plaintext highlighter-rouge">previous</code> LSN offset). This way allows PostgreSQL to read the WAL stream from the end and go back in history to get the exact boundaries of a piece of work.</p>
Using ora2pg to do a kind of backup2021-06-11T00:00:00+00:00https://fluca1978.github.io/2021/06/11/ora2pgBackup<p>How I implemented a kind of Oracle-to-PostgreSQL backup.</p>
<h1 id="using-ora2pg-to-do-a-kind-of-backup">Using ora2pg to do a kind of backup</h1>
<p><strong>Disclaimer: <code class="language-plaintext highlighter-rouge">ora2pg</code> is an amazing tool, but is not supposed to be used as a backup tool!</strong>
<br />
In this article I’m going to show you how I decided to implement a kind of <em>Oracle-to-PostgreSQL</em> backup by means of <a href="https://ora2pg.darold.net/" target="_blank"><code class="language-plaintext highlighter-rouge">ora2pg</code></a>.
<br />
<br />
It all started as a simple need: <em>migrate an Oracle database to PostgreSQL to do some experiments</em>.
<br />
Therefore I fired up an <code class="language-plaintext highlighter-rouge">ora2pg</code> project, and started from there in order to do the migration.
<br />
End of the story.
<br />
But then I was asked to migrate <em>again</em> the same database, because in the meantime something changed.
<br />
And then again, and again.
<br />
I’m not saying I was asked to keep the database synchronized, but to sometime load an updated amount of data (and structures) from Oracle to PostgreSQL.
<br />
As lazy as I am, after a couple of request I was producing a simple shell script to automate the job, at least about running <code class="language-plaintext highlighter-rouge">ora2pg</code>. Yes, this could be less trivial than you think, since <code class="language-plaintext highlighter-rouge">ora2pg</code> relies on the Oracle instaclient to be installed (with all the environment set), and Perl to be ready with all the <code class="language-plaintext highlighter-rouge">DBD::Oracle</code>, <code class="language-plaintext highlighter-rouge">DBI</code> and other stuff in the right place. And this is a little complicated on my machines because I tend to experiment, and so I have a lot of different stuff installed, so I have to fire up the right Perl, with the right modules, and the right environment (I do use <a href="https://perlbrew.pl" target="_blank"><code class="language-plaintext highlighter-rouge">perlbrew</code></a>, in the case you are wondering). In other words, there was some setup work necessary before I could run <code class="language-plaintext highlighter-rouge">ora2pg</code>, and that was a perfect candidate for a real shell script.
<br />
Then, the number of the databases to do this work on became two, and this was a call for a parametric script…you get the point!
<br />
Last but not least, I was not sure about when the migration would happen and when I was asked to load a new bunch of stuff into PostgreSQL, and since my memory is lazier than me, I not always do remember all the required steps to load the extracted part of <code class="language-plaintext highlighter-rouge">ora2pg</code> into our beloved database.
<br />
And therefore I decided to write a simple shell script to allow me to:</p>
<ul>
<li>extract data from a customizable Oracle database, assuming <code class="language-plaintext highlighter-rouge">ora2pg</code> project is configured;</li>
<li>place the data and structures in a well defined space on my storage;</li>
<li>create a compact and clear <code class="language-plaintext highlighter-rouge">psql</code> script to load the extraction into PostgreSQL (yes, I know <code class="language-plaintext highlighter-rouge">ora2pg</code> can do this automagically with a PostgreSQL connection, but I have to do it offline).
<br />
<br />
Let’s start first from how I do add new databases to my script:</li>
</ul>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">ORACLE_HOME</span><span class="o">=</span>/opt/oracle/instantclient_18_3
<span class="nb">source</span> ~/perl5/perlbrew/etc/bashrc
<span class="nv">DATE_DIR</span><span class="o">=</span><span class="sb">`</span><span class="nb">date</span> <span class="s1">'+%Y-%m-%d'</span><span class="sb">`</span>
<span class="nv">BACKUP_ROOT</span><span class="o">=</span>/backup
<span class="nv">ORACLE_PG_TEMPALTE</span><span class="o">=</span><span class="s2">"my_oracle_template"</span>
do_ora2pg ora-srv ORADB1 /backup/ora2pg/ORADB1
do_ora2pg ora-srv ORADB2 /backup/ora2pg/ORADB2
</code></pre></div></div>
<p><br />
<br />
The initial part is used to set up Perl and Oracle Instant Client.
<br />
The <code class="language-plaintext highlighter-rouge">do_ora2pg</code> are the lines that define a single extraction; the arguments to the <code class="language-plaintext highlighter-rouge">do_ora2pg</code> shell function are:</p>
<ul>
<li>Oracle host name;</li>
<li>Oracle schema to which I need to connect;</li>
<li>path to the <code class="language-plaintext highlighter-rouge">ora2pg</code> project.</li>
</ul>
<p><br />
<br />
What does the <code class="language-plaintext highlighter-rouge">do_ora2pg</code> shell function do?
<br />
Here it is:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>do_ora2pg<span class="o">()</span>
<span class="o">{</span>
<span class="nb">local </span><span class="nv">SERVER_NAME</span><span class="o">=</span><span class="nv">$1</span>
<span class="nb">local </span><span class="nv">ORACLE_SCHEMA</span><span class="o">=</span><span class="nv">$2</span>
<span class="nb">local </span><span class="nv">ORACLE_PROJECT_FOLDER</span><span class="o">=</span><span class="nv">$3</span>
<span class="nb">local </span><span class="nv">BACKUP_DIR</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">BACKUP_ROOT</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">SERVER_NAME</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">DATE_DIR</span><span class="k">}</span><span class="s2">/ora2pg/</span><span class="k">${</span><span class="nv">ORACLE_SCHEMA</span><span class="k">}</span><span class="s2">"</span>
<span class="nb">local </span><span class="nv">PG_DATABASE</span><span class="o">=</span><span class="si">$(</span> <span class="nb">echo</span> <span class="nv">$ORACLE_SCHEMA</span> | <span class="nb">tr</span> <span class="s1">'[:upper:]'</span> <span class="s1">'[:lower:]'</span> <span class="si">)</span>
<span class="k">if</span> <span class="o">[</span> <span class="o">!</span> <span class="nt">-d</span> <span class="s2">"</span><span class="nv">$BACKUP_DIR</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
</span><span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$BACKUP_DIR</span><span class="s2">"</span>
<span class="k">else
</span><span class="nb">rm</span> <span class="nv">$BACKUP_DIR</span>/<span class="k">*</span>.sql <span class="o">></span> /dev/null 2>&1
<span class="k">fi
</span><span class="nb">echo</span> <span class="nt">-e</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">{ </span><span class="nv">$ORACLE_SCHEMA</span><span class="s2"> }</span><span class="se">\n\t</span><span class="s2">=> PostgreSQL Dump in [</span><span class="nv">$BACKUP_DIR</span><span class="s2">]</span><span class="se">\n</span><span class="s2">"</span>
<span class="nb">cd</span> <span class="nv">$BACKUP_DIR</span>
<span class="nv">TYPES</span><span class="o">=</span><span class="s2">"TABLE VIEW MVIEW INSERT SEQUENCE FUNCTION PROCEDURE TRIGGER"</span>
<span class="nv">counter</span><span class="o">=</span>1
<span class="nb">cat</span> <span class="o"><<</span><span class="no">EOF</span><span class="sh"> > all.sql
-- Automatic PostgreSQL reload from Oracle
-- </span><span class="nv">$ORACLE_SCHEMA</span><span class="sh">
-- </span><span class="nv">$BACKUP_DIR</span><span class="sh">
</span><span class="se">\s</span><span class="sh">et ON_ERROR_STOP 1
</span><span class="se">\s</span><span class="sh">et QUIET 1
</span><span class="se">\e</span><span class="sh">cho Reload of Oracle schema </span><span class="nv">$PG_DATABASE</span><span class="sh">
DROP DATABASE IF EXISTS </span><span class="nv">$PG_DATABASE</span><span class="sh">;
CREATE DATABASE </span><span class="nv">$PG_DATABASE</span><span class="sh"> WITH TEMPLATE </span><span class="nv">$ORACLE_PG_TEMPLATE</span><span class="sh">
</span><span class="se">\c</span><span class="sh"> </span><span class="nv">$PG_DATABASE</span><span class="sh">
CREATE EXTENSION IF NOT EXISTS orafce;
</span><span class="se">\e</span><span class="sh">cho Connected to </span><span class="nv">$PG_DATABASE</span><span class="sh">
</span><span class="se">\e</span><span class="sh">cho Starting the loading batch
</span><span class="se">\e</span><span class="sh">cho
</span><span class="no">
EOF
</span> <span class="k">for </span>t <span class="k">in</span> <span class="nv">$TYPES</span>
<span class="k">do
</span><span class="nv">output</span><span class="o">=</span><span class="si">$(</span><span class="nb">printf</span> <span class="s2">"%02d"</span> <span class="nv">$counter</span><span class="si">)</span>-<span class="nv">$t</span>.sql
<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$t</span><span class="s2"> => </span><span class="nv">$output</span><span class="s2"> "</span>
<span class="nb">echo</span> <span class="nt">-e</span> <span class="s2">"</span><span class="se">\n\\\e</span><span class="s2">cho Batch to load : </span><span class="nv">$output</span><span class="s2"> ..."</span> <span class="o">>></span> all.sql
ora2pg <span class="nt">--c</span> <span class="k">${</span><span class="nv">ORACLE_PROJECT_FOLDER</span><span class="k">}</span>/config/ora2pg.conf <span class="nt">-t</span> <span class="nv">$t</span> <span class="nt">-o</span> <span class="nv">$output</span> <span class="o">>></span> ora2pg.log 2>&1
<span class="k">if</span> <span class="o">[</span> <span class="nv">$?</span> <span class="nt">-eq</span> 0 <span class="o">]</span><span class="p">;</span> <span class="k">then
</span><span class="nb">echo</span> <span class="s2">"OK"</span>
<span class="nb">echo</span> <span class="nt">-e</span> <span class="s2">"</span><span class="se">\\\i</span><span class="s2"> </span><span class="nv">$output</span><span class="s2">"</span> <span class="o">>></span> all.sql
<span class="k">else
</span><span class="nb">echo</span> <span class="s2">"KO"</span>
<span class="nb">echo</span> <span class="nt">-e</span> <span class="s2">"</span><span class="se">\\\e</span><span class="s2">cho NOT LOADING!"</span> <span class="o">>></span> all.sql
<span class="k">fi
</span><span class="nv">counter</span><span class="o">=</span><span class="k">$((</span> counter <span class="o">+</span> <span class="m">1</span> <span class="k">))</span>
<span class="k">done</span>
<span class="o">}</span>
</code></pre></div></div>
<p><br />
<br />
The function initially creates a <code class="language-plaintext highlighter-rouge">BACKUP_DIR</code> that is named after a well defined root, and after the date the backup is took on (I assume to do no more than one per day). The idea is that the backup directory will result in something like <code class="language-plaintext highlighter-rouge">/backup/ora-srv/2021-06-11/db1</code>.
After a quick check about the existance or not of the backup directory, the script creates a file named <code class="language-plaintext highlighter-rouge">all.sql</code> in such directory, placing some <code class="language-plaintext highlighter-rouge">psql</code> directives into such file.
<br />
Then the script executes <code class="language-plaintext highlighter-rouge">ora2pg</code> for the objects I care about, producing a different file name suffix for every kind of invocation, for example <code class="language-plaintext highlighter-rouge">01-TABLES</code> for table structures (schema).
If the dump of the objects type is fine, the <code class="language-plaintext highlighter-rouge">\i</code> inclusion of that file is placed into <code class="language-plaintext highlighter-rouge">all.sql</code>, otherwise an alert is inserted.
<br />
The result <code class="language-plaintext highlighter-rouge">all.sql</code> is a file like the following:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- Automatic PostgreSQL reload from Oracle</span>
<span class="c1">-- ORADB1</span>
<span class="c1">-- /backup/DATI/ora-srv/2021-06-10/ora2pg/ORADB1</span>
<span class="err">\</span><span class="k">set</span> <span class="n">ON_ERROR_STOP</span> <span class="mi">1</span>
<span class="err">\</span><span class="k">set</span> <span class="n">QUIET</span> <span class="mi">1</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Reload</span> <span class="k">of</span> <span class="n">Oracle</span> <span class="k">schema</span> <span class="n">db1</span>
<span class="k">DROP</span> <span class="k">DATABASE</span> <span class="n">IF</span> <span class="k">EXISTS</span> <span class="n">db1</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">db1</span> <span class="k">WITH</span> <span class="k">TEMPLATE</span> <span class="n">my_oracle_template</span><span class="p">;</span>
<span class="err">\</span><span class="k">c</span> <span class="n">db1</span>
<span class="k">CREATE</span> <span class="n">EXTENSION</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">orafce</span><span class="p">;</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Connected</span> <span class="k">to</span> <span class="n">db1</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Starting</span> <span class="n">the</span> <span class="n">loading</span> <span class="n">batch</span>
<span class="err">\</span><span class="n">echo</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">01</span><span class="o">-</span><span class="k">TABLE</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">01</span><span class="o">-</span><span class="k">TABLE</span><span class="p">.</span><span class="k">sql</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">02</span><span class="o">-</span><span class="k">VIEW</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">02</span><span class="o">-</span><span class="k">VIEW</span><span class="p">.</span><span class="k">sql</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">03</span><span class="o">-</span><span class="n">MVIEW</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">03</span><span class="o">-</span><span class="n">MVIEW</span><span class="p">.</span><span class="k">sql</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">04</span><span class="o">-</span><span class="k">INSERT</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">04</span><span class="o">-</span><span class="k">INSERT</span><span class="p">.</span><span class="k">sql</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">05</span><span class="o">-</span><span class="n">SEQUENCE</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">05</span><span class="o">-</span><span class="n">SEQUENCE</span><span class="p">.</span><span class="k">sql</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">06</span><span class="o">-</span><span class="k">FUNCTION</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">06</span><span class="o">-</span><span class="k">FUNCTION</span><span class="p">.</span><span class="k">sql</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">07</span><span class="o">-</span><span class="k">PROCEDURE</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">07</span><span class="o">-</span><span class="k">PROCEDURE</span><span class="p">.</span><span class="k">sql</span>
<span class="err">\</span><span class="n">echo</span> <span class="n">Batch</span> <span class="k">to</span> <span class="k">load</span> <span class="p">:</span> <span class="mi">08</span><span class="o">-</span><span class="k">TRIGGER</span><span class="p">.</span><span class="k">sql</span> <span class="p">...</span>
<span class="err">\</span><span class="n">i</span> <span class="mi">08</span><span class="o">-</span><span class="k">TRIGGER</span><span class="p">.</span><span class="k">sql</span>
</code></pre></div></div>
<p><br />
<br />
Therefore, the only thing I have to do when I want to <em>migrate</em> the Oracle content into PostgreSQL, is to launch a command like:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> luca template1 < all.sql
</code></pre></div></div>
<p><br />
<br />
and wait. This is something easy enough for me to remember even if I have not sleep well!
<br />
I’ve experimented with this for a few weeks now, and it is something that is really useful to my use case.
<br />
Please note that I create the extension <code class="language-plaintext highlighter-rouge">orafce</code> in the reloaded database, because we do use some functions that are dumped and reloaded well by this extension. For that reason, the database on the PostgreSQL side is created by means of a specific template that have the extension already installed.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">ora2pg</code> is an amazing tool, that can be used and abused in different ways including doing backups!
<br />
I’m sure there are smarter ways to achieve my same aim, and I will report back if I learn about them, so please let me know if you have suggestions!</p>
Template Databases2021-06-08T00:00:00+00:00https://fluca1978.github.io/2021/06/08/PostgreSQLTemplateDatabase<p>PostgreSQL relies on the concept of template databases to create a new one.</p>
<h1 id="template-databases">Template Databases</h1>
<p>PostgreSQL relies on the concept of <em>template</em> as a way to create a new database.
The idea is similar to the one of the <code class="language-plaintext highlighter-rouge">/etc/skel</code> for Unix operating systems: whenever you create a new user, its own home directory is cloned from the <code class="language-plaintext highlighter-rouge">/etc/skel</code>. In PostgreSQL the idea is similar: whenever you create a new database, that is <strong>cloned</strong> from a template one.
<br />
<br />
PostgreSQL ships with two template database, namely <code class="language-plaintext highlighter-rouge">template1</code> and <code class="language-plaintext highlighter-rouge">template0</code>.
<br />
<code class="language-plaintext highlighter-rouge">template1</code> is the default database, meaning that when you execute a <code class="language-plaintext highlighter-rouge">CREATE DATABASE</code> the system will clone such database as the new one. In other words:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">foo</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br />
is the same as
<br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">foo</span> <span class="k">WITH</span> <span class="k">TEMPLATE</span> <span class="n">template1</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>One advantage of this technique is that whatever object you put into the <code class="language-plaintext highlighter-rouge">tempalte1</code>, you will find into the new database(s). This could be handy when having to face multiple database with similar or identical objects, but can be a nightmare if you screw up your template database.
<br />
Then there is <code class="language-plaintext highlighter-rouge">template0</code>, that is used as a <em>backup</em> for <code class="language-plaintext highlighter-rouge">template1</code> (in the case you screw up) or as a special templating database for handling particular situations like different encoding.</p>
<h2 id="working-with-different-templates">Working with different templates</h2>
<p>You can create your own template database, that you can then use as a <em>base</em> to create other database:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">emplate1</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">my_template</span> <span class="k">WITH</span>
<span class="n">IS_TEMPLATE</span> <span class="o">=</span> <span class="k">true</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">DATABASE</span>
<span class="n">template1</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">a_new_database</span>
<span class="k">WITH</span> <span class="k">TEMPLATE</span> <span class="n">my_template</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">DATABASE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Having templates is handy, however is not mandatory to exploit a template to build a new database. Change the previous template so that it is no more a template database and then build another database:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">template1</span><span class="o">=#</span> <span class="k">ALTER</span> <span class="k">DATABASE</span> <span class="n">my_template</span>
<span class="k">WITH</span> <span class="n">IS_TEMPLATE</span> <span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">DATABASE</span>
<span class="n">template1</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">a_new_database_from_no_template</span>
<span class="k">WITH</span> <span class="k">TEMPLATE</span> <span class="n">my_template</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">DATABASE</span>
</code></pre></div></div>
<p><br />>
<br /></p>
<p>As you can see, you can use a normal (i.e., not template) database to build a new database too!
<br />
This is possible only if done by a superuser!</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">template1</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">db_from_user</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">DATABASE</span>
<span class="n">template1</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">db_from_user_and_template</span>
<span class="k">WITH</span> <span class="k">TEMPLATE</span> <span class="n">my_template</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">to</span> <span class="k">copy</span> <span class="k">database</span> <span class="nv">"my_template"</span>
</code></pre></div></div>
<p><br />
<br />
As you can see, being a normal user you can create a new database using a template database, but not using a non-templating database.
<br />
<strong>Templates are exploitable by both normal and super users, but only super users can create a new database exploiting a database that is not marked as a template</strong>.</p>
<h2 id="connections-while-creating-a-database">Connections while creating a database</h2>
<p>When the <code class="language-plaintext highlighter-rouge">CREATE DATABASE</code> is performing, there must be no ther connections to the target database, it does not mean if it is a template or a normal database. The reason is that, in order to clone the database, there must be no activity on such database.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">template1</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">db_from_user_while_template1_in_use</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">source</span> <span class="k">database</span> <span class="nv">"template1"</span> <span class="k">is</span> <span class="n">being</span> <span class="n">accessed</span> <span class="k">by</span> <span class="n">other</span> <span class="n">users</span>
<span class="n">DETAIL</span><span class="p">:</span> <span class="n">There</span> <span class="k">is</span> <span class="mi">1</span> <span class="n">other</span> <span class="k">session</span> <span class="k">using</span> <span class="n">the</span> <span class="k">database</span><span class="p">.</span>
</code></pre></div></div>
<p><br />
<br />
Here it is: the message states clearly that there is some kind of activity on <code class="language-plaintext highlighter-rouge">template1</code> and therefore it is not safe to clone such database.
<br />
The same happens with a non-template database:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">template1</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">db_from_user_while_my_template_in_use</span>
<span class="k">WITH</span> <span class="k">TEMPLATE</span> <span class="n">my_template</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">source</span> <span class="k">database</span> <span class="nv">"my_template"</span> <span class="k">is</span> <span class="n">being</span> <span class="n">accessed</span> <span class="k">by</span> <span class="n">other</span> <span class="n">users</span>
<span class="n">DETAIL</span><span class="p">:</span> <span class="n">There</span> <span class="k">is</span> <span class="mi">1</span> <span class="n">other</span> <span class="k">session</span> <span class="k">using</span> <span class="n">the</span> <span class="k">database</span><span class="p">.</span>
</code></pre></div></div>
<p><br />
<br />
It is interesting to note that it does not matter <em>what kind of activity</em> is ongoing in the database used as a template: it does suffice there is a single connection (event idle) to prevent <code class="language-plaintext highlighter-rouge">CREATE DATABASE</code> to continue.
<br />
On the other hand, the system prevents any incoming connection to be established against the tempalte database until the <code class="language-plaintext highlighter-rouge">CREATE DATABASE</code> has finished and hence <em>releases</em> the database.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Template database are used as a <em>skeleton</em> to be cloned when a new database is going to be created.
<br />
The cluster can survive even without template database, but not having the default one(s) will make less comfortable the usage of <code class="language-plaintext highlighter-rouge">CREATE DATABASE</code>. You can build your own templates, and this is recommended to avoid tainting the default one(s), but you will need to specify your template name within every <code class="language-plaintext highlighter-rouge">CREATE DATABASE</code>.
<br />
Last, the system will not allow you to use a database as a template if there are active connections (except your own), because cloning will become unsafe.</p>
pgbackrest lands on FreeBSD!2021-06-03T00:00:00+00:00https://fluca1978.github.io/2021/06/03/pgbackrestFreeBSD<p>pgbackrest has been inserted into the FreeBSD ports!</p>
<h1 id="pgbackrest-lands-on-freebsd">pgbackrest lands on FreeBSD!</h1>
<p>At last it happened: <a href="https://pgbackrest.org/" target="_blank">pgbackrest, my <strong>favourite backup solution fo PostgreSQL</strong></a> is now <a href="https://cgit.freebsd.org/ports/commit/?id=c49969ef4f68de362c260f7822e212c8045f7e6a" target="_blank">available in the FreeBSD ports tree</a>, my favourite operating system!
<br />
<br />
Thanks to the efforts of people involved in <a href="https://github.com/pgbackrest/pgbackrest/issues/1293" target="_blank">this issue</a> it is now possible to get pgbackrest installed <em>easily</em> (or in a simpler way) on FreeBSD!</p>
PostgreSQL Builtin Trigger Function to Speed Up Updates2021-06-03T00:00:00+00:00https://fluca1978.github.io/2021/06/03/PostgreSQLUpdateTrigger<p>Did you know PostgreSQL ships with a pre-built trigger function that can speed up UPDATES?</p>
<h1 id="postgresql-builtin-trigger-function-to-speed-up-updates">PostgreSQL Builtin Trigger Function to Speed Up Updates</h1>
<p>PostgreSQL ships with an <em>internal</em> trigger function, named <code class="language-plaintext highlighter-rouge">suppress_redundant_updates_trigger</code> that can be used to avoid idempotent updates on a table.
<br />
The <a href="https://www.postgresql.org/docs/12/functions-trigger.html" target="_blank">online documentation</a> explains very well how to use it, including the fact that the trigger should be fire as last in a trigger chain, and so the trigger name should be alphabetically the last one in natural sorting.
<br />
But is it worth using such function?
<br />
Let’s find out wth a very trivial example on well known <code class="language-plaintext highlighter-rouge">pgbench</code> database. First of all, let’s consider the initial setup:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">),</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'pgbench_accounts'</span> <span class="p">)</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">pgbench_accounts</span><span class="p">;</span>
<span class="k">count</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">----------|----------------</span>
<span class="mi">10000000</span> <span class="o">|</span> <span class="mi">1281</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now, let’s execute an idempotet <code class="language-plaintext highlighter-rouge">UPDATE</code>, that is something that does not change anything, and monitor the timing:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="err">\</span><span class="n">timing</span>
<span class="n">Timing</span> <span class="k">is</span> <span class="k">on</span><span class="p">.</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">filler</span> <span class="o">=</span> <span class="n">filler</span><span class="p">;</span>
<span class="k">UPDATE</span> <span class="mi">10000000</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">307939</span><span class="p">,</span><span class="mi">763</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">05</span><span class="p">:</span><span class="mi">07</span><span class="p">,</span><span class="mi">940</span><span class="p">)</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'pgbench_accounts'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">2561</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">180</span><span class="p">,</span><span class="mi">732</span> <span class="n">ms</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Note how the table has doubled its size: this is because of bloating caused by every row being substituted by an exact copy of it.
<br />
Now, let’s create the trigger using the <code class="language-plaintext highlighter-rouge">suppress_redundant_updates_trigger</code> function, and let’s run the same update again, but after a server restart to clean up also the memory.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TRIGGER</span> <span class="n">tr_avoid_idempotent_updates</span>
<span class="k">BEFORE</span> <span class="k">UPDATE</span> <span class="k">ON</span> <span class="n">pgbench_accounts</span>
<span class="k">FOR</span> <span class="k">EACH</span> <span class="k">ROW</span>
<span class="k">EXECUTE</span> <span class="k">FUNCTION</span> <span class="n">suppress_redundant_updates_trigger</span><span class="p">();</span>
<span class="c1">-- restart the server</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="err">\</span><span class="n">timing</span>
<span class="n">Timing</span> <span class="k">is</span> <span class="k">on</span><span class="p">.</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">filler</span> <span class="o">=</span> <span class="n">filler</span><span class="p">;</span>
<span class="k">UPDATE</span> <span class="mi">0</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">287588</span><span class="p">,</span><span class="mi">607</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">04</span><span class="p">:</span><span class="mi">47</span><span class="p">,</span><span class="mi">589</span><span class="p">)</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'pgbench_accounts'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">2561</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The total gain was about <code class="language-plaintext highlighter-rouge">20 secs</code>, that is a speed up of roughly <code class="language-plaintext highlighter-rouge">7%</code>, that is not too much at all.
<br />
However, note how the <code class="language-plaintext highlighter-rouge">UPDATE</code> reports <strong>zero tuples have been touched</strong>, therefore while the speed up gain is not really exciting, the bloating of the table remains the same as before the <code class="language-plaintext highlighter-rouge">UPDATE</code> itself.</p>
<p><br />
After a full vacuum, the speed up results a lot more, but this can be a counter effect of having in memory already some pages:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">VACUUM</span> <span class="k">FULL</span> <span class="n">pgbench_accounts</span> <span class="p">;</span>
<span class="k">VACUUM</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">222455</span><span class="p">,</span><span class="mi">150</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">03</span><span class="p">:</span><span class="mi">42</span><span class="p">,</span><span class="mi">455</span><span class="p">)</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">filler</span> <span class="o">=</span> <span class="n">filler</span><span class="p">;</span>
<span class="k">UPDATE</span> <span class="mi">0</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">198104</span><span class="p">,</span><span class="mi">981</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">03</span><span class="p">:</span><span class="mi">18</span><span class="p">,</span><span class="mi">105</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>However, even after a reboot of the server, the time remains lower:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">filler</span> <span class="o">=</span> <span class="n">filler</span><span class="p">;</span>
<span class="k">UPDATE</span> <span class="mi">0</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">184217</span><span class="p">,</span><span class="mi">260</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">03</span><span class="p">:</span><span class="mi">04</span><span class="p">,</span><span class="mi">217</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>So the gain on a <em>not bloated</em> table is around <code class="language-plaintext highlighter-rouge">67%</code> which is much more interesting!</p>
<h2 id="timing-the-trigger-execution">Timing the trigger execution</h2>
<p>How long does it take to execute the trigger function against every row? It is possible to get this information with <code class="language-plaintext highlighter-rouge">EXPLAIN ANALYZE</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span><span class="n">FORMAT</span> <span class="n">yaml</span><span class="p">,</span> <span class="k">ANALYZE</span><span class="p">,</span> <span class="k">VERBOSE</span><span class="p">,</span> <span class="n">TIMING</span> <span class="p">)</span>
<span class="k">UPDATE</span> <span class="n">pgbench_accounts</span> <span class="k">SET</span> <span class="n">filler</span> <span class="o">=</span> <span class="n">filler</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">---------------------------------------------------</span>
<span class="o">-</span> <span class="n">Plan</span><span class="p">:</span> <span class="o">+</span>
<span class="n">Node</span> <span class="k">Type</span><span class="p">:</span> <span class="nv">"ModifyTable"</span> <span class="o">+</span>
<span class="k">Operation</span><span class="p">:</span> <span class="nv">"Update"</span> <span class="o">+</span>
<span class="n">Parallel</span> <span class="n">Aware</span><span class="p">:</span> <span class="k">false</span> <span class="o">+</span>
<span class="n">Relation</span> <span class="n">Name</span><span class="p">:</span> <span class="nv">"pgbench_accounts"</span> <span class="o">+</span>
<span class="k">Schema</span><span class="p">:</span> <span class="nv">"public"</span> <span class="o">+</span>
<span class="k">Alias</span><span class="p">:</span> <span class="nv">"pgbench_accounts"</span> <span class="o">+</span>
<span class="n">Startup</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">+</span>
<span class="n">Total</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">263935</span><span class="p">.</span><span class="mi">00</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="k">Rows</span><span class="p">:</span> <span class="mi">10000000</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="n">Width</span><span class="p">:</span> <span class="mi">103</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Startup</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">153053</span><span class="p">.</span><span class="mi">980</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Total</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">153377</span><span class="p">.</span><span class="mi">845</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="k">Rows</span><span class="p">:</span> <span class="mi">0</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Loops</span><span class="p">:</span> <span class="mi">1</span> <span class="o">+</span>
<span class="n">Plans</span><span class="p">:</span> <span class="o">+</span>
<span class="o">-</span> <span class="n">Node</span> <span class="k">Type</span><span class="p">:</span> <span class="nv">"Seq Scan"</span> <span class="o">+</span>
<span class="n">Parent</span> <span class="n">Relationship</span><span class="p">:</span> <span class="nv">"Member"</span> <span class="o">+</span>
<span class="n">Parallel</span> <span class="n">Aware</span><span class="p">:</span> <span class="k">false</span> <span class="o">+</span>
<span class="n">Relation</span> <span class="n">Name</span><span class="p">:</span> <span class="nv">"pgbench_accounts"</span> <span class="o">+</span>
<span class="k">Schema</span><span class="p">:</span> <span class="nv">"public"</span> <span class="o">+</span>
<span class="k">Alias</span><span class="p">:</span> <span class="nv">"pgbench_accounts"</span> <span class="o">+</span>
<span class="n">Startup</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">+</span>
<span class="n">Total</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">263935</span><span class="p">.</span><span class="mi">00</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="k">Rows</span><span class="p">:</span> <span class="mi">10000000</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="n">Width</span><span class="p">:</span> <span class="mi">103</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Startup</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">8</span><span class="p">.</span><span class="mi">968</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Total</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">44542</span><span class="p">.</span><span class="mi">939</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="k">Rows</span><span class="p">:</span> <span class="mi">10000000</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Loops</span><span class="p">:</span> <span class="mi">1</span> <span class="o">+</span>
<span class="k">Output</span><span class="p">:</span> <span class="o">+</span>
<span class="o">-</span> <span class="nv">"aid"</span> <span class="o">+</span>
<span class="o">-</span> <span class="nv">"bid"</span> <span class="o">+</span>
<span class="o">-</span> <span class="nv">"abalance"</span> <span class="o">+</span>
<span class="o">-</span> <span class="nv">"filler"</span> <span class="o">+</span>
<span class="o">-</span> <span class="nv">"ctid"</span> <span class="o">+</span>
<span class="n">Planning</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">24</span><span class="p">.</span><span class="mi">475</span> <span class="o">+</span>
<span class="n">Triggers</span><span class="p">:</span> <span class="o">+</span>
<span class="o">-</span> <span class="k">Trigger</span> <span class="n">Name</span><span class="p">:</span> <span class="nv">"tr_avoid_idempotent_updates"</span><span class="o">+</span>
<span class="n">Relation</span><span class="p">:</span> <span class="nv">"pgbench_accounts"</span> <span class="o">+</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">1510</span><span class="p">.</span><span class="mi">272</span> <span class="o">+</span>
<span class="n">Calls</span><span class="p">:</span> <span class="mi">10000000</span> <span class="o">+</span>
<span class="n">Execution</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">159552</span><span class="p">.</span><span class="mi">624</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, running the trigger requires roughly <code class="language-plaintext highlighter-rouge">1.5 secs</code> for <code class="language-plaintext highlighter-rouge">10 million</code> tuples.
<br />
Assuming the timing is enough accurate and stable, it means <code class="language-plaintext highlighter-rouge">0.00015 msecs</code> for every tuple, that is not much overhead after all.</p>
<p><br />
<br />
It is possible to provide another table to experiment against, in order to see if the timing for the trigger eecution depends on the data types and its content:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">create</span> <span class="k">table</span> <span class="n">stuff</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">serial</span><span class="p">,</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">);</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">stuff</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span> <span class="k">SELECT</span> <span class="n">repeat</span><span class="p">(</span> <span class="s1">'abc'</span><span class="p">,</span> <span class="mi">1000</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2000000</span> <span class="p">);</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TRIGGER</span> <span class="n">tr_avoid_idempotent_updates</span>
<span class="k">BEFORE</span> <span class="k">UPDATE</span> <span class="k">ON</span> <span class="n">stuff</span>
<span class="k">FOR</span> <span class="k">EACH</span> <span class="k">ROW</span>
<span class="k">EXECUTE</span> <span class="k">FUNCTION</span> <span class="n">suppress_redundant_updates_trigger</span><span class="p">();</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span> <span class="n">FORMAT</span> <span class="n">yaml</span><span class="p">,</span> <span class="k">ANALYZE</span><span class="p">,</span> <span class="k">VERBOSE</span><span class="p">,</span> <span class="n">TIMING</span> <span class="p">)</span>
<span class="k">UPDATE</span> <span class="n">stuff</span> <span class="k">SET</span> <span class="n">t</span> <span class="o">=</span> <span class="n">t</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">Triggers</span><span class="p">:</span> <span class="o">+</span>
<span class="o">-</span> <span class="k">Trigger</span> <span class="n">Name</span><span class="p">:</span> <span class="nv">"tr_avoid_idempotent_updates"</span><span class="o">+</span>
<span class="n">Relation</span><span class="p">:</span> <span class="nv">"stuff"</span> <span class="o">+</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">223</span><span class="p">.</span><span class="mi">227</span> <span class="o">+</span>
<span class="n">Calls</span><span class="p">:</span> <span class="mi">2000000</span> <span class="o">+</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Again, the mean execution time of the trigger is <code class="language-plaintext highlighter-rouge">0.00011 msecs</code>, and very similar (if not equal) results can be obtained with the <code class="language-plaintext highlighter-rouge">pk</code> column, so I would say that <em>the execution time of the trigger does not involves the specific type of the column(s) being updated</em>.</p>
<h1 id="the-black-behing-the-triger-funtion">The Black Behing the Triger Funtion</h1>
<p>The <code class="language-plaintext highlighter-rouge">suppress_redundant_updates_trigger</code> is defined in the file <code class="language-plaintext highlighter-rouge">utils/adt/trigfuncs.c</code>, and the magic happens in the following piece of code:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="cm">/* if the tuple payload is the same ... */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">newtuple</span><span class="o">-></span><span class="n">t_len</span> <span class="o">==</span> <span class="n">oldtuple</span><span class="o">-></span><span class="n">t_len</span> <span class="o">&&</span>
<span class="n">newheader</span><span class="o">-></span><span class="n">t_hoff</span> <span class="o">==</span> <span class="n">oldheader</span><span class="o">-></span><span class="n">t_hoff</span> <span class="o">&&</span>
<span class="p">(</span><span class="n">HeapTupleHeaderGetNatts</span><span class="p">(</span><span class="n">newheader</span><span class="p">)</span> <span class="o">==</span>
<span class="n">HeapTupleHeaderGetNatts</span><span class="p">(</span><span class="n">oldheader</span><span class="p">))</span> <span class="o">&&</span>
<span class="p">((</span><span class="n">newheader</span><span class="o">-></span><span class="n">t_infomask</span> <span class="o">&</span> <span class="o">~</span><span class="n">HEAP_XACT_MASK</span><span class="p">)</span> <span class="o">==</span>
<span class="p">(</span><span class="n">oldheader</span><span class="o">-></span><span class="n">t_infomask</span> <span class="o">&</span> <span class="o">~</span><span class="n">HEAP_XACT_MASK</span><span class="p">))</span> <span class="o">&&</span>
<span class="n">memcmp</span><span class="p">(((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span> <span class="n">newheader</span><span class="p">)</span> <span class="o">+</span> <span class="n">SizeofHeapTupleHeader</span><span class="p">,</span>
<span class="p">((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span> <span class="n">oldheader</span><span class="p">)</span> <span class="o">+</span> <span class="n">SizeofHeapTupleHeader</span><span class="p">,</span>
<span class="n">newtuple</span><span class="o">-></span><span class="n">t_len</span> <span class="o">-</span> <span class="n">SizeofHeapTupleHeader</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="cm">/* ... then suppress the update */</span>
<span class="n">rettuple</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>that essentially compares the old and the new tuple to see if they have the same headers, the same number of attributes, and of course the same content of the memory representation (by means of <code class="language-plaintext highlighter-rouge">memcpm(3)</code>).</p>
<h2 id="doing-in-plpgsql">Doing in <code class="language-plaintext highlighter-rouge">plpgsql</code></h2>
<p>It is possible to implement a basic function in <code class="language-plaintext highlighter-rouge">plpgsql</code> by means of the <code class="language-plaintext highlighter-rouge">IS DISTINCT FROM</code> operator:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_avoid_idempotent_updates</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="k">TRIGGER</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="n">IF</span> <span class="k">NEW</span><span class="p">.</span><span class="o">*</span> <span class="k">IS</span> <span class="k">DISTINCT</span> <span class="k">FROM</span> <span class="k">OLD</span><span class="p">.</span><span class="o">*</span> <span class="k">THEN</span>
<span class="k">RETURN</span> <span class="k">NEW</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="k">RETURN</span> <span class="k">NULL</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and the execution with this trigger in place results in:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">drop</span> <span class="k">trigger</span> <span class="n">tr_avoid_idempotent_updates</span> <span class="k">on</span> <span class="n">pgbench_accounts</span><span class="p">;</span>
<span class="k">DROP</span> <span class="k">TRIGGER</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">create</span> <span class="k">trigger</span> <span class="n">tr_avoid_idempotent_updates</span>
<span class="k">before</span> <span class="k">update</span> <span class="k">on</span> <span class="n">pgbench_accounts</span>
<span class="k">for</span> <span class="k">each</span> <span class="k">row</span>
<span class="k">execute</span> <span class="k">function</span> <span class="n">f_avoid_idempotent_updates</span><span class="p">();</span>
<span class="k">CREATE</span> <span class="k">TRIGGER</span>
<span class="n">pgbench</span><span class="o">=></span> <span class="k">update</span> <span class="n">pgbench_accounts</span> <span class="k">set</span> <span class="n">filler</span> <span class="o">=</span> <span class="n">filler</span><span class="p">;</span>
<span class="k">UPDATE</span> <span class="mi">0</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">167400</span><span class="p">,</span><span class="mi">098</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">02</span><span class="p">:</span><span class="mi">47</span><span class="p">,</span><span class="mi">400</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and if you track function executions:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">pg_stat_user_functions</span> <span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">--------------------------</span>
<span class="n">funcid</span> <span class="o">|</span> <span class="mi">36672</span>
<span class="n">schemaname</span> <span class="o">|</span> <span class="k">public</span>
<span class="n">funcname</span> <span class="o">|</span> <span class="n">f_avoid_idempotent_updates</span>
<span class="n">calls</span> <span class="o">|</span> <span class="mi">10000000</span>
<span class="n">total_time</span> <span class="o">|</span> <span class="mi">21276</span><span class="p">.</span><span class="mi">741</span>
<span class="n">self_time</span> <span class="o">|</span> <span class="mi">21276</span><span class="p">.</span><span class="mi">741</span>
</code></pre></div></div>
<p><br />
<br />
that indicates that <code class="language-plaintext highlighter-rouge">21 secs</code> are spent in doing the trigger analysis, so roughly <code class="language-plaintext highlighter-rouge">0,0021 msecs</code> spent for each tuple. This is by far much more expensive of the C default function (that was roughly <code class="language-plaintext highlighter-rouge">0.00015 msecs</code>).
<br />
Similar results are emphasized by the <code class="language-plaintext highlighter-rouge">EXPLAIN ANALYZE</code> output:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span><span class="n">FORMAT</span> <span class="n">yaml</span><span class="p">,</span> <span class="k">ANALYZE</span><span class="p">,</span> <span class="n">TIMING</span> <span class="p">)</span>
<span class="k">UPDATE</span> <span class="n">pgbench</span> <span class="k">SET</span> <span class="n">filler</span> <span class="o">=</span> <span class="n">filler</span><span class="p">;</span>
<span class="p">...</span>
<span class="o">|</span> <span class="n">Triggers</span><span class="p">:</span> <span class="o">+</span>
<span class="o">|</span> <span class="o">-</span> <span class="k">Trigger</span> <span class="n">Name</span><span class="p">:</span> <span class="nv">"tr_avoid_idempotent_updates"</span><span class="o">+</span>
<span class="o">|</span> <span class="n">Relation</span><span class="p">:</span> <span class="nv">"pgbench_accounts"</span> <span class="o">+</span>
<span class="o">|</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">23002</span><span class="p">.</span><span class="mi">383</span> <span class="o">+</span>
<span class="o">|</span> <span class="n">Calls</span><span class="p">:</span> <span class="mi">10000000</span> <span class="o">+</span>
<span class="o">|</span> <span class="n">Execution</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">163343</span><span class="p">.</span><span class="mi">183</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Here the <code class="language-plaintext highlighter-rouge">Time</code> is around <code class="language-plaintext highlighter-rouge">23000 msecs</code> while with the C native function it was about <code class="language-plaintext highlighter-rouge">1500 msecs</code>.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The internal <code class="language-plaintext highlighter-rouge">suppress_redundant_updates_trigger</code> function can be useful for reducing <strong>both time and bloating</strong> against large batches of <code class="language-plaintext highlighter-rouge">UPDATE</code>s.
<br />
The function is implemented in the C language and checks if the memory content of the tuples is the same or not, and this makes this approach really powerful and not so error prone as defining a custom trigger function by the user.</p>
Memory inspection thru pg_buffercache2021-05-28T00:00:00+00:00https://fluca1978.github.io/2021/05/28/PostgreSQLMemoryFuntions<p>A tiny set of functions to glance at the memory usage in the PostgreSQL system.</p>
<h1 id="memory-inspection-thru-pg_buffercache">Memory inspection thru pg_buffercache</h1>
<p><code class="language-plaintext highlighter-rouge">pg_buffercache</code> is a very useful extension that allows for the inspection of the memory as used by a live PostgreSQL instance. The <a href="https://www.postgresql.org/docs/current/pgbuffercache.html" target="_blank">extension</a> is available by means of the contrib module and is very useful to take a look at the memory usage, in other words the usage of the <code class="language-plaintext highlighter-rouge">shared_buffers</code>.
<br />
Thanks to this module it is possible to clearly understand the memory consumption and, therefore, the correct tuning of the <code class="language-plaintext highlighter-rouge">shared_buffers</code> parameter.
<br />
A few years ago <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/pg_buffercache.sql" target="_blank">I wrote a set of example queries</a> to interact with the module and get a glance at the memory usage. While those queries were a starting point, they had some issues especially when a table was not consuming memory (disibion by zero, and so on).
<br />
<br />
I finally found the time to produce a cleaner approach to those queries, so I <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/memory.sql" target="_blank">re-implemented all the queries by means of functions</a>. The script is a <code class="language-plaintext highlighter-rouge">psql</code> script, and uses some special backslash commands, but you can extract the SQL pure part and execute it by means of another client.
<br />
The script creates a <code class="language-plaintext highlighter-rouge">memory</code> schema and places all the functions into such schema; the functions have a name that starts with <code class="language-plaintext highlighter-rouge">f_memory</code>, so that they should not clash with existing functions.
<br />
In the following I describe every function.
<br />
Please note that the idea here is to provide a background about memory inspection, there is still room for improvements and fixes!</p>
<h2 id="installing-the-functions">Installing the functions</h2>
<p>It does suffice to execute the <code class="language-plaintext highlighter-rouge">memory.sql</code> psql script to get the creation of the schema <code class="language-plaintext highlighter-rouge">memory</code> and all the functions into such schema. The script provides some information about the objects created:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tfdb</span><span class="o">=#</span> <span class="err">\</span><span class="n">i</span> <span class="n">memory</span><span class="p">.</span><span class="k">sql</span>
<span class="n">Creating</span> <span class="n">a</span> <span class="k">schema</span> <span class="n">named</span> <span class="n">memory</span><span class="p">...</span>
<span class="k">All</span> <span class="n">objects</span> <span class="n">created</span><span class="o">!</span>
<span class="n">Try</span> <span class="n">one</span> <span class="k">of</span> <span class="n">the</span> <span class="k">following</span> <span class="n">functions</span><span class="p">:</span>
<span class="o">-</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory</span><span class="p">()</span> <span class="k">to</span> <span class="k">get</span> <span class="n">very</span> <span class="n">basic</span> <span class="n">information</span>
<span class="o">-</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage</span><span class="p">()</span> <span class="k">to</span> <span class="k">get</span> <span class="n">information</span> <span class="n">about</span> <span class="n">the</span> <span class="n">whole</span> <span class="n">memory</span>
<span class="o">-</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage_by_database</span><span class="p">()</span> <span class="k">to</span> <span class="k">get</span> <span class="n">information</span> <span class="n">about</span> <span class="n">single</span> <span class="n">databases</span>
<span class="o">-</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage_by_table</span><span class="p">()</span> <span class="k">to</span> <span class="k">get</span> <span class="n">information</span> <span class="n">about</span> <span class="n">tables</span> <span class="k">in</span> <span class="n">the</span> <span class="k">current</span> <span class="k">database</span>
<span class="o">-</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage_by_table_cumulative</span><span class="p">()</span> <span class="k">to</span> <span class="k">get</span> <span class="n">cumulative</span> <span class="n">information</span> <span class="k">for</span> <span class="n">tables</span>
<span class="n">You</span> <span class="n">can</span> <span class="k">add</span> <span class="n">the</span> <span class="n">memory</span> <span class="k">schema</span> <span class="k">to</span> <span class="n">the</span> <span class="k">search</span> <span class="n">path</span><span class="p">.</span>
<span class="n">Try</span> <span class="n">running</span> <span class="n">the</span> <span class="k">following</span> <span class="n">query</span> <span class="n">while</span> <span class="n">testing</span> <span class="n">the</span> <span class="k">database</span> <span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="k">g</span><span class="p">.,</span> <span class="n">via</span> <span class="n">pgbench</span><span class="p">):</span>
<span class="k">select</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage</span><span class="p">();</span>
<span class="err">\</span><span class="n">watch</span> <span class="mi">5</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="the-output-of-the-functions">The output of the functions</h2>
<p>All the function accept a boolean <code class="language-plaintext highlighter-rouge">human</code> flag, that by default is set to <code class="language-plaintext highlighter-rouge">true</code>. If the flag is set the output of the memory dimensions will be formatted using <code class="language-plaintext highlighter-rouge">pg_size_pretty()</code>, therefore will be in a <em>human readable format</em>. Otherwise the output will be formatted as plain number of bytes.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tfdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory</span><span class="p">();</span>
<span class="n">total</span> <span class="o">|</span> <span class="n">used</span> <span class="o">|</span> <span class="k">free</span>
<span class="c1">--------|--------|--------</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">101</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">699</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">tfdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory</span><span class="p">(</span> <span class="k">false</span> <span class="p">);</span>
<span class="n">total</span> <span class="o">|</span> <span class="n">used</span> <span class="o">|</span> <span class="k">free</span>
<span class="c1">-----------|-----------|-----------</span>
<span class="mi">838860800</span> <span class="o">|</span> <span class="mi">106168320</span> <span class="o">|</span> <span class="mi">732692480</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="utility-functions">Utility functions</h2>
<p>There are a few utility functions that are used as a backbone to build the others. In particular:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">memory.f_check_pg_buffercache()</code> it checks that the extension <code class="language-plaintext highlighter-rouge">pg_buffercache</code> is installed into the database;</li>
<li><code class="language-plaintext highlighter-rouge">memory.f_check_user()</code> checks that the user is either an administrator or has the privileges to run <code class="language-plaintext highlighter-rouge">pg_buffercache</code> functions;</li>
<li><code class="language-plaintext highlighter-rouge">memory.f_check()</code> calls the previous two functions and raises an exception if the check fails. This function is invoked by all the other <em>memory related</em> functions, so that before the function is run the user can get an alert about missing pieces;</li>
<li><code class="language-plaintext highlighter-rouge">memory.f_usagecounter_to_string()</code> provides a textual description of the <code class="language-plaintext highlighter-rouge">pg_buffercache.usagecount</code> value;</li>
<li><code class="language-plaintext highlighter-rouge">memory.f_tablename()</code> provides the name of a table, index or view os anything that will appear in the output of other functions;</li>
<li><code class="language-plaintext highlighter-rouge">memory.f_print_bytes()</code> prints the amount of bytes as text, using either <code class="language-plaintext highlighter-rouge">pg_size_pretty()</code> or plain text conversion. This is used in every function to support the above mentioned <code class="language-plaintext highlighter-rouge">human</code> flag.</li>
</ul>
<h2 id="available-functions">Available functions</h2>
<p>The available functions to inspect the memory usage are described in the following.</p>
<h3 id="f_memory">f_memory()</h3>
<p>The function <code class="language-plaintext highlighter-rouge">memory.f_memory()</code> provides a glance at free and used memory in the cluster.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tfdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory</span><span class="p">();</span>
<span class="n">total</span> <span class="o">|</span> <span class="n">used</span> <span class="o">|</span> <span class="k">free</span>
<span class="c1">--------|--------|--------</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">163</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">637</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h3 id="f_memory_usage">f_memory_usage()</h3>
<p>The function <code class="language-plaintext highlighter-rouge">memory.f_memory_usage()</code> provides a more detailed view about the usage of the memory. In particular it provides the amount of memory used by level of <code class="language-plaintext highlighter-rouge">usagecount</code>.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tfdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage</span><span class="p">();</span>
<span class="n">total_memory</span> <span class="o">|</span> <span class="n">memory</span> <span class="o">|</span> <span class="n">percent</span> <span class="o">|</span> <span class="n">cumulative</span> <span class="o">|</span> <span class="n">description</span>
<span class="c1">--------------|---------|---------|------------|----------------</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">22</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">2</span><span class="p">.</span><span class="mi">71</span> <span class="o">%</span> <span class="o">|</span> <span class="mi">2</span><span class="p">.</span><span class="mi">71</span><span class="o">%</span> <span class="o">|</span> <span class="n">VERY</span> <span class="n">HIGH</span> <span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">2536</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">31</span> <span class="o">%</span> <span class="o">|</span> <span class="mi">3</span><span class="p">.</span><span class="mi">02</span><span class="o">%</span> <span class="o">|</span> <span class="n">HIGH</span> <span class="p">(</span><span class="mi">4</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">1936</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">24</span> <span class="o">%</span> <span class="o">|</span> <span class="mi">3</span><span class="p">.</span><span class="mi">26</span><span class="o">%</span> <span class="o">|</span> <span class="n">MID</span> <span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">1888</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">23</span> <span class="o">%</span> <span class="o">|</span> <span class="mi">3</span><span class="p">.</span><span class="mi">49</span><span class="o">%</span> <span class="o">|</span> <span class="n">LOW</span> <span class="p">(</span><span class="mi">2</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">135</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">16</span><span class="p">.</span><span class="mi">85</span> <span class="o">%</span> <span class="o">|</span> <span class="mi">20</span><span class="p">.</span><span class="mi">34</span><span class="o">%</span> <span class="o">|</span> <span class="n">VERY</span> <span class="n">LOW</span> <span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">637</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">79</span><span class="p">.</span><span class="mi">66</span> <span class="o">%</span> <span class="o">|</span> <span class="mi">100</span><span class="p">.</span><span class="mi">00</span><span class="o">%</span> <span class="o">|</span> <span class="o">==</span> <span class="k">FREE</span> <span class="o">==</span> <span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="p">(</span><span class="mi">6</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">memory</code> column provides the amount of memory used for a specific <em>region</em>, and the <code class="language-plaintext highlighter-rouge">percent</code> columns provide the ratio of memory usage with regard to the total memory. The <code class="language-plaintext highlighter-rouge">cumulative</code> column provides the amount ratio of the usage level greater than the current one.
<br />
As an example, in the above there are <code class="language-plaintext highlighter-rouge">135 MB</code> used not frequently, and thus the <code class="language-plaintext highlighter-rouge">20.34 %</code> of memory is used from very high to very low.</p>
<h3 id="f_memory_usage_by_database">f_memory_usage_by_database()</h3>
<p>The function <code class="language-plaintext highlighter-rouge">memory.f_memory_usage_by_database()</code> provides information about the usage of memory by each database in the cluster, and provides also the <em>caching</em> amount of every database.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pgbench</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage_by_database</span><span class="p">();</span>
<span class="n">total_memory</span> <span class="o">|</span> <span class="k">database</span> <span class="o">|</span> <span class="n">size_in_memory</span> <span class="o">|</span> <span class="n">size_on_disk</span> <span class="o">|</span> <span class="n">percent_cached</span> <span class="o">|</span> <span class="n">percent_of_memory</span>
<span class="c1">--------------|-------------|----------------|--------------|----------------|-------------------</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">pgbench</span> <span class="o">|</span> <span class="mi">182</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">1505</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">12</span><span class="p">.</span><span class="mi">11</span><span class="o">%</span> <span class="o">|</span> <span class="mi">71</span><span class="p">.</span><span class="mi">15</span><span class="o">%</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">ltdb</span> <span class="o">|</span> <span class="mi">608</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">171</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">35</span><span class="o">%</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">23</span><span class="o">%</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">postgres</span> <span class="o">|</span> <span class="mi">544</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">104</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">51</span><span class="o">%</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">21</span><span class="o">%</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">restore</span> <span class="o">|</span> <span class="mi">544</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">104</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">51</span><span class="o">%</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">21</span><span class="o">%</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">restore2</span> <span class="o">|</span> <span class="mi">544</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">104</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">51</span><span class="o">%</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">21</span><span class="o">%</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">restore3</span> <span class="o">|</span> <span class="mi">544</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">104</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">51</span><span class="o">%</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">21</span><span class="o">%</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">restore4</span> <span class="o">|</span> <span class="mi">544</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">8269</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">6</span><span class="p">.</span><span class="mi">58</span><span class="o">%</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">21</span><span class="o">%</span>
<span class="mi">256</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">template1</span> <span class="o">|</span> <span class="mi">544</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">8245</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">6</span><span class="p">.</span><span class="mi">60</span><span class="o">%</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">21</span><span class="o">%</span>
<span class="p">(</span><span class="mi">8</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h3 id="f_memory_usage_by_table">f_memory_usage_by_table()</h3>
<p>The function <code class="language-plaintext highlighter-rouge">memory.f_memory_usage_by_table()</code> provides information about the usage of all <em>tabular like</em> stuff, in other words about <em>relations</em>.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tfdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage_by_table</span><span class="p">();</span>
<span class="p">...</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">tfdb</span> <span class="o">|</span> <span class="p">(</span><span class="k">table</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m12</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">%</span> <span class="o">|</span> <span class="n">VERY</span> <span class="n">HIGH</span> <span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">tfdb</span> <span class="o">|</span> <span class="p">(</span><span class="k">table</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m12</span> <span class="o">|</span> <span class="mi">22</span> <span class="n">MB</span> <span class="o">|</span> <span class="mi">2</span><span class="p">.</span><span class="mi">70</span> <span class="o">%</span> <span class="o">|</span> <span class="n">VERY</span> <span class="n">VERY</span> <span class="n">LOW</span> <span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">tfdb</span> <span class="o">|</span> <span class="p">(</span><span class="k">index</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m12_ts_idx</span> <span class="o">|</span> <span class="mi">32</span> <span class="n">kB</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">%</span> <span class="o">|</span> <span class="n">VERY</span> <span class="n">HIGH</span> <span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="mi">800</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">tfdb</span> <span class="o">|</span> <span class="p">(</span><span class="k">index</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m12_ts_idx1</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">%</span> <span class="o">|</span> <span class="n">VERY</span> <span class="n">HIGH</span> <span class="p">(</span><span class="mi">5</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h3 id="f_memory_usage_by_table_cumulative">f_memory_usage_by_table_cumulative()</h3>
<p>The function <code class="language-plaintext highlighter-rouge">f_memory_usage_by_table_cumulative()</code> provides an overview of how much memory a single table is “consuming”, without any regard to the usage level counter.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tfdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage_by_table_cumulative</span><span class="p">();</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-----|-----------------------------------------------</span>
<span class="n">total_memory</span> <span class="o">|</span> <span class="mi">800</span> <span class="n">MB</span>
<span class="k">database</span> <span class="o">|</span> <span class="n">tfdb</span>
<span class="n">relation</span> <span class="o">|</span> <span class="p">(</span><span class="k">table</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m07</span>
<span class="n">memory</span> <span class="o">|</span> <span class="mi">10</span> <span class="n">MB</span>
<span class="n">on_disk</span> <span class="o">|</span> <span class="mi">1159</span> <span class="n">MB</span>
<span class="n">percent_of_memory</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">27</span> <span class="o">%</span>
<span class="n">percent_of_disk</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">88</span><span class="o">%</span>
<span class="n">usagedescription</span> <span class="o">|</span> <span class="k">any</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="c1">-----|-----------------------------------------------</span>
<span class="n">total_memory</span> <span class="o">|</span> <span class="mi">800</span> <span class="n">MB</span>
<span class="k">database</span> <span class="o">|</span> <span class="n">tfdb</span>
<span class="n">relation</span> <span class="o">|</span> <span class="p">(</span><span class="k">table</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m06</span>
<span class="n">memory</span> <span class="o">|</span> <span class="mi">10</span> <span class="n">MB</span>
<span class="n">on_disk</span> <span class="o">|</span> <span class="mi">1156</span> <span class="n">MB</span>
<span class="n">percent_of_memory</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">26</span> <span class="o">%</span>
<span class="n">percent_of_disk</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">87</span><span class="o">%</span>
<span class="n">usagedescription</span> <span class="o">|</span> <span class="k">any</span>
<span class="p">...</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The function accepts the usual <code class="language-plaintext highlighter-rouge">human</code> argument, but also an integer optional argument that represents the usage counter you are interested in. When specified, the function will show only the amount of memory used with a greater or equal usage counter.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tfdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">memory</span><span class="p">.</span><span class="n">f_memory_usage_by_table_cumulative</span><span class="p">(</span> <span class="mi">5</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-----|-----------------------------------------------</span>
<span class="n">total_memory</span> <span class="o">|</span> <span class="mi">800</span> <span class="n">MB</span>
<span class="k">database</span> <span class="o">|</span> <span class="n">tfdb</span>
<span class="n">relation</span> <span class="o">|</span> <span class="p">(</span><span class="k">table</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m07</span>
<span class="n">memory</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span>
<span class="n">on_disk</span> <span class="o">|</span> <span class="mi">1159</span> <span class="n">MB</span>
<span class="n">percent_of_memory</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">%</span>
<span class="n">percent_of_disk</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="o">%</span>
<span class="n">usagedescription</span> <span class="o">|</span> <span class="o">>=</span> <span class="n">VERY</span> <span class="n">HIGH</span> <span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="c1">-----|-----------------------------------------------</span>
<span class="n">total_memory</span> <span class="o">|</span> <span class="mi">800</span> <span class="n">MB</span>
<span class="k">database</span> <span class="o">|</span> <span class="n">tfdb</span>
<span class="n">relation</span> <span class="o">|</span> <span class="p">(</span><span class="k">table</span><span class="p">)</span> <span class="n">respi</span><span class="p">.</span><span class="n">y2019m06</span>
<span class="n">memory</span> <span class="o">|</span> <span class="mi">8192</span> <span class="n">bytes</span>
<span class="n">on_disk</span> <span class="o">|</span> <span class="mi">1156</span> <span class="n">MB</span>
<span class="n">percent_of_memory</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">%</span>
<span class="n">percent_of_disk</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="o">%</span>
<span class="n">usagedescription</span> <span class="o">|</span> <span class="o">>=</span> <span class="n">VERY</span> <span class="n">HIGH</span> <span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="p">...</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>The above set of functions can be used as a starting point to build your own set of queries to inspect the memory usage of a live PostgreSQL cluster. There is still room for improvements and reduce the code duplication, so stay tuned for other versions!</p>
A glance at doas & pg_ctl2021-05-10T00:00:00+00:00https://fluca1978.github.io/2021/05/10/PostgreSQLdoAs<p>A possible system that differs from <code class="language-plaintext highlighter-rouge">sudo</code>.</p>
<h1 id="a-glance-at-doas--pg_ctl">A glance at doas & pg_ctl</h1>
<p><code class="language-plaintext highlighter-rouge">doas(1)</code> is a replacement for <code class="language-plaintext highlighter-rouge">sudo(1)</code>, a program that allows you to execute commands as a different user.
The main advantage of using <code class="language-plaintext highlighter-rouge">sudo(1)</code> and hence <code class="language-plaintext highlighter-rouge">doas(1)</code> is that you can gain different privileges without the need to know the authentication tokens (e.g., a password) to do that.
<br />
I use <code class="language-plaintext highlighter-rouge">sudo(1)</code> on pretty much every machine I use, both Linux and FreeBSD.
<br />
In this post I glance at <code class="language-plaintext highlighter-rouge">doas(1)</code> and how it can be quickly configured to run PostgreSQL commands, mainly <code class="language-plaintext highlighter-rouge">pg_ctl</code>.</p>
<h2 id="doas-introduction"><code class="language-plaintext highlighter-rouge">doas</code> introduction</h2>
<p><code class="language-plaintext highlighter-rouge">doas(1)</code> is a program that was born in the <a href="http://openbsd.org" target="_blank">OpenBSD</a> ecosystem as a replacement for <code class="language-plaintext highlighter-rouge">sudo(1)</code> because, in short, the latter is too big and cannot be easily integrated into the base system.
<br />
<code class="language-plaintext highlighter-rouge">doas</code> is now available on FreeBSD and Linux too, so it is worth spending some time to learn how it works.
<br />
<code class="language-plaintext highlighter-rouge">doas(1)</code> is based on a configuration file, namely <code class="language-plaintext highlighter-rouge">doas.conf</code> (in FreeBSD <code class="language-plaintext highlighter-rouge">/usr/local/etc/doas.conf</code>), that has a syntax a lot clearer than that of <code class="language-plaintext highlighter-rouge">sudo</code>, at least in my opinion.
<br /></p>
<p>Rules are pretty simple:</p>
<ul>
<li>every line in the configuration file is a rule, and rules are read from top to the bottom;</li>
<li>a rule can be either <code class="language-plaintext highlighter-rouge">permit</code> or <code class="language-plaintext highlighter-rouge">deny</code>, allowing a user to run a command or not;</li>
<li>a command is prefix by the special keyword <code class="language-plaintext highlighter-rouge">cmd</code>;</li>
<li>a target user, that is the user you want to run the command as, is prefix by the keyword <code class="language-plaintext highlighter-rouge">as</code>;</li>
<li>the special keyword <code class="language-plaintext highlighter-rouge">nopass</code> does not ask for password (same as <code class="language-plaintext highlighter-rouge">NOPASSWD</code> option for <code class="language-plaintext highlighter-rouge">sudo</code>);</li>
<li>it is possible to specify or keep the environment or change it.</li>
</ul>
<p><br />
The usage of <code class="language-plaintext highlighter-rouge">doas(1)</code> is pretty much the same of <code class="language-plaintext highlighter-rouge">sudo(1)</code>, and mainly;</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">doas</code> is the entry command;</li>
<li><code class="language-plaintext highlighter-rouge">-u</code> specifies the user to run the command as;</li>
<li>the command is the remaining part of the command line.</li>
</ul>
<p><br />
<br />
<code class="language-plaintext highlighter-rouge">doas</code> has a lot less features (and thus syntax cluttering) than <code class="language-plaintext highlighter-rouge">sudo</code>, and therefore it is a lot faster and easy to setup, and according to me a lot less prone to errors.</p>
<h2 id="using-doas-to-control-a-postgresql-cluster">Using <code class="language-plaintext highlighter-rouge">doas</code> to control a PostgreSQL cluster</h2>
<p>Assuming you want to control a cluster, that is being able to run <code class="language-plaintext highlighter-rouge">pg_ctl</code> against a cluster, a possible configuration of <code class="language-plaintext highlighter-rouge">doas.conf</code> is as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>permit nopass setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres cmd /usr/local/bin/pg_ctl
permit nopass setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres cmd pg_ctl
</code></pre></div></div>
<p><br />
<br /></p>
<p>The two lines are pretty much identical, with the exception that the second allows for a relative path <code class="language-plaintext highlighter-rouge">pg_ctl</code> command to run. Let’s examine the rules:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">permit nopass</code> means that the rule allows to do the command without asking for the current user password;</li>
<li><code class="language-plaintext highlighter-rouge">luca as postgres</code> means that the user <code class="language-plaintext highlighter-rouge">luca</code> to become the user <code class="language-plaintext highlighter-rouge">postgres</code>, that is allows the current user <code class="language-plaintext highlighter-rouge">luca</code> to execute a command with the privileges of the local user <code class="language-plaintext highlighter-rouge">postgres</code>;</li>
<li><code class="language-plaintext highlighter-rouge">cmd //usr/local/bin/pg_ctl</code> specifies which command (both with absolute and relative path) to execute;</li>
<li><code class="language-plaintext highlighter-rouge">setenv { PGDATA=$PGDATA }</code> means that the target user <code class="language-plaintext highlighter-rouge">postgres</code> will inherit the <code class="language-plaintext highlighter-rouge">PGDATA</code> variable from the current user <code class="language-plaintext highlighter-rouge">luca</code>.</li>
</ul>
<p><br />
<br />
Therefore, it is now possible to issue the following command to stop the cluster:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% doas <span class="nt">-u</span> postgres pg_ctl stop
waiting <span class="k">for </span>server to shut down.... <span class="k">done
</span>server stopped
</code></pre></div></div>
<p><br />
<br /></p>
<p>That is equivalent to <code class="language-plaintext highlighter-rouge">sudo -u postgres pg_ctl stop</code> (assuming you have configured <code class="language-plaintext highlighter-rouge">sudo</code> to keep the environment**.</p>
<p><br />
<br />
<strong>Please note that using <code class="language-plaintext highlighter-rouge">nopass</code> and <em>relative paths</em> is, in general, a very bad idea. Do not use it in production!</strong>
<br />
<br /></p>
<p>Let’s execute some other commands:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% doas <span class="nt">-u</span> postgres initdb /postgres/13
doas: Operation not permitted
</code></pre></div></div>
<p><br />
<br /></p>
<p>Since <code class="language-plaintext highlighter-rouge">doas</code> does not have any entry for the command <code class="language-plaintext highlighter-rouge">initdb</code>, it does not allow the user to execute such command. In order to allow the <code class="language-plaintext highlighter-rouge">initdb</code>, it is possible to add the following lines to <code class="language-plaintext highlighter-rouge">doas.conf</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres cmd /usr/local/bin/initdb
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres cmd initdb
</code></pre></div></div>
<p><br />
<br /></p>
<p>and now it is possible to run it:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% doas <span class="nt">-u</span> postgres initdb /postgres/13
Password:
The files belonging to this database system will be owned by user <span class="s2">"postgres"</span><span class="nb">.</span>
This user must also own the server process.
...
Success. You can now start the database server using:
pg_ctl <span class="nt">-D</span> /postgres/13 <span class="nt">-l</span> logfile start
</code></pre></div></div>
<p><br />
<br /></p>
<p>Note how the program asked for a password; this is due to the <code class="language-plaintext highlighter-rouge">persist</code> authentication mode instead of <code class="language-plaintext highlighter-rouge">nopass</code>. <code class="language-plaintext highlighter-rouge">persist</code> is the behaviour that makes <code class="language-plaintext highlighter-rouge">doas(1)</code> asking for an authentication password and let the user to execute other commands without the same password within a short period of time. Essentially this is the same as the <em>default</em> behaviour of <code class="language-plaintext highlighter-rouge">sudo</code> in most of the default installations.</p>
<p><br />
What if the user wants to be able to execute <em>every</em> command related to PostgreSQL?
We can configure the user to be able to execute any command as the <code class="language-plaintext highlighter-rouge">postgres</code> user with a configuration like the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above allows <code class="language-plaintext highlighter-rouge">luca</code> to become <code class="language-plaintext highlighter-rouge">postgres</code> and execute any command as the latter user.
<br />
It is quite simple to generate a shell script that can add automatically configuration lines so that all the PostgreSQL related commands will be executed:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># for cmd in /usr/local/bin/pg*; do</span>
<span class="nb">echo</span> <span class="s2">"permit persist setenv { PGDATA=</span><span class="se">\$</span><span class="s2">PGDATA } luca as postgres </span><span class="nv">$cmd</span><span class="s2">"</span> <span class="o">>></span> //usr/local/etc/doas.conf
<span class="k">done</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and the above is going to produce something really verbose as:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_archivecleanup
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_basebackup
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_checksums
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_config
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_controldata
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_ctl
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_dump
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_dumpall
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_isready
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_receivewal
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_recvlogical
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_repack
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_resetwal
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_restore
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_rewind
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_standby
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_test_fsync
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_test_timing
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_upgrade
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pg_waldump
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgbackrest
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgbadger
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgbench
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgbench_helper.sh
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgxn
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgxn-3.7
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgxnclient
permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span>/postgres/12/data <span class="o">}</span> luca as postgres /usr/local/bin/pgxnclient-3.7
</code></pre></div></div>
<p><br />
<br /></p>
<p>Of course, you can tune such <em>generator</em> as much as you like.</p>
<h2 id="using-commands-against-a-single-cluster-dont-try-this-at-home">Using commands against a single cluster (don’t try this at home!)</h2>
<p>In the previous examples, <code class="language-plaintext highlighter-rouge">doas</code> has been configured to allow only PostgreSQL related commands with a <em>default</em> <code class="language-plaintext highlighter-rouge">PGDATA</code> environment variable, but the user is still able to execute a command using a different directory:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% doas <span class="nt">-u</span> postgres pg_ctl <span class="nt">-D</span> /postgres/13/ start
waiting <span class="k">for </span>server to start....
<span class="k">done
</span>server started
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can configure <code class="language-plaintext highlighter-rouge">sudo</code>, you can also tune <code class="language-plaintext highlighter-rouge">doas</code> to accept only a specific data directory as option to the commands. This is, however, quite complex and prone to errors: you have to specify the environment and all available arguments, such as:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>permit nopass setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres cmd /usr/local/bin/pg_ctl args start
permit nopass setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres cmd /usr/local/bin/pg_ctl args stop
permit nopass setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres cmd /usr/local/bin/pg_ctl args restart
</code></pre></div></div>
<p><br />
<br /></p>
<p>The situation becomes:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> % doas <span class="nt">-u</span> postgres /usr/local/bin/pg_ctl start
waiting <span class="k">for </span>server to start....
...
<span class="k">done
</span>server started
% doas <span class="nt">-u</span> postgres /usr/local/bin/pg_ctl reload
doas: Operation not permitted
</code></pre></div></div>
<p><br />
<br /></p>
<p>Please be aware that this is not a good solution however, because while updating the <code class="language-plaintext highlighter-rouge">doas.conf</code> file the file could result shorter and the rules could be executed in a way you don’t figure.
<br />
A better approach is, of course, allow the user to become <code class="language-plaintext highlighter-rouge">postgres</code> and have the latter able to do only her own tasks.</p>
<h2 id="being-able-to-run-as-user-postgres">Being able to run as user <code class="language-plaintext highlighter-rouge">postgres</code></h2>
<p>This is much simpler you may think and it resolves into the single rule:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>permit persist setenv <span class="o">{</span> <span class="nv">PGDATA</span><span class="o">=</span><span class="nv">$PGDATA</span> <span class="o">}</span> luca as postgres
</code></pre></div></div>
<p><br />
<br /></p>
<p>Without specifying any command with the special keywor <code class="language-plaintext highlighter-rouge">cmd</code>, the user <code class="language-plaintext highlighter-rouge">luca</code> will be able to run any command as <code class="language-plaintext highlighter-rouge">postgres</code>, and such user will be able to execute every PostgreSQL related command.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">doas(1)</code> is a nice piece of code that allows for a more readable and less tunable configuration than <code class="language-plaintext highlighter-rouge">sudo</code>, and this can be exploited to allow users for executing operations against PostgreSQL, among other programs.</p>
To WAL or not to WAL? When unlogged becomes logged...2021-05-10T00:00:00+00:00https://fluca1978.github.io/2021/05/10/PostgreSQLUnloggedTablesReplication<p>What happens to table that are not logged into WALs when a physical replication is in place?</p>
<h1 id="to-wal-or-not-to-wal-when-unlogged-becomes-logged">To WAL or not to WAL? When unlogged becomes logged…</h1>
<p>As many other databases, PostgreSQL allows for a table to be <em>unlogged</em>, that in short means “exclude me from the WALs!”. Such tables are not crash safe, as well as they are not replicated because the PostgreSQL replication relies on the WALs.
<br />
But what happens when you deal with such tables in a replication scenario? This post tries to provide you some explaination of what is possible and what happens.</p>
<h2 id="creating-and-populating-a-database-to-test">Creating and populating a database to test</h2>
<p>First of all, let’s create a clean database just to keep the test environment separated from other databases:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="n">rep_test</span> <span class="k">WITH</span> <span class="k">OWNER</span> <span class="n">luca</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">DATABASE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now let’s create and populate three tables (one temporary, one unlogged and one <em>normal</em>):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">t_norm</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span><span class="p">,</span>
<span class="n">t</span> <span class="nb">text</span><span class="p">,</span>
<span class="k">primary</span> <span class="k">key</span><span class="p">(</span> <span class="n">pk</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">CREATE</span> <span class="n">UNLOGGED</span> <span class="k">TABLE</span>
<span class="n">t_unlogged</span><span class="p">(</span> <span class="k">like</span> <span class="n">t_norm</span> <span class="k">including</span> <span class="k">all</span> <span class="p">);</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TEMPORARY</span> <span class="k">TABLE</span>
<span class="n">t_temp</span><span class="p">(</span> <span class="k">like</span> <span class="n">t_norm</span> <span class="k">including</span> <span class="k">all</span> <span class="p">);</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">t_norm</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="s1">'Row #'</span> <span class="o">||</span> <span class="n">v</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">4712</span><span class="p">.</span><span class="mi">185</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">04</span><span class="p">.</span><span class="mi">712</span><span class="p">)</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">t_temp</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="s1">'Row #'</span> <span class="o">||</span> <span class="n">v</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">1789</span><span class="p">.</span><span class="mi">473</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">01</span><span class="p">.</span><span class="mi">789</span><span class="p">)</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">t_unlogged</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="s1">'Unlogged #'</span> <span class="o">||</span> <span class="n">v</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1000000</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">1746</span><span class="p">.</span><span class="mi">729</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">01</span><span class="p">.</span><span class="mi">747</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The situation now is as follows:
<br /></p>
<table>
<thead>
<tr>
<th style="text-align: right">Table</th>
<th style="text-align: center">Status</th>
<th style="text-align: center">Insertion time</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">t_norm</code></td>
<td style="text-align: center">Ordinary table</td>
<td style="text-align: center">4.7 secs</td>
</tr>
<tr>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">t_temp</code></td>
<td style="text-align: center">Temporary table</td>
<td style="text-align: center">1.8 secs</td>
</tr>
<tr>
<td style="text-align: right"><code class="language-plaintext highlighter-rouge">t_unlogged</code></td>
<td style="text-align: center">Unlogged table</td>
<td style="text-align: center">1.7 secs</td>
</tr>
</tbody>
</table>
<p><br /></p>
<p>As you can see, timing for temporary and unlogged tables is pretty much the same, and this is because both are not inserted into WAL records, and therefore there is no <em>crash-recovery</em> machinery involved. This also means that writing transactions against temporary and unlogged tables is much faster against those tables. <strong>Of course, the above is not an absolute measurement of <code class="language-plaintext highlighter-rouge">INSERT</code> times, but is reported here just to give you an idea of differences</strong>.</p>
<p><br />
Since there is a temporary table, <em>you need to keep opened the session with the master node or you are going to loose all the data in such table!</em></p>
<h2 id="doing-the-physical-replication">Doing the physical replication</h2>
<p>Start a physical replication. This is not a tutorial about how to do a physical replication, I will report the commands I’ve done on a separate machine in order to get the replica cluster on its way:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_basebackup <span class="nt">-X</span> stream <span class="nt">--create-slot</span> <span class="nt">--slot</span> <span class="s1">'carmensita_physical_replication_slot'</span> <span class="nt">-R</span> <span class="nt">-r</span> 100M <span class="nt">-D</span> /postgres/12/replica <span class="nt">-l</span> <span class="s2">"Test unlogged tables"</span> <span class="nt">-P</span> <span class="nt">-d</span> <span class="s2">"dbname=backup user=backup host=miguel"</span> <span class="nt">-T</span> /wal<span class="o">=</span>/postgres/12
</code></pre></div></div>
<p><br />
<br /></p>
<p>The original cluster is on a machine named <code class="language-plaintext highlighter-rouge">miguel</code>, while the replicated slot is placed on a machine named <code class="language-plaintext highlighter-rouge">carmensita</code>. These are the two machines I use always to do some experimental work.
<br />
Please note also that I use a <code class="language-plaintext highlighter-rouge">backup</code> database and role to stream the information; as you can imagine you need to enable the replication connection on the <code class="language-plaintext highlighter-rouge">pg_hba.conf</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">tail</span> <span class="nv">$PGDATA</span>/pg_hba.conf
host replication backup carmensita trust
</code></pre></div></div>
<p><br />
<br />
Once the replication has completed, you can fire up the standby node:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% /usr/pgsql-12/bin/pg_ctl <span class="nt">-D</span> /postgres/12/replica start
<span class="k">in </span>attesa che il server si avvii....
LOG: starting PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc <span class="o">(</span>GCC<span class="o">)</span> 10.2.1 20201125 <span class="o">(</span>Red Hat 10.2.1-9<span class="o">)</span>, 64-bit
LOG: listening on IPv4 address <span class="s2">"0.0.0.0"</span>, port 5432
LOG: listening on IPv6 address <span class="s2">"::"</span>, port 5432
LOG: listening on Unix socket <span class="s2">"/var/run/postgresql/.s.PGSQL.5432"</span>
LOG: listening on Unix socket <span class="s2">"/tmp/.s.PGSQL.5432"</span>
LOG: redirecting log output to logging collector process
HINT: Future log output will appear <span class="k">in </span>directory <span class="s2">"log"</span><span class="nb">.</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="check-the-tables-on-the-replication-side">Check the tables on the replication side</h2>
<p>It is now time to check the replicated database on the replication host:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">%</span> <span class="n">psql</span> <span class="o">-</span><span class="n">h</span> <span class="n">carmensita</span> <span class="o">-</span><span class="n">U</span> <span class="n">luca</span> <span class="n">rep_test</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="err">\</span><span class="n">d</span>
<span class="n">Lista</span> <span class="n">delle</span> <span class="n">relazioni</span>
<span class="k">Schema</span> <span class="o">|</span> <span class="n">Nome</span> <span class="o">|</span> <span class="n">Tipo</span> <span class="o">|</span> <span class="n">Proprietario</span>
<span class="c1">--------|-------------------|----------|--------------</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_norm</span> <span class="o">|</span> <span class="n">tabella</span> <span class="o">|</span> <span class="n">luca</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_norm_pk_seq</span> <span class="o">|</span> <span class="n">sequenza</span> <span class="o">|</span> <span class="n">luca</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_unlogged</span> <span class="o">|</span> <span class="n">tabella</span> <span class="o">|</span> <span class="n">luca</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_unlogged_pk_seq</span> <span class="o">|</span> <span class="n">sequenza</span> <span class="o">|</span> <span class="n">luca</span>
<span class="p">(</span><span class="mi">4</span> <span class="n">righe</span><span class="p">)</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">from</span> <span class="n">t_norm</span><span class="p">;</span>
<span class="k">count</span>
<span class="c1">---------</span>
<span class="mi">1000000</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">from</span> <span class="n">t_unlogged</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">cannot</span> <span class="k">access</span> <span class="k">temporary</span> <span class="k">or</span> <span class="n">unlogged</span> <span class="n">relations</span> <span class="n">during</span> <span class="n">recovery</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see <strong>the temporary table is missing</strong>, even if still available on the other master connection.
There is no surprise here, a temporary table is <em>usable</em> only on a per-connection basis, and therefore will not be replicated.
<br />
It is more interesting to see that the unlogged table <code class="language-plaintext highlighter-rouge">t_unlogged</code> and the related sequence have been replicated, but <strong>they are there only as a placeholder</strong>, and in fact it is not possible to act on the unlogged table.
<br />
<strong>Therefore unlogged tables are replicated in their structure but not in their data!</strong></p>
<h2 id="switching-from-unlogged-to-logged">Switching from unlogged to logged</h2>
<p>On the master node, it is now time to change the unlogged status of <code class="language-plaintext highlighter-rouge">t_unlogged</code> to <em>logged</em>, and this can be done quickly with the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> command.
Let’s also check the status of the <code class="language-plaintext highlighter-rouge">relpersistence</code> flag on <code class="language-plaintext highlighter-rouge">pg_class</code> to see how it changed from <code class="language-plaintext highlighter-rouge">u</code> (<em>u</em>nlogged) to <code class="language-plaintext highlighter-rouge">p</code> (<em>p</em>ersistent):</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=></span> <span class="err">\</span><span class="n">d</span>
<span class="n">Lista</span> <span class="n">delle</span> <span class="n">relazioni</span>
<span class="k">Schema</span> <span class="o">|</span> <span class="n">Nome</span> <span class="o">|</span> <span class="n">Tipo</span> <span class="o">|</span> <span class="n">Proprietario</span>
<span class="c1">--------|-------------------|----------|--------------</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_norm</span> <span class="o">|</span> <span class="n">tabella</span> <span class="o">|</span> <span class="n">luca</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_norm_pk_seq</span> <span class="o">|</span> <span class="n">sequenza</span> <span class="o">|</span> <span class="n">luca</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_unlogged</span> <span class="o">|</span> <span class="n">tabella</span> <span class="o">|</span> <span class="n">luca</span>
<span class="k">public</span> <span class="o">|</span> <span class="n">t_unlogged_pk_seq</span> <span class="o">|</span> <span class="n">sequenza</span> <span class="o">|</span> <span class="n">luca</span>
<span class="p">(</span><span class="mi">4</span> <span class="n">righe</span><span class="p">)</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">from</span> <span class="n">t_norm</span><span class="p">;</span>
<span class="k">count</span>
<span class="c1">---------</span>
<span class="mi">1000000</span>
<span class="p">(</span><span class="mi">1</span> <span class="n">riga</span><span class="p">)</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">from</span> <span class="n">t_unlogged</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">cannot</span> <span class="k">access</span> <span class="k">temporary</span> <span class="k">or</span> <span class="n">unlogged</span> <span class="n">relations</span> <span class="n">during</span> <span class="n">recovery</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The interesting part to note here is that changing the unlogged status to logged required <code class="language-plaintext highlighter-rouge">11 secs</code>, that is more than the the insertion time on an ordinary table. The idea here is that PostgreSQL has to <em>insert into the WALs all the records</em> from the table, as the <code class="language-plaintext highlighter-rouge">INSERT</code> of each row just happened.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=#</span> <span class="k">alter</span> <span class="k">table</span> <span class="n">t_unlogged</span> <span class="k">set</span> <span class="n">logged</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">11485</span><span class="p">.</span><span class="mi">505</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">11</span><span class="p">.</span><span class="mi">486</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and after that, on the replicated standby the table becomes <em>ordinary</em> too:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=></span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">from</span> <span class="n">t_unlogged</span><span class="p">;</span>
<span class="k">count</span>
<span class="c1">---------</span>
<span class="mi">1000000</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="switching-from-logged-to-unlogged">Switching from logged to unlogged</h2>
<p>What happens now if the <code class="language-plaintext highlighter-rouge">t_unlogged</code> returns unlogged again:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=#</span> <span class="k">alter</span> <span class="k">table</span> <span class="n">t_unlogged</span> <span class="k">set</span> <span class="n">unlogged</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">5236</span><span class="p">.</span><span class="mi">165</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">05</span><span class="p">.</span><span class="mi">236</span><span class="p">)</span>
<span class="n">rep_test</span><span class="o">=#</span> <span class="k">truncate</span> <span class="n">t_unlogged</span><span class="p">;</span>
<span class="k">TRUNCATE</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">21</span><span class="p">.</span><span class="mi">498</span> <span class="n">ms</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The interesting part to note here is that, again, there is a lot of time spent in the storage change.
<br />
On the standby, the table become again not usable:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=></span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">from</span> <span class="n">t_unlogged</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">cannot</span> <span class="k">access</span> <span class="k">temporary</span> <span class="k">or</span> <span class="n">unlogged</span> <span class="n">relations</span> <span class="n">during</span> <span class="n">recovery</span>
<span class="n">rep_test</span><span class="o">=></span> <span class="k">select</span> <span class="n">relpages</span><span class="p">,</span> <span class="n">reltuples</span> <span class="k">from</span> <span class="n">pg_class</span> <span class="k">where</span> <span class="n">oid</span> <span class="o">=</span> <span class="s1">'t_unlogged'</span><span class="p">::</span><span class="n">regclass</span><span class="p">;</span>
<span class="n">relpages</span> <span class="o">|</span> <span class="n">reltuples</span>
<span class="c1">----------|-----------</span>
<span class="mi">0</span> <span class="o">|</span> <span class="mi">0</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="does-the-replica-knows-about-the-unlogged-tables">Does the replica knows about the unlogged tables?</h2>
<p>Of course it does, and in fact <code class="language-plaintext highlighter-rouge">pg_class</code> knows how many tuples and pages the table is using.
<br />
However the table <strong>is not consuming store space on the replication host</strong>. In other words, the database on the replication side knows how much the table occupies on the master node, because the <code class="language-plaintext highlighter-rouge">pg_class</code> (and other catalogs) are replicated too. The table data is missing on disk.</p>
<p><br />
Let’s see this on the master side:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=#</span> <span class="k">select</span> <span class="n">relpages</span><span class="p">,</span> <span class="n">reltuples</span><span class="p">,</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'t_unlogged'</span><span class="p">)</span> <span class="p">),</span>
<span class="n">pg_relation_filepath</span><span class="p">(</span> <span class="n">oid</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">pg_class</span> <span class="k">where</span> <span class="n">oid</span> <span class="o">=</span> <span class="s1">'t_unlogged'</span><span class="p">::</span><span class="n">regclass</span><span class="p">;</span>
<span class="n">relpages</span> <span class="o">|</span> <span class="n">reltuples</span> <span class="o">|</span> <span class="n">pg_size_pretty</span> <span class="o">|</span> <span class="n">pg_relation_filepath</span>
<span class="c1">----------|-----------|----------------|----------------------</span>
<span class="mi">12738</span> <span class="o">|</span> <span class="mi">2</span><span class="n">e</span><span class="o">+</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">100</span> <span class="n">MB</span> <span class="o">|</span> <span class="n">base</span><span class="o">/</span><span class="mi">41441</span><span class="o">/</span><span class="mi">41555</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and on disk the size of the file is</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo du</span> <span class="nt">-h</span> <span class="nv">$PGDATA</span>/base/41441/41555
100M /postgres/12/data/base/41441/41555
</code></pre></div></div>
<p><br />
<br /></p>
<p>What on the replicating host? The information is the same, but on the disk there is nothing:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rep_test</span><span class="o">=#</span> <span class="k">select</span> <span class="n">relpages</span><span class="p">,</span> <span class="n">reltuples</span><span class="p">,</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'t_unlogged'</span><span class="p">)</span> <span class="p">),</span>
<span class="n">pg_relation_filepath</span><span class="p">(</span> <span class="n">oid</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">pg_class</span> <span class="k">where</span> <span class="n">oid</span> <span class="o">=</span> <span class="s1">'t_unlogged'</span><span class="p">::</span><span class="n">regclass</span><span class="p">;</span>
<span class="n">relpages</span> <span class="o">|</span> <span class="n">reltuples</span> <span class="o">|</span> <span class="n">pg_size_pretty</span> <span class="o">|</span> <span class="n">pg_relation_filepath</span>
<span class="c1">----------|-----------|----------------|----------------------</span>
<span class="mi">12738</span> <span class="o">|</span> <span class="mi">2</span><span class="n">e</span><span class="o">+</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">0</span> <span class="n">bytes</span> <span class="o">|</span> <span class="n">base</span><span class="o">/</span><span class="mi">41441</span><span class="o">/</span><span class="mi">41555</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>and on disk, in fact, there is no room occupied by the table:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo du</span> <span class="nt">-h</span> /postgres/12/replica/base/41441/4155
0 /postgres/12/replica/base/41441/4155
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="unlogged-but-replicated--ordinary">Unlogged but replicated ~ ordinary</h2>
<p>An unlogged table that is replicated, looses the speed advantages of being unlogged.
<br />
Why?
Because the system has to provide all the machinery to synchronize the table once it becomes logged.
If you “stop” the replication, removing the slots and other related stuff, the table gains speed.</p>
<h1 id="conclusions">Conclusions</h1>
<p>As expected, PostgreSQL replicates only <em>logged</em> tables and not <em>temporary</em> or <em>unlogged</em> ones. The latter are however present on the replicating side as <em>placeholders</em>, and once you turn them as logged they are fully shipped to the replicating part.</p>
pg_dump and inserts2021-04-30T00:00:00+00:00https://fluca1978.github.io/2021/04/30/pgdumpInserts<p>pg_dump supports a few useful options to export data as a list of INSERTs</p>
<h1 id="pg_dump-and-inserts">pg_dump and inserts</h1>
<p><code class="language-plaintext highlighter-rouge">pg_dump(1)</code> is the default tool for doing backups of a PostgreSQL database.
<br />
I often got answers about how to produce a <strong>more portable</strong> output of the database dump, with <em>portable</em> meaning truly <em>“loadable into another PostgreSQL version or even a different database”</em>.
<br />
In fact, <code class="language-plaintext highlighter-rouge">pg_dump</code> defaults to use <code class="language-plaintext highlighter-rouge">COPY</code> for bulkd loading data:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">-a</span> <span class="nt">-t</span> wa <span class="nt">-U</span> luca testdb
...
COPY luca.wa <span class="o">(</span>pk, t<span class="o">)</span> FROM stdin<span class="p">;</span>
9200673 Record <span class="c">#1</span>
9200674 Record <span class="c">#2</span>
9200675 Record <span class="c">#3</span>
9200676 Record <span class="c">#4</span>
9200677 Record <span class="c">#5</span>
9200678 Record <span class="c">#6</span>
9200679 Record <span class="c">#7</span>
9200680 Record <span class="c">#8</span>
9200681 Record <span class="c">#9</span>
...
</code></pre></div></div>
<p><br />
<br />
As you can guess, <code class="language-plaintext highlighter-rouge">COPY</code> is usable only in PostgreSQL and not into other database. So, how to handle a text dump that can be used into other databases?
<br />
No need to worry: <code class="language-plaintext highlighter-rouge">pg_dump</code> has a few features to handle such need.
<br />
In particular, the following options can be useful:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">--inserts</code> removes the <code class="language-plaintext highlighter-rouge">COPY</code> and substitutes it with <code class="language-plaintext highlighter-rouge">INSERT</code> statements, one per tuple;</li>
<li><code class="language-plaintext highlighter-rouge">--column-inserts</code> similar to the previous, but each <code class="language-plaintext highlighter-rouge">INSERT</code> has the list of named columns;</li>
<li><code class="language-plaintext highlighter-rouge">rows-per-inserts</code> a number of tuples a single <code class="language-plaintext highlighter-rouge">INSERT</code> statement can handle, useful for a better bulk loading (but could be less portable).</li>
</ul>
<p><br />
There are also some other useful options:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">--quote-all-identifiers</code> force the quoting of the identifiers, and this is useful when preparing data for a different database;</li>
<li><code class="language-plaintext highlighter-rouge">--use-set-session-authorization</code> when dealing with ownership of objects, use SQL standard commands;`</li>
<li><code class="language-plaintext highlighter-rouge">--no-comments</code>, this is not a very “technical” aspect, but when you are going to load your dump into another database you probably do not want to import comments since they could be handled differently. Similarly, there are other <code class="language-plaintext highlighter-rouge">--no</code> options that are specific to PostgreSQL, like <code class="language-plaintext highlighter-rouge">--no-publications</code> to avoid replicating publications, and so on.</li>
</ul>
<p><br />
In the following I will use the same example table <code class="language-plaintext highlighter-rouge">wa</code> table with just two columns and a bunch of records, so that you can easily compare the output differences.</p>
<h2 id="defaulting-to-insert">Defaulting to <code class="language-plaintext highlighter-rouge">INSERT</code></h2>
<p>In order to better understand the difference between every single option, let’s see a few examples:
<br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">-a</span> <span class="nt">-t</span> wa <span class="nt">--inserts</span> <span class="nt">-U</span> luca testdb
...
INSERT INTO luca.wa VALUES <span class="o">(</span>9200673, <span class="s1">'Record #1'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO luca.wa VALUES <span class="o">(</span>9200674, <span class="s1">'Record #2'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO luca.wa VALUES <span class="o">(</span>9200675, <span class="s1">'Record #3'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO luca.wa VALUES <span class="o">(</span>9200676, <span class="s1">'Record #4'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO luca.wa VALUES <span class="o">(</span>9200677, <span class="s1">'Record #5'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO luca.wa VALUES <span class="o">(</span>9200678, <span class="s1">'Record #6'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO luca.wa VALUES <span class="o">(</span>9200679, <span class="s1">'Record #7'</span><span class="o">)</span><span class="p">;</span>
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see from the above, the <code class="language-plaintext highlighter-rouge">COPY</code> has been translated into a set of <code class="language-plaintext highlighter-rouge">INSERT</code>s. This of course has the drawback of having a slower buk loading.
<br />
Just to do another example, let’s see how it does change the output with identifier quotiong:
<br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">-a</span> <span class="nt">-t</span> wa <span class="nt">--inserts</span> <span class="nt">--quote-all-identifiers</span> <span class="nt">-U</span> luca testdb
...
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES <span class="o">(</span>9200673, <span class="s1">'Record #1'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES <span class="o">(</span>9200674, <span class="s1">'Record #2'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES <span class="o">(</span>9200675, <span class="s1">'Record #3'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES <span class="o">(</span>9200676, <span class="s1">'Record #4'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES <span class="o">(</span>9200677, <span class="s1">'Record #5'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES <span class="o">(</span>9200678, <span class="s1">'Record #6'</span><span class="o">)</span><span class="p">;</span>
...
</code></pre></div></div>
<p><br />
<br />
And the table and schema name has been quoted.
<br />
What if you want also the column list on every <code class="language-plaintext highlighter-rouge">INSERT</code>? The optin ``–column-inserts<code class="language-plaintext highlighter-rouge"> is there to explode the list of columns:</code>
<br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">-a</span> <span class="nt">-t</span> wa <span class="nt">--column-inserts</span> <span class="nt">--quote-all-identifiers</span> <span class="nt">-U</span> luca testdb
...
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200673, <span class="s1">'Record #1'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200674, <span class="s1">'Record #2'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200675, <span class="s1">'Record #3'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200676, <span class="s1">'Record #4'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200677, <span class="s1">'Record #5'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200678, <span class="s1">'Record #6'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200679, <span class="s1">'Record #7'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200680, <span class="s1">'Record #8'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> <span class="o">(</span><span class="s2">"pk"</span>, <span class="s2">"t"</span><span class="o">)</span> VALUES <span class="o">(</span>9200681, <span class="s1">'Record #9'</span><span class="o">)</span><span class="p">;</span>
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>despite the usage or not of the <code class="language-plaintext highlighter-rouge">--quote-all-identifiers</code>, each <code class="language-plaintext highlighter-rouge">INSERT</code> has the list of the columns the values are referring to.
<br />
The last case, a middle path between the <code class="language-plaintext highlighter-rouge">COPY</code> and a single <code class="language-plaintext highlighter-rouge">INSERT</code> per tuple, is the <code class="language-plaintext highlighter-rouge">--rows-per-insert</code> that allows you specify the maximum number of rows every <code class="language-plaintext highlighter-rouge">INSERT</code> will handle:
<br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">-a</span> <span class="nt">-t</span> wa <span class="nt">--rows-per-insert</span><span class="o">=</span>3 <span class="nt">--quote-all-identifiers</span> <span class="nt">-U</span> luca testdb
...
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES
<span class="o">(</span>9200688, <span class="s1">'Record #16'</span><span class="o">)</span>,
<span class="o">(</span>9200689, <span class="s1">'Record #17'</span><span class="o">)</span>,
<span class="o">(</span>9200690, <span class="s1">'Record #18'</span><span class="o">)</span><span class="p">;</span>
INSERT INTO <span class="s2">"luca"</span>.<span class="s2">"wa"</span> VALUES
<span class="o">(</span>9200691, <span class="s1">'Record #19'</span><span class="o">)</span>,
<span class="o">(</span>9200692, <span class="s1">'Record #20'</span><span class="o">)</span><span class="p">;</span>
...
</code></pre></div></div>
<p><br />
<br />
Note how the last <code class="language-plaintext highlighter-rouge">INSERT</code> has only two tuples instead of the specified <code class="language-plaintext highlighter-rouge">3</code>: the <code class="language-plaintext highlighter-rouge">pg_dump</code> is smart enough to let your <code class="language-plaintext highlighter-rouge">INSERT</code> to not loose a single row, so if there is not enough data left, the <code class="language-plaintext highlighter-rouge">INSERT</code> involves less rows.</p>
<h2 id="avoid-alter-tbale-to-set-ownership">Avoid <code class="language-plaintext highlighter-rouge">ALTER TBALE</code> to set ownership</h2>
<p>If the dump includes the table data structure, <code class="language-plaintext highlighter-rouge">pg_dump</code> will issue appropriate commands to change the ownership. For example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">-C</span> <span class="nt">-t</span> wa <span class="nt">-U</span> luca testdb
...
CREATE TABLE luca.wa <span class="o">(</span>
pk integer NOT NULL,
t text
<span class="o">)</span><span class="p">;</span>
ALTER TABLE <span class="s2">"luca"</span>.<span class="s2">"wraparaound_pk_seq"</span> OWNER TO <span class="s2">"luca"</span><span class="p">;</span>
...
</code></pre></div></div>
<p><br />
<br />
The option <code class="language-plaintext highlighter-rouge">--use-set-session-authorization</code> produces a more portable SQL output:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_dump <span class="nt">-C</span> <span class="nt">-t</span> wa <span class="nt">--use-set-session-authorization</span> <span class="nt">-U</span> luca testdb
...
SET SESSION AUTHORIZATION <span class="s1">'luca'</span><span class="p">;</span>
CREATE TABLE luca.wa <span class="o">(</span>
pk integer NOT NULL,
t text
<span class="o">)</span><span class="p">;</span>
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the user is <em>set</em> in the beginning, so that automatically all created objects will belong to such user.</p>
pgBackRest 2.33: multiple repositories (and more)2021-04-28T00:00:00+00:00https://fluca1978.github.io/2021/04/28/pgbackrestMultiRepository<p>pgBackRest now supports multiple repositories!</p>
<h1 id="pgbackrest-233-multiple-repositories-and-more">pgBackRest 2.33: multiple repositories (and more)</h1>
<p>A few weeks ago a <a href="https://github.com/pgbackrest/pgbackrest/releases/tag/release%2F2.33" target="_blank">new release of pgbackrest, the 2.33</a> has been released. This release improves a lot of things, in particular two of them caught my attention:</p>
<ul>
<li>multi repository support;</li>
<li>custom configuration path.</li>
</ul>
<p><br />
The former allows <code class="language-plaintext highlighter-rouge">pgbackrest</code> to perform a multiple backup scattared over different repositories, in other words it allows the backup to be <em>mirrored</em> across different storages.
<br />
The second improvement fixes a few annoyances with non-Linux operating systems, such as FreeBSD.
<br />
In the following I give a glance at both this improvements, in no specific order.</p>
<h1 id="custom-configuration-path">Custom configuration path</h1>
<p>FreeBSD and, most in general, non-Linux machines use different default configuration paths. For example, what is commonly used as <code class="language-plaintext highlighter-rouge">/etc</code> on Linux is usually <code class="language-plaintext highlighter-rouge">/usr/local/etc</code>. In previous releases, there was room for using the <code class="language-plaintext highlighter-rouge">--prefix</code> option during the <code class="language-plaintext highlighter-rouge">configure</code> phase, but this was tedious because there was the need to specify the path to non standard files manually before invoking the command.
<br />
In other words:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>archive_command <span class="o">=</span> <span class="s1">'/usr/local/bin/pgbackrest --pg1-path=/postgres/12/data \
--config=/usr/local/etc/pgbackrest.conf \
--stanza=miguel archive-push %p'</span>
archive_mode <span class="o">=</span> on
</code></pre></div></div>
<p><br />
<br /></p>
<p>The important part to note in the above snippet, is that on FreeBSD if you wanted to use the standard (from an operating system point of view) path for the configuration, <code class="language-plaintext highlighter-rouge">pgbackrest</code> did not have any clue about and would try to look up the configuration file as <code class="language-plaintext highlighter-rouge">/etc/pgbackrest.conf</code>. The solution was, of course, to specify the <code class="language-plaintext highlighter-rouge">--config</code> option with the appropriate file.
<br />
<br />
Things have changed in version 2.33, since the <code class="language-plaintext highlighter-rouge">configure</code> command now can instrument the <code class="language-plaintext highlighter-rouge">pgbackrest</code> binary to find out the correct configuration file:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% ./configure <span class="nt">--help</span>
...
<span class="nt">--with-configdir</span><span class="o">=</span>DIR default configuration path
...
</code></pre></div></div>
<p><br />
<br /></p>
<p>**The default configuration path remains <code class="language-plaintext highlighter-rouge">/etc/pgbackrest.conf</code> ** but it is now possible to specify a default configuration file path at compile time, so that you don’t have to repeat yourself with <code class="language-plaintext highlighter-rouge">--config</code> at every invocation.</p>
<h1 id="multi-repository-support">Multi Repository Support</h1>
<p>This is a much more important improvement, at least in my opinion. <code class="language-plaintext highlighter-rouge">pgbackrest</code> has been designed with this feature in mind, but until now there was not support for multiple repositories.
<br />
Thanks to multiple repositories you can now <em>scatter or even mirror</em> your backups across different storage systems, so for example you can have a local repository and a remote one (e.g., in one of the supported cloud storages), or you can <em>mount</em> different storages and have the backup to be mirrored across all of them.
<br />
The advantage of this solution is that it <strong>provides a better redundancy</strong> in the case your <em>single-point-of-failure</em> backup storage dies.
<br />
One thing to take into account when working with multiple repositories is that a few <code class="language-plaintext highlighter-rouge">pgbackrest</code> commands now require a repository specification other than the stanza. The rule of thumb is that whenever <code class="language-plaintext highlighter-rouge">pgbackrest</code> is able to find out which repository to use, it will do, and this applies to the case when a single repository is configured. In other words, backward compatibility is safe!
<br />
<br />
In the following, there will be two configured repositories on the same backup machine. While this is <em>a very bad idea</em>, because it emphasizes a single point of failure, it allows for a quick run on multiple repository setup. The <code class="language-plaintext highlighter-rouge">carmensita</code> machine will handle two different local repositories:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">/backup/pgbackrest</code> is the main repository;</li>
<li><code class="language-plaintext highlighter-rouge">/backup/pgbackrest-mirror</code> is the secondary repository, attached to a different storage.</li>
</ul>
<h2 id="in-the-beginning-there-was-only-repo1">In the beginning there was only <code class="language-plaintext highlighter-rouge">repo1</code></h2>
<p>With <code class="language-plaintext highlighter-rouge">pgbackrest</code> prior to version 2.33, you could not configure multiple repositories: the configuration did accept a <code class="language-plaintext highlighter-rouge">repo1</code> set of variables but it was unable to <em>handle</em> repositories with a specification different from 1. As an example, consider the following configuration:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span>global]
start-fast <span class="o">=</span> y
stop-auto <span class="o">=</span> y
repo1-path <span class="o">=</span> /backup/pgbackrest
repo1-retention-full<span class="o">=</span>2
repo1-retention-archive<span class="o">=</span>5
repo2-path <span class="o">=</span> /backup/pgbackrest-mirror
repo2-retention-full <span class="o">=</span> 1
</code></pre></div></div>
<p><br />
<br /></p>
<p>Such a configuration produces an error even in version 2.32:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgbackrest <span class="nt">--stanza</span> miguel stanza-create
ERROR: <span class="o">[</span>032]: only repo1 may be configured
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="multiple-repositories">Multiple Repositories</h2>
<p>I have to confess that setting up <code class="language-plaintext highlighter-rouge">pgbackrest</code> for different repositories on the same machine was not as simple as I initially thought, but once again <a href="https://github.com/pgbackrest/pgbackrest/issues/1361" target="_blank">thanks to very professional community behind this great product</a> I was able to fix my setup:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span>global]
start-fast <span class="o">=</span> y
stop-auto <span class="o">=</span> y
repo1-path <span class="o">=</span> /backup/pgbackrest
repo1-retention-full<span class="o">=</span>2
repo1-retention-archive<span class="o">=</span>5
repo2-path <span class="o">=</span> /backup/pgbackrest-mirror
repo2-retention-full <span class="o">=</span> 1
log-level-console <span class="o">=</span> info
<span class="o">[</span>miguel]
pg1-host <span class="o">=</span> miguel
pg1-path <span class="o">=</span> /postgres/12/data
pg1-host-user <span class="o">=</span> postgres
</code></pre></div></div>
<p><br />
<br /></p>
<p>while on the target machine the main configuration parameters are (<code class="language-plaintext highlighter-rouge">/usr/local/etc/pgbackrest.conf</code>):</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[global]
repo1-path = /backup/pgbackrest
repo1-host-user = backup
repo1-host = carmensita
repo2-host = sheriff
repo2-host-user = backup
repo2-path = /backup/pgbackrest-mirror
</code></pre></div></div>
<p><br />
<br /></p>
<h3 id="creating-a-stanza">Creating a stanza</h3>
<p>As you can imagine, the <code class="language-plaintext highlighter-rouge">stanza-create</code> command creates the stanza in all the repositories automatically:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgbackrest <span class="nt">--stanza</span> miguel stanza-create
P00 INFO: stanza-create <span class="k">for </span>stanza <span class="s1">'miguel'</span> on repo1
P00 INFO: stanza-create <span class="k">for </span>stanza <span class="s1">'miguel'</span> on repo2
P00 INFO: stanza-create <span class="nb">command </span>end: completed successfully <span class="o">(</span>1017ms<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="executing-a-backup">Executing a backup</h2>
<p>It is now time to execute a backup and see what happens:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbackrest <span class="nt">--stanza</span> miguel backup
...
INFO: repo option not specified, defaulting to repo1
...
INFO: new backup label <span class="o">=</span> 20210413-105939F
INFO: backup <span class="nb">command </span>end: completed successfully <span class="o">(</span>254377ms<span class="o">)</span>
INFO: expire <span class="nb">command </span>begin 2.33: <span class="nt">--exec-id</span><span class="o">=</span>1606-12c0320b <span class="nt">--log-level-console</span><span class="o">=</span>info <span class="nt">--repo1-path</span><span class="o">=</span>/backup/pgbackrest <span class="nt">--repo2-path</span><span class="o">=</span>/backup/pgbackrest-mirror <span class="nt">--repo1-retention-archive</span><span class="o">=</span>5 <span class="nt">--repo1-retention-full</span><span class="o">=</span>2 <span class="nt">--repo2-retention-full</span><span class="o">=</span>1 <span class="nt">--stanza</span><span class="o">=</span>miguel
INFO: expire <span class="nb">command </span>end: completed successfully <span class="o">(</span>59ms<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, since I did not specify any particular repository, the program <strong>program automatically selects the first repository</strong>.</p>
<h2 id="mixed-backups">Mixed backups</h2>
<p>Having a single repository active in the backup list means the backup status is <em>mixed</em>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pgbackrest <span class="nt">--stanza</span> miguel info
stanza: miguel
status: mixed
repo1: ok
repo2: error <span class="o">(</span>no valid backups<span class="o">)</span>
cipher: none
db <span class="o">(</span>current<span class="o">)</span>
wal archive min/max <span class="o">(</span>12<span class="o">)</span>: 0000000100000005000000F2/000000010000000600000004
full backup: 20210413-105939F
timestamp start/stop: 2021-04-13 10:59:39 / 2021-04-13 11:03:51
wal start/stop: 000000010000000600000004 / 000000010000000600000004
database size: 2.5GB, database backup size: 2.5GB
repo1: backup <span class="nb">set </span>size: 142.8MB, backup size: 142.8MB
</code></pre></div></div>
<p><br />
<br /></p>
<p>To some extent, the above is a <em>degraded</em> state, that means not all repositories are up with good backups.
<br />
Note that the single backup info now has a final line that indicates the repository where the backup can be found.</p>
<h2 id="specifying-the-repository-for-a-backup">Specifying the repository for a backup</h2>
<p>You can specify the <code class="language-plaintext highlighter-rouge">--repo</code> option to instrument <code class="language-plaintext highlighter-rouge">pgbackrest</code> on which repository to store the backup:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbackrest <span class="nt">--stanza</span> miguel backup <span class="nt">--repo</span> 2
...
INFO: backup <span class="nb">command </span>end: completed successfully <span class="o">(</span>4846ms<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="the-situation-on-the-repositories">The situation on the repositories</h2>
<p>The <code class="language-plaintext highlighter-rouge">info</code> command can, as always, display information about repositories and their content:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbackrest <span class="nt">--stanza</span> miguel info
stanza: miguel
status: ok
cipher: none
db <span class="o">(</span>current<span class="o">)</span>
wal archive min/max <span class="o">(</span>12<span class="o">)</span>: 0000000100000005000000F2/000000010000000600000016
full backup: 20210413-105939F
timestamp start/stop: 2021-04-13 10:59:39 / 2021-04-13 11:03:51
wal start/stop: 000000010000000600000004 / 000000010000000600000004
database size: 2.5GB, database backup size: 2.5GB
repo1: backup <span class="nb">set </span>size: 142.8MB, backup size: 142.8MB
full backup: 20210413-111525F
timestamp start/stop: 2021-04-13 11:15:25 / 2021-04-13 11:19:37
wal start/stop: 00000001000000060000000F / 00000001000000060000000F
database size: 2.5GB, database backup size: 2.5GB
repo2: backup <span class="nb">set </span>size: 142.8MB, backup size: 142.8MB
...
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="one-backup-at-a-time">One backup at a time</h2>
<p>It is not possible, as far as I know, to instrument <code class="language-plaintext highlighter-rouge">pgbackrest</code> to do simultaneously backups on all the repositories. This means that <strong>you are in charge of scheduling backups on all the repositories manually</strong>!</p>
<h2 id="archiving-on-all-the-repositories">Archiving on all the repositories</h2>
<p>The archiving, however, is done on all repositories at the same time.
However, as <a href="https://github.com/pgbackrest/pgbackrest/issues/1361#issuecomment-817630417" target="_blank">explained here</a>, the <code class="language-plaintext highlighter-rouge">archive-push</code> will iterate on every repository to push the same WAL segment. What this mean is that, from a PostgreSQL perspective, if a repository fails to get the WAL (while the others succeed), PostgreSQL will think the archiving has failed and will retry later.
<br />
One way to solve the problem is to use the <code class="language-plaintext highlighter-rouge">archive-push</code> asynchronous mode.</p>
<h1 id="conclusions">Conclusions</h1>
<p>I am very enthusiast about how <code class="language-plaintext highlighter-rouge">pgbackrest</code> is progressing and how it is enabling new features at every release.</p>
Preventing FreeBSD to kill PostgreSQL (aka OOM Killer prevention)2021-04-02T00:00:00+00:00https://fluca1978.github.io/2021/04/02/OOMKillerFreeBSD<p>Something that can be useful when running PostgreSQL on FreeBSD.</p>
<h1 id="preventing-freebsd-to-kill-postgresql-aka-oom-killer-prevention">Preventing FreeBSD to kill PostgreSQL (aka OOM Killer prevention)</h1>
<p>There are a lot of interesting articles on how to prevent the <em>Out of Memory Killer</em> (<em>OOM killer</em> in short) on Linux to ruin you day, or better your night. One particular well done explaination about how the OOM Killer works, and how to help PostgreSQL to survive, is, in my humble opinion, the one from <a href="https://www.percona.com/blog/2019/08/02/out-of-memory-killer-or-savior/" target="_blank">Percona Blog</a>.</p>
<p><br />
<br />
I tend to run PostgreSQL on FreeBSD machines, at least whenever it is possible, and quite frankly I have still a lot of things to learn. One of those <em>little</em> details is about <em>FreeBSD OOM Killer</em>.
<br />
It turned out FreeBSD <strong>has its own <em>OOM Killer implementation</em></strong>, see <a href="https://klarasystems.com/articles/exploring-swap-on-freebsd/" target="_blank">this excellent article</a>; I discovered it recently via the <a href="https://forums.freebsd.org/threads/oom-killer-like-configuration.79514/" target="_blank">excellent FreeBSD forum</a> and, as usual, the kindness and professional of the community behind this great operating system.
<br />
<br />
A difference between Linux and FreeBSD is that the former exploits a lot the <code class="language-plaintext highlighter-rouge">/proc</code> filesystem to let the administrator to interact with the process configurations and information, while the latter does not.
And thanks to the <a href="https://klarasystems.com/articles/exploring-swap-on-freebsd/" target="_blank">the above article</a> I discovered the <code class="language-plaintext highlighter-rouge">protect(1)</code> command, that is aimed to instrument the OOM Killer.
<br />
<br />
In the following I describe what I learnt so far and how to protect PostgreSQL from the OOM Killer.</p>
<h2 id="protect1-and-freebsd-oom-killer"><code class="language-plaintext highlighter-rouge">protect(1)</code> and FreeBSD OOM Killer</h2>
<p>Processes in FreeBSD has a particular flag named <strong><code class="language-plaintext highlighter-rouge">PROC_SPROTECT</code></strong> that, as the man page for <code class="language-plaintext highlighter-rouge">procctl(2)</code> system call states, is used to instrument the OOM Killer to skip this process when selecting a candidate to kill:</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PROC_SPROTECT Set process protection state. This is used to mark
a process as protected from being killed if the
system exhausts the available memory and swap.
</code></pre></div></div>
<p><br />
<br /></p>
<p>The idea is that when the OOM Killer <em>scans</em> the processes to find out one (or more) candidate to kill to immediatly free memory, the <em>protected</em> processes must be skipped.
<br />
An important thing to note is that <strong>protection is not inherited by <code class="language-plaintext highlighter-rouge">fork(2)</code>-ed processes</strong>. Luckily, it is possible to mark a protected process to let its children to inherit the protection status. In fact, setting <code class="language-plaintext highlighter-rouge">PROC_SPROTECT</code> to:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">PPROT_SET</code> protects the current process but not its children;</li>
<li><code class="language-plaintext highlighter-rouge">PPROT_SET | PPROT_INHERIT</code> protects the current process and any children from hereby.</li>
</ul>
<p><br />
Why is this detail important? Because as we all know, PostgreSQL starts with a <em>main</em> process (the <em>postmaster</em>) that forks a new process for every connection. Therefore, you are free to control the OOM Killer protection at level of postmaster or connection level.
<br />
<br />
<em>WARNING: marking all processes as protected can prevent the OOM Killer to work at all, with the presumably result of <strong>panicing</strong> the whole machine</em>.
<br />
<br /></p>
<h2 id="protecting-postgresql-from-oom-killer">Protecting PostgreSQL from OOM Killer</h2>
<p>There are two main ways to protect PostgreSQL from the OOM Killer:</p>
<ul>
<li>manually use <code class="language-plaintext highlighter-rouge">protect(1)</code> against one or more PostgreSQL processes;</li>
<li>automatically use <code class="language-plaintext highlighter-rouge">protect(1)</code> at sevrice startup.</li>
</ul>
<p><br />
Manually using <code class="language-plaintext highlighter-rouge">protect(1)</code> means that you are going to protect the process by means of its PID. As an example, suppose that on a machine there are the following processes:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>pstree <span class="nt">-s</span> postgres
<span class="se">\-</span>+<span class="o">=</span> 00776 postgres /usr/local/bin/postgres <span class="nt">-D</span> /postgres/12/data
|--<span class="o">=</span> 00777 postgres postgres: logger <span class="o">(</span>postgres<span class="o">)</span>
|--<span class="o">=</span> 00779 postgres postgres: checkpointer <span class="o">(</span>postgres<span class="o">)</span>
|--<span class="o">=</span> 00780 postgres postgres: background writer <span class="o">(</span>postgres<span class="o">)</span>
|--<span class="o">=</span> 00781 postgres postgres: walwriter <span class="o">(</span>postgres<span class="o">)</span>
|--<span class="o">=</span> 00782 postgres postgres: stats collector <span class="o">(</span>postgres<span class="o">)</span>
<span class="se">\-</span>-<span class="o">=</span> 00783 postgres postgres: logical replication launcher <span class="o">(</span>postgres<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>where the process with PID <code class="language-plaintext highlighter-rouge">776</code> is clearly the <em>postmaster</em>. Now, assume you want to protect the postmaster itself: you can call <code class="language-plaintext highlighter-rouge">protect(1)</code> specyfing the PID of the process.</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>protect <span class="nt">-p</span> 776
</code></pre></div></div>
<p><br />
<br /></p>
<p>The main flags for <code class="language-plaintext highlighter-rouge">protect(1)</code> are:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">-p</code> specifies the PID of the process to protect;</li>
<li><code class="language-plaintext highlighter-rouge">-d</code> or <code class="language-plaintext highlighter-rouge">-i</code> to apply the protection to all the current children or to the future children;</li>
<li><code class="language-plaintext highlighter-rouge">-c</code> to remove the protection.
<br />
Therefore, in order to protect <strong>all new connections to the database</strong> the command to use is:</li>
</ul>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>protect <span class="nt">-i</span> <span class="nt">-p</span> 776
</code></pre></div></div>
<p><br />
<br /></p>
<p>that reads as <em>protect process <code class="language-plaintext highlighter-rouge">776</code> and all new forked processes</em>.
<br />
<br />
Doing all the protection manually is boring, and luckily the excellent <code class="language-plaintext highlighter-rouge">rc.d</code> system allows for the configuration of protection at the service startup. It is possible to specify the <strong><code class="language-plaintext highlighter-rouge">oomprotect</code></strong> configuration parameter for the service (all services, not only PostgreSQL!), that in turn can assume the following values:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">yes</code> enables protection for (a single) process;</li>
<li><code class="language-plaintext highlighter-rouge">all</code> enables protection for all processes (forked).
<br />
<br /></li>
</ul>
<p><strong>Unluckily, this does not apply directly to PostgreSQL since the <code class="language-plaintext highlighter-rouge">service(8)</code> script <code class="language-plaintext highlighter-rouge">/usr/local/etc/rc.d/postgresql</code> does not fully use <code class="language-plaintext highlighter-rouge">/etc/rc.subr</code> that, in turn, is in charge of examining the <code class="language-plaintext highlighter-rouge">oomprotect</code> variable.</strong> The <code class="language-plaintext highlighter-rouge">postgresql</code> script uses directly <code class="language-plaintext highlighter-rouge">pg_ctl(1)</code> to manage the cluster, without any “protection** possible.
<strong>I suspect the problem is due to the fact that <code class="language-plaintext highlighter-rouge">pg_ctl(1)</code> must be run as a normal user, and therefore there is the need to simultaneously run the <code class="language-plaintext highlighter-rouge">pg_ctl(1)</code> command without <code class="language-plaintext highlighter-rouge">root</code> privileges, as well as with such privileges to wrap it in <code class="language-plaintext highlighter-rouge">protect(1)</code>.</strong></p>
<p>In short, this means that even a configuration like the following will not apply <code class="language-plaintext highlighter-rouge">protect(1)</code> to PostgreSQL:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">postgresql_enable</span><span class="o">=</span><span class="s2">"YES"</span>
<span class="nv">postgresql_data</span><span class="o">=</span><span class="s2">"/postgres/12/data"</span>
<span class="c"># all = protect -i -p</span>
<span class="c"># yes = protect -p</span>
<span class="nv">postgresql_oomprotect</span><span class="o">=</span><span class="s2">"all"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore, in order to protect the <em>postmaster</em> or any other PostgreSQL process, you need to manually use <code class="language-plaintext highlighter-rouge">protect(1)</code> as already shown.
<br />
I am not sure if this is going to change in the future to allow the <code class="language-plaintext highlighter-rouge">rc.d</code> script to honor the <code class="language-plaintext highlighter-rouge">oomprotect</code> variable.</p>
<h2 id="how-to-inspect-the-protection-status">How to inspect the protection status</h2>
<p>This has been hard to me, but again thanks to great FreeBSD community and IRC, I discovered that <code class="language-plaintext highlighter-rouge">ps(1)</code> has a special command line argument, named <code class="language-plaintext highlighter-rouge">flags</code>, that can show the status of the single process protection. It is also the <code class="language-plaintext highlighter-rouge">flags2</code> command line argument that shows the status of the protection inheritance.
<br />
Both the <code class="language-plaintext highlighter-rouge">flags</code> and <code class="language-plaintext highlighter-rouge">flags2</code> sections contain <em>hexadecimal values</em> that indicates all the extra information tied to a process. In the case of <code class="language-plaintext highlighter-rouge">P_PROTECTED</code> the value is <code class="language-plaintext highlighter-rouge">0x100000</code> (and this is found in <code class="language-plaintext highlighter-rouge">flags</code>), while for the <code class="language-plaintext highlighter-rouge">P_INHERIT_PROTECTED</code> the value is <code class="language-plaintext highlighter-rouge">0x00000001</code> (and this is found in <code class="language-plaintext highlighter-rouge">flags2</code>).
<br />
Putting it all together, you can inspect your PostgreSQL processes as follows:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>ps <span class="nt">-ax</span> <span class="nt">-o</span> flags,flags2,pid,command | <span class="nb">grep </span>postgres
10104000 00000001 3747 /usr/local/bin/postgres <span class="nt">-D</span> /postgres/12/data
10100000 00000001 3748 postgres: logger <span class="o">(</span>postgres<span class="o">)</span>
10100000 00000001 3750 postgres: checkpointer <span class="o">(</span>postgres<span class="o">)</span>
10100000 00000001 3751 postgres: background writer <span class="o">(</span>postgres<span class="o">)</span>
10100000 00000001 3752 postgres: walwriter <span class="o">(</span>postgres<span class="o">)</span>
10100000 00000001 3753 postgres: stats collector <span class="o">(</span>postgres<span class="o">)</span>
10100000 00000001 3754 postgres: logical replication launcher <span class="o">(</span>postgres<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The first process, with PID <code class="language-plaintext highlighter-rouge">3747</code> is the already mentioned <em>postmaster</em> and it has a <code class="language-plaintext highlighter-rouge">flags</code> value of <code class="language-plaintext highlighter-rouge">10104000</code> that means <strong>it is OOM protected</strong>, and it also has a <code class="language-plaintext highlighter-rouge">flags2</code> section that is <code class="language-plaintext highlighter-rouge">00000001</code> that means it will make any spawn process protected too.</p>
<p>You can check this with some math and Perl:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
% <span class="nb">sudo </span>ps <span class="nt">-ax</span> <span class="nt">-o</span> flags,flags2,command <span class="se">\</span>
| <span class="nb">grep </span>postgres <span class="se">\ </span>
| perl <span class="nt">-lanE</span> <span class="s1">'say "[OOM PROTECTED]\t@F[2 .. $#F]" if $F[0] =~ /^\d{2}1\d{5}$/; '</span> | |
<span class="o">[</span>OOM PROTECTED] /usr/local/bin/postgres <span class="nt">-D</span> /postgres/12/data
<span class="o">[</span>OOM PROTECTED] postgres: logger <span class="o">(</span>postgres<span class="o">)</span>
<span class="o">[</span>OOM PROTECTED] postgres: checkpointer <span class="o">(</span>postgres<span class="o">)</span>
<span class="o">[</span>OOM PROTECTED] postgres: background writer <span class="o">(</span>postgres<span class="o">)</span>
<span class="o">[</span>OOM PROTECTED] postgres: walwriter <span class="o">(</span>postgres<span class="o">)</span>
<span class="o">[</span>OOM PROTECTED] postgres: stats collector <span class="o">(</span>postgres<span class="o">)</span>
<span class="o">[</span>OOM PROTECTED] postgres: logical replication launcher <span class="o">(</span>postgres<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above Perl one liner gets the command line and the <code class="language-plaintext highlighter-rouge">flags</code> section, as internal array <code class="language-plaintext highlighter-rouge">@F</code>, and checks if the third leftmost bit is set; in such case the process is protected against OOM killing.</p>
<h2 id="hey-ma-am-i-protected">Hey ‘ma, am I protected?</h2>
<p>I created an example <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/freebsd_oom.sql" target="_blank">pl/pgSQL function to check if the current connection is protected against the OOM Killer</a>.
The function is defined with <code class="language-plaintext highlighter-rouge">SECURITY DEFINER</code> and has to be created by a superuser, because it internally uses the <code class="language-plaintext highlighter-rouge">COPY</code> command to execute the <code class="language-plaintext highlighter-rouge">ps</code> utility.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_oomprotect</span><span class="p">(</span> <span class="n">pid</span> <span class="nb">int</span> <span class="k">DEFAULT</span> <span class="k">NULL</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">boolean</span>
<span class="k">AS</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">p_protected</span> <span class="nb">bit</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span> <span class="o">=</span> <span class="s1">'00100000'</span><span class="p">;</span>
<span class="n">is_protected</span> <span class="nb">boolean</span> <span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="n">shell</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="c1">-- if no pid supplied, use my own</span>
<span class="n">IF</span> <span class="n">pid</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">OR</span> <span class="n">pid</span> <span class="o"><</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">pid</span> <span class="p">:</span><span class="o">=</span> <span class="n">pg_backend_pid</span><span class="p">();</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Inspecting PostgreSQL process %'</span><span class="p">,</span> <span class="n">pid</span><span class="p">;</span>
<span class="n">shell</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'/bin/ps -ax -o flags,flags2 -p '</span>
<span class="o">||</span> <span class="n">pid</span>
<span class="o">||</span> <span class="s1">' | /usr/bin/tail -n 1 '</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">TEMPORARY</span> <span class="k">TABLE</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span>
<span class="n">my_ps</span><span class="p">(</span> <span class="n">flags</span> <span class="nb">bit</span><span class="p">(</span><span class="mi">8</span><span class="p">),</span> <span class="n">flags2</span> <span class="nb">bit</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span> <span class="p">);</span>
<span class="k">TRUNCATE</span> <span class="n">my_ps</span><span class="p">;</span>
<span class="k">EXECUTE</span> <span class="n">format</span><span class="p">(</span> <span class="s1">' COPY my_ps( flags , flags2 ) FROM PROGRAM $$ %s $$ WITH ( DELIMITER $$ $$, FORMAT TEXT)'</span><span class="p">,</span> <span class="n">shell</span> <span class="p">);</span>
<span class="k">SELECT</span> <span class="p">(</span> <span class="n">flags</span> <span class="o">&</span> <span class="n">p_protected</span> <span class="p">)::</span><span class="nb">int</span> <span class="o">></span> <span class="mi">0</span>
<span class="k">INTO</span> <span class="n">is_protected</span>
<span class="k">FROM</span> <span class="n">my_ps</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">is_protected</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span>
<span class="k">SECURITY</span> <span class="k">DEFINER</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The idea is quite simple: the function gets a PID to check, if none is specified it assumes we are interested in the current connection. Then the function creates (or empties) a temporary table <code class="language-plaintext highlighter-rouge">my_ps</code> to store the result of the <code class="language-plaintext highlighter-rouge">ps</code> shell command, in particular <code class="language-plaintext highlighter-rouge">flags</code> and <code class="language-plaintext highlighter-rouge">flags2</code> (even if only the former is used).
Flags are stored as bit strings, so that it becomes simpler to make flag comparison.
Last, the <code class="language-plaintext highlighter-rouge">flags</code> field is compared with a logical and with the <code class="language-plaintext highlighter-rouge">p_protected</code> internal variable, and the boolean result is returned.
<br />
Therefore if the function returns <code class="language-plaintext highlighter-rouge">true</code> the selected connection/backend process is protected against the OOM Killer.</p>
<h1 id="conclusions">Conclusions</h1>
<p>As usual FreeBSD reveals itself as a complex and well designed operating system. PostgreSQL can be protected against the OOM Killer in a more aggressive way with regard to Linux, but as usual <em>protecting everything is like protecting nothing at all</em>, so I recommend to not abuse about the <code class="language-plaintext highlighter-rouge">protec(1)</code> command.</p>
A glance at Raku connectivity towards PostgreSQL2021-03-29T00:00:00+00:00https://fluca1978.github.io/2021/03/29/RakuPostgreSQL<p>A glance at Raku implementation for PostgreSQL database connectivity.</p>
<h1 id="a-glance-at-raku-connectivity-towards-postgresql">A glance at Raku connectivity towards PostgreSQL</h1>
<p><a href="https://raku.org" target="_blank">Raku</a> is a great language in my opinion, and I’m using it more and more everyday. I can say it is going to substitute my Perl scripting.
<br />
<br />
Raku comes with an extensive module library, that include of course <strong>database connectivity</strong>, that in turn includes features for connecting to PostgreSQL.
<br />
In this simple article, I’m going to quickly demonstrate how to use a Raku piece of code to do many of the trivial tasks than a database application can do.
<br />
The script is presented in an incremental way, so the <em>Connecting to the database</em> section must be always be as the script preamble.
<br />
<br />
The <a href="https://modules.raku.org/dist/DB::Pg:cpan:CTILMES" target="_blank"><code class="language-plaintext highlighter-rouge">DB::Pg</code></a> module is somehow similar to Perl 5 <code class="language-plaintext highlighter-rouge">DBD::Pg</code>, so a lot of concepts and method names will remind the latter.</p>
<h2 id="installation">Installation</h2>
<p>It is possible to use <code class="language-plaintext highlighter-rouge">zef</code> to install the <code class="language-plaintext highlighter-rouge">DB::Pg</code> module:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">% zef install DB::Pg
</code></pre>
<p><br />
<br /></p>
<p>Depending on the speed of your system and the libraries already installed, it can take a few minutes.
<br />
<br />
If you are going to use the <code class="language-plaintext highlighter-rouge">LISTEN</code>/<code class="language-plaintext highlighter-rouge">NOTIFY</code> you need to install also the <code class="language-plaintext highlighter-rouge">epoLl</code>:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">% zef install epoll
</code></pre>
<p><br />
<br /></p>
<h2 id="connecting-to-the-database">Connecting to the database</h2>
<p>It is now possible to connect to the database using the <code class="language-plaintext highlighter-rouge">DB::Pg</code> module. For example, a simple script that accepts all parameters (in clear text!) on the command line can be:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">#!raku
use DB::Pg;
sub MAIN( Str :$host = 'miguel',
Str :$username = 'luca',
Str :$password = 'secret',
Str :$database = 'testdb' ) {
"Connecting $username @ $host/$database".say;
my $connection = DB::Pg.new: conninfo => "host=$host user=$username password=$password dbname=$database";
</code></pre>
<p><br />
<br /></p>
<p>As you can see, the <code class="language-plaintext highlighter-rouge">DB::Pg</code> module accepts a <strong>conninfo</strong> string.</p>
<h2 id="read-queries-and-results">Read queries and results</h2>
<p>The <code class="language-plaintext highlighter-rouge">.query</code> method allows for issuing a read query to the database. The result is a <code class="language-plaintext highlighter-rouge">Result</code> class object, that can be used by means of different methods, most notably with <code class="language-plaintext highlighter-rouge">.hashes</code> and <code class="language-plaintext highlighter-rouge">.arrays</code> that return a sequence of hashes or arrays, one per every row extracted from the query.
<br />
Special methods like <code class="language-plaintext highlighter-rouge">.rows</code> and <code class="language-plaintext highlighter-rouge">.columns</code> provide respectively the number of rows returned by a query and the list of coumn names of the result set.
<br />
As an example, here it is a simple query:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">my $query = 'SELECT current_role, current_time';
my $results = $connection.query: $query;
say "The query { $query } returned { $results.rows } rows with columns: { $results.columns.join( ', ' ) }";
for $results.hashes -> $row {
for $row.kv -> $column, $value {
say "Column $column = $value";
}
}
</code></pre>
<p><br />
<br /></p>
<p>The above piece of code provides an output similar to the following:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">The query SELECT current_role, current_time returned 1 rows with columns: current_role, current_time
Column current_role = luca
Column current_time = 14:48:47.147983+02
</code></pre>
<p><br />
<br /></p>
<h2 id="cursors">Cursors</h2>
<p>By default, a <code class="language-plaintext highlighter-rouge">.query</code> method will fetch all the rows from the query, that is a problem with larger datasets. It is possible to use the <code class="language-plaintext highlighter-rouge">.cursor</code> method that accepts the optional batch size (by default <code class="language-plaintext highlighter-rouge">1000</code> tuples) and, optionally, the specifier for getting results into a sequence of hashes.
<br />
As a simple example:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">for $connection.cursor( 'select * from raku', fetch => 2, :hash ) -> %row {
say "====================";
for %row.kv -> $column, $value {
say "Column [ $column ] = $value";
}
say "====================";
}
</code></pre>
<p><br />
<br /></p>
<p>that produces and output like:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">====================
Column [ pk ] = 2
Column [ t ] = This is value 0
====================
====================
Column [ pk ] = 3
Column [ t ] = This is value 1
====================
====================
Column [ t ] = This is value 2
Column [ pk ] = 4
====================
====================
Column [ pk ] = 5
Column [ t ] = This is value 3
====================
...
</code></pre>
<p><br />
<br /></p>
<h2 id="write-statements">Write Statements</h2>
<p>Write statements can be performed by means of <code class="language-plaintext highlighter-rouge">.execute</code> method, such as:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">$connection.execute: q< insert into raku( t ) values( 'Hello World' )>;
</code></pre>
<p><br />
<br /></p>
<h2 id="transactions-and-prepared-statements">Transactions and Prepared Statements</h2>
<p>In order to handle transactions, you need to access the <em>database handler</em> that is “masked” into the <code class="language-plaintext highlighter-rouge">DB::Pg</code> main object. The <em>database</em> object provides the method <code class="language-plaintext highlighter-rouge">.begin</code>, <code class="language-plaintext highlighter-rouge">.rollback</code>, <code class="language-plaintext highlighter-rouge">.commit</code> as usual.
<br />
Moreover, it is possible to use the <code class="language-plaintext highlighter-rouge">.prepare</code> method to obtained a <em>prepared statement</em> that can be cached and used in loops and repetitive tasks. It is worth noting that the <code class="language-plaintext highlighter-rouge">.prepare</code> method use the <code class="language-plaintext highlighter-rouge">$1</code>, <code class="language-plaintext highlighter-rouge">$2</code>, and so on parameter placeholders, and that when a statement accepts a single value it has to be specified without the index in <code class="language-plaintext highlighter-rouge">.execute</code>.
<br />
As an example:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">my $database-handler = $connection.db;
my $statement = $database-handler.prepare: 'insert into raku( t ) values( $1 )';
$database-handler.begin;
$statement.execute( "This is value $_" ) for 0 .. 10;
$database-handler.commit;
$database-handler.finish;
</code></pre>
<p><br />
<br /></p>
<p>The above loop is equivalent to an SQL transaction like:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">BEGIN</span><span class="p">;</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">raku</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span><span class="s1">'This is value 0'</span> <span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">raku</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span><span class="s1">'This is value 1'</span> <span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">raku</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span><span class="s1">'This is value 2'</span> <span class="p">);</span>
<span class="p">...</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">raku</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span><span class="s1">'This is value 10'</span> <span class="p">);</span>
<span class="k">COMMIT</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">.finish</code> method is required because <code class="language-plaintext highlighter-rouge">DB::Pg</code> handles caching.
Please note that the <code class="language-plaintext highlighter-rouge">.commit</code> and <code class="language-plaintext highlighter-rouge">.rollback</code> methods are <em>fluent</em>, and return an object instance so that you can call <code class="language-plaintext highlighter-rouge">.commit.finish</code>.</p>
<h2 id="databases-vs-connections">Databases vs Connections</h2>
<p>Caching is handled so that when a query is issued, a new connection is opened and used. Once the work has completed, the connection is returned to the internal pool. The <code class="language-plaintext highlighter-rouge">DB::Pg::Database</code> object does the same work of the <code class="language-plaintext highlighter-rouge">DB::Pg</code> one, with the exception that it does not automatically returns the connection to the pool, so you need to do the <code class="language-plaintext highlighter-rouge">.finish</code> by yourself.
<br />
<br />
Therefore, you can use the same <code class="language-plaintext highlighter-rouge">.query</code> and <code class="language-plaintext highlighter-rouge">.execute</code> methods on both the objects, but the <code class="language-plaintext highlighter-rouge">DB::Pg</code> automatically returns the connection into the internal pool, while the database object allows you for a fine grain control of when to return the connection into the pool.</p>
<h2 id="copy">Copy</h2>
<p>PostgreSQL provides the special <code class="language-plaintext highlighter-rouge">COPY</code> command, that can be used to copy from and into. There is a method <code class="language-plaintext highlighter-rouge">.copy-in</code> that executes a <code class="language-plaintext highlighter-rouge">COPY FROM</code>, while <code class="language-plaintext highlighter-rouge">COPY TO</code> can be used within an iteration loop:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">my $file = '/tmp/raku.csv'.IO.open: :w;
for $connection.query: 'COPY raku TO stdout (FORMAT CSV)' -> $row {
$file.print: $row;
}
</code></pre>
<p><br />
<br /></p>
<p>The above exports the CSV result on a text file.
<br />
To read the data back, it is possible to issue the <code class="language-plaintext highlighter-rouge">.copy-in</code> method, but you first need to issue an SQL <code class="language-plaintext highlighter-rouge">COPY</code>. The workflow is:</p>
<ul>
<li>issue a <code class="language-plaintext highlighter-rouge">COPY FROM STDIN</code>;</li>
<li>use <code class="language-plaintext highlighter-rouge">.copy-data</code> to slurp all the data;</li>
<li>use <code class="language-plaintext highlighter-rouge">.copy-end</code> to notify the database that the <code class="language-plaintext highlighter-rouge">COPY</code> is concluded.</li>
</ul>
<p><br />
The need for <code class="language-plaintext highlighter-rouge">.copy-end</code> is an advatange: it is possible to issue different <code class="language-plaintext highlighter-rouge">.copy-data</code> in a single run, for example to import data from different files.</p>
<p><br />
<br /></p>
<pre><code class="language-raku">$database-handler = $connection.db;
$database-handler.query: 'COPY raku FROM STDIN (FORMAT CSV)';
$database-handler.copy-data: '/tmp/raku1.csv'.IO.slurp;
$database-handler.copy-data: '/tmp/raku2.csv'.IO.slurp;
$database-handler.copy-end;
</code></pre>
<p><br />
<br /></p>
<h2 id="converters">Converters</h2>
<p>It is possible to specify converters, special <em>roles</em> that handle values in and out the database; something that reminds me the <em>inflate</em> and <em>deflate</em> options of <code class="language-plaintext highlighter-rouge">DBI::Class</code>.
<br />
The first step is to add a role to the <code class="language-plaintext highlighter-rouge">converter</code> instance within the <code class="language-plaintext highlighter-rouge">DB::Pg</code>, such instance must:</p>
<ul>
<li>add a new type conversion;</li>
<li>add a <code class="language-plaintext highlighter-rouge">convert</code> method to handle the type stringified value and returns the new value (in any Raku instance).
<br />
As an example, the following converts a <code class="language-plaintext highlighter-rouge">text</code> PostgreSQL type into a <code class="language-plaintext highlighter-rouge">Str</code> Raku object reversed in its content:</li>
</ul>
<p><br />
<br /></p>
<pre><code class="language-raku">$connection.converter does role fluca-converter
{
submethod BUILD { self.add-type( text => Str ) }
multi method convert( Str:U, Str:D $value) {
$value.flip.uc;
}
}
.say for $connection.query( 'select * from raku' ).arrays;
</code></pre>
<p><br />
<br /></p>
<p>that produces an output similar to:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span>442 DLROW OLLEH]
<span class="o">[</span>454 DLROW OLLEH]
<span class="o">[</span>466 DLROW OLLEH]
</code></pre></div></div>
<p><br />
<br /></p>
<p>where the string <code class="language-plaintext highlighter-rouge">Hello World</code> is flipped.</p>
<h1 id="listen-and-notify">Listen and Notify</h1>
<p><code class="language-plaintext highlighter-rouge">DB::Pg</code> can handle also <code class="language-plaintext highlighter-rouge">LISTEN</code> and <code class="language-plaintext highlighter-rouge">NOTIFY</code>, and they are able to interact with the <code class="language-plaintext highlighter-rouge">react</code> dynamic feature of Raku.
<br />
First of all, create a simple mechanism to notify some events:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">or</span> <span class="k">replace</span> <span class="k">rule</span> <span class="n">r_raku_insert</span>
<span class="k">as</span> <span class="k">on</span> <span class="k">insert</span> <span class="k">to</span> <span class="n">raku</span>
<span class="k">do</span> <span class="n">also</span>
<span class="k">SELECT</span> <span class="n">pg_notify</span><span class="p">(</span> <span class="s1">'insert_event'</span><span class="p">,</span> <span class="s1">'INSERTING ROW(S)'</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">RULE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">or</span> <span class="k">replace</span> <span class="k">rule</span> <span class="n">r_raku_delete</span>
<span class="k">as</span> <span class="k">on</span> <span class="k">delete</span> <span class="k">to</span> <span class="n">raku</span>
<span class="k">do</span> <span class="n">also</span>
<span class="k">SELECT</span> <span class="n">pg_notify</span><span class="p">(</span> <span class="s1">'delete_event'</span><span class="p">,</span> <span class="s1">'DELETING ROW(S)'</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">RULE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now it is possible to create a Raku script that waits for incoming events:</p>
<p><br />
<br /></p>
<pre><code class="language-raku">react {
whenever $connection.listen( 'delete_event' ) { .say; }
whenever $connection.listen( 'insert_event' ) { .say; }
}
</code></pre>
<p><br />
<br /></p>
<p>The aim is that, every time an event is issued, the <code class="language-plaintext highlighter-rouge">.listen</code> passes the message payload to the <code class="language-plaintext highlighter-rouge">react</code><code class="language-plaintext highlighter-rouge"> code block. Therefore, issuing some </code>DELETE<code class="language-plaintext highlighter-rouge"> and </code>INSERT` will result in the output:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DELETING ROW<span class="o">(</span>S<span class="o">)</span>
INSERTING ROW<span class="o">(</span>S<span class="o">)</span>
INSERTING ROW<span class="o">(</span>S<span class="o">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>It is possible to stop the listening <code class="language-plaintext highlighter-rouge">react</code> block with the <code class="language-plaintext highlighter-rouge">.unlisten</code> method. It is also possible to issue an event via <code class="language-plaintext highlighter-rouge">.notify</code>.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The <a href="https://modules.raku.org/dist/DB::Pg:cpan:CTILMES" target="_blank"><code class="language-plaintext highlighter-rouge">DB::Pg</code></a> is a great driver for PostgreSQL that allows Raku to exploit a lot of features directly into the language.</p>
A first look at pg_repack2021-03-25T00:00:00+00:00https://fluca1978.github.io/2021/03/25/pgrepack<p>An interesting extension that helps removing bloating from tables and databases.</p>
<h1 id="a-first-look-at-pg_repack">A first look at pg_repack</h1>
<p>I got time to have a look at <a href="https://github.com/reorg/pg_repack" target="_blank"><code class="language-plaintext highlighter-rouge">pg_repack</code></a>, an interesting extension that helps removing bloat from tables, indexes and databases with the promise of minimal locking.
<br />
Since this is a first look, I could be wrong on some aspects, so please apologize me.
<br />
<br />
The main idea behind <code class="language-plaintext highlighter-rouge">pg_repack</code> is to perform an <em>on-line copy</em> of a source (bloated) table, then switching the original table with the new one. In short, something like:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">BEGIN</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">not_bloated</span> <span class="k">AS</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">my_table</span> <span class="k">RENAME</span> <span class="k">TO</span> <span class="n">old_bloated</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">not_bloated</span> <span class="k">RENAME</span> <span class="k">TO</span> <span class="n">my_table</span><span class="p">;</span>
<span class="k">DROP</span> <span class="k">TABLE</span> <span class="n">old_bloated</span><span class="p">;</span>
<span class="k">COMMIT</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p><strong>Of course things are a lot more complex than the above description</strong>, but I think that could be a good summary of what happens.
<br />
<br />
Why such a workflow would remove bloating?
<br />
Well, the idea is that the copy of tuples from the bloated table will, of course, copy only visible tuples (i.e., those that would left after a <code class="language-plaintext highlighter-rouge">VACUUM</code>). In other words, dead tuples are not going to hit the new table and therefore the last will not be bloated.
<br />
<br />
What about locking?
<br />
Since the copy is done on-line, the source table (the bloated one) can be used as usual, that is DML queries can be executed against such table. This of course creates a kind of <em>race-condition</em>, since changes are not propagated automatically to the new table.
<br />
To solve the problem, <code class="language-plaintext highlighter-rouge">pg_repack</code> <em>installs</em> a trigger that will fire for every DML statement and will <em>log</em> changes to a <code class="language-plaintext highlighter-rouge">repack.log</code> table, so that <code class="language-plaintext highlighter-rouge">pg_repack</code> will be able to replay changes at the end of the copy, that is just before switching the tables.
<br />
This is important, according to me, <strong>because this means that running <code class="language-plaintext highlighter-rouge">pg_repack</code> is not the same as running <code class="language-plaintext highlighter-rouge">VACUUM</code> since the new table could have a small fraction of bloating</strong>. Why? Well, if during the copy the original table is subjected to a workload that can cause bloating (i.e., <code class="language-plaintext highlighter-rouge">UPDATE</code> and <code class="language-plaintext highlighter-rouge">DELETE</code>), such bloating will be propagated to the new table as well.
<br/<
<br />
What about disk space?
<br />
Doing a copy of the original table, <code class="language-plaintext highlighter-rouge">pg_repack</code> is going to require at least double the size of the original table on disk.</p>
<h2 id="installing-pg_repack">Installing <code class="language-plaintext highlighter-rouge">pg_repack</code></h2>
<p><code class="language-plaintext highlighter-rouge">pg_repack</code> is an extension, and therefore can be installed via <code class="language-plaintext highlighter-rouge">pgxn</code> (as well as manually, of course):</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>pgxn <span class="nb">install </span>pg_repack
</code></pre></div></div>
<p><br />
<br />
Then of course, you need to install the extension into the database you are going to use (or repack):</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s2">"CREATE EXTENSION pg_repack;"</span> testdb
CREATE EXTENSION
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="using-pg_repack">Using <code class="language-plaintext highlighter-rouge">pg_repack</code></h2>
<p><code class="language-plaintext highlighter-rouge">pg_repack</code> must be invoked from the command line as an external utility. The command accepts pretty much all the usual arguments from <em>libpq</em>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_repack <span class="nt">-t</span> <span class="s2">"luca.wa"</span> <span class="nt">-U</span> postgres testdb
INFO: repacking table <span class="s2">"luca.wa"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The above will repack a single table, but it is possible to repack all tables in a schema, all tables in a database and so on.</p>
<h2 id="the-repack-schema">The <code class="language-plaintext highlighter-rouge">repack</code> schema</h2>
<p><code class="language-plaintext highlighter-rouge">pg_repack</code> installs a <code class="language-plaintext highlighter-rouge">repack</code> schema in the database where the extension lives. In such schema there are different tables, mainly <em>temporary</em> for repacking objects. An interesting table is <code class="language-plaintext highlighter-rouge">repack.tables</code> that contains all the details for every table that can be repacked. Querying such tables you can see some <em>tricks</em> used in the workflow of <code class="language-plaintext highlighter-rouge">pg_repack</code>:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">create_log</span><span class="p">,</span> <span class="n">create_trigger</span><span class="p">,</span> <span class="n">lock_table</span>
<span class="k">from</span> <span class="n">repack</span><span class="p">.</span><span class="n">tables</span>
<span class="k">where</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'luca.wa'</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">create_log</span> <span class="o">|</span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">repack</span><span class="p">.</span><span class="n">log_16553</span>
<span class="p">(</span><span class="n">id</span> <span class="n">bigserial</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">,</span> <span class="n">pk</span> <span class="n">repack</span><span class="p">.</span><span class="n">pk_16553</span><span class="p">,</span> <span class="k">row</span> <span class="n">luca</span><span class="p">.</span><span class="n">wa</span><span class="p">)</span>
<span class="n">create_trigger</span> <span class="o">|</span> <span class="k">CREATE</span> <span class="k">TRIGGER</span> <span class="n">repack_trigger</span>
<span class="k">AFTER</span> <span class="k">INSERT</span> <span class="k">OR</span> <span class="k">DELETE</span> <span class="k">OR</span> <span class="k">UPDATE</span> <span class="k">ON</span> <span class="n">luca</span><span class="p">.</span><span class="n">wa</span>
<span class="k">FOR</span> <span class="k">EACH</span> <span class="k">ROW</span> <span class="k">EXECUTE</span> <span class="k">PROCEDURE</span>
<span class="n">repack</span><span class="p">.</span><span class="n">repack_trigger</span><span class="p">(</span><span class="s1">'INSERT INTO repack.log_16553(pk, row)
VALUES( CASE WHEN $1 IS NULL THEN NULL
ELSE (ROW($1.pk)::repack.pk_16553) END, $2)'</span><span class="p">)</span>
<span class="n">lock_table</span> <span class="o">|</span> <span class="k">LOCK</span> <span class="k">TABLE</span> <span class="n">luca</span><span class="p">.</span><span class="n">wa</span> <span class="k">IN</span> <span class="k">ACCESS</span> <span class="k">EXCLUSIVE</span> <span class="k">MODE</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, there are SQL instructions to create a <code class="language-plaintext highlighter-rouge">log_xxx</code> table where changed tuples will be logged, as well as the definition of the trigger to attach to the table.
<br />
The <code class="language-plaintext highlighter-rouge">repack_trigger</code> is a C function that accepts an SQL string (as you can see) and that will execute an insert into the <code class="language-plaintext highlighter-rouge">log_xxx</code> table so that:</p>
<ul>
<li>in case of an <code class="language-plaintext highlighter-rouge">INSERT</code> the new tuple will be inserted as <code class="language-plaintext highlighter-rouge">(null, row)</code>;</li>
<li>in case of an <code class="language-plaintext highlighter-rouge">UPDATE</code> both the new and old tuples will be inserted as <code class="language-plaintext highlighter-rouge">(old, new)</code>;</li>
<li>in case of a <code class="language-plaintext highlighter-rouge">DELETE</code> the old tuple only will be inserted as <code class="language-plaintext highlighter-rouge">(old, null)</code>.</li>
</ul>
<p><br />
<br />
The <code class="language-plaintext highlighter-rouge">lock_table</code> is used to lock the table during the initial and final steps, that is at the time the trigger is attached and when the tables are swapped.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">pg_repack</code> is surely an interesting extension to keep into the bag. In the future I’m going to spend some time using this extension to see how it performs, but I already know there are happy people using it, so I’m expecting positive results!</p>
Physical Backup Privileges Check2021-03-25T00:00:00+00:00https://fluca1978.github.io/2021/03/25/PostgreSQLBackupUserPrivileges<p>A simple view to see if a user can perform backups.</p>
<h1 id="physical-backup-privileges-check">Physical Backup Privileges Check</h1>
<p>In order to perform a physical backup, PostgreSQL requires a role that is allowed to perform several operations, mainly invoke <code class="language-plaintext highlighter-rouge">pg_start_backup()</code> and <code class="language-plaintext highlighter-rouge">pg_stop_backup()</code> functions.
<br />
On old PostgreSQL versions, before version 10, only superusers can invoke the above backup functions and, therefore, can do a physical backup. Since PostgreSQL 10 things have changed and nowdays there are more fine grain permissions. In particular there are a few <a href="https://www.postgresql.org/docs/12/default-roles.html" target="_blank">default roles</a> that can be used to set up a <em>backup role</em>.
<br />
This is what I usually do, since working with superuser role can be dangerous, so I do create <em>low profile</em> roles and assign them the required privileges.
<br />
<br />
Depending on the backup solution you are going to implement, these privileges can be different, so I decided to create a <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/backup_privileges.sql" target="_blank">view</a> that can help inspecting the status of the roles available in the database.
<br />
<br />
The view works as follows:</p>
<ul>
<li>extract a set of flags;</li>
<li>merge with logical <code class="language-plaintext highlighter-rouge">AND</code> and <code class="language-plaintext highlighter-rouge">OR</code>;</li>
<li>provide some additional flags.</li>
</ul>
<p><br />
<br />
Using the view results in something like the following:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">backupdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">vw_role_backup_privileges</span>
<span class="k">WHERE</span> <span class="n">rolname</span> <span class="k">IN</span> <span class="p">(</span> <span class="s1">'luca'</span><span class="p">,</span> <span class="s1">'backup'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">------------|-------</span>
<span class="n">rolname</span> <span class="o">|</span> <span class="n">backup</span>
<span class="n">can_do_backup</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">can_monitor_backup</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">can_create_restore_point</span> <span class="o">|</span> <span class="n">t</span>
<span class="n">can_switch_wal</span> <span class="o">|</span> <span class="n">t</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="c1">------------|-------</span>
<span class="n">rolname</span> <span class="o">|</span> <span class="n">luca</span>
<span class="n">can_do_backup</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">can_monitor_backup</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">can_create_restore_point</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">can_switch_wal</span> <span class="o">|</span> <span class="n">f</span>
</code></pre></div></div>
<p><br />
<br />
As you can see, the <code class="language-plaintext highlighter-rouge">backup</code> role, even if not a superuser, can do backups, can monitor them, can create restore points and force a WAL switch.
<br />
Of course, the above is not a <em>one size fits all</em> solution, since every backup solution could require different permissions, however this is a possible starting point to check the status of the users.
<br />
<br />
In the following I describe the single pieces of the view.</p>
<h2 id="what-is-required-to-do-a-physical-backup">What is required to do a physical backup?</h2>
<p>The minimal set of privileges required to perform a backup are:</p>
<ul>
<li>permission to start a replication;</li>
<li>permission to invoke <code class="language-plaintext highlighter-rouge">pg_start_backup()</code>, <code class="language-plaintext highlighter-rouge">pg_stop_backup()</code>.</li>
</ul>
<p><br />
This is done by the part:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">f</span><span class="p">.</span><span class="n">can_start_replication</span>
<span class="k">AND</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_start_backup</span>
<span class="k">AND</span> <span class="p">(</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_stop_backup</span> <span class="k">OR</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_stop_backup_exclusive</span> <span class="p">)</span>
</code></pre></div></div>
<p><br />
<br />
where I check the above requirements.</p>
<h2 id="additinal-requirements">Additinal requirements</h2>
<p>being able to start and stop a backup could not suffice: the user could be required to monitor the backup. Monitoring always means being able to query the statistic data and the configuration of the cluster. The former can be used to see if the replication is working fine, while the latter to check the archiving setup.
<br />
PostgreSQL provides a <code class="language-plaintext highlighter-rouge">pg_monitor</code> role that can do the above queries, otherwise the user could need two different roles, namely <code class="language-plaintext highlighter-rouge">pg_read_all_settings</code> and <code class="language-plaintext highlighter-rouge">pg_read_all_stats</code>. Since <code class="language-plaintext highlighter-rouge">pg_monitor</code> includes the above two roles, assigning <code class="language-plaintext highlighter-rouge">pg_monitor</code> is equivalent to assign the latter two roles.
It could also be required to be able to query the <code class="language-plaintext highlighter-rouge">pg_is_in_backup()</code> function, that indicates if the cluster is actually in physical backup mode.
<br />
This means that I need to check:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">f</span><span class="p">.</span><span class="n">pg_monitor</span>
<span class="k">OR</span> <span class="p">(</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_read_all_settings</span>
<span class="k">AND</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_read_all_stats</span>
<span class="k">AND</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_is_in_backup</span>
<span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h2 id="switch-wals-and-create-restore-points">Switch WALs and create restore points</h2>
<p>Starting a backup could also require the user to issue an immediate switch of the WALs in order to quickly start the backup.
<br />
Moreover, it could be required to create a restore point, for example to mark in the WALs that the backup has started at a specific point in time.
<br />
This mean that the check is:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">f</span><span class="p">.</span><span class="n">pg_create_restore_point</span> <span class="k">AS</span> <span class="n">can_create_restore_point</span>
<span class="p">,</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_switch_wal</span> <span class="k">AS</span> <span class="n">can_switch_wal</span>
</code></pre></div></div>
<p><br />
<br /></p>
<h1 id="putting-everything-together">Putting everything together</h1>
<p>Having stated the above list of requirements, the query can be split into two parts:</p>
<ul>
<li>a CTE that extracts the flags;</li>
<li>a query that composes the flags.
<br />
<br />
To extract the flags, the following CTE can be used:</li>
</ul>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">WITH</span> <span class="n">flags</span> <span class="k">AS</span> <span class="p">(</span>
<span class="k">SELECT</span>
<span class="n">a</span><span class="p">.</span><span class="n">rolname</span>
<span class="p">,</span> <span class="n">a</span><span class="p">.</span><span class="n">rolsuper</span> <span class="k">AS</span> <span class="n">is_superuser</span>
<span class="p">,</span> <span class="n">a</span><span class="p">.</span><span class="n">rolreplication</span> <span class="k">AS</span> <span class="n">can_start_replication</span>
<span class="p">,</span> <span class="n">pg_has_role</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_monitor'</span><span class="p">,</span> <span class="s1">'USAGE'</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">pg_monitor</span>
<span class="p">,</span> <span class="n">pg_has_role</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_read_all_settings'</span><span class="p">,</span> <span class="s1">'USAGE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_read_all_settings</span>
<span class="p">,</span> <span class="n">pg_has_role</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_read_all_stats'</span><span class="p">,</span> <span class="s1">'USAGE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_read_all_stats</span>
<span class="p">,</span> <span class="n">has_function_privilege</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_start_backup( text, bool, bool )'</span><span class="p">,</span> <span class="s1">'EXECUTE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_start_backup</span>
<span class="p">,</span> <span class="n">has_function_privilege</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_stop_backup( bool, bool )'</span><span class="p">,</span> <span class="s1">'EXECUTE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_stop_backup</span>
<span class="p">,</span> <span class="n">has_function_privilege</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_stop_backup()'</span><span class="p">,</span> <span class="s1">'EXECUTE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_stop_backup_exclusive</span>
<span class="p">,</span> <span class="n">has_function_privilege</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_create_restore_point( text )'</span><span class="p">,</span> <span class="s1">'EXECUTE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_create_restore_point</span>
<span class="p">,</span> <span class="n">has_function_privilege</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_is_in_backup()'</span><span class="p">,</span> <span class="s1">'EXECUTE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_is_in_backup</span>
<span class="p">,</span> <span class="n">has_function_privilege</span><span class="p">(</span> <span class="n">a</span><span class="p">.</span><span class="n">rolname</span><span class="p">,</span> <span class="s1">'pg_switch_wal()'</span><span class="p">,</span> <span class="s1">'EXECUTE'</span> <span class="p">)</span> <span class="k">as</span> <span class="n">pg_switch_wal</span>
<span class="k">FROM</span>
<span class="c1">-- use pg_roles instead of pg_authid</span>
<span class="c1">-- to allow non-superuser roles to query</span>
<span class="n">pg_roles</span> <span class="n">a</span>
<span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>I do query <code class="language-plaintext highlighter-rouge">pg_roles</code> that contain all the information that is found in <code class="language-plaintext highlighter-rouge">pg_authid</code> but do not require superuser privileges to be queried.
<br />
Please note that I check role group membership with the <code class="language-plaintext highlighter-rouge">USAGE</code> privilege, that means that the role does not have to do an explicit <code class="language-plaintext highlighter-rouge">SET ROLE</code> to gain access to the privileges from the group it belongs to, that is it has been created <code class="language-plaintext highlighter-rouge">WITH INHERIT</code>.
<br />
<br />
Then, composing the flags is as simple as:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">f</span><span class="p">.</span><span class="n">rolname</span>
<span class="p">,</span> <span class="n">f</span><span class="p">.</span><span class="n">is_superuser</span>
<span class="k">OR</span> <span class="p">(</span>
<span class="n">f</span><span class="p">.</span><span class="n">can_start_replication</span>
<span class="k">AND</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_start_backup</span>
<span class="k">AND</span> <span class="p">(</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_stop_backup</span> <span class="k">OR</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_stop_backup_exclusive</span> <span class="p">)</span>
<span class="p">)</span> <span class="k">AS</span> <span class="n">can_do_backup</span>
<span class="p">,</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_monitor</span>
<span class="k">OR</span> <span class="p">(</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_read_all_settings</span>
<span class="k">AND</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_read_all_stats</span>
<span class="k">AND</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_is_in_backup</span>
<span class="p">)</span> <span class="k">AS</span> <span class="n">can_monitor_backup</span>
<span class="p">,</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_create_restore_point</span> <span class="k">AS</span> <span class="n">can_create_restore_point</span>
<span class="p">,</span> <span class="n">f</span><span class="p">.</span><span class="n">pg_switch_wal</span> <span class="k">AS</span> <span class="n">can_switch_wal</span>
<span class="k">FROM</span> <span class="n">flags</span> <span class="n">f</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
Managing Multiple PostgreSQL Instances on FreeBSD2021-03-22T00:00:00+00:00https://fluca1978.github.io/2021/03/22/PostgreSQLFreeBSDMultiInstance<p>FreeBSD <code class="language-plaintext highlighter-rouge">service(8)</code> is a fully featured system to manage services, and allows multiple instances of PostgreSQL.</p>
<h1 id="managing-multiple-postgresql-instances-on-freebsd">Managing Multiple PostgreSQL Instances on FreeBSD</h1>
<p>FreeBSD allows the management of multiple instances of PostgreSQL by means of <code class="language-plaintext highlighter-rouge">rc.conf(5)</code>.
<br />
The trick is to use <strong>profiles</strong>, that are available for the PostgreSQL rc script (<code class="language-plaintext highlighter-rouge">/usr/local/etc/rc.d/postgresql</code>) even if not well documented, at least in my opinion.
<br />
In order to understand how to deal with multiple PostgreSQL instances, consider a system with two cluster: <em>test</em> and <em>prod</em>.
<br />
In <code class="language-plaintext highlighter-rouge">/etc/rc.conf</code> you need to define the <code class="language-plaintext highlighter-rouge">postgresql_profiles</code> variable, where you list the clusters separated by spaces. Then, for each profile, you define the well know <code class="language-plaintext highlighter-rouge">postgresql_xxx</code> variables, specifying the profile name before the variable suffix. For example, to define a <code class="language-plaintext highlighter-rouge">PGDATA</code>, that will be usually defined into <code class="language-plaintext highlighter-rouge">postgresql_data</code> variable, you need to specify a <code class="language-plaintext highlighter-rouge">postgresql_<profile-name>_data</code> variable.
<br />
Therefore, in <code class="language-plaintext highlighter-rouge">/etc/rc.conf</code> you need to specify the following:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">postgresql_profiles</span><span class="o">=</span><span class="s2">"test prod"</span>
<span class="nv">postgresql_test_data</span><span class="o">=</span><span class="s2">"/postgres/12/test"</span>
<span class="nv">postgresql_test_enable</span><span class="o">=</span><span class="s2">"YES"</span>
<span class="nv">postgresql_prod_data</span><span class="o">=</span><span class="s2">"/postgres/12/prod"</span>
<span class="nv">postgresql_prod_enable</span><span class="o">=</span><span class="s2">"YES"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Now you need to manage all instances by specifying the profile name on every <code class="language-plaintext highlighter-rouge">service(8)</code> call:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>service postgresql start <span class="nb">test</span>
% <span class="nb">sudo </span>service postgresql status <span class="nb">test
</span>pg_ctl: server is running <span class="o">(</span>PID: 35979<span class="o">)</span>
/usr/local/bin/postgres <span class="s2">"-D"</span> <span class="s2">"/postgres/12/test"</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p><strong>You need to specify the profile name as last argument to <code class="language-plaintext highlighter-rouge">service(8)</code> invocation</strong>.
<br />
But there is more: <strong>if you don’t specify any profile on the command line, <code class="language-plaintext highlighter-rouge">service(8)</code> will iterate on all available profiles</strong>. As an example, the following two sequences are equivalent:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>service postgresql stop
<span class="o">===></span> postgresql profile: <span class="nb">test</span>
<span class="o">===></span> postgresql profile: prod
<span class="c"># equivalent to</span>
% <span class="nb">sudo </span>service postgresql stop <span class="nb">test</span>
% <span class="nb">sudo </span>service postgresql stop prod
</code></pre></div></div>
<p><br />
<br /></p>
<p>With this simple profile-based management, it is easy to handle and manage multiple PostgreSQL instances on the same FreeBSD host.</p>
PgTraining online webinar 2021-03-12 (Italian): video available2021-03-19T00:00:00+00:00https://fluca1978.github.io/2021/03/19/PGTrainingOnlineEventVideoAvailable<p>An online event organized by PgTraining.</p>
<h1 id="pgtraining-online-webinar--2021-03-12-italian-video-available">PgTraining online webinar 2021-03-12 (Italian): video available</h1>
<p><a href="http://pgtraining.com" target="_blank">PgTraining</a>, the amazin italian group of people that spread the word about PostgreSQL and that I joined in the last years, has organized an online <strong>free event</strong> (<em>webinar</em>) on last April the 12th, 2021.
<br />
There were around 45 participants, that was quite a success in our opinion for such a small event. Most notably, the participants were really interested in the topics covered and, in fact, we got a lot of live questions and were unable to close the session in time since the discussion was really active!
<br />
We made video recordings for the event available, or better <em>off line video recording</em> because we decided not to live record the event due to privacy concerns. All the videos are in <em>italian language</em>.
Here there is my talk, <strong>pgbackrest as a backup solution</strong>:
<br />
<br /></p>
<iframe src="https://player.vimeo.com/video/523432651" width="640" height="360" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen=""></iframe>
<p><a href="https://vimeo.com/523432651">PgTraining online event 2021-03-12: pgBackRest</a> from <a href="https://vimeo.com/user10626375">Pg Training</a> on <a href="https://vimeo.com">Vimeo</a>.</p>
<p><br />
<br /></p>
<p>There is also the video about <strong>sharding</strong>, made by Enrico, that you can watch here:</p>
<p><br />
<br /></p>
<iframe src="https://player.vimeo.com/video/520841922" width="640" height="360" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen=""></iframe>
<p><a href="https://vimeo.com/520841922">PgTraining online session 2021-03 - Sharding</a> from <a href="https://vimeo.com/user10626375">Pg Training</a> on <a href="https://vimeo.com">Vimeo</a>.</p>
<p><br />
<br /></p>
<p>And finally, the video about the <strong>JIT compiler</strong> made by Chris, that you can watch here:
<br />
<br /></p>
<iframe src="https://player.vimeo.com/video/525756192" width="640" height="360" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen=""></iframe>
<p><a href="https://vimeo.com/525756192">PgTraining online event 2021-03-12: Esperimenti con il JIT compiler di Postgres</a> from <a href="https://vimeo.com/user10626375">Pg Training</a> on <a href="https://vimeo.com">Vimeo</a>.</p>
<p><br />
<br /></p>
<p>Slides about each talk are already available via the <a href="https://gitlab.com/pgtraining/slides/-/tree/master" target="_blank">GitLab repository of the event</a> (<em>italian</em>).</p>
PgTraining online webinar on 2021-03-12 (Italian)2021-02-13T00:00:00+00:00https://fluca1978.github.io/2021/02/13/PgTrainingOnlineEvent<p>An online event organized by PgTraining.</p>
<h1 id="pgtraining-online-webinar-on-2021-03-12-italian">PgTraining online webinar on 2021-03-12 (Italian)</h1>
<p><a href="http://pgtraining.com" target="_blank">PgTraining</a>, the amazin italian group of people that spread the word about PostgreSQL and that I joined in the last years, is organizing an online event (<em>webinar</em>) on next 12th April 2021.
<br /></p>
<p><br />
The event will consist in three hours with talks about <strong>sharding</strong>, <strong>backup using PgBackrest</strong> and <strong>the JIT compiler</strong>.
<br /> The webinar will be in Italian and there will be room for questions and discussion at the end of every single talk.
<br />
<br />
There are only <strong>40 available seats</strong> for the event, that <strong>is totally free of charge</strong>, so <a href="https://www.eventbrite.it/e/biglietti-pgtraining-on-line-session-2021-03-141534091277" target="_blank">hurry up and register to the event</a>.</p>
PostgreSQL TOAST Data Corruption (ERROR: unexpected chunk number)2021-02-08T00:00:00+00:00https://fluca1978.github.io/2021/02/08/PostgreSQLToastCorruption<p>I wrote a simple function to test for corrupted TOAST data.</p>
<h1 id="postgresql-toast-data-corruption-error-unexpected-chunk-number">PostgreSQL TOAST Data Corruption (ERROR: unexpected chunk number)</h1>
<p><strong>T</strong>he <strong>O</strong>versize <strong>A</strong>ttribute <strong>S</strong>torage <strong>T</strong>ecnique (TOAST) is a way that allows PostgreSQL to store any kind of attribute within the table.
<br />
PostgreSQL stores data into data pages that have a fixed size, usually <code class="language-plaintext highlighter-rouge">8 kB</code>; this means there is no room for a variable content (e.g., a string) that grows more than a single data page. To solve the problem, PostgreSQL uses TOAST: when an attribute value is too large to be stored in the table data page, PostgreSQL <strong>transparently</strong> moves the content to an external storage, namely <code class="language-plaintext highlighter-rouge">pg_toast</code>, where the content is split into <em>chunks</em> (parts) and stored as a set ot chunk tuples.
When you ask back your content, PostgreSQL transparently seeks the chunks, recompose them in the right order, and provide the result to you. It is like the system executes a transparent join between your main table and the <code class="language-plaintext highlighter-rouge">pg_toast</code> one.</p>
<p><br />
<br />
Unluckily sometime the TOAST storage can be damage, by accident often, resulting in data corruption. The problem is that such corruption goes often <em>unseen</em> until the real content is required: in other words your table looks fine unless you select that exact content that has been stored <em>off-line</em> into TOAST.
<br />
<br />
In this article I introduce a couple of functions that can serve as a basis to find out damaged TOAST data.
<br />
I’ve written such functions to do exactly the above job: help me identify the records that have been damaged, so that I can decide how to restore them (and here you should insert any backup good advice as you wish).
<br />
<br />
This article is divided into two parts:</p>
<ul>
<li>the first one creates an examples and damaged it by purpose, so that you can try the code;</li>
<li>the second part shows how to use the functions and get some results.</li>
</ul>
<p><br />
<br />
The code of the functions can be found online <a href="https://gitlab.com/fluca1978/fluca1978-pg-utils/-/blob/master/examples/toast/find_bad_toast.sql" target="_blank">on my Gitlab repository</a>. As usual, any comment and improvement is appreciated.
<br />
Inspiration for this technique comes <a href="http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html" target="_blank">from Josh Berkus excellent article</a>.</p>
<h1 id="create-an-example-corrupt-your-toast-data">Create an example: corrupt your TOAST data</h1>
<p>Assume we create the following table within our database:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">create</span> <span class="k">table</span> <span class="n">example_toast</span><span class="p">(</span> <span class="n">a</span> <span class="nb">int</span><span class="p">,</span> <span class="n">b</span> <span class="nb">text</span><span class="p">,</span> <span class="k">c</span> <span class="nb">float</span><span class="p">,</span> <span class="n">d</span> <span class="nb">varchar</span><span class="p">(</span><span class="mi">10000</span><span class="p">)</span> <span class="p">);</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">example_toast</span> <span class="k">add</span> <span class="k">column</span> <span class="n">pk</span> <span class="nb">serial</span> <span class="k">primary</span> <span class="k">key</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">insert</span> <span class="k">into</span> <span class="n">example_toast</span>
<span class="k">select</span> <span class="n">x</span><span class="p">,</span> <span class="n">repeat</span><span class="p">(</span> <span class="s1">'fluca1978'</span><span class="p">,</span> <span class="n">x</span> <span class="o">*</span> <span class="mi">8000</span> <span class="p">),</span> <span class="n">x</span> <span class="o">*</span> <span class="mi">1</span><span class="p">.</span><span class="mi">2</span><span class="p">,</span>
<span class="n">repeat</span><span class="p">(</span> <span class="s1">'fluca1978'</span><span class="p">,</span> <span class="n">x</span> <span class="o">*</span> <span class="mi">10</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">210</span> <span class="p">)</span> <span class="n">x</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'example_toast'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">8192</span> <span class="n">bytes</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The table has been filled with four different types of data, each initialized with a different value, with particular regard to the text types that have been initialized to <em>long</em> contents.
The table results in a very small one, and occupies exactly one data page.
<br />
<strong>Does the table has any TOAST-ed data?</strong> We can check that <code class="language-plaintext highlighter-rouge">reltoastrelid</code> has a value:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">relname</span><span class="p">,</span> <span class="n">relfilenode</span><span class="p">,</span> <span class="n">reltoastrelid</span> <span class="k">from</span> <span class="n">pg_class</span> <span class="k">where</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span> <span class="k">and</span> <span class="n">oid</span> <span class="o">=</span> <span class="s1">'example_toast'</span><span class="p">::</span><span class="n">regclass</span><span class="p">;</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">relfilenode</span> <span class="o">|</span> <span class="n">reltoastrelid</span>
<span class="c1">---------------|-------------|---------------</span>
<span class="n">example_toast</span> <span class="o">|</span> <span class="mi">52367</span> <span class="o">|</span> <span class="mi">44178</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore, the table has the <code class="language-plaintext highlighter-rouge">44178</code> TOAST table associated.</p>
<h2 id="its-time-for-a-corruption">It’s time for a corruption!</h2>
<p>In order to make the toasted data faulty, we can use an old Perl script of mine that is going to insert a crappy string into a data file. The script is really simple, as you can see:</p>
<p><br />
<br /></p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!env perl</span>
<span class="nb">open</span> <span class="k">my</span> <span class="nv">$db_file</span><span class="p">,</span> <span class="p">"</span><span class="s2">+<</span><span class="p">",</span> <span class="nv">$ARGV</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span>
<span class="o">||</span> <span class="nb">die</span> <span class="p">"</span><span class="s2">Cannot open data file!</span><span class="se">\n\n</span><span class="p">";</span>
<span class="nb">seek</span> <span class="nv">$db_file</span><span class="p">,</span> <span class="p">(</span> <span class="mi">8</span> <span class="o">*</span> <span class="mi">1024</span> <span class="p">)</span> <span class="o">+</span> <span class="nv">$ARGV</span><span class="p">[</span> <span class="mi">1</span> <span class="p">],</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">print</span> <span class="p">{</span> <span class="nv">$db_file</span> <span class="p">}</span> <span class="p">"</span><span class="s2">Hello Corrupted Database!</span><span class="p">";</span>
<span class="nb">close</span> <span class="nv">$db_file</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Having placed a corruption script, we need to find out the data file that must be corrupted: it is the TOAST table we are going to damage, and we can get the path to a disk file using the PostgreSQL functions.</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">relname</span><span class="p">,</span> <span class="n">relfilenode</span><span class="p">,</span> <span class="n">reltoastrelid</span><span class="p">,</span>
<span class="n">pg_relation_filepath</span><span class="p">(</span> <span class="n">reltoastrelid</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">pg_class</span> <span class="k">where</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">and</span> <span class="n">oid</span> <span class="o">=</span> <span class="s1">'example_toast'</span><span class="p">::</span><span class="n">regclass</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">--------|-----------------</span>
<span class="n">relname</span> <span class="o">|</span> <span class="n">example_toast</span>
<span class="n">relfilenode</span> <span class="o">|</span> <span class="mi">52367</span>
<span class="n">reltoastrelid</span> <span class="o">|</span> <span class="mi">44175</span>
<span class="n">pg_relation_filepath</span> <span class="o">|</span> <span class="n">base</span><span class="o">/</span><span class="mi">24815</span><span class="o">/</span><span class="mi">52368</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>We can now corrupt the data on the datafile <code class="language-plaintext highlighter-rouge">base/24815/44175</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo</span> <span class="nt">-u</span> postgres <span class="se">\ </span>
perl /usr/local/bin/do_corruption.pl <span class="se">\</span>
/postgres/12/base/24815/44175 <span class="se">\</span>
12345
</code></pre></div></div>
<p><br />
<br /></p>
<p><strong>WARNING: don’t try this at home</strong>, or better, do try against a test-only database!</p>
<h2 id="find-out-the-corruption">Find out the corruption</h2>
<p>What happens if we query the table now? Well, we asked for a data corruption and we got it!</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">o</span> <span class="n">test</span><span class="p">.</span><span class="n">txt</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">b</span><span class="p">,</span><span class="n">d</span> <span class="k">from</span> <span class="n">example_toast</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">unexpected</span> <span class="n">chunk</span> <span class="n">number</span> <span class="mi">1126199148</span> <span class="p">(</span><span class="n">expected</span> <span class="mi">2</span><span class="p">)</span> <span class="k">for</span> <span class="k">toast</span> <span class="n">value</span> <span class="mi">60699</span> <span class="k">in</span> <span class="n">pg_toast_44175</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Please note that I sent the otuput of the query to a file to make the whole buffer fill the blog post.</p>
<h1 id="searching-for-the-error-find-out-corrupted-toast-data">Searching for the error: find out corrupted TOAST data</h1>
<p>The data on the TOAST storage has been damaged, and <strong>it is now required to find out which tuples have been affected by the damage</strong> so that you can decide the right strategy for recovery of that data.
<br />
I have built a couple of functions that can help you find out the damaged tuples. Let’s see the final result and then allow me to discuss the details:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">f_find_bad_toast</span><span class="p">(</span> <span class="s1">'example_toast'</span><span class="p">,</span> <span class="s1">'pk'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">----|-------------------------------------------------------------</span>
<span class="n">total</span> <span class="o">|</span> <span class="mi">210</span>
<span class="n">ok</span> <span class="o">|</span> <span class="mi">207</span>
<span class="n">ko</span> <span class="o">|</span> <span class="mi">3</span>
<span class="n">health_ratio</span> <span class="o">|</span> <span class="mi">98</span><span class="p">.</span><span class="mi">57142857142857</span>
<span class="n">damage_ratio</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">4285714285714286</span>
<span class="n">description</span> <span class="o">|</span> <span class="k">Table</span> <span class="n">example_toast</span> <span class="n">has</span> <span class="mi">1</span><span class="p">.</span><span class="mi">4285714285714286</span><span class="o">%</span> <span class="k">toast</span>
<span class="k">data</span> <span class="n">damaged</span> <span class="p">(</span><span class="k">toast</span> <span class="n">relation</span> <span class="n">pg_toast</span><span class="p">.</span><span class="n">pg_toast_44175</span>
<span class="k">on</span> <span class="n">disk</span> <span class="n">file</span> <span class="p">[</span><span class="n">base</span><span class="o">/</span><span class="mi">24815</span><span class="o">/</span><span class="mi">85136</span><span class="p">])</span>
<span class="n">damage_tuple_ids</span> <span class="o">|</span> <span class="p">{</span><span class="mi">110</span><span class="p">,</span><span class="mi">111</span><span class="p">,</span><span class="mi">112</span><span class="p">}</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>We now have a report that tells us that 3 records out of the 210 total ones have been damaged: the record ranging from <code class="language-plaintext highlighter-rouge">pk</code> 110 to 112 are the ones hitted by the data corruption, and therefore the toast data is wrong. The good news is that 98% of our table is healthy.</p>
<p><br />
<br />
The function <code class="language-plaintext highlighter-rouge">f_find_bad_toast</code> accepts the table name and a column that must be unique (and therefore acting as a surrogate primary key). The function inspects every single record in the table and tries to de-toast its data. The final result is that, in this example, every single tuple has been corrupted.</p>
<p>The function <code class="language-plaintext highlighter-rouge">f_find_bad_toast</code> does the following:
1) performs a few sanity checks, and <em>gets the list of TOASTable attributes of the table</em>;
2) prepare an SQL query <code class="language-plaintext highlighter-rouge">SELECT</code> to query every single toastable attribute;
3) converts the toasted column into <code class="language-plaintext highlighter-rouge">text</code> and performs a few aggregate operations on that data, so to force the <em>detoasting</em>;
4) if an exception arises, the function stores the primary key of the tuple to indicate that there is an error on such tuple.</p>
<p><br />
Internally, the function exploits another custom piece of code, <code class="language-plaintext highlighter-rouge">f_enumerate_toastable_columns</code> that provides a list of those columns that could have been stored on TOAST.
As an example:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">f_enumerate_toastable_columns</span><span class="p">(</span> <span class="s1">'example_toast'</span> <span class="p">);</span>
<span class="n">f_enumerate_toastable_columns</span>
<span class="c1">-------------------------------</span>
<span class="n">b</span>
<span class="n">d</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, only columns with variable length could be stored in the TOAST area.</p>
<h2 id="how-f_find_bad_toast-works">How <code class="language-plaintext highlighter-rouge">f_find_bad_toast()</code> works</h2>
<p>You can get some hints on the internal working behavior by increasing the debug message level:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">set</span> <span class="n">client_min_messages</span> <span class="k">to</span> <span class="n">debug</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">f_find_bad_toast</span><span class="p">(</span> <span class="s1">'example_toast'</span><span class="p">,</span> <span class="s1">'pk'</span> <span class="p">);</span>
<span class="p">...</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Preparing</span> <span class="k">to</span> <span class="n">de</span><span class="o">-</span><span class="k">toast</span> <span class="n">record</span> <span class="n">pk</span> <span class="o">=</span> <span class="mi">161</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Prepared</span> <span class="n">query</span> <span class="p">[</span><span class="k">SELECT</span> <span class="k">lower</span><span class="p">(</span> <span class="n">b</span><span class="p">::</span><span class="nb">text</span> <span class="p">)</span> <span class="o">||</span> <span class="k">lower</span><span class="p">(</span> <span class="n">d</span><span class="p">::</span><span class="nb">text</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">example_toast</span> <span class="k">WHERE</span> <span class="n">pk</span> <span class="o">=</span> <span class="s1">'161'</span><span class="p">]</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Succesfully</span> <span class="n">executed</span> <span class="n">query</span> <span class="p">[</span><span class="k">SELECT</span> <span class="k">lower</span><span class="p">(</span> <span class="n">b</span><span class="p">::</span><span class="nb">text</span> <span class="p">)</span> <span class="o">||</span> <span class="k">lower</span><span class="p">(</span> <span class="n">d</span><span class="p">::</span><span class="nb">text</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">example_toast</span> <span class="k">WHERE</span> <span class="n">pk</span> <span class="o">=</span> <span class="s1">'161'</span><span class="p">]</span>
<span class="p">,,,</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, the function inspects <strong>every single record at a time</strong> (i.e., it can be really slow on large tables!) and builds an appropriate query to de-toast toastable data. Then the query is executed, the result is placed into a variable and the length of the result is computed; if this succeed the data has been detoasted, otherwise there was a problem reading the toasted data.
In short, the function does:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">BEGIN</span>
<span class="k">EXECUTE</span> <span class="n">query_detoast</span>
<span class="k">INTO</span> <span class="n">current_detoasted_data</span><span class="p">;</span>
<span class="n">PERFORM</span> <span class="k">length</span><span class="p">(</span> <span class="n">current_detoasted_data</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Succesfully executed query [%]'</span><span class="p">,</span> <span class="n">query_detoast</span><span class="p">;</span>
<span class="n">ok_counter</span> <span class="o">=</span> <span class="n">ok_counter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">EXCEPTION</span>
<span class="k">WHEN</span> <span class="n">OTHERS</span> <span class="k">THEN</span>
<span class="n">ko_counter</span> <span class="o">=</span> <span class="n">ko_counter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">wrong_tuple_ids</span> <span class="o">=</span> <span class="n">array_append</span><span class="p">(</span> <span class="n">wrong_tuple_ids</span><span class="p">,</span> <span class="n">current_pk</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">NOTICE</span> <span class="s1">'Record with % = % of table % has corrupted toast data!'</span><span class="p">,</span> <span class="n">pk</span><span class="p">,</span> <span class="n">current_pk</span><span class="p">,</span> <span class="n">tablez</span><span class="p">;</span>
<span class="k">END</span><span class="p">;</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The <code class="language-plaintext highlighter-rouge">query_detoast</code> is built for every single record, as for instance <code class="language-plaintext highlighter-rouge">SELECT lower( b::text ) || lower( d::text ) FROM example_toast WHERE pk = '161'</code>.</p>
<h3 id="arguments">Arguments</h3>
<p>The <code class="language-plaintext highlighter-rouge">f_find_bad_toast</code> function accepts four arguments:</p>
<ul>
<li>the table name;</li>
<li>the surrogate primary key column name;</li>
<li>an optional limit clause, useful when inspecting very large tables;</li>
<li>an optional offset argument, useful when iterating over the same table.</li>
</ul>
<p>A possible improvement could be to automatically find out a table’s primary key, for example by inspecting the system catalogs.</p>
<h3 id="f_enumerate_toastable_columns"><code class="language-plaintext highlighter-rouge">f_enumerate_toastable_columns</code></h3>
<p>The <code class="language-plaintext highlighter-rouge">f_enumerate_toastable_columns</code> inspects the system catalogs to find out which attributes can be stored by TOAST. At its core, it returns every item in <code class="language-plaintext highlighter-rouge">pg_attribute</code> that has a storage of <code class="language-plaintext highlighter-rouge">x</code> (extended) or <code class="language-plaintext highlighter-rouge">e</code> (external), meaning that the attribute has been stored outside of the main table.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The TOAST mechanism is great, but until you detoast the content of your data you could not notice a problem in it.
Periodically run tools based on the above functions can help you determine if a problem has been generated, and so far I’ve only experienced human-caused damages, so don’t worry about your PostgreSQL cluster as far as nobody disturbs it!</p>
PostgreSQL Literate Programming with GNU Emacs2021-01-18T00:00:00+00:00https://fluca1978.github.io/2021/01/18/PostgreSQLLiterateProgramming<p>GNU Emacs is great! I can prepare my slides with PostgreSQL snippets of code and results.</p>
<h2 id="postgresql-literate-programming-with-gnu-emacs">PostgreSQL Literate Programming with GNU Emacs</h2>
<p>What is <em>literate programming</em>? <a href="https://en.wikipedia.org/wiki/Literate_programming" target="_blank">Literate Programming</a> is a programming paradigm that makes you write a program in a more natural language, interleaving documentation and code together.
<br />
GNU Emacs allows literate programming by means of <em>Org Mode</em> and its module <em>Org Babel</em>.
<br />
I am already used to Org Mode, and I am already writing my own documentation, slides and papers with this great tool. But Org Babel can do much more for me: as you probably know I write several articles, papers, presentation for training events all related to PostgreSQL.
<br />
The classical workflow is:</p>
<ul>
<li>write a slide or piece of document;</li>
<li>execute an SQL statement (e.g. in a terminal);</li>
<li>copy and paste the SQL statement into your slide or document;</li>
<li>copy and paste the result into your slide or document.
<br />
One huge problem about the above is that every time you change the initial statement, you have to repeat the process copy and pasting the results, and this can lead to errors, inconsistencies, and duty on yourself to keep the documentation up to date.
Moreover, imagine the output of a command changes from one version of PostgreSQL to another: you have to re-run every single command and repeat the copy and paste of the results.
<br />
That’s too much!</li>
</ul>
<bt />
<p><br />
Being BNU Emacs what it is, there’s a much more smarter way to do it!</p>
<h2 id="org-babel-to-the-rescue">Org Babel to the Rescue!</h2>
<p>Org Babel is a module that allows Org Mode to execute a single snippet of code. The code is executed launching external processes, like interpreters (in the case of Perl, Python, etc.), shells or, in the case of our beloved database, <code class="language-plaintext highlighter-rouge">psql</code>.
<br />
Let’s see an example, imagine to write the documentation for a PostgreSQL transaction as follows:</p>
<p><br />
<br /></p>
<pre><code class="language-org">* An example of transaction
The following is a PostgreSQL explicit transaction:
#+begin_src sql :engine postgresql :dbhost miguel :dbuser luca :database emacsdb
BEGIN;
CREATE TABLE emacs( t text );
INSERT INTO emacs
SELECT 'Foo' || v
FROM generate_series(1, 10);
COMMIT;
#+end_src
and when executed, the system replies with every command feedback:
</code></pre>
<p><br />
<br />
For now, avoid the discussion about the connection parameters, that after all are quite easy to guess.
<br />
If you place within the code block (i.e., in any poin from <code class="language-plaintext highlighter-rouge">#+begin_src</code> to <code class="language-plaintext highlighter-rouge">#+to_src</code>) and hit <code class="language-plaintext highlighter-rouge">C-c C-c</code>, Emacs will launch a <code class="language-plaintext highlighter-rouge">psql</code> connection to the database to execute the SQL set of statements. In other words, it will be like if you had manually typed the following on a command line:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo</span> <span class="s1">'BEGIN; CREATE TABLE emacs(t text); ...'</span> | psql <span class="nt">-h</span> miguel <span class="nt">-U</span> luca emacsdb
</code></pre></div></div>
<p><br />
<br /></p>
<p>The end result will be that your document automagically changes to:</p>
<pre><code class="language-org">* An example of transaction
The following is a PostgreSQL explicit transaction:
#+begin_src sql :engine postgresql :dbhost miguel :dbuser luca :database emacsdb
BEGIN;
CREATE TABLE emacs( t text );
INSERT INTO emacs
SELECT 'Foo' || v
FROM generate_series(1, 10);
COMMIT;
#+end_src
and when executed, the system replies with every command feedback:
#+RESULTS:
| BEGIN |
|--------------|
| CREATE TABLE |
| INSERT 0 10 |
| COMMIT |
</code></pre>
<p><br />
<br />
that in turn, renders to something like the following</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/emacs/literate_postgresql_programming_1.png" />
</center>
<p><br />
<br /></p>
<p>Not bad, uh?</p>
<h2 id="emacs-and-org-babel-configuration">Emacs and Org Babel Configuration</h2>
<p>Emacs does not usually ship with Org Babel configured for SQL, so you have to place into your configuration file the following:</p>
<p><br />
<br /></p>
<div class="language-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">org-babel-do-load-languages</span>
<span class="ss">'org-babel-load-languages</span>
<span class="o">'</span><span class="p">((</span><span class="nv">sql</span> <span class="o">.</span> <span class="no">t</span><span class="p">)))</span>
<span class="p">(</span><span class="k">setq</span> <span class="nv">org-confirm-babel-evaluate</span> <span class="no">nil</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
<br />
The first three lines enables the SQL language, while the last one prevents Emacs to ask for confirmation before running every single snippet of code.</p>
<h2 id="update-the-results">Update the Results</h2>
<p>In the case you change a snippet of code, you can simply re-issue <code class="language-plaintext highlighter-rouge">C-c C-c</code> to update consequently the results.</p>
<h2 id="running-all">Running All</h2>
<p>Here it is the most fun part: imagine your documentation or slides include several snippets of code, and you want to update all the code results. Remember, you are in Emacs, and there must be a way to do it.
And in fact, you can run <code class="language-plaintext highlighter-rouge">C-c C-v b</code> to create and/or update all the result sections.
<br />
This is particular handy for me when I want to update results based on a different version of PostgreSQL.</p>
<h2 id="connection-parameters">Connection Parameters</h2>
<p>As you have probably guessed, those parameters after the <code class="language-plaintext highlighter-rouge">sql</code> tag in the header of the code snippets tell Emacs how to reach the PostgreSQL server. In particular:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">dbhost</code> is the remote hostname, with <code class="language-plaintext highlighter-rouge">localhost</code> for a local connection;</li>
<li><code class="language-plaintext highlighter-rouge">dbuser</code> is the database username</li>
<li><code class="language-plaintext highlighter-rouge">dbpasswd</code> is the user password, in clear text (!);</li>
<li><code class="language-plaintext highlighter-rouge">database</code> is the name of the database to which you need to connect to.</li>
</ul>
<h2 id="do-not-repeat-yourself">Do not Repeat Yourself</h2>
<p>You don’t have to specify the connection properties on the header of every single piece of code: you can group properties in an Org Mode tree to handle all at once.
<br />
Allow me to explain with an example document:</p>
<p><br />
<br /></p>
<pre><code class="language-org">* My experiments
#+begin_src sql :engine postgresql :dbhost miguel :dbuser luca :database emacsdb
BEGIN;
CREATE TABLE emacs( pk serial, t text );
INSERT INTO emacs(t) SELECT 'Foo' || v
FROM generate_series(1,10) v;
COMMIT;
#+end_src
#+begin_src sql :engine postgresql :dbhost miguel :dbuser luca :database emacsdb
SELECT * FROM emacs
LIMIT 2;
#+end_src
</code></pre>
<p><br />
<br /></p>
<p>the above can be replaced with a more compact version like</p>
<p><br />
<br /></p>
<pre><code class="language-org">* My experiments
:PROPERTIES:
:header-args: sql :engine postgresql :dbhost localhost :dbuser luca :database emacsdb
:END:
#+begin_src sql
BEGIN;
CREATE TABLE emacs( pk serial, t text );
INSERT INTO emacs(t) SELECT 'Foo' || v
FROM generate_series(1,10) v;
COMMIT;
#+end_src
#+begin_src sql
SELECT * FROM emacs
LIMIT 2;
#+end_src
</code></pre>
<p><br />
<br /></p>
<p>It is now possible to change in and manage the connection properties in a single place, so that if I, for example, need to change the hostname I can change on the <code class="language-plaintext highlighter-rouge">header-args</code> line and execute <code class="language-plaintext highlighter-rouge">C-c C-v b</code> to get all the require results.</p>
<h2 id="give-me-the-shell-quick">Give me the Shell, Quick!</h2>
<p>Org Babel can, of course, execute and evaluate different snippets of code and languages. This allows you to insert into your own documentation not only SQL statements, but also maintaance commands to run thru the shell, like <code class="language-plaintext highlighter-rouge">service postgresql restart</code>.
And you can also execute directly <code class="language-plaintext highlighter-rouge">psql</code> as follows:</p>
<p><br />
<br /></p>
<pre><code class="language-org">#+begin_src shell
psql -h localhost -U luca -c 'SELECT t FROM emacs LIMIT 2' emacsdb
#+end_src
#+RESULTS:
| t | |
| ------ | |
| Foo1 | |
| Foo2 | |
| (2 | rows) |
</code></pre>
<p><br />
Please note that, since in Org Mode a <code class="language-plaintext highlighter-rouge"><TAB></code> is used in conjunction with a table, the output is rendered as a two columns table even if you selected a single column.
<br />
Remember that in order to allow Org Babel to evaluate the shell commands you need to enable the shell language in the Emacs configuration, therefore in your <code class="language-plaintext highlighter-rouge">.emacs</code> file you must now have something like:</p>
<p><br /></p>
<div class="language-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">org-babel-do-load-languages</span>
<span class="ss">'org-babel-load-languages</span>
<span class="o">'</span><span class="p">(</span>
<span class="p">(</span><span class="nv">shell</span> <span class="o">.</span> <span class="no">t</span><span class="p">)</span>
<span class="p">(</span><span class="nv">sql</span> <span class="o">.</span> <span class="no">t</span><span class="p">)</span>
<span class="p">)</span> <span class="p">)</span>
</code></pre></div></div>
<p><br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>Emacs is a great tool! You can improve your PostgreSQL documentation by means of Org Mode and Org Babel.
<br />
There is much more about the Org Babel, and this is just a quick introduction to let you taste the power of Emacs!</p>
pgenv special keywords: earliest and latest2021-01-14T00:00:00+00:00https://fluca1978.github.io/2021/01/14/pgenv_special_keywords<p>A nice addition to the <code class="language-plaintext highlighter-rouge">pgenv</code> PostgreSQL binary manager.</p>
<h2 id="pgenv-special-keywords-earliest-and-latest">pgenv special keywords: earliest and latest</h2>
<p>I recently added support for two different keywords in <a href="https://github.com/theory/pgenv" target="_blank"><code class="language-plaintext highlighter-rouge">pgenv</code></a>: <strong>earliest</strong> and <strong>latest</strong>.
<br />
The idea is quite simple: instead of having to specify each time a PostgreSQL version number to work on, you can now specify one of the above keywords to <em>jump</em> immediately to the oldest or newest PostgreSQL version you have installed. Of course, the newest PostgreSQL version is the most recent on a version number basis (not installation time), and on the other hand the oldest is the one with the lesser version number among those installed.
<br />
Let’s understand the concept with an example:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv versions
12.1 pgsql-12.1
12.3 pgsql-12.3
12.4 pgsql-12.4
13.0 pgsql-13.0
9.6.20 pgsql-9.6.20
</code></pre></div></div>
<p><br />
<br /></p>
<p>Among the versions installed above, we have that:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">9.6.20</code> is the oldest one, and therefore is <em>mapped</em> to <code class="language-plaintext highlighter-rouge">earliest</code>;</li>
<li><code class="language-plaintext highlighter-rouge">13.0</code> is the newest one, and therefore is <em>mapped</em> to <code class="language-plaintext highlighter-rouge">newest</code>.
It is quite easy to demonstrate this by means of <code class="language-plaintext highlighter-rouge">use</code>:</li>
</ul>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv use earliest
PostgreSQL 9.6.20 started
Logging to /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log
</code></pre></div></div>
<p><br />
<br /></p>
<p>As you can see, <code class="language-plaintext highlighter-rouge">earliest</code> has been resolved to version <code class="language-plaintext highlighter-rouge">9.6.20</code>; on the other hand <code class="language-plaintext highlighter-rouge">latest</code> is going to be resolved to <code class="language-plaintext highlighter-rouge">13.0</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv use latest
PostgreSQL 9.6.20 stopped
PostgreSQL 13.0 started
Logging to /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log
</code></pre></div></div>
<p><br />
<br /></p>
<p>But that is not enough: you can also narrow down the scope of versions to a specific major number. For instance, in the <code class="language-plaintext highlighter-rouge">12</code> branch we have installed <code class="language-plaintext highlighter-rouge">12.1</code>, <code class="language-plaintext highlighter-rouge">12.3</code> and <code class="language-plaintext highlighter-rouge">12.4</code>, that means that <code class="language-plaintext highlighter-rouge">12.1</code> is oldest version in the twelve branch, as far as <code class="language-plaintext highlighter-rouge">12.4</code> is the newest one. You can filter by a version number specifying the major version number after the <code class="language-plaintext highlighter-rouge">earliest</code> or <code class="language-plaintext highlighter-rouge">latest</code> keywords:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv use latest 12
PostgreSQL 13.0 stopped
PostgreSQL 12.4 started
Logging to /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log
% pgenv use earliest 12
PostgreSQL 12.4 stopped
PostgreSQL 12.1 started
Logging to /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log
</code></pre></div></div>
<p><br />
<br /></p>
<p>Thanks to the addition of <code class="language-plaintext highlighter-rouge">earliest</code> and <code class="language-plaintext highlighter-rouge">latest</code> it becomes more intuitive and easy to automate <code class="language-plaintext highlighter-rouge">pgenv</code> usage, so that you don’t have to remember to which version of PostgreSQL you are referring to.</p>
<h1 id="what-about-build">What about <code class="language-plaintext highlighter-rouge">build</code>?</h1>
<p>Thanks to <a href="https://github.com/theory/pgenv/commit/95236fd43f8f7af5f1b94e3fe9259397fcb70c46" target="_blank">this commit</a>, <strong>it is now possible to issue a <code class="language-plaintext highlighter-rouge">build</code> command using the same special keywords as above</strong>.
<br />
As an example, specifying <code class="language-plaintext highlighter-rouge">pgenv build latest 13</code> will install the latest available version in the <code class="language-plaintext highlighter-rouge">13</code> major release, as well as <code class="language-plaintext highlighter-rouge">pgenv build latest</code> will install the very last available version among all.
The word <code class="language-plaintext highlighter-rouge">earliest</code> works the opposite, even if I believe that building the very oldest PostgreSQL version could be a good way to have fun!</p>
krunner and PostgreSQL Documentation Search2021-01-10T00:00:00+00:00https://fluca1978.github.io/2021/01/10/PostgreSQLKRunnerSearch<p>How to search directly into the PostgreSQL documentation from your Plasma desktop.</p>
<h2 id="krunner-and-postgresql-documentation-search">krunner and PostgreSQL Documentation Search</h2>
<p>If you, like me, are addicted to Plasma, the KDE desktop, you probably already know about <em>krunner</em>, an <strong>application launcher on steroids</strong>.
<br />
<code class="language-plaintext highlighter-rouge">krunner</code> allows you to quickly launch, kill, switch to and manage applications, as well as executed computations and, most notably <strong>web searches</strong>. In fact, krunner exploits the <em>Konqueror</em> shortcuts for web searches. Konqueror is the default web browser for KDE/Plasma (since KDE version 2), and allows for a quick customization of shortcut that enable you to redirect a string thru a search engine.
As an example, by default Konqueror has the <code class="language-plaintext highlighter-rouge">dd</code> and the <code class="language-plaintext highlighter-rouge">gg</code> shortcuts: the former enbles the search of the remaining part of the string thru <em>DuckDuckGo</em>, while the latter thru <em>Google</em>.
<br />
So, what does it take to get krunner integrated with the PostgreSQL official documentation search engine?
<br />
There is no much work to do, after all, and in fact it does suffice to:
1) create a new Konqueror shortcut;
2) no, there are no other steps involved!
<br />
The good news is that you can configure whatever you want by the krunner interface itself.</p>
<h2 id="configure-krunner">Configure krunner</h2>
<p>First of all, launch krunner by hitting <code class="language-plaintext highlighter-rouge">ALT + F2</code> or <code class="language-plaintext highlighter-rouge">ALT + <space></code>, then click on the setup icon on the left of the bar</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/plasma/postgresql_search_1.png" />
</center>
<p><br />
<br /></p>
<p>In the dialog window, scroll to the <em>Web Shortcuts</em> line and click on the <em>configure</em> icon.</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/plasma/postgresql_search_2.png" />
</center>
<p><br />
<br /></p>
<p>In the opened dialog, after having searched for the <em>key</em> sequence you want to insert, click on the <em>New</em> button to create a new shortcut.</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/plasma/postgresql_search_3.png" />
</center>
<p><br />
<br /></p>
<p>Fill the dialog as you find appropriate, but with regard to the <code class="language-plaintext highlighter-rouge">Shortcut URL</code> place <code class="language-plaintext highlighter-rouge">https://www.postgresql.org/search/?q=</code> and then hit the button on the right to insert the query parameters (<code class="language-plaintext highlighter-rouge">\{@}</code>), so that the ending result is <code class="language-plaintext highlighter-rouge">https://www.postgresql.org/search/?q=\{@}</code>.
<br />
Place a shortcut in the <code class="language-plaintext highlighter-rouge">Shortcuts</code> entry, separaed by comma, for example <code class="language-plaintext highlighter-rouge">pg</code>, then <code class="language-plaintext highlighter-rouge">postgres</code> and last <code class="language-plaintext highlighter-rouge">postgresql</code>, so that you will be able to inject a search by a short or common character sequence.</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/plasma/postgresql_search_5.png" />
</center>
<p><br />
<br /></p>
<p>Apply the changes and get ready for your PostgreSQL related queries.</p>
<h2 id="searching-for-something-postgresql-related">Searching for something PostgreSQL-related</h2>
<p>It is now time to test the searching shortcut:</p>
<ul>
<li>launch krunner by hitting <code class="language-plaintext highlighter-rouge">ALT + F2</code> or <code class="language-plaintext highlighter-rouge">ALT + <space></code>;</li>
<li>enter <code class="language-plaintext highlighter-rouge">pg:</code> to activate the search engine</li>
<li>insert a PostgreSQL string and press <code class="language-plaintext highlighter-rouge"><enter></code>.</li>
</ul>
<p><br />
<br /></p>
<center>
<img src="/images/posts/plasma/postgresql_search_6.png" />
</center>
<p><br />
<br /></p>
<p>and the result will popup in your default web browser (<em>that is not mandatory to be Konqeuror!</em>).</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/plasma/postgresql_search_7.png" />
</center>
<p><br />
<br /></p>
<h1 id="konqueror-and-web-shortcuts">Konqueror and Web Shortcuts</h1>
<p>As already written, <code class="language-plaintext highlighter-rouge">krunner</code> exploits the Konqueror Web Shortcuts, and in fact <a href="https://fluca1978.github.io/2008/01/28/ricerca-diretta-nella-documentazione-di.html" target="_blank">I wrote an article (italian)</a> back in <em>2008</em> about the configuration of Konqueror to access the PostgreSQL documentation. I also asked for that article to appear on the ITPUG official web site, without any success, but this is another story.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">krunner</code> is an amazing piece of software, that I totally use every day and every moment to the extent that I do not more use a lot of icons to start applications and tasks, but simply pass a few characters to krunner and let it do the heavy work for me.
<br />
Being able to integrate the PostgreSQL documentation search into krunner represent a huge adavantage for every PostgreSQL and Plasma user.</p>
Firefox and PostgreSQL Documentation Search2021-01-10T00:00:00+00:00https://fluca1978.github.io/2021/01/10/FirefoxPostgreSQLSearch<p>How to search directly within the PostgreSQL documentation from your Firefox web browser.</p>
<h2 id="firefox-and-postgresql-documentation-search">Firefox and PostgreSQL Documentation Search</h2>
<p>The Firefox web browser supports several search engines, extensions by means of which you can insert a search string and get it passed to a specific site for search.
<br />
It is possible to customize Firefox to search for a particular string within the PostgreSQL official documentation: the idea is to instrument the web browser to redirect the searching for thru the PostgreSQL web site via a <code class="language-plaintext highlighter-rouge">GET</code> URL.
<br />
In order to achieve this, you need to install a customizable search engine, and then configure the shortcuts for enabling the web engine access.</p>
<h2 id="custom-search-engine-setup">Custom Search Engine Setup</h2>
<p>The first step consists of installing the <a href="https://addons.mozilla.org/en-US/firefox/addon/add-custom-search-engine/" target="_blank">Custom Search Engine</a> to your Firefox web browser.
<br />
Then, clicking on the main Firefox menu (the hamburger icon), select the <em>Add-ons</em> entry and then go to the <em>extensions</em> menu: you should see the new searching engine there. Check the engine is active and then click on the three dots button and select <em>Preferences</em>:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/firefox/postgresql_search_engine_1.png" />
</center>
<p><br />
<br /></p>
<p>In the opened screen, edit a line to add the following details:</p>
<ul>
<li><em>key</em>, I use <code class="language-plaintext highlighter-rouge">pg</code> as the default prefix to indicate I’m going to specify a PostgreSQL documentation search;</li>
<li><em>Search Engine Name</em>, set to <code class="language-plaintext highlighter-rouge">PostgreSQL</code> or any name it makes sense to you;</li>
<li><em>URL</em>, you have to set it to <code class="language-plaintext highlighter-rouge">https://www.postgresql.org/search/?q={searchTerms}</code>, where <code class="language-plaintext highlighter-rouge">{searchTerms}</code> is going to be replaced by firefox with the searching keywords;</li>
<li><em>Description</em>, whatever it makes sense to you, for example <code class="language-plaintext highlighter-rouge">PostgreSQL Official Documentation</code>.</li>
</ul>
<p><br />
As you can imagine, the important parts are the <em>key</em> and the <em>URL</em>. Note that you can also add specific PostgreSQL versions by changing the URL to include a version number, do a few searches on the official web site and inspect the URL for other arguments.
<br />
Once you have done, click on the button <em>Save Preferences</em> and then close the tab.</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/firefox/postgresql_search_engine_2.png" />
</center>
<p><br />
<br /></p>
<h2 id="searching-into-the-documentation">Searching into the documentation</h2>
<p>With the engine in place, you can search within the PostgreSQL documentation by means of inserting:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">ms</code> to activate the custom search engine;</li>
<li><code class="language-plaintext highlighter-rouge">pg</code> to activate the PostgreSQL documentation search engine (this is the <em>key</em> specified above);</li>
<li>any keyword you need to search into the documentation.
<br />
As an example, imagine we want to search for the <code class="language-plaintext highlighter-rouge">CREATE INDEX</code> statement documentation; we need to enter:</li>
</ul>
<p><br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ms pg create index
</code></pre></div></div>
<p><br /></p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/firefox/postgresql_search_engine_3.png" />
</center>
<p><br />
<br /></p>
<p>and pressing <code class="language-plaintext highlighter-rouge">enter</code> the search will go thru the PostgreSQL documentation web site:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/firefox/postgresql_search_engine_4.png" />
</center>
<p><br />
<br /></p>
<h1 id="conclusions">Conclusions</h1>
<p>I personally don’t like very much the way Firefox allows for a search customization: having to type a shortcut to activate the search engine and another one to specialize the search engine seems to me too much work.
However, it can result useful when you live in Firefox and want to quickly search for a PostgreSQL tip!</p>
Single User Mode and -P flag2021-01-03T00:00:00+00:00https://fluca1978.github.io/2021/01/03/PostgreSQLSingleUserModeP<p>How to allow corrupted catalogs repair.</p>
<h2 id="single-user-mode-and--p-flag">Single User Mode and -P flag</h2>
<p>It could happen that you can no more connect to the database because an error on a catalog happens.
<br />
PostgreSQL is rock-solid, so this usually does not happen, but in the case of disk corruption (or sometimes because of poor human behavior), the system could be not able to connect to the database because the system catalogs for that database are <em>bad</em>.
<br />
<br />
When the catalogs have been corrupted at the index level, there is a chance to get back your database (and data) by restoring the system catalogs. In fact, the <code class="language-plaintext highlighter-rouge">REINDEX</code> command supports the <em><code class="language-plaintext highlighter-rouge">SYSTEM</code></em> option that, as the name suggests, performs a reindex at the <em>system</em> level, that is against the database catalogs.
<br />
<br />
There is however an <em>egg and chicken</em> problem: you can reindex only the catalogs of a database you are connected to, and if you cannot connect to such database because of an index corruption what can you do?
<br />
Luckily <code class="language-plaintext highlighter-rouge">postgres</code> (the process) allows for a <code class="language-plaintext highlighter-rouge">-P</code> flag that <em>P</em>revents the system catalog indexes to be loaded:</p>
<p><br />
<br /></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> -P
Ignore system indexes when reading system tables, but still update
the indexes when modifying the tables. This is useful when
recovering from damaged system indexes.
</code></pre></div></div>
<p><br />
<br /></p>
<p>Therefore the recovery can be achieved following these steps:</p>
<ul>
<li>shutdown the cluster and restart it in <strong>single user mode</strong> (see <a href="https://fluca1978.github.io/2019/06/27/PostgreSQLSingleMode.html" target="_blank">my article about</a>);</li>
<li>start a backend process ignoring the system indexes, such as
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>postgres <span class="nt">--single</span> <span class="nt">-P</span> <span class="nt">-D</span> /your/own/pgdata your_faulty_database
</code></pre></div> </div>
<p>where <code class="language-plaintext highlighter-rouge">your_faulty_database</code> is the damaged database;</p>
</li>
<li>issue a full system reindex with `REINDEX SYSTEM your_faulty_database;</li>
<li>restart the cluster in multi-user mode and try to connect to the faulty database.</li>
</ul>
<p><br />
<br />
Why is it important to start the cluster in single user mode, therefore <strong>tearing down any other database and process</strong>?
Well, PostgreSQL is smart enough to prevent you to connect <em>directly</em> to an already running cluster, that is any <code class="language-plaintext highlighter-rouge">postgres</code> process is checking against the presence of a <code class="language-plaintext highlighter-rouge">postmaster</code>:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo</span> <span class="nt">-u</span> postgres postgres <span class="nt">-D</span> /postgres/12 <span class="nt">-P</span>
FATAL: lock file <span class="s2">"postmaster.pid"</span> already exists
HINT: Is another postmaster <span class="o">(</span>PID 14355<span class="o">)</span> running <span class="k">in </span>data directory <span class="s2">"/postgres/12"</span>?
</code></pre></div></div>
<p><br />
<br /></p>
<p>The conclusion is: any data corruption is a story apart and cannot be <em>easily</em> fixed, but often PostgreSQL provides all the tools you needto recover.
<br />
And of course, once you have recovered, you should take all precautions to backup, verify and test your data!</p>
ITPUG has a new board of directors2020-11-09T00:00:00+00:00https://fluca1978.github.io/2020/11/09/ITPUG<p>Something is changing in ITPUG?</p>
<h1 id="itpug-has-a-new-board-of-directors">ITPUG has a new board of directors</h1>
<hr />
<p>I’ve recently have been contacted by a friend of mine claiming that the ITPUG (<em>Italian PostgreSQL Users’ Group</em>) board of directors has changed, and that’s true as you can see from the <a href="https://www.itpug.org/about/" target="_blank">association page</a>.
<br />
Having been a member of ITPUG form its very conception to the end of 2016, I know pretty much every member of the actual <em>fresh</em> board of directors, and I hope to see a jump forward in the manaement of the association.
<br />
<br />
AFAIK, there are members that want to make ITPUG much more user-friendly.
<br />
And there are members who deserve it!</p>
<p><br />
<br />
Good luck!</p>
Learn PostgreSQL - a new book2020-10-28T00:00:00+00:00https://fluca1978.github.io/2020/10/28/LearnPostgreSQL<p>My latest book talking about PostgreSQL</p>
<h1 id="learn-postgresql---a-new-book">Learn PostgreSQL - a new book</h1>
<hr />
<p>I’m really excited to introduce you to a new <em>PostgreSQL book</em> entitled <strong>Learn PostgreSQL</strong>, written by myself and my good friend <em>Enrico</em>.</p>
<p><br />
<br /></p>
<center>
<a href="https://www.packtpub.com/product/learn-postgresql-12/9781838985288">
<img src="/images/posts/learnpostgresql/cover.png" alt="Learn PostgreSQL book cover" />
</a>
</center>
<p><br />
<br /></p>
<center>
<iframe width="560" height="315" src="https://www.youtube.com/embed/3h47-J0rro4" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
</center>
<p><br />
<br /></p>
<p>The book covers PostgreSQL 12 and 13 and contains 20 chapters explaining how to install a cluster, how to set up for first connections, basic usage of SQL and data types, the administration of a cluster including role management, statistics and performances, log analysis and tuning, with a eye on backups. The last section of the book is dedicated to replication, both physical and logical.</p>
<p><br />
<br />
It has been an hard word to complete the book, and I would like to thank Enrico for having written this with me.
We are really proud and believe it delivers a good content and a complete presentation of our beloved database.</p>
<p><br />
<br />
A quick book outline is the following one:
<br /></p>
<h3 id="part-1">Part 1</h3>
<ul>
<li>Introduction to PostgreSQL</li>
<li>Getting to know your cluster</li>
<li>Managing Users and Connections</li>
</ul>
<h3 id="part-2">Part 2</h3>
<ul>
<li>Basic Statements</li>
<li>Advanced Statements</li>
<li>Window Functions</li>
<li>Server Side Programming</li>
<li>Triggers and Rules</li>
<li>Partitioning</li>
</ul>
<h3 id="part-3">Part 3</h3>
<ul>
<li>Users, Roles and Database Security</li>
<li>Transactions, MVCC, WALs and Checkpoints</li>
<li>Extending the database: the Extension ecosystem</li>
<li>Indexes and Performance Optimization</li>
<li>Logging and Auditing</li>
<li>Backup and Restore</li>
<li>Configuration and Monitoring</li>
</ul>
<h3 id="part-4">Part 4</h3>
<ul>
<li>Physical Replication</li>
<li>Logical Replication</li>
</ul>
<h3 id="part-5">Part 5</h3>
<ul>
<li>Usefult tools and useful extensions</li>
<li>Towards PostgreSQL 13</li>
</ul>
<p><br />
<br />
There is of course much more to describe about the book, but so far I’m still recovering from an eye surgery so I will come back in the following weeks to discuss the book.
<br />
In the meantime you can have a look at the <a href="https://github.com/PacktPublishing/Learn-PostgreSQL" target="_blank">official repository hosting the code examples</a>.</p>
<h3 id="update-and-fix-typos">Update and Fix Typos</h3>
<p>I’m sorry, in the original post I have both mispelled the title, not placed the right image and link.
<br />
<em>This is what happens when you try to edit a post and your eyes have not recovered yet from the last surgery!</em>
<br />
This is also why I recorded the video in first place!</p>
<h3 id="update-2-2020-11-02">Update 2 (2020-11-02)</h3>
<p>Hans-Jürgen Schönig also wrote me to emphasize the mispelling in the title, apparently the <em>Planet PostgreSQL</em> is not getting an update of my post.</p>
Hey there! I'm using PostgreSQL!2020-09-04T00:00:00+00:00https://fluca1978.github.io/2020/09/04/UsingPostgreSQLWhatsapp<p>A little contribution in spreading the PostgreSQL word!</p>
<h1 id="hey-there-im-using-postgresql">Hey there! I’m using PostgreSQL!</h1>
<p>A few weeks ago I changed my old mobile phone, and so I had to install again all my applications, including something I personally hate: <em>WhatsApp</em>.
<br />
While checking the configuration of the application, correctly and automatically cloned from my old phone, I came across the standard <em>status</em> that WhatsApp places for you:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/whatsapp/whatsapp_postgresql_1.png" width="50%" />
</center>
<p><br />
<br /></p>
<p>The standard phrase is <code class="language-plaintext highlighter-rouge">Hey there! I'm using WhatsApp!</code>.
<br />
I hate this automatically placed sentences, so I was trying to thin about something different, and then I decided that I did not want something different because, after all, I don’t think many people spend their time reading your status.
<br />
And then, I decided to let the world know I’m using PostgreSQL:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/whatsapp/whatsapp_postgresql_2.png" width="50%" />
</center>
<p><br />
<br /></p>
<p>It’s not a very big contribution, but <strong>it is a just a quick and easy way to let the world know about PostgreSQL</strong>.
<br />
<em>If you like PostrgreSQL and the idea, please update your status too!</em></p>
pgenv: get to know your logs2020-08-28T00:00:00+00:00https://fluca1978.github.io/2020/08/28/pgenv_log<p>I’ve added a couple of very minimalistic features to <code class="language-plaintext highlighter-rouge">pgenv</code>.</p>
<h1 id="pgenv-get-to-know-your-logs">pgenv: get to know your logs</h1>
<p>In these days a work of mine, related to PostgreSQL, is going to be tested. One quick way to get a fully functional PostgreSQL instance is to use <a href="https://github.com/theory/pgenv" target="_blank"><code class="language-plaintext highlighter-rouge">pgenv</code></a>.
<br />
However, one user asked me how to find out quickly the problem why <code class="language-plaintext highlighter-rouge">pgenv</code> was unable to start his own cluster.
<br />
<br />
<strong>Do your homework and read the logs!</strong> is the correct answer to the problem.
<br />
The you realize that part of your aim is to help people embracing the technology, so why should not <code class="language-plaintext highlighter-rouge">pgenv</code> try to teach the user to do so?
<br />
<br />
And here are two very small and ridiculous features that could help some user to get used to learn the basis of every problem solving, especially with PostgreSQL.</p>
<h2 id="a-quick-look-at-the-logs-when-things-go-wrong">A quick look at the logs when things go wrong</h2>
<p>The first problem is that when the cluster does not start, for any reason, <code class="language-plaintext highlighter-rouge">pgenv</code> correctly tells you to examine the logs.
<br />
End of the story.
<br />
That means that you have to mangle your logs thru your own favourite tool, even if you are an experienced database and system administrator. I’m lazy, so let’s <code class="language-plaintext highlighter-rouge">pgenv</code> provide me a quick hint:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv start
PostgreSQL 12.1 NOT started, examine logs <span class="k">in</span> /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log
Following are the last 5 lines of the log, as a quick hint:
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] LOG: could not <span class="nb">bind </span>IPv4 address <span class="s2">"127.0.0.1"</span>: Address already <span class="k">in </span>use
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] HINT: Is another postmaster already running on port 5432? If not, <span class="nb">wait </span>a few seconds and retry.
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] WARNING: could not create listen socket <span class="k">for</span> <span class="s2">"localhost"</span>
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] FATAL: could not create any TCP/IP sockets
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] LOG: database system is shut down
</code></pre></div></div>
<p><br />
<br />
It is that simple: if something goes wrong, <code class="language-plaintext highlighter-rouge">pgenv</code> shows me the last bunch of lines of the logs. If I’m lucky, I will see the problem without having to manually type another command to dig into the logs (in the above, another cluster or process is holding the TCP/IP port 5432).
<br />
There is no black magic here: <code class="language-plaintext highlighter-rouge">tail</code> is used to the rescue!</p>
<h3 id="show-me-my-logs">Show me my logs</h3>
<p>What if you don’t remember where <code class="language-plaintext highlighter-rouge">pgenv</code> is storing your logs and want to see them to mail or ask for help?
<br />
Here comes the new <code class="language-plaintext highlighter-rouge">log</code> command, that in turns invokes <code class="language-plaintext highlighter-rouge">tail</code> on the logs (assuming there is one log!). The beauty of using <code class="language-plaintext highlighter-rouge">tail</code> is that it becomes very simple to support every other flag <code class="language-plaintext highlighter-rouge">tail</code> does support, doing therefore “complex” log analysis.
<br />
So, in the case you want your logs:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv log
Dumping the content of /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log
2020-08-28 03:29:19.903 CEST <span class="o">[</span>11867] LOG: aborting any active transactions
2020-08-28 03:29:19.905 CEST <span class="o">[</span>11867] LOG: background worker <span class="s2">"logical replication launcher"</span> <span class="o">(</span>PID 11874<span class="o">)</span> exited with <span class="nb">exit </span>code 1
2020-08-28 03:29:19.906 CEST <span class="o">[</span>11869] LOG: shutting down
2020-08-28 03:29:19.922 CEST <span class="o">[</span>11867] LOG: database system is shut down
2020-08-28 03:29:39.342 CEST <span class="o">[</span>13046] LOG: starting PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc <span class="o">(</span>Ubuntu 8.3.0-6ubuntu1<span class="o">)</span> 8.3.0, 64-bit
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] LOG: could not <span class="nb">bind </span>IPv4 address <span class="s2">"127.0.0.1"</span>: Address already <span class="k">in </span>use
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] HINT: Is another postmaster already running on port 5432? If not, <span class="nb">wait </span>a few seconds and retry.
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] WARNING: could not create listen socket <span class="k">for</span> <span class="s2">"localhost"</span>
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] FATAL: could not create any TCP/IP sockets
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] LOG: database system is shut down
</code></pre></div></div>
<p><br />
<br /></p>
<p>and in the case you want something different:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv log <span class="nt">-n</span> 3 <span class="nt">-f</span>
Dumping the content of /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] WARNING: could not create listen socket <span class="k">for</span> <span class="s2">"localhost"</span>
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] FATAL: could not create any TCP/IP sockets
2020-08-28 03:29:39.343 CEST <span class="o">[</span>13046] LOG: database system is shut down
</code></pre></div></div>
<p><br />
<br /></p>
<p>that prints the last three lines and waits for new logs to be displayed.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The new <code class="language-plaintext highlighter-rouge">pgenv</code> functionalities are just toys, but I hope they can help people approaching this project that can really help, in turn, to get a cluster up and running.</p>
Who needs comments?2020-08-18T00:00:00+00:00https://fluca1978.github.io/2020/08/18/PostgreSQLPGDUMPComments<p>Who cares about comments? Because you can certainly read your database schema, right?</p>
<h1 id="who-needs-comments">Who needs comments?</h1>
<p>My <a href="http://www.pgtraining.com/chi-siamo/enrico-pirozzi" target="_blank">friend and colleague Enrico</a> told me about one of those <em>hidden</em> features of <code class="language-plaintext highlighter-rouge">pg_dump</code>: <strong><code class="language-plaintext highlighter-rouge">--no-comments</code></strong>.
<br />
<br />
The option allows you to dump the database (or the part of it) without dumping any <em>user defined comment</em>, that is no comment on tables, data types, and nothing you placed with an explicit <code class="language-plaintext highlighter-rouge">COMMENT ON</code> statement.
<br />
This made me lough at firts: why should I don’t want comments on my dump? Are we still back in the ninenties where people thought that hiding information was a good strategy to ensure their job?
<br />
However, there are some cases I can think about where you don’t want comments. For example, some extensions use comments on objects to perform some <em>magic</em>, and <code class="language-plaintext highlighter-rouge">pgaudit</code> comes to mind. But it is not always true that you need to replicate the same configuration on another database, hence you should strip off the comments.</p>
<h2 id="how-pg_dump-avoids-comments">How <code class="language-plaintext highlighter-rouge">pg_dump</code> avoids comments</h2>
<p>Having a quick look at <code class="language-plaintext highlighter-rouge">pg_dump</code> source code, the function <code class="language-plaintext highlighter-rouge">dumpTableComment</code> represent a good introduction to how the comments are dumped or not. In particular, in the very beginning of the function, you can find something like:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* do nothing, if --no-comments is supplied */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">dopt</span><span class="o">-></span><span class="n">no_comments</span><span class="p">)</span>
<span class="k">return</span><span class="p">;</span>
<span class="cm">/* Comments are SCHEMA not data */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">dopt</span><span class="o">-></span><span class="n">dataOnly</span><span class="p">)</span>
<span class="k">return</span><span class="p">;</span>
<span class="cm">/* Search for comments associated with relation, using table */</span>
<span class="n">ncomments</span> <span class="o">=</span> <span class="n">findComments</span><span class="p">(</span><span class="n">fout</span><span class="p">,</span>
<span class="n">tbinfo</span><span class="o">-></span><span class="n">dobj</span><span class="p">.</span><span class="n">catId</span><span class="p">.</span><span class="n">tableoid</span><span class="p">,</span>
<span class="n">tbinfo</span><span class="o">-></span><span class="n">dobj</span><span class="p">.</span><span class="n">catId</span><span class="p">.</span><span class="n">oid</span><span class="p">,</span>
<span class="o">&</span><span class="n">comments</span><span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>If the <code class="language-plaintext highlighter-rouge">--no-comments</code> command line option is set (i.e., <code class="language-plaintext highlighter-rouge">dopt->no_comments</code> is true), the function returns immediatly since there is nothing to do.
<br />
Interestingly, if the user wants to dump only the data for the database, and not its schema, the comments are not dumped too. That’s quite obvious if you think about.
<br />
The <code class="language-plaintext highlighter-rouge">findComments</code> function is in charge of going to the storage to retrieve the comments, and it does in a <em>strange</em> way. It invokes, in turn, <code class="language-plaintext highlighter-rouge">collectComments</code>, that executes a query like the following one:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">appendPQExpBufferStr</span><span class="p">(</span><span class="n">query</span><span class="p">,</span> <span class="s">"SELECT description, classoid, objoid, objsubid "</span>
<span class="s">"FROM pg_catalog.pg_description "</span>
<span class="s">"ORDER BY classoid, objoid, objsubid"</span><span class="p">);</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>Do you see something strange there? <em>There is no <code class="language-plaintext highlighter-rouge">WHERE</code> clause in the query!</em> It does mean that the function is going to get all the comments from all the objects in the database, as reported also by the function comments preamble:</p>
<p><br />
<br /></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*
* collectComments --
*
* Construct a table of all comments available for database objects.
* We used to do per-object queries for the comments, but it's much faster
* to pull them all over at once, and on most databases the memory cost
* isn't high.
*
* The table is sorted by classoid/objid/objsubid for speed in lookup.
*/</span>
</code></pre></div></div>
<p><br />
<br /></p>
<p>The idea is that all the comments are retrieved on a single pass, and then <code class="language-plaintext highlighter-rouge">findComments</code> performs a kind of binary search to find out the exact range of comments that match the object it is dumping at that moment (i.e., the table).</p>
PostgreSQL 13 Explain now includes WAL information2020-07-27T00:00:00+00:00https://fluca1978.github.io/2020/07/27/PostgreSQLWALExplain<p>The upcoming version of PostgreSQL now includes new information in the EXPLAIN output.</p>
<h1 id="postgresql-13-explain-now-includes-wal-information">PostgreSQL 13 Explain now includes WAL information</h1>
<p>The upcoming PostgreSQL 13 includes a lot of new features, as a very consolidated habit in every release. One interesting feature among the others is that <a href="https://www.postgresql.org/docs/13/sql-explain.html" target="_blank"><code class="language-plaintext highlighter-rouge">EXPLAIN</code> now supports a new <code class="language-plaintext highlighter-rouge">WAL</code> option</a> (that requires <code class="language-plaintext highlighter-rouge">ANALYZE</code> to be set).
<br />
This new <code class="language-plaintext highlighter-rouge">WAL</code> feature allows <code class="language-plaintext highlighter-rouge">EXPLAIN</code> to provide information about the generated amount of WAL traffic.
It is quite simple to see it in action:</p>
<p><br />
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">foo</span><span class="p">(</span> <span class="n">i</span> <span class="nb">int</span> <span class="k">generated</span> <span class="n">always</span> <span class="k">as</span> <span class="k">identity</span><span class="p">,</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">);</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span> <span class="k">ANALYZE</span><span class="p">,</span> <span class="n">WAL</span><span class="p">,</span> <span class="n">FORMAT</span> <span class="n">yaml</span> <span class="p">)</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">foo</span><span class="p">(</span> <span class="n">t</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">md5</span><span class="p">(</span> <span class="n">v</span><span class="p">::</span><span class="nb">text</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">300000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">------------------------------------------</span>
<span class="o">-</span> <span class="n">Plan</span><span class="p">:</span> <span class="o">+</span>
<span class="n">Node</span> <span class="k">Type</span><span class="p">:</span> <span class="nv">"ModifyTable"</span> <span class="o">+</span>
<span class="k">Operation</span><span class="p">:</span> <span class="nv">"Insert"</span> <span class="o">+</span>
<span class="n">Parallel</span> <span class="n">Aware</span><span class="p">:</span> <span class="k">false</span> <span class="o">+</span>
<span class="n">Relation</span> <span class="n">Name</span><span class="p">:</span> <span class="nv">"foo"</span> <span class="o">+</span>
<span class="k">Alias</span><span class="p">:</span> <span class="nv">"foo"</span> <span class="o">+</span>
<span class="n">Startup</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00</span> <span class="o">+</span>
<span class="n">Total</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">6000</span><span class="p">.</span><span class="mi">00</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="k">Rows</span><span class="p">:</span> <span class="mi">300000</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="n">Width</span><span class="p">:</span> <span class="mi">36</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Startup</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">508</span><span class="p">.</span><span class="mi">168</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Total</span> <span class="nb">Time</span><span class="p">:</span> <span class="mi">508</span><span class="p">.</span><span class="mi">168</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="k">Rows</span><span class="p">:</span> <span class="mi">0</span> <span class="o">+</span>
<span class="n">Actual</span> <span class="n">Loops</span><span class="p">:</span> <span class="mi">1</span> <span class="o">+</span>
<span class="n">WAL</span> <span class="n">Records</span><span class="p">:</span> <span class="mi">309091</span> <span class="o">+</span>
<span class="n">WAL</span> <span class="n">FPI</span><span class="p">:</span> <span class="mi">0</span> <span class="o">+</span>
<span class="n">WAL</span> <span class="n">Bytes</span><span class="p">:</span> <span class="mi">28500009</span> <span class="o">+</span>
<span class="p">...</span>
</code></pre></div></div>
<p><br />
<br />
As you can see, the output of <code class="language-plaintext highlighter-rouge">EXPLAIN</code> now includes three new nodes:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">WAL Records</code>, as the name suggests, is the number of WAL records inserted into the logs;</li>
<li><code class="language-plaintext highlighter-rouge">WAL FPI</code> is the number of the <em>F</em>ull <em>P</em>age <em>I</em>mages inserted into the WALs;</li>
<li><code class="language-plaintext highlighter-rouge">WAL bytes</code> is the amount of <em>traffic</em> generated towards the WAL logs.</li>
</ul>
<p>The number of WAL records does not match exactly the number of tuple inserted by the query, clearly, but it is equal or greater. You can check this with a small number of inserts:</p>
<p><br />
<br />
```sql
testdb=> EXPLAIN ( ANALYZE, WAL, FORMAT yaml )
INSERT INTO foo( t )
SELECT md5( v::text )
FROM generate_series( 1, 3 ) v;</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> QUERY PLAN ------------------------------------------ - Plan: +
Node Type: "ModifyTable" +
Operation: "Insert" +
...
WAL Records: 3 +
WAL FPI: 0 +
WAL Bytes: 276 +
... ``** <br/> <br/>
</code></pre></div></div>
<p>I think this can help in understanding the amount of traffic passing thru the WALs, and therefore helping in configuring properly also the checkpoint related settings in a more aggressive way.
<br />
<strong><a href="https://www.postgresql.org/docs/13/auto-explain.html" target="_blank"><code class="language-plaintext highlighter-rouge">auto_explain</code></a> does support WAL information dump too</strong>, via the special configuration parameter <code class="language-plaintext highlighter-rouge">auto_explain.log_wal</code>.
<br />
<br /></p>
PostgreSQL 13 Beta 2: it's your time to help testing!2020-07-13T00:00:00+00:00https://fluca1978.github.io/2020/07/13/PostgreSQL13BetaPgenv<p>We are approaching the great 13 release, help the team testing it!</p>
<h1 id="postgresql-13-beta-2-its-your-time-to-help-testing">PostgreSQL 13 Beta 2: it’s your time to help testing!</h1>
<p>We are approaching very quickly (and on time) the <em><a href="https://www.postgresql.org/about/news/2047/" target="_blank">PostgreSQL 13</a></em> version, and <strong>we all can help testing it</strong> to provide a feedback and get ready for the next version.
<br />
As I’ve often written, a very easy approach to install and test the new version along side the version you are using (but not in production!) is by means of <a href="https://github.com/theory/pgenv" target="_blank">pgenv</a>.
<br />
The only thing you have to do is <code class="language-plaintext highlighter-rouge">pgenv build 13beta2</code>, or if you are more curious:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>luca@miguel ~ % pgenv available 13
Available PostgreSQL Versions
<span class="o">========================================================</span>
PostgreSQL 13
<span class="nt">------------------------------------------------</span>
13beta1 13beta2
luca@miguel ~ % pgenv build 13beta2
...
PostgreSQL, contrib, and documentation installation complete.
pgenv configuration written to file /home/luca/git/pgenv/.pgenv.13beta2.conf
PostgreSQL 13beta2 built
</code></pre></div></div>
<p><br />
<br />
Once the system has been compiled, you can start it and use it:
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>luca@miguel ~ % pgenv use 13beta2
WARNING:
your PATH enrvironemnt variable does not seem to include
/home/luca/git/pgenv/pgsql/bin
as an entry. You will not be able to use the currently
selected PostgreSQL binaries.
HINT:
adjust your PATH variable to include
/home/luca/git/pgenv/pgsql/bin
<span class="k">for </span>instance
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span>/home/luca/git/pgenv/pgsql/bin:<span class="nv">$PATH</span>
Already using PostgreSQL 13beta2
waiting <span class="k">for </span>server to start.... <span class="k">done
</span>server started
PostgreSQL 13beta2 started
Logging to /home/luca/git/pgenv/pgsql/data/server.log
</code></pre></div></div>
<p><br />
As the <code class="language-plaintext highlighter-rouge">pgenv</code> output suggests, it is better to modify your <code class="language-plaintext highlighter-rouge">PATH</code> to get the new executables:</p>
<p><br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>luca@miguel ~ % <span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span>/home/luca/git/pgenv/pgsql/bin:<span class="nv">$PATH</span>
</code></pre></div></div>
<p><br />
and you can make this a permanent change in your shell configuration.
<br />
<br />
Now, let’s connect to the cluster:</p>
<p><br />
<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>luca@miguel ~ % psql <span class="nt">-U</span> postgres template1 <span class="nt">-c</span> <span class="s1">'SHOW server_version;'</span>
server_version
<span class="nt">----------------</span>
13beta2
<span class="o">(</span>1 row<span class="o">)</span>
luca@miguel ~ % psql <span class="nt">-U</span> postgres template1 <span class="nt">-c</span> <span class="s1">'SELECT version();'</span>
version
<span class="nt">------------------------------------------------------------------------------------------------------------</span>
PostgreSQL 13beta2 on x86_64-unknown-freebsd12.1, compiled by gcc <span class="o">(</span>FreeBSD Ports Collection<span class="o">)</span> 9.2.0, 64-bit
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p><br />
Happy testing!</p>
replace vs regexp_replace2020-07-09T00:00:00+00:00https://fluca1978.github.io/2020/07/09/ReplaceVSRegexpReplace<p>Some considerations about the usage of <code class="language-plaintext highlighter-rouge">replace</code> or <code class="language-plaintext highlighter-rouge">regexp_replace</code>.</p>
<h1 id="replace-vs-regexp_replace">replace vs regexp_replace</h1>
<p>While trying to help Stefan Stefanov with his <a href="https://github.com/stefanov-sm/pg_spreadsheetml" target="_blank"><code class="language-plaintext highlighter-rouge">pg_spreadsheetml</code></a> I came across something that would have been obvious, but not too much to me.
<br />
The obvious thing is <em><code class="language-plaintext highlighter-rouge">replace</code> is generally faster than <code class="language-plaintext highlighter-rouge">regexp_replace</code></em>.
<br />
The fact is that, probably due to my heavy usage of Perl and Raku, I tend to use regular expressions even where they are not really required, and that is why I tried to change a nested invocation of <code class="language-plaintext highlighter-rouge">replace</code> into one of <code class="language-plaintext highlighter-rouge">regexp_replace</code>. The <a href="https://github.com/stefanov-sm/pg_spreadsheetml/pull/3/commits/0e931cc212572be9db190bb761ef7d758fd61b2e" target="_blank">pull request, and in particular the commit</a> did transform something like:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">replace</span><span class="p">(</span><span class="k">replace</span><span class="p">(</span><span class="k">replace</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="s1">'&'</span><span class="p">,</span> <span class="s1">'&amp;'</span><span class="p">),</span> <span class="s1">'>'</span><span class="p">,</span> <span class="s1">'&gt;'</span><span class="p">),</span> <span class="s1">'<'</span><span class="p">,</span> <span class="s1">'&lt;'</span><span class="p">);</span>
</code></pre></div></div>
<p>into something like</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">regexp_replace</span><span class="p">(</span> <span class="n">regexp_replace</span><span class="p">(</span> <span class="n">regexp_replace</span><span class="p">(</span> <span class="n">s</span><span class="p">,</span> <span class="s1">'&'</span><span class="p">,</span> <span class="s1">'&amp;'</span><span class="p">,</span> <span class="s1">'g'</span> <span class="p">)</span>
<span class="p">,</span> <span class="s1">'>'</span>
<span class="p">,</span> <span class="s1">'&gt;'</span>
<span class="p">,</span> <span class="s1">'g'</span> <span class="p">)</span>
<span class="p">,</span> <span class="s1">'<'</span>
<span class="p">,</span> <span class="s1">'&lt;'</span>
<span class="p">,</span> <span class="s1">'g'</span> <span class="p">);</span>
</code></pre></div></div>
<p><br />
<br />
Now, despite the newlines, the usage of <code class="language-plaintext highlighter-rouge">regexp_replace</code> resulted in slower code.
So we decided to benchmark, and I decided in particular to test it with <code class="language-plaintext highlighter-rouge">pgbench</code>.</p>
<h2 id="testing-with-pgbench">Testing with <code class="language-plaintext highlighter-rouge">pgbench</code></h2>
<p>I created <a href="https://github.com/fluca1978/fluca1978-pg-utils/tree/master/examples/regexp_replace_becnhmarking" target="_blank">three sql scripts</a> that essentially do the following:</p>
<ul>
<li>loop from 1 to the <code class="language-plaintext highlighter-rouge">:scale</code>;</li>
<li>build a single XML piece of code with a sligthly different content to avoid caching;</li>
<li>perform the substitution in three different ways
<ul>
<li>with <code class="language-plaintext highlighter-rouge">replace</code></li>
<li>with <code class="language-plaintext highlighter-rouge">regexp_replace</code></li>
<li>with <code class="language-plaintext highlighter-rouge">regexp_replace</code> and backreferences</li>
</ul>
</li>
<li>store the results with timing (<code class="language-plaintext highlighter-rouge">clock_timestamp()</code>) into a table for later analysis.</li>
</ul>
<p><br />
<br />
I did run the tests in a way similar to the following:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgbench <span class="nt">-s</span> 300000 <span class="nt">-f</span> benchmark_regexp_replace_compact.sql <span class="nt">-U</span> luca testdb
</code></pre></div></div>
<p><br />
and at the end I asked to get the result for the type of test.</p>
<h2 id="results">Results</h2>
<p>Getting the results is quite straightforward, and on my PostgreSQL 12.2 I got:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">replacement_type</span><span class="p">,</span> <span class="k">avg</span><span class="p">(</span> <span class="n">ms</span> <span class="p">),</span> <span class="k">min</span><span class="p">(</span> <span class="n">ms</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">ms</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">benchmark_replace</span> <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">replacement_type</span><span class="p">;</span>
<span class="n">replacement_type</span> <span class="o">|</span> <span class="k">avg</span> <span class="o">|</span> <span class="k">min</span> <span class="o">|</span> <span class="k">max</span>
<span class="c1">------------------------|------------------------|-----|----------</span>
<span class="n">regexp_replace</span> <span class="o">|</span> <span class="mi">2</span><span class="p">.</span><span class="mi">0656612333436503</span><span class="n">e</span><span class="o">-</span><span class="mi">05</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">039055</span>
<span class="n">regexp_replace_compact</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">00018001079899881362</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">06716</span>
<span class="k">replace</span> <span class="o">|</span> <span class="mi">4</span><span class="p">.</span><span class="mi">885953333294914</span><span class="n">e</span><span class="o">-</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">027875</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
that clearly show how <code class="language-plaintext highlighter-rouge">replace</code> is ten times faster than <code class="language-plaintext highlighter-rouge">regexp_replace</code> that in turns, is roughly ten time faster that a <code class="language-plaintext highlighter-rouge">regexp_replace</code> with backreferences, as you could expect (even if I was hoping for a lower difference due to a minor number of invocations of the function).
<br />
It is also interesting that the maximum times pretty much are <code class="language-plaintext highlighter-rouge">200%</code> of the previous best case.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Even if the presented approach cannot be considered a <em>good benchmarking</em>, it does emphasizes how it is important to use the simplest function available for the task, in this case <code class="language-plaintext highlighter-rouge">replace</code> when you don’t need to do a regular expression magic.</p>
PostgreSQL 12 Coin (by PGUS)2020-07-09T00:00:00+00:00https://fluca1978.github.io/2020/07/09/PostgreSQL12Coin<p>A very nice surpise in the mail!</p>
<h1 id="postgresql-12-coin-by-pgus">PostgreSQL 12 Coin (by PGUS)</h1>
<p>A couple of weeks ago I got a very nice surprise in the mail, but I was not able to write about it due to my eyes problems and current situation (a very few details can be found <a href="https://fluca1978.github.io/2020/06/29/PerlWeeklyChallenge67.html" target="_blank">here</a>).
<br />
Anyway, promoting PostgreSQL is important, so here I am to tell you about what I received.
<br />
<br />
Long story short: <strong>I received a PostgreSQL 12 celebrative coin!</strong></p>
<h2 id="the-envelope">The envelope</h2>
<p>First of all, I clearly recognized the name on the envelope: Mark Wong is the treasurer of the <a href="https://postgresql.us/" target="_blank">PostgreSQL US organization, namely <em>PGUS</em></a>.</p>
<p><br /></p>
<center>
<img src="/images/posts/postgresql12-coin/coin1.png" />
</center>
<p><br /></p>
<h2 id="the-content">The content</h2>
<p>The content of the evenlope was <strong>an amazing PostgreSQL 12 coin</strong> and a couple of the PostgreSQL 12 press kit.
I’ve took a picture of the coin near a pen, in order to let you understand the size.</p>
<p><br /></p>
<center>
<img src="/images/posts/postgresql12-coin/coin2.png" />
<br />
<br />
<img src="/images/posts/postgresql12-coin/coin3.png" size="50%" />
<br /><br />
<img src="/images/posts/postgresql12-coin/coin4.png" size="50%" />
</center>
<p><br />
<br /></p>
<h2 id="the-mission">The mission</h2>
<p>PostgreSQL has been an important part of my life so far.
<br />
I’m not a developer, but I started using it and being productive with it at work.
<br />
Then I organized conferences, seminars and co-funded the Italian PostgreSQL Users’ Group (ITPUG), from which I literally escaped in 2016 due to clashes with the management.
<br />
Today I’m a PostgreSQL consultant.
<br />
Therefore, I can say that PostgreSQL has always been a part, even if sometime marginal, of my working career. But this is not my mission, nor is that of the community.
<br />
Our mission, as volounteers, is to improve PostgreSQL depending on our capabilities and <em>to spread the word</em>, to let other professional and passionate people embrace this database and get inspired by it.
<br />
<br />
This coin is an important sign, as other promotional material, in doing our mission and reminds me (and us) how important it is to make PostgreSQL a <em>famous</em> product even with non-technical stuff.</p>
<h2 id="2019-10-03">2019-10-03</h2>
<p>It is interesting to note that the very next days of the release of PostgreSQL 12 I was in Rome, doing a professional course on PostgreSQL (of course!).
If my memory serves me well, it was October the 7th.</p>
<h2 id="postgresql-13">PostgreSQL 13</h2>
<p>No, it’s not a typo: PostgreSQL 13 is almost here, but I’m equally glad to have this <em>piece of art</em>.</p>
<h1 id="conclusions">Conclusions</h1>
<p>I would like to thank the <a href="https://postgresql.us/" target="_blank">PGUS</a> for sending me the coin, that I take as both a reminder on how important is to promote PostgreSQL and that I could have done some of the mission to promote it and get <em>awarded</em> with this, even if the award could be completly made up in my mind (so please let me believe I’m right!).
<br />
<br />
Being this a very hard time of my life, due to my eyes, it is somehow a relief knowing, or better being reminded, that <strong>we are all part of an amazing community</strong>.</p>
ORA-2449 and the Constraint Dependencies2020-06-18T00:00:00+00:00https://fluca1978.github.io/2020/06/18/SQLDeveloperConstraintWarning<p>What happens if you try to drop a table that is referenced by another table?</p>
<h1 id="ora-2449-and-the-constraint-dependencies">ORA-2449 and the Constraint Dependencies</h1>
<p>Oracle clients seems somehow a little goofy when you have to deal with dependencies.
<br />
Imagine you have two tables, <code class="language-plaintext highlighter-rouge">a</code> that references table <code class="language-plaintext highlighter-rouge">b</code>; you can generate the tables as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SQL</span><span class="o">></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">a</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="p">);</span>
<span class="k">Table</span> <span class="n">created</span><span class="p">.</span>
<span class="k">SQL</span><span class="o">></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">b</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="p">);</span>
<span class="k">Table</span> <span class="n">created</span><span class="p">.</span>
<span class="k">SQL</span><span class="o">></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">a</span> <span class="k">ADD</span><span class="p">(</span> <span class="n">b_ref</span> <span class="nb">int</span> <span class="k">REFERENCES</span> <span class="n">b</span><span class="p">(</span><span class="n">pk</span><span class="p">)</span> <span class="p">);</span>
<span class="k">Table</span> <span class="n">altered</span><span class="p">.</span>
<span class="k">SQL</span><span class="o">></span> <span class="k">COMMIT</span><span class="p">;</span>
<span class="k">Commit</span> <span class="n">complete</span><span class="p">.</span>
</code></pre></div></div>
<p><br />
As you can see, the tables are empty, there is no effective data but there is a clear reference made by the foreign key <code class="language-plaintext highlighter-rouge">b_ref</code> that connects table <code class="language-plaintext highlighter-rouge">a</code> to table <code class="language-plaintext highlighter-rouge">b</code>.
<br />
So far, so good!
<br />
Now, let’s try to delete table <code class="language-plaintext highlighter-rouge">b</code>, on which <code class="language-plaintext highlighter-rouge">a</code> depends on:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SQL</span><span class="o">></span> <span class="k">DROP</span> <span class="k">TABLE</span> <span class="n">b</span><span class="p">;</span>
<span class="k">DROP</span> <span class="k">TABLE</span> <span class="n">b</span>
<span class="o">*</span>
<span class="n">ERROR</span> <span class="k">at</span> <span class="n">line</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">ORA</span><span class="o">-</span><span class="mi">02449</span><span class="p">:</span> <span class="k">unique</span><span class="o">/</span><span class="k">primary</span> <span class="n">keys</span> <span class="k">in</span> <span class="k">table</span> <span class="n">referenced</span> <span class="k">by</span> <span class="k">foreign</span> <span class="n">keys</span>
</code></pre></div></div>
<p><br />
Great! Oracle, as we are expecting, is telling us that we cannot drop the refenced table unless we remove the dependency from the dependent object.
<br />
However, please note how <strong>Oracle is not telling us what dependency is preventing us from dropping the table</strong>!
<br />
The situation is pretty much the same if you execute the <code class="language-plaintext highlighter-rouge">SQL Developer</code> client:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/oracle/oracle_constraint_warning.png" />
</center>
<p><br />
<br /></p>
<p>Please note how the <code class="language-plaintext highlighter-rouge">SQL Developer</code> warning dialog is even suggesting us to execute a query against the Oracle catalogs to see which constraints are making the <code class="language-plaintext highlighter-rouge">DROP</code> fail. Not only, Oracle <code class="language-plaintext highlighter-rouge">SQL Developer</code> is so lazy to not even complete the query for us: instead of placing the table name in the query and presenting us a <em>copy-and-paste</em> ready statement, it tells us to execute something like <code class="language-plaintext highlighter-rouge">SELECT ... WHERE TABLE_NAME = 'tabname'</code>.
<br />
I’m pretty sure someone at least one time has executed a query searching a table named <em><code class="language-plaintext highlighter-rouge">tabname</code></em>!
<br />
<br />
<strong>Why is not Oracle giving us an hint about the constraints?</strong>
<br />
<br />
Being used to PostgreSQL, I can say that this should be correct behavior. In fact, if you try this in PostgreSQL you get a clear warning about which constraint is preventing you to delete the table:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">a</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">serial</span> <span class="k">primary</span> <span class="k">key</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">b</span><span class="p">(</span> <span class="n">pk</span> <span class="nb">serial</span> <span class="k">primary</span> <span class="k">key</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">alter</span> <span class="k">table</span> <span class="n">a</span> <span class="k">add</span> <span class="n">b_ref</span> <span class="nb">int</span> <span class="k">references</span> <span class="n">b</span><span class="p">(</span><span class="n">pk</span><span class="p">);</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">drop</span> <span class="k">table</span> <span class="n">b</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">cannot</span> <span class="k">drop</span> <span class="k">table</span> <span class="n">b</span> <span class="n">because</span> <span class="n">other</span> <span class="n">objects</span> <span class="n">depend</span> <span class="k">on</span> <span class="n">it</span>
<span class="n">DETAIL</span><span class="p">:</span> <span class="k">constraint</span> <span class="n">a_b_ref_fkey</span> <span class="k">on</span> <span class="k">table</span> <span class="n">a</span> <span class="n">depends</span> <span class="k">on</span> <span class="k">table</span> <span class="n">b</span>
<span class="n">HINT</span><span class="p">:</span> <span class="n">Use</span> <span class="k">DROP</span> <span class="p">...</span> <span class="k">CASCADE</span> <span class="k">to</span> <span class="k">drop</span> <span class="n">the</span> <span class="n">dependent</span> <span class="n">objects</span> <span class="n">too</span><span class="p">.</span>
</code></pre></div></div>
<p><br />
In particular, the message <code class="language-plaintext highlighter-rouge">constraint a_b_ref_fkey on table a depends on table b</code> gives us a really clear explaination of what we should search for to fix the “problem”.
<br />
Not only, PostgreSQL is reminding us that, if we want to quickly get rid of the table, we can use the <code class="language-plaintext highlighter-rouge">DROP...CASCADE</code> statement to force PostgreSQL to take action.</p>
PostgreSQL 11 Server Side Programming Errata Corrige2020-06-17T00:00:00+00:00https://fluca1978.github.io/2020/06/17/PG11SSP_errata<p>A reader provided us a feedback about a wrong listing.</p>
<h2 id="postgresql-11-server-side-programming-errata-corrige">PostgreSQL 11 Server Side Programming Errata Corrige</h2>
<p>I have already written about how my first book on PostgreSQL, named <strong><a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide">PostgreSQL 11 Server Side Programming Quick Start Guide</a></strong>, gained more attention.</p>
<p><br /><br /><br /></p>
<center>
<a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide" target="_blank">
<img src="/images/posts/pg11ssp/cover.png" />
</a>
</center>
<p><br /><br /><br />
Gaining attention also means that readers could find out problems and errors, <strong>and this is good</strong> (to me)!
<br /></p>
<p>The first problem that has been reported to me is described here, so that if you are reading the book can better understand and deal with the problem.</p>
<p><br /><br /></p>
<h2 id="listing-8-on-chapter-3">Listing 8 on Chapter 3</h2>
<p>The <em>Listing 8</em> in chapter 3 is wrong, and in particular it is the very same listing as <em>Listing 13</em> later in the chapter.
The problem is that the shown listing 8 does not include a variable, namely <code class="language-plaintext highlighter-rouge">file_type</code>, that is referenced in the text.
<br />
Therefore, if you are dealing with that particular example, please consider that the right listing is reported <a href="https://github.com/PacktPublishing/PostgreSQL-11-Quick-Start-Guide/blob/master/Chapter03/Chapter03_Listing08.sql" target="_blank">on the official GitHub repository</a>.</p>
<p><br /><br />
I’m really sorry about the misplaced listing, I hope this can help making it more readable.</p>
Running pgbackrest on FreeBSD2020-06-12T00:00:00+00:00https://fluca1978.github.io/2020/06/12/pgbackrestOnFreeBSD<p>pgbackrest is an amazing backup solution for PostgreSQL, quite frankly it is my favourite. And now fully supports FreeBSD too!</p>
<h1 id="running-pgbackrest-on-freebsd">Running pgbackrest on FreeBSD</h1>
<p><a href="https://pgbackrest.org/index.html" target="_blank">pgbackrest</a> is an amazing tool for backup and recovery of a PostgreSQL database.
Quite frankly, <strong>it is my favourite backup solution</strong> because it is reliable, fast and supports a lot of interesting features including retention policies and encryption.
<br />
<a href="https://fluca1978.github.io/2019/03/04/pgbackrest_FreeBSD.html" target="_blank">I have already written</a> about some problems in running <a href="https://pgbackrest.org/index.html" target="_blank">pgbackrest</a> on FreeBSD, and the problem were not related to the application itself, rather to the compilation process.
<br />
I’m really glad that now <a href="https://pgbackrest.org/index.html" target="_blank">pgbackrest</a> fully supports non-Linux platforms, including FreeBSD, thanks to the changes in the compilation approach. It is therefore a simple process to get pgbackrest installed on your FreeBSD machine!</p>
<h2 id="installing-pgbackrest-on-freebsd">Installing pgbackrest on FreeBSD</h2>
<p>In order to see how simple it is now to install pgbackrest on FreeBSD, let’s download the latest stable release, the <code class="language-plaintext highlighter-rouge">2.27</code> one, and install it. The only advice is that the project needs to be compiled with <code class="language-plaintext highlighter-rouge">GNU make</code>, that means you have to digit <code class="language-plaintext highlighter-rouge">gmake</code> inestead of usual <code class="language-plaintext highlighter-rouge">make</code>:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% wget https://github.com/pgbackrest/pgbackrest/archive/release/2.27.tar.gz
% <span class="nb">tar </span>xzvf 2.27.tar.gz
% <span class="nb">cd </span>pgbackrest-release-2.27
% <span class="nb">cd </span>src
% ./configure <span class="nt">--prefix</span><span class="o">=</span>/usr/local/pgbackrest
% gmake
% <span class="nb">sudo </span>gmake <span class="nb">install</span>
</code></pre></div></div>
<p>I’ve decided to install it on a specific path, <code class="language-plaintext highlighter-rouge">/usr/local/pgbackrest</code> just to avoid messing with other binaries, but you can install in the default FreeBSD location <code class="language-plaintext highlighter-rouge">/usr/local/</code>. If everything was succesful, you can then proceed to testing the program:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span>/usr/local/pgbackrest/bin:<span class="nv">$PATH</span>
% pgbackrest
pgBackRest 2.27 - General <span class="nb">help
</span>Usage:
pgbackrest <span class="o">[</span>options] <span class="o">[</span><span class="nb">command</span><span class="o">]</span>
Commands:
archive-get Get a WAL segment from the archive.
archive-push Push a WAL segment to the archive.
backup Backup a database cluster.
check Check the configuration.
expire Expire backups that exceed retention.
<span class="nb">help </span>Get help.
info Retrieve information about backups.
restore Restore a database cluster.
stanza-create Create the required stanza data.
stanza-delete Delete a stanza.
stanza-upgrade Upgrade a stanza.
start Allow pgBackRest processes to run.
stop Stop pgBackRest processes from running.
version Get version.
Use <span class="s1">'pgbackrest help [command]'</span> <span class="k">for </span>more information.
</code></pre></div></div>
<p>Great! Installing on FreeBSD is now really simple!</p>
<h2 id="some-recent-history-about-pgbackrest">Some recent history about pgbackrest</h2>
<p>In the last few month the porject was deply improved, and I’m not going to quote the whole <a href="https://pgbackrest.org/release.html" target="_blank">release history</a> here. However, there are two major aspects that I found really interesting.</p>
<h3 id="autoconf">Autoconf</h3>
<p>As you probably have noted in the above installation example, <a href="https://pgbackrest.org/index.html" target="_blank">pgbackrest</a> now uses <a href="https://www.gnu.org/software/autoconf/" target="_blank">autoconf</a> to <em>understand</em> how to correctly configure the project for the hosting operating system. Autoconf <a href="https://github.com/pgbackrest/pgbackrest/commit/027c2638719dffa9ba99250085c403e89a2a8a9a" target="_blank">was introduced in the previous year</a> as a reaction to a <a href="https://github.com/pgbackrest/pgbackrest/pull/690" target="_blank">pull request I opened to compile on FreeBSD</a>.</p>
<h3 id="migrating-to-c">Migrating to C</h3>
<p><a href="https://pgbackrest.org/index.html" target="_blank">pgbackrest</a> was initially developed mainly in Perl, with little parts written in C to deal with performances and internals of PostgreSQL WAL files format.
<br />
As of January 2020, release <code class="language-plaintext highlighter-rouge">2.21</code>, the whole codebase is in C. Well, this is not fully true, since the testing and documentation part is still written in Perl, at least to my understanding, but the whole <code class="language-plaintext highlighter-rouge">pgbackrest</code> production thing is now in C.
<br />
The fact that the application is now written in C makes a clear distinction between <code class="language-plaintext highlighter-rouge">pgbackrest</code> and other similar backup solutions, that indeed take advantages of existing tools to behave as “glue” between small pieces. Moreover, it means that the backup, and most notably the restore, can run at full speed.</p>
<h3 id="my-little-messy-contribution">My little messy contribution</h3>
<p>A long time ago… I tried to contribute to a requested feature that sounded very easy to implement, and of course it was not!
<br />
Since version <code class="language-plaintext highlighter-rouge">2.25</code> there is the <code class="language-plaintext highlighter-rouge">--dry-run</code> flag for the <code class="language-plaintext highlighter-rouge">expire</code> command:</p>
<blockquote>
<p>Add –dry-run option to the expire command.
Use dry-run to see which backups/archive would be removed by the expire command
without actually removing anything. (Contributed by Cynthia Shang, Luca Ferrari.
Reviewed by David Steele. Suggested by Marc Cousin.)</p>
</blockquote>
<p>Unluckily, I was unable to complete the effort because I was unable to use the testing system, <a href="https://github.com/pgbackrest/pgbackrest/pull/853" target="_blank">and it was my fault</a>, I underestimated the problem. But there are two very good news about this:</p>
<ul>
<li>the project provide me a very quick, polite and constant support in trying to fix my issues;</li>
<li>they required me to test my changes instead of doing the testing by themselves.</li>
</ul>
<p>Why are the above good news? First of all, other projcets are not so reactive when new contributions come, and I think this is very important for the project health. Second, testing a feature means that the project will not introduce regressions, and forcing every developer to test their own changes is a very good habit.</p>
<h1 id="conclusions">Conclusions</h1>
<p>I have already used <code class="language-plaintext highlighter-rouge">pgbackrest</code> on FreeBSD, but now that it is *natively** supporting this platform I believe that the project will attrac more and more users. Moreover, now that all the code has been converted to C, the already optimal performances will be much more impressive.
<br />
<br />
<strong><a href="https://pgbackrest.org/index.html" target="_blank">pgbackrest</a> is definetely my backup solution of choice</strong>, and not only for its features, but also for the clean and rigorous way the project is mantained and improved.</p>
Locating the PostgreSQL configuration file2020-06-08T00:00:00+00:00https://fluca1978.github.io/2020/06/08/FindPostgreSQLConfiguration<p>How to find the PostgreSQL configuration file on an unknown system?</p>
<h1 id="locating-the-postgresql-configuration-file">Locating the PostgreSQL configuration file</h1>
<p>Sometimes you get to manage a PostgreSQL instance on an <em>unknown</em> system, and this means you don’t know how to locate the PostgreSQL configuration file.
<br />
An example could be when you are running PostgreSQL on a Docker container:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@ff20ff72ee64:/# ps <span class="nt">-auxwww</span>
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 1 0.0 0.8 288596 18076 ? Ss 16:05 0:00 postgres
postgres 22 0.0 0.1 288732 3860 ? Ss 16:05 0:00 postgres: checkpointer
postgres 23 0.0 0.1 288596 3096 ? Ss 16:05 0:00 postgres: background writer
postgres 24 0.0 0.3 288596 6260 ? Ss 16:05 0:00 postgres: walwriter
postgres 25 0.0 0.1 289024 3068 ? Ss 16:05 0:00 postgres: autovacuum launcher
postgres 26 0.0 0.1 143856 2264 ? Ss 16:05 0:00 postgres: stats collector
postgres 27 0.0 0.1 288884 2564 ? Ss 16:05 0:00 postgres: logical replication launcher
root 70 0.0 0.1 19856 2236 pts/0 Ss 16:12 0:00 /bin/bash
</code></pre></div></div>
<p><br />
Assuming you have the credentials of a PostgreSQL <em>super user</em>, you can ask PostgreSQL itself:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@ff20ff72ee64:/# psql <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s1">'SHOW config_file'</span>
config_file
<span class="nt">------------------------------------------</span>
/var/lib/postgresql/data/postgresql.conf
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<blockquote>
<p>You should have the administrator user credentials, since you have been assigned to manage this PostgreSQL instance!</p>
</blockquote>
<p>Of course, you can have your credentials stored into the <code class="language-plaintext highlighter-rouge">.pgpass</code> file, as usual.
<br />
You can also save some extra “attaching” operations by macking <code class="language-plaintext highlighter-rouge">docker</code> do all the stuff for you:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker <span class="nb">exec</span> <span class="nt">-it</span> db psql <span class="nt">-U</span> ckan <span class="nt">-c</span> <span class="s1">'SHOW config_file'</span>
config_file
<span class="nt">------------------------------------------</span>
/var/lib/postgresql/data/postgresql.conf
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p><br />
There are other methods, of course, including a search with <code class="language-plaintext highlighter-rouge">find(1)</code> or <code class="language-plaintext highlighter-rouge">locate</code>.</p>
WAL, LSN and File Names2020-05-28T00:00:00+00:00https://fluca1978.github.io/2020/05/28/PostgreSQLWalNames<p>Understanding the relationship between LSN and WAL file names.</p>
<h1 id="wal-lsn-and-file-names">WAL, LSN and File Names</h1>
<p>PostgreSQL stores changes that is going to apply to data into the <strong>Write Ahead Logs (WALs)</strong>, that usually are <code class="language-plaintext highlighter-rouge">16 MB</code> each in size, even if you can configure your cluster (starting from version 11) to different sizes.
<br />
PostgreSQL knows at which part of the <code class="language-plaintext highlighter-rouge">16 MB</code> file (named <em>segment</em>) it is by an offset that is tied to the <strong>Log Sequence Number (LSN)</strong>. Let’s see those in action.
<br />
First of all, let’s get some information about the current status:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_current_wal_lsn</span><span class="p">(),</span>
<span class="n">pg_walfile_name</span><span class="p">(</span> <span class="n">pg_current_wal_lsn</span><span class="p">()</span> <span class="p">);;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">------|-------------------------</span>
<span class="n">pg_current_wal_lsn</span> <span class="o">|</span> <span class="k">C</span><span class="o">/</span><span class="n">CE7BAD70</span>
<span class="n">pg_walfile_name</span> <span class="o">|</span> <span class="mi">000000010000000</span><span class="n">C000000CE</span>
</code></pre></div></div>
<p>The server is currently using the WAL file named <code class="language-plaintext highlighter-rouge">000000010000000C000000CE</code>.
It is possible to see the relationship between the LSN, currently <code class="language-plaintext highlighter-rouge">C/CE7BAD70</code> and the WAL file name as follows.
The LSN is made up by three pieces: <code class="language-plaintext highlighter-rouge">X/YYZZZZZZ</code> where:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">X</code> represents the middle part of the WAL file name, one or two symbols;</li>
<li><code class="language-plaintext highlighter-rouge">YY</code> represents the final part of the WAL file name;</li>
<li><code class="language-plaintext highlighter-rouge">ZZZZZZ</code> are six symbols that represents the offset within the file name.
<br />
Therefore, given the LSN <code class="language-plaintext highlighter-rouge">C/CE7BAD70</code> we can assume that the middle part of the WAL file name will be <code class="language-plaintext highlighter-rouge">C</code> and the last part will be <code class="language-plaintext highlighter-rouge">CE</code>, both zero padded to 8 symbols, so respectively <code class="language-plaintext highlighter-rouge">0000000C</code> and <code class="language-plaintext highlighter-rouge">000000CE</code>. Concatenated togehter, they provide us with a file name that ends with <code class="language-plaintext highlighter-rouge">0000000C000000CE</code>. The initial part of the filename is still missing, and that is the timeline the server is running on, in this case <code class="language-plaintext highlighter-rouge">1</code>, zero padded as the other parts, so <code class="language-plaintext highlighter-rouge">00000001</code> that provides us the final name <code class="language-plaintext highlighter-rouge">000000010000000C000000CE</code>.
<br />
To summarize, the following is the correspondance between the single parts:</li>
</ul>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LSN -> C / CE 7BAD70
WAL -> 00000001 0000000C 000000CE
</code></pre></div></div>
<p><br />
<br /></p>
<blockquote>
<p>Please consider that the above example is just to show you the concept, but it is better to
use the function pg_walfile_name() to get the exact WAL file name from an LSN since WAL switch
may lead to incorrect result from the LSN “manual decoding”.</p>
</blockquote>
<p><br />
<br />
The final part of the LSN is the offset within the WAL file, and it does suffice to convert it to <code class="language-plaintext highlighter-rouge">int</code> to get an idea:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="p">(</span> <span class="n">x</span><span class="s1">'7BAD70'</span> <span class="p">)::</span><span class="nb">int</span> <span class="k">AS</span> <span class="k">offset</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">---</span>
<span class="k">offset</span> <span class="o">|</span> <span class="mi">8105328</span>
</code></pre></div></div>
<p>You can get the same information with the special function <code class="language-plaintext highlighter-rouge">pg_walfile_name_offset()</code>, to which you can pass the LSN, and get the current filename and the offset in a single run:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="p">(</span> <span class="n">x</span><span class="s1">'7BAD70'</span> <span class="p">)::</span><span class="nb">int</span> <span class="k">AS</span> <span class="n">offset_computed</span><span class="p">,</span> <span class="n">pg_walfile_name_offset</span><span class="p">(</span> <span class="s1">'C/CE7BAD70'</span> <span class="p">);</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">----------|-----------------------------------</span>
<span class="n">offset_computed</span> <span class="o">|</span> <span class="mi">8105328</span>
<span class="n">pg_walfile_name_offset</span> <span class="o">|</span> <span class="p">(</span><span class="mi">000000010000000</span><span class="n">C000000CE</span><span class="p">,</span><span class="mi">8105328</span><span class="p">)</span>
</code></pre></div></div>
<p><br />
To summarize, given a specific LSN the database is (and must be) clearly aware of the WAL file segment the LSN refers to and to the exact offset, within such file, where the data can be found.</p>
Inspecting Command Tags and Events in event triggers2020-05-26T00:00:00+00:00https://fluca1978.github.io/2020/05/26/PostgreSQLEventTriggerDemo<p>Event triggers are a very powerful mechanism to react to data structure changes in PostgreSQL.</p>
<h1 id="inspecting-command-tags-and-events-in-event-triggers">Inspecting Command Tags and Events in event triggers</h1>
<p>While preparing an example for a course of mine about event triggers, I thought I’ve never proposed a <em>catch-all</em> event trigger debugging use case. So, here it is.
<br />
Event Triggers are a powerful mechanism that PostgreSQL provides to react to database schema changes, like table or column addition and deletion, object creation, and so on. The <a href="https://www.postgresql.org/docs/12/functions-event-triggers.html" target="_blank">official documentatio</a> already presents a couple of example about <em>dropping</em> objects or <em>rewriting</em> tables, so my little example is about more common commands. I create the following function:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_event_trigger_demo</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="n">EVENT_TRIGGER</span>
<span class="k">AS</span>
<span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">event_tuple</span> <span class="n">record</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'Event trigger function called '</span><span class="p">;</span>
<span class="k">FOR</span> <span class="n">event_tuple</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">pg_event_trigger_ddl_commands</span><span class="p">()</span> <span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'TAG [%] COMMAND [%]'</span><span class="p">,</span> <span class="n">event_tuple</span><span class="p">.</span><span class="n">command_tag</span><span class="p">,</span> <span class="n">event_tuple</span><span class="p">.</span><span class="n">object_type</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>It is quite simple to understand what it does: every time the function is triggered, it asks or the tuples out of the special function <code class="language-plaintext highlighter-rouge">pg_event_trigger_ddl_commands()</code>, that provides one tuple for every single command executed. Why multiple tuples? Because you could execute one command that explodes into different sub-commands.
<br />
Than, simply, the function does print the command tag and the object type.
<br />
Usually <em>command tags</em> are <em>uppercase</em>, while <em>object types</em> are <em>lowercase</em>.
<br />
The trigger can be created as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">create</span> <span class="n">event</span> <span class="k">trigger</span> <span class="n">tr_demo</span> <span class="k">on</span> <span class="n">ddl_command_end</span> <span class="k">execute</span> <span class="k">function</span> <span class="n">f_event_trigger_demo</span><span class="p">();</span>
</code></pre></div></div>
<p>It is now simple enough to test the trigger:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">create</span> <span class="k">table</span> <span class="n">foo</span><span class="p">();</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Event</span> <span class="k">trigger</span> <span class="k">function</span> <span class="k">called</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">TAG</span> <span class="p">[</span><span class="k">CREATE</span> <span class="k">TABLE</span><span class="p">]</span> <span class="n">COMMAND</span> <span class="p">[</span><span class="k">table</span><span class="p">]</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">alter</span> <span class="k">table</span> <span class="n">foo</span> <span class="k">add</span> <span class="k">column</span> <span class="n">i</span> <span class="nb">int</span> <span class="k">default</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Event</span> <span class="k">trigger</span> <span class="k">function</span> <span class="k">called</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">TAG</span> <span class="p">[</span><span class="k">ALTER</span> <span class="k">TABLE</span><span class="p">]</span> <span class="n">COMMAND</span> <span class="p">[</span><span class="k">table</span><span class="p">]</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">create</span> <span class="k">index</span> <span class="n">idx_foo</span> <span class="k">on</span> <span class="n">foo</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Event</span> <span class="k">trigger</span> <span class="k">function</span> <span class="k">called</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">TAG</span> <span class="p">[</span><span class="k">CREATE</span> <span class="k">INDEX</span><span class="p">]</span> <span class="n">COMMAND</span> <span class="p">[</span><span class="k">index</span><span class="p">]</span>
<span class="k">CREATE</span> <span class="k">INDEX</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">foo</span> <span class="k">RENAME</span> <span class="k">TO</span> <span class="n">baz</span><span class="p">;</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Event</span> <span class="k">trigger</span> <span class="k">function</span> <span class="k">called</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">TAG</span> <span class="p">[</span><span class="k">ALTER</span> <span class="k">TABLE</span><span class="p">]</span> <span class="n">COMMAND</span> <span class="p">[</span><span class="k">table</span><span class="p">]</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
</code></pre></div></div>
<p>You can compare the output of the trigger function with the <a href="https://www.postgresql.org/docs/12/event-trigger-matrix.html" target="_blank">event trigger firing matrix</a> to get an idea of what you can “catch”.
<br />
One last note: why have I attached the trigger to the <code class="language-plaintext highlighter-rouge">ddl_command_end</code>? Having a look at the <a href="https://www.postgresql.org/docs/12/event-trigger-matrix.html" target="_blank">event trigger firing matrix</a> it looks like you can attach the trigger to either the <code class="language-plaintext highlighter-rouge">ddl_command_start</code> or <code class="language-plaintext highlighter-rouge">ddl_command_end</code> with the very same result, but the fact is that the function <code class="language-plaintext highlighter-rouge">pg_event_trigger_ddl_command()</code> works only on the <em>end</em> side of an event. The reason, as already explained, is that only approaching the <em>end</em> the system kows what a command has been exploded into.</p>
<h2 id="source-code">Source Code</h2>
<p>You can find the source code of the trigger function <a href="https://gitlab.com/fluca1978/fluca1978-pg-utils/-/blob/master/examples/triggers/event_trigger_demo.sql" target="_blank">in my GitLab repository</a>.</p>
PostgreSQL 13 beta 1 on FreeBSD via pgenv2020-05-26T00:00:00+00:00https://fluca1978.github.io/2020/05/26/PostgreSQL13BetaPgenv<p>It’s time to test the new PostgreSQL 13 release!</p>
<h1 id="postgresql-13-beta-1-on-freebsd-via-pgenv">PostgreSQL 13 beta 1 on FreeBSD via pgenv</h1>
<p>Five days ago <a href="https://www.postgresql.org/about/news/2040/" target="_blank">PostgreSQL 13 beta 1</a> has been released!
<br />
It’s time to test the new awesome version of our beloved database. Installing from source is quite trivial, but why not using <a href="https://github.com/theory/pgenv" target="_blank">pgenv</a> to such aim?
<br />
Installing on my FreeBSD machine with <code class="language-plaintext highlighter-rouge">pgenv</code> is as simple as:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>luca@miguel ~ % pgenv build 13beta1
...
PostgreSQL, contrib, and documentation installation complete.
pgenv configuration written to file /home/luca/git/pgenv/.pgenv.13beta1.conf
PostgreSQL 13beta1 built
</code></pre></div></div>
<p><br />
Are you ready to test it?
<br />
Activate it and enjoy:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>luca@miguel ~ % pgenv use 13beta1
...
server started
PostgreSQL 13beta1 started
Logging to /home/luca/git/pgenv/pgsql/data/server.log
luca@miguel ~ % psql <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s2">"SELECT version();"</span> template1
version
<span class="nt">------------------------------------------------------------------------------------------------------------</span>
PostgreSQL 13beta1 on x86_64-unknown-freebsd12.1, compiled by gcc <span class="o">(</span>FreeBSD Ports Collection<span class="o">)</span> 9.2.0, 64-bit
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p><em>Enjoy!</em></p>
PostgresWeekly Interview2020-04-24T00:00:00+00:00https://fluca1978.github.io/2020/04/24/PostgresWeeklyInterview<p>I have been interviewed by PostgresWeekly.</p>
<h1 id="postgresweekly-interview">PostgresWeekly Interview</h1>
<p>Peter Cooper had written a <a href="https://superhighway.dev/luca-ferrari-interview" target="_blank">small interview on me</a>.
<br />
<a href="https://postgresweekly.com/" target="_blank">Postgres Weekly</a> is a <em>email roundup</em> about PostgreSQL that is sent once per week, as the name implies, and provides a <em>summary</em> about what is happening and happened in the PostgreSQL ecosystem.
<br />
I think I’ve met <a href="https://postgresweekly.com/" target="_blank">Postgres Weekly</a> one or two years ago, when I was contacted by the administrator that asked me to publish some of the contents I try to periodically write on PostgreSQL.
<br />
<br />
<em>Sure! That’s why I have a blog after all!</em>
<br />
And I have to enhance the above statement saying that I actually openened my first blog immediatly after the first italian PGDay.IT (2007) to the purpose of write about PostgreSQL (and other technologies), so PostgreSQL has been one of the most important reasons for me to have a blog!</p>
<p><br />
<br /></p>
<center>
<a href="https://superhighway.dev/luca-ferrari-interview" target="_blank">
<img src="/images/posts/pg11ssp/postgresweekly.png" alt="Interview with Luca Ferrari" />
</a>
</center>
<p><br />
<br />
Having being interviewed by <a href="https://twitter.com/peterc" target="_blank">Peter</a> has been a pleasure, and he is a very good, nice and polite guy.
<br />
<br />
In this interview, for the fist time, I also tell about a very difficult part of my life that has nothing to do with PostgreSQL:
<strong>long story short my eyes are turning off and there is no chance to <em>rollback</em></strong>. I have to confess that this requires some <em>brainpower</em> to escape from nightmares.
<br />
That’s why <a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide" target="_blank">my book</a>, even if small and simple, has been a very important achievement for me. Pretty much the same feeling as when I was shooting at 90 meters with my bow ithout having any eye difficulty after a long period of repeated surgeries and medications.</p>
<p><br />
<br /></p>
<center>
<a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide" target="_blank">
<img src="https://www.packtpub.com/media/catalog/product/cache/e4d64343b1bc593f1c5348fe05efa4a6/b/1/b11208.png" alt="PostgreSQL 11 Server Side Programming by Luca Ferrari" />
</a>
</center>
<p><br />
<br />
But there is also something more happy in this interview, as the announcement that I am working to another book and we (because I’m writing with a friend of mine) have almot done half of the journey. You guess the subject!</p>
PL/pgSQL Trends2020-04-17T00:00:00+00:00https://fluca1978.github.io/2020/04/17/PLPGSQL_trends<p>A graph that shows the trends of <code class="language-plaintext highlighter-rouge">PL/pgSQL</code>, according to <a href="https://github.com" target="_blank">Github.com</a>.</p>
<h1 id="plpgsql-trends">PL/pgSQL Trends</h1>
<p>I discovered <a href="https://tjpalmer.github.io/languish/#y=mean&names=sql%2Cplsql%2Cplpgsql" target="_blank">this excellent graphing system</a> that shows several programming language trends, according to <a href="https://github.com" target="_blank">Github.com</a>.
<br />
<br />
So, let’s compare <code class="language-plaintext highlighter-rouge">PL/pgSQL</code> with <code class="language-plaintext highlighter-rouge">SQL</code> and <code class="language-plaintext highlighter-rouge">plSQL</code>:</p>
<center>
<a href="https://tjpalmer.github.io/languish/#y=mean&names=sql%2Cplsql%2Cplpgsql" target="_blank">
<img src="/images/posts/trends/2020_plpgsql.png" />
</a>
</center>
<p>As you can see, the interest in <code class="language-plaintext highlighter-rouge">PL/pgSQL</code> has grown grown a lot in the last days, and this is due to the success PostgreSQL (and therefore the language) has achieved, at least in my opinion. I’m not sure that the comparison with <code class="language-plaintext highlighter-rouge">plSQL</code> is correct, because this is tied to a proprietary database and chances are there is less material available on Github, at least on a general basis.
<br />
<br />
The trends are confirmed also with regard to the issues, pull requests and stars, with the only exception that <code class="language-plaintext highlighter-rouge">plSQL</code> overtook <code class="language-plaintext highlighter-rouge">PL/pgSQL</code> on pull requests around the year 2014.
<br />
<br />
That’s interesting, especially if you compare the trends with <em>real</em> programming language, I mean with multipurpose programming languages like <code class="language-plaintext highlighter-rouge">Perl</code></p>
<center>
<a href="https://tjpalmer.github.io/languish/#y=issues&names=sql%2Cplsql%2Cplpgsql%2Cperl" target="_blank">
<img src="/images/posts/trends/2020_plpgsql_perl.png" />
</a>
</center>
<p><br />
and everything disappear if you compare against <em>hype languages</em> like <code class="language-plaintext highlighter-rouge">Python</code> or…ehm…<code class="language-plaintext highlighter-rouge">Javascript</code>:</p>
<center>
<a href="https://tjpalmer.github.io/languish/#y=issues&names=sql%2Cplsql%2Cplpgsql%2Cpython%2Cjavascript" target="_blank">
<img src="/images/posts/trends/2020_plpgsql_python_javascript.png" />
</a>
</center>
Are we at the ZeroConference point?2020-04-09T00:00:00+00:00https://fluca1978.github.io/2020/04/09/ZeroConf<p>I’m seeing that around all the world the conferences are being cancelled. This is also true for PostgreSQL related conferences.</p>
<h1 id="are-we-at-the-zeroconference-point">Are we at the ZeroConference point?</h1>
<p>The situation around the planet is dramatic, to the point that our lifes have been deeply changed by the <em>COVID-19</em>.
One, small, consequence of all the measures to avoid the prolification of COVID-19, is that a lot of conferences have been cancelled.
<br />
<br />
PostgreSQL conferences are no exclusion, and a lot of events are going to be cancelled or pushed to the <em>video streaming</em> mode.
<br />
<br />
Among the others, <strong>PGDay.IT has been cancelled</strong>, and I think someone should advice also on the planet, not only the mailing list.
<br />
<br />
I was hoping for a streamed conference, but the web site clearly states the conference is postponed to the next year.
<br />
<br />
I hope PostgreSQL conferences, as all other technical conferences, can be soon</p>
PostgreSQL 11 Server Side Programming it's gaining attention2020-04-08T00:00:00+00:00https://fluca1978.github.io/2020/04/08/PG11SSP-3<p>My own first book on PostgreSQL is gaining more and more attention, and this is agood news.</p>
<h2 id="postgresql-11-server-side-programming-its-gaining-attention">PostgreSQL 11 Server Side Programming it’s gaining attention</h2>
<p>I’m happy to say that my very first book on PostgreSQL, <strong><a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide">PostgreSQL 11 Server Side Programming Quick Start Guide</a></strong>, is gaining more and more attention and the statistics about it is increasing.</p>
<p><br /><br /><br /></p>
<p><a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide"><img src="/images/posts/pg11ssp/cover.png" alt="PostgreSQL-11-ServerSideProgramming-cover-image" /></a></p>
<p><br /><br /><br />
Quite frankly, the book sold one order of magnitude more than I was expecting and it is still raising even if PostgreSQL 12 is the latest major version and 13 is almost here.
<br /></p>
<p><br />
I think the dramatical situation around europe, and more in general the world, namely <em>COVID-19</em> is also responsible for this raising: since people is forced to stay at home, one that keep mind occupied is to read and study other topics and subjects.
<br />
Me myself have started reading technical books I never thought I would have read in normal conditions.
<br />
<br />
Anyway, back to the book, please consider that even if the title includes <em>11</em> as version, many of the examples and guidelines can be applied in newer versions of PostgreSQL. I have to admit I’m using many of the examples in my <strong>training events and courses</strong> (I can, of course, being the author).
<br />
The book uses Perl and Java as the main foreign languages, being of course PL/PgSQL the main <em>native</em> language for the application development.
<br />
<br />
<br />
I hope you can enjoy the book during this particular point in time.
<br />
And if you have some suggestions or errata-corridge, please advice and I will include credits and details in the book code repository.</p>
<h2 id="code-repository">Code Repository</h2>
<p>The code repository with examples and other information is available on the official <strong><a href="https://github.com/PacktPublishing/PostgreSQL-11-Quick-Start-Guide">GitHub space</a></strong> and is also cloned into my <strong><a href="https://gitlab.com/fluca1978/postgresql-11-quick-start-guide">GitLab repository</a></strong> so feel free to clone it from whatever is more comfortable to you!</p>
PostgreSQL 12 Generated Columns: another use case2020-03-02T00:00:00+00:00https://fluca1978.github.io/2020/03/02/PostgreSQLGeneratedColumns_part2<p>When you start realizing how useful can be generated columns, you start using them as part of your workflow.
Here there’s another story of mine in the adventures in PostgreSQL-land.</p>
<h1 id="postgresql-12-generated-columns-another-use-case">PostgreSQL 12 Generated Columns: another use case</h1>
<p><a href="https://fluca1978.github.io/2019/11/04/PostgreSQL12GeneratedColumns.html" target="_blank">I’ve already written</a> about PostgreSQL 12 feature related to <a href="https://www.postgresql.org/docs/12/ddl-generated-columns.html">automatically generated columns</a>{:target=”_blank”<strong>.
<br />
A few days ago I worked on a simple table that contains a single tuple for every file on a filesystem, including the file size and hash. Having the file hash provides a lot of practical analisys, including seeing how many times the file is replicated in the file system.
<br />
But what if I want to store such duplication information into the table?
<br />
One solution could be to add a column, and then run a long `UPDATE</strong> to update such column, then insert a trigger to catch every new table modifications.
<br />
<strong>Or, I can use generated columns!</strong></p>
<h2 id="the-table-structure">The table structure</h2>
<p>The table was structured as follows, and it is quite simple to understand:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">d</span> <span class="n">my_files</span>
<span class="k">Table</span> <span class="nv">"public.my_files"</span>
<span class="k">Column</span> <span class="o">|</span> <span class="k">Type</span> <span class="o">|</span> <span class="k">Collation</span> <span class="o">|</span> <span class="k">Nullable</span> <span class="o">|</span> <span class="k">Default</span>
<span class="c1">-----------|-------------------------|-----------|----------|---------</span>
<span class="n">filename</span> <span class="o">|</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">(</span><span class="mi">200</span><span class="p">)</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
<span class="n">directory</span> <span class="o">|</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">(</span><span class="mi">2048</span><span class="p">)</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
<span class="n">md5sum</span> <span class="o">|</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">(</span><span class="mi">128</span><span class="p">)</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
<span class="n">bytes</span> <span class="o">|</span> <span class="nb">integer</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
</code></pre></div></div>
<p>The file can be found in the filesystem in the position <code class="language-plaintext highlighter-rouge">directory || filename</code> (i.e., string concatenation). Every file has its own checksum (<code class="language-plaintext highlighter-rouge">md5sum</code>) and the size expressed in <code class="language-plaintext highlighter-rouge">bytes</code>.
<br />
Please note that this is a de-normalized schema, but it is a simple use case I have to work with so far.
<br />
The size of the table is quite normal:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">reltuples</span><span class="p">,</span> <span class="n">relpages</span><span class="p">,</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'vace.my_files'</span> <span class="p">)</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'my_files'</span> <span class="k">AND</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span><span class="p">;</span>
<span class="n">reltuples</span> <span class="o">|</span> <span class="n">relpages</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------|----------|----------------</span>
<span class="mi">1</span><span class="p">.</span><span class="mi">872529</span><span class="n">e</span><span class="o">+</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">40757</span> <span class="o">|</span> <span class="mi">318</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="adding-a-generated-column">Adding a generated column</h2>
<p>Let’s add a new column to count the occurrencies of the file, that is how many times the file appears in the filesystem.
<br />
First of all, a new <code class="language-plaintext highlighter-rouge">IMMUTABLE</code> function must be generated:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">FUNCTION</span> <span class="n">f_count_occurrencies</span><span class="p">(</span> <span class="n">md5sum_to_find</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">bigint</span>
<span class="k">AS</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">SELECT</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span>
<span class="k">FROM</span> <span class="n">my_files</span>
<span class="k">WHERE</span> <span class="n">md5sum</span> <span class="o">=</span> <span class="n">md5sum_to_find</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="k">sql</span>
<span class="k">IMMUTABLE</span><span class="p">;</span>
</code></pre></div></div>
<p>It is a very simple function: it does a <code class="language-plaintext highlighter-rouge">count(*)</code> of every tuple with a specific checksum.
<br />
There are two things to note: the function must return a <code class="language-plaintext highlighter-rouge">bigint</code> because so it does <code class="language-plaintext highlighter-rouge">count()</code> and, most notably, it must be marked as <code class="language-plaintext highlighter-rouge">IMMUTABLE</code> because it is what is required to use such function as the engine to compute the generated column values.
<br />
However, applying such a function did not complete within two hours!</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">my_files</span>
<span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">occurrencies</span> <span class="nb">int</span>
<span class="k">GENERATED</span> <span class="n">ALWAYS</span>
<span class="k">AS</span> <span class="p">(</span> <span class="n">f_count_occurrencies</span><span class="p">(</span> <span class="n">md5sum</span> <span class="p">)</span> <span class="p">)</span> <span class="n">STORED</span><span class="p">;</span>
<span class="o">^</span><span class="n">CCancel</span> <span class="n">request</span> <span class="n">sent</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">canceling</span> <span class="k">statement</span> <span class="n">due</span> <span class="k">to</span> <span class="k">user</span> <span class="n">request</span>
<span class="n">CONTEXT</span><span class="p">:</span> <span class="k">SQL</span> <span class="k">function</span> <span class="nv">"f_count_occurrencies"</span> <span class="k">statement</span> <span class="mi">1</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">8285400</span><span class="p">,</span><span class="mi">145</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">02</span><span class="p">:</span><span class="mi">18</span><span class="p">:</span><span class="mi">05</span><span class="p">,</span><span class="mi">400</span><span class="p">)</span>
</code></pre></div></div>
<p>Therefore I decided to create an index on the field <code class="language-plaintext highlighter-rouge">md5sum</code> and try it again:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">idx_md5sum</span> <span class="k">ON</span> <span class="n">my_files</span><span class="p">(</span> <span class="n">md5sum</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">INDEX</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">3016</span><span class="p">,</span><span class="mi">571</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">03</span><span class="p">,</span><span class="mi">017</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">my_files</span>
<span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">occurrencies</span> <span class="nb">int</span>
<span class="k">GENERATED</span> <span class="n">ALWAYS</span>
<span class="k">AS</span> <span class="p">(</span> <span class="n">f_count_occurrencies</span><span class="p">(</span> <span class="n">md5sum</span> <span class="p">)</span> <span class="p">)</span> <span class="n">STORED</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">120131</span><span class="p">,</span><span class="mi">809</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">02</span><span class="p">:</span><span class="mi">00</span><span class="p">,</span><span class="mi">132</span><span class="p">)</span>
</code></pre></div></div>
<p>As you can see, this time it took two minutes to perform the update of the table structure with the automatically computed column, while before the creation of the index it had not finished within two hours.</p>
<p>The table does not occupy much more space than before:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">reltuples</span><span class="p">,</span> <span class="n">relpages</span><span class="p">,</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'my_files'</span> <span class="p">)</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'my_files'</span> <span class="k">AND</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span><span class="p">;</span>
<span class="n">reltuples</span> <span class="o">|</span> <span class="n">relpages</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------|----------|----------------</span>
<span class="mi">1</span><span class="p">.</span><span class="mi">872529</span><span class="n">e</span><span class="o">+</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">41492</span> <span class="o">|</span> <span class="mi">324</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>so with six extra megabytes we have now the information replicated on every row. The table increased of around <code class="language-plaintext highlighter-rouge">1.8%</code> in size but make now computation about how much a file is replicated is straightforward.</p>
<h3 id="warning">WARNING!</h3>
<p><strong>edit 2020-03-04</strong>
<br />
As <em>Adam Brusselback</em> correctly pointd out in a comment to this blog post, adding the <code class="language-plaintext highlighter-rouge">occurrencies</code> column to the table does the job only if the table is <em>immutable</em> too, that is no more repeated files are added. In the case a file with an already existing <code class="language-plaintext highlighter-rouge">md5sum</code> is added to the table as a new entry, such last tuple will have the correct number of `occurencies<strong>, but other tuples will still get the last computed value.
<br />
**I didn’t mentioned in the beginning of this post that I was doing inspection and computations on an historical table, that is a table where new tuples are not added anymore.</strong></p>
<h2 id="a-generated-column-cannot-be-based-on-a-generate-column">A generated column cannot be based on a generate column</h2>
<p>Once you get used to generated column, you simply want more.
<br />
Making a column that indicates, per file, how much disk space it consumes due to its replicated version seems easy, but it is a little tricky. <strong>A generated column cannot be based on another generated column</strong>, it would be a circular dependency or better, a dependency that PostgreSQL cannot solve (there should be a generation order and sooner or later you could end up with a circular dependency).
<br />
This means we cannot exploit the <code class="language-plaintext highlighter-rouge">occurrencies</code> column in the count of the disk space. In fact, let’s add a generated column based on that, so first let’s create a simple computation function:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">FUNCTION</span> <span class="n">f_compute_duplicated_size</span><span class="p">(</span> <span class="n">bytes</span> <span class="nb">int</span><span class="p">,</span> <span class="n">how_many_times</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">SELECT</span> <span class="n">bytes</span> <span class="o">*</span> <span class="n">how_many_times</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="k">sql</span> <span class="k">IMMUTABLE</span><span class="p">;</span>
</code></pre></div></div>
<p>Note that the function exploits the generated column <code class="language-plaintext highlighter-rouge">occurrencies</code>, that is <em>we are generating a column on the basis of another generated column</em>, something PostgreSQL will avoid and in fact:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">my_files</span>
<span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">duplication_bytes</span> <span class="nb">int</span>
<span class="k">GENERATED</span> <span class="n">ALWAYS</span>
<span class="k">AS</span> <span class="p">(</span> <span class="n">f_compute_duplicated_size</span><span class="p">(</span> <span class="n">bytes</span><span class="p">,</span> <span class="n">occurrencies</span> <span class="p">)</span> <span class="p">)</span> <span class="n">STORED</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">cannot</span> <span class="n">use</span> <span class="k">generated</span> <span class="k">column</span> <span class="nv">"occurrencies"</span> <span class="k">in</span> <span class="k">column</span> <span class="n">generation</span> <span class="n">expression</span>
<span class="n">DETAIL</span><span class="p">:</span> <span class="n">A</span> <span class="k">generated</span> <span class="k">column</span> <span class="n">cannot</span> <span class="n">reference</span> <span class="n">another</span> <span class="k">generated</span> <span class="k">column</span><span class="p">.</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">17</span><span class="p">,</span><span class="mi">900</span> <span class="n">ms</span>
</code></pre></div></div>
<p>We need a trick to make the generated column indipendent from the other already generated one. Therefore, we can use a function like the following, that does not exploit any generated column:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">FUNCTION</span> <span class="n">f_compute_duplicated_size</span><span class="p">(</span> <span class="n">md5sum_to_find</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">bigint</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">SELECT</span> <span class="k">sum</span><span class="p">(</span> <span class="n">bytes</span><span class="p">)</span>
<span class="k">FROM</span> <span class="n">my_files</span>
<span class="k">WHERE</span> <span class="n">md5sum</span> <span class="o">=</span> <span class="n">md5sum_to_find</span><span class="p">;</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="k">sql</span> <span class="k">IMMUTABLE</span><span class="p">;</span>
</code></pre></div></div>
<p>and add the column, this time with success:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">my_files</span>
<span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">duplication_bytes</span> <span class="nb">int</span>
<span class="k">GENERATED</span> <span class="n">ALWAYS</span>
<span class="k">AS</span> <span class="p">(</span> <span class="n">f_compute_duplicated_size</span><span class="p">(</span> <span class="n">md5sum</span> <span class="p">)</span> <span class="p">)</span> <span class="n">STORED</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">119696</span><span class="p">,</span><span class="mi">310</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">01</span><span class="p">:</span><span class="mi">59</span><span class="p">,</span><span class="mi">696</span><span class="p">)</span>
</code></pre></div></div>
<p>Again, we are exploiting the <code class="language-plaintext highlighter-rouge">md5sum</code> index to keep the table modification at a rational speed.</p>
<h2 id="more-columnsmore-columns-quick">More columns…more columns quick!</h2>
<p>You get the point, it is now possible to enhance the table to get much more generated columns.
Of course, the risk is to denomarlize the data more and more, so using this approach depends on what is your aim. Since I’m using this table as a workbench to make some simulations on a filesystem, I don’t care too much about the normalization of the data, rather I do care to be able to do simple queries and get the result.
<br />
So what’s next?
<br />
Let’s add a column to decide the file <em>MIME type</em> based on a very poor approach: the file extension!
Of course, I do trust my filenames to be correct with respect to the relationship between the extension and the MIME type.
<br />
The approach is always the same:
1) create an immutable function;
2) alter the table.</p>
<p><br />
The function I use exploits the regular expression engine:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">FUNCTION</span> <span class="n">f_compute_file_type</span><span class="p">(</span> <span class="n">filename</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">text</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">SELECT</span> <span class="k">upper</span><span class="p">(</span> <span class="p">(</span> <span class="n">regexp_match</span><span class="p">(</span> <span class="k">trim</span><span class="p">(</span> <span class="n">filename</span> <span class="p">),</span>
<span class="s1">'</span><span class="se">\.</span><span class="s1">([a-zA-Z0-9]{3,4})$'</span> <span class="p">)</span> <span class="p">)[</span> <span class="mi">1</span> <span class="p">]</span> <span class="p">);</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="k">SQL</span> <span class="k">IMMUTABLE</span><span class="p">;</span>
</code></pre></div></div>
<p>The function returns the very last extension assuming it could be three or four characters (or digits).
Then adding the column is boring:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">my_files</span>
<span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">filetype</span> <span class="nb">text</span>
<span class="k">GENERATED</span> <span class="n">ALWAYS</span>
<span class="k">AS</span> <span class="p">(</span> <span class="n">f_compute_file_type</span><span class="p">(</span> <span class="n">filename</span> <span class="p">)</span> <span class="p">)</span> <span class="n">STORED</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">21452</span><span class="p">,</span><span class="mi">153</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">21</span><span class="p">,</span><span class="mi">452</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="what-is-the-final-effect-on-the-table">What is the final effect on the table?</h2>
<p>We added three generated columns in the table, what is the impact about the space?
<br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">reltuples</span><span class="p">,</span> <span class="n">relpages</span><span class="p">,</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'my_files'</span> <span class="p">)</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'my_files'</span> <span class="k">AND</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span><span class="p">;</span>
<span class="n">reltuples</span> <span class="o">|</span> <span class="n">relpages</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------|----------|----------------</span>
<span class="mi">1</span><span class="p">.</span><span class="mi">872529</span><span class="n">e</span><span class="o">+</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">43704</span> <span class="o">|</span> <span class="mi">341</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>Therefore the table size has grown from <code class="language-plaintext highlighter-rouge">318 MB</code> to <code class="language-plaintext highlighter-rouge">341 MB</code>, meaning <code class="language-plaintext highlighter-rouge">7%</code>. We have now a bigger table, to some extent de-normalized, but with a lot of more data to be used for analysis. Moreover, we can drop the index on <code class="language-plaintext highlighter-rouge">md5sum</code> since we could not need it anymore.</p>
<h2 id="generated-columns-are-not-fixed">Generated Columns are not fixed!</h2>
<p>Well, this could sound trivial, but the fact is that a generated column is not a <em>compute-once</em> column: the value of the column is updated every time its dependending-on columns are modified.
<br />
In the previous example you have seen that the <code class="language-plaintext highlighter-rouge">filetype</code> column depends on the <code class="language-plaintext highlighter-rouge">filename</code> one, but if we change the name to the file, for example because we don’t care anymore about the file extension, we are going to mess up also the <code class="language-plaintext highlighter-rouge">filetype</code> column.
<br />
The rule of thumb therefore is: if you need <em>computed-once</em> data use a materialized view (or a fixed column in the table), otherwise you can use generated columns.</p>
Usage of disk space in Oracle and PostgreSQL: a simple use case2020-02-24T00:00:00+00:00https://fluca1978.github.io/2020/02/24/Oracle_vs_PostgreSQL_storage<p>A very non-scientific comparison about the two database engines.</p>
<h1 id="usage-of-disk-space-in-oracle-and-postgresql">Usage of disk space in Oracle and PostgreSQL</h1>
<p>A few days ago I built a table in Oracle (11, if that matters) to store a few hundred megabytes of data.
But I don’t feel at home using Oracle, so I decided to export the data and import it back in PostgreSQL 12.
<br />
Surprisingly, PostgreSQL requires more data space to store the same amount of data.
<br />
<br />
<strong>I’m not saying anything about who is the best, and I don’t know the exact reasons why this happens, however this is just what I’ve observed hpoing this can be useful to someone else!</strong>
<br />
<em>So please don’t flame!</em>
<br /></p>
<h2 id="table-structure">Table structure</h2>
<p>The table is really simple, and holds data about files on a disk. It does not have even a key, since it is just data I must mangle and then throw away.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">d</span> <span class="n">my_schema</span><span class="p">.</span><span class="n">my_files</span>
<span class="k">Table</span> <span class="nv">"my_schema.my_files"</span>
<span class="k">Column</span> <span class="o">|</span> <span class="k">Type</span> <span class="o">|</span> <span class="k">Collation</span> <span class="o">|</span> <span class="k">Nullable</span> <span class="o">|</span> <span class="k">Default</span>
<span class="c1">-----------|-------------------------|-----------|----------|---------</span>
<span class="n">filename</span> <span class="o">|</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">(</span><span class="mi">200</span><span class="p">)</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
<span class="n">directory</span> <span class="o">|</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">(</span><span class="mi">2048</span><span class="p">)</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
<span class="n">md5sum</span> <span class="o">|</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">(</span><span class="mi">128</span><span class="p">)</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
<span class="n">bytes</span> <span class="o">|</span> <span class="nb">bigint</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span>
</code></pre></div></div>
<p>I’ve seen no changes in using <code class="language-plaintext highlighter-rouge">text</code> against a <code class="language-plaintext highlighter-rouge">varchar</code>, I used the latter just to be as similar as possible in the definition with Oracle.
<br />
<em>The table is populated with <code class="language-plaintext highlighter-rouge">1872529</code> tuples (around 2 million tuples).</em></p>
<h2 id="oracle-disk-space">Oracle Disk Space</h2>
<p>Oracle requires <code class="language-plaintext highlighter-rouge">312 MB</code> to store the data:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">select</span> <span class="n">segment_name</span><span class="p">,</span><span class="k">sum</span><span class="p">(</span><span class="n">bytes</span><span class="p">)</span><span class="o">/</span><span class="mi">1024</span><span class="o">/</span><span class="mi">1024</span> <span class="n">MB</span>
<span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="n">segment_name</span><span class="p">)</span>
<span class="p">,</span> <span class="n">blocks</span> <span class="o">*</span> <span class="mi">8192</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1024</span> <span class="o">*</span> <span class="mi">1024</span> <span class="p">)</span>
<span class="k">from</span> <span class="n">user_segments</span>
<span class="k">where</span> <span class="n">segment_type</span><span class="o">=</span><span class="s1">'TABLE'</span>
<span class="k">and</span> <span class="n">segment_name</span><span class="o">=</span><span class="k">upper</span><span class="p">(</span><span class="s1">'MY_FILES'</span><span class="p">)</span>
<span class="k">group</span> <span class="k">by</span> <span class="n">segment_name</span><span class="p">,</span> <span class="n">blocks</span> <span class="p">;</span>
</code></pre></div></div>
<p>The results of the above query are:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">312 MB</code> of data;</li>
<li><code class="language-plaintext highlighter-rouge">39936</code> blocks, that are something similar to PostgreSQL data pages.</li>
</ul>
<p>The table has <code class="language-plaintext highlighter-rouge">110</code> extents, but I’m not sure how they account in the space compuation.</p>
<h2 id="postgresql-disk-space">PostgreSQL Disk Space</h2>
<p>The same data in PostgreSQL required <code class="language-plaintext highlighter-rouge">324 MB</code>, so <code class="language-plaintext highlighter-rouge">12 MB</code> more than Oracle, that is <strong>roughly 4% more of disk space</strong>. It is therefore possible to say that the overall space is pretty much the same:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">reltuples</span><span class="p">,</span> <span class="n">relpages</span><span class="p">,</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'my_schema.my_files'</span> <span class="p">)</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'my_files'</span> <span class="k">AND</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span><span class="p">;</span>
<span class="n">reltuples</span> <span class="o">|</span> <span class="n">relpages</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------|----------|----------------</span>
<span class="mi">1</span><span class="p">.</span><span class="mi">872529</span><span class="n">e</span><span class="o">+</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">41491</span> <span class="o">|</span> <span class="mi">324</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>Please note that <code class="language-plaintext highlighter-rouge">fillfactor</code> has been set to 100% and the table has been <code class="language-plaintext highlighter-rouge">VACUUM</code>ed.</p>
<h2 id="counting-pages">Counting Pages</h2>
<p>What I can see, is that PostgreSQL uses <code class="language-plaintext highlighter-rouge">41491</code> data pages, while Oracle uses <code class="language-plaintext highlighter-rouge">39936</code>, so <code class="language-plaintext highlighter-rouge">1555</code> less data pages. Again, that is roughly the same 4% we already saw on effective space, that lead me think the Oracle datapages have the same size as PostgreSQL.
<br />
In fact, asking for the datapage size:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SQL</span><span class="o">></span> <span class="k">show</span> <span class="k">parameter</span> <span class="n">db_block_size</span><span class="p">;</span>
<span class="n">NAME</span> <span class="k">TYPE</span> <span class="n">VALUE</span>
<span class="c1">------------- ------- ----- </span>
<span class="n">db_block_size</span> <span class="nb">integer</span> <span class="mi">8192</span>
</code></pre></div></div>
<p>shows the same size as PostgreSQL.</p>
<h2 id="from-numeric-to-int">From <code class="language-plaintext highlighter-rouge">NUMERIC</code> to <code class="language-plaintext highlighter-rouge">INT</code></h2>
<p><strong>update of 2020-02-24</strong>
<br />
One possible difference between the two tables, is the <code class="language-plaintext highlighter-rouge">NUMERIC</code> data type used by Oracle. After inspecting the values, I’ve seen that the <code class="language-plaintext highlighter-rouge">bytes</code> column can be handled by an <code class="language-plaintext highlighter-rouge">int4</code> (normal integer) value type, so I changed it in both Oracle and PostgreSQL. While in Oracle the size remained the same, <code class="language-plaintext highlighter-rouge">312 MB</code>, in PostgreSQL the size shrinked down to <code class="language-plaintext highlighter-rouge">318 MB</code> which is much more close to the Oracle one:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">vace</span><span class="p">.</span><span class="n">my_files</span> <span class="k">ALTER</span> <span class="k">COLUMN</span> <span class="n">bytes</span> <span class="k">SET</span> <span class="k">DATA</span> <span class="k">TYPE</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">vacuum</span> <span class="k">full</span> <span class="n">vace</span><span class="p">.</span><span class="n">my_files</span><span class="p">;</span>
<span class="k">VACUUM</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">reltuples</span><span class="p">,</span> <span class="n">relpages</span><span class="p">,</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'vace.my_files'</span> <span class="p">)</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">WHERE</span> <span class="n">relname</span> <span class="o">=</span> <span class="s1">'my_files'</span> <span class="k">AND</span> <span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span><span class="p">;</span>
<span class="n">reltuples</span> <span class="o">|</span> <span class="n">relpages</span> <span class="o">|</span> <span class="n">pg_size_pretty</span>
<span class="c1">--------------|----------|----------------</span>
<span class="mi">1</span><span class="p">.</span><span class="mi">872529</span><span class="n">e</span><span class="o">+</span><span class="mi">06</span> <span class="o">|</span> <span class="mi">40757</span> <span class="o">|</span> <span class="mi">318</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<h1 id="conclusions">Conclusions</h1>
<p><strong>I really don’t have any. I know too little about Oracle storage to say why there is this difference in size, and I’m sure this is neither an advantage of Oracle nor a drawback of PostgreSQL.</strong>
<br />
I don’t even know if this is the default behavior for any use-case, I hardly think so, but it is interesting to know that even a simple use-case like this can require a little more space on disk.</p>
Take advantage of pg_settings when dealing with your configuration2020-02-13T00:00:00+00:00https://fluca1978.github.io/2020/02/13/pgsettings<p>The right way to get the current PostgreSQL configuration is by means of pg_settings.</p>
<h1 id="take-advantage-of-pg_settings-when-dealing-with-your-configuration">Take advantage of pg_settings when dealing with your configuration</h1>
<p>I often see messages on PostgreSQL related mailing list where the configuration is assumed by a <em>Unix-style</em> approach. For example, imagine you have been asked to provide your <em>autovacuum</em> configuration in order to see if there’s something wrong with it; one approach I often is the copy and paste of the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% sudo -u postgres grep autovacuum /postgres/12/postgresql.conf
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
#autovacuum = on # Enable autovacuum subprocess? 'on'
#log_autovacuum_min_duration = -1 # -1 disables, 0 logs all actions and
autovacuum_max_workers = 7 # max number of autovacuum subprocesses
autovacuum_naptime = 2min # time between autovacuum runs
autovacuum_vacuum_threshold = 500 # min number of row updates before
autovacuum_analyze_threshold = 700 # min number of row updates before
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
#autovacuum_multixact_freeze_max_age = 400000000 # maximum multixact age
#autovacuum_vacuum_cost_delay = 2ms # default vacuum cost delay for
# autovacuum, in milliseconds;
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
# autovacuum, -1 means use
</code></pre></div></div>
<p>While this <em>could be a correct approach</em> and makes it simply to provide a <em>full set</em> of configuration values, it has few drawbacks:</p>
<ul>
<li>it produces verbose output (e.g., there are comments on the right of each line);</li>
<li><em>it could not be the whole story about the configuration</em>, for example because something is in <code class="language-plaintext highlighter-rouge">postgresql.conf.auto</code>;</li>
<li><strong>it does include commented out lines</strong>;</li>
<li><strong>it could be not the configuration your cluster is running on</strong>.</li>
</ul>
<p>Let’s examine all the drawbacks, one at a time.</p>
<h2 id="verbose-output">Verbose Output</h2>
<p>This is much annoyance than a real problem, but please consider that people on the other part of the world could have a screen resolution, line wrapping, or setting that makes it difficult to read verbose lines.</p>
<h2 id="could-not-be-the-whole-truth-about-configuration">Could not be the whole truth about configuration</h2>
<p>I often place my own PostgreSQL configuration into <code class="language-plaintext highlighter-rouge">include_if_exists</code> files, so that I leave the <code class="language-plaintext highlighter-rouge">postgresql.conf</code> file unchanged. Let’s call it a kind of FreeBSD configuration style!
<br />
This means that, in order to use a Unix approach to find a particular setting, I have to include in the search every single configuration file in every single location. This can be as simple as doing <code class="language-plaintext highlighter-rouge">grep autovacuum *.conf*</code> or much more complicated depending on your directory structure.
<br />
In any case, I could have omitted one single file, and that’s bad both for me and whoever is trying to help me.
<br />
Moreover, since <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code> is gaining more and more power, setting could also live into <code class="language-plaintext highlighter-rouge">postgresql.conf.auto</code> and people should begin used to check also such file.</p>
<h2 id="it-does-include-commented-out-lines">It does include commented-out lines</h2>
<p>Come on, who cares about that? After all commented-out lines means the value is at its default value.
<br />
And that could be the problem: do you remember all the default values for every single PostgreSQL version?
<br />
Therefore, don’t trust the default value, get the exact value!</p>
<h2 id="it-could-not-be-the-configuration-you-cluster-is-using">It could not be the configuration you cluster is using</h2>
<p>What if you modified the configuration file but edited the wrong context with regard to the action you did? May be you edited a <code class="language-plaintext highlighter-rouge">postmaster</code> context parameter and issued only a <em>simple</em> <code class="language-plaintext highlighter-rouge">SIGHUP</code>? What if you forgot to notify the cluster at all?
<br />
What if another administrator changed the parameters without telling you and scheduling a <em>cluster notification</em> at night? Yes, that really happened to me…
<br />
Again, get the real configuration!</p>
<h1 id="how-to-get-the-real-configuration">How to get the real configuration</h1>
<p>I’m glad you asked: <code class="language-plaintext highlighter-rouge">pg_settings</code> is there for you.
<br />
It is a matter of a single query, for example:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">forumdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">name</span><span class="p">,</span> <span class="n">setting</span><span class="p">,</span> <span class="n">pending_restart</span>
<span class="k">FROM</span> <span class="n">pg_settings</span>
<span class="k">WHERE</span> <span class="n">name</span> <span class="k">like</span> <span class="s1">'autovacuum%'</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">name</span> <span class="o">|</span> <span class="n">setting</span> <span class="o">|</span> <span class="n">pending_restart</span>
<span class="c1">-------------------------------------|-----------|-----------------</span>
<span class="n">autovacuum</span> <span class="o">|</span> <span class="k">on</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_analyze_scale_factor</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">1</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_analyze_threshold</span> <span class="o">|</span> <span class="mi">50</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_freeze_max_age</span> <span class="o">|</span> <span class="mi">200000000</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_max_workers</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_multixact_freeze_max_age</span> <span class="o">|</span> <span class="mi">400000000</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_naptime</span> <span class="o">|</span> <span class="mi">60</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_vacuum_cost_delay</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_vacuum_cost_limit</span> <span class="o">|</span> <span class="o">-</span><span class="mi">1</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_vacuum_scale_factor</span> <span class="o">|</span> <span class="mi">0</span><span class="p">.</span><span class="mi">2</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_vacuum_threshold</span> <span class="o">|</span> <span class="mi">50</span> <span class="o">|</span> <span class="n">f</span>
<span class="n">autovacuum_work_mem</span> <span class="o">|</span> <span class="o">-</span><span class="mi">1</span> <span class="o">|</span> <span class="n">f</span>
<span class="p">(</span><span class="mi">12</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p>You can elaborate the query as you like, but the point is that you get exact values. In this particular example, as you can see, some values differs from what you get out of the configuration file. For example, <code class="language-plaintext highlighter-rouge">autovacuum_max_worker</code> has been set to <code class="language-plaintext highlighter-rouge">7</code> in the configuration file, but the database applies a value of <code class="language-plaintext highlighter-rouge">3</code>.
<br />
Now you can inspect this problem too, and see if it has been caused from a cluster that has not been notified about configuration changes or an included file that overwrites your settings.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The configuration file is always only an hint about what your cluster is configured for, not the real thruth.
When inspecting a configuration problem, the starting point <em>to report even to others</em> is the output of <code class="language-plaintext highlighter-rouge">pg_settings</code>.</p>
Why Dropping a Column does not Reclaim Disk Space? (or better, why is it so fast?)2020-02-09T00:00:00+00:00https://fluca1978.github.io/2020/02/09/PostgreSQLDROPCOlumn<p>You may have noticed how dropping a column is fast in PostgreSQL, haven’t you?</p>
<h1 id="why-dropping-a-column-does-not-reclaim-disk-space-or-better-why-is-it-so-fast">Why Dropping a Column does not Reclaim Disk Space? (or better, why is it so fast?)</h1>
<p>Simple answer: <strong>because PostgreSQL knows how to do its job at best</strong>!
<br />
<br /></p>
<p>Let’s create a dummy table to test this behavior against:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">foo</span><span class="p">(</span> <span class="n">i</span> <span class="nb">int</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">foo</span>
<span class="k">SELECT</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">10000000</span> <span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">10000000</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'foo'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">346</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>Now, let’s add a quite large column to the table and measure how much time does it takes:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="err">\</span><span class="n">timing</span>
<span class="n">Timing</span> <span class="k">is</span> <span class="k">on</span><span class="p">.</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">foo</span>
<span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">t</span> <span class="nb">text</span>
<span class="k">DEFAULT</span> <span class="n">md5</span><span class="p">(</span> <span class="n">random</span><span class="p">()::</span><span class="nb">text</span> <span class="p">);</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">30702</span><span class="p">,</span><span class="mi">872</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">30</span><span class="p">,</span><span class="mi">703</span><span class="p">)</span>
</code></pre></div></div>
<p>What happened? In nearly <code class="language-plaintext highlighter-rouge">31 secs</code> the table has grown with random data on every row to the extent of <code class="language-plaintext highlighter-rouge">651 MB</code> (almost the double of the original size):</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'foo'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">651</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>What does PostgreSQL thinks about the columns in this table? Let’s query the <code class="language-plaintext highlighter-rouge">pg_attribute</code> catalog on all those attributes that are user-defined (i.e., <code class="language-plaintext highlighter-rouge">attnum</code> is a positive value) and inspect the <code class="language-plaintext highlighter-rouge">attisdropped</code> value that indicates if the column belongs or not to the table:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">attnum</span><span class="p">,</span> <span class="n">attname</span><span class="p">,</span> <span class="n">attisdropped</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span> <span class="n">a</span>
<span class="k">JOIN</span> <span class="n">pg_class</span> <span class="k">c</span> <span class="k">ON</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">attrelid</span>
<span class="k">WHERE</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="o">=</span> <span class="s1">'foo'</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">a</span><span class="p">.</span><span class="n">attnum</span> <span class="o">></span> <span class="mi">0</span><span class="p">;</span>
<span class="n">attnum</span> <span class="o">|</span> <span class="n">attname</span> <span class="o">|</span> <span class="n">attisdropped</span>
<span class="c1">--------|---------|--------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">i</span> <span class="o">|</span> <span class="n">f</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">t</span> <span class="o">|</span> <span class="n">f</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p>As you can see, both the columns <code class="language-plaintext highlighter-rouge">foo.i</code> and <code class="language-plaintext highlighter-rouge">foo.t</code> are valid in the table, that means <em>they have not been dropped</em>.</p>
<p><br />
<br />
It is now time to drop the columns and see the results:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">foo</span> <span class="k">DROP</span> <span class="k">COLUMN</span> <span class="n">t</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">20</span><span class="p">,</span><span class="mi">237</span> <span class="n">ms</span>
</code></pre></div></div>
<p><em>Pretty impressive, isn’t it?</em>
<br />
We waited almost <code class="language-plaintext highlighter-rouge">31</code> seconds to add the new data and no one (<code class="language-plaintext highlighter-rouge">20</code> <em>milliseconds</em>) to drop it away?
<br />
The <a href="https://www.postgresql.org/docs/12/sql-altertable.html" target="_blank">documentation helps understanding it</a>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>The DROP COLUMN form does not physically remove the column, but simply makes it invisible to SQL operations. Subsequent insert and update operations in the table will store a null value for the column. Thus, dropping a column is quick but it will not immediately reduce the on-disk size of your table, as the space occupied by the dropped column is not reclaimed. The space will be reclaimed over time as existing rows are updated.
</code></pre></div></div>
<p>There is no reason to immediatly force a table rewrite, the <code class="language-plaintext highlighter-rouge">DROP COLUMN</code> <strong>invalidates the column</strong> so that is has disappeared <em>logically</em> but not <em>physically</em>. Let’s inspect the table and its attributes again:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'foo'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">651</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">attnum</span><span class="p">,</span> <span class="n">attname</span><span class="p">,</span> <span class="n">attisdropped</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span> <span class="n">a</span>
<span class="k">JOIN</span> <span class="n">pg_class</span> <span class="k">c</span> <span class="k">ON</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">attrelid</span>
<span class="k">WHERE</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="o">=</span> <span class="s1">'foo'</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">a</span><span class="p">.</span><span class="n">attnum</span> <span class="o">></span> <span class="mi">0</span><span class="p">;</span>
<span class="n">attnum</span> <span class="o">|</span> <span class="n">attname</span> <span class="o">|</span> <span class="n">attisdropped</span>
<span class="c1">--------|------------------------------|--------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">i</span> <span class="o">|</span> <span class="n">f</span>
<span class="mi">2</span> <span class="o">|</span> <span class="p">........</span><span class="n">pg</span><span class="p">.</span><span class="n">dropped</span><span class="p">.</span><span class="mi">2</span><span class="p">........</span> <span class="o">|</span> <span class="n">t</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p>The table size remained the same, but the <code class="language-plaintext highlighter-rouge">t</code> attribute has been renamed as <code class="language-plaintext highlighter-rouge">........pg.dropped.2........</code> and <strong>is now marked as dropped from the table (<code class="language-plaintext highlighter-rouge">attisdropped = t</code>)</strong>.
<br />
Does that mean that it is possible from SQL to query the dropped column? No, <em>this is not a recycle bin</em> like mechanism:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">i</span><span class="p">,</span> <span class="nv">"........pg.dropped.2........"</span> <span class="k">FROM</span> <span class="n">foo</span> <span class="k">limit</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">column</span> <span class="nv">"........pg.dropped.2........"</span> <span class="n">does</span> <span class="k">not</span> <span class="n">exist</span>
<span class="n">LINE</span> <span class="mi">1</span><span class="p">:</span> <span class="k">SELECT</span> <span class="n">i</span><span class="p">,</span> <span class="nv">"........pg.dropped.2........"</span> <span class="k">FROM</span> <span class="n">foo</span> <span class="k">limit</span> <span class="mi">10</span><span class="p">;</span>
</code></pre></div></div>
<p>However, many of the properties of the column data type, such its length, are still in there into <code class="language-plaintext highlighter-rouge">pg_attribute</code> to allow the system to mangle that column even if the data type itself disappears.</p>
<p><br />
Last, let’s fire a full table rewrite, for example with a <code class="language-plaintext highlighter-rouge">VACUUM</code>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">VACUUM</span> <span class="k">FULL</span> <span class="n">foo</span><span class="p">;</span>
<span class="k">VACUUM</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">20231</span><span class="p">,</span><span class="mi">232</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">00</span><span class="p">:</span><span class="mi">20</span><span class="p">,</span><span class="mi">231</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'foo'</span> <span class="p">)</span> <span class="p">);</span> <span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">346</span> <span class="n">MB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span><span class="mi">519</span> <span class="n">ms</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">attnum</span><span class="p">,</span> <span class="n">attname</span><span class="p">,</span> <span class="n">attisdropped</span>
<span class="k">FROM</span> <span class="n">pg_attribute</span> <span class="n">a</span>
<span class="k">JOIN</span> <span class="n">pg_class</span> <span class="k">c</span> <span class="k">ON</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">attrelid</span>
<span class="k">WHERE</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="o">=</span> <span class="s1">'foo'</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">a</span><span class="p">.</span><span class="n">attnum</span> <span class="o">></span> <span class="mi">0</span><span class="p">;</span>
<span class="n">attnum</span> <span class="o">|</span> <span class="n">attname</span> <span class="o">|</span> <span class="n">attisdropped</span>
<span class="c1">--------|------------------------------|--------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">i</span> <span class="o">|</span> <span class="n">f</span>
<span class="mi">2</span> <span class="o">|</span> <span class="p">........</span><span class="n">pg</span><span class="p">.</span><span class="n">dropped</span><span class="p">.</span><span class="mi">2</span><span class="p">........</span> <span class="o">|</span> <span class="n">t</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p>According to the time spent in <code class="language-plaintext highlighter-rouge">VACUUM</code> something good must be happened, and in fact the table space was reduced to the right (or better, the original) amount of space.
<br />
But why the dropped column is still mentioned in <code class="language-plaintext highlighter-rouge">pg_attribute</code>?
<br />
In this particular case it would have been dropped quite easily also from <code class="language-plaintext highlighter-rouge">pg_attribute</code>, but imagine a more complex tble where you drop a column in the middle of the attribute list: PostgreSQL would also have to rewrite all the attribute ordering with a quite expensive amount of work.
<br />
However, this approach has a potential drawback: being dropped attributes mentioned in <code class="language-plaintext highlighter-rouge">pg_attribute</code> as <em>normal</em> ones, they do count as table attributes and therefore could lower the number of <em>real active</em> attributes you can have in the table.</p>
<h1 id="conclusions">Conclusions</h1>
<p>PostgreSQL way of dropping column is really fast because it involves a catalog update. But that also means disk space is not reclaimed, so in order to do that you need to trigger a full table rewrite.</p>
Executing VACUUM by non-owner user2020-02-06T00:00:00+00:00https://fluca1978.github.io/2020/02/06/VacuumByNotOwners<p>VACUUM needs to be run by the object owner!</p>
<h1 id="executing-vacuum-by-non-owner-user">Executing VACUUM by non-owner user</h1>
<p>The <a href="https://www.postgresql.org/docs/12/sql-vacuum.html" target="_blank">documentation about <code class="language-plaintext highlighter-rouge">VACUUM</code> clearly states it</a>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> To vacuum a table, one must ordinarily be the table's owner or a superuser.
However, database owners are allowed to vacuum all tables in their databases,
except shared catalogs.
[...]
VACUUM cannot be executed inside a transaction block.
</code></pre></div></div>
<p>There is not an ACL flag about <code class="language-plaintext highlighter-rouge">VACUUM</code>, that means <strong>you cannot <code class="language-plaintext highlighter-rouge">GRANT</code> someone else to execute <code class="language-plaintext highlighter-rouge">VACUUM</code></strong>.
<br />
Period.
<br />
<br />
Therefore there is no escape: <strong>in order to run <code class="language-plaintext highlighter-rouge">VACUUM</code> you must to be either (i) the <em>object owner</em> or (ii) <em>the database owner</em> or,as you can imagine, (iii) one of the cluster superuser(s).</strong>
<br />
<br />
Why am I insisting on this? Because some friends of mine argued that it is always possible to escape restrictions with functions an <code class="language-plaintext highlighter-rouge">SECURITY DEFINER</code> options. In this particular case, one could think to define a function that executes <code class="language-plaintext highlighter-rouge">VACUUM</code>, then apply the <code class="language-plaintext highlighter-rouge">SECURITY DEFINER</code> option so that the function will run as the object owner, and then provide (i.e., <code class="language-plaintext highlighter-rouge">GRANT</code>) execution permission to a normal user.
<br />
<em>WRONG!</em>
<br />
The fact that <code class="language-plaintext highlighter-rouge">VACUUM</code> cannot be executed within a transaction block means you cannot use such an approach, because a function is executed within a transaction block.
<br />
And if now you are asking yourself why <code class="language-plaintext highlighter-rouge">VACUUM</code> cannot be wrapped in a transaction block, just explain me how to <code class="language-plaintext highlighter-rouge">ROLLBACK</code> a <code class="language-plaintext highlighter-rouge">VACUUM</code> execution, it will be an interesting and fantasyland explaination!</p>
<p><br />
So, what is going to happen if you define a <code class="language-plaintext highlighter-rouge">VACUUM</code>-function?
Let’s quickly see what the database does:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">do_vacuum</span><span class="p">(</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="n">VOID</span>
<span class="k">AS</span> <span class="err">$$</span>
<span class="k">BEGIN</span>
<span class="k">EXECUTE</span> <span class="s1">'VACUUM FULL VERBOSE '</span>
<span class="o">||</span> <span class="n">quote_ident</span><span class="p">(</span> <span class="n">t</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>This will not work, since <strong><code class="language-plaintext highlighter-rouge">VACUUM</code> cannot be invoked by a function</strong> (have I already written this?):</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">do_vacuum</span><span class="p">(</span> <span class="s1">'foo'</span> <span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">VACUUM</span> <span class="n">cannot</span> <span class="n">be</span> <span class="n">executed</span> <span class="k">from</span> <span class="n">a</span> <span class="k">function</span>
<span class="n">CONTEXT</span><span class="p">:</span> <span class="k">SQL</span> <span class="k">statement</span> <span class="nv">"VACUUM FULL VERBOSE foo"</span>
<span class="n">PL</span><span class="o">/</span><span class="n">pgSQL</span> <span class="k">function</span> <span class="n">do_vacuum</span><span class="p">(</span><span class="nb">text</span><span class="p">)</span> <span class="n">line</span> <span class="mi">3</span> <span class="k">at</span> <span class="k">EXECUTE</span>
</code></pre></div></div>
<p>Changing the function into a procedure does not solve the problem, because <strong><code class="language-plaintext highlighter-rouge">VACUUM</code> cannot be invoked by a function</strong> (have I already written this?):</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">PROCEDURE</span>
<span class="n">do_vacuum</span><span class="p">(</span> <span class="n">t</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$$</span>
<span class="k">BEGIN</span>
<span class="k">EXECUTE</span> <span class="s1">'VACUUM FULL VERBOSE '</span>
<span class="o">||</span> <span class="n">quote_ident</span><span class="p">(</span> <span class="n">t</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">PROCEDURE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">CALL</span> <span class="n">do_vacuum</span><span class="p">(</span> <span class="s1">'foo'</span> <span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">VACUUM</span> <span class="n">cannot</span> <span class="n">be</span> <span class="n">executed</span> <span class="k">from</span> <span class="n">a</span> <span class="k">function</span>
<span class="n">CONTEXT</span><span class="p">:</span> <span class="k">SQL</span> <span class="k">statement</span> <span class="nv">"VACUUM FULL VERBOSE foo"</span>
<span class="n">PL</span><span class="o">/</span><span class="n">pgSQL</span> <span class="k">function</span> <span class="n">do_vacuum</span><span class="p">(</span><span class="nb">text</span><span class="p">)</span> <span class="n">line</span> <span class="mi">3</span> <span class="k">at</span> <span class="k">EXECUTE</span>
</code></pre></div></div>
<h1 id="conclusions">Conclusions</h1>
<p><strong><code class="language-plaintext highlighter-rouge">VACUUM</code> cannot be wrapped in a transaction nor a routine</strong>, therefore in order to execute it you must be a “special” user, with special simply meaning the owner, or the database owner, or a superuser.</p>
PL/PgSQL Exception and XIDs2020-02-05T00:00:00+00:00https://fluca1978.github.io/2020/02/05/PLPGSQLExceptions<p>A few considerations on how exceptions are handled in PL/PgSQL.</p>
<h1 id="plpgsql-exception-and-xids">PL/PgSQL Exception and XIDs</h1>
<p>I read the blog post <a href="https://pgdba.org/post/2020/02/exception_block/" target="_blank">The strange case of the EXCEPTION block</a> where the author was claiming that an <code class="language-plaintext highlighter-rouge">EXCEPTION</code> block in a PL/PgSQL function was incrementing the transaction id (<em>xid</em>).
<br />
Somehow, this was not very surprising to me.
<br />
Why? That reminded me immediatly <a href="https://www.postgresql.org/message-id/CAKoxK%2B5Wm6RazPnU8AqB97XRqx5zbY7us00QSVrQgobbgmf8hQ%40mail.gmail.com" target="_blank">my own question on the general mailing list</a> when I was observing a very similar behaviour within <code class="language-plaintext highlighter-rouge">psql</code>. In particular, <a href="https://www.postgresql.org/message-id/20171106125330.gcozh4ijjrkn6shq%40alvherre.pgsql" target="_blank">this answer was illuminating</a>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> something is using subtransactions there.
My first guess would be that
there are triggers with EXCEPTION blocks
</code></pre></div></div>
<h2 id="my-guess-about-how-exceptions-are-handled">My Guess About How Exceptions Are Handled</h2>
<p><strong>I think PL/PgSQL is using subtransactions (or savepoints) to handle exceptions</strong>.
<br />
Why?
<br />
Well, if you think about when you <em>catch</em> and exception you probably want to resume your execution, that is you must have a way to <em>rollback</em> your unit of work and start over again.</p>
<h2 id="see-transactions-in-action">See Transactions in Action!</h2>
<p>It is possible to inspect the transactions in action with a simple function and a table to abuse.
<br />
There is no need to play around with <code class="language-plaintext highlighter-rouge">VACUUM FREEZE</code> and <code class="language-plaintext highlighter-rouge">age()</code> as the original author says.
<br />
Let’s see the function:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">f_loop</span><span class="p">(</span> <span class="n">b</span> <span class="nb">int</span> <span class="k">DEFAULT</span> <span class="mi">0</span><span class="p">,</span> <span class="n">e</span> <span class="nb">int</span> <span class="k">DEFAULT</span> <span class="mi">10</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="n">VOID</span>
<span class="k">AS</span> <span class="err">$$</span>
<span class="k">BEGIN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'TXID of the function (here should not be assigned) function: % %'</span><span class="p">,</span>
<span class="n">txid_current_if_assigned</span><span class="p">(),</span>
<span class="n">txid_status</span><span class="p">(</span> <span class="n">txid_current_if_assigned</span><span class="p">()</span> <span class="p">);</span>
<span class="k">FOR</span> <span class="n">f</span> <span class="k">IN</span> <span class="n">b</span> <span class="p">..</span> <span class="n">e</span> <span class="n">LOOP</span>
<span class="k">BEGIN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Before INSERT of % TXID: % SNAPSHOT: %'</span><span class="p">,</span>
<span class="n">f</span><span class="p">,</span>
<span class="n">txid_current_if_assigned</span><span class="p">(),</span>
<span class="n">txid_current_snapshot</span><span class="p">();</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">foo</span><span class="p">(</span> <span class="n">i</span> <span class="p">)</span> <span class="k">VALUES</span><span class="p">(</span> <span class="n">f</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'After INSERT of % TXID: % SNAPSHOT: %'</span><span class="p">,</span>
<span class="n">f</span><span class="p">,</span>
<span class="n">txid_current_if_assigned</span><span class="p">(),</span>
<span class="n">txid_current_snapshot</span><span class="p">();</span>
<span class="n">EXCEPTION</span>
<span class="k">WHEN</span> <span class="n">UNIQUE_VIOLATION</span>
<span class="k">THEN</span> <span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Exception for % TXID: % SNAPSHOT: %'</span><span class="p">,</span>
<span class="n">f</span><span class="p">,</span>
<span class="n">txid_current_if_assigned</span><span class="p">(),</span>
<span class="n">txid_current_snapshot</span><span class="p">();</span>
<span class="k">END</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span><span class="p">;</span>
<span class="err">$$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>The function accepts a begin and end indexes and loop thru every value between them, trying to insert the value into a table. At every step, including the exception, we inspect <code class="language-plaintext highlighter-rouge">txid_current_if_assigned()</code>, that reports the transaction ID (<code class="language-plaintext highlighter-rouge">xid</code>) and <code class="language-plaintext highlighter-rouge">txid_current_snapshot()</code> that provides the current snapshot, that means roughly the minimum and maximum xid this transaction is “flying” over.</p>
<p><br />
The definition of the table is pretty straightforward: it has a single column with a <code class="language-plaintext highlighter-rouge">UNIQUE</code> constraint on it. That’s the constraint the function is going to violate.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">foo</span> <span class="p">(</span> <span class="n">i</span> <span class="nb">int</span> <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="p">);</span>
</code></pre></div></div>
<h3 id="first-run-no-exceptions">First Run: No Exceptions</h3>
<p>Since the table is empty, inserting values from <code class="language-plaintext highlighter-rouge">1</code> to <code class="language-plaintext highlighter-rouge">10</code> does not produce any exception.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">f_loop</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">10</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">TXID</span> <span class="k">of</span> <span class="n">the</span> <span class="k">function</span> <span class="p">(</span><span class="n">here</span> <span class="n">should</span> <span class="k">not</span> <span class="n">be</span> <span class="n">assigned</span><span class="p">)</span> <span class="k">function</span><span class="p">:</span> <span class="o"><</span><span class="k">NULL</span><span class="o">></span> <span class="o"><</span><span class="k">NULL</span><span class="o">></span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">1</span> <span class="n">TXID</span><span class="p">:</span> <span class="o"><</span><span class="k">NULL</span><span class="o">></span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">1</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">2</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">2</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">3</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">3</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">4</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">4</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">5</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">5</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">6</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">6</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">7</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">7</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">8</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">8</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">9</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">9</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">10</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">10</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4748</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4748</span><span class="p">:</span><span class="mi">4748</span><span class="p">:</span>
<span class="n">f_loop</span>
<span class="c1">--------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>In the very first run the <code class="language-plaintext highlighter-rouge">xid</code> is <code class="language-plaintext highlighter-rouge">NULL</code> because the function has not (yet) modified anything. That’s why I use <code class="language-plaintext highlighter-rouge">txid_current_if_assigned()</code> instead of <code class="language-plaintext highlighter-rouge">txid_current()</code> to avoid wasting a number. Once the function starts modifying the data (i.e., after the very first <code class="language-plaintext highlighter-rouge">INSERT</code>) the transaction is <em>promoted</em> from virtual to <em>concrete</em> and so a <code class="language-plaintext highlighter-rouge">xid</code> is assigned.
<br />
Since no exception at all is raised, the <code class="language-plaintext highlighter-rouge">xid</code> of the function is <em>fixed</em> and so is the snapshot.</p>
<h2 id="second-run-half-of-exceptions">Second Run: Half of Exceptions</h2>
<p>Let’s run it with some numbers overlapping, so that half of the values are inserted succesfully and half throw an exception.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">f_loop</span><span class="p">(</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">15</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">TXID</span> <span class="k">of</span> <span class="n">the</span> <span class="k">function</span> <span class="p">(</span><span class="n">here</span> <span class="n">should</span> <span class="k">not</span> <span class="n">be</span> <span class="n">assigned</span><span class="p">)</span> <span class="k">function</span><span class="p">:</span> <span class="o"><</span><span class="k">NULL</span><span class="o">></span> <span class="o"><</span><span class="k">NULL</span><span class="o">></span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">5</span> <span class="n">TXID</span><span class="p">:</span> <span class="o"><</span><span class="k">NULL</span><span class="o">></span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4760</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Exception</span> <span class="k">for</span> <span class="mi">5</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4762</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">6</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4762</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Exception</span> <span class="k">for</span> <span class="mi">6</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4763</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">7</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4763</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Exception</span> <span class="k">for</span> <span class="mi">7</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4764</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">8</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4764</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Exception</span> <span class="k">for</span> <span class="mi">8</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4765</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">9</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4765</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Exception</span> <span class="k">for</span> <span class="mi">9</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4766</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">10</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4766</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Exception</span> <span class="k">for</span> <span class="mi">10</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">11</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">11</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">12</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">12</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">13</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">13</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">14</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">14</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Before</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">15</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">After</span> <span class="k">INSERT</span> <span class="k">of</span> <span class="mi">15</span> <span class="n">TXID</span><span class="p">:</span> <span class="mi">4760</span> <span class="n">SNAPSHOT</span><span class="p">:</span> <span class="mi">4760</span><span class="p">:</span><span class="mi">4767</span><span class="p">:</span>
<span class="n">f_loop</span>
<span class="c1">--------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>As you can see, in the first five numbers there’s an exception reported. The <code class="language-plaintext highlighter-rouge">xid</code> of the function remains the same, but the snapshot grows by 6 transactions identifiers (one for the function, five for the subtransactions).
After that, the remaining five values are succesfully inserted and so the snapshot does not grow anymore.</p>
<h3 id="where-are-these-subtransactions">Where are these Subtransactions?</h3>
<p>If you now inspect the MVCC values for the table, you can see that every value inserted has a different transaction id <code class="language-plaintext highlighter-rouge">xmin</code>, without any regard to the fact that the function call did catch an exception or not.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">xmin</span><span class="p">,</span><span class="n">xmax</span><span class="p">,</span> <span class="n">cmin</span><span class="p">,</span> <span class="n">cmax</span><span class="p">,</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">foo</span><span class="p">;</span>
<span class="n">xmin</span> <span class="o">|</span> <span class="n">xmax</span> <span class="o">|</span> <span class="n">cmin</span> <span class="o">|</span> <span class="n">cmax</span> <span class="o">|</span> <span class="n">i</span>
<span class="c1">------|------|------|------|----</span>
<span class="mi">4749</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">4750</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">4751</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span> <span class="o">|</span> <span class="mi">3</span>
<span class="mi">4752</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">4753</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">4</span> <span class="o">|</span> <span class="mi">5</span>
<span class="mi">4754</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="mi">6</span>
<span class="mi">4755</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">6</span> <span class="o">|</span> <span class="mi">6</span> <span class="o">|</span> <span class="mi">7</span>
<span class="mi">4756</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">7</span> <span class="o">|</span> <span class="mi">7</span> <span class="o">|</span> <span class="mi">8</span>
<span class="mi">4757</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">8</span> <span class="o">|</span> <span class="mi">8</span> <span class="o">|</span> <span class="mi">9</span>
<span class="mi">4758</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">9</span> <span class="o">|</span> <span class="mi">9</span> <span class="o">|</span> <span class="mi">10</span>
<span class="mi">4767</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">6</span> <span class="o">|</span> <span class="mi">6</span> <span class="o">|</span> <span class="mi">11</span>
<span class="mi">4768</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">7</span> <span class="o">|</span> <span class="mi">7</span> <span class="o">|</span> <span class="mi">12</span>
<span class="mi">4769</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">8</span> <span class="o">|</span> <span class="mi">8</span> <span class="o">|</span> <span class="mi">13</span>
<span class="mi">4770</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">9</span> <span class="o">|</span> <span class="mi">9</span> <span class="o">|</span> <span class="mi">14</span>
<span class="mi">4771</span> <span class="o">|</span> <span class="mi">0</span> <span class="o">|</span> <span class="mi">10</span> <span class="o">|</span> <span class="mi">10</span> <span class="o">|</span> <span class="mi">15</span>
<span class="p">(</span><span class="mi">15</span> <span class="k">rows</span><span class="p">)</span>
<span class="nv">``</span><span class="o">**</span>
<span class="o">###</span> <span class="n">How</span> <span class="k">to</span> <span class="n">Simulate</span> <span class="n">the</span> <span class="n">Same</span> <span class="n">Behavior</span>
<span class="o">**</span><span class="n">Savepoints</span><span class="o">**</span> <span class="k">do</span> <span class="n">pretty</span> <span class="n">much</span> <span class="n">the</span> <span class="n">same</span><span class="o">!</span> <span class="n">Therefore</span><span class="p">,</span> <span class="n">let</span><span class="s1">'s truncate the table and insert new values in it with an explicit transaction and savepoints:
```sql
testdb=> TRUNCATE foo;
TRUNCATE TABLE
testdb=> BEGIN;
BEGIN
testdb=>
testdb=> INSERT INTO foo( i ) VALUES( 1 );
INSERT 0 1
testdb=> SAVEPOINT S1;
SAVEPOINT
testdb=>
testdb=> INSERT INTO foo( i ) VALUES( 2 );
INSERT 0 1
testdb=> SAVEPOINT S2;
SAVEPOINT
testdb=>
testdb=> INSERT INTO foo( i ) VALUES( 3 );
INSERT 0 1
testdb=> SAVEPOINT S3;
SAVEPOINT
testdb=>
testdb=> COMMIT;
COMMIT
testdb=> SELECT xmin,xmax, cmin, cmax, * FROM foo;
xmin | xmax | cmin | cmax | i
------|------|------|------|---
4779 | 0 | 0 | 0 | 1
4780 | 0 | 1 | 1 | 2
4781 | 0 | 2 | 2 | 3
(3 rows)
</span></code></pre></div></div>
<p>As you can see the <code class="language-plaintext highlighter-rouge">xmin</code> is incremented continuosly by every <code class="language-plaintext highlighter-rouge">INSERT</code>.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Exception are quite clearly implemented in PL/PgSQL (and possibly in other languages) by means of subtransactions. At least, the behavior is pretty much reproducible.</p>
Checking catalogues for corruption with pg_catcheck2020-01-30T00:00:00+00:00https://fluca1978.github.io/2020/01/30/PostgreSQL_pgcatcheck<p>I just discovered a new utility for checking the health of a cluster.</p>
<h1 id="checking-catalogues-for-corruption-with-pg_catcheck">Checking catalogues for corruption with pg_catcheck</h1>
<p>Today I discovered a nice tool from <a href="https://www.enterprisedb.com/" target="_blank">EnterpriseDB</a> named <a href="https://github.com/EnterpriseDB/pg_catcheck" target="_blank">pg_catcheck</a> that aims at checking the health of the PostgreSQL catalogs.
<br />
As you know, if the catalogs are damaged, the database can quickly get confused and not allow you to use as you wish. Luckily, this is something does not happen very often, or rather I should say I think I’ve seen this happening only once during my career (and I don’t remember the cause).
<br />
While I’m not sure I would be able to fix any problem in the catalogues by myself, having a tool that helps me understanding if everything is fine is a relief!</p>
<h2 id="installing-pg_catcheck-on-freebsd">Installing pg_catcheck on FreeBSD</h2>
<p>You need to get it from the <a href="https://github.com/EnterpriseDB/pg_catcheck" target="_blank">project repository</a>. There is at the moment one official release, but let’s use the <code class="language-plaintext highlighter-rouge">HEAD</code> (after all, <em>releases are for feeble people!</em>).</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% git clone https://github.com/EnterpriseDB/pg_catcheck.git
% <span class="nb">cd </span>pg_catcheck
% gmake
...
% <span class="nb">sudo </span>gmake <span class="nb">install</span>
...
</code></pre></div></div>
<p>If everything works fine, you will end up with a program named <code class="language-plaintext highlighter-rouge">pg_catcheck</code> under <code class="language-plaintext highlighter-rouge">/usr/local/bin</code>.</p>
<h2 id="using-pg_catcheck">Using pg_catcheck</h2>
<p>As you can imagine, <strong>you need a database administrator</strong> to perform the check. The application supports pretty much the same options than <code class="language-plaintext highlighter-rouge">psql</code> to connect, and there’s an extra option <code class="language-plaintext highlighter-rouge">--postgresql</code> to indicate you are running against a vanilla PostgreSQL (on the other hand, with <code class="language-plaintext highlighter-rouge">--enterprisedb</code> the program will assume you are running an EnterpriseDB instance).</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_catcheck <span class="nt">--postgresql</span> <span class="nt">-U</span> postgres template1
progress: <span class="k">done</span> <span class="o">(</span>0 inconsistencies, 0 warnings, 0 errors<span class="o">)</span>
</code></pre></div></div>
<p>That’s it, if you see <code class="language-plaintext highlighter-rouge">0 inconsinstencies</code> your database is fine.
<br /></p>
<p>You can see what the program checks with the <code class="language-plaintext highlighter-rouge">--verbose</code> option:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pg_catcheck <span class="nt">--postgresql</span> <span class="nt">-U</span> postgres <span class="nt">-v</span> template1
verbose: detected server version 120001
verbose: assuming PostgreSQL server
verbose: preloading table pg_authid because it is required <span class="k">in </span>order to check pg_namespace
verbose: loading table pg_namespace
verbose: checking table pg_namespace <span class="o">(</span>6 rows<span class="o">)</span>
verbose: loading table pg_collation
verbose: checking table pg_collation <span class="o">(</span>1110 rows<span class="o">)</span>
verbose: loading table pg_tablespace
verbose: checking table pg_tablespace <span class="o">(</span>2 rows<span class="o">)</span>
verbose: loading table pg_language
verbose: checking table pg_language <span class="o">(</span>4 rows<span class="o">)</span>
verbose: loading table pg_database
verbose: checking table pg_database <span class="o">(</span>4 rows<span class="o">)</span>
verbose: loading table pg_largeobject_metadata
verbose: checking table pg_largeobject_metadata <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_publication
verbose: checking table pg_publication <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_subscription
verbose: checking table pg_subscription <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_default_acl
verbose: checking table pg_default_acl <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_largeobject
verbose: checking table pg_largeobject <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_db_role_setting
verbose: checking table pg_db_role_setting <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_auth_members
verbose: checking table pg_auth_members <span class="o">(</span>8 rows<span class="o">)</span>
verbose: preloading table pg_class because it is required <span class="k">in </span>order to check pg_type
verbose: loading table pg_type
verbose: checking table pg_type <span class="o">(</span>406 rows<span class="o">)</span>
verbose: loading table pg_proc
verbose: checking table pg_proc <span class="o">(</span>2960 rows<span class="o">)</span>
verbose: loading table pg_operator
verbose: checking table pg_operator <span class="o">(</span>770 rows<span class="o">)</span>
verbose: loading table pg_ts_parser
verbose: checking table pg_ts_parser <span class="o">(</span>1 rows<span class="o">)</span>
verbose: loading table pg_ts_config
verbose: checking table pg_ts_config <span class="o">(</span>22 rows<span class="o">)</span>
verbose: loading table pg_ts_template
verbose: checking table pg_ts_template <span class="o">(</span>5 rows<span class="o">)</span>
verbose: loading table pg_ts_dict
verbose: checking table pg_ts_dict <span class="o">(</span>22 rows<span class="o">)</span>
verbose: loading table pg_foreign_data_wrapper
verbose: checking table pg_foreign_data_wrapper <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_foreign_server
verbose: checking table pg_foreign_server <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_cast
verbose: checking table pg_cast <span class="o">(</span>216 rows<span class="o">)</span>
verbose: loading table pg_conversion
verbose: checking table pg_conversion <span class="o">(</span>132 rows<span class="o">)</span>
verbose: loading table pg_extension
verbose: checking table pg_extension <span class="o">(</span>1 rows<span class="o">)</span>
verbose: loading table pg_enum
verbose: checking table pg_enum <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_user_mapping
verbose: checking table pg_user_mapping <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_event_trigger
verbose: checking table pg_event_trigger <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_rewrite
verbose: checking table pg_rewrite <span class="o">(</span>126 rows<span class="o">)</span>
verbose: loading table pg_attrdef
verbose: checking table pg_attrdef <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_policy
verbose: checking table pg_policy <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_publication_rel
verbose: checking table pg_publication_rel <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_statistic_ext
verbose: checking table pg_statistic_ext <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_transform
verbose: checking table pg_transform <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_attribute
verbose: checking table pg_attribute <span class="o">(</span>2913 rows<span class="o">)</span>
verbose: loading table pg_foreign_table
verbose: checking table pg_foreign_table <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_inherits
verbose: checking table pg_inherits <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_aggregate
verbose: checking table pg_aggregate <span class="o">(</span>136 rows<span class="o">)</span>
verbose: loading table pg_ts_config_map
verbose: checking table pg_ts_config_map <span class="o">(</span>418 rows<span class="o">)</span>
verbose: loading table pg_statistic
verbose: checking table pg_statistic <span class="o">(</span>422 rows<span class="o">)</span>
verbose: loading table pg_init_privs
verbose: checking table pg_init_privs <span class="o">(</span>171 rows<span class="o">)</span>
verbose: loading table pg_sequence
verbose: checking table pg_sequence <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_subscription_rel
verbose: checking table pg_subscription_rel <span class="o">(</span>0 rows<span class="o">)</span>
verbose: preloading table pg_am because it is required <span class="k">in </span>order to check pg_opfamily
verbose: loading table pg_opfamily
verbose: checking table pg_opfamily <span class="o">(</span>107 rows<span class="o">)</span>
verbose: checking table pg_class <span class="o">(</span>395 rows<span class="o">)</span>
verbose: loading table pg_opclass
verbose: checking table pg_opclass <span class="o">(</span>128 rows<span class="o">)</span>
verbose: loading table pg_amop
verbose: checking table pg_amop <span class="o">(</span>715 rows<span class="o">)</span>
verbose: loading table pg_amproc
verbose: checking table pg_amproc <span class="o">(</span>447 rows<span class="o">)</span>
verbose: loading table pg_index
verbose: checking table pg_index <span class="o">(</span>159 rows<span class="o">)</span>
verbose: loading table pg_constraint
verbose: checking table pg_constraint <span class="o">(</span>2 rows<span class="o">)</span>
verbose: loading table pg_trigger
verbose: checking table pg_trigger <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_range
verbose: checking table pg_range <span class="o">(</span>6 rows<span class="o">)</span>
verbose: loading table pg_depend
verbose: checking table pg_depend <span class="o">(</span>7601 rows<span class="o">)</span>
verbose: loading table pg_shdepend
verbose: checking table pg_shdepend <span class="o">(</span>16 rows<span class="o">)</span>
verbose: loading table pg_description
verbose: checking table pg_description <span class="o">(</span>4744 rows<span class="o">)</span>
verbose: loading table pg_shdescription
verbose: checking table pg_shdescription <span class="o">(</span>3 rows<span class="o">)</span>
verbose: loading table pg_seclabel
verbose: checking table pg_seclabel <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_shseclabel
verbose: checking table pg_shseclabel <span class="o">(</span>0 rows<span class="o">)</span>
verbose: loading table pg_partitioned_table
verbose: checking table pg_partitioned_table <span class="o">(</span>0 rows<span class="o">)</span>
progress: <span class="k">done</span> <span class="o">(</span>0 inconsistencies, 0 warnings, 0 errors<span class="o">)</span>
</code></pre></div></div>
<p>Thanks <a href="https://www.enterprisedb.com/" target="_blank">EnterpriseDB</a> named <a href="https://github.com/EnterpriseDB/pg_catcheck" target="_blank">pg_catcheck</a> for making this tool open source!</p>
PostgreSQL 12 EXPLAIN SETTINGS2019-12-05T00:00:00+00:00https://fluca1978.github.io/2019/12/05/Explain_settings<p>PostgreSQL 12 has a very interesting feature to turn on when doing an execution plan analysis.</p>
<h1 id="postgresql-12-explain-settings">PostgreSQL 12 EXPLAIN SETTINGS</h1>
<p>PostgreSQL 12 has a new feature that can be turned on in the <code class="language-plaintext highlighter-rouge">EXPLAIN</code> output: <code class="language-plaintext highlighter-rouge">SETTINGS</code>. This option provides some information about <strong>all and only those</strong> parameters that can affect an execution plan <strong>if and only if</strong> they are not at the default setting.
<br />
What does it mean in practice? Let’s see an <em>old plain <code class="language-plaintext highlighter-rouge">EXPLAIN</code></em>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">digikamdb</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span><span class="n">FORMAT</span> <span class="n">YAML</span><span class="p">)</span>
<span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">digikam_images</span>
<span class="k">WHERE</span> <span class="n">id</span> <span class="k">IN</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">id</span> <span class="k">FROM</span> <span class="n">digikam_images</span>
<span class="k">WHERE</span> <span class="n">modificationdate</span> <span class="o">=</span> <span class="s1">'2019-10-04'</span> <span class="p">);</span>
</code></pre></div></div>
<p>the output is as follows:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="pi">-</span> <span class="na">Plan</span><span class="pi">:</span> <span class="s">+</span>
<span class="s">Node Type</span><span class="err">:</span> <span class="s2">"</span><span class="s">Nested</span><span class="nv"> </span><span class="s">Loop"</span> <span class="s">+</span>
<span class="s">Parallel Aware</span><span class="err">:</span> <span class="no">false</span><span class="s"> +</span>
<span class="s">Join Type</span><span class="err">:</span> <span class="s2">"</span><span class="s">Inner"</span> <span class="s">+</span>
<span class="s">Startup Cost</span><span class="err">:</span> <span class="s">0.29 +</span>
<span class="s">Total Cost</span><span class="err">:</span> <span class="s">1737.95 +</span>
<span class="s">Plan Rows</span><span class="err">:</span> <span class="s">17 +</span>
<span class="s">Plan Width</span><span class="err">:</span> <span class="s">87 +</span>
<span class="s">Inner Unique</span><span class="err">:</span> <span class="no">true</span><span class="s"> +</span>
<span class="s">Plans</span><span class="err">:</span> <span class="s">+</span>
<span class="s">- Node Type</span><span class="err">:</span> <span class="s2">"</span><span class="s">Seq</span><span class="nv"> </span><span class="s">Scan"</span> <span class="s">+</span>
<span class="s">Parent Relationship</span><span class="err">:</span> <span class="s2">"</span><span class="s">Outer"</span> <span class="s">+</span>
<span class="s">Parallel Aware</span><span class="err">:</span> <span class="no">false</span><span class="s"> +</span>
<span class="s">Relation Name</span><span class="err">:</span> <span class="s2">"</span><span class="s">digikam_images"</span> <span class="s">+</span>
<span class="s">Alias</span><span class="err">:</span> <span class="s2">"</span><span class="s">digikam_images_1"</span> <span class="s">+</span>
<span class="s">Startup Cost</span><span class="err">:</span> <span class="s">0.00 +</span>
<span class="s">Total Cost</span><span class="err">:</span> <span class="s">1596.72 +</span>
<span class="s">Plan Rows</span><span class="err">:</span> <span class="s">17 +</span>
<span class="s">Plan Width</span><span class="err">:</span> <span class="s">8 +</span>
<span class="s">Filter</span><span class="err">:</span> <span class="s2">"</span><span class="s">(modificationdate</span><span class="nv"> </span><span class="s">=</span><span class="nv"> </span><span class="s">'2019-10-04'::date)"+</span>
<span class="s">- Node Type</span><span class="err">:</span> <span class="s2">"</span><span class="s">Index</span><span class="nv"> </span><span class="s">Scan"</span> <span class="s">+</span>
<span class="s">Parent Relationship</span><span class="err">:</span> <span class="s2">"</span><span class="s">Inner"</span> <span class="s">+</span>
<span class="s">Parallel Aware</span><span class="err">:</span> <span class="no">false</span><span class="s"> +</span>
<span class="s">Scan Direction</span><span class="err">:</span> <span class="s2">"</span><span class="s">Forward"</span> <span class="s">+</span>
<span class="s">Index Name</span><span class="err">:</span> <span class="s2">"</span><span class="s">idx_id"</span> <span class="s">+</span>
<span class="s">Relation Name</span><span class="err">:</span> <span class="s2">"</span><span class="s">digikam_images"</span> <span class="s">+</span>
<span class="s">Alias</span><span class="err">:</span> <span class="s2">"</span><span class="s">digikam_images"</span> <span class="s">+</span>
<span class="s">Startup Cost</span><span class="err">:</span> <span class="s">0.29 +</span>
<span class="s">Total Cost</span><span class="err">:</span> <span class="s">8.31 +</span>
<span class="s">Plan Rows</span><span class="err">:</span> <span class="s">1 +</span>
<span class="s">Plan Width</span><span class="err">:</span> <span class="s">87 +</span>
<span class="s">Index Cond</span><span class="err">:</span> <span class="s2">"</span><span class="s">(id</span><span class="nv"> </span><span class="s">=</span><span class="nv"> </span><span class="s">digikam_images_1.id)"</span>
</code></pre></div></div>
<p>The output is quite long, as well as the query is intentionally stupid just to generate some kind of loop.
Please note that I’m using <code class="language-plaintext highlighter-rouge">yaml</code> as an output format for better web impagination.
<br />
<br />
Let’s see <code class="language-plaintext highlighter-rouge">SETTINGS</code> in action, so change the <code class="language-plaintext highlighter-rouge">EXPLAIN</code> as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">digikamdb</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span><span class="n">FORMAT</span> <span class="n">YAML</span><span class="p">,</span> <span class="n">SETTINGS</span> <span class="k">ON</span><span class="p">)</span>
<span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">digikam_images</span>
<span class="k">WHERE</span> <span class="n">id</span> <span class="k">IN</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">id</span> <span class="k">FROM</span> <span class="n">digikam_images</span>
<span class="k">WHERE</span> <span class="n">modificationdate</span> <span class="o">=</span> <span class="s1">'2019-10-04'</span> <span class="p">);</span>
</code></pre></div></div>
<p>that produces <em>the very same output</em>!
<br />
Why? Because nothing has changed, so nothing must be shown!
<br />
Now, let’s change a parameter or two:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">digikamdb</span><span class="o">=></span> <span class="k">SET</span> <span class="n">seq_page_cost</span> <span class="k">TO</span> <span class="mi">3</span><span class="p">;</span>
<span class="n">digikamdb</span><span class="o">=></span> <span class="k">SET</span> <span class="n">random_page_cost</span> <span class="k">TO</span> <span class="mi">1</span><span class="p">;</span>
</code></pre></div></div>
<p>and see again the <code class="language-plaintext highlighter-rouge">EXPLAIN</code> in action:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">digikamdb</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span><span class="n">FORMAT</span> <span class="n">YAML</span><span class="p">,</span> <span class="n">SETTINGS</span> <span class="k">ON</span><span class="p">)</span>
<span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">digikam_images</span>
<span class="k">WHERE</span> <span class="n">id</span> <span class="k">IN</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">id</span> <span class="k">FROM</span> <span class="n">digikam_images</span>
<span class="k">WHERE</span> <span class="n">modificationdate</span> <span class="o">=</span> <span class="s1">'2019-10-04'</span> <span class="p">);</span>
<span class="p">...</span>
<span class="o">-</span> <span class="n">Plan</span><span class="p">:</span> <span class="o">+</span>
<span class="n">Node</span> <span class="k">Type</span><span class="p">:</span> <span class="nv">"Nested Loop"</span> <span class="o">+</span>
<span class="n">Parallel</span> <span class="n">Aware</span><span class="p">:</span> <span class="k">false</span> <span class="o">+</span>
<span class="k">Join</span> <span class="k">Type</span><span class="p">:</span> <span class="nv">"Inner"</span> <span class="o">+</span>
<span class="n">Startup</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">29</span> <span class="o">+</span>
<span class="n">Total</span> <span class="n">Cost</span><span class="p">:</span> <span class="mi">4353</span><span class="p">.</span><span class="mi">95</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="k">Rows</span><span class="p">:</span> <span class="mi">17</span> <span class="o">+</span>
<span class="n">Plan</span> <span class="n">Width</span><span class="p">:</span> <span class="mi">87</span> <span class="o">+</span>
<span class="k">Inner</span> <span class="k">Unique</span><span class="p">:</span> <span class="k">true</span> <span class="o">+</span>
<span class="n">Plans</span><span class="p">:</span> <span class="o">+</span>
<span class="o">-</span> <span class="n">Node</span> <span class="k">Type</span><span class="p">:</span> <span class="nv">"Seq Scan"</span> <span class="o">+</span>
<span class="p">...</span>
<span class="o">-</span> <span class="n">Node</span> <span class="k">Type</span><span class="p">:</span> <span class="nv">"Index Scan"</span> <span class="o">+</span>
<span class="p">...</span>
<span class="n">Settings</span><span class="p">:</span> <span class="o">+</span>
<span class="n">random_page_cost</span><span class="p">:</span> <span class="nv">"1"</span> <span class="o">+</span>
<span class="n">seq_page_cost</span><span class="p">:</span> <span class="nv">"4"</span>
</code></pre></div></div>
<p>As you can see, there is another section at the end of the output, titled <code class="language-plaintext highlighter-rouge">Settings</code>, that reminds us what parameters have changed and to which value they are currently.
<br />
<br />
In this way, it is possible to get an idea of why a plan is as it is, or at least we can remember that the system is running with different parameters.</p>
<h2 id="are-all-parameters-affected">Are all parameters affected?</h2>
<p>Reading the documentation <a href="https://www.postgresql.org/docs/12/sql-explain.html" target="_blank""_">about <code class="language-plaintext highlighter-rouge">SETTINGS</code></a> one could think that only those parameters that are part of an access method are going to be reported on the output of <code class="language-plaintext highlighter-rouge">EXPLAIN</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SETTINGS
Include information on configuration parameters.
Specifically, include options affecting query planning
with value different from the built-in default value.
This parameter defaults to FALSE.
</code></pre></div></div>
<p>However, even parameters that are not going to change the query plan are displayed. For example, in selection all the tuples, there is no need to know that the random page cost has changed, but it is displayed anyway:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">digikamdb</span><span class="o">=></span> <span class="k">RESET</span> <span class="k">ALL</span><span class="p">;</span>
<span class="n">digikamdb</span><span class="o">=></span> <span class="k">SET</span> <span class="n">seq_page_cost</span> <span class="k">TO</span> <span class="mi">2</span><span class="p">;</span>
<span class="n">digikamdb</span><span class="o">=></span> <span class="k">SET</span> <span class="n">random_page_cost</span> <span class="k">TO</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">digikamdb</span><span class="o">=></span> <span class="k">EXPLAIN</span> <span class="p">(</span><span class="n">SETTINGS</span> <span class="k">ON</span><span class="p">)</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">digikam_images</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">----------------------------------------------------------------------</span>
<span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">digikam_images</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">2364</span><span class="p">.</span><span class="mi">58</span> <span class="k">rows</span><span class="o">=</span><span class="mi">55258</span> <span class="n">width</span><span class="o">=</span><span class="mi">87</span><span class="p">)</span>
<span class="n">Settings</span><span class="p">:</span> <span class="n">random_page_cost</span> <span class="o">=</span> <span class="s1">'1'</span><span class="p">,</span> <span class="n">seq_page_cost</span> <span class="o">=</span> <span class="s1">'2'</span>
</code></pre></div></div>
<h2 id="which-parameters">Which parameters?</h2>
<p>There are different parameters, other than the trivial <em>costs</em>, that can be reported by <code class="language-plaintext highlighter-rouge">SETTINGS</code> section. An example is <code class="language-plaintext highlighter-rouge">work_mem</code>. Reading the commit <a href="https://github.com/postgres/postgres/commit/ea569d64ac7174d3fe657e3e682d11053ecf1866" target="_blank">ea569d64ac7174d3fe657e3e682d11053ecf1866</a> reveals that all the options marked in the source code with <code class="language-plaintext highlighter-rouge">GUC_EXPLAIN</code> are candidates to be printed.
<br />
So far, this resolves to the following long list, where I tried to mark as bold those that I usually configure (and I’ve seen touched by others):</p>
<ul>
<li><strong><code class="language-plaintext highlighter-rouge">enable_seqscan</code>, <code class="language-plaintext highlighter-rouge">enable_indexscan</code> <code class="language-plaintext highlighter-rouge">enable_indexonlyscan</code>, <code class="language-plaintext highlighter-rouge">enable_bitmapscan</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">temp_buffers</code>, <code class="language-plaintext highlighter-rouge">work_mem</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">max_parallel_workers_per_gather</code>, <code class="language-plaintext highlighter-rouge">max_parallel_workers</code>, <code class="language-plaintext highlighter-rouge">enable_gathermerge</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">effective_cache_size</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">min_parallel_table_scan_size</code>, <code class="language-plaintext highlighter-rouge">min_parallel_index_scan_size</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">enable_parallel_append</code>, <code class="language-plaintext highlighter-rouge">enable_parallel_hash</code>, <code class="language-plaintext highlighter-rouge">enable_partition_pruning</code></strong>;</li>
<li><strong><code class="language-plaintext highlighter-rouge">enable_nestloop</code>, <code class="language-plaintext highlighter-rouge">enable_mergejoin</code>, <code class="language-plaintext highlighter-rouge">enable_hashjoin</code></strong>;</li>
<li><code class="language-plaintext highlighter-rouge">enable_tidscan</code>;</li>
<li><code class="language-plaintext highlighter-rouge">enable_sort</code>;</li>
<li><code class="language-plaintext highlighter-rouge">enable_hashagg</code>;</li>
<li><code class="language-plaintext highlighter-rouge">enable_material</code>;</li>
<li><code class="language-plaintext highlighter-rouge">enable_partitionwise_join</code>;</li>
<li><code class="language-plaintext highlighter-rouge">enable_partitionwise_aggregate</code>;</li>
<li><code class="language-plaintext highlighter-rouge">geqo</code>;</li>
<li><code class="language-plaintext highlighter-rouge">optimize_bounded_sort</code>;</li>
<li><code class="language-plaintext highlighter-rouge">parallel_leader_participation</code>;</li>
<li><code class="language-plaintext highlighter-rouge">jit</code>;</li>
<li><code class="language-plaintext highlighter-rouge">from_collapse_limit</code>;</li>
<li><code class="language-plaintext highlighter-rouge">join_collapse_limit</code>;</li>
<li><code class="language-plaintext highlighter-rouge">geqo_threshold</code>;</li>
<li><code class="language-plaintext highlighter-rouge">geqo_effort</code>;</li>
<li><code class="language-plaintext highlighter-rouge">geqo_pool_size</code>;</li>
<li><code class="language-plaintext highlighter-rouge">geqo_generations</code>;</li>
<li><code class="language-plaintext highlighter-rouge">effective_io_concurrency</code>;</li>
</ul>
<h2 id="what-about-auto_explain">What about <code class="language-plaintext highlighter-rouge">auto_explain</code>?</h2>
<p>The new <code class="language-plaintext highlighter-rouge">SETTINGS</code> affects also the <code class="language-plaintext highlighter-rouge">auto_explain</code> tuning and output, and in fact there is a new GUC named <a href="https://www.postgresql.org/docs/12/auto-explain.html"><code class="language-plaintext highlighter-rouge">auto_explain.log_settings</code></a> that provides the same functionality as above for the <code class="language-plaintext highlighter-rouge">auto_explain</code> module.</p>
<h1 id="conclusions">Conclusions</h1>
<p>The <code class="language-plaintext highlighter-rouge">EXPLAIN (SETTINGS ON)</code> new feature is something really cool in my opinion that pretty much every DBA should turn on when inspecting query execution plans.</p>
PostgreSQL ascii logo for FreeBSD boot loader2019-11-12T00:00:00+00:00https://fluca1978.github.io/2019/11/12/PostgreSQLFreeBSDLOGO<p>I spent some time making an elephant logo to be used as FreeBSD boot loader logo.</p>
<h1 id="postgresql-ascii-logo-for-freebsd-boot-loader">PostgreSQL ascii logo for FreeBSD boot loader</h1>
<p>I use FreeBSD as my main PostgreSQL server, and also as virtual machine for training courses. A long time ago, I changed the <em>message of the day</em> (<code class="language-plaintext highlighter-rouge">/etc/motd</code>) to reflect the elephant logo in ascii-art, but why not changing also the booloader logo?
<br />
FreeBSD by default shows what is called <em>orb</em> or the devil (named <em>beastie</em>), and the new <a href="https://www.lua.org/">Lua</a> based bootloader use some <em>simple string concatenation</em> to generate a logo.
However, it was not so simple to make a new logo, since I’ve no idea about how to debug it production, and that forced me to a very long and repetitive *try and reboot** process to identify all the problems with my logos.
<br />
<strong>Last, I made it!</strong>
<br />
Now there are two available logos for the bootloader that provide both the black-and-white and the coloured elephant. Below you can see a couple of screenshoots:</p>
<p><br />
<br /></p>
<center>
<img src="/images/posts/freebsd_logo/logo-postgresql-color.png" />
<br />
<br />
<img src="/images/posts/freebsd_logo/logo-postgresql-bw.png" />
</center>
<p><br /></p>
<h2 id="how-to-use-the-postgresql-bootloader-logo">How to use the PostgreSQL bootloader logo</h2>
<p>In order to use one of the logos, you have to:</p>
<ul>
<li>download the Lua script from my <a href="https://github.com/fluca1978/fluca1978-pg-utils/tree/master/logos">Github repository</a>, within the <code class="language-plaintext highlighter-rouge">logos</code> directory you can find the files
<ul>
<li><a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/logos/logo-postgresql.lua"><code class="language-plaintext highlighter-rouge">logo-postgresql.lua</code></a> that is the coloured version of the logo;</li>
<li><a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/logos/logo-postgresqlbw.lua"><code class="language-plaintext highlighter-rouge">logo-postgresqlbw.lua</code></a> that is the black-and-white version of the logo;</li>
</ul>
</li>
<li>put the choosen file into the <code class="language-plaintext highlighter-rouge">/boot/lua</code> directory and provide read permissions;</li>
<li>edit your <code class="language-plaintext highlighter-rouge">/boot/loader.conf</code> and add the setting <code class="language-plaintext highlighter-rouge">loader_logo</code> depending on the chosen logo
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># for the coloured version</span>
<span class="nv">loader_logo</span><span class="o">=</span><span class="s2">"postgresql"</span>
<span class="c"># or for the black-and-white version</span>
<span class="c"># loader_logo="postgresqlbw"</span>
</code></pre></div> </div>
</li>
<li>and of course, reboot!</li>
</ul>
<h3 id="why-cyan-instead-of-blue">Why <code class="language-plaintext highlighter-rouge">cyan</code> instead of <code class="language-plaintext highlighter-rouge">blue</code>?</h3>
<p>You probably have noticed that the coloured elephant is made in <code class="language-plaintext highlighter-rouge">cyan</code> and not in the well known <code class="language-plaintext highlighter-rouge">blue</code>. The reason for that is that the console foreground blue is too dark to make the elephant appear. However, it is possible to manipulate the escape sequences in order to get a different color, but please note that for a reason I don’t know, highlighting colors (e.g., escape sequences like <code class="language-plaintext highlighter-rouge">94</code>) are not working in the bootloader.</p>
<h2 id="how-to-use-the-etcmotd-logo">How to use the <code class="language-plaintext highlighter-rouge">/etc/motd</code> logo</h2>
<p>In the <a href="https://github.com/fluca1978/fluca1978-pg-utils/tree/master/logos">logos directory</a> there is also the <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/logos/motd">motd</a> file, or better, an example of message-of-the-day. Place it on your machine and customize at your wills.</p>
<h2 id="what-about-the-ascii-art">What about the ascii art?</h2>
<p>Credits to the ascii art go to Oleg Bartunov, even if I’m not able to find out anymore a message thread where he proposed the elephant logo. However, thanks also to <a href="https://www.postgresql.org/message-id/57386570.8090703%40swisspug.org">Charles Clavadetscher</a> that provided another version.</p>
<h2 id="feel-free-to-contribute">Feel free to contribute!</h2>
<p>As usual, having posted the logos on my <a href="https://github.com/fluca1978/fluca1978-pg-utils/tree/master/logos">Github repository</a>, any contribution and improvement is welcome.</p>
PostgreSQL 12 Generated Columns2019-11-04T00:00:00+00:00https://fluca1978.github.io/2019/11/04/PostgreSQL12GeneratedColumns<p>PostgreSQL 12 provides support for automatically computed columns.</p>
<h1 id="postgresql-12-generated-columns">PostgreSQL 12 Generated Columns</h1>
<p>PostgreSQL 12 introduces the <a href="https://www.postgresql.org/docs/12/ddl-generated-columns.html">generated columns</a>, columns that are automatically computed depending on a <em>generation expression</em>.
<br />
The usage of generated columns is quite simple and can be summarized as follows:</p>
<ul>
<li>the column must be annotated with the <code class="language-plaintext highlighter-rouge">GENERATED ALWAYS AS (...) STORED</code> instruction;</li>
<li>the expression in parentheses must use only <code class="language-plaintext highlighter-rouge">IMMUTABLE</code> functions and cannot use subqueries.
<br />
For more specific constraints, see the <a href="https://www.postgresql.org/docs/12/ddl-generated-columns.html">official documentation</a>.
<br />
<br /></li>
</ul>
<p>Please note I’ve indicated the <code class="language-plaintext highlighter-rouge">STORED</code> clause because at the moment PostgreSQL supports only that kind of columns: a <code class="language-plaintext highlighter-rouge">STORED</code> generated column is saved on disk storage as a normal column would be, the only difference is that you cannot modify it autonomously, the database will compute it for you.
<br />
<br />
You can think of a stored generated column as a trade-off between a table with a trigger and a materialized view. When the <code class="language-plaintext highlighter-rouge">VIRTUAL</code> (as opposed to <code class="language-plaintext highlighter-rouge">STORED</code>) will be implemented, the column will take no space at all and will be computed on each column access, something similar as a view.</p>
<h2 id="an-example-of-not-generated-column">An example of not-generated column</h2>
<p>Let’s see generated columns in action: consider an ordinary table with a dependency between the <code class="language-plaintext highlighter-rouge">age</code> column and the <code class="language-plaintext highlighter-rouge">birthday</code> one, since the former can be computed from the values in the latter column:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">people</span><span class="p">(</span>
<span class="n">name</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">birthday</span> <span class="nb">date</span><span class="p">,</span>
<span class="n">age</span> <span class="nb">int</span> <span class="p">);</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">WITH</span> <span class="nb">year</span> <span class="k">AS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="p">(</span> <span class="n">random</span><span class="p">()</span> <span class="o">*</span> <span class="mi">100</span> <span class="p">)::</span><span class="nb">int</span> <span class="o">%</span> <span class="mi">70</span> <span class="k">AS</span> <span class="n">y</span>
<span class="p">)</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">people</span><span class="p">(</span> <span class="n">name</span><span class="p">,</span> <span class="n">age</span><span class="p">,</span> <span class="n">birthday</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="s1">'Person '</span> <span class="o">||</span> <span class="n">v</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="k">current_date</span> <span class="o">-</span> <span class="p">(</span> <span class="n">y</span> <span class="o">*</span> <span class="mi">365</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">,</span> <span class="nb">year</span><span class="p">;</span>
</code></pre></div></div>
<p>Let’s see how much space does it occupy to have such table filled with one million of rows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'people'</span> <span class="p">);</span>
<span class="n">pg_relation_size</span>
<span class="c1">------------------</span>
<span class="mi">52183040</span>
</code></pre></div></div>
<h2 id="an-example-with-generated-columns">An example with generated columns</h2>
<p>In order to create a similar table where <code class="language-plaintext highlighter-rouge">age</code> is automatically computed.
<br />
Since the column must use an <code class="language-plaintext highlighter-rouge">IMMUTABLE</code> function, the first step is to abstract the computation into a function:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">f_person_age</span><span class="p">(</span> <span class="n">birthday</span> <span class="nb">date</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">int</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="k">RETURN</span> <span class="k">extract</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="k">CURRENT_DATE</span> <span class="p">)</span>
<span class="o">-</span> <span class="k">extract</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="n">birthday</span> <span class="p">)</span>
<span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span> <span class="k">IMMUTABLE</span><span class="p">;</span>
</code></pre></div></div>
<p>Then it is possible to create the table using the function as <em>generation method</em>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">people_gc_stored</span> <span class="p">(</span>
<span class="n">name</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">birthday</span> <span class="nb">date</span><span class="p">,</span>
<span class="n">age</span> <span class="nb">int</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="p">(</span> <span class="n">f_person_age</span><span class="p">(</span> <span class="n">birthday</span> <span class="p">)</span> <span class="p">)</span> <span class="n">STORED</span>
<span class="p">);</span>
</code></pre></div></div>
<p>If the table is filled in a similar way, the space occupied is the same:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">people_gc_stored</span><span class="p">(</span> <span class="n">name</span><span class="p">,</span> <span class="n">birthday</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="s1">'Person '</span> <span class="o">||</span> <span class="n">v</span><span class="p">,</span> <span class="k">current_date</span> <span class="o">-</span> <span class="n">v</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1000000</span> <span class="p">)</span> <span class="n">v</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'people_gc_stored'</span> <span class="p">);</span>
<span class="n">pg_relation_size</span>
<span class="c1">------------------</span>
<span class="mi">52183040</span>
</code></pre></div></div>
<p>Why using a function in the generated column? Because if we place the real expression we got an error at creation time:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">people_gc_stored</span> <span class="p">(</span>
<span class="n">name</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">birthday</span> <span class="nb">date</span><span class="p">,</span>
<span class="n">age</span> <span class="nb">int</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="p">(</span>
<span class="k">extract</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="k">CURRENT_DATE</span> <span class="p">)</span>
<span class="o">-</span> <span class="k">extract</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="n">birthday</span> <span class="p">)</span>
<span class="o">+</span> <span class="mi">1</span> <span class="p">)</span> <span class="n">STORED</span>
<span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">generation</span> <span class="n">expression</span> <span class="k">is</span> <span class="k">not</span> <span class="k">immutable</span>
</code></pre></div></div>
<h2 id="writing-the-generated-column">Writing the generated column</h2>
<p>As already written, the generated column is not writable once it has been computed:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">UPDATE</span> <span class="n">people_gc_stored</span> <span class="k">SET</span> <span class="n">age</span> <span class="o">=</span> <span class="mi">40</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="k">column</span> <span class="nv">"age"</span> <span class="n">can</span> <span class="k">only</span> <span class="n">be</span> <span class="n">updated</span> <span class="k">to</span> <span class="k">DEFAULT</span>
<span class="n">DETAIL</span><span class="p">:</span> <span class="k">Column</span> <span class="nv">"age"</span> <span class="k">is</span> <span class="n">a</span> <span class="k">generated</span> <span class="k">column</span><span class="p">.</span>
</code></pre></div></div>
<h2 id="querying-the-generated-column">Querying the generated column</h2>
<p>The generated column works and behaves as a normal column, that is access can be restricted or granted on such column:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">REVOKE</span> <span class="k">ALL</span> <span class="k">ON</span> <span class="n">people_gc_stored</span> <span class="k">FROM</span> <span class="k">public</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">GRANT</span> <span class="k">SELECT</span><span class="p">(</span> <span class="n">name</span><span class="p">,</span> <span class="n">age</span> <span class="p">)</span> <span class="k">ON</span> <span class="n">people_gc_stored</span> <span class="k">TO</span> <span class="n">harry</span><span class="p">;</span>
</code></pre></div></div>
<p>Since user <code class="language-plaintext highlighter-rouge">harry</code> has access only on columns <code class="language-plaintext highlighter-rouge">name</code> and <code class="language-plaintext highlighter-rouge">age</code>, the user cannot see the dependency column:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">luca</span><span class="p">.</span><span class="n">people_gc_stored</span> <span class="k">LIMIT</span> <span class="mi">5</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">people_gc_stored</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">min</span><span class="p">(</span> <span class="n">age</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">age</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">luca</span><span class="p">.</span><span class="n">people_gc_stored</span><span class="p">;</span>
<span class="k">min</span> <span class="o">|</span> <span class="k">max</span>
<span class="c1">-----|------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">2740</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">min</span><span class="p">(</span> <span class="n">birthday</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">birthday</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">luca</span><span class="p">.</span><span class="n">people_gc_stored</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">people_gc_stored</span>
</code></pre></div></div>
<p>On the other hand, giving access only on <code class="language-plaintext highlighter-rouge">birthday</code> column does not automatically provide access on <code class="language-plaintext highlighter-rouge">age</code>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">REVOKE</span> <span class="k">SELECT</span> <span class="k">ON</span> <span class="n">people_gc_stored</span> <span class="k">FROM</span> <span class="n">harry</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">GRANT</span> <span class="k">SELECT</span><span class="p">(</span> <span class="n">name</span><span class="p">,</span> <span class="n">birthday</span> <span class="p">)</span> <span class="k">ON</span> <span class="n">people_gc_stored</span> <span class="k">TO</span> <span class="n">harry</span><span class="p">;</span>
</code></pre></div></div>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">min</span><span class="p">(</span> <span class="n">birthday</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">birthday</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">luca</span><span class="p">.</span><span class="n">people_gc_stored</span><span class="p">;</span>
<span class="k">min</span> <span class="o">|</span> <span class="k">max</span>
<span class="c1">---------------|------------</span>
<span class="mi">0720</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">07</span> <span class="n">BC</span> <span class="o">|</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">03</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">min</span><span class="p">(</span> <span class="n">age</span> <span class="p">),</span> <span class="k">max</span><span class="p">(</span> <span class="n">age</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">luca</span><span class="p">.</span><span class="n">people_gc_stored</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">people_gc_stored</span>
</code></pre></div></div>
PostgreSQL 12 package on FreeBSD2019-11-04T00:00:00+00:00https://fluca1978.github.io/2019/11/04/PostgreSQL12FreeBSD<p>PostgreSQL 12 is available as binary package on FreeBSD, but not in the quarterly update.</p>
<h1 id="postgresql-12-package-on-freebsd">PostgreSQL 12 package on FreeBSD</h1>
<p>In the case you need to install PostgreSQL 12 on FreeBSD please consider it has not reached the <code class="language-plaintext highlighter-rouge">quarterly</code> <code class="language-plaintext highlighter-rouge">pkg(1)</code> update, therefore if you install it via <code class="language-plaintext highlighter-rouge">pkg(1)</code> you will get <em>PostgreSQL 12 rc1</em>. However, in the ports tree, PostgreSQL is clearly at version 12 (release).
<br />
This behavior is due to the fact that since FreeBSD 12, the default repository for packages is <code class="language-plaintext highlighter-rouge">quarterly</code>, that in short means packages are older than the ports tree.
<br />
<br />
In order to install the official release, a new URL for the FreeBSD repository must be set up. The repository URL is placed into the file <code class="language-plaintext highlighter-rouge">/etc/pkg/FreeBSD.conf</code>:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FreeBSD: <span class="o">{</span>
url: <span class="s2">"pkg+http://pkg.FreeBSD.org/</span><span class="k">${</span><span class="nv">ABI</span><span class="k">}</span><span class="s2">/quarterly"</span>,
mirror_type: <span class="s2">"srv"</span>,
signature_type: <span class="s2">"fingerprints"</span>,
fingerprints: <span class="s2">"/usr/share/keys/pkg"</span>,
enabled: <span class="nb">yes</span>
<span class="o">}</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">pkg(1)</code> configuration allows the overriding of the default URL placing a file <code class="language-plaintext highlighter-rouge">/usr/local/etc/pkg/FreeBSD.conf</code> that overrides the properties of the above, so with the content:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">FreeBSD</span><span class="p">:</span> <span class="p">{</span>
<span class="n">url</span><span class="p">:</span> <span class="nv">"pkg+http://pkg.FreeBSD.org/${ABI}/latest"</span>
<span class="p">}</span>
</code></pre></div></div>
<p>After that, the repository can be updated and new packages will be available. Therefore, run:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>pkg update
% <span class="nb">sudo </span>pkg <span class="nb">install </span>postgresql12-client-12 <span class="se">\</span>
postgresql12-contrib-12 <span class="se">\</span>
postgresql12-docs-12 <span class="se">\</span>
postgresql12-plperl-12 <span class="se">\ </span>
postgresql12-server-12
</code></pre></div></div>
Installing PostgreSQL on FreeBSD via Ansible2019-10-30T00:00:00+00:00https://fluca1978.github.io/2019/10/30/PostgreSQL_FreeBSD_Ansible<p>My very simple attempt at keeping PostgreSQL up-to-date on FreeBSD machines.</p>
<h1 id="installing-postgresql-on-freebsd-via-ansible">Installing PostgreSQL on FreeBSD via Ansible</h1>
<p>I’m slowly moving to <a href="https://www.ansible.com/">Ansible</a> to manage my machines, and one problem I’m trying to solve at best is how to keep PostgreSQL up-to-date.
<br />
In the case of FreeBSD machines, <code class="language-plaintext highlighter-rouge">pkgng</code> is the module to use, but in the past I was used to this very simple playbook snippet:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">PostgreSQL </span><span class="m">11</span>
<span class="na">become</span><span class="pi">:</span> <span class="s">yes</span>
<span class="na">with_items</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">server</span>
<span class="pi">-</span> <span class="s">contrib</span>
<span class="pi">-</span> <span class="s">client</span>
<span class="pi">-</span> <span class="s">plperl</span>
<span class="na">pkgng</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">postgresql11-{{ item }}</span>
<span class="na">state</span><span class="pi">:</span> <span class="s">latest</span>
</code></pre></div></div>
<p>However, there is a very scarign warning message when running the above:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>TASK <span class="o">[</span>PostgreSQL 11]
<span class="o">[</span>DEPRECATION WARNING]: Invoking <span class="s2">"pkgng"</span> only once <span class="k">while </span>using a loop via squash_actions is deprecated. Instead of using a loop to supply multiple
items and specifying <span class="sb">`</span>name: <span class="s2">"postgresql11-"</span><span class="sb">`</span>, please use <span class="sb">`</span>name: <span class="o">[</span><span class="s1">'server'</span>, <span class="s1">'contrib'</span>, <span class="s1">'client'</span>, <span class="s1">'plperl'</span><span class="o">]</span><span class="sb">`</span> and remove the loop. This
feature will be removed <span class="k">in </span>version 2.11. Deprecation warnings can be disabled by setting <span class="nv">deprecation_warnings</span><span class="o">=</span>False <span class="k">in </span>ansible.cfg.
</code></pre></div></div>
<p>That’s easy to fix, but also annoying (at least to me), because I have to change the above snippet to the following one:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">PostgreSQL </span><span class="m">11</span>
<span class="na">become</span><span class="pi">:</span> <span class="s">yes</span>
<span class="na">pkgng</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">postgresql11-server</span>
<span class="pi">-</span> <span class="s">postgresql11-contrib</span>
<span class="pi">-</span> <span class="s">postgresql11-client</span>
<span class="pi">-</span> <span class="s">postgresql11-plperl</span>
<span class="na">state</span><span class="pi">:</span> <span class="s">latest</span>
</code></pre></div></div>
<p>So far, the better solution I’ve found that helps me keep readibility is to use a variable to hold the PostgreSQL version I want and the list of packages I need:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="na">vars</span><span class="pi">:</span>
<span class="na">pg_version</span><span class="pi">:</span> <span class="m">11</span>
<span class="na">pg_components</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">postgresql{{ pg_version }}-server</span>
<span class="pi">-</span> <span class="s">postgresql{{ pg_version }}-contrib</span>
<span class="pi">-</span> <span class="s">postgresql{{ pg_version }}-client</span>
<span class="pi">-</span> <span class="s">postgresql{{ pg_version }}-plperl</span>
<span class="na">tasks</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">PostgreSQL {{ pg_version }}</span>
<span class="na">become</span><span class="pi">:</span> <span class="s">yes</span>
<span class="na">pkgng</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">pg_components</span><span class="nv"> </span><span class="s">}}"</span>
<span class="na">state</span><span class="pi">:</span> <span class="s">latest</span>
</code></pre></div></div>
pgenv: adjust your PATH!2019-10-25T00:00:00+00:00https://fluca1978.github.io/2019/10/25/PostgreSQL12_pgenv<p>A few days ago we added the option to suggest you changes to your <code class="language-plaintext highlighter-rouge">PATH</code> to prevent version clashes.</p>
<h1 id="pgenv-adjust-your-path">pgenv: adjust your PATH!</h1>
<p>In the following you can find another quick video that demonstrate how easy it is to get, almost <em>automtically</em>, a PostgreSQL 12 instance up and running on your local machine using <a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a>.</p>
<p><a href="https://asciinema.org/a/276923"><img src="https://asciinema.org/a/276923.svg" alt="asciicast" /></a></p>
<p>Please note also that, at time <code class="language-plaintext highlighter-rouge">5:35</code>, you will see how <code class="language-plaintext highlighter-rouge">pgenv</code> suggests you to adjust your <code class="language-plaintext highlighter-rouge">PATH</code> environment variable in order to use the <em>just installed</em> binaries for the cluster. The idea behind this suggestion is to prevent you using a system-wide binary, e.g., <code class="language-plaintext highlighter-rouge">psql</code>, that has a possible incompatibility with the <em>in-use</em> cluster.</p>
PostgreSQL 12 beta 4 up and running in less than six minutes2019-09-19T00:00:00+00:00https://fluca1978.github.io/2019/09/19/PG12Beta4<p>How hard can it be to grab a copy of PostgreSQL 12 (still in beta) and install on your computer for testing, without having to deal with your existing database?</p>
<h1 id="postgresql-12-beta-4-up-and-running-in-less-than-six-minutes">PostgreSQL 12 beta 4 up and running in less than six minutes</h1>
<p>I have realized a very short, and to some extent, boring video to demonstrate how <a href="https://github.com/theory/pgenv/"><code class="language-plaintext highlighter-rouge">pgenv</code></a> can simplify the installation of <strong>PostgreSQL 12 beta 4</strong> (as well as other versions of course).
<br />
<br />
The video shows how automated it could be to install the beta version on a FreeBSD machine.
For the very impatients, the commands are essentially:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv build 12beta4
% pgenv use 12beta4
% psql -h localhost -U postgres template1
</code></pre></div></div>
<p>but the last command is, of course, the proof that all is up and running.
<br />
<br />
As you will see, the most of the time is spent in doing the actual compilation of the software. The value added by <code class="language-plaintext highlighter-rouge">pgenv</code> is that you don’t have to deal with download links and commands to initialize your database. And once you are done, you can simply nuke the <code class="language-plaintext highlighter-rouge">pgsql-12beta4</code> directory that will remove binaries and data.</p>
<p><br /></p>
<center>
<iframe width="560" height="315" src="https://www.youtube.com/embed/8-7b4nbDhns" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
</center>
<p><br />
<br />
Of course, <a href="https://github.com/theory/pgenv/"><code class="language-plaintext highlighter-rouge">pgenv</code></a> can do a lot more than just downloading and compiling PostgreSQL, but the above demonstrate how it simplifies even the <em>boring</em> setup tasks.</p>
New Release of PL/Proxy2019-09-16T00:00:00+00:00https://fluca1978.github.io/2019/09/16/PLPROXY12<p>There is a new release of PL/Proxy out there!</p>
<h1 id="new-release-of-plproxy">New Release of PL/Proxy</h1>
<p>There is a new exciting release of PL/Proxy: <strong><a href="https://plproxy.github.io/downloads/files/2.9/plproxy-2.9.tar.gz">version 2.9 has been released a few hours ago</a></strong>!
<br /> <br />
This is an important release because <a href="https://github.com/plproxy/plproxy/commit/d0a83211e1f71dac0bfd620741fee04eeded3173">it adds support for upcoming PostgreSQL 12</a>. The main problem with PostgreSQL 12 has been that <code class="language-plaintext highlighter-rouge">Oid</code><code class="language-plaintext highlighter-rouge"> is now a regular column, meaning that </code>HeapTupleGetOid`` is no longer a valid macro. I first proposed a <a href="https://github.com/plproxy/plproxy/pull/38/commits/4de2dec2a91a44dae6a89d88f5bb0eb12eaaeabc">patch</a> that was based on the C preprocessor to get rid of older PostgreSQL version.
<br /><br /> <br />
The solution implemented by Marko Kreen <a href="https://github.com/plproxy/plproxy/commit/d0a83211e1f71dac0bfd620741fee04eeded3173">is of course much more elegant</a> and is based on defining <em>helper</em> functions that are pre-processed depending on the PostgreSQL version.
<br /> <br />
Enjoy <em>proxying</em>!</p>
Compute day working hours in PL/pgsql2019-08-30T00:00:00+00:00https://fluca1978.github.io/2019/08/30/postgresql_working_hours<p>How many working hours are there in a range of dates?</p>
<h1 id="compute-day-working-hours-in-plpgsql">Compute day working hours in PL/pgsql</h1>
<p>A few days ago there was a very <a href="https://www.postgresql.org/message-id/20190827222741.GB19306%40panix.com">nice thread in the <code class="language-plaintext highlighter-rouge">pgsql-general</code> mailing list</a> asking for ideas about how to compute working hours in a month.
<br />
The idea is quite simple: you must extract the number of working days (let’s say excluding sundays) and multiple each of them for the number of <em>hours per day</em> and then get the sum.
<br />
<strong>There are a lot of nice and almost <em>one-liner</em> solutions in the thread, so I strongly encourage you to read it all!</strong>
<br />
<br />
I came up with my own solution, that is based on <em>functions</em>, and here I’m going to explain it hoping it can be useful (at least as a starting point).
<br />
<br />
You can find the code, as usual, on my <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/functions/working_hours.sql">GitHub repository related to PostgreSQL</a>.</p>
<h2 id="the-workhorse-function">The workhorse function</h2>
<p>One reason I decided to implement the alghoritm using a function was because I want it to be configurable. There are people, like me, that do a job where the working hours are different on a day-by-day basis. So, assuming the more general problem of computing the working hours between two dates, here there’s a possible implementation:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">compute_working_hours</span><span class="p">(</span> <span class="n">begin_day</span> <span class="nb">DATE</span><span class="p">,</span>
<span class="n">end_day</span> <span class="nb">DATE</span><span class="p">,</span>
<span class="n">_saturday</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">false</span><span class="p">,</span>
<span class="n">_hour_template</span> <span class="nb">real</span><span class="p">[]</span> <span class="k">DEFAULT</span> <span class="n">ARRAY</span><span class="p">[</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span> <span class="p">]::</span><span class="nb">real</span><span class="p">[],</span>
<span class="n">_exclude_days</span> <span class="nb">date</span><span class="p">[]</span> <span class="k">DEFAULT</span> <span class="k">NULL</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">real</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">working_hours</span> <span class="nb">real</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">working_days</span> <span class="n">daterange</span><span class="p">;</span>
<span class="n">current_day</span> <span class="nb">date</span><span class="p">;</span>
<span class="n">current_day_hours</span> <span class="nb">real</span><span class="p">;</span>
<span class="n">skip</span> <span class="nb">boolean</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="c1">-- check arguments</span>
<span class="n">IF</span> <span class="n">begin_day</span> <span class="k">IS</span> <span class="k">NULL</span>
<span class="k">OR</span> <span class="n">end_day</span> <span class="k">IS</span> <span class="k">NULL</span>
<span class="k">OR</span> <span class="n">begin_day</span> <span class="o">>=</span> <span class="n">end_day</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">EXCEPTION</span> <span class="s1">'Please check dates'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">_hour_template</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">THEN</span>
<span class="n">_hour_template</span> <span class="p">:</span><span class="o">=</span> <span class="n">ARRAY</span><span class="p">[</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span> <span class="p">]::</span><span class="nb">real</span><span class="p">[];</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">WHILE</span> <span class="n">array_length</span><span class="p">(</span> <span class="n">_hour_template</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="o"><</span> <span class="mi">7</span> <span class="n">LOOP</span>
<span class="n">_hour_template</span> <span class="p">:</span><span class="o">=</span> <span class="n">array_append</span><span class="p">(</span> <span class="n">_hour_template</span><span class="p">,</span> <span class="mi">8</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="c1">-- create the working period date range</span>
<span class="n">working_days</span> <span class="o">=</span> <span class="n">daterange</span><span class="p">(</span> <span class="n">begin_day</span><span class="p">,</span> <span class="n">end_day</span><span class="p">,</span> <span class="s1">'[]'</span><span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Working days in the range %'</span><span class="p">,</span> <span class="n">working_days</span><span class="p">;</span>
<span class="n">current_day</span> <span class="p">:</span><span class="o">=</span> <span class="k">lower</span><span class="p">(</span> <span class="n">working_days</span> <span class="p">);</span>
<span class="n">LOOP</span>
<span class="c1">-- skip sundays</span>
<span class="n">skip</span> <span class="p">:</span><span class="o">=</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="n">dow</span> <span class="k">FROM</span> <span class="n">current_day</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="c1">-- skip saturdays if required</span>
<span class="n">skip</span> <span class="p">:</span><span class="o">=</span> <span class="n">skip</span> <span class="k">OR</span> <span class="p">(</span> <span class="k">NOT</span> <span class="n">_saturday</span> <span class="k">AND</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="n">dow</span> <span class="k">FROM</span> <span class="n">current_day</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">6</span> <span class="p">);</span>
<span class="c1">-- skip this particular day if specified</span>
<span class="n">skip</span> <span class="p">:</span><span class="o">=</span> <span class="n">skip</span> <span class="k">OR</span> <span class="p">(</span> <span class="n">_exclude_days</span> <span class="k">IS</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">AND</span> <span class="n">_exclude_days</span> <span class="o">@></span> <span class="n">ARRAY</span><span class="p">[</span> <span class="n">current_day</span> <span class="p">]</span> <span class="p">);</span>
<span class="n">IF</span> <span class="k">NOT</span> <span class="n">skip</span> <span class="k">THEN</span>
<span class="n">current_day_hours</span> <span class="p">:</span><span class="o">=</span> <span class="n">_hour_template</span><span class="p">[</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="n">dow</span> <span class="k">FROM</span> <span class="n">current_day</span> <span class="p">)</span> <span class="p">];</span>
<span class="k">ELSE</span>
<span class="n">current_day_hours</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Day % counting % working hours'</span><span class="p">,</span>
<span class="n">current_day</span><span class="p">,</span>
<span class="n">current_day_hours</span><span class="p">;</span>
<span class="n">working_hours</span> <span class="p">:</span><span class="o">=</span> <span class="n">working_hours</span> <span class="o">+</span> <span class="n">current_day_hours</span><span class="p">;</span>
<span class="n">current_day</span> <span class="p">:</span><span class="o">=</span> <span class="n">current_day</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">EXIT</span> <span class="k">WHEN</span> <span class="k">NOT</span> <span class="n">current_day</span> <span class="o"><@</span> <span class="n">working_days</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="c1">-- all done</span>
<span class="k">RETURN</span> <span class="n">working_hours</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>Let’s consider the arguments: the first two are the dates you want to inspect, then there’s a boolean that indicates if saturday is a working day or not. The <code class="language-plaintext highlighter-rouge">_hour_template</code> is a template with the amount of hours within each day (sunday first, which can be any value since sundays are never working days - at least I would it to be!). Last, an array of days to exclude from the computation (holidays, vacation, and so on).
<br />
<br />
The function computes a <code class="language-plaintext highlighter-rouge">working_days</code> date range including the begin and end date, and then uses a <code class="language-plaintext highlighter-rouge">current_day</code> single day date to iterate within the date range. In the main loop, there are checks to skip the current day in the case it is a sunday, or a saturday (and saturdays are not working days) or is included into the array of ecluded days.
<br />
Then the tricky part: if the day has to be excluded, the working hours will be <em>zero</em>, otherwise the working hours will be extracted from the hour template. Working hours are then summed together.
<br />
<br />
Let’s see this in action:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">compute_working_hours</span><span class="p">(</span> <span class="k">current_date</span><span class="p">,</span>
<span class="k">current_date</span> <span class="o">+</span> <span class="mi">3</span><span class="p">,</span>
<span class="k">false</span><span class="p">,</span> <span class="k">NULL</span><span class="p">,</span> <span class="n">ARRAY</span><span class="p">[</span> <span class="s1">'2019-08-28'</span> <span class="p">]::</span><span class="nb">date</span><span class="p">[]</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Working</span> <span class="n">days</span> <span class="k">in</span> <span class="n">the</span> <span class="k">range</span> <span class="p">[</span><span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">28</span><span class="p">,</span><span class="mi">2019</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">01</span><span class="p">)</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Day</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">28</span> <span class="n">counting</span> <span class="mi">0</span> <span class="n">working</span> <span class="n">hours</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Day</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">29</span> <span class="n">counting</span> <span class="mi">8</span> <span class="n">working</span> <span class="n">hours</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Day</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">30</span> <span class="n">counting</span> <span class="mi">8</span> <span class="n">working</span> <span class="n">hours</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Day</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="n">counting</span> <span class="mi">0</span> <span class="n">working</span> <span class="n">hours</span>
<span class="n">compute_working_hours</span>
<span class="c1">-----------------------</span>
<span class="mi">16</span>
</code></pre></div></div>
<p><em>I wish not to work on my beautiful wife’s birthday</em>, so within the three days I’m supposed to work only two and get <code class="language-plaintext highlighter-rouge">16</code> hours.
<br />
<br />
As you probably have noticed, the hour template is expressed as <code class="language-plaintext highlighter-rouge">real</code> values, so that it is possible to express even part of hours, like 8.5 to indicate 8 hours and half. Here probably the usage of <code class="language-plaintext highlighter-rouge">time</code> would have been a better choice, but with a little complication over the final sum, so I’m not yet convinced about providing such an implementation.</p>
<h2 id="back-to-the-real-problem-computing-within-a-month">Back to the real problem: computing within a month</h2>
<p>Having the above function in place, it is now possible to <em>overload</em> it and provide a function that computes the working hours in a single month of the year:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">compute_working_hours</span><span class="p">(</span> <span class="n">_year</span> <span class="nb">int</span><span class="p">,</span>
<span class="n">_month</span> <span class="nb">int</span><span class="p">,</span>
<span class="n">_saturday</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">false</span><span class="p">,</span>
<span class="n">_hour_template</span> <span class="nb">real</span><span class="p">[]</span> <span class="k">DEFAULT</span> <span class="n">ARRAY</span><span class="p">[</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span> <span class="p">]::</span><span class="nb">real</span><span class="p">[],</span>
<span class="n">_exclude_days</span> <span class="nb">int</span><span class="p">[]</span> <span class="k">DEFAULT</span> <span class="k">null</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">real</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">_exclude_days_as_dates</span> <span class="nb">date</span><span class="p">[];</span>
<span class="n">current_index</span> <span class="nb">int</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="c1">-- check arguments</span>
<span class="n">IF</span> <span class="n">_year</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">THEN</span>
<span class="n">_year</span> <span class="p">:</span><span class="o">=</span> <span class="k">extract</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="k">CURRENT_DATE</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">_month</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">THEN</span>
<span class="n">_month</span> <span class="p">:</span><span class="o">=</span> <span class="k">extract</span><span class="p">(</span> <span class="k">month</span> <span class="k">FROM</span> <span class="k">CURRENT_DATE</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">_exclude_days</span> <span class="k">IS</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">THEN</span>
<span class="k">FOR</span> <span class="n">current_index</span> <span class="k">IN</span> <span class="mi">1</span> <span class="p">..</span> <span class="n">array_upper</span><span class="p">(</span> <span class="n">_exclude_days</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="n">LOOP</span>
<span class="n">_exclude_days_as_dates</span> <span class="p">:</span><span class="o">=</span> <span class="n">array_append</span><span class="p">(</span> <span class="n">_exclude_days_as_dates</span><span class="p">,</span>
<span class="n">make_date</span><span class="p">(</span> <span class="n">_year</span><span class="p">,</span> <span class="n">_month</span><span class="p">,</span> <span class="n">_exclude_days</span><span class="p">[</span> <span class="n">current_index</span> <span class="p">]</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">compute_working_hours</span><span class="p">(</span> <span class="n">make_date</span><span class="p">(</span> <span class="n">_year</span><span class="p">,</span> <span class="n">_month</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="p">(</span> <span class="n">make_date</span><span class="p">(</span> <span class="n">_year</span><span class="p">,</span> <span class="n">_month</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="s1">'1 month - 1 day'</span><span class="p">::</span><span class="n">interval</span> <span class="p">)::</span><span class="nb">date</span><span class="p">,</span>
<span class="n">_saturday</span><span class="p">,</span>
<span class="n">_hour_template</span><span class="p">,</span>
<span class="n">_exclude_days_as_dates</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>As you can see, the function asks for the year and month (as well as other parameters like the hour template), computes the range of dates for the specified month and delegates to the former implementation the computation.
<br />
<br />
One part I’m not really proud of is the <code class="language-plaintext highlighter-rouge">_exclude_days</code> parameter, that in this version is an array of integers that I have to convert then in array of <code class="language-plaintext highlighter-rouge">date</code>s. On one hand, I wanted the function to have coherent parameters, so if I specify a single month and want to skip the day <code class="language-plaintext highlighter-rouge">28</code> I already know that’s the <code class="language-plaintext highlighter-rouge">28</code>th day of that month, so it is just a noise to ask the user to input a <code class="language-plaintext highlighter-rouge">date</code>. On the other hand, the loop that converts <code class="language-plaintext highlighter-rouge">__exclude_days</code> into an array of dates named <code class="language-plaintext highlighter-rouge">_exclude_days_as_dates</code> is less than elegant!
<br />
<br />
By the way, how is this invoked?</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">compute_working_hours</span><span class="p">(</span> <span class="k">NULL</span><span class="p">,</span>
<span class="k">NULL</span><span class="p">,</span>
<span class="k">true</span><span class="p">,</span>
<span class="k">NULL</span><span class="p">,</span>
<span class="n">ARRAY</span><span class="p">[</span><span class="mi">12</span><span class="p">,</span> <span class="mi">15</span> <span class="p">,</span><span class="mi">29</span><span class="p">]</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Working</span> <span class="n">days</span> <span class="k">in</span> <span class="n">the</span> <span class="k">range</span> <span class="p">[</span><span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">01</span><span class="p">,</span><span class="mi">2019</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">01</span><span class="p">)</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Day</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">01</span> <span class="n">counting</span> <span class="mi">8</span> <span class="n">working</span> <span class="n">hours</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Day</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">02</span> <span class="n">counting</span> <span class="mi">8</span> <span class="n">working</span> <span class="n">hours</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Day</span> <span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">03</span> <span class="n">counting</span> <span class="mi">8</span> <span class="n">working</span> <span class="n">hours</span>
<span class="p">...</span>
<span class="n">compute_working_hours</span>
<span class="c1">-----------------------</span>
<span class="mi">192</span>
</code></pre></div></div>
<p>And yes, I love <em>defaults</em> and so pretty much every parameter can be omitted at all and still get a pretty decent result.</p>
PgBouncer gets SCRAM!2019-08-30T00:00:00+00:00https://fluca1978.github.io/2019/08/30/pgbouncerscram<p>A few of days ago a new release of PgBouncer has been released, with the addition of SCRAM support!</p>
<h1 id="pgbouncer-gets-scram">PgBouncer gets SCRAM!</h1>
<p>Three days ago <a href="https://pgbouncer.github.io/changelog.html#pgbouncer-111x">PgBouncer 1.11</a> has been released, and one feature that immediately caught my attention was the addition of /SCRAM support for password/.
<br />
<br />
<a href="https://www.postgresql.org/docs/11/auth-password.html"><code class="language-plaintext highlighter-rouge">SCRAM</code></a> is currently the most secure way to use password for PostgreSQL authentication and has been around since version ~10~ (so nearly two years). <code class="language-plaintext highlighter-rouge">SCRAM</code> support for PgBouncer has been a /wanted feature/ for a while, since not having it prevented users of this great tool to use <code class="language-plaintext highlighter-rouge">SCRAM</code> on the clusters.
<br />
<br />
Luckily, now this has been implemented and <a href="https://pgbouncer.github.io/config.html#authentication-file-format">the configuration of the PgBouncer account</a>** is similar to the plain and ~md5~, so it is very simple.
<br />
<br />
I really love PgBouncer and, with this addition, I can now upgrade my servers to /SCRAM/!
<strong>Thank you PgBouncer developers!</strong></p>
PL/Proxy on PostgreSQL 12 ?2019-08-27T00:00:00+00:00https://fluca1978.github.io/2019/08/27/PLProxyPostgreSQL12<p>I spent some more time on the PL/Proxy code base in order to make it compiling against upcoming PostgreSQL 12.</p>
<h1 id="plproxy-on-postgresql-12-">PL/Proxy on PostgreSQL 12 ?</h1>
<p>In my <a href="https://fluca1978.github.io/2019/08/26/PLProxy_FreeBSD.html">yesterday blog post</a> I reported some stupid thougth about compiling PL/Proxy against PostgreSQL 12.
<br />
I was too stupid to hit the removal of <code class="language-plaintext highlighter-rouge">HeapTupleGetOid</code> (as of <a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=578b229718e8f15fa779e20f086c4b6bb3776106">commit 578b229718e8f15fa779e20f086c4b6bb3776106 </a>), and after having read the commit comment with more accuracy, I found how to fix the code (<em>at least I hope so!</em>).
<br />
<br />
Essentially, wherever I found usage of <code class="language-plaintext highlighter-rouge">HeapTupleGetOid</code> I placed a preprocessor macro to extract the <code class="language-plaintext highlighter-rouge">Form_pg_</code> structure and use the normal column <code class="language-plaintext highlighter-rouge">oid</code> instead, something like:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#if PG_VERSION_NUM < 12000
</span> <span class="n">Oid</span> <span class="n">namespaceId</span> <span class="o">=</span> <span class="n">HeapTupleGetOid</span><span class="p">(</span><span class="n">tup</span><span class="p">);</span>
<span class="cp">#else
</span> <span class="n">Form_pg_namespace</span> <span class="n">form</span> <span class="o">=</span> <span class="p">(</span><span class="n">Form_pg_namespace</span><span class="p">)</span> <span class="n">GETSTRUCT</span><span class="p">(</span><span class="n">tup</span><span class="p">);</span>
<span class="n">Oid</span> <span class="n">namespaceId</span> <span class="o">=</span> <span class="n">form</span><span class="o">-></span><span class="n">oid</span><span class="p">;</span>
<span class="cp">#endif
</span></code></pre></div></div>
<p><br />
<strong>I strongly advise to not use this in production, at least until someone of the PL/Proxy authors have a look at the code</strong>! However the tests pass on PostgreSQL 12beta2 on Linux.
<br />
<br />
You can find the <a href="https://github.com/plproxy/plproxy/pull/38">pull request</a> that also includes my previous pull request to make PL/Proxy work against PostgreSQL11 and FreeBSD.
<br />
I hope it can help pushing a new release of this tool.</p>
PL/Proxy on PostgreSQL 11 and FreeBSD 122019-08-26T00:00:00+00:00https://fluca1978.github.io/2019/08/26/PLProxy_FreeBSD<p>PL/Proxy is a procedural language implementation that makes really easy to do database proxying, and sharding as a consequence. Unluckily getting it to run on PostgreSQL 11 and FreeBSD 12 is not for free.</p>
<h1 id="plproxy-on-postgresql-11-and-freebsd-12">PL/Proxy on PostgreSQL 11 and FreeBSD 12</h1>
<p><a href="https://plproxy.github.io/">PL/Proxy</a> is a project that allows database proxying, that is a way to connect to remote databases, and as a consequence allows for /sharding/ implementations.
<br />
The idea behind PL/Proxy is as simple as elegant: define a minimalistic language to access remote (database) objects and, more in particular, execute queries.
<br />
<br />
Unluckily, the latest stable release of PL/Proxy is <code class="language-plaintext highlighter-rouge">2.8</code> and is dated <em>October 2017</em>, that means <em>PostgreSQL 10</em>! There are a couple of Pull Requests to make it working against PostgreSQL 11, but hey have not been merged and the project code seems in pause.
<br />
<br />
Today I created a <a href="https://github.com/plproxy/plproxy/pull/37">cumulative pull request</a> that does a little adjustments to allow the compilation on FreeBSD 12 against PostgreSQL 11.
<br />
<br />
My pull request is inspired and borrows changes from other two pull requests:</p>
<ul>
<li><a href="https://github.com/plproxy/plproxy/pull/31">pr-31</a> and credits to <a href="https://github.com/laurenz">Laurenz Albe</a>;</li>
<li><a href="https://github.com/plproxy/plproxy/pull/33">pr-33</a> that has been merged into mine, and credits to <a href="https://github.com/df7cb">Christoph Berg</a>.
<br />
Then I added a compiler flag to adjust headers on FreeBSD 12, as well as dropped an old Bison syntax since this should be safe enough on modern PostgreSQL (at least 9.6 and higher.
Some bit here and there to make all tests to pass against PostgreSQL 11, and everything seems right now.
<br />
<strong>It is important to warn that <a href="https://github.com/plproxy/plproxy/pull/37">my version</a> is not <em>production ready</em> because it should be reviewed by at least one PL/Proxy developer</strong>.</li>
</ul>
<h2 id="and-what-about-postgresql-12">And what about PostgreSQL 12?</h2>
<p>Well, PostgreSQL 12 drops the usage of the special column <code class="language-plaintext highlighter-rouge">Oid</code> in catalogs, with commit 578b229718e8f15fa779e20f086c4b6bb3776106. What this means is that the macro <code class="language-plaintext highlighter-rouge">HeapTupleGetOid</code> is no longer there and PL/Proxy does an heavy usage of it. <a href="https://github.com/fluca1978/plproxy/tree/pg12">I’ve tried to blindly substitute it with <code class="language-plaintext highlighter-rouge">->t_tableOid</code></a>, but this does not seems to work since the tests are failing to lookup objects. So any suggestion here is welcome!</p>
yum upgrade postgresql11 panic!2019-07-22T00:00:00+00:00https://fluca1978.github.io/2019/07/22/PostgreSQLCentosUpgrade<p>I have to say, I don’t use CentOS very much and I’m not a good user of <code class="language-plaintext highlighter-rouge">systemd</code>, that is the reason why I got five minutes of pure fear!</p>
<h1 id="yum-upgrade-postgresql11-panic">yum upgrade postgresql11 panic!</h1>
<p><em>How hard could it be to upgrade PostgreSQL within minor versions?</em>
<br />
Usually it is very simple, and <em>it is very simple</em> but not when you don’t know your tools!
<br />
And in this case that’s my fault.
<br />
However, I’m writing this short note in order to avoid other people experience the same problem I had.</p>
<h2 id="the-current-setup">The current setup</h2>
<p>The machine is a CentOS 7 running PostgreSQL 11.1 installed by <a href="https://yum.postgresql.org/">packages provided by the PostgreSQL Global Development Group</a>.</p>
<h2 id="preparing-to-upgrade">Preparing to upgrade</h2>
<p>Of course, I took a full backup before proceeding, just in case. The cluster I’m talking about is a low traffice cluster with roughly ~12 GB~ of data, that is the backup and restore are not a <em>zero downtime</em> (and no, I’m not in the position of having a WAL based backup, but that’s another story).
<br />
Having a backup helps keeping the amount of panic at a fair level.</p>
<h2 id="performing-the-upgrade">Performing the upgrade</h2>
<p>I do like <code class="language-plaintext highlighter-rouge">yum(8)</code> and its transactional approach.
Doing the upgrade was a matter of:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>yum upgrade postgresql11
</code></pre></div></div>
<p>and all dependencies are, of course, calculated and applied. Then I confirmed, waited a couple of minutes for the upgrade to apply, and <strong>I started keeping my breath</strong>:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>psql: could not connect to server: Connection refused
Is the server running on host <span class="s2">"xxx"</span> <span class="o">(</span>192.168.222.123<span class="o">)</span> and accepting
TCP/IP connections on port 5432?
</code></pre></div></div>
<h2 id="inspecting-and-solving-the-problem">Inspecting and solving the problem</h2>
<p>Apparently PostgreSQL has not been restarted after the upgrade, but what is worst <strong>is that is not going to restart again</strong>:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>10:33:25 lnx168 systemd[1]: Starting PostgreSQL 11 database server...
10:33:25 lnx168 postgresql-11-check-db-dir[10214]: <span class="s2">"/var/lib/pgsql/11/data/"</span> is missing or empty.
10:33:25 lnx168 postgresql-11-check-db-dir[10214]: Use <span class="s2">"/usr/pgsql-11/bin/postgresql-11-setup initdb"</span> to initialize the database cluster.
10:33:25 lnx168 postgresql-11-check-db-dir[10214]: See /usr/share/doc/postgresql11-11.4/README.rpm-dist <span class="k">for </span>more information.
10:33:25 lnx168 systemd[1]: postgresql-11.service: control process exited, <span class="nv">code</span><span class="o">=</span>exited <span class="nv">status</span><span class="o">=</span>1
10:33:25 lnx168 systemd[1]: Failed to start PostgreSQL 11 database server.
</code></pre></div></div>
<p><strong>What the hell!</strong> (I’m allowed to spell it loud because my colleague was on vacation and I was alone in my office).
<br />
First of all, <strong>do not run <code class="language-plaintext highlighter-rouge">initdb</code> as suggested</strong> because chances are you will destroy all your data. But that’s a good hint about the problem: <strong>systemd was trying to launch PostgreSQL with an empty PGDATA</strong>.
<br />
<br />
Of course, the <code class="language-plaintext highlighter-rouge">PGDATA</code> was not empty and was still in place, but <strong><code class="language-plaintext highlighter-rouge">yum</code> upgraded my <code class="language-plaintext highlighter-rouge">systemd</code> configuration for PostgreSQL to the CentOS default</strong>, therefore my file <code class="language-plaintext highlighter-rouge">/usr/lib/systemd/system/postgresql-11.service</code> was overriden without any advice!
<br />
<br />
And in fact, to confirm the above, I was able to start the server manually using <code class="language-plaintext highlighter-rouge">pg_ctl</code>, and at least I had the server running.
<br />
<br />
Now that the server is running, I have more time to inspect <code class="language-plaintext highlighter-rouge">/usr/lib/systemd/system/postgresql-11.service</code> and adjust the <code class="language-plaintext highlighter-rouge">PGDATA</code> parameter to the right value:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo grep </span>PGDATA /usr/lib/systemd/system/postgresql-11.service
<span class="nv">Environment</span><span class="o">=</span><span class="nv">PGDATA</span><span class="o">=</span>/data/pgdata
</code></pre></div></div>
<p>I also double checked that the <code class="language-plaintext highlighter-rouge">systemd</code> startup script correctly links to the edited file:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">ls</span> <span class="nt">-l</span> /etc/systemd/system/multi-user.target.wants/postgresql-11.service
lrwxrwxrwx 1 root root 45 20 dic 2018 /etc/systemd/system/multi-user.target.wants/postgresql-11.service
-> /usr/lib/systemd/system/postgresql-11.service
</code></pre></div></div>
<p>Seems fine, right?</p>
<h2 id="nested-problems">Nested problems</h2>
<p>No matter how fine the setup was, <code class="language-plaintext highlighter-rouge">systemd</code> still refused to restart the cluster:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">sudo </span>service postgresql-11 restart
Redirecting to /bin/systemctl restart postgresql-11.service
Job <span class="k">for </span>postgresql-11.service failed because the control process exited with error code. See <span class="s2">"systemctl status postgresql-11.service"</span> and <span class="s2">"journalctl -xe"</span> <span class="k">for </span>details.
</code></pre></div></div>
<p>For a reason I don’t really know, it seems that <code class="language-plaintext highlighter-rouge">systemd</code> keeps track that it hasn’t started the service, and that the latter is in failed mode. The solution was to manually stop the cluster via <code class="language-plaintext highlighter-rouge">pg_ctl</code> and that asks <code class="language-plaintext highlighter-rouge">systemd</code> to start it again, and this time it gets running.</p>
<h1 id="fixing-the-problem-with-systemd-the-right-approach">Fixing the problem with `systemd**: the right approach</h1>
<p><br />
<strong>updated on 2019-07-22</strong>
<br /></p>
<p>As pointed out by <em>Andrew Gierth</em> in a comment, editing the <code class="language-plaintext highlighter-rouge">systemd</code> unit service file is not the right approach to configure services. Here it is the right approach, so that my changes do not get overwritten by <code class="language-plaintext highlighter-rouge">systemd</code>:
1) run <code class="language-plaintext highlighter-rouge">systemctl edit postgresql-11</code>;
2) add a line with <code class="language-plaintext highlighter-rouge">Environment=PGDATA=/data/pgdata</code> within the <code class="language-plaintext highlighter-rouge">Service</code> section:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span>Service]
<span class="nv">Environment</span><span class="o">=</span><span class="nv">PGDATA</span><span class="o">=</span>/data/pgdata
</code></pre></div></div>
<p>3) inspect the service with <code class="language-plaintext highlighter-rouge">systemctl status postgresql-11</code>, that will show the following:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>systemctl status postgresql-11
● postgresql-11.service - PostgreSQL 11 database server
Loaded: loaded <span class="o">(</span>/usr/lib/systemd/system/postgresql-11.service<span class="p">;</span> enabled<span class="p">;</span> vendor preset: disabled<span class="o">)</span>
Drop-In: /etc/systemd/system/postgresql-11.service.d
└─override.conf
Active: active <span class="o">(</span>running<span class="o">)</span> since lun 2019-07-22 15:43:50 CEST<span class="p">;</span> 31s ago
Docs: https://www.postgresql.org/docs/11/static/
Main PID: 16114 <span class="o">(</span>postmaster<span class="o">)</span>
CGroup: /system.slice/postgresql-11.service
├─16114 /usr/pgsql-11/bin/postmaster <span class="nt">-D</span> /postgres/data
├─16116 postgres: logger
├─16118 postgres: checkpointer
├─16119 postgres: background writer
├─16120 postgres: walwriter
├─16121 postgres: autovacuum launcher
├─16122 postgres: stats collector
├─16123 postgres: pg_cron scheduler
└─16124 postgres: logical replication launcher
</code></pre></div></div>
<p>The important part in the above is the <strong>Drop-In</strong> line that points to a freshly created directory <code class="language-plaintext highlighter-rouge">/etc/systemd/system/postgresql-11.service.d</code> with a single file, <code class="language-plaintext highlighter-rouge">override.conf</code> that contains the new <code class="language-plaintext highlighter-rouge">PGDATA</code> definition. In other words, <code class="language-plaintext highlighter-rouge">systemd</code> keeps the service units under its own control, and you have to create an <code class="language-plaintext highlighter-rouge">override.conf</code> file to place other variable values.</p>
<h1 id="conclusions">Conclusions</h1>
<p>Not knowing your tools, <code class="language-plaintext highlighter-rouge">systemd</code> in this case, can lead to panic when they do not behave as you expect.
Unluckily, there are too many little details to know about every different system, and I wish <code class="language-plaintext highlighter-rouge">systemd</code> becomes a little less rude and at least warns the user that his files are going to be overriden.
<br />
While the unit file states, in its beginning, to not modify the file, it is not clear what is the best approach to use to re-define variables (include or override file?).</p>
Checking PostgreSQL Version in Scripts2019-07-18T00:00:00+00:00https://fluca1978.github.io/2019/07/18/CheckPostgreSQLVersionInScripts<p><code class="language-plaintext highlighter-rouge">psql(1)</code> has a bar support for conditionals, that can be used to check PostgreSQL version and act accordingly in scripts.</p>
<h1 id="checking-postgresql-version-in-scripts">Checking PostgreSQL Version in Scripts</h1>
<p><a href="https://www.postgresql.org/docs/11/app-psql.html"><code class="language-plaintext highlighter-rouge">psql(1)</code> provides a little support to conditionals</a> and this can be used in scripts to check, for instance, the PostgreSQL version.
<br />
This is quite trivial, however I had to adjust an example script of mine to act properly depending on the PostgreSQL version.</p>
<h2 id="the-problem">The problem</h2>
<p>The problem I had was with declarative partitioning: since PostgreSQL 11, declarative partitioning supports a <code class="language-plaintext highlighter-rouge">DEFAULT</code> partition, that is <em>catch-all bucket</em> for tuples that don’t have an explicit partition to go into.
In PostgreSQL 10 you need to manually create <em>catch-all</em> partition(s) by explicitly defining them.
<br />
In my use case, I had a set of tables partitioned by a time range (the year, to be precise), but I don’t want to set up a partition for each year before the starting point of <em>clean data</em>: all data after year 2015 is correct, somewhere there could be some dirty data with bogus years.
<br />
Therefore, I needed a partition to catch all bogus data before year 2015, that is, a partition that ranges from the earth creation until 2015. In PostgreSQL 11 this, of course, requires you to define a <code class="language-plaintext highlighter-rouge">DEFAULT</code> partition and that’s it! But how to create a different <em>default</em> partition on PostgreSQL 10 and 11?
<br />
<br />
<a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/partitioning/partitioning_example.declarative.sql">I solved the problem with something like the following</a>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">\</span><span class="n">if</span> <span class="p">:</span><span class="n">pg_version_10</span>
<span class="err">\</span><span class="n">echo</span> <span class="s1">'PostgreSQL version is 10'</span>
<span class="err">\</span><span class="n">echo</span> <span class="s1">'Emulate a DEFAULT partition'</span>
<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">digikam</span><span class="p">.</span><span class="n">images_old</span>
<span class="k">PARTITION</span> <span class="k">OF</span> <span class="n">digikam</span><span class="p">.</span><span class="n">images_root</span>
<span class="k">FOR</span> <span class="k">VALUES</span> <span class="k">FROM</span> <span class="p">(</span> <span class="k">MINVALUE</span> <span class="p">)</span>
<span class="k">TO</span> <span class="p">(</span> <span class="s1">'2015-01-01'</span> <span class="p">);</span>
<span class="err">\</span><span class="k">else</span>
<span class="err">\</span><span class="n">echo</span> <span class="s1">'PostgreSQL version is at least 11'</span>
<span class="err">\</span><span class="n">echo</span> <span class="s1">'Using DEFAULT partition'</span>
<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">digikam</span><span class="p">.</span><span class="n">images_old</span>
<span class="k">PARTITION</span> <span class="k">OF</span> <span class="n">digikam</span><span class="p">.</span><span class="n">images_root</span>
<span class="k">DEFAULT</span><span class="p">;</span>
<span class="err">\</span><span class="n">endif</span>
</code></pre></div></div>
<p>The idea is quite simple: if (<code class="language-plaintext highlighter-rouge">\if</code>) PostgreSQL is at version 10 emulate a default partition, otherwise (<code class="language-plaintext highlighter-rouge">\else</code>) PostgreSQL is at version 11 or greater and can use native <code class="language-plaintext highlighter-rouge">DEFAULT</code> partition. The partition table is named the same in the two cases so that the final user does not see any difference.
<br />
<br />
But what is that <code class="language-plaintext highlighter-rouge">:pg_version_10</code> stuff? That’s a boolean <code class="language-plaintext highlighter-rouge">psql(1)</code> variable set up by another <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/pgsql.check_postgresql_version.psql">utility</a>, included into my script:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span>
<span class="k">EXISTS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">setting</span>
<span class="k">FROM</span> <span class="n">pg_settings</span>
<span class="k">WHERE</span> <span class="n">name</span> <span class="o">=</span> <span class="s1">'server_version_num'</span>
<span class="k">AND</span> <span class="n">setting</span><span class="p">::</span><span class="nb">int</span> <span class="o">>=</span> <span class="mi">120000</span>
<span class="k">AND</span> <span class="n">setting</span><span class="p">::</span><span class="nb">int</span> <span class="o"><</span> <span class="mi">130000</span>
<span class="p">)</span>
<span class="k">AS</span> <span class="n">pg_version_12</span>
<span class="p">,</span> <span class="k">EXISTS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">setting</span>
<span class="k">FROM</span> <span class="n">pg_settings</span>
<span class="k">WHERE</span> <span class="n">name</span> <span class="o">=</span> <span class="s1">'server_version_num'</span>
<span class="k">AND</span> <span class="n">setting</span><span class="p">::</span><span class="nb">int</span> <span class="o">>=</span> <span class="mi">110000</span>
<span class="k">AND</span> <span class="n">setting</span><span class="p">::</span><span class="nb">int</span> <span class="o"><</span> <span class="mi">120000</span>
<span class="p">)</span>
<span class="k">AS</span> <span class="n">pg_version_11</span>
<span class="c1">-- and so on ...</span>
<span class="p">,</span> <span class="k">EXISTS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">setting</span>
<span class="k">FROM</span> <span class="n">pg_settings</span>
<span class="k">WHERE</span> <span class="n">name</span> <span class="o">=</span> <span class="s1">'server_version_num'</span>
<span class="k">AND</span> <span class="n">setting</span><span class="p">::</span><span class="nb">int</span> <span class="o"><</span> <span class="mi">100000</span>
<span class="p">)</span>
<span class="k">AS</span> <span class="n">pg_version_less_than_10</span>
<span class="err">\</span><span class="n">gset</span>
</code></pre></div></div>
<p>The script does a very dummy job: it queries the <code class="language-plaintext highlighter-rouge">server_version_num</code> setting and dynamically creates (<code class="language-plaintext highlighter-rouge">\gset</code>) variables that are true depending on the PostgreSQL instance version number.
<br />
The only thing required is to import the script, for instance at the very top of your script, as for instance:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- beginning of your script</span>
<span class="err">\</span><span class="n">ir</span> <span class="p">..</span><span class="o">/</span><span class="n">pgsql</span><span class="p">.</span><span class="n">check_postgresql_version</span><span class="p">.</span><span class="n">psql</span>
</code></pre></div></div>
<p><br />
<br />
<em>And that’s all folks!</em>
<br />
<br />
What this allows me to do is, for instance, avoid to run a declarative partition script at all if that is not supported on the server side:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">\</span><span class="n">if</span> <span class="p">:</span><span class="n">pg_version_less_than_10</span>
<span class="err">\</span><span class="n">echo</span> <span class="s1">'PostgreSQL version less than 10, cannot run declarative partitioning!'</span>
<span class="err">\</span><span class="n">echo</span> <span class="s1">'Update yourself!'</span>
<span class="err">\</span><span class="n">quit</span>
<span class="err">\</span><span class="n">endif</span>
</code></pre></div></div>
<p>Just placing the above snippet on top of my declarative partitioning script prevents me to running commands that will generate errors if the server is not at least at version 10.</p>
<h2 id="summary">Summary</h2>
<p>Thanks to <code class="language-plaintext highlighter-rouge">psql(1)</code> conditionals support it is possible to behave differently depending on the server version.
<br />
The advantage is that, clearly, you can build more robust scripts.
<br />
The drawback is that such script will require <code class="language-plaintext highlighter-rouge">psql(1)</code> and are therefore less portable.</p>
Suggesting Single-Column Primary Keys (almost) Automatically2019-07-17T00:00:00+00:00https://fluca1978.github.io/2019/07/17/SuggestPrimaryKeys<p>Is it possible to infer primary keys automatically? If it, I’m not able at doing that, but at least I can try.</p>
<h1 id="suggesting-single-column-primary-keys-almost-automatically">Suggesting Single-Column Primary Keys (almost) Automatically</h1>
<p>A comment on my previous <a href="https://fluca1978.github.io/2019/07/09/GeneratePrimaryKeys.html">blog post about generating primary keys</a> with a procedure made me think about how to inspect a table to understand which columns can be candidates for primary keys.
<br />
<br />
Of course, this does make sense (at least to me) for <em>single-column</em> constraints only, because multi column constraint require a deep knowledge about the data. Anyway, <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/suggest_primary_keys.sql">here it is my first attempt</a>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">f_suggest_primary_keys</span><span class="p">(</span> <span class="n">schemaz</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'public'</span><span class="p">,</span>
<span class="n">tablez</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="k">NULL</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">SETOF</span> <span class="nb">text</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">current_stats</span> <span class="n">record</span><span class="p">;</span>
<span class="n">is_unique</span> <span class="nb">boolean</span><span class="p">;</span>
<span class="n">is_primary_key</span> <span class="nb">boolean</span><span class="p">;</span>
<span class="n">could_be_unique</span> <span class="nb">boolean</span><span class="p">;</span>
<span class="n">could_be_primary_key</span> <span class="nb">boolean</span><span class="p">;</span>
<span class="n">current_constraint</span> <span class="nb">char</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="n">current_alter_table</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Inspecting schema % (table %)'</span><span class="p">,</span> <span class="n">schemaz</span><span class="p">,</span> <span class="n">tablez</span><span class="p">;</span>
<span class="k">FOR</span> <span class="n">current_stats</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="n">s</span><span class="p">.</span><span class="o">*</span><span class="p">,</span> <span class="n">n</span><span class="p">.</span><span class="n">oid</span> <span class="k">AS</span> <span class="n">nspoid</span><span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="k">AS</span> <span class="n">reloid</span> <span class="k">FROM</span> <span class="n">pg_stats</span> <span class="n">s</span>
<span class="k">JOIN</span> <span class="n">pg_class</span> <span class="k">c</span> <span class="k">ON</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">tablename</span>
<span class="k">JOIN</span> <span class="n">pg_namespace</span> <span class="n">n</span> <span class="k">ON</span> <span class="n">n</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">relnamespace</span>
<span class="k">WHERE</span> <span class="n">s</span><span class="p">.</span><span class="n">schemaname</span> <span class="o">=</span> <span class="n">schemaz</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="n">n</span><span class="p">.</span><span class="n">nspname</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">schemaname</span>
<span class="k">AND</span> <span class="p">(</span> <span class="p">(</span> <span class="n">s</span><span class="p">.</span><span class="n">tablename</span> <span class="o">=</span> <span class="n">tablez</span> <span class="p">))</span>
<span class="n">LOOP</span>
<span class="n">is_primary_key</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="n">is_unique</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="n">could_be_unique</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="n">could_be_primary_key</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Inspecting table [%.%] (%.%) -> %'</span><span class="p">,</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">schemaname</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">tablename</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">nspoid</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">reloid</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">attname</span><span class="p">;</span>
<span class="c1">-- search if this attribute is already included into</span>
<span class="c1">-- a primary key constraint</span>
<span class="k">SELECT</span> <span class="n">cn</span><span class="p">.</span><span class="n">contype</span>
<span class="k">INTO</span> <span class="n">current_constraint</span>
<span class="k">FROM</span> <span class="n">pg_constraint</span> <span class="n">cn</span>
<span class="k">JOIN</span> <span class="n">pg_attribute</span> <span class="n">a</span> <span class="k">ON</span> <span class="n">a</span><span class="p">.</span><span class="n">attnum</span> <span class="o">=</span> <span class="k">ANY</span><span class="p">(</span> <span class="n">cn</span><span class="p">.</span><span class="n">conkey</span> <span class="p">)</span>
<span class="k">WHERE</span> <span class="n">cn</span><span class="p">.</span><span class="n">conrelid</span> <span class="o">=</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">reloid</span>
<span class="k">AND</span> <span class="n">cn</span><span class="p">.</span><span class="n">connamespace</span> <span class="o">=</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">nspoid</span>
<span class="k">AND</span> <span class="n">a</span><span class="p">.</span><span class="n">attrelid</span> <span class="o">=</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">reloid</span>
<span class="k">AND</span> <span class="n">a</span><span class="p">.</span><span class="n">attname</span> <span class="o">=</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">attname</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">current_constraint</span> <span class="o">=</span> <span class="s1">'p'</span> <span class="k">THEN</span>
<span class="n">is_primary_key</span> <span class="p">:</span><span class="o">=</span> <span class="k">true</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="n">is_primary_key</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">current_constraint</span> <span class="o">=</span> <span class="s1">'u'</span> <span class="k">THEN</span>
<span class="n">is_unique</span> <span class="p">:</span><span class="o">=</span> <span class="k">true</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="n">is_unique</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- if this is already on a constraint, skip!</span>
<span class="n">IF</span> <span class="n">is_primary_key</span> <span class="k">OR</span> <span class="n">is_unique</span> <span class="k">THEN</span>
<span class="k">CONTINUE</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- check if this could be an unique attribute</span>
<span class="n">IF</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">n_distinct</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span> <span class="k">THEN</span>
<span class="n">could_be_unique</span> <span class="p">:</span><span class="o">=</span> <span class="k">true</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="n">could_be_unique</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="c1">-- could it be promoted as a primary key?</span>
<span class="n">IF</span> <span class="n">could_be_unique</span> <span class="k">AND</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">null_frac</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">could_be_primary_key</span> <span class="p">:</span><span class="o">=</span> <span class="k">true</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="n">could_be_primary_key</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">could_be_primary_key</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Suggested PRIMARY KEY(%) on %.%'</span><span class="p">,</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">attname</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">schemaname</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">tablename</span><span class="p">;</span>
<span class="n">current_alter_table</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'ALTER TABLE %I.%I ADD CONSTRAINT UNIQUE(%I)'</span><span class="p">,</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">schemaname</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">tablename</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">attname</span> <span class="p">);</span>
<span class="k">ELSE</span> <span class="n">IF</span> <span class="n">could_be_unique</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Suggested UNIQUE(%) on %.%'</span><span class="p">,</span> <span class="n">current_stats</span><span class="p">.</span><span class="n">attname</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">schemaname</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">tablename</span><span class="p">;</span>
<span class="n">current_alter_table</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'ALTER TABLE %I.%I ADD CONSTRAINT PRIMARY KEY(%I)'</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">schemaname</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">tablename</span><span class="p">,</span>
<span class="n">current_stats</span><span class="p">.</span><span class="n">attname</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NEXT</span> <span class="n">current_alter_table</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>The idea is to wrap into a function all the logic, so that I can pass either the schema or the table name to inspect and have the arguments be set to decent defaults.
<br />
The first look is at <code class="language-plaintext highlighter-rouge">pg_stats</code> because it can provide hints about good candidates:</p>
<ul>
<li>if <code class="language-plaintext highlighter-rouge">n_distinct</code> is negative, and in particular is <code class="language-plaintext highlighter-rouge">-1</code>, the column has one different value on every different tuple, so it (as far as we know) <em>unique</em>;</li>
<li>if <code class="language-plaintext highlighter-rouge">null_frac</code> is <code class="language-plaintext highlighter-rouge">0</code> the value is not null and it can be a candidate for a primary key.</li>
</ul>
<p><br />
<br />
<strong>Of course, this means that the statistics must be up-to-date or the whole thing will not be able to suggest constraints!</strong>
<br />
<br /></p>
<p>From <code class="language-plaintext highlighter-rouge">pg_stats</code> I get back column names, and the first thing to check then is if the column already appears in a constraint of type <code class="language-plaintext highlighter-rouge">p</code> (primary key) or <code class="language-plaintext highlighter-rouge">u</code> (unique); this prevents the function to suggest columns that already implied in such a constraint, that is avoid suggesting obvious things.
<br />
<br />
<br />
The remaining is quite simple: if the column is already involved in a constraint, skip it; otherwise consider if it can be part of a <code class="language-plaintext highlighter-rouge">UNIQUE</code> or <code class="language-plaintext highlighter-rouge">PRIMARY KEY</code> constraint. Depending on the result, the right <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> is emitted, so that the administrator can use it with rationality.
<br />
<br />
Here it is an example invocation:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span> <span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">f_suggest_primary_keys</span><span class="p">(</span> <span class="s1">'respi'</span><span class="p">,</span> <span class="s1">'tipo_rensom'</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Inspecting</span> <span class="k">schema</span> <span class="n">respi</span> <span class="p">(</span><span class="k">table</span> <span class="n">tipo_rensom</span><span class="p">)</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Inspecting</span> <span class="k">table</span> <span class="p">[</span><span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span><span class="p">]</span> <span class="p">(</span><span class="mi">151915</span><span class="p">.</span><span class="mi">151952</span><span class="p">)</span> <span class="o">-></span> <span class="n">pk</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Inspecting</span> <span class="k">table</span> <span class="p">[</span><span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span><span class="p">]</span> <span class="p">(</span><span class="mi">151915</span><span class="p">.</span><span class="mi">151952</span><span class="p">)</span> <span class="o">-></span> <span class="n">id_tipo_rensom</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Inspecting</span> <span class="k">table</span> <span class="p">[</span><span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span><span class="p">]</span> <span class="p">(</span><span class="mi">151915</span><span class="p">.</span><span class="mi">151952</span><span class="p">)</span> <span class="o">-></span> <span class="n">nome</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Suggested</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">(</span><span class="n">nome</span><span class="p">)</span> <span class="k">on</span> <span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Inspecting</span> <span class="k">table</span> <span class="p">[</span><span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span><span class="p">]</span> <span class="p">(</span><span class="mi">151915</span><span class="p">.</span><span class="mi">151952</span><span class="p">)</span> <span class="o">-></span> <span class="n">descrizione</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Suggested</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">(</span><span class="n">descrizione</span><span class="p">)</span> <span class="k">on</span> <span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span>
<span class="n">f_suggest_primary_keys</span>
<span class="c1">-------------------------------------------------------------------</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span> <span class="k">ADD</span> <span class="k">CONSTRAINT</span> <span class="k">UNIQUE</span><span class="p">(</span><span class="n">nome</span><span class="p">)</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">respi</span><span class="p">.</span><span class="n">tipo_rensom</span> <span class="k">ADD</span> <span class="k">CONSTRAINT</span> <span class="k">UNIQUE</span><span class="p">(</span><span class="n">descrizione</span><span class="p">)</span>
</code></pre></div></div>
<p>With no surprise, the <code class="language-plaintext highlighter-rouge">pk</code> column, that is a <code class="language-plaintext highlighter-rouge">PRIMARY KEY</code> is inspected but skipped, while two other columns appear to be enough unique to take a role in a constraint addition.</p>
<p><br />
As you can imagine, this is just a little attempt in automating boring stuff. There is a lot of room for improvements, both on the performance way and on the more important support the function can provide to an administrator.</p>
Generate Primary Keys (almost) Automatically2019-07-09T00:00:00+00:00https://fluca1978.github.io/2019/07/09/GeneratePrimaryKeys<p>What if your database design is so poor that you need to refactor tables in order to add primary keys?</p>
<h1 id="generate-primary-keys-almost-automatically">Generate Primary Keys (almost) Automatically</h1>
<p>While playing on quite large database (in terms of number of tables) with a friend of mine, we discovered that almost all tables <strong>did not have a primary key</strong>!
<br />
Gosh!
<br />
<em>This is really baaaad!</em>
<br />
<br />
Why is that bad? Well, you should not ask, but let’s keep the poor database design alone and focus on some more concrete problems: in particular not having a primary key prevents a lot of <em>smart</em> softwares and middlewares to work on your database. As you probably know, almost every <em>ORM</em> requires each table to have at least one surrogate key in order to properly identify each row and enable persistence (that is, modification of rows).
<br />
<br />
Luckily, fixing tables for such software is quite simple: just add a surrogate key and everyone will be happy again. But unluckily, while adding a primary key is a matter of issuing an <code class="language-plaintext highlighter-rouge">ALTER TABLE</code>, doing so for a long list of tables is boring.
<br />
<br />
Here comes the power of PostgreSQL again: thanks to its rich catalog, it is possible to automate the process.</p>
<p><br />
<br />
In this post you will see how to build from a query to a whole procedure that does the trick.</p>
<h2 id="a-query-to-generate-the-alter-table-commands">A query to generate the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> commands</h2>
<p>A first example is the following query, that searches for every table in the schema <code class="language-plaintext highlighter-rouge">public</code> that does not have a constraint of type <code class="language-plaintext highlighter-rouge">p</code> (<em>primary key</em>) and issue an <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> for such table:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">WITH</span>
<span class="n">to_be_fixed</span> <span class="k">AS</span>
<span class="p">(</span>
<span class="k">SELECT</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span><span class="p">,</span>
<span class="s1">'ALTER TABLE '</span>
<span class="o">||</span> <span class="n">quote_ident</span><span class="p">(</span> <span class="n">n</span><span class="p">.</span><span class="n">nspname</span> <span class="p">)</span>
<span class="o">||</span> <span class="s1">'.'</span>
<span class="o">||</span> <span class="n">quote_ident</span><span class="p">(</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="p">)</span>
<span class="o">||</span> <span class="s1">' ADD COLUMN pk int GENERATED ALWAYS AS IDENTITY PRIMARY KEY;'</span> <span class="k">AS</span> <span class="n">command</span>
<span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">c</span>
<span class="k">JOIN</span> <span class="n">pg_namespace</span> <span class="n">n</span> <span class="k">ON</span> <span class="n">n</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">relnamespace</span>
<span class="k">WHERE</span> <span class="n">n</span><span class="p">.</span><span class="n">nspname</span> <span class="o">=</span> <span class="s1">'public'</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">conname</span> <span class="k">FROM</span> <span class="n">pg_constraint</span> <span class="k">WHERE</span> <span class="n">contype</span> <span class="o">=</span> <span class="s1">'p'</span> <span class="k">AND</span> <span class="n">conrelid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="p">)</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span>
<span class="p">)</span>
<span class="k">SELECT</span> <span class="n">command</span> <span class="k">FROM</span> <span class="n">to_be_fixed</span><span class="p">;</span>
<span class="n">command</span>
<span class="c1">------------------------------------------------------------------------------------</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">bar</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
</code></pre></div></div>
<p>So a first, desperate way of doing it is to adjust the above query to your schema, saving it to a file named <code class="language-plaintext highlighter-rouge">query.sql</code>, and then executing it putting the output into a text file (say <code class="language-plaintext highlighter-rouge">script.sql</code>) and then execute it. In other words, something like:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> luca <span class="nt">-h</span> miguel <span class="nt">-f</span> query.sql <span class="nt">-o</span> script.sql testdb
% psql <span class="nt">-U</span> <span class="nt">-h</span> miguel <span class="nt">-f</span> script.sql
</code></pre></div></div>
<p>But let’s see a more tunable way of doing it.</p>
<h2 id="a-function-to-generate-the-alter-table-commands">A function to generate the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> commands</h2>
<p>I’ve written <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/create_primary_keys.sql">a very small function to do the above <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> commands</a> in a way that is a little smarter and tunable.
The function accepts a couple of parameters, all with default values:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">pk_prefix</code> (defaults to <code class="language-plaintext highlighter-rouge">pk</code>) the name of your primary key column, call it <code class="language-plaintext highlighter-rouge">id</code>, <code class="language-plaintext highlighter-rouge">pk</code> or whatever;</li>
<li><code class="language-plaintext highlighter-rouge">schemaz</code> (defaults to <code class="language-plaintext highlighter-rouge">public</code>) the schema where you want to operate on;</li>
<li><code class="language-plaintext highlighter-rouge">use_identity</code> true if you want to generate identity columns, false if you want to generate serial columns;</li>
<li><code class="language-plaintext highlighter-rouge">append_table_name</code> in order to avoid column name clashes (it could be you already have an <code class="language-plaintext highlighter-rouge">id</code> column somewhere), it is possible to append the table name to the column name <code class="language-plaintext highlighter-rouge">pk_prefix</code> so to generate almost unique keys.</li>
</ul>
<p>The <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/create_primary_keys.sql">function looks like the following</a>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">f_generate_primary_keys</span><span class="p">(</span> <span class="n">pk_prefix</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'pk'</span><span class="p">,</span>
<span class="n">schemaz</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'public'</span><span class="p">,</span>
<span class="n">use_identity</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">true</span><span class="p">,</span>
<span class="n">append_table_name</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">false</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="k">SETOF</span> <span class="nb">text</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">current_class</span> <span class="n">pg_class</span><span class="o">%</span><span class="n">rowtype</span><span class="p">;</span>
<span class="n">current_alter_table</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">current_pk_type</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">current_pk_generation</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">current_pk_name</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">current_class</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">c</span>
<span class="k">JOIN</span> <span class="n">pg_namespace</span> <span class="n">n</span> <span class="k">ON</span> <span class="n">n</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">relnamespace</span>
<span class="k">WHERE</span> <span class="n">n</span><span class="p">.</span><span class="n">nspname</span> <span class="o">=</span> <span class="n">schemaz</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'r'</span>
<span class="k">AND</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="p">(</span> <span class="k">SELECT</span> <span class="n">conname</span> <span class="k">FROM</span> <span class="n">pg_constraint</span>
<span class="k">WHERE</span> <span class="n">contype</span> <span class="o">=</span> <span class="s1">'p'</span>
<span class="k">AND</span> <span class="n">conrelid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="p">)</span>
<span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Table [%] without primary key'</span><span class="p">,</span> <span class="n">current_class</span><span class="p">.</span><span class="n">relname</span><span class="p">;</span>
<span class="n">current_pk_name</span> <span class="p">:</span><span class="o">=</span> <span class="n">pk_prefix</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">append_table_name</span> <span class="k">THEN</span>
<span class="n">current_pk_name</span> <span class="p">:</span><span class="o">=</span> <span class="n">current_pk_name</span> <span class="o">||</span> <span class="s1">'_'</span> <span class="o">||</span> <span class="n">current_class</span><span class="p">.</span><span class="n">relname</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="k">NOT</span> <span class="n">use_identity</span> <span class="k">THEN</span>
<span class="n">current_pk_type</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'serial'</span><span class="p">;</span>
<span class="n">current_pk_generation</span> <span class="p">:</span><span class="o">=</span> <span class="s1">''</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="n">current_pk_type</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'int'</span><span class="p">;</span>
<span class="n">current_pk_generation</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'GENERATED ALWAYS AS IDENTITY'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">current_alter_table</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'ALTER TABLE %I.%I ADD COLUMN %I %s NOT NULL %s PRIMARY KEY;'</span><span class="p">,</span>
<span class="n">schemaz</span><span class="p">,</span>
<span class="n">current_class</span><span class="p">.</span><span class="n">relname</span><span class="p">,</span>
<span class="n">current_pk_name</span><span class="p">,</span>
<span class="n">current_pk_type</span><span class="p">,</span>
<span class="n">current_pk_generation</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">' -> %'</span><span class="p">,</span> <span class="n">current_alter_table</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NEXT</span> <span class="n">current_alter_table</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">RETURN</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>Briefly, the function issues a query that is very similar to the above one, and that finds out all tuples in <code class="language-plaintext highlighter-rouge">pg_class</code> corresponding to a table without a primary key. For each table, the appropriate <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> is built and issued as a returning value.
<br />
Invoking the function produces the commands to execute after in the database:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">f_generate_primary_keys</span><span class="p">(</span> <span class="s1">'id'</span><span class="p">,</span> <span class="s1">'public'</span><span class="p">,</span> <span class="k">true</span><span class="p">,</span> <span class="k">false</span> <span class="p">);</span>
<span class="n">f_generate_primary_keys</span>
<span class="c1">---------------------------------------------------------------------------------------------</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id</span> <span class="nb">int</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">bar</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id</span> <span class="nb">int</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">f_generate_primary_keys</span><span class="p">();</span>
<span class="n">f_generate_primary_keys</span>
<span class="c1">---------------------------------------------------------------------------------------------</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">bar</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">pk</span> <span class="nb">int</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">f_generate_primary_keys</span><span class="p">(</span> <span class="s1">'id'</span><span class="p">,</span> <span class="s1">'public'</span><span class="p">,</span> <span class="k">false</span><span class="p">,</span> <span class="k">true</span> <span class="p">);</span>
<span class="n">f_generate_primary_keys</span>
<span class="c1">------------------------------------------------------------------------</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id_foo</span> <span class="nb">serial</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">bar</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id_bar</span> <span class="nb">serial</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
</code></pre></div></div>
<p>There is of course room for improvements, for instance executing the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> immediatly within the function.</p>
<h2 id="a-procedure-to-execute-the-alter-table-commands">A procedure to execute the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> commands</h2>
<p>It is now quite straightforward to wrap the <code class="language-plaintext highlighter-rouge">f_generate_primary_keys</code> function into a <em>procedure</em> and add transaction logic. The boring stuff is just to pass thru the arguments and control when to issue a commit while batch processing:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">PROCEDURE</span> <span class="n">p_generate_primary_keys</span><span class="p">(</span> <span class="n">pk_prefix</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'pk'</span><span class="p">,</span>
<span class="n">schemaz</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'public'</span><span class="p">,</span>
<span class="n">use_identity</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">true</span><span class="p">,</span>
<span class="n">append_table_name</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">false</span><span class="p">,</span>
<span class="n">commit_after</span> <span class="nb">int</span> <span class="k">DEFAULT</span> <span class="mi">10</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">current_alter_table</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">done</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">current_alter_table</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="n">f_generate_primary_keys</span><span class="p">(</span> <span class="n">pk_prefix</span><span class="p">,</span> <span class="n">schemaz</span><span class="p">,</span> <span class="n">use_identity</span><span class="p">,</span> <span class="n">append_table_name</span> <span class="p">)</span>
<span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Executing [%]'</span><span class="p">,</span> <span class="n">current_alter_table</span><span class="p">;</span>
<span class="k">EXECUTE</span> <span class="n">current_alter_table</span><span class="p">;</span>
<span class="n">done</span> <span class="p">:</span><span class="o">=</span> <span class="n">done</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">done</span> <span class="o">%</span> <span class="n">commit_after</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Forcing a commit'</span><span class="p">;</span>
<span class="k">COMMIT</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Altered % tables in schema %'</span><span class="p">,</span> <span class="n">done</span><span class="p">,</span> <span class="n">schemaz</span><span class="p">;</span>
<span class="k">COMMIT</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>The important part here is, of course, the <code class="language-plaintext highlighter-rouge">EXECUTE</code> statement and the commit control.
Invoking the procedure proceduces something like:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">call</span> <span class="n">p_generate_primary_keys</span><span class="p">(</span> <span class="s1">'id'</span><span class="p">,</span> <span class="s1">'public'</span><span class="p">,</span> <span class="k">false</span><span class="p">,</span> <span class="k">true</span> <span class="p">);</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Table</span> <span class="p">[</span><span class="n">foo</span><span class="p">]</span> <span class="k">without</span> <span class="k">primary</span> <span class="k">key</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="o">-></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id_foo</span> <span class="nb">serial</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">Table</span> <span class="p">[</span><span class="n">bar</span><span class="p">]</span> <span class="k">without</span> <span class="k">primary</span> <span class="k">key</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="o">-></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">bar</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id_bar</span> <span class="nb">serial</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Executing</span> <span class="p">[</span><span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id_foo</span> <span class="nb">serial</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;]</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">will</span> <span class="k">create</span> <span class="k">implicit</span> <span class="n">sequence</span> <span class="nv">"foo_id_foo_seq"</span> <span class="k">for</span> <span class="nb">serial</span> <span class="k">column</span> <span class="nv">"foo.id_foo"</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="o">/</span> <span class="k">ADD</span> <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="n">will</span> <span class="k">create</span> <span class="k">implicit</span> <span class="k">index</span> <span class="nv">"foo_pkey"</span> <span class="k">for</span> <span class="k">table</span> <span class="nv">"foo"</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">rewriting</span> <span class="k">table</span> <span class="nv">"foo"</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">building</span> <span class="k">index</span> <span class="nv">"foo_pkey"</span> <span class="k">on</span> <span class="k">table</span> <span class="nv">"foo"</span> <span class="n">serially</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Executing</span> <span class="p">[</span><span class="k">ALTER</span> <span class="k">TABLE</span> <span class="k">public</span><span class="p">.</span><span class="n">bar</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">id_bar</span> <span class="nb">serial</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">;]</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">will</span> <span class="k">create</span> <span class="k">implicit</span> <span class="n">sequence</span> <span class="nv">"bar_id_bar_seq"</span> <span class="k">for</span> <span class="nb">serial</span> <span class="k">column</span> <span class="nv">"bar.id_bar"</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="o">/</span> <span class="k">ADD</span> <span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="n">will</span> <span class="k">create</span> <span class="k">implicit</span> <span class="k">index</span> <span class="nv">"bar_pkey"</span> <span class="k">for</span> <span class="k">table</span> <span class="nv">"bar"</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">rewriting</span> <span class="k">table</span> <span class="nv">"bar"</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">building</span> <span class="k">index</span> <span class="nv">"bar_pkey"</span> <span class="k">on</span> <span class="k">table</span> <span class="nv">"bar"</span> <span class="n">serially</span>
<span class="n">DEBUG</span><span class="p">:</span> <span class="n">Altered</span> <span class="mi">2</span> <span class="n">tables</span> <span class="k">in</span> <span class="k">schema</span> <span class="k">public</span>
<span class="n">LOG</span><span class="p">:</span> <span class="n">duration</span><span class="p">:</span> <span class="mi">16</span><span class="p">.</span><span class="mi">224</span> <span class="n">ms</span> <span class="k">statement</span><span class="p">:</span> <span class="k">call</span> <span class="n">p_generate_primary_keys</span><span class="p">(</span> <span class="s1">'id'</span><span class="p">,</span> <span class="s1">'public'</span><span class="p">,</span> <span class="k">false</span><span class="p">,</span> <span class="k">true</span> <span class="p">);</span>
<span class="k">CALL</span>
</code></pre></div></div>
<p>Again, there is room for improvement, but this is just a quick demonstration of how easy it is to exploit PostgreSQL facilities to refactor your schema.</p>
PostgreSQL & recovery.conf2019-07-08T00:00:00+00:00https://fluca1978.github.io/2019/07/08/PostgreSQL12Recovery<p>The coming version of PostgreSQL, 12, will loose the <code class="language-plaintext highlighter-rouge">recovery.conf</code> file. It will get some time to get used to!</p>
<h1 id="postgresql--recoveryconf">PostgreSQL & recovery.conf</h1>
<p>According to the documentation for the upcoming version <em>12</em>, the <strong><code class="language-plaintext highlighter-rouge">recovery.conf</code> file has gone!</strong>
The release note states it clearly: <a href="https://www.postgresql.org/docs/12/release-12.html#id-1.11.6.5.4">the server will not start if <code class="language-plaintext highlighter-rouge">recovery.conf</code> is in place</a> and all the configuration parameters have moved to the classic <code class="language-plaintext highlighter-rouge">postgresql.conf</code> (or included files).
<br />
<br />
The <a href="https://www.postgresql.org/message-id/flat/CANP8+jLO5fmfudbB1b1iw3pTdOK1HBM=xMTaRfOa5zpDVcqzew@mail.gmail.com">change proposal is quite old</a>, but represents a <a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2dedf4d9a899b36d1a8ed29be5efbd1b31a8fe85">deep change in the way PostgreSQL handles the server startup and recovery</a> and could take a while to get all the software out there to handle it too.
<br />
<br />
<em>Please note that since PostgreSQL 12 is still in beta, things could change a little</em>, even if the discussion and the implementation is nearly ended.
<br />
<br /></p>
<p><a href="https://www.postgresql.org/docs/12/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY">Two files can be created to instrument a standby node</a>:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">standby.signal</code> if present in the <code class="language-plaintext highlighter-rouge">PGDATA</code> directory the host will work as a standby, that is it will wait for new incoming WAL segments and replay them for the rest of its life;</li>
<li><code class="language-plaintext highlighter-rouge">recovery.signal</code> if present will stop the WAL replaying as soon as all the WALs have been consumed or the <code class="language-plaintext highlighter-rouge">recovery_target</code> parameter has been reached.</li>
</ul>
<p><br />
It is interesting to note that <code class="language-plaintext highlighter-rouge">standby.signal</code> takes precedence on <code class="language-plaintext highlighter-rouge">recovery.signal</code>, meaning that if both file exists the node will act as a standby. <strong>Both files may be empty, they act now as as <em>triggering</em> files rather than configuration files</strong> (here the change in the suffix).
<br />
<br />
So, what is the rationale for this change? There are several reasons, including the not needing for a duplication of configuration files. But what I like the most is that having the parameters into the <em>trunk</em> configuration <strong>make them good candidate to be changed via an <a href="https://www.postgresql.org/docs/12/sql-altersystem.html"><code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code> and the <code class="language-plaintext highlighter-rouge">postgresql.auto.conf</code> machinery</a></strong> (see later for an example).</p>
<p><br />
<br />
While all recovery parameters have been kept the same, the <code class="language-plaintext highlighter-rouge">trigger_file</code> one has been <a href="https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-PROMOTE-TRIGGER-FILE">renamed to <code class="language-plaintext highlighter-rouge">promote_trigger_file</code></a> to clearly emphasize its meaning.
<br />
<br />
The above is not the only big difference in recovery handling: now it is no more possible to specify multiple <code class="language-plaintext highlighter-rouge">recovery_target_xxx</code> variables and “hope” to get the server to do it right (selecting the last one, effectively). The administrator is required to do a better job in selecting precisely which target to recover to! Last, also the timeline defaults to recover to the last one and not the current one.
<br />
As you can expect, <code class="language-plaintext highlighter-rouge">pg_basebackup</code> has been changed accordingly and therefore the <code class="language-plaintext highlighter-rouge">--write-recovery-conf</code> option (<code class="language-plaintext highlighter-rouge">-R</code>) now only puts a <code class="language-plaintext highlighter-rouge">standby.signal</code> file within the <code class="language-plaintext highlighter-rouge">PGDATA</code> directory. Settings are now appended to <code class="language-plaintext highlighter-rouge">postgresql.auto.conf</code>.
<br />
<br />
<br />
So, a lot of changes in the way the cluster manages the recovery/stand-by modes, and I hope all the automated backup software out there will respond properly.</p>
<h2 id="contexts">Contexts</h2>
<p>Contexts of the included setting GUCs have not changed so far:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">template1</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">name</span><span class="p">,</span> <span class="n">context</span> <span class="k">FROM</span> <span class="n">pg_settings</span> <span class="k">WHERE</span> <span class="n">category</span> <span class="k">like</span> <span class="s1">'% Archiv%'</span><span class="p">;</span>
<span class="n">name</span> <span class="o">|</span> <span class="n">context</span>
<span class="c1">-------------------------|------------</span>
<span class="n">archive_cleanup_command</span> <span class="o">|</span> <span class="n">sighup</span>
<span class="n">archive_command</span> <span class="o">|</span> <span class="n">sighup</span>
<span class="n">archive_mode</span> <span class="o">|</span> <span class="n">postmaster</span>
<span class="n">archive_timeout</span> <span class="o">|</span> <span class="n">sighup</span>
<span class="n">recovery_end_command</span> <span class="o">|</span> <span class="n">sighup</span>
<span class="n">restore_command</span> <span class="o">|</span> <span class="n">postmaster</span>
</code></pre></div></div>
<h2 id="what-happens-if-you-keep-around-recoveryconf">What happens if you keep around <code class="language-plaintext highlighter-rouge">recovery.conf</code>?</h2>
<p>Let’s try it:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo</span> <span class="nt">-u</span> postgres <span class="nb">touch</span> /pgdata/12beta2/recovery.conf
% <span class="nb">sudo</span> <span class="nt">-u</span> postgres pg_ctl <span class="nt">-D</span> /pgdata/12beta2 start
...
FATAL: using recovery <span class="nb">command </span>file <span class="s2">"recovery.conf"</span> is not supported
LOG: startup process <span class="o">(</span>PID 5837<span class="o">)</span> exited with <span class="nb">exit </span>code 1
LOG: aborting startup due to startup process failure
LOG: database system is shut down
</code></pre></div></div>
<p>as already detailed, the database refuses to start.</p>
<h2 id="what-does-happen-when-you-issue-an-alter-system">What does happen when you issue an <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code>?</h2>
<p>Easy pal, configuration is put on <code class="language-plaintext highlighter-rouge">postgresql.auto.conf</code>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">%</span> <span class="n">psql</span> <span class="o">-</span><span class="n">U</span> <span class="n">postgres</span> <span class="n">template1</span>
<span class="n">psql</span> <span class="p">(</span><span class="mi">12</span><span class="n">beta2</span><span class="p">)</span>
<span class="k">Type</span> <span class="nv">"help"</span> <span class="k">for</span> <span class="n">help</span><span class="p">.</span>
<span class="n">template1</span><span class="o">=#</span> <span class="k">ALTER</span> <span class="k">SYSTEM</span> <span class="k">SET</span> <span class="n">restore_command</span> <span class="k">TO</span> <span class="s1">'cp %p %f'</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">SYSTEM</span>
</code></pre></div></div>
<p>that results in:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo cat</span> /pgdata/12beta2/postgresql.auto.conf
<span class="c"># Do not edit this file manually!</span>
<span class="c"># It will be overwritten by the ALTER SYSTEM command.</span>
restore_command <span class="o">=</span> <span class="s1">'cp %p %f'</span>
</code></pre></div></div>
PostgreSQL Administrator Account WITH NOLOGIN (recover your role)2019-06-27T00:00:00+00:00https://fluca1978.github.io/2019/06/27/PostgreSQLSingleMode<p>Today I got an email from a friend of mine that locked out of his own database due to a little mistake.</p>
<h1 id="postgresql-administrator-account-with-nologin-recover-your-postgres-role">PostgreSQL Administrator Account WITH NOLOGIN (recover your <code class="language-plaintext highlighter-rouge">postgres</code> role)</h1>
<p>What if you get locked out your own cluster due to a simple and, to some extent, stupid error?
Let’s see it in quick list of steps.
<br />
First of all, lock the default <code class="language-plaintext highlighter-rouge">postgres</code> account so that the default administrator cannot any more log in the clsuter:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% psql <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s2">"ALTER ROLE postgres WITH NOLOGIN"</span> testdb
ALTER ROLE
% psql <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s2">"SELECT version();"</span> testdb
psql: FATAL: role <span class="s2">"postgres"</span> is not permitted to log <span class="k">in</span>
</code></pre></div></div>
<p><em>What a mess!</em>
<br />
<br />
PostgreSQL has a specific recovery mode, called <strong>single user mode</strong>, that resemble the operating system single user mode and can be used for such situations. Let’s see how.
<br />
First of all, <strong>shut down the cluster</strong>, avoid more damages of what you have already done!</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>service postgresql stop
</code></pre></div></div>
<p><br />
Now, start the <code class="language-plaintext highlighter-rouge">postgres</code> process in single user mode. You need to know the data directory of your cluster in order for it to work:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo</span> <span class="nt">-u</span> postgres postgres <span class="nt">--single</span> <span class="nt">-D</span> /mnt/pg_data/pgdata/11.1
PostgreSQL stand-alone backend 11.3
backend>
</code></pre></div></div>
<p>What happened? I used the operating system user <code class="language-plaintext highlighter-rouge">postgres</code> to launch the operating system process <code class="language-plaintext highlighter-rouge">postgres</code> (ok there’s a little name confusion here!) in single (<code class="language-plaintext highlighter-rouge">--single</code>) mode for my own data directory (<code class="language-plaintext highlighter-rouge">-D</code>). I got a prompt, I’m connected to the backend process directly, so this is not the same as a local or TCP/IP connection: I’m interacting with the backend process itself. Luckily, the backend process can speak SQL! Therefore, I can <em>reset</em> my administrator role:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">backend</span><span class="o">></span> <span class="k">ALTER</span> <span class="k">ROLE</span> <span class="n">postgres</span> <span class="k">WITH</span> <span class="n">LOGIN</span><span class="p">;</span>
<span class="n">backend</span><span class="o">></span>
</code></pre></div></div>
<p>Please note that, while the backend process can speak SQL, it does not speak the same way <code class="language-plaintext highlighter-rouge">psql</code> does: there is no need for a semicolon and an <code class="language-plaintext highlighter-rouge"><enter></code> will send the statement to the backend. Anyway, I can now release the backend process as I would do with any other operating system process, gently or not, for instance via <code class="language-plaintext highlighter-rouge">CTRL-D</code> (<em>End of File</em>).</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>backend> CTRL-D
%
</code></pre></div></div>
<p>It is now time to restart the cluster and check if the user <code class="language-plaintext highlighter-rouge">postgres</code> can connect again:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">sudo </span>service postgresql start
% psql <span class="nt">-U</span> postgres <span class="nt">-c</span> <span class="s2">"SELECT CURRENT_DATE;"</span> testdb
current_date
<span class="nt">--------------</span>
2019-06-27
<span class="o">(</span>1 row<span class="o">)</span>
</code></pre></div></div>
<p>The world is an happy place again!</p>
Importing data from MSSQL is faster than I thought2019-06-26T00:00:00+00:00https://fluca1978.github.io/2019/06/26/MSSQLImport<p>A few months ago I set up a Foreign Data Wrapper against a Microsoft SQL Server to import historical data.
I’m quite impressive about how quick the bulk import is.</p>
<h1 id="importing-data-from-mssql-is-faster-than-i-thought">Importing data from MSSQL is faster than I thought</h1>
<p>It has not been an exciting job, but having PostgreSQL to pull data out of Microsoft SQL Server is a joy!
The architecture is quite dumb:</p>
<ul>
<li>PostgreSQL connects via <em>Foreign Data Wrapper</em> to the MSSQL machine;</li>
<li>a function is executed to extract records, crunch them and store modified into the PostgreSQL engine;</li>
<li>every 10 minutes, iterate.</li>
</ul>
<p>So far, it is importing <code class="language-plaintext highlighter-rouge">3,5k</code> tuples every ten minutes, that is around <code class="language-plaintext highlighter-rouge">500000</code> tuples per day. So far it is running in less than a second, and I’m able to monitor that thanks to a <em>status</em> table where I store the <code class="language-plaintext highlighter-rouge">clock_timestamp()</code> values for monitoring:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="n">record_count</span><span class="p">,</span>
<span class="n">ts_end</span> <span class="o">-</span> <span class="n">ts_begin</span> <span class="k">AS</span> <span class="n">elapsed_time</span>
<span class="k">FROM</span> <span class="n">pull_stat</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">pk</span> <span class="k">DESC</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">record_count</span> <span class="o">|</span> <span class="n">elapsed_time</span>
<span class="c1">--------------|-----------------</span>
<span class="mi">3659</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">440777</span>
<span class="mi">3656</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">363089</span>
<span class="mi">3694</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">385919</span>
<span class="mi">3713</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">460304</span>
<span class="mi">3695</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">02</span><span class="p">.</span><span class="mi">209158</span>
<span class="mi">3678</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">393815</span>
<span class="mi">3699</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">404685</span>
<span class="mi">3693</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">403348</span>
<span class="mi">3704</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">358293</span>
<span class="mi">3683</span> <span class="o">|</span> <span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">:</span><span class="mi">00</span><span class="p">.</span><span class="mi">355856</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="k">version</span><span class="p">();</span>
<span class="k">version</span>
<span class="c1">---------------------------------------------------------------------------------------------------------</span>
<span class="n">PostgreSQL</span> <span class="mi">11</span><span class="p">.</span><span class="mi">1</span> <span class="k">on</span> <span class="n">x86_64</span><span class="o">-</span><span class="n">pc</span><span class="o">-</span><span class="n">linux</span><span class="o">-</span><span class="n">gnu</span><span class="p">,</span> <span class="n">compiled</span> <span class="k">by</span> <span class="n">gcc</span> <span class="p">(</span><span class="n">GCC</span><span class="p">)</span> <span class="mi">4</span><span class="p">.</span><span class="mi">8</span><span class="p">.</span><span class="mi">5</span> <span class="mi">20150623</span> <span class="p">(</span><span class="n">Red</span> <span class="n">Hat</span> <span class="mi">4</span><span class="p">.</span><span class="mi">8</span><span class="p">.</span><span class="mi">5</span><span class="o">-</span><span class="mi">28</span><span class="p">),</span> <span class="mi">64</span><span class="o">-</span><span class="nb">bit</span>
</code></pre></div></div>
<p>While this is surely not a benchmark, I’m quite impressed about the speed of pulling data from a foreign server.
It’s interesting to note that the <code class="language-plaintext highlighter-rouge">tds</code> version is <code class="language-plaintext highlighter-rouge">7.2</code> with <code class="language-plaintext highlighter-rouge">notice</code> messages enabled (that I suspect lead to a little time expense).</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="err">\</span><span class="n">des</span><span class="o">+</span>
<span class="n">List</span> <span class="k">of</span> <span class="k">foreign</span> <span class="n">servers</span>
<span class="n">Name</span> <span class="o">|</span> <span class="n">server_mssql</span>
<span class="k">Owner</span> <span class="o">|</span> <span class="n">postgres</span>
<span class="k">Foreign</span><span class="o">-</span><span class="k">data</span> <span class="n">wrapper</span> <span class="o">|</span> <span class="n">tds_fdw</span>
<span class="k">Access</span> <span class="k">privileges</span> <span class="o">|</span>
<span class="k">Type</span> <span class="o">|</span>
<span class="k">Version</span> <span class="o">|</span>
<span class="n">FDW</span> <span class="k">options</span> <span class="o">|</span> <span class="p">(</span><span class="n">servername</span> <span class="s1">'10.0.0.1'</span><span class="p">,</span> <span class="k">database</span> <span class="s1">'AXES'</span><span class="p">,</span> <span class="n">msg_handler</span> <span class="s1">'notice'</span><span class="p">,</span> <span class="n">tds_version</span> <span class="s1">'7.2'</span><span class="p">)</span>
<span class="n">Description</span> <span class="o">|</span>
</code></pre></div></div>
A recursive CTE to get information about partitions2019-06-12T00:00:00+00:00https://fluca1978.github.io/2019/06/12/PartitioningCTE<p>I was wondering about writing a function that provides a quick status about partitioning. But wait, PostgreSQL has recursive CTEs!</p>
<h1 id="a-recursive-cte-to-get-information-about-partitions">A recursive CTE to get information about partitions</h1>
<p>I’m used to partitioning, it allows me to quickly and precisely split data across different tables.
PostgreSQL 10 introduced the native partitioning, and since that I’m using native partitioning over inheritance whenever it is possible.
<br />
But how to get a quick overview of the partition status? I mean, knowing which partition is growing the more?
<br />
In the beginning I was thinking to write a function to do that task, quickly finding myself iterating recursively over <code class="language-plaintext highlighter-rouge">pg_inherits</code>, the table that <em>links</em> partitions to their parents. But the keyword here is <em>recursively</em>: PostgreSQL provides <em>recursive Common Table Expression</em>, and a quick search revelead I was right: it is possible to do it with a single CTE. Taking inspiration from <a href="https://www.postgresql.org/message-id/otalb9%245ma%241%40blaine.gmane.org">this mailing list message</a>, here it is a simple CTE to get a partition status (you can find it on my <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/partitioning_report.sql">GitHub repository</a>):</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">WITH</span> <span class="k">RECURSIVE</span> <span class="n">inheritance_tree</span> <span class="k">AS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="k">AS</span> <span class="n">table_oid</span>
<span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="k">AS</span> <span class="k">table_name</span>
<span class="p">,</span> <span class="k">NULL</span><span class="p">::</span><span class="n">name</span> <span class="k">AS</span> <span class="n">table_parent_name</span>
<span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">relispartition</span> <span class="k">AS</span> <span class="n">is_partition</span>
<span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">c</span>
<span class="k">JOIN</span> <span class="n">pg_namespace</span> <span class="n">n</span> <span class="k">ON</span> <span class="n">n</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">relnamespace</span>
<span class="k">WHERE</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'p'</span>
<span class="k">AND</span> <span class="k">c</span><span class="p">.</span><span class="n">relispartition</span> <span class="o">=</span> <span class="k">false</span>
<span class="k">UNION</span> <span class="k">ALL</span>
<span class="k">SELECT</span> <span class="n">inh</span><span class="p">.</span><span class="n">inhrelid</span> <span class="k">AS</span> <span class="n">table_oid</span>
<span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span> <span class="k">AS</span> <span class="k">table_name</span>
<span class="p">,</span> <span class="n">cc</span><span class="p">.</span><span class="n">relname</span> <span class="k">AS</span> <span class="n">table_parent_name</span>
<span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">relispartition</span> <span class="k">AS</span> <span class="n">is_partition</span>
<span class="k">FROM</span> <span class="n">inheritance_tree</span> <span class="n">it</span>
<span class="k">JOIN</span> <span class="n">pg_inherits</span> <span class="n">inh</span> <span class="k">ON</span> <span class="n">inh</span><span class="p">.</span><span class="n">inhparent</span> <span class="o">=</span> <span class="n">it</span><span class="p">.</span><span class="n">table_oid</span>
<span class="k">JOIN</span> <span class="n">pg_class</span> <span class="k">c</span> <span class="k">ON</span> <span class="n">inh</span><span class="p">.</span><span class="n">inhrelid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span>
<span class="k">JOIN</span> <span class="n">pg_class</span> <span class="n">cc</span> <span class="k">ON</span> <span class="n">it</span><span class="p">.</span><span class="n">table_oid</span> <span class="o">=</span> <span class="n">cc</span><span class="p">.</span><span class="n">oid</span>
<span class="p">)</span>
<span class="k">SELECT</span>
<span class="n">it</span><span class="p">.</span><span class="k">table_name</span>
<span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">reltuples</span>
<span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">relpages</span>
<span class="p">,</span> <span class="k">CASE</span> <span class="n">p</span><span class="p">.</span><span class="n">partstrat</span>
<span class="k">WHEN</span> <span class="s1">'l'</span> <span class="k">THEN</span> <span class="s1">'BY LIST'</span>
<span class="k">WHEN</span> <span class="s1">'r'</span> <span class="k">THEN</span> <span class="s1">'BY RANGE'</span>
<span class="k">ELSE</span> <span class="s1">'not partitioned'</span>
<span class="k">END</span> <span class="k">AS</span> <span class="n">partitionin_type</span>
<span class="p">,</span> <span class="n">it</span><span class="p">.</span><span class="n">table_parent_name</span>
<span class="p">,</span> <span class="n">pg_get_expr</span><span class="p">(</span> <span class="k">c</span><span class="p">.</span><span class="n">relpartbound</span><span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span><span class="p">,</span> <span class="k">true</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">partitioning_values</span>
<span class="p">,</span> <span class="n">pg_get_expr</span><span class="p">(</span> <span class="n">p</span><span class="p">.</span><span class="n">partexprs</span><span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span><span class="p">,</span> <span class="k">true</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">sub_partitioning_values</span>
<span class="k">FROM</span> <span class="n">inheritance_tree</span> <span class="n">it</span>
<span class="k">JOIN</span> <span class="n">pg_class</span> <span class="k">c</span> <span class="k">ON</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="n">it</span><span class="p">.</span><span class="n">table_oid</span>
<span class="k">LEFT</span> <span class="k">JOIN</span> <span class="n">pg_partitioned_table</span> <span class="n">p</span> <span class="k">ON</span> <span class="n">p</span><span class="p">.</span><span class="n">partrelid</span> <span class="o">=</span> <span class="n">it</span><span class="p">.</span><span class="n">table_oid</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">;</span>
</code></pre></div></div>
<p>The bootstrap term in the CTE selects all the tables that are not partition, that is the <em>roots</em> of a partitioning scheme. The recursive term simply joins <code class="language-plaintext highlighter-rouge">pg_inherits</code> in order to extract the children information.
The query attached to the CTE extracts information like the number of tuples and pages (that’s what I need), and a summary of the partitioning including second level partitioning. Thanks to <code class="language-plaintext highlighter-rouge">pg_get_expr</code> it is possible to get a human readable partitioning startegy.
<br />
<br />
The output results in something like the following:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">...</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">5</span> <span class="p">]</span><span class="c1">-----------|----------------------------------</span>
<span class="k">table_name</span> <span class="o">|</span> <span class="n">y2018</span>
<span class="n">reltuples</span> <span class="o">|</span> <span class="mi">0</span>
<span class="n">relpages</span> <span class="o">|</span> <span class="mi">0</span>
<span class="n">partitionin_type</span> <span class="o">|</span> <span class="k">BY</span> <span class="n">LIST</span>
<span class="n">table_parent_name</span> <span class="o">|</span> <span class="n">root</span>
<span class="n">partitioning_values</span> <span class="o">|</span> <span class="k">FOR</span> <span class="k">VALUES</span> <span class="k">IN</span> <span class="p">(</span><span class="s1">'2018'</span><span class="p">)</span>
<span class="n">sub_partitioning_values</span> <span class="o">|</span> <span class="n">date_part</span><span class="p">(</span><span class="s1">'month'</span><span class="p">::</span><span class="nb">text</span><span class="p">,</span> <span class="n">mis_ora</span><span class="p">)</span>
<span class="p">...</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">15</span> <span class="p">]</span><span class="c1">----------|----------------------------------</span>
<span class="k">table_name</span> <span class="o">|</span> <span class="n">y2018m10</span>
<span class="n">reltuples</span> <span class="o">|</span> <span class="mi">1</span><span class="p">.</span><span class="mi">48956</span><span class="n">e</span><span class="o">+</span><span class="mi">07</span>
<span class="n">relpages</span> <span class="o">|</span> <span class="mi">139212</span>
<span class="n">partitionin_type</span> <span class="o">|</span> <span class="k">not</span> <span class="n">partitioned</span>
<span class="n">table_parent_name</span> <span class="o">|</span> <span class="n">y2018</span>
<span class="n">partitioning_values</span> <span class="o">|</span> <span class="k">FOR</span> <span class="k">VALUES</span> <span class="k">IN</span> <span class="p">(</span><span class="s1">'10'</span><span class="p">)</span>
<span class="n">sub_partitioning_values</span> <span class="o">|</span>
</code></pre></div></div>
<p>That states table <code class="language-plaintext highlighter-rouge">y2018</code> is child of table <code class="language-plaintext highlighter-rouge">root</code>, accepts values <code class="language-plaintext highlighter-rouge">'2018'</code> and is partitioned by list, and children are partitioned by month. On the other hand, <code class="language-plaintext highlighter-rouge">y2018m10</code> is not partitioned anymore and is child of <code class="language-plaintext highlighter-rouge">y2018'</code>.
<br />
That’s a quick glance at the partitioning status in the cluster! Of course, it is possible to improve on this to get more information or restrict it depending on your needs.</p>
<p><br />
<br /></p>
<h2 id="update-2019-06-15">UPDATE 2019-06-15</h2>
<p>As per discussion reported on the <a href="https://www.postgresql.org/message-id/[email protected]"><code class="language-plaintext highlighter-rouge">bugs</code> mailing list</a> the query I originally proposed was tricky: while it was working on v11, it was not on upcoming v12 and the reason <a href="https://github.com/fluca1978/fluca1978-pg-utils/commit/487fb04210e4e3dd31703ae6de18a08b7b3aae17">was that I was erronously casting <code class="language-plaintext highlighter-rouge">NULL</code> to <code class="language-plaintext highlighter-rouge">text</code> in the non-recursive term and then <em>unioning</em> with a <code class="language-plaintext highlighter-rouge">name</code> in the recursive part</a>. Thanks to the <a href="https://www.postgresql.org/message-id/27731.1560525569%40sss.pgh.pa.us">explaination by Tom Lane</a> I was able not only to fix the query, but to gain some more knowledge about PostgreSQL!</p>
Checking the sequences status on a single pass2019-06-11T00:00:00+00:00https://fluca1978.github.io/2019/06/11/SequenceCheck<p>It is quite simple to wrap a couple of queries in a function to have a glance at all the sequences and their cycling status.</p>
<h1 id="checking-the-sequences-status-on-a-single-pass">Checking the sequences status on a single pass</h1>
<p>The catalog <code class="language-plaintext highlighter-rouge">pg_sequence</code> keeps track about the definition of a single sequence, including the increment value and boundaries. Combined with <code class="language-plaintext highlighter-rouge">pg_class</code> and a few other functions it is possible to create a very simple administrative function to keep track about the overall sequences status.
<br />
<br />
I’ve created a <code class="language-plaintext highlighter-rouge">seq_check()</code> function that provides an output as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">seq_check</span><span class="p">()</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">remaining</span><span class="p">;</span>
<span class="n">seq_name</span> <span class="o">|</span> <span class="n">current_value</span> <span class="o">|</span> <span class="n">lim</span> <span class="o">|</span> <span class="n">remaining</span>
<span class="c1">------------------------|---------------|------------|------------</span>
<span class="k">public</span><span class="p">.</span><span class="n">persona_pk_seq</span> <span class="o">|</span> <span class="mi">5000000</span> <span class="o">|</span> <span class="mi">2147483647</span> <span class="o">|</span> <span class="mi">214248</span>
<span class="k">public</span><span class="p">.</span><span class="n">root_pk_seq</span> <span class="o">|</span> <span class="mi">50000</span> <span class="o">|</span> <span class="mi">2147483647</span> <span class="o">|</span> <span class="mi">2147433647</span>
<span class="k">public</span><span class="p">.</span><span class="n">students_pk_seq</span> <span class="o">|</span> <span class="mi">7</span> <span class="o">|</span> <span class="mi">2147483647</span> <span class="o">|</span> <span class="mi">2147483640</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p>As you can see, the function provides the current value of the sequence, the maximum value (limit) and how much values the sequence can still provide before it overflows or cycles. For example, <code class="language-plaintext highlighter-rouge">persona_pk_seq</code> has remained with 214248 values to provide. Combined with the current value, that is 5000000, this provides hint about the fact that the sequence has probably a too large increment interval.
<br />
<br />
The code of the function is as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">seq_check</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="k">TABLE</span><span class="p">(</span> <span class="n">seq_name</span> <span class="nb">text</span><span class="p">,</span> <span class="n">current_value</span> <span class="nb">bigint</span><span class="p">,</span> <span class="n">lim</span> <span class="nb">bigint</span><span class="p">,</span> <span class="n">remaining</span> <span class="nb">bigint</span> <span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">query</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">schemaz</span> <span class="n">name</span><span class="p">;</span>
<span class="n">seqz</span> <span class="n">name</span><span class="p">;</span>
<span class="n">seqid</span> <span class="n">oid</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">schemaz</span><span class="p">,</span> <span class="n">seqz</span><span class="p">,</span> <span class="n">seqid</span> <span class="k">IN</span> <span class="k">SELECT</span> <span class="n">n</span><span class="p">.</span><span class="n">nspname</span><span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">relname</span><span class="p">,</span> <span class="k">c</span><span class="p">.</span><span class="n">oid</span>
<span class="k">FROM</span> <span class="n">pg_class</span> <span class="k">c</span>
<span class="k">JOIN</span> <span class="n">pg_namespace</span> <span class="n">n</span> <span class="k">ON</span> <span class="n">n</span><span class="p">.</span><span class="n">oid</span> <span class="o">=</span> <span class="k">c</span><span class="p">.</span><span class="n">relnamespace</span>
<span class="k">WHERE</span> <span class="k">c</span><span class="p">.</span><span class="n">relkind</span> <span class="o">=</span> <span class="s1">'S'</span> <span class="c1">--sequence</span>
<span class="n">LOOP</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Inspecting %.%'</span><span class="p">,</span> <span class="n">schemaz</span><span class="p">,</span> <span class="n">seqz</span><span class="p">;</span>
<span class="n">query</span> <span class="p">:</span><span class="o">=</span> <span class="n">format</span><span class="p">(</span> <span class="s1">'SELECT </span><span class="se">''</span><span class="s1">%s.%s</span><span class="se">''</span><span class="s1">, last_value, s.seqmax AS lim, (s.seqmax - last_value) / s.seqincrement AS remaining FROM %I.%I, pg_sequence s WHERE s.seqrelid = %s'</span><span class="p">,</span>
<span class="n">quote_ident</span><span class="p">(</span> <span class="n">schemaz</span> <span class="p">),</span>
<span class="n">quote_ident</span><span class="p">(</span> <span class="n">seqz</span> <span class="p">),</span>
<span class="n">schemaz</span><span class="p">,</span>
<span class="n">seqz</span><span class="p">,</span>
<span class="n">seqid</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Query [%]'</span><span class="p">,</span> <span class="n">query</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">QUERY</span> <span class="k">EXECUTE</span> <span class="n">query</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span>
<span class="k">STRICT</span><span class="p">;</span>
</code></pre></div></div>
<p>As you can see, the main query is a join between <code class="language-plaintext highlighter-rouge">pg_sequence</code> and data extracted directly from <code class="language-plaintext highlighter-rouge">pg_class</code>. The function iterates on all sequences within the system, and this means <em>the function must run with administrator privileges</em>.
<br />
<br />
I use this handy function to check the status on other machines, and quite frankly I’ve not yet come to <code class="language-plaintext highlighter-rouge">remaining</code> being near to zero, therefore I can sleep well at night:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">seq_check</span><span class="p">()</span> <span class="k">order</span> <span class="k">by</span> <span class="n">remaining</span><span class="p">;</span>
<span class="n">seq_name</span> <span class="o">|</span> <span class="n">current_value</span> <span class="o">|</span> <span class="n">lim</span> <span class="o">|</span> <span class="n">remaining</span>
<span class="c1">---------------------------|---------------|---------------------|---------------------</span>
<span class="n">t</span><span class="p">.</span><span class="n">root_pk_seq</span> <span class="o">|</span> <span class="mi">201338</span> <span class="o">|</span> <span class="mi">2147483647</span> <span class="o">|</span> <span class="mi">2147282309</span>
<span class="n">respi</span><span class="p">.</span><span class="n">rosseni_tmp_pk_seq</span> <span class="o">|</span> <span class="mi">16673</span> <span class="o">|</span> <span class="mi">2147483647</span> <span class="o">|</span> <span class="mi">2147466974</span>
<span class="n">respi</span><span class="p">.</span><span class="n">pull_status_pk_seq</span> <span class="o">|</span> <span class="mi">14603</span> <span class="o">|</span> <span class="mi">2147483647</span> <span class="o">|</span> <span class="mi">2147469044</span>
<span class="n">respi</span><span class="p">.</span><span class="n">tipo_rossene_pk_seq</span> <span class="o">|</span> <span class="mi">8</span> <span class="o">|</span> <span class="mi">2147483647</span> <span class="o">|</span> <span class="mi">2147483639</span>
<span class="n">respi</span><span class="p">.</span><span class="n">root_pk_seq</span> <span class="o">|</span> <span class="mi">140509487</span> <span class="o">|</span> <span class="mi">9223372036854775807</span> <span class="o">|</span> <span class="mi">9223372036714266320</span>
<span class="n">cron</span><span class="p">.</span><span class="n">jobid_seq</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">|</span> <span class="mi">9223372036854775807</span> <span class="o">|</span> <span class="mi">9223372036854775806</span>
</code></pre></div></div>
<p>Of course, it is quite easy to improve the function adding, for instance, a percent ratio or a <em>near-to-cycle</em> flag.</p>
FizzBuzz (in both plpgsql and SQL)2019-06-11T00:00:00+00:00https://fluca1978.github.io/2019/06/11/FizzBuzz<p>While listening to a great talk by Benno Rice, I was pointed to the <em>FizzBuzz</em> alghortim. How hard could it be to implement it using PostgreSQL?</p>
<h1 id="fizzbuzz-in-both-plpgsql-and-sql">FizzBuzz (in both plpgsql and SQL)</h1>
<p><em>FizzBuzz</em> is something used as straight question during job interviews: the idea is that if you cannot get the alghoritm fine, you are not a programmer at all!
<br />
The alghoritm can be described as:
<br />
<br />
<em>Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.</em>
<br />
<br />
Now, how hard could it be? You can find my implementation <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/fizzbuzz.sql">here</a>.
Well, implementing using <code class="language-plaintext highlighter-rouge">pgsql</code> is as simple as:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span>
<span class="n">fizzbuzz</span><span class="p">(</span> <span class="n">start_number</span> <span class="nb">int</span> <span class="k">DEFAULT</span> <span class="mi">1</span><span class="p">,</span> <span class="n">end_number</span> <span class="nb">int</span> <span class="k">DEFAULT</span> <span class="mi">100</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="n">VOID</span>
<span class="k">AS</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">current_number</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">current_value</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="c1">-- check arguments</span>
<span class="n">IF</span> <span class="n">start_number</span> <span class="o">>=</span> <span class="n">end_number</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">EXCEPTION</span> <span class="s1">'The start number must be lower then the end one! From % to %'</span><span class="p">,</span> <span class="n">start_number</span><span class="p">,</span> <span class="n">end_number</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">FOR</span> <span class="n">current_number</span> <span class="k">IN</span> <span class="n">start_number</span> <span class="p">..</span> <span class="n">end_number</span> <span class="n">LOOP</span>
<span class="n">current_value</span> <span class="p">:</span><span class="o">=</span> <span class="k">NULL</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">current_number</span> <span class="o">%</span> <span class="mi">3</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">current_value</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'Fizz'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">current_number</span> <span class="o">%</span> <span class="mi">5</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">current_value</span> <span class="p">:</span><span class="o">=</span> <span class="n">coalesce</span><span class="p">(</span> <span class="n">current_value</span><span class="p">,</span> <span class="s1">''</span> <span class="p">)</span> <span class="o">||</span> <span class="s1">'Buzz'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">current_value</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">THEN</span>
<span class="n">current_value</span> <span class="p">:</span><span class="o">=</span> <span class="n">current_number</span><span class="p">::</span><span class="nb">text</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">INFO</span> <span class="s1">'% -> %'</span><span class="p">,</span> <span class="n">current_number</span><span class="p">,</span> <span class="n">current_value</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>This is <strong>a possible implementation</strong>, as you can see there is more code to test input than to effectively do the work. The only trick, in my opinion, in <em>FizzBuzz</em> is that the case that prints <code class="language-plaintext highlighter-rouge">FizzBuzz</code> must to be handled as a different conditional from the one that tests for <code class="language-plaintext highlighter-rouge">Fizz</code> or `Buzz**.
<br />
<br /></p>
<p><strong>But PostgreSQL has also recursive CTEs</strong>, and things get more interesting.</p>
<p><br /></p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">WITH</span> <span class="k">RECURSIVE</span> <span class="n">n</span> <span class="k">AS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="mi">1</span> <span class="k">AS</span> <span class="n">current_number</span><span class="p">,</span> <span class="k">NULL</span> <span class="k">AS</span> <span class="n">mod_3</span><span class="p">,</span> <span class="k">NULL</span> <span class="k">AS</span> <span class="n">mod_5</span>
<span class="k">UNION</span>
<span class="k">SELECT</span> <span class="n">current_number</span> <span class="o">+</span> <span class="mi">1</span> <span class="k">as</span> <span class="n">current_number</span>
<span class="p">,</span> <span class="k">CASE</span> <span class="p">(</span> <span class="n">current_number</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">)</span> <span class="o">%</span> <span class="mi">3</span> <span class="k">WHEN</span> <span class="mi">0</span> <span class="k">THEN</span> <span class="s1">'Fizz'</span>
<span class="k">ELSE</span> <span class="k">NULL</span>
<span class="k">END</span> <span class="k">AS</span> <span class="n">mod_3</span>
<span class="p">,</span> <span class="k">CASE</span> <span class="p">(</span> <span class="n">current_number</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">)</span> <span class="o">%</span> <span class="mi">5</span> <span class="k">WHEN</span> <span class="mi">0</span> <span class="k">THEN</span> <span class="s1">'Buzz'</span>
<span class="k">ELSE</span> <span class="k">NULL</span>
<span class="k">END</span> <span class="k">AS</span> <span class="n">mod_5</span>
<span class="k">FROM</span> <span class="n">n</span> <span class="k">WHERE</span> <span class="n">current_number</span> <span class="o"><</span> <span class="mi">99</span>
<span class="p">)</span>
<span class="k">SELECT</span> <span class="n">current_number</span><span class="p">,</span>
<span class="n">coalesce</span><span class="p">(</span> <span class="n">mod_3</span> <span class="o">||</span> <span class="n">mod_5</span><span class="p">,</span>
<span class="n">mod_3</span><span class="p">,</span>
<span class="n">mod_5</span><span class="p">,</span>
<span class="n">current_number</span><span class="p">::</span><span class="nb">text</span> <span class="p">)</span>
<span class="k">FROM</span> <span class="n">n</span><span class="p">;</span>
</code></pre></div></div>
<p>The idea is pretty simple: the <code class="language-plaintext highlighter-rouge">n</code> recursive CTE provides a list of one hundred numbers with the strings <code class="language-plaintext highlighter-rouge">Fizz</code>, or <code class="language-plaintext highlighter-rouge">Buzz</code> or both as a set of rows. Now, such strings must be concatenated, and here comes <code class="language-plaintext highlighter-rouge">coalesce</code>. The <code class="language-plaintext highlighter-rouge">coalesce</code> function gets the first value that is not <code class="language-plaintext highlighter-rouge">NULL</code>. If both <code class="language-plaintext highlighter-rouge">mod_3</code> and <code class="language-plaintext highlighter-rouge">mod_5</code> are not <code class="language-plaintext highlighter-rouge">NULL</code> they are concatenated into the <code class="language-plaintext highlighter-rouge">FizzBuzz</code> string. Otherwise, either <code class="language-plaintext highlighter-rouge">mod_3</code> or <code class="language-plaintext highlighter-rouge">mod_5</code> is not <code class="language-plaintext highlighter-rouge">NULL</code> (but not both), and therefore one of them passes. If none <code class="language-plaintext highlighter-rouge">Fizz</code> or <code class="language-plaintext highlighter-rouge">Buzz</code> is set, then the regular number is printed as last resort.
As you can imagine, the output is similar to:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">current_number</span> <span class="o">|</span> <span class="n">coalesce</span>
<span class="c1">----------------|----------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">1</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">2</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">Fizz</span>
<span class="mi">4</span> <span class="o">|</span> <span class="mi">4</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">Buzz</span>
<span class="mi">6</span> <span class="o">|</span> <span class="n">Fizz</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">7</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">8</span>
<span class="mi">9</span> <span class="o">|</span> <span class="n">Fizz</span>
<span class="mi">10</span> <span class="o">|</span> <span class="n">Buzz</span>
<span class="mi">11</span> <span class="o">|</span> <span class="mi">11</span>
<span class="mi">12</span> <span class="o">|</span> <span class="n">Fizz</span>
<span class="mi">13</span> <span class="o">|</span> <span class="mi">13</span>
<span class="mi">14</span> <span class="o">|</span> <span class="mi">14</span>
<span class="mi">15</span> <span class="o">|</span> <span class="n">FizzBuzz</span>
</code></pre></div></div>
<p>I’m sure there are tons of other implementations, smarter than the above ones. However, what I was interested in demonstrating here was the capability to implement such an alghoritm with PostgreSQL facilities.</p>
Normalize to save space2019-05-31T00:00:00+00:00https://fluca1978.github.io/2019/05/31/RefactorToSaveSpace<p>It is no surprise at all: a normalized database requires less space on disk than a not-normalized one.</p>
<h1 id="normalize-to-save-space">Normalize to save space</h1>
<p>Sometimes you get a database that <em>Just Works</em> (tm) but its data is not normalized. I’m not a big fan of data normalization, I mean it does surely matter, but I don’t tend to “over-normalize” data ahead of design.
However, one of my database was growing more and more because of a table with a few repeated extra information.
<br />
Of course a normalized database gives you some more disk space at the cost of the joins during query execution, but having a decent server and a small join table is enough to sleep at night!
<br />
Let’s see what we are talking about:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mydb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_database_size</span><span class="p">(</span> <span class="s1">'mydb'</span> <span class="p">)</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">13</span> <span class="n">GB</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>Ok, <code class="language-plaintext highlighter-rouge">13 GB</code> is not something scarying, let’s say it is a fair database to work on (please note the size if reported <em>after</em> a full <code class="language-plaintext highlighter-rouge">VACUUM</code>).
In such database, I’ve a table <code class="language-plaintext highlighter-rouge">root</code> that handles a lot of data from hardware sensors; such table is of course partitioned on a time base scale. One thing the table was storing was information about the sensor <em>name</em>, a text string repeated over and over on child tables too. While this was not a problem in the beginning, it was wasting space over time.
<br />
<br />
Shame on me!
<br />
<br />
Let’s go normalize the table!
<br />
Normalizing a table is quite straightforward, and I’m not interesting in sharing details here. Let’s say this was quite easy because <strong>my users where executing query against a view and not the root table</strong>, therefore I simply:</p>
<ul>
<li>created a join table;</li>
<li>populated the join table extracting data from the root table;</li>
<li>(within a transaction) removed the columns from the root table, modified the view (by dropping and recreating it).</li>
</ul>
<p><br />
How much space was I supposed to gain?
Let’s see how much space did the column occupy:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mydb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">pg_column_size</span><span class="p">(</span> <span class="s1">'app.root.sensor_name'</span> <span class="p">);</span>
<span class="n">pg_column_size</span>
<span class="c1">----------------</span>
<span class="mi">20</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">mydb</span><span class="o">=#</span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">from</span> <span class="n">app</span><span class="p">.</span><span class="n">root</span><span class="p">;</span>
<span class="k">count</span>
<span class="c1">-----------</span>
<span class="mi">126224120</span>
<span class="n">mydb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="mi">126224120</span><span class="p">::</span><span class="nb">bigint</span> <span class="o">*</span> <span class="mi">20</span> <span class="p">);</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">2408</span> <span class="n">MB</span>
</code></pre></div></div>
<p>The text column was estimated 20 bytes, that on 126 milion of tuples was around <code class="language-plaintext highlighter-rouge">2,4 GB</code> of disk space.
After the transaction, I did a <code class="language-plaintext highlighter-rouge">VACUUM FULL</code> to let PostgreSQL re-arrange the disk space and I got the expected result:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mydb</span><span class="o">=#</span> <span class="k">select</span> <span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_database_size</span><span class="p">(</span> <span class="s1">'mydb'</span> <span class="p">));</span>
<span class="n">pg_size_pretty</span>
<span class="c1">----------------</span>
<span class="mi">9234</span> <span class="n">MB</span>
</code></pre></div></div>
<p>Please note that the gained space is a lot more than the one estimated becauce I also refactored other columns here and there. <strong>But the normalized database proved to be less space hungry</strong>. Remember that the starting size was already <em>vacuumed</em>, so there is no extra space gain due to dead rows lying around.
<br />
<em>All the queries are working, the space is optimized, my users are happy, I’m happy!</em></p>
PostgreSQL is almost the best (according to Stack Overflow Survery)2019-05-29T00:00:00+00:00https://fluca1978.github.io/2019/05/29/StackOverflowSurvey<p>Stack Overflow 2019 Suvery results are available, and PostgreSQL is almost leading in the database field.</p>
<h1 id="postgresql-is-almost-the-best-according-to-stack-overflow-survery">PostgreSQL is almost the best (according to Stack Overflow Survery)</h1>
<p>According to the 2019 suvery made by Stack Overflow and <a href="https://insights.stackoverflow.com/survey/2019#technology">available here</a>, <strong>PostgreSQL is the second top database</strong>, slightly ahead of Microsoft SQL Server and cleary ahead of Oracle. And this is true both for <em>community</em> and <em>professional</em> users that take the survey.</p>
<p><br /></p>
<center>
<img src="/images/posts/stack_overflow_survey/database_2019.png" />
</center>
<p><br />
PostgreSQL is keeping its high position year after year and this means that the database is growing as a professional choice. In particular, in the <em>professional</em> users’ opinion PostgreSQL is more used and MySQL and MS SQL loose some points.</p>
A glance at pg_cron to automatically schedule database tasks2019-05-21T00:00:00+00:00https://fluca1978.github.io/2019/05/21/pgcron<p>I tend to use <code class="language-plaintext highlighter-rouge">cron(1)</code> to schedule some automated tasks on the database server side, and since I discovered the <code class="language-plaintext highlighter-rouge">pg_cron</code> extension, I decided to try it. Here are some impressions.</p>
<h1 id="a-glance-at-pg_cron-to-automatically-schedule-database-tasks">A glance at <code class="language-plaintext highlighter-rouge">pg_cron</code> to automatically schedule database tasks</h1>
<p><a href="https://github.com/citusdata/pg_cron"><code class="language-plaintext highlighter-rouge">pg_cron</code></a> is an interesting PostgreSQL extension by Citus Data: it does include a <em>background worker</em> (i.e., a PostgreSQL managed process) to execute database tasks on the server side. This is something I’ve done for years, I mean managing automated tasks using operating system wide <code class="language-plaintext highlighter-rouge">cron(1)</code> and schedulers alike, but having the scheduler within the database sounds really cool, since I can keep it tied to the data itself.</p>
<h2 id="an-example-scenario">An example scenario</h2>
<p>I’ve one server pulling data regularly out of another server, via a foreign data wrapper. No matter how this design choice sounds to you, it works for me!
<br />
In order to constantly pull data, I have set up a <code class="language-plaintext highlighter-rouge">cron(1)</code> task in my user crontab to execute a function that does all the business logic I need.
Therefore my crontab file looks like:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>crontab <span class="nt">-l</span>
10,20,30,40,50,0 <span class="k">*</span> <span class="k">*</span> <span class="k">*</span> <span class="k">*</span> <span class="se">\</span>
/usr/bin/psql <span class="nt">-U</span> postgres <span class="nt">-h</span> 127.0.0.1 <span class="se">\</span>
<span class="nt">-c</span> <span class="s2">"SELECT f_pull('crontab import');"</span> mydb
</code></pre></div></div>
<p>So I’m executing the function <code class="language-plaintext highlighter-rouge">f_pull</code> on database <code class="language-plaintext highlighter-rouge">mydb</code> specifying a label <code class="language-plaintext highlighter-rouge">crontab import</code>.
Let’s see how this can be done using <code class="language-plaintext highlighter-rouge">pg_cron</code> too.</p>
<h2 id="installing-pg_cron">Installing <code class="language-plaintext highlighter-rouge">pg_cron</code></h2>
<p>While there are some packages for major Linux distributions, I find it quite easily to install it from the official repository with the following short commands:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>git clone https://github.com/citusdata/pg_cron.git
<span class="nv">$ </span><span class="nb">cd </span>pg_cron
<span class="nv">$ </span><span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span>/usr/pgsql-11/bin:<span class="nv">$PATH</span>
<span class="nv">$ </span>make
<span class="nv">$ </span><span class="nb">sudo </span><span class="nv">PATH</span><span class="o">=</span><span class="nv">$PATH</span> make <span class="nb">install</span>
</code></pre></div></div>
<p>and in the case it does matter, I’m using a CentOS 7 Linux here.</p>
<p>Now, in order to make <code class="language-plaintext highlighter-rouge">pg_cron</code> working it must be loaded as a shared library, so you have to adjust the PostgreSQL configuration (usually ~postgresql.conf~) as follows:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>shared_preload_libraries <span class="o">=</span> <span class="s1">'pg_cron'</span>
cron.database_name <span class="o">=</span> <span class="s1">'mydb'</span>
</code></pre></div></div>
<p>Here I use <code class="language-plaintext highlighter-rouge">mydb</code> as the database on which store the <code class="language-plaintext highlighter-rouge">pg_cron</code> data. In fact, <code class="language-plaintext highlighter-rouge">pg_cron</code> will create a <code class="language-plaintext highlighter-rouge">cron</code> schema with a table <code class="language-plaintext highlighter-rouge">job</code> in there that will do the same as your crontab file on any Unix machine.
Unluckily, you need to restart the PostgreSQL cluster in order to apply changes.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">sudo </span>service postgresql-11 restart
</code></pre></div></div>
<p>It is now time to decide <em>who</em> will execute the cron jobs, and in my case it is the <code class="language-plaintext highlighter-rouge">postgres</code> superuser. This could be not the optimal choice, so choose the user that fits the need for you. In my case I was already using <code class="language-plaintext highlighter-rouge">cron(1)</code> with <code class="language-plaintext highlighter-rouge">postgres</code> user, so it sounded to me the right and faster way to migrate from regular cron to <code class="language-plaintext highlighter-rouge">pg_cron</code>. Why does it matter choosing the user in advance? Because <code class="language-plaintext highlighter-rouge">pg_cron</code> requires such user to be able to connect to the database without providing any password, so you either should adjust the <code class="language-plaintext highlighter-rouge">pg_hba.conf</code> properly or add a <code class="language-plaintext highlighter-rouge">.pgpass</code> in the home the user. Yes, even if a background worker is used to implement the <code class="language-plaintext highlighter-rouge">pg_cron</code> features, the connection happens thru <code class="language-plaintext highlighter-rouge">libpq</code>, so the need for the user to be granted to connect withou providing a password. Therefore I changed the <code class="language-plaintext highlighter-rouge">pg_hba.conf</code> as follows:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>host all all 127.0.0.1/32 md5
host mydb postgres localhost trust
</code></pre></div></div>
<p>adding the specific line for <code class="language-plaintext highlighter-rouge">postgres</code> (line ordering does not matter) and leaving all other connections requiring a password.
Now, you can issue a <em>reload</em> and test your user connectivity. Once this is done, you can configure <code class="language-plaintext highlighter-rouge">pg_cron</code>.</p>
<h2 id="configuring-a-job">Configuring a job</h2>
<p>Configuring <code class="language-plaintext highlighter-rouge">pg_cron</code> is really simple: all jobs are kept in the <code class="language-plaintext highlighter-rouge">cron.job</code> table and you can either edit such table with standard SQL or use the <code class="language-plaintext highlighter-rouge">cron.schedule()</code> function to get an initial entry to work later on.
Since I was migrating a <code class="language-plaintext highlighter-rouge">cron(1)</code> entry, things were as simple as copy and paste the <code class="language-plaintext highlighter-rouge">cron(1)</code> entry line with dollar quoting:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mydb</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">cron</span><span class="p">.</span><span class="n">schedule</span><span class="p">(</span><span class="s1">'5,15,25,35,45,55 * * * *'</span><span class="p">,</span>
<span class="err">$</span><span class="n">CRON</span><span class="err">$</span> <span class="k">SELECT</span> <span class="n">f_pull</span><span class="p">(</span><span class="s1">'pg_cron import'</span><span class="p">);</span> <span class="err">$</span><span class="n">CRON</span><span class="err">$</span>
<span class="p">);</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">pg_cron</code> replies with the identifier of the job, in my case <code class="language-plaintext highlighter-rouge">1</code> because it is the very first job inserted in the scheduler. I can inspect it with an ordinary <code class="language-plaintext highlighter-rouge">SELECT</code> against the <code class="language-plaintext highlighter-rouge">cron.job</code> table.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mydb</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">cron</span><span class="p">.</span><span class="n">job</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">------------------------------------------------------------</span>
<span class="n">jobid</span> <span class="o">|</span> <span class="mi">1</span>
<span class="n">schedule</span> <span class="o">|</span> <span class="mi">5</span><span class="p">,</span><span class="mi">15</span><span class="p">,</span><span class="mi">25</span><span class="p">,</span><span class="mi">35</span><span class="p">,</span><span class="mi">45</span><span class="p">,</span><span class="mi">55</span> <span class="o">*</span> <span class="o">*</span> <span class="o">*</span> <span class="o">*</span>
<span class="n">command</span> <span class="o">|</span> <span class="k">SELECT</span> <span class="k">public</span><span class="p">.</span><span class="n">f_pull</span><span class="p">(</span><span class="s1">'pg_cron import'</span><span class="p">);</span>
<span class="n">nodename</span> <span class="o">|</span> <span class="n">localhost</span>
<span class="n">nodeport</span> <span class="o">|</span> <span class="mi">5432</span>
<span class="k">database</span> <span class="o">|</span> <span class="n">mydb</span>
<span class="n">username</span> <span class="o">|</span> <span class="n">postgres</span>
<span class="n">active</span> <span class="o">|</span> <span class="n">t</span>
</code></pre></div></div>
<p>All the fields are perfectly understandable, and please note that the <code class="language-plaintext highlighter-rouge">schedule</code> field reports the string in the exact same format of <code class="language-plaintext highlighter-rouge">cron(1)</code>; this is due to the fact the <code class="language-plaintext highlighter-rouge">pg_cron</code> uses the very same parser as <code class="language-plaintext highlighter-rouge">cron(1)</code>, making migration from <code class="language-plaintext highlighter-rouge">cron(1)</code> to <code class="language-plaintext highlighter-rouge">pg_cron</code> really easy. By feault <code class="language-plaintext highlighter-rouge">cron.schedule()</code> uses the current PostgreSQL instance parameters and the current username, but you can than adjust them to something else. While I haven’t tested it, this means you could execute cron task from one PostgreSQL into a remote one.
<br />
<br />
And that’s all!
<br />
Now you can sit down and check your cron jobs.</p>
<h2 id="pg_cron-logging"><code class="language-plaintext highlighter-rouge">pg_cron</code> logging</h2>
<p>Things never works the first time! In the case you need inspection, consider that <code class="language-plaintext highlighter-rouge">pg_cron</code> logs at the <code class="language-plaintext highlighter-rouge">LOG</code> level and provides a statement for job begin and end. A succesfully executed job prints log statements as</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LOG: cron job 1 starting: SELECT public.f_pull<span class="o">(</span><span class="s1">'pg_cron import'</span><span class="o">)</span><span class="p">;</span>
LOG: cron job 1 completed: 1 row
</code></pre></div></div>
<p>while a failing job prints lines as</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LOG: cron job 1 starting: SELECT public.f_pull<span class="o">(</span><span class="s1">'pg_cron import'</span><span class="o">)</span><span class="p">;</span>
LOG: cron job 1 connection failed
</code></pre></div></div>
<p>Often problems arise due to connection permissions or grants, so double check your cron user is really able to do what you are exepcting to do. In my case it was really simple because I was already using a <code class="language-plaintext highlighter-rouge">cron(1)</code> job, so the user was already granted to do its job.</p>
<h1 id="conclusions">Conclusions</h1>
<p><code class="language-plaintext highlighter-rouge">pg_cron</code> is an awesome tool to keep in your toolbag because it makes really easy to migrate from <code class="language-plaintext highlighter-rouge">cron(1)</code> to <code class="language-plaintext highlighter-rouge">pg_cron</code> (and back!). Moreover, being an extension, it makes all schedule configuration available within the database, and <strong>since <code class="language-plaintext highlighter-rouge">cron.job</code> is added to the backup</strong> from the extension installation instruction, this means you will get scheduler backups <em>for free!</em></p>
The role of a role within another role2019-05-09T00:00:00+00:00https://fluca1978.github.io/2019/05/09/PostgreSQL_Roles<p>A recursive title for a kind of recursive topic: what does really mean to have a role into another one? This article tries to figure out some basic knowledge about it.</p>
<h1 id="the-role-of-a-role-within-another-role">The role of a role within another role</h1>
<p>After reading the very <a href="https://www.cybertec-postgresql.com/en/postgresql-using-create-user-with-caution/">excellent article by Hans-Jürgen Schönig</a> about roles, I decided to provide my own vision about <em>users, groups and the more abstract role concept</em>.</p>
<h2 id="the-word-role">The word <em>role</em></h2>
<p>First of all, the word <code class="language-plaintext highlighter-rouge">role</code> has little to do with PostgreSQL: it is a word used in the SQL standard, so don’t blame our favourite database for using the same word to express different concepts like <em>user</em> and <em>group</em>.</p>
<h2 id="roles-are-they-users-or-groups">Roles: are they users or groups?</h2>
<p>The wrong part of the question is <em>or</em>: <strong>roles are both users and groups</strong>. Period.
A role is a stereotype, an abstraction for saying <strong>a collection of permissions to do some stuff</strong>. Now, often a collection of permission is granted to a user, and therefore a role smells like an user account, but in my opinion this is just a coincidence. And in fact, as in the best system administration tradition, when you have to assign a collection of permissions to more than one user you need a group; roles can therefore smell like a group.
<br />
Remember: roles are collection of permission, what makes they smell as a group or an user is just <strong>the way you use them</strong>. If you use a role for a single user, then it is fine to think the role as an user account. If you use the role for more than one user, then it is fine to think the role as a group.
<br />
Now, if you think this is trivial and simple, consider that a role can smell like an user and a group at the same time. <strong>A role is a representative of a collection of permissions</strong> and therefore can be something assigned to a single user, to a group (multiple users) or both. Somehow, it is like the chief of a company: he is playing at the same time as an employee and as an employer, as well as a representation of the company itself.</p>
<h2 id="enough-lets-see-something">Enough, let’s see something!</h2>
<p>Consider a very simple example: a school with a <code class="language-plaintext highlighter-rouge">schoolars</code> table that can be writen only by <code class="language-plaintext highlighter-rouge">professors</code> and read by other <code class="language-plaintext highlighter-rouge">students</code>: as you can image both <code class="language-plaintext highlighter-rouge">professors</code> and <code class="language-plaintext highlighter-rouge">students</code> will be groups of permissions.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">ROLE</span> <span class="n">professors</span> <span class="k">WITH</span> <span class="n">LOGIN</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">ROLE</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">ROLE</span> <span class="n">students</span> <span class="k">WITH</span> <span class="n">LOGIN</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">ROLE</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">REVOKE</span> <span class="k">ALL</span> <span class="k">ON</span> <span class="n">schoolars</span> <span class="k">FROM</span> <span class="k">PUBLIC</span><span class="p">;</span>
<span class="k">REVOKE</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">GRANT</span> <span class="k">ALL</span> <span class="k">ON</span> <span class="n">schoolars</span> <span class="k">TO</span> <span class="n">professors</span><span class="p">;</span>
<span class="k">GRANT</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">GRANT</span> <span class="k">SELECT</span> <span class="k">ON</span> <span class="n">schoolars</span> <span class="k">TO</span> <span class="n">students</span><span class="p">;</span>
<span class="k">GRANT</span>
</code></pre></div></div>
<p>Anybody playing the <code class="language-plaintext highlighter-rouge">professors</code> role can do whatever he wants against the table:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">professors</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">TABLE</span> <span class="n">schoolars</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">name</span>
<span class="c1">----|--------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">Harry</span> <span class="n">Potter</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">Luca</span> <span class="n">Ferrari</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="k">VALUES</span><span class="p">(</span><span class="s1">'Ron Weasly'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
</code></pre></div></div>
<p>but anybody playing the <code class="language-plaintext highlighter-rouge">students</code> role cannot:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">students</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">TABLE</span> <span class="n">schoolars</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">name</span>
<span class="c1">----|--------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">Harry</span> <span class="n">Potter</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">Luca</span> <span class="n">Ferrari</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">Ron</span> <span class="n">Weasly</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="k">VALUES</span><span class="p">(</span><span class="s1">'Rubeus Hagrid'</span><span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">schoolars</span>
</code></pre></div></div>
<p>So far, so good! But our groups are not very useful so far, they act as single accounts. Let’s create a professor and add it to the <code class="language-plaintext highlighter-rouge">professors</code> group and see what happens:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">ROLE</span> <span class="n">severus</span>
<span class="k">WITH</span> <span class="n">LOGIN</span>
<span class="k">IN</span> <span class="k">ROLE</span> <span class="n">professors</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">ROLE</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">IN ROLE professors</code> clause makes the role <code class="language-plaintext highlighter-rouge">severus</code> belonging to the <code class="language-plaintext highlighter-rouge">professors</code> group, and so we would expect it can do whatever the group can do:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">severus</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">TABLE</span> <span class="n">schoolars</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">name</span>
<span class="c1">----|--------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">Harry</span> <span class="n">Potter</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">Luca</span> <span class="n">Ferrari</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">Ron</span> <span class="n">Weasly</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="k">VALUES</span><span class="p">(</span><span class="s1">'Drako Malfoy'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
</code></pre></div></div>
<p>So far so good, again!
However, the above example worked as expected because of the <strong>default INHERIT behavior</strong> as <a href="https://www.postgresql.org/docs/11/sql-createrole.html">clearly stated in the documentation</a>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>The INHERIT attribute is the default for reasons of backwards compatibility:
in prior releases of PostgreSQL, users always had access to all privileges
of groups they were members of.
However, NOINHERIT provides a closer match to the
semantics specified in the SQL standard.
</code></pre></div></div>
<h3 id="role-inheritance">Role inheritance</h3>
<p>When a role is <em>attached</em> to another role, and therefore is a member of the latter as if it was a group, PostgreSQL automatically uses the <code class="language-plaintext highlighter-rouge">INHERIT</code> property of the <code class="language-plaintext highlighter-rouge">CREATE ROLE</code>. Such property states that all permissions of the group the role is going to be a member will be forwarded to the member itself. In the above example, it does mean that <code class="language-plaintext highlighter-rouge">severus</code> has all the permissions of <code class="language-plaintext highlighter-rouge">professors</code> for free.
<br />
But what happens if the role has been created without inheritance?</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">ROLE</span> <span class="n">severus</span> <span class="k">WITH</span> <span class="n">LOGIN</span> <span class="k">IN</span> <span class="k">ROLE</span> <span class="n">professors</span> <span class="n">NOINHERIT</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">ROLE</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">severus</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">TABLE</span> <span class="n">schoolars</span><span class="p">;</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">schoolars</span>
</code></pre></div></div>
<p><strong>The role still owns all the permissions, but it explicitly needs to state which set of permission must eb applied</strong> and this is done via a <code class="language-plaintext highlighter-rouge">SET ROLE</code> command:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SET</span> <span class="k">ROLE</span> <span class="k">TO</span> <span class="n">professors</span><span class="p">;</span>
<span class="k">SET</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">professors</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">TABLE</span> <span class="n">schoolars</span><span class="p">;</span>
<span class="n">pk</span> <span class="o">|</span> <span class="n">name</span>
<span class="c1">----|--------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">Harry</span> <span class="n">Potter</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">Luca</span> <span class="n">Ferrari</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">Ron</span> <span class="n">Weasly</span>
<span class="mi">5</span> <span class="o">|</span> <span class="n">Drako</span> <span class="n">Malfoy</span>
<span class="p">(</span><span class="mi">4</span> <span class="k">rows</span><span class="p">)</span>
</code></pre></div></div>
<p>It is like the role <code class="language-plaintext highlighter-rouge">severus</code> is allowed to become another user, like with system command <code class="language-plaintext highlighter-rouge">sudo(1)</code>, but explicitly needs to become such user.
In the case of <code class="language-plaintext highlighter-rouge">INHERIT</code> instead (the default behavior), all permissions are automatically granted.</p>
<h3 id="dynamic-behvaior">Dynamic behvaior</h3>
<p>Let’s add another professor, say <code class="language-plaintext highlighter-rouge">albus</code>, so that we will have <code class="language-plaintext highlighter-rouge">albus</code> that inherits from <code class="language-plaintext highlighter-rouge">professors</code> and <code class="language-plaintext highlighter-rouge">severus</code> who does not, but before that remove the <code class="language-plaintext highlighter-rouge">INSERT</code> permission from the <code class="language-plaintext highlighter-rouge">professors</code> group:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">REVOKE</span> <span class="k">INSERT</span>
<span class="k">ON</span> <span class="n">schoolars</span>
<span class="k">FROM</span> <span class="n">professors</span><span class="p">;</span>
<span class="k">REVOKE</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">ROLE</span> <span class="n">albus</span>
<span class="k">WITH</span> <span class="n">LOGIN</span>
<span class="k">IN</span> <span class="k">ROLE</span> <span class="n">professors</span>
<span class="n">INHERIT</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">ROLE</span>
</code></pre></div></div>
<p>Let’s see what this mean at run-time:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">severus</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
<span class="k">VALUES</span><span class="p">(</span><span class="s1">'Lord Voldemort'</span><span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">schoolars</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SET</span> <span class="k">ROLE</span> <span class="n">professors</span><span class="p">;</span>
<span class="k">SET</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
<span class="k">VALUES</span><span class="p">(</span><span class="s1">'Lord Voldemort'</span><span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">schoolars</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">albus</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
<span class="k">VALUES</span><span class="p">(</span><span class="s1">'Lord Voldemort'</span><span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">schoolars</span>
</code></pre></div></div>
<p>Neither <code class="language-plaintext highlighter-rouge">albus</code> nor <code class="language-plaintext highlighter-rouge">severus</code> can anymore insert a new tuple, as we would expect.
Now let’s add again the <code class="language-plaintext highlighter-rouge">INSERT</code> permission to <code class="language-plaintext highlighter-rouge">professors</code>:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">GRANT</span> <span class="k">INSERT</span>
<span class="k">ON</span> <span class="n">schoolars</span>
<span class="k">TO</span> <span class="n">professors</span><span class="p">;</span>
<span class="k">GRANT</span>
</code></pre></div></div>
<p>Let’s see how both <code class="language-plaintext highlighter-rouge">severus</code> and <code class="language-plaintext highlighter-rouge">albus</code> can now perform an evil insert:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">severus</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
<span class="k">VALUES</span><span class="p">(</span><span class="s1">'Lord Voldemort'</span><span class="p">);</span>
<span class="n">ERROR</span><span class="p">:</span> <span class="n">permission</span> <span class="n">denied</span> <span class="k">for</span> <span class="k">table</span> <span class="n">schoolars</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">SET</span> <span class="k">ROLE</span> <span class="n">professors</span><span class="p">;</span>
<span class="k">SET</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
<span class="k">VALUES</span><span class="p">(</span><span class="s1">'Lord Voldemort'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
</code></pre></div></div>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">SELECT</span> <span class="k">CURRENT_USER</span><span class="p">;</span>
<span class="k">current_user</span>
<span class="c1">--------------</span>
<span class="n">albus</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="n">testdb</span><span class="o">=></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">schoolars</span><span class="p">(</span><span class="n">name</span><span class="p">)</span>
<span class="k">VALUES</span><span class="p">(</span><span class="s1">'Lord Voldemort'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="mi">0</span> <span class="mi">1</span>
</code></pre></div></div>
<p>Did you spot the difference? <strong><code class="language-plaintext highlighter-rouge">INHERIT</code> means that the permission is immediatly granted at run-time to the role, while without inheritance the role must still become the target role to exploit the privileges</strong>.</p>
<h2 id="summary">Summary</h2>
<p>So what is all about? When you create a role you can assign it to another role, that is make it belonging to a group. Such group must be enabled explicitly with a <code class="language-plaintext highlighter-rouge">SET ROLE</code> or, in the case of <code class="language-plaintext highlighter-rouge">INHERITANCE</code> all the permissions will be granted to the final user.
Remember: a role is just a collection of priviliges, and how you nest a role into another <em>merges</em> all the privileges, either flatting them (<code class="language-plaintext highlighter-rouge">INHERIT</code>) or keeping them separated (<code class="language-plaintext highlighter-rouge">NOINHERIT</code>).</p>
An article about pgenv2019-04-17T00:00:00+00:00https://fluca1978.github.io/2019/04/17/PGDay_it_Haikin9<p>A few months ago I worked to improve the great <code class="language-plaintext highlighter-rouge">pgenv</code> tool by <em>theory</em>. Today, I try to spread the word in the hope this tool can grow a little more!</p>
<h1 id="an-article-about-pgenv">An article about pgenv</h1>
<h3 id="tldr">tl;dr</h3>
<p>I proposed a talk about <a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a>, a Bash tool to manage several <a href="http://www.postgresql.org">PostgreSQL</a> instances on the same local machine, to the Italian PGDay 2019.
<br />
<em>My talk has been rejected</em>, and I hate to waste what I have already prepared, so <em>I decided to transform my talk in an article, that has been quickly accepted on <a href="https://hakin9.org/product/practical-devops/">Haikin9 Devops Issue</a></em>!</p>
<p><br />
<br />
I should have written about this a couple of months ago, but I did not had time to.
<br />
My hope is that <a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a> gets more and more users, so that it can grow and become someday a widely used tool. Quite frankly, I don’t see this happening while being in Bash, for both portability and flexibility, and I suspect Perl is much more the language for a more flexible implementation. However, who knows? Gathering users is also a way to gather contributors and bring therefore new ideas to this small but very useful project.
<br />
<br />
In the meantime, if you have time and will, try testing the <a href="https://github.com/theory/pgenv/pull/26">build from git</a> patch, that allows you to build and manage a development version of our beloved database.</p>
Estimating row count from explain output...in Perl!2019-04-04T00:00:00+00:00https://fluca1978.github.io/2019/04/04/plperl_row_estimate<p>After having read the interesting post by <a href="https://www.cybertec-postgresql.com/en/count-made-fast/#comment-4409216441">Laurenz Albe</a> on how to use <code class="language-plaintext highlighter-rouge">EXPLAIN</code> to get a quick estimate of a query count, I decided to implement the same feature in Perl.</p>
<h1 id="estimating-row-count-from-explain-outputin-perl">Estimating row count from explain output…in Perl!</h1>
<p>At the end of his blog post, <a href="https://www.cybertec-postgresql.com/en/count-made-fast/#comment-4409216441">Laurenz Albe</a> shows how to use a <em>quick and dirty</em> function to estimate the number of rows returned by an arbitrary query.
<br />
<br />
While I don’t believe it is often a good idea to judge the size of a query by the optimizer guesses, the approach is interesting. Laurenz shows how to exploit the <code class="language-plaintext highlighter-rouge">JSON</code> format and query facilities to extract data from the <code class="language-plaintext highlighter-rouge">EXPLAIN</code> output, why not using Perl to crunch the textual data?</p>
<p>So <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/functions/explain_row_estimate.plperl.sql">here it is</a> a simple implementation to extract the estimate within Perl:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">plperl_row_estimate</span><span class="p">(</span> <span class="nv">query</span> <span class="nv">text</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">BIGINT</span>
<span class="nv">AS</span> <span class="nv">$PERL</span><span class="err">$</span>
<span class="k">my</span> <span class="p">(</span> <span class="nv">$query</span> <span class="p">)</span> <span class="o">=</span> <span class="nv">@_</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span> <span class="k">if</span> <span class="p">(</span> <span class="o">!</span> <span class="nv">$query</span> <span class="p">);</span>
<span class="nv">$query</span> <span class="o">=</span> <span class="nb">sprintf</span> <span class="p">"</span><span class="s2">EXPLAIN (FORMAT YAML) %s</span><span class="p">",</span> <span class="nv">$query</span><span class="p">;</span>
<span class="nv">elog</span><span class="p">(</span> <span class="nv">DEBUG</span><span class="p">,</span> <span class="p">"</span><span class="s2">Estimating from [</span><span class="si">$query</span><span class="s2">]</span><span class="p">"</span> <span class="p">);</span>
<span class="k">my</span> <span class="nv">@estimated_rows</span> <span class="o">=</span> <span class="nb">map</span> <span class="p">{</span> <span class="sr">s/Plan Rows:\s+(\d+)$/$1/</span><span class="p">;</span> <span class="vg">$_</span> <span class="p">}</span>
<span class="nb">grep</span> <span class="p">{</span> <span class="vg">$_</span> <span class="o">=~</span> <span class="sr">/Plan Rows:/</span> <span class="p">}</span>
<span class="nb">split</span><span class="p">(</span> <span class="p">"</span><span class="se">\n</span><span class="p">",</span> <span class="nv">spi_exec_query</span><span class="p">(</span> <span class="nv">$query</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="nv">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="p">"</span><span class="s2">QUERY PLAN</span><span class="p">"</span> <span class="p">}</span> <span class="p">);</span>
<span class="k">return</span> <span class="mi">0</span> <span class="k">if</span> <span class="p">(</span> <span class="o">!</span> <span class="nv">@estimated_rows</span> <span class="p">);</span>
<span class="k">return</span> <span class="nv">$estimated_rows</span><span class="p">[</span> <span class="mi">0</span> <span class="p">];</span>
<span class="nv">$PERL</span><span class="err">$</span>
<span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p>Let’s see an example in action:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=></span> <span class="k">select</span> <span class="n">plperl_row_estimate</span><span class="p">(</span> <span class="s1">'SELECT p.* FROM persona p JOIN persona k on k.pk = p.pk WHERE k.eta = 40'</span> <span class="p">);</span>
<span class="n">plperl_row_estimate</span>
<span class="c1">---------------------</span>
<span class="mi">69500</span>
</code></pre></div></div>
<p>How does the function works? The main trick is at this point in code:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">my</span> <span class="nv">@estimated_rows</span> <span class="o">=</span> <span class="nb">map</span> <span class="p">{</span> <span class="sr">s/Plan Rows:\s+(\d+)$/$1/</span><span class="p">;</span> <span class="vg">$_</span> <span class="p">}</span>
<span class="nb">grep</span> <span class="p">{</span> <span class="vg">$_</span> <span class="o">=~</span> <span class="sr">/Plan Rows:/</span> <span class="p">}</span>
<span class="nb">split</span><span class="p">(</span> <span class="p">"</span><span class="se">\n</span><span class="p">",</span> <span class="nv">spi_exec_query</span><span class="p">(</span> <span class="nv">$query</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="nv">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="p">"</span><span class="s2">QUERY PLAN</span><span class="p">"</span> <span class="p">}</span> <span class="p">);</span>
</code></pre></div></div>
<p>where thru <code class="language-plaintext highlighter-rouge">spi_exec_query</code> an <code class="language-plaintext highlighter-rouge">EXPLAIN</code> is executed and its format, in <code class="language-plaintext highlighter-rouge">YAML</code> is split into an array of strings, one entry per line. Such array, is then passed to <code class="language-plaintext highlighter-rouge">grep</code> to exclude all rows that do not contain information about the row estimation. Last, <code class="language-plaintext highlighter-rouge">map</code> extracts the numeric value from such lines.
<br />
<br />
After that, therefore, there is an array of <code class="language-plaintext highlighter-rouge">@estimated_rows</code> entries where each one contains the rows estimatation of each plan node, with the outer node in the begin of the array. Such single position is therefore returned by the function and all the others are dropped away.
<br />
<br />
As a final note, please consider that such function accepts an arbitrary piece of text and tries to execute it as a query, <em>therefore it must be used carefully to avoid SQL-injection and problems alike</em>.</p>
psql.it Mailing List is Back!2019-03-25T00:00:00+00:00https://fluca1978.github.io/2019/03/25/psql.it_mailing_list<p>The historical mailing list of the Italian <code class="language-plaintext highlighter-rouge">psql.it</code> group has been succesfully migrated!</p>
<h1 id="psqlit-mailing-list-is-back"><code class="language-plaintext highlighter-rouge">psql.it</code> Mailing List is Back!</h1>
<p>With the great work of people behind the <em>psql.it</em> Italian group the <strong>first</strong> (and for many years the only one) Italian language mailing list has been migrated to a new platform and is now online again!
<br />
<br />
On this mailing list you can find a few very talented people willing to help with your PostgreSQL-related problem or curiosity, to discuss the current status and the future of the development and anything else you would expect from a very technical mailing list. <em>Of course, the language is Italian!</em>.
<br />
<br />
The link to the new mailing list management panel is <a href="https://www.freelists.org/list/postgresql-it">https://www.freelists.org/list/postgresql-it</a>.
<br />
Enjoy!</p>
Running pgbackrest on FreeBSD2019-03-04T00:00:00+00:00https://fluca1978.github.io/2019/03/04/pgbackrest_FreeBSD<p>I tend to use FreeBSD as my PostgreSQL base machine, and that’s not always as simple as it sounds to get software running on it. In this post I take some advices on running <code class="language-plaintext highlighter-rouge">pgbackrest</code> on FreeBSD 12.</p>
<h1 id="running-pgbackrest-on-freebsd">Running pgbackrest on FreeBSD</h1>
<p><a href="https://pgbackrest.org/index.html">pgbackrest</a> is an amazing tool for backup and recovery of a PostgreSQL database. However, and this is not a critique at all, it has some Linux-isms that make it difficult to run on FreeBSD.
I tried to install and run it on FreeBSD 12, stopping immediatly at the compilation part. So <a href="https://github.com/pgbackrest/pgbackrest/issues/686">I opened an issue</a> to get some help, and then tried to experiment a little more to see if at least I could compile.</p>
<p>The first trial was to cross-compile: I created the executable (<code class="language-plaintext highlighter-rouge">pgbackrest</code> has a single executable) on a Linux machine, then moved it to the FreeBSD machine along with all the <code class="language-plaintext highlighter-rouge">ldd</code> libraries (placed into <code class="language-plaintext highlighter-rouge">/compat/linux/lib64</code>). But <code class="language-plaintext highlighter-rouge">libpthread.so.0</code> prevented me to start the command:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% ./pgbackrest
./pgbackrest: error <span class="k">while </span>loading shared libraries: libpthread.so.0:
cannot open shared object file: No such file or directory
</code></pre></div></div>
<p>So I switched back to native compilation and, as described in the <a href="https://github.com/pgbackrest/pgbackrest/issues/686">issue</a> I made a little changes to the <code class="language-plaintext highlighter-rouge">client.c</code> and the <code class="language-plaintext highlighter-rouge">Makefile</code>. Since it compiled (using of course <code class="language-plaintext highlighter-rouge">gmake</code>), I also made a little more changes to <code class="language-plaintext highlighter-rouge">Makefile</code> to compile and install it the FreeBSD way (i.e., under <code class="language-plaintext highlighter-rouge">/usr/local/bin</code>). The full diff is the following (some changes are not shown in the issue):</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% git diff
diff <span class="nt">--git</span> a/src/Makefile b/src/Makefile
index 73672bff..0472c7f1 100644
<span class="nt">---</span> a/src/Makefile
+++ b/src/Makefile
@@ <span class="nt">-8</span>,7 +8,7 @@
<span class="nv">CC</span><span class="o">=</span>gcc
<span class="c"># Compile using C99 and Posix 2001 standards (also _DARWIN_C_SOURCE for MacOS)</span>
<span class="nt">-CSTD</span> <span class="o">=</span> <span class="nt">-std</span><span class="o">=</span>c99 <span class="nt">-D_POSIX_C_SOURCE</span><span class="o">=</span>200112L <span class="nt">-D_DARWIN_C_SOURCE</span>
+CSTD <span class="o">=</span> <span class="nt">-std</span><span class="o">=</span>c99
<span class="c"># Compile optimizations</span>
COPT <span class="o">=</span> <span class="nt">-O2</span>
@@ <span class="nt">-51</span>,7 +51,7 @@ LDFLAGS <span class="o">=</span> <span class="nt">-lcrypto</span> <span class="nt">-lssl</span> <span class="nt">-lxml2</span> <span class="nt">-lz</span> <span class="si">$(</span>LDPERL<span class="si">)</span> <span class="si">$(</span>LDEXTRA<span class="si">)</span>
<span class="c"># Install options</span>
<span class="c">####################################################################################################################################</span>
<span class="c"># Modify destination install directory</span>
<span class="nt">-DESTDIR</span> <span class="o">=</span>
+DESTDIR <span class="o">=</span> /usr/local/
<span class="c">####################################################################################################################################</span>
<span class="c"># List of required source files. main.c should always be listed last and the rest in alpha order.</span>
@@ <span class="nt">-175</span>,8 +175,8 @@ pgbackrest: <span class="si">$(</span>OBJS<span class="si">)</span>
<span class="c"># Installation. DESTDIR can be used to modify the install location.</span>
<span class="c">####################################################################################################################################</span>
<span class="nb">install</span>: pgbackrest
- <span class="nb">install</span> <span class="nt">-d</span> <span class="si">$(</span>DESTDIR<span class="si">)</span>/usr/bin
- <span class="nb">install</span> <span class="nt">-m</span> 755 pgbackrest <span class="si">$(</span>DESTDIR<span class="si">)</span>/usr/bin
+ <span class="nb">install</span> <span class="nt">-d</span> <span class="si">$(</span>DESTDIR<span class="si">)</span>bin
+ <span class="nb">install</span> <span class="nt">-m</span> 755 pgbackrest <span class="si">$(</span>DESTDIR<span class="si">)</span>/bin
<span class="c">####################################################################################################################################</span>
<span class="c"># Compile rules</span>
diff <span class="nt">--git</span> a/src/common/io/tls/client.c b/src/common/io/tls/client.c
index ddddb790..10b1d538 100644
<span class="nt">---</span> a/src/common/io/tls/client.c
+++ b/src/common/io/tls/client.c
@@ <span class="nt">-25</span>,6 +25,7 @@ TLS Client
<span class="c">#include "common/type/keyValue.h"</span>
<span class="c">#include "common/wait.h"</span>
<span class="c">#include "crypto/crypto.h"</span>
+#include <netinet/in.h>
/<span class="k">***********************************************************************************************************************************</span>
Object <span class="nb">type</span>
</code></pre></div></div>
<p>Then, following the FreeBSD software paths, I created <code class="language-plaintext highlighter-rouge">/usr/local/etc/pgbackrest/pgbackrest.conf</code> and prooceed.
So far everything seems working, even if as far as I know, <strong>FreeBSD is not a tested platform, so I’m working at my own risk (and so are you if you doing the same installation)!</strong></p>
<p>One little annoying detail is the configuration file: <code class="language-plaintext highlighter-rouge">pgbackrest</code> defaults to <code class="language-plaintext highlighter-rouge">/etc/pgbackrest/pgbackrest.conf</code>, and such file seems to me to be hardcoded into the <code class="language-plaintext highlighter-rouge">config/parse.c</code> source file:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define PGBACKREST_CONFIG_FILE PROJECT_BIN ".conf"
#define PGBACKREST_CONFIG_ORIG_PATH_FILE "/etc/" PGBACKREST_CONFIG_FILE
</span><span class="n">STRING_STATIC</span><span class="p">(</span><span class="n">PGBACKREST_CONFIG_ORIG_PATH_FILE_STR</span><span class="p">,</span> <span class="n">PGBACKREST_CONFIG_ORIG_PATH_FILE</span><span class="p">);</span>
</code></pre></div></div>
<p>or at least I don’t see any comfortable way to change such behavior. The problem is that having to specify the FreeBSD-style configuration file <code class="language-plaintext highlighter-rouge">/usr/local/etc/pgbackrest/pgbackrest.conf</code> is not only annoying, but can cause weird errors, most notably an apparently unrelated error like</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>option pg1-path must be specified when relative wal paths are used
</code></pre></div></div>
<p>because the <code class="language-plaintext highlighter-rouge">archive_command</code> specified did not included the same configuration file and <code class="language-plaintext highlighter-rouge">pgbackrest</code> was looking for its default. In other words, ensures that the PostgreSQL instance has something like:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>archive_command <span class="o">=</span> <span class="s1">'/usr/local/bin/pgbackrest
--stanza=main
--config=/usr/local/etc/pgbackrest/pgbackrest.conf
archive-push %p'</span>
</code></pre></div></div>
<p>That’s made me think that linking <code class="language-plaintext highlighter-rouge">/usr/local/etc/pgbackrest</code> directory to <code class="language-plaintext highlighter-rouge">/etc/pgbackrest</code> could be, at this point a good solution to avoid some future mess.</p>
PostgreSQL to Microsoft SQL Server Using TDS Foreign Data Wrapper2019-01-18T00:00:00+00:00https://fluca1978.github.io/2019/01/18/PostgreSQL-TDS-FDW<p>I needed to push data from a Microsoft SQL Server 2005 to our beloved database, so why don’t use a FDW to the purpose? It has not been as simple as with other FDW, but works!</p>
<h1 id="postgresql-to-microsoft-sql-server-using-tds-foreign-data-wrapper">PostgreSQL to Microsoft SQL Server Using TDS Foreign Data Wrapper</h1>
<p>At work I needed to push data out from a Microsoft SQL Server 2005 to a PostgreSQL 11 instance. <em>Foreign Data Wrappers</em> was my first thought! <em>Perl to the rescue</em> was my second, but since I had some time, I decided to investigate the first way first.</p>
<p><br />
The scenario was the following:</p>
<ul>
<li>CentOS 7 machine running PostgreSQL 11, it is not my preferred setup (I do prefer either FreeBSD or Ubuntu), but I have to deal with that;</li>
<li>Microsoft SQL Server 2005 running on a Windows Server <something>, surely not something I like to work with and to which I had to connect via remote desktop (argh!).</something></li>
</ul>
<h2 id="first-step-get-tds-working">First Step: get TDS working</h2>
<p>After a quick research on the web, I discovered that MSSQL talks the <strong>Table Data Stream</strong> (TDS for short), so I don’t need to install an ODBC stack on my Linux box. And luckily, there are binaries for CentOS:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">sudo </span>yum <span class="nb">install </span>freetds
<span class="nv">$ </span><span class="nb">sudo </span>yum <span class="nb">install </span>freetds-devel freetds-doc
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">freetds</code> comes along with a <code class="language-plaintext highlighter-rouge">tsql</code> terminal command that is meant to be a diagnosing tool, so nothing as complete as a <code class="language-plaintext highlighter-rouge">psql</code> terminal. <strong>You should really test your connectivity with <code class="language-plaintext highlighter-rouge">tsql</code> before proceeding further</strong>, since it can save you hours of debugging when things do not work.
<br />
Thanks to a pragmatic test with <code class="language-plaintext highlighter-rouge">tsql</code> I discovered that I needed to open port <code class="language-plaintext highlighter-rouge">1433</code> (default MSSQL port) on our enterprise firewall, as well as I had to create a database user on the MSSQL instance (not the right term, I know) and to grant read permissions to it. After that, I spent half an hour trying to understand how to send queries from <code class="language-plaintext highlighter-rouge">tsql</code> to MSSQL, because the former uses the special <code class="language-plaintext highlighter-rouge">go</code> keyword to send queries (something like <code class="language-plaintext highlighter-rouge">psql</code> <code class="language-plaintext highlighter-rouge">\g</code>):</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>tsql <span class="nt">-H</span> 192.168.6.53 <span class="nt">-p</span> 1433 <span class="nt">-U</span> <span class="s1">'fluca1978'</span> <span class="nt">-P</span> <span class="s1">'xxxx'</span>
1> <span class="k">select </span>name from <span class="o">[</span>mydb].[dbo].[people]
2> go
...
</code></pre></div></div>
<p>Once I got <code class="language-plaintext highlighter-rouge">tsql</code> connection working, was time to install the foreign data wrapper.</p>
<h2 id="second-step-install-tds_fdw">Second Step: Install <code class="language-plaintext highlighter-rouge">tds_fdw</code></h2>
<p>The foreign data wrapper to connect to MSSQL is an FDW that exploits TDS: <a href="https://github.com/tds-fdw/tds_fdw/">tds_fdw</a>. Unluckily, binaries are for older PostgreSQL versions and, moreover, there is <a href="https://github.com/tds-fdw/tds_fdw/issues/192">a problem that prevents compilation against PostgreSQL 11</a>. Luckily there is already a patch, so it is recommended to compile the current HEAD:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>git show
commit 3719a995b0ae3fc4c4b390dd8a2820d54b88e18a
Merge: 782883d 3909a44
Author: Geoff Montee <[email protected]>
Date: Thu Jan 3 15:21:14 2019 <span class="nt">-0800</span>
Merge pull request <span class="c">#190 from l-we/patch-1</span>
Update options.c
<span class="nv">$ </span>make
...
<span class="nv">$ </span><span class="nb">sudo </span>make <span class="nb">install</span>
...
/usr/bin/mkdir <span class="nt">-p</span> <span class="s1">'/usr/pgsql-11/lib'</span>
/usr/bin/mkdir <span class="nt">-p</span> <span class="s1">'/usr/pgsql-11/share/extension'</span>
/usr/bin/mkdir <span class="nt">-p</span> <span class="s1">'/usr/pgsql-11/share/extension'</span>
/usr/bin/mkdir <span class="nt">-p</span> <span class="s1">'/usr/pgsql-11/doc/extension'</span>
/usr/bin/install <span class="nt">-c</span> <span class="nt">-m</span> 755 tds_fdw.so <span class="s1">'/usr/pgsql-11/lib/tds_fdw.so'</span>
/usr/bin/install <span class="nt">-c</span> <span class="nt">-m</span> 644 .//tds_fdw.control <span class="s1">'/usr/pgsql-11/share/extension/'</span>
/usr/bin/install <span class="nt">-c</span> <span class="nt">-m</span> 644 .//sql/tds_fdw--2.0.0-alpha.2.sql <span class="s1">'/usr/pgsql-11/share/extension/'</span>
/usr/bin/install <span class="nt">-c</span> <span class="nt">-m</span> 644 .//README.tds_fdw.md <span class="s1">'/usr/pgsql-11/doc/extension/'</span>
/usr/bin/mkdir <span class="nt">-p</span> <span class="s1">'/usr/pgsql-11/lib/bitcode/tds_fdw'</span>
/usr/bin/mkdir <span class="nt">-p</span> <span class="s1">'/usr/pgsql-11/lib/bitcode'</span>/tds_fdw/src/
/usr/bin/install <span class="nt">-c</span> <span class="nt">-m</span> 644 src/tds_fdw.bc <span class="s1">'/usr/pgsql-11/lib/bitcode'</span>/tds_fdw/src/
/usr/bin/install <span class="nt">-c</span> <span class="nt">-m</span> 644 src/options.bc <span class="s1">'/usr/pgsql-11/lib/bitcode'</span>/tds_fdw/src/
/usr/bin/install <span class="nt">-c</span> <span class="nt">-m</span> 644 src/deparse.bc <span class="s1">'/usr/pgsql-11/lib/bitcode'</span>/tds_fdw/src/
<span class="nb">cd</span> <span class="s1">'/usr/pgsql-11/lib/bitcode'</span> <span class="o">&&</span> /usr/lib64/llvm5.0/bin/llvm-lto <span class="nt">-thinlto</span> <span class="nt">-thinlto-action</span><span class="o">=</span>thinlink <span class="nt">-o</span> tds_fdw.index.bc tds_fdw/src/tds_fdw.bc tds_fdw/src/options.bc tds_fdw/src/deparse.bc
</code></pre></div></div>
<h2 id="third-step-use-the-fdw">Third Step: Use the FDW</h2>
<p>With the foreign data wrapper in place, it is now time to create PostgreSQL objects. <strong>I recommend using <code class="language-plaintext highlighter-rouge">notice</code> as a message level because it can provide valuable information about the connection to the foreign server!</strong> In my case, it helped me understand that the <em>showplan</em> was something I needed to grant permission on to my MSSQL user, otherwise the conenction will fail with a generic “See the server log”, which made me discover that Microsoft is not very good at logging!</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">EXTENSION</span> <span class="n">tds_fdw</span><span class="p">;</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">SERVER</span>
<span class="n">server_mssql</span>
<span class="k">FOREIGN</span> <span class="k">DATA</span> <span class="n">WRAPPER</span> <span class="n">tds_fdw</span>
<span class="k">OPTIONS</span><span class="p">(</span> <span class="n">servername</span> <span class="s1">'192.168.6.53'</span><span class="p">,</span>
<span class="k">database</span> <span class="s1">'mydb'</span><span class="p">,</span>
<span class="n">msg_handler</span> <span class="s1">'notice'</span><span class="p">,</span> <span class="c1">-- useful!</span>
<span class="n">tds_version</span> <span class="s1">'7.2'</span> <span class="c1">-- MSSQL 2005</span>
<span class="p">);</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">USER</span> <span class="n">MAPPING</span> <span class="k">FOR</span> <span class="n">luca</span>
<span class="n">SERVER</span> <span class="n">server_mssql</span>
<span class="k">OPTIONS</span> <span class="p">(</span>
<span class="n">username</span> <span class="s1">'fluca1978'</span><span class="p">,</span>
<span class="n">password</span> <span class="s1">'xxxx'</span>
<span class="p">);</span>
<span class="n">testdb</span><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span> <span class="n">mssql_people</span>
<span class="p">(</span>
<span class="p">...</span>
<span class="p">)</span>
<span class="n">SERVER</span> <span class="n">server_mssql</span>
<span class="k">OPTIONS</span> <span class="p">(</span>
<span class="k">schema_name</span> <span class="s1">'dbo'</span><span class="p">,</span>
<span class="k">table_name</span> <span class="s1">'people'</span><span class="p">,</span>
<span class="c1">--row_estimate_method 'showplan_all' -- requires extra privileges on MSSQL side!</span>
<span class="p">);</span>
</code></pre></div></div>
<h2 id="final-step-and-here-we-go">Final Step: and here we go!</h2>
<p>If everything goes ok, it is possible to query <code class="language-plaintext highlighter-rouge">mssql_people</code> and get some <code class="language-plaintext highlighter-rouge">notice</code> messages about retrieving data:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">testdb</span><span class="o">=#</span> <span class="k">select</span> <span class="o">*</span> <span class="k">from</span> <span class="n">people</span> <span class="k">limit</span> <span class="mi">5</span>
<span class="n">NOTIFICA</span><span class="p">:</span> <span class="n">DB</span><span class="o">-</span><span class="n">Library</span> <span class="n">notice</span><span class="p">:</span> <span class="n">Msg</span> <span class="o">#</span><span class="p">:</span> <span class="mi">5701</span><span class="p">,</span> <span class="n">Msg</span> <span class="k">state</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="n">Msg</span><span class="p">:</span> <span class="n">Changed</span> <span class="k">database</span> <span class="n">context</span> <span class="k">to</span> <span class="s1">'mydb'</span><span class="p">.,</span> <span class="n">Server</span><span class="p">:</span> <span class="mi">192</span><span class="p">.</span><span class="mi">168</span><span class="p">.</span><span class="mi">6</span><span class="p">.</span><span class="mi">53</span><span class="p">,</span> <span class="n">Process</span><span class="p">:</span> <span class="p">,</span> <span class="n">Line</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="k">Level</span><span class="p">:</span> <span class="mi">0</span>
<span class="n">NOTIFICA</span><span class="p">:</span> <span class="n">DB</span><span class="o">-</span><span class="n">Library</span> <span class="n">notice</span><span class="p">:</span> <span class="n">Msg</span> <span class="o">#</span><span class="p">:</span> <span class="mi">5703</span><span class="p">,</span> <span class="n">Msg</span> <span class="k">state</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">Msg</span><span class="p">:</span> <span class="n">Changed</span> <span class="k">language</span> <span class="n">setting</span> <span class="k">to</span> <span class="n">us_english</span><span class="p">.,</span> <span class="n">Server</span><span class="p">:</span> <span class="mi">192</span><span class="p">.</span><span class="mi">168</span><span class="p">.</span><span class="mi">6</span><span class="p">.</span><span class="mi">53</span><span class="p">,</span> <span class="n">Process</span><span class="p">:</span> <span class="p">,</span> <span class="n">Line</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="k">Level</span><span class="p">:</span> <span class="mi">0</span>
<span class="n">NOTIFICA</span><span class="p">:</span> <span class="n">tds_fdw</span><span class="p">:</span> <span class="n">Query</span> <span class="n">executed</span> <span class="n">correctly</span>
<span class="n">NOTIFICA</span><span class="p">:</span> <span class="n">tds_fdw</span><span class="p">:</span> <span class="n">Getting</span> <span class="n">results</span>
<span class="n">NOTIFICA</span><span class="p">:</span> <span class="n">DB</span><span class="o">-</span><span class="n">Library</span> <span class="n">notice</span><span class="p">:</span> <span class="n">Msg</span> <span class="o">#</span><span class="p">:</span> <span class="mi">5701</span><span class="p">,</span> <span class="n">Msg</span> <span class="k">state</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="n">Msg</span><span class="p">:</span> <span class="n">Changed</span> <span class="k">database</span> <span class="n">context</span> <span class="k">to</span> <span class="s1">'mydb'</span><span class="p">.,</span> <span class="n">Server</span><span class="p">:</span> <span class="mi">192</span><span class="p">.</span><span class="mi">168</span><span class="p">.</span><span class="mi">6</span><span class="p">.</span><span class="mi">53</span><span class="p">,</span> <span class="n">Process</span><span class="p">:</span> <span class="p">,</span> <span class="n">Line</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="k">Level</span><span class="p">:</span> <span class="mi">0</span>
<span class="n">NOTIFICA</span><span class="p">:</span> <span class="n">DB</span><span class="o">-</span><span class="n">Library</span> <span class="n">notice</span><span class="p">:</span> <span class="n">Msg</span> <span class="o">#</span><span class="p">:</span> <span class="mi">5703</span><span class="p">,</span> <span class="n">Msg</span> <span class="k">state</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="n">Msg</span><span class="p">:</span> <span class="n">Changed</span> <span class="k">language</span> <span class="n">setting</span> <span class="k">to</span> <span class="n">us_english</span><span class="p">.,</span> <span class="n">Server</span><span class="p">:</span> <span class="mi">192</span><span class="p">.</span><span class="mi">168</span><span class="p">.</span><span class="mi">6</span><span class="p">.</span><span class="mi">53</span><span class="p">,</span> <span class="n">Process</span><span class="p">:</span> <span class="p">,</span> <span class="n">Line</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="k">Level</span><span class="p">:</span> <span class="mi">0</span>
<span class="p">...</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">192231</span><span class="p">,</span><span class="mi">473</span> <span class="n">ms</span> <span class="p">(</span><span class="mi">03</span><span class="p">:</span><span class="mi">12</span><span class="p">,</span><span class="mi">231</span><span class="p">)</span>
</code></pre></div></div>
<p>Times are really bad, but this is due to MSSQL database design I’m working with, so I don’t blame nor MSSQL itself, nor <code class="language-plaintext highlighter-rouge">tds_fdw</code>: the table is a few gigabytes in size. I really need to investigate some optimizations.</p>
PGVersion: a class to manage PostgreSQL Version (strings) within a Perl 6 Program2018-12-20T00:00:00+00:00https://fluca1978.github.io/2018/12/20/PGVersionPerl6<p>While writing a program in Perl 6 I had the need to correctly parse and analyze diffefent PostgreSQL version strings. I wrote a simple and minimal class to the aim, and refactored so it can escape in the wild.</p>
<h2 id="pgversion-a-class-to-manage-postgresql-version-strings-within-a-perl-6-program">PGVersion: a class to manage PostgreSQL Version (strings) within a Perl 6 Program</h2>
<p>As you probably already know, PostgreSQL has changed its versioning number scheme from a <code class="language-plaintext highlighter-rouge">major.major.minor</code> approach to a concise <code class="language-plaintext highlighter-rouge">major.minor</code> one. Both are simple enought to be evaluated with a regular expression, but I found myself wrinting the same logic over and over, so I decided to write a minimal class to do the job for me and provide several information.
<br />
Oh, and this is Perl 6 (that I’m still learning!).
<br />
The class is named <a href="https://github.com/fluca1978/fluca1978-coding-bits/blob/master/perl6/lib/Fluca1978/Utils/PostgreSQL/PGVersion.pm6"><code class="language-plaintext highlighter-rouge">Fluca1978::Utils::PostgreSQL::PGVersion</code></a> and is released as it is under the BSD Licence.</p>
<h1 id="quick-show-me-something">Quick, show me something!</h1>
<p>Ok, here it is how it works:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">Fluca1978::Utils::PostgreSQL::</span><span class="nv">PGVersion</span><span class="p">;</span>
<span class="k">for</span> <span class="o"><</span><span class="mf">10.1</span> <span class="mi">11</span><span class="nv">beta1</span> <span class="mf">11.1</span> <span class="mf">9.6.5</span> <span class="mf">6.11</span><span class="o">></span> <span class="p">{</span>
<span class="k">my</span> <span class="nv">$v</span> <span class="o">=</span> <span class="nv">PGVersion</span><span class="o">.</span><span class="k">new</span><span class="p">:</span> <span class="p">:</span><span class="nv">version</span><span class="o">-</span><span class="nv">string</span><span class="p">(</span> <span class="vg">$_</span> <span class="p">);</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">PostgreSQL version is </span><span class="si">$v</span><span class="p">";</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">or for short { </span><span class="si">$v</span><span class="s2">.gist }</span><span class="p">";</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">and if you want a detailed version:</span><span class="se">\n</span><span class="s2">{ </span><span class="si">$v</span><span class="s2">.Str( True ) }</span><span class="p">";</span>
<span class="nv">say</span> <span class="p">"</span><span class="s2">URL to download: { </span><span class="si">$v</span><span class="s2">.http-download-url }</span><span class="p">";</span>
<span class="nv">say</span> <span class="p">'</span><span class="s1">~~~~</span><span class="p">'</span> <span class="nv">x</span> <span class="mi">10</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The above simple loop provides the following output:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% perl6 <span class="nt">-Ilib</span> usage-example-pgversion.pl
PostgreSQL version is v10.1
or <span class="k">for </span>short 10.1
and <span class="k">if </span>you want a detailed version:
10.1 <span class="o">(</span>Major: 10, Minor: 1, stable<span class="o">)</span>
Equivalent to SHOW SERVER_VERSION_NUM is 100001
URL to download: https://ftp.postgresql.org/pub/source//v10.1/postgressql-10.1.tar.bz2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PostgreSQL version is v11beta1
or <span class="k">for </span>short 11beta1
and <span class="k">if </span>you want a detailed version:
11beta1 <span class="o">(</span>beta 1 of development branch 11<span class="o">)</span>
Equivalent to SHOW SERVER_VERSION_NUM is 119999
URL to download: https://ftp.postgresql.org/pub/source//v11beta1/postgressql-11beta1.tar.bz2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PostgreSQL version is v11.1
or <span class="k">for </span>short 11.1
and <span class="k">if </span>you want a detailed version:
11.1 <span class="o">(</span>Major: 11, Minor: 1, stable<span class="o">)</span>
Equivalent to SHOW SERVER_VERSION_NUM is 110001
URL to download: https://ftp.postgresql.org/pub/source//v11.1/postgressql-11.1.tar.bz2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PostgreSQL version is v9.6.5
or <span class="k">for </span>short 9.6.5
and <span class="k">if </span>you want a detailed version:
9.6.5 <span class="o">(</span>Major: 9.6, Minor: 5, stable<span class="o">)</span>
Equivalent to SHOW SERVER_VERSION_NUM is 090605
URL to download: https://ftp.postgresql.org/pub/source//v9.6.5/postgressql-9.6.5.tar.bz2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PostgreSQL version is v6.11
or <span class="k">for </span>short 6.11
and <span class="k">if </span>you want a detailed version:
6.11 <span class="o">(</span>Major: 6, Minor: 11, stable<span class="o">)</span>
Equivalent to SHOW SERVER_VERSION_NUM is 060011
URL to download: https://ftp.postgresql.org/pub/source//v6.11/postgressql-6.11.tar.gz
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
</code></pre></div></div>
<p>As you can see, every version string is correctly interpreted and printed out, major and minor numbering schemes are applied depending on the version number recognized.</p>
<h2 id="from-a-string-to-a-string">From a string to a string</h2>
<p>The class provides three <em>stringify</em> methods:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">.gist</code> provides the numeric part of the version number, e.g., <code class="language-plaintext highlighter-rouge">9.6.5</code>;</li>
<li><code class="language-plaintext highlighter-rouge">.Str</code> places a <code class="language-plaintext highlighter-rouge">v</code> (for version) in front of the numeric part of the string, e.g., <code class="language-plaintext highlighter-rouge">v9.6.5</code>;</li>
<li><code class="language-plaintext highlighter-rouge">.Str( Bool )</code> when passed a <code class="language-plaintext highlighter-rouge">True</code> value produces a verbose output explaining the version of PostgreSQL including its alfa or beta status (e.g., <code class="language-plaintext highlighter-rouge">9.6.5 (Major: 9.6, Minor: 5, stable)</code>), when invoked with a <code class="language-plaintext highlighter-rouge">False</code> argument provides the same output of <code class="language-plaintext highlighter-rouge">.gist</code>.</li>
</ul>
<h2 id="input-strings">Input strings</h2>
<p>The class accepts a named argument for its construction: <code class="language-plaintext highlighter-rouge">version-string</code>. Acceptable strings are those in the form:</p>
<ul>
<li><em>va.b.c</em></li>
<li><em>a.b.c</em></li>
<li><em>va.b</em></li>
<li><em>a.b</em></li>
<li><em>xbetay</em></li>
<li><em>xalfay</em></li>
</ul>
<p>So for example all the following are good strings: <code class="language-plaintext highlighter-rouge">9.6.5</code>, <code class="language-plaintext highlighter-rouge">6.10</code>, <code class="language-plaintext highlighter-rouge">10.1</code>, <code class="language-plaintext highlighter-rouge">11beta3</code>.</p>
<h2 id="main-methods">Main methods</h2>
<p>The <code class="language-plaintext highlighter-rouge">.parse</code> method performs the main work disassembling the version string used to construct the object into pieces that are then stored in the class internal attributes, as numbers (<code class="language-plaintext highlighter-rouge">Int</code>). You can call <code class="language-plaintext highlighter-rouge">.parse</code> after a version object has been created to change its internal status.</p>
<p><br />
The class provides several methods with easy-to-understand names:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">is-alfa</code>, <code class="language-plaintext highlighter-rouge">is-beta</code> to check if the version string identifies a development branch;</li>
<li><code class="language-plaintext highlighter-rouge">major-number</code>, <code class="language-plaintext highlighter-rouge">minor-number</code> to get the single pieces of a numbering. Please note that <code class="language-plaintext highlighter-rouge">.major-number</code> returns a <code class="language-plaintext highlighter-rouge">Str</code> because for 7 to 9 numbering scheme the major number is made by two digits separated by a dot. I’m thinking about the idea of using a <code class="language-plaintext highlighter-rouge">Rat</code> to return a numeric value, but I’m not sure it is a good idea. Internally, however, all the data is kept as integers;</li>
<li><code class="language-plaintext highlighter-rouge">server-version</code> and <code class="language-plaintext highlighter-rouge">server-version-num</code> are there to provide the same behavior of a <code class="language-plaintext highlighter-rouge">SHOW</code> issued on PostgreSQL connection;</li>
<li><code class="language-plaintext highlighter-rouge">http-download-url</code> accepts an optional URL as base and returns the HTTP link to download the specified version of PostgreSQL.</li>
</ul>
<p>The class also defines the <code class="language-plaintext highlighter-rouge">ACCEPTS</code> method that is used for smart matching, so that you can say if two objects are the same, as well as comparing if they are older or newer with respect one to the other:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">my</span> <span class="nv">$version</span><span class="o">-</span><span class="nv">a</span> <span class="o">=</span> <span class="nv">PGVersion</span><span class="o">.</span><span class="k">new</span><span class="p">:</span> <span class="p">:</span><span class="nv">version</span><span class="o">-</span><span class="nv">string</span><span class="p">(</span> <span class="p">'</span><span class="s1">10.1</span><span class="p">'</span> <span class="p">);</span>
<span class="k">my</span> <span class="nv">$version</span><span class="o">-</span><span class="nv">b</span> <span class="o">=</span> <span class="nv">PGVersion</span><span class="o">.</span><span class="k">new</span><span class="p">:</span> <span class="p">:</span><span class="nv">version</span><span class="o">-</span><span class="nv">string</span><span class="p">(</span> <span class="p">'</span><span class="s1">v9.6.5</span><span class="p">'</span> <span class="p">);</span>
<span class="nv">$version</span><span class="o">-</span><span class="nv">a</span> <span class="o">~~</span> <span class="nv">$version</span><span class="o">-</span><span class="nv">b</span><span class="p">;</span> <span class="c1"># False</span>
<span class="nv">$version</span><span class="o">-</span><span class="nv">a</span><span class="o">.</span><span class="nv">newer:</span> <span class="nv">$version</span><span class="o">-</span><span class="nv">b</span><span class="p">;</span> <span class="c1"># True</span>
<span class="nv">$version</span><span class="o">-</span><span class="nv">b</span><span class="o">.</span><span class="nv">older:</span> <span class="nv">$version</span><span class="o">-</span><span class="nv">a</span><span class="p">;</span> <span class="c1"># True</span>
</code></pre></div></div>
<p>See the <a href="https://github.com/fluca1978/fluca1978-coding-bits/blob/master/perl6/t/01-pgversion.t">tests</a> for a more comphrensive usage of the instances and of the methods.</p>
<p><br />
<br />
I hope this can be helpful to someone, I could one day collect other utilities and release them as a whole module.</p>
PostgreSQL 11 Server Side Programming - Now Available!2018-12-12T00:00:00+00:00https://fluca1978.github.io/2018/12/12/PG11SSP-2<p>A quick start guide on implementing and deploying code to PostgreSQL.</p>
<h2 id="postgresql-11-server-side-programming---now-available">PostgreSQL 11 Server Side Programming - Now Available!</h2>
<p>Near the end of November, Packt published the book <strong><a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide">PostgreSQL 11 Server Side Programming Quick Start Guide</a></strong>, authored by <em><a href="https://fluca1978.github.io">me</a></em>.
<br />
<br /></p>
<p><a href="https://www.packtpub.com/big-data-and-business-intelligence/postgresql-11-server-side-programming-quick-start-guide"><img src="/images/posts/pg11ssp/cover.png" alt="PostgreSQL-11-ServerSideProgramming-cover-image" /></a></p>
<p><br />
<br />
<em>This post has the only aim of describing the book contents and the reason behind choices related to it.</em>
<br /></p>
<p>Following a consolidated tradition, Packt is producing more and more books on PostgreSQL and related technologies, and this is the first one that covers aspects about the freshly released <em>PostgreSQL 11</em> version.
<br />
<br />
Nevertheless, this does not mean that the book is <em>only</em> for PostgreSQL 11 users and administrators: it covers topics, concepts and provide examples that can be use as-is or ported to older versions of PostgreSQL, as well as probably to newer ones. In fact, while the book code examples have been tested against a PostgreSQL 11 cluster, only the examples related to the new object <code class="language-plaintext highlighter-rouge">PROCEDURE</code>, introduced by PostgreSQL 11, are strongly tied to such a version.
<br />
<br />
This book is a <em>Quick Start Guide</em>, and therefore it has a very practical approach to a limited scope, and only that. Therefore the book assumes you are able to install, manage and run a PostgreSQL 11 cluster, that you know how to connect and how to handle basic SQL statements. A basic knowledge in general programming is also required.
<br />
<br />
The book consists of 10 chapters, each focusing on a particular aspect of developing and deploying code within a PostgreSQL cluster. The main programming language used in the book is <strong><code class="language-plaintext highlighter-rouge">PL/pgSQL</code></strong>, the default procedural language for PostgreSQL; however several examples are rewritten using <strong><code class="language-plaintext highlighter-rouge">Perl 5</code></strong> and <strong><code class="language-plaintext highlighter-rouge">Java</code></strong> in order to demonstrate how it is possible to use also other “foreign” languages. In fact, one of the cool features of PostgreSQL since a lot is to be able to run code written in other non-SQL based languages directly within the cluster. But not all languages are equal: some of them require a deployment workflow, while others begin script-based allows you to directly <em>inject</em> the code into the cluster as you type it. Here the choice of using <code class="language-plaintext highlighter-rouge">PL/Java</code> to show how a deployable language works, as opposite to <code class="language-plaintext highlighter-rouge">PL/Perl</code> that being script-based can be typed directly within a database connection.
<br />
<br />
More in details, the book chapters are the followings:
<br />
<br /></p>
<ul>
<li>
<p><em>Chapter 1</em>, <strong>Introduction to Server Side Programming</strong> presents the basic concepts behind the server-side programming, what it means with regard to PostgreSQL, how this great database support the paradigm and shows the example database used along the whole book.</p>
</li>
<li>
<p><em>Chapter 2</em>, <strong>Query Tricks</strong> provides you hints about advanced SQL statements that can help you solve day-by-day tasks without requiring a stand-alone program. As an example, discovering auto-assigned keys or computed fields (e.g., dates or timestamps) and performing recursion on a dataset.</p>
</li>
<li>
<p><em>Chapter 3</em>, <strong>The PL/pgSQL Language</strong> details the syntax and workflow of a piece of <code class="language-plaintext highlighter-rouge">PL/pgSQL</code> language, the default language in PostgreSQL to write SQL-like statements, iterations, conditionals and manage variables, exceptions and other programming stuff. The chapter takes you directly to the language usage via the <code class="language-plaintext highlighter-rouge">DO</code> statement, that allows you to execute code on a database connection directly.</p>
</li>
<li>
<p><em>Chapter 4</em>, <strong>Stored Procedures</strong> tells you how to <em>store</em> the code in the database for later execution and reuse. The chapter details both main type of stored procedures: <code class="language-plaintext highlighter-rouge">FUNCTION</code>s, the old well known objects, and <code class="language-plaintext highlighter-rouge">PROCEDURE</code>s, the new PostgreSQL 11 objects able to interact with a transaction.</p>
</li>
<li>
<p><em>Chapter 5</em>, <strong>PL/Perl and PL/Java</strong> shows how to implement stored procedures (both <code class="language-plaintext highlighter-rouge">FUNCTION</code>s and <code class="language-plaintext highlighter-rouge">PROCEDURE</code>s) using either Perl 5 or Java. As already stated, the concepts are general enough to apply the implementation to other foreign languages.</p>
</li>
<li>
<p><em>Chapter 6</em>, <strong>Triggers</strong> shows you how to use <code class="language-plaintext highlighter-rouge">FUNCTION</code>s to react to data events, like changes against a table. Both Data Manipulation Triggers (DML Triggers) and Data Definition Triggers (DDL Triggers) are detailed and examples are shown in all the three book languages (<code class="language-plaintext highlighter-rouge">PL/pgSQL</code>, <code class="language-plaintext highlighter-rouge">PL/Perl</code>, <code class="language-plaintext highlighter-rouge">PL/Java</code>).</p>
</li>
<li>
<p><em>Chapter 7</em>, <strong>Rules and the Query Rewriting System</strong> provides hint about the path of a statement and the way it is possible to alter it on the fly to perform statement manipulation even before a trigger fires.</p>
</li>
<li>
<p><em>Chapter 8</em>, <strong>Extensions</strong> tells you how to organize your code in a way PostgreSQL can easily handle, from installation to ugprade. A glance at the PostgreSQL Extension Network (PGXN) and the search infrastructure (PGXS) is provided, as well as some practical examples about how to write an extension from scratch or use an already existing extension.</p>
</li>
<li>
<p><em>Chapter 9</em>, <strong>Intra-Process Communications</strong> tells you about how PostgreSQL can make two different <em>backend</em> process communicate via <code class="language-plaintext highlighter-rouge">LISTEN</code> and <code class="language-plaintext highlighter-rouge">NOTIFY</code>, providing examples on how even external application (and processes) can be notified of events. The chapter then continues showing a skeleton implementation of the <em>Background Workers</em>, custom processes that can be plugged into a cluster and run as part of it.</p>
</li>
<li>
<p><em>Chapter 10</em>, <strong>Custom Data Types</strong> shows you how to extend the PostgreSQL already rich data types and create your own, from adding SQL enumerations to implementing a whole custom data type.</p>
</li>
</ul>
<h2 id="code-repository">Code Repository</h2>
<p>The code repository with examples and other information is available on the official <strong><a href="https://github.com/PacktPublishing/PostgreSQL-11-Quick-Start-Guide">GitHub space</a></strong> and is also cloned into my <strong><a href="https://gitlab.com/fluca1978/postgresql-11-quick-start-guide">GitLab repository</a></strong> so feel free to clone it from whatever is more comfortable to you!</p>
<h2 id="people-and-projects-id-like-to-thank">People and Projects I’d like to Thank</h2>
<p>There are several people that helped me or supported me in writing the book, and with regard to PostgreSQL I would like to thank <a href="http://www.enricopirozzi.info/">Enrico Pirozzi</a> for believing in me and encouraging during the early stages. Thank you enrico!
<br />
<br />
During the whole process, I used <a href="https://github.com/theory/pgenv">pgenv</a> as a quick manager for installing and re-installing PostgreSQL across machines. Thank you <em>theory</em>!
<br />
<br />
As a side note, the main part of my writing has been done using <em>Emacs and Org-Mode</em>!</p>
pgenv gets patching support2018-10-31T00:00:00+00:00https://fluca1978.github.io/2018/10/31/pgenv_patching<p><code class="language-plaintext highlighter-rouge">pgenv</code> does now support a customizable <em>patching</em> feature that allows the user to define which patches to apply when an instance is built.</p>
<h1 id="pgenv-gets-patching-support">pgenv gets patching support</h1>
<p><a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a>, the useful tool for managing several PostgreSQL installations, gets support for customizable patching.</p>
<p>What is all about?
Well, it happens that you could need to patch PostgreSQL source tree before you build, and it could be because something on your operating system is different than the majority of the systems PostgreSQL is built against. Nevermind, you need to patch it!</p>
<p><code class="language-plaintext highlighter-rouge">pgenv</code> did support a very simple patching mechanism hardcoded within the program itself, but during the last days I worked on a different and more customizable approach. The idea is simple: the program will apply every patch file listed in an <em>index</em> for the particular version. So, if you want to build the outshining 11.0 and need to patch it, build an index text file and list there all the patches, and the <code class="language-plaintext highlighter-rouge">pgenv</code> build process will apply them before compiling.</p>
<p>Of course, what if you need to apply the same patches over and over to different versions? You will end up with several indexes, one for each version you need to patch. Uhm…not so smart! To avoid this, I designed the patching index selection in a way that allows you to group patches for operating system and brand.</p>
<p>Allow me to explain more in detail with an example.
Suppose you are on a Linux machine and need to patch version 11.0: the program will search for a file that matches any of the following:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$PGENV_ROOT</span>/patch/index/patch.11.0.Linux
<span class="nv">$PGENV_ROOT</span>/patch/index/patch.11.0
<span class="nv">$PGENV_ROOT</span>/patch/index/patch.11.Linux
<span class="nv">$PGENV_ROOT</span>/patch/index/patch.11
</code></pre></div></div>
<p>This <em>desperate</em> searching for works selecting the <em>first</em> file that matches the operating system and PostgreSQL version or a combination of the two including the major (or brand in previous versions) number.</p>
<p>Last, but not least, a new configuration variable has been introduced: <code class="language-plaintext highlighter-rouge">PGENV_PATCH_INDEX</code>. The usage of this variable allows you to overide the index selection mechanism providing a list of patches to apply that have a name possibly unlrelated at all with PostgreSQL version and/or operating system. Therefore, this allows you to do something like:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nv">PGENV_PATCH_INDEX</span><span class="o">=</span>patch/patch_for_osx.txt pgenv build 11.0
</code></pre></div></div>
<p>I really hope this suffices in covering enough use cases to make this greate tool more and more useful.</p>
pgenv get configuration!2018-09-24T00:00:00+00:00https://fluca1978.github.io/2018/09/24/pgenv_configuration<p>I have already written about a very useful and powerful small pearl by <a href="https://justatheory.com/"><em>theory</em></a>: <a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a>. Now the tool does support for user configuration!</p>
<h1 id="pgenv-get-configuration">pgenv get configuration!</h1>
<p>I spent some time implementing a very rudimental approach to configuration for <code class="language-plaintext highlighter-rouge">pgenv</code>.
The idea was simple: since the program is a single Bash script, the configuration can be done using a single file to source variables in.</p>
<p>But before this was possible, I had to do a little refactoring over here and there in order to make all the commands behave smooth across the configuration. And at least, it seems to work, with some parts that can be improved and implemented better (as always it is!). However, I designed from scratch to support every single version of PostgreSQL, that means configuration could be different depending on the specific version you are running. This allows, for example, to set particular flags for ancient versions, without having to get crazy when switching to more recent ones.</p>
<p>Now <code class="language-plaintext highlighter-rouge">pgenv</code> supports a <code class="language-plaintext highlighter-rouge">config</code> command that, in turn, support for several subcommands:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">write</code> to store the configuration in an hidden file named after the PostgreSQL version (e.g., <code class="language-plaintext highlighter-rouge">.pgenv.10.5.conf</code>);</li>
<li><code class="language-plaintext highlighter-rouge">edit</code> just to launch you <code class="language-plaintext highlighter-rouge">$EDITOR</code> to manipulate the configuration;</li>
<li><code class="language-plaintext highlighter-rouge">delete</code> to remove a configuration file;</li>
<li><code class="language-plaintext highlighter-rouge">show</code> to dump the configuration.</li>
</ul>
<p>The idea is simple: each time a new PostgreSQL version is built, a configuration file is created for such instance. You can then customize the file in order to make <code class="language-plaintext highlighter-rouge">pgenv</code> behave differently for that particular version of PostgreSQL. As an example, you can set different languages (e.g., PL/Perl) or different startup/stop modes.
If the configuration file for a particular version is not found, a <em>global</em> configuration is loaded. If neither that is found, the program behaves depending on some shell variables or, as last resort, with its internal defaults.</p>
<p>Please read the <a href="https://github.com/theory/pgenv/blob/master/README.md">documentation</a> for a better explaination of the new <code class="language-plaintext highlighter-rouge">config</code> command.</p>
Managing Multiple PostgreSQL Installations with pgenv2018-08-30T00:00:00+00:00https://fluca1978.github.io/2018/08/30/pgenv<p><code class="language-plaintext highlighter-rouge">pgenv</code> is a shell script that allows you to quickly manage multiple PostgreSQL installations within the same host. It reminds somehow <a href="https://perlbrew.pl/"><code class="language-plaintext highlighter-rouge">perlbrew</code></a> (for Perl 5) and systems like that. In this post I briefly show how to use <code class="language-plaintext highlighter-rouge">pgenv</code> as well as I explain which changes I made to it.</p>
<h1 id="managing-multiple-postgresql-installations-with-pgenv">Managing Multiple PostgreSQL Installations with pgenv</h1>
<p><a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a> is another pearl from <a href="https://justatheory.com/"><em>theory</em></a>. It is a <em>bash</em> single script that allows you to download, build, start and stop (as well as <em>nuke</em>) several PostgreSQL installations within the same host.
<br />
It is worth noting that <a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a> is not, at least now, an enterprise-level PostgreSQL management tool, rather an easy way to keep <em>test</em> instances clean and organized. It can be very useful to keep several clusters on which doing experiments, testing, and so on.
<br />
<br />
I first discovered <a href="https://github.com/theory/pgenv"><code class="language-plaintext highlighter-rouge">pgenv</code></a> reading this <a href="https://justatheory.com/2018/08/pgenv/">blog post by David</a>, and I thought it was cool to have a single script to help me manage several environments. I must be honest, this is not the first tool like this I have seen for PostgreSQL, but somehow it caught my attention.
I then cloned the repository and start using it. And since I’m curious, I read the source code.
Well, ehm, bash? Ok, it is not my favourite shell anymore, but surely it can speed up development while shorting the code with respect to more portable shells.
<br />
<br />
<code class="language-plaintext highlighter-rouge">pgenv</code> works with a <em>command-oriented</em> interface: as in <code class="language-plaintext highlighter-rouge">git</code> or other developer-oriented tools you specify a command (e.g., <code class="language-plaintext highlighter-rouge">build</code>) and optionally a specific PostgreSQL version to apply the command to.
<code class="language-plaintext highlighter-rouge">pgenv</code> works on a single cluster at time, by linking and unlinking the specific instance directories (binary and <code class="language-plaintext highlighter-rouge">PGDATA</code>) so that it always knows where to find PostgreSQL commands.</p>
<p>You can read the <a href="https://github.com/theory/pgenv/blob/master/README.md">documentation</a> for more details.</p>
<h2 id="my-contribution-to-pgenv">My Contribution to <code class="language-plaintext highlighter-rouge">pgenv</code></h2>
<p>After reading the code, I decided to provide a few improvements to the script, so I forked the repository and issued a few <a href="https://github.com/theory/pgenv/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Afluca1978+">pull requests</a>. The most complex, long and I hope useful one is the <a href="https://github.com/theory/pgenv/pull/5">implementation of an <code class="language-plaintext highlighter-rouge">available</code> command</a>.
<br />
<br />
The <code class="language-plaintext highlighter-rouge">available</code> command automatically downloads the list of available PostgreSQL versions, group them by <em>major version number</em> (according to the changes from 10 ongoing) and shows them to the user. Then I also added the capability to <a href="https://github.com/theory/pgenv/pull/5/commits/c6d5d77cf5674039f082339521ea7b5a82985139">filter by version numbers</a>, so that the user will not get always the long list of <em>all</em> PostgreSQL versions, but only the ones he is interested into.
<br />
<br />
So why did the implementation of the command take so long? First of all, because I made several minor mistakes, or inaccuracies, and I had to fix them of course. Second, because of a problem with temporary files that made the script less portable, so I had to implement back again the grouping by using associative arrays (I did not want to introduce another Perl dependency).
<br />
<br />
I have to say that David is a great mentor, and even a simple project like this helped me learning more and more skills.</p>
<h2 id="an-example-workflow">An example workflow</h2>
<p>Enough blather, let’s see <code class="language-plaintext highlighter-rouge">pgenv</code> in action and got thru an example workflow.
Please note that some output could be different because, at the time of writing, I’m still submitting new contributionms.</p>
<h3 id="check-everything-is-fine">Check everything is fine</h3>
<p>I’ve added the <code class="language-plaintext highlighter-rouge">check</code> command just to check that <code class="language-plaintext highlighter-rouge">pgenv</code> can work in your environment, so once installed, running the check command can help you find problems:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv check
<span class="o">[</span>OK] make: /usr/bin/make
<span class="o">[</span>OK] curl: /usr/local/bin/curl
<span class="o">[</span>OK] patch: /usr/bin/patch
<span class="o">[</span>OK] <span class="nb">tar</span>: /usr/bin/tar
<span class="o">[</span>OK] <span class="nb">sed</span>: /usr/bin/sed
<span class="o">[</span>OK] perl: /usr/local/bin/perl
</code></pre></div></div>
<p>Everything seems fine!</p>
<h3 id="choosing-a-version-to-build">Choosing a version to build</h3>
<p>Suppose I want to build one of the 10 series version, I can use the new <code class="language-plaintext highlighter-rouge">available</code> command to get the full list of versions that <code class="language-plaintext highlighter-rouge">pgenv</code> can download, but that would be a rather long listing (after all, PostgreSQL has a looong history), so I can pass a major version to the <code class="language-plaintext highlighter-rouge">available</code> command to see only the 10 releases:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv available 10
Available PostgreSQL Versions
<span class="o">========================================================</span>
PostgreSQL 10
<span class="nt">------------------------------------------------</span>
10.0 10.1 10.2 10.3 10.4 10.5
</code></pre></div></div>
<p>If I’m in doubt, or if I want to see more releases, I can filter them specifying a major release on the command line.
As an example, to get back in time and keep up to date:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv available 10 7.4
Available PostgreSQL Versions
<span class="o">========================================================</span>
PostgreSQL 7.4
<span class="nt">------------------------------------------------</span>
7.4 7.4.1 7.4.2 7.4.3 7.4.4 7.4.5
7.4.6 7.4.7 7.4.8 7.4.9 7.4.10 7.4.11
7.4.12 7.4.13 7.4.14 7.4.15 7.4.16 7.4.17
7.4.18 7.4.19 7.4.21 7.4.22 7.4.23 7.4.24
7.4.25 7.4.26 7.4.27 7.4.28 7.4.29 7.4.30
PostgreSQL 10
<span class="nt">------------------------------------------------</span>
10.0 10.1 10.2 10.3 10.4 10.5
</code></pre></div></div>
<h3 id="build-a-version">Build a version</h3>
<p>The <code class="language-plaintext highlighter-rouge">build</code> command downloads and compiles the chosen version:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv build 10.5
...
PostgreSQL, contrib, and documentation installation complete.
PostgreSQL 10.5 built
</code></pre></div></div>
<p>I can further iterate the process for other versions:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv build 9.6.10
...
PostgreSQL, contrib, and documentation installation complete.
PostgreSQL 9.6.10 built
</code></pre></div></div>
<h3 id="use-a-version">Use a version</h3>
<p>Let’s see what choice of PostgreSQL do I have installed on the machine:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv versions
10.5 pgsql-10.5
9.6.10 pgsql-9.6.10
</code></pre></div></div>
<p>None of the two above versions have been selected as default version, in order to use a version the <code class="language-plaintext highlighter-rouge">use</code> command must be issued:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv use 9.6.10
The files belonging to this database system will be owned by user <span class="s2">"luca"</span><span class="nb">.</span>
This user must also own the server process.
...
Logs are <span class="k">in</span> <span class="o">[</span>/home/luca/tmp/pgsql/data/server.log]
</code></pre></div></div>
<p>The very first time a database is selected its <code class="language-plaintext highlighter-rouge">PGDATA</code> directory is properly initialized by <code class="language-plaintext highlighter-rouge">initdb</code>.<code class="language-plaintext highlighter-rouge">
The </code>PGDATA<code class="language-plaintext highlighter-rouge"> directory is set to </code>pgsql/data<code class="language-plaintext highlighter-rouge">, while the cluster has been linked to </code>pgsql` directory:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">ls</span> <span class="nt">-l</span> pgsql
lrwxr-xr-x 1 luca luca 12 Aug 30 06:59 pgsql -> pgsql-9.6.10
</code></pre></div></div>
<p>If I inspect again the installed versions, an asterisk will mark the currently selected version as the one in use:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv versions
10.5 pgsql-10.5
<span class="k">*</span> 9.6.10 pgsql-9.6.10
% pgenv version
9.6.10
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">versions</code> command (mind the <em>s</em>) shows all installed versions, while <code class="language-plaintext highlighter-rouge">version</code> shows the currently selected one.</p>
<h3 id="start-and-stop">Start and Stop</h3>
<p>The <code class="language-plaintext highlighter-rouge">start</code> and <code class="language-plaintext highlighter-rouge">stop</code> commands can be used to activate and deactivate the currently selected cluster:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv start
PostgreSQL 9.6.10 is already running
% pgenv stop
waiting <span class="k">for </span>server to shut down.... <span class="k">done
</span>server stopped
PostgreSQL 9.6.10 stopped
</code></pre></div></div>
<h3 id="clear-and-nuke">Clear and Nuke</h3>
<p>The <code class="language-plaintext highlighter-rouge">clear</code> command unsets the currently selected version, that is <em>un-use</em> a version:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv clear
PostgreSQL 9.6.10 cleared
% pgenv versions
10.5 pgsql-10.5
9.6.10 pgsql-9.6.10
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">remove</code> command nukes the specified version, removing all the files and database content.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% pgenv remove 9.6.10
PostgreSQL 9.6.10 removed
% pgenv versions
10.5 pgsql-10.5
</code></pre></div></div>
<h2 id="there-is-more">There is more</h2>
<p>There are other commands and usage tricks, please read the <a href="https://github.com/theory/pgenv/blob/master/README.md">documentation</a> for a deeper understanding of this tool.</p>
An example of PostgreSQL rules: updating pg_settings2018-08-13T00:00:00+00:00https://fluca1978.github.io/2018/08/13/PostgreSQL-pgsettings-rules<p>Rules are a powerful mechanism by which PostgreSQL allows a statement to be <em>transformed</em> into another.
And PostgreSQL itself does use rules in order to make your life easier.</p>
<h1 id="an-example-of-postgresql-rules-updating-pg_settings">An example of PostgreSQL rules: updating pg_settings</h1>
<p>When asked for a quick and sweet example about rules I often answer with the <code class="language-plaintext highlighter-rouge">pg_settings</code> example.</p>
<p>The special view <code class="language-plaintext highlighter-rouge">pg_settings</code> offers a tabular decodification of the current cluster settings, in other words allows you to see <code class="language-plaintext highlighter-rouge">postgresql.conf</code> (and friends) as a table to run queries against.</p>
<p>But there is more than that: you can also issue <code class="language-plaintext highlighter-rouge">UPDATE</code> commands against such table and get the configuration updated on the fly (this does not mean <em>applied</em>, it depends on the parameter context). Internally, PostgreSQL uses a very simple <strong>rule</strong> to cascade updates to <code class="language-plaintext highlighter-rouge">pg_settings</code> into the run-time configuration. The rule can be found in the <code class="language-plaintext highlighter-rouge">system_views.sql</code> files inside the backend source code and is implemented as:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">RULE</span> <span class="n">pg_settings_u</span> <span class="k">AS</span>
<span class="k">ON</span> <span class="k">UPDATE</span> <span class="k">TO</span> <span class="n">pg_settings</span>
<span class="k">WHERE</span> <span class="k">new</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="k">old</span><span class="p">.</span><span class="n">name</span> <span class="k">DO</span>
<span class="k">SELECT</span> <span class="n">set_config</span><span class="p">(</span><span class="k">old</span><span class="p">.</span><span class="n">name</span><span class="p">,</span> <span class="k">new</span><span class="p">.</span><span class="n">setting</span><span class="p">,</span> <span class="s1">'f'</span><span class="p">);</span>
</code></pre></div></div>
<p>It simply reads as: whenever there is an update keeping untouched the parameter name, invoke the special function <code class="language-plaintext highlighter-rouge">set_config</code> with the parameter name and its new value (the flag <code class="language-plaintext highlighter-rouge">f</code> means to keep changes not <em>local</em> to session). For more information about <code class="language-plaintext highlighter-rouge">set_config</code> see <a href="https://www.postgresql.org/docs/current/static/functions-admin.html">the function official documentation</a>.</p>
<p>How cool!</p>
pgxnclient and beta version2018-08-07T00:00:00+00:00https://fluca1978.github.io/2018/08/07/PostgreSQL-pgxnclient-patch<p><code class="language-plaintext highlighter-rouge">pgxnclient</code> is a wonderful <code class="language-plaintext highlighter-rouge">cpan</code> like tool for the <a href="http://pgxn.org">PGXN</a> extension network. Unlickily, the client cannot handle PostgreSQL beta version, so I submitted a really small patch to fix the issue.</p>
<h1 id="pgxnclient-and-beta-version">pgxnclient and beta version</h1>
<p>If you, like me, are addicted to terminal mode, you surely love a tool like <code class="language-plaintext highlighter-rouge">pgxnclient</code> that allows you to install extension into PostgreSQL from the command line, much like <code class="language-plaintext highlighter-rouge">cpan</code> (and friends) does for Perl.</p>
<p>A few days ago, I run into a problem: the `load** command cannot work against a PostgreSQL 11 beta 2 server. At first I reported it with a [ticket])https://github.com/dvarrazzo/pgxnclient/issues/29), but then curiosity hit me and I decided to give a look at very well written source code.</p>
<p><strong>Warning: I’m not a Python developer</strong>, or better, I’m a Python-idiot! This means the work I’ve done, even if it seems it works, could be totally wrong, so reviews are welcome.</p>
<p>First I got to the regular expression used to parse a <code class="language-plaintext highlighter-rouge">version()</code> output:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">match</span><span class="p">(</span><span class="sa">r</span><span class="s">'\S+\s+(\d+)\.(\d+)(?:\.(\d+))?'</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span>
</code></pre></div></div>
<p>where <code class="language-plaintext highlighter-rouge">data</code> is the output of a <code class="language-plaintext highlighter-rouge">SELECT version();</code>. Now, this works great for a version like <code class="language-plaintext highlighter-rouge">9.6.5</code> or <code class="language-plaintext highlighter-rouge">10.3</code>, but does not work for <code class="language-plaintext highlighter-rouge">11beta2</code>. Therefore, I decided to implement a two level regular expression check: at first search for a two or three numbers, and if it fails, search for two numbers separated by the <code class="language-plaintext highlighter-rouge">beta</code> text string.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">match</span><span class="p">(</span><span class="sa">r</span><span class="s">'\S+\s+(\d+)\.(\d+)(?:\.(\d+))?'</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span>
<span class="k">if</span> <span class="n">m</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">match</span><span class="p">(</span> <span class="sa">r</span><span class="s">'\S+\s+(\d+)beta(\d+)'</span><span class="p">,</span> <span class="n">data</span> <span class="p">)</span>
<span class="n">is_beta</span> <span class="o">=</span> <span class="bp">True</span>
<span class="k">if</span> <span class="n">m</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">PgxnClientException</span><span class="p">(</span>
<span class="s">"cannot parse version number from '%s'"</span> <span class="o">%</span> <span class="n">data</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">is_beta</span> <span class="o">=</span> <span class="bp">False</span>
</code></pre></div></div>
<p><a href="https://github.com/fluca1978/pgxnclient/commit/9ddce97679f3e2af6aaa3c8bb9ec90e62a3ffb87">Apparently it works</a>, but I’m not sure if there are not other pieces of code that need more attention.</p>
PostgreSQL 11 and PL/Java: it can work!2018-07-25T00:00:00+00:00https://fluca1978.github.io/2018/07/25/PostgreSQL11_pljava<p>PL/Java is a wonderful piece of code that allows the definition of functions in Java directly within PostgreSQL. Unluckily, PostgreSQL 11 introduced a few changes that made PL/Java not compiling. But keep calm, experts are already working on this!</p>
<h1 id="postgresql-11-and-pljava-it-can-work">PostgreSQL 11 and PL/Java: it can work!</h1>
<h2 id="tldr">tl;dr</h2>
<p>If you are in a rush and want to experiment with PL/Java 1.5.1 Beta and PostgreSQL 11, have a look at this <a href="https://github.com/tada/pljava/pull/161">pull request</a>.</p>
<h2 id="trying-to-compile-pljava-against-pg11b2">Trying to compile PL/Java against PG11b2</h2>
<p>First of all, I’m used to serve PostgreSQL over FreeBSD, and this is not the optimal situation because a lot of code is written with Linux in mind and requires some adjustements when compiled/ported over other Unix implementations.
PL/Java is an example: it compiles thru Apache Maven, that in turn seems to require GCC, that is not the default compiler on FreeBSD and … you get the point.</p>
<p><br /><br /></p>
<p>However, with a little work, it is possible to compile PL/Java even on FreeBSD (as you can imagine) and this is what I’ve done so far. But last monday, trying to compile it against PostgreSQL 11 beta 2, quickly resulted in frustation.</p>
<p><br />
<br /></p>
<p>Luckily, and thanks to the great PL/Java, I found that due to a change in the PostgreSQL 11 GUC definition, <a href="https://github.com/tada/pljava/issues/160#issuecomment-407104952">things could have been adjusted to make it compiling</a>. After one day, all of my PL/Java code seemed to be running fine with this simple workaround (and yes, I don’t have code that complex, so this does not mean the workaround is production safe!). I therefore decided to open a <a href="https://github.com/tada/pljava/pull/162">pull request about it</a>.
<br />
<br />
<em>Shame on me!</em>
<br />
My pull request resulted in a mess, because I made it against the right branch and tag. And in the meantime, I didn’t noticed that the <a href="https://github.com/tada/pljava/pull/161">right pull request</a> was added by someone realy much more competente than me on this subject: Chapman Flack.
Thanks Chapman, hope things will be fixed soon!</p>
PostgreSQL Extended Statistics2018-06-28T00:00:00+00:00https://fluca1978.github.io/2018/06/28/PostgreSQLExtendedStatistics<p>PostgreSQL 10 allows users to define extended statistics to help the planner understand data dependencies.</p>
<h1 id="postgresql-extended-statistics">PostgreSQL Extended Statistics</h1>
<p>PostgreSQL 10 defines a set of <em>extented statistics</em>, mainly for intra-column dependencies and distinct values.
<a href="https://www.postgresql.org/docs/10/static/sql-createstatistics.html">New commands</a> have been added to create and rop such extended statistics, and this post just covers the surface of this new feature directly as long as I was experimenting with that (and as usual, comments are welcome!).</p>
<h2 id="a-sample-data-set">A sample data set</h2>
<p>Assume you have a table defined as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">expenses</span><span class="p">(</span>
<span class="n">pk</span> <span class="nb">int</span> <span class="k">GENERATED</span> <span class="n">ALWAYS</span> <span class="k">AS</span> <span class="k">IDENTITY</span><span class="p">,</span>
<span class="n">value</span> <span class="n">money</span><span class="p">,</span>
<span class="k">day</span> <span class="nb">date</span><span class="p">,</span>
<span class="n">quarter</span> <span class="nb">int</span><span class="p">,</span>
<span class="nb">year</span> <span class="nb">int</span><span class="p">,</span>
<span class="n">incoming</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">false</span><span class="p">,</span>
<span class="n">account</span> <span class="nb">text</span> <span class="k">DEFAULT</span> <span class="s1">'cash'</span><span class="p">,</span>
<span class="k">PRIMARY</span> <span class="k">KEY</span> <span class="p">(</span> <span class="n">pk</span> <span class="p">),</span>
<span class="k">CHECK</span><span class="p">(</span> <span class="n">value</span> <span class="o"><></span> <span class="mi">0</span><span class="p">::</span><span class="n">money</span> <span class="p">),</span>
<span class="k">CHECK</span><span class="p">(</span> <span class="n">quarter</span> <span class="o">=</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="n">quarter</span> <span class="k">FROM</span> <span class="k">day</span> <span class="p">)</span> <span class="p">),</span>
<span class="k">CHECK</span><span class="p">(</span> <span class="nb">year</span> <span class="o">=</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="k">day</span> <span class="p">)</span> <span class="p">)</span>
<span class="p">);</span>
</code></pre></div></div>
<p>clearly there are a few dependencies:</p>
<ul>
<li>column <code class="language-plaintext highlighter-rouge">quarter</code> depends on the value of column <code class="language-plaintext highlighter-rouge">day</code>;</li>
<li>column <code class="language-plaintext highlighter-rouge">year</code> depends on the value of <code class="language-plaintext highlighter-rouge">day</code> too,</li>
<li>column <code class="language-plaintext highlighter-rouge">incoming</code> is <code class="language-plaintext highlighter-rouge">true</code> when the <code class="language-plaintext highlighter-rouge">value</code> is greater than zero, <code class="language-plaintext highlighter-rouge">false</code> otherwise.</li>
</ul>
<p>And since there are these dependencies, there must be something that ensures us the columns <em>move</em> together, so for instance there is a simple trigger as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">f_expenses</span><span class="p">()</span>
<span class="k">RETURNS</span> <span class="k">TRIGGER</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">BEGIN</span>
<span class="n">IF</span> <span class="k">NEW</span><span class="p">.</span><span class="k">day</span> <span class="k">IS</span> <span class="k">NULL</span> <span class="k">THEN</span>
<span class="k">NEW</span><span class="p">.</span><span class="k">day</span> <span class="p">:</span><span class="o">=</span> <span class="k">CURRENT_DATE</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="k">NEW</span><span class="p">.</span><span class="n">value</span> <span class="o">=</span> <span class="mi">0</span><span class="p">::</span><span class="n">money</span> <span class="k">THEN</span>
<span class="k">NEW</span><span class="p">.</span><span class="n">value</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">.</span><span class="mi">01</span><span class="p">::</span><span class="n">money</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">NEW</span><span class="p">.</span><span class="n">quarter</span> <span class="p">:</span><span class="o">=</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="n">quarter</span> <span class="k">FROM</span> <span class="k">NEW</span><span class="p">.</span><span class="k">day</span> <span class="p">);</span>
<span class="k">NEW</span><span class="p">.</span><span class="nb">year</span> <span class="p">:</span><span class="o">=</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="k">NEW</span><span class="p">.</span><span class="k">day</span> <span class="p">);</span>
<span class="n">IF</span> <span class="k">NEW</span><span class="p">.</span><span class="n">value</span> <span class="o">></span> <span class="mi">0</span><span class="p">::</span><span class="n">money</span> <span class="k">THEN</span>
<span class="k">NEW</span><span class="p">.</span><span class="n">incoming</span> <span class="p">:</span><span class="o">=</span> <span class="k">true</span><span class="p">;</span>
<span class="k">ELSE</span>
<span class="k">NEW</span><span class="p">.</span><span class="n">incoming</span> <span class="p">:</span><span class="o">=</span> <span class="k">false</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">NEW</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">code</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">TRIGGER</span> <span class="n">tr_expenses</span> <span class="k">BEFORE</span> <span class="k">INSERT</span> <span class="k">OR</span> <span class="k">UPDATE</span> <span class="k">ON</span> <span class="n">expenses</span>
<span class="k">FOR</span> <span class="k">EACH</span> <span class="k">ROW</span>
<span class="k">EXECUTE</span> <span class="k">PROCEDURE</span> <span class="n">f_expenses</span><span class="p">();</span>
</code></pre></div></div>
<p>In a /normal/ situation the planner cannot know such dependencies and, consequently, cannot take advantage of them.
In order to see how this can change in PostgreSQL 10, let’s first insert some records:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">WITH</span> <span class="n">v</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">counter</span><span class="p">,</span> <span class="n">account</span><span class="p">)</span> <span class="k">AS</span> <span class="p">(</span>
<span class="k">SELECT</span> <span class="p">(</span> <span class="n">random</span><span class="p">()</span> <span class="o">*</span> <span class="mi">100</span> <span class="p">)::</span><span class="nb">numeric</span> <span class="o">+</span> <span class="mi">1</span> <span class="o">*</span> <span class="k">CASE</span> <span class="n">v</span> <span class="o">%</span> <span class="mi">2</span> <span class="k">WHEN</span> <span class="mi">0</span> <span class="k">THEN</span> <span class="mi">1</span> <span class="k">ELSE</span> <span class="o">-</span><span class="mi">1</span> <span class="k">END</span><span class="p">,</span>
<span class="n">v</span><span class="p">,</span>
<span class="k">CASE</span> <span class="n">v</span> <span class="o">%</span> <span class="mi">3</span> <span class="k">WHEN</span> <span class="mi">0</span> <span class="k">THEN</span> <span class="s1">'credit card'</span>
<span class="k">WHEN</span> <span class="mi">1</span> <span class="k">THEN</span> <span class="s1">'bank'</span>
<span class="k">ELSE</span> <span class="s1">'cash'</span>
<span class="k">END</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2000000</span> <span class="p">)</span> <span class="n">v</span>
<span class="p">)</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">expenses</span><span class="p">(</span> <span class="n">value</span><span class="p">,</span> <span class="k">day</span><span class="p">,</span> <span class="n">account</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">value</span><span class="p">,</span> <span class="k">CURRENT_DATE</span> <span class="o">-</span> <span class="p">(</span> <span class="n">counter</span> <span class="o">%</span> <span class="mi">1000</span> <span class="p">),</span> <span class="n">account</span>
<span class="k">FROM</span> <span class="n">v</span><span class="p">;</span>
</code></pre></div></div>
<p>which inserts 2 millions rows (roughly 130 MB of data), 2000 tuples per day.
It is possible to check how many values are there, for instance in the year 2016:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span>
<span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="n">FILTER</span><span class="p">(</span> <span class="k">WHERE</span> <span class="nb">year</span> <span class="o">=</span> <span class="mi">2016</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">total_2016_by_year</span><span class="p">,</span>
<span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="n">FILTER</span><span class="p">(</span> <span class="k">WHERE</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="k">day</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">2016</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">total_2016_by_day</span><span class="p">,</span>
<span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="n">FILTER</span><span class="p">(</span> <span class="k">WHERE</span> <span class="nb">year</span> <span class="o">=</span> <span class="mi">2016</span> <span class="k">AND</span> <span class="n">incoming</span> <span class="o">=</span> <span class="k">true</span> <span class="k">AND</span> <span class="k">day</span> <span class="o">=</span> <span class="s1">'2016-7-19'</span><span class="p">::</span><span class="nb">date</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">incoming_by_year</span><span class="p">,</span>
<span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="n">FILTER</span><span class="p">(</span> <span class="k">WHERE</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="nb">year</span> <span class="k">FROM</span> <span class="k">day</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">2016</span> <span class="k">AND</span> <span class="n">incoming</span> <span class="o">=</span> <span class="k">true</span> <span class="k">AND</span> <span class="k">day</span> <span class="o">=</span> <span class="s1">'2016-7-19'</span><span class="p">::</span><span class="nb">date</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">incoming_2016_by_day</span><span class="p">,</span>
<span class="n">pg_size_pretty</span><span class="p">(</span> <span class="n">pg_relation_size</span><span class="p">(</span> <span class="s1">'expenses'</span> <span class="p">)</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">table_size</span>
<span class="k">FROM</span> <span class="n">expenses</span><span class="p">;</span>
<span class="n">total_2016_by_year</span> <span class="o">|</span> <span class="mi">732000</span>
<span class="n">total_2016_by_day</span> <span class="o">|</span> <span class="mi">732000</span>
<span class="n">incoming_by_year</span> <span class="o">|</span> <span class="mi">1982</span>
<span class="n">incoming_2016_by_day</span> <span class="o">|</span> <span class="mi">1982</span>
<span class="n">table_size</span> <span class="o">|</span> <span class="mi">136</span> <span class="n">MB</span>
</code></pre></div></div>
<h2 id="without-the-extended-statistics">Without the extended statistics</h2>
<p>What happens when we search for all the expenses of a particular year?</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">EXPLAIN</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">expenses</span>
<span class="k">WHERE</span> <span class="n">incoming</span> <span class="o">=</span> <span class="k">true</span>
<span class="k">AND</span> <span class="n">value</span> <span class="o">></span> <span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">::</span><span class="n">money</span>
<span class="k">AND</span> <span class="k">day</span> <span class="o">=</span> <span class="s1">'2016-7-19'</span><span class="p">::</span><span class="nb">date</span>
<span class="k">AND</span> <span class="nb">year</span> <span class="o">=</span> <span class="mi">2016</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">--------------------------------------------------------------------------------------------------------</span>
<span class="n">Gather</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">1000</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">35128</span><span class="p">.</span><span class="mi">37</span> <span class="k">rows</span><span class="o">=</span><span class="mi">697</span> <span class="n">width</span><span class="o">=</span><span class="mi">32</span><span class="p">)</span>
<span class="n">Workers</span> <span class="n">Planned</span><span class="p">:</span> <span class="mi">2</span>
<span class="o">-></span> <span class="n">Parallel</span> <span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">expenses</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">34058</span><span class="p">.</span><span class="mi">67</span> <span class="k">rows</span><span class="o">=</span><span class="mi">290</span> <span class="n">width</span><span class="o">=</span><span class="mi">32</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">incoming</span> <span class="k">AND</span> <span class="p">(</span><span class="k">day</span> <span class="o">=</span> <span class="s1">'2016-07-19'</span><span class="p">::</span><span class="nb">date</span><span class="p">)</span> <span class="k">AND</span> <span class="p">(</span><span class="nb">year</span> <span class="o">=</span> <span class="mi">2016</span><span class="p">)</span> <span class="k">AND</span> <span class="p">(</span><span class="n">value</span> <span class="o">></span> <span class="p">(</span><span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">)::</span><span class="n">money</span><span class="p">))</span>
</code></pre></div></div>
<p>The planner estimates <em>697</em> tuples while we know that the above filtering predicates will provide <em>1982</em> tuples.</p>
<h2 id="with-the-extended-statistics">With the extended statistics</h2>
<p>Let’s inform the system about the column dependencies:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">STATISTICS</span> <span class="n">stat_day_year</span> <span class="p">(</span> <span class="n">dependencies</span> <span class="p">)</span>
<span class="k">ON</span> <span class="k">day</span><span class="p">,</span> <span class="nb">year</span>
<span class="k">FROM</span> <span class="n">expenses</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">STATISTICS</span> <span class="n">stat_day_quarter</span> <span class="p">(</span> <span class="n">dependencies</span> <span class="p">)</span>
<span class="k">ON</span> <span class="k">day</span><span class="p">,</span> <span class="n">quarter</span>
<span class="k">FROM</span> <span class="n">expenses</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="k">STATISTICS</span> <span class="n">stat_value_incoming</span><span class="p">(</span> <span class="n">dependencies</span> <span class="p">)</span>
<span class="k">ON</span> <span class="n">value</span><span class="p">,</span> <span class="n">incoming</span>
<span class="k">FROM</span> <span class="n">expenses</span><span class="p">;</span>
</code></pre></div></div>
<p>There is a special view <a href="https://www.postgresql.org/docs/10/static/catalog-pg-statistic-ext.html"><code class="language-plaintext highlighter-rouge">pg_statistic_ext</code></a> that holds data about the created statistics, but it gets updated only by an <code class="language-plaintext highlighter-rouge">ANALYZE</code> command. Each row in the view corresponds to a created statistics and includes the type and relationship:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pg_statistic_ext</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">---|---------------------</span>
<span class="n">stxrelid</span> <span class="o">|</span> <span class="mi">51015</span>
<span class="n">stxname</span> <span class="o">|</span> <span class="n">stat_value_incoming</span>
<span class="n">stxnamespace</span> <span class="o">|</span> <span class="mi">2200</span>
<span class="n">stxowner</span> <span class="o">|</span> <span class="mi">16384</span>
<span class="n">stxkeys</span> <span class="o">|</span> <span class="mi">2</span> <span class="mi">6</span>
<span class="n">stxkind</span> <span class="o">|</span> <span class="p">{</span><span class="n">f</span><span class="p">}</span>
<span class="n">stxndistinct</span> <span class="o">|</span>
<span class="n">stxdependencies</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"2 => 6"</span><span class="p">:</span> <span class="mi">1</span><span class="p">.</span><span class="mi">000000</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">2</span> <span class="p">]</span><span class="c1">---|---------------------</span>
<span class="n">stxrelid</span> <span class="o">|</span> <span class="mi">51015</span>
<span class="n">stxname</span> <span class="o">|</span> <span class="n">stat_day_quarter</span>
<span class="n">stxnamespace</span> <span class="o">|</span> <span class="mi">2200</span>
<span class="n">stxowner</span> <span class="o">|</span> <span class="mi">16384</span>
<span class="n">stxkeys</span> <span class="o">|</span> <span class="mi">3</span> <span class="mi">4</span>
<span class="n">stxkind</span> <span class="o">|</span> <span class="p">{</span><span class="n">f</span><span class="p">}</span>
<span class="n">stxndistinct</span> <span class="o">|</span>
<span class="n">stxdependencies</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"3 => 4"</span><span class="p">:</span> <span class="mi">1</span><span class="p">.</span><span class="mi">000000</span><span class="p">}</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">3</span> <span class="p">]</span><span class="c1">---|---------------------</span>
<span class="n">stxrelid</span> <span class="o">|</span> <span class="mi">51015</span>
<span class="n">stxname</span> <span class="o">|</span> <span class="n">stat_day_year</span>
<span class="n">stxnamespace</span> <span class="o">|</span> <span class="mi">2200</span>
<span class="n">stxowner</span> <span class="o">|</span> <span class="mi">16384</span>
<span class="n">stxkeys</span> <span class="o">|</span> <span class="mi">3</span> <span class="mi">5</span>
<span class="n">stxkind</span> <span class="o">|</span> <span class="p">{</span><span class="n">f</span><span class="p">}</span>
<span class="n">stxndistinct</span> <span class="o">|</span>
<span class="n">stxdependencies</span> <span class="o">|</span> <span class="p">{</span><span class="nv">"3 => 5"</span><span class="p">:</span> <span class="mi">1</span><span class="p">.</span><span class="mi">000000</span><span class="p">}</span>
</code></pre></div></div>
<p>Therefore column <code class="language-plaintext highlighter-rouge">3</code> determines both columns <code class="language-plaintext highlighter-rouge">4</code> and <code class="language-plaintext highlighter-rouge">5</code> (see <code class="language-plaintext highlighter-rouge">stxkeys</code>) by a <em>functional</em> <code class="language-plaintext highlighter-rouge">{f}</code> dependency, and in particular target columns are fully computed by their dependendant column (100%) as reported in the <code class="language-plaintext highlighter-rouge">stxdependencies</code> column. The same applies for column <code class="language-plaintext highlighter-rouge">2</code> (<code class="language-plaintext highlighter-rouge">value</code>) that determines column <code class="language-plaintext highlighter-rouge">6</code> (<code class="language-plaintext highlighter-rouge">incoming</code>).</p>
<p>What is now the query plan?</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">EXPLAIN</span>
<span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">expenses</span>
<span class="k">WHERE</span> <span class="n">incoming</span> <span class="o">=</span> <span class="k">true</span>
<span class="k">AND</span> <span class="n">value</span> <span class="o">></span> <span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">::</span><span class="n">money</span>
<span class="k">AND</span> <span class="k">day</span> <span class="o">=</span> <span class="s1">'2016-7-19'</span><span class="p">::</span><span class="nb">date</span>
<span class="k">AND</span> <span class="nb">year</span> <span class="o">=</span> <span class="mi">2016</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">--------------------------------------------------------------------------------------------------------</span>
<span class="n">Gather</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">1000</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">35251</span><span class="p">.</span><span class="mi">67</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1930</span> <span class="n">width</span><span class="o">=</span><span class="mi">32</span><span class="p">)</span>
<span class="n">Workers</span> <span class="n">Planned</span><span class="p">:</span> <span class="mi">2</span>
<span class="o">-></span> <span class="n">Parallel</span> <span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">expenses</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">34058</span><span class="p">.</span><span class="mi">67</span> <span class="k">rows</span><span class="o">=</span><span class="mi">804</span> <span class="n">width</span><span class="o">=</span><span class="mi">32</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">incoming</span> <span class="k">AND</span> <span class="p">(</span><span class="k">day</span> <span class="o">=</span> <span class="s1">'2016-07-19'</span><span class="p">::</span><span class="nb">date</span><span class="p">)</span> <span class="k">AND</span> <span class="p">(</span><span class="nb">year</span> <span class="o">=</span> <span class="mi">2016</span><span class="p">)</span> <span class="k">AND</span> <span class="p">(</span><span class="n">value</span> <span class="o">></span> <span class="p">(</span><span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">)::</span><span class="n">money</span><span class="p">))</span>
</code></pre></div></div>
<p>As you can see the planner now can estimates almost exactly the number of tuples that will be returned (<em>1930</em> exstimated against the <em>1982</em> actual). The fact is that now the planner knows the predicate pairs on <code class="language-plaintext highlighter-rouge">incoming</code> and <code class="language-plaintext highlighter-rouge">value</code>, as well as <code class="language-plaintext highlighter-rouge">day</code> and <code class="language-plaintext highlighter-rouge">year</code> are now tied due to a functional dependency and therefore it will not multiplicate each individual ratio to get the final filtering ratio.</p>
<h2 id="what-about-indexes">What about indexes?</h2>
<p>Initially I thought that the new extended statistics could <em>automagically</em> understand also index depdencies, so that an index built on an expression of <code class="language-plaintext highlighter-rouge">day</code> can be used to retrieve also <code class="language-plaintext highlighter-rouge">year</code>. Clearly, <a href="https://www.postgresql.org/message-id/CAKoxK%2B6C8CKdbYbbyNeYnc5aiDk%3DG-k-iDyDZMcmjJATqkLM9w%40mail.gmail.com">I was wrong</a>, and that proves that <em>I have to do a better job learning this new feaure!</em>.</p>
Sqitch and Sqitchers2018-06-26T00:00:00+00:00https://fluca1978.github.io/2018/06/26/SqitchUpdate<p>Sqitch has nothing particular to do with PostgreSQL, except it does support our beloved database!</p>
<h1 id="sqitch-and-sqitchers">Sqitch and Sqitchers</h1>
<p>Long story short: <strong>if you are not using <a href="https://sqitch.org/"><code class="language-plaintext highlighter-rouge">sqitch</code></a> you should give it a look</strong>.
<br />
<br />
<a href="https://sqitch.org/"><code class="language-plaintext highlighter-rouge">sqitch</code></a> does not ties itself to <em>only</em> PostgreSQL, but it does support a lot of relational engines. However, if you want to know how to start using Sqitch over PostgreSQL go read the excellent <a href="https://github.com/sqitchers/sqitch/blob/master/lib/sqitchtutorial.pod">Introduction to Sqitch on PostgreSQL</a>.
<br />
<br /></p>
<p>I’ve already written about <a href="https://fluca1978.github.io/2014/12/20/calendario-dellavvento-itpug-20-dicembre.html"><code class="language-plaintext highlighter-rouge">sqitch</code> in the past (in italian)</a>.
<br />
<a href="https://sqitch.org/"><code class="language-plaintext highlighter-rouge">sqitch</code></a> is a great tool to manage database changes, mainly schema changes. The idea is to provide a <code class="language-plaintext highlighter-rouge">git</code>-like interface to manage <em>changes</em>, a <em>change</em> is made by three scripts appropriately written for the backend database:</p>
<ul>
<li>a <em>deploy</em> script (what to do);</li>
<li>a <em>revert</em> script (how to undo);</li>
<li>a <em>test</em> script (how to check the deploy succeeded).</li>
</ul>
<p><br />
<br /></p>
<h3 id="introducing-sqitchers">Introducing <em>sqitchers</em>.</h3>
<p><br />
<a href="https://justatheory.com/2018/05/sqitchers/">Around a month ago</a>, the <code class="language-plaintext highlighter-rouge">sqitch</code> creator, David E. Wheeler, created a GitHub Organization named <a href="https://github.com/sqitchers"><code class="language-plaintext highlighter-rouge">sqitchers</code></a> that now holds all the Sqitch related stuff including, obviously, the codebase for the project. At the same time, the Sqitch steering committee grown, and this is a good thing since this project quickly became an handy tool for database management.
<br />
<br />
In conclusion, <code class="language-plaintext highlighter-rouge">sqitch</code> is growing and getting more free every day. If you are curious about project history and explaination by its own creator David E. Wheeler, <a href="https://justatheory.com/2014/09/sqitch-on-floss-weekly/">I suggest you listening to this (old) FLOSS Weekly podcast</a>.</p>
Statements with RETURNING: Perl and Java clients2018-05-31T00:00:00+00:00https://fluca1978.github.io/2018/05/31/PostgreSQL_Returning_Clients<p>PostgreSQL statements support the ~RETURNING~ predicate that allows a statement that manipulates tuples to return a set of column of such tuples. It is easy to use such statements on a client-basis to get back data not available when the query has been written.</p>
<h1 id="statements-with-returning-perl-and-java-clients">Statements with RETURNING: Perl and Java clients</h1>
<p>Statements such as <code class="language-plaintext highlighter-rouge">INSERT</code>, <code class="language-plaintext highlighter-rouge">DELETE</code> and <code class="language-plaintext highlighter-rouge">UPDATE</code> can have a <code class="language-plaintext highlighter-rouge">RETURNING</code> predicate that allows to get back the data that the statement has manipulated. On a theoretical point of view, it is like the following two statements are executed:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">INSERT</span><span class="o">|</span><span class="k">UPDATE</span><span class="o">|</span><span class="k">DELETE</span> <span class="n">tuples</span><span class="p">;</span>
<span class="k">SELECT</span> <span class="n">above_tuples</span><span class="p">;</span>
</code></pre></div></div>
<p>From within the database connection, such <code class="language-plaintext highlighter-rouge">RETURNING</code> statement can be very useful to <em>see</em> which tuples have been modified, and from a client perspective it can be used to get random and serial-based data.
Consider a simple table defined as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">foo</span>
<span class="p">(</span>
<span class="n">pk</span> <span class="nb">serial</span>
<span class="p">,</span> <span class="n">rv</span> <span class="nb">float</span>
<span class="p">);</span>
</code></pre></div></div>
<p>and consider the following simple statement to insert values:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">foo</span><span class="p">(</span> <span class="n">rv</span> <span class="p">)</span>
<span class="k">SELECT</span> <span class="n">random</span><span class="p">()</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">10</span> <span class="p">);</span>
</code></pre></div></div>
<p>The above query inserts 10 tuples with <code class="language-plaintext highlighter-rouge">rv</code> set to a random value and <code class="language-plaintext highlighter-rouge">pk</code> set to the next value of the associated sequence. In other words, it is not possible to know in advance what values have been inserted.</p>
<p>Thanks to <code class="language-plaintext highlighter-rouge">RETURNING</code> this knowledge is pushed back to the client, and can be consumed as a normal result set, that means as if the client issued a <code class="language-plaintext highlighter-rouge">SELECT</code> statement.</p>
<p>As a simple example, consider <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/clients/perl/returning.pl">the following Perl client</a>:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nv">DBI</span><span class="p">;</span>
<span class="k">use</span> <span class="nv">v5</span><span class="mf">.20</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$dbh</span> <span class="o">=</span> <span class="nv">DBI</span><span class="o">-></span><span class="nb">connect</span><span class="p">("</span><span class="s2">dbi:Pg:dbname=testdb;host=localhost;port=5432</span><span class="p">",</span>
<span class="p">'</span><span class="s1">luca</span><span class="p">',</span>
<span class="p">'',</span>
<span class="p">{</span><span class="s">AutoCommit</span> <span class="o">=></span> <span class="mi">0</span><span class="p">}</span> <span class="p">);</span>
<span class="k">my</span> <span class="nv">$query</span> <span class="o">=</span> <span class="s"><<'END_QUERY';
INSERT INTO foo( rv )
SELECT random()
FROM generate_series( 1, 10 )
RETURNING pk, rv;
END_QUERY
</span>
<span class="k">my</span> <span class="nv">$statement</span> <span class="o">=</span> <span class="nv">$dbh</span><span class="o">-></span><span class="nv">prepare</span><span class="p">(</span> <span class="nv">$query</span> <span class="p">);</span>
<span class="nv">$statement</span><span class="o">-></span><span class="nv">execute</span><span class="p">();</span>
<span class="k">while</span> <span class="p">(</span> <span class="k">my</span> <span class="nv">$result</span> <span class="o">=</span> <span class="nv">$statement</span><span class="o">-></span><span class="nv">fetchrow_hashref</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">say</span> <span class="nb">sprintf</span> <span class="p">'</span><span class="s1">The statement inserted pk = %d and a random value rv = %f</span><span class="p">',</span>
<span class="nv">$result</span><span class="o">-></span><span class="p">{</span> <span class="nv">pk</span> <span class="p">},</span>
<span class="nv">$result</span><span class="o">-></span><span class="p">{</span> <span class="nv">rv</span> <span class="p">};</span>
<span class="p">}</span>
<span class="nv">$dbh</span><span class="o">-></span><span class="nv">disconnect</span><span class="p">();</span>
</code></pre></div></div>
<p>that produces an output similar to the following one:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>The statement inserted pk <span class="o">=</span> 11 and a random value rv <span class="o">=</span> 0.258626
The statement inserted pk <span class="o">=</span> 12 and a random value rv <span class="o">=</span> 0.877215
The statement inserted pk <span class="o">=</span> 13 and a random value rv <span class="o">=</span> 0.900430
The statement inserted pk <span class="o">=</span> 14 and a random value rv <span class="o">=</span> 0.312273
The statement inserted pk <span class="o">=</span> 15 and a random value rv <span class="o">=</span> 0.300636
The statement inserted pk <span class="o">=</span> 16 and a random value rv <span class="o">=</span> 0.401800
The statement inserted pk <span class="o">=</span> 17 and a random value rv <span class="o">=</span> 0.446666
The statement inserted pk <span class="o">=</span> 18 and a random value rv <span class="o">=</span> 0.352235
The statement inserted pk <span class="o">=</span> 19 and a random value rv <span class="o">=</span> 0.390648
The statement inserted pk <span class="o">=</span> 20 and a random value rv <span class="o">=</span> 0.790937
</code></pre></div></div>
<p>and the <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/clients/java/returning.java">corresponding Java client</a>:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">java.sql.*</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">java.util.*</span><span class="o">;</span>
<span class="kd">class</span> <span class="nc">returning</span> <span class="o">{</span>
<span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span> <span class="nc">String</span> <span class="n">argv</span><span class="o">[]</span> <span class="o">)</span> <span class="kd">throws</span> <span class="nc">Exception</span> <span class="o">{</span>
<span class="nc">Class</span><span class="o">.</span><span class="na">forName</span><span class="o">(</span> <span class="s">"org.postgresql.Driver"</span> <span class="o">);</span>
<span class="nc">String</span> <span class="n">connectionURL</span> <span class="o">=</span> <span class="s">"jdbc:postgresql://localhost/testdb"</span><span class="o">;</span>
<span class="nc">Properties</span> <span class="n">connectionProperties</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Properties</span><span class="o">();</span>
<span class="n">connectionProperties</span><span class="o">.</span><span class="na">put</span><span class="o">(</span> <span class="s">"user"</span><span class="o">,</span> <span class="s">"luca"</span> <span class="o">);</span>
<span class="n">connectionProperties</span><span class="o">.</span><span class="na">put</span><span class="o">(</span> <span class="s">"password"</span><span class="o">,</span> <span class="s">"xyz"</span> <span class="o">);</span>
<span class="nc">Connection</span> <span class="n">conn</span> <span class="o">=</span> <span class="nc">DriverManager</span><span class="o">.</span><span class="na">getConnection</span><span class="o">(</span> <span class="n">connectionURL</span><span class="o">,</span> <span class="n">connectionProperties</span> <span class="o">);</span>
<span class="nc">String</span> <span class="n">query</span> <span class="o">=</span> <span class="s">"INSERT INTO foo( rv ) "</span>
<span class="o">+</span> <span class="s">" SELECT random() "</span>
<span class="o">+</span> <span class="s">" FROM generate_series( 1, 10 ) "</span>
<span class="o">+</span> <span class="s">" RETURNING pk, rv;"</span><span class="o">;</span>
<span class="nc">Statement</span> <span class="n">statement</span> <span class="o">=</span> <span class="n">conn</span><span class="o">.</span><span class="na">createStatement</span><span class="o">();</span>
<span class="nc">ResultSet</span> <span class="n">resultSet</span> <span class="o">=</span> <span class="n">statement</span><span class="o">.</span><span class="na">executeQuery</span><span class="o">(</span> <span class="n">query</span> <span class="o">);</span>
<span class="k">while</span> <span class="o">(</span> <span class="n">resultSet</span><span class="o">.</span><span class="na">next</span><span class="o">()</span> <span class="o">)</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span> <span class="nc">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span> <span class="s">"The statement inserted pk = %d and a random value rv = %f "</span><span class="o">,</span>
<span class="n">resultSet</span><span class="o">.</span><span class="na">getLong</span><span class="o">(</span> <span class="s">"pk"</span> <span class="o">),</span>
<span class="n">resultSet</span><span class="o">.</span><span class="na">getFloat</span><span class="o">(</span> <span class="s">"rv"</span> <span class="o">)</span> <span class="o">)</span> <span class="o">);</span>
<span class="n">resultSet</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
<span class="n">statement</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p>that produces the very same result.</p>
pg_chocolate (aka the end of course in Modena)2018-05-23T00:00:00+00:00https://fluca1978.github.io/2018/05/23/pg_chocolate<p>Yesterday it was my last session at the local Linux Users’ Group ConoscereLinux course on PostgreSQL. Since the LUG gave me the chance to deliver an extra session, and since all the attendees were fun and nice, I decide to “contribute back”.</p>
<h1 id="pg_chocolate"><code class="language-plaintext highlighter-rouge">pg_chocolate</code></h1>
<p>No, this is not a new project about our favourite database.
<br />
I asked my great wife to make a chocolate cake to share with all attendees, and we came up with the idea of putting the elephant logo on top of it.</p>
<p><br />
Let’s say it was not difficult for my wife to produce the elephant cake, that bite after bite disappeared…you know, even complex queries become simpler in front of a good cake!
Chances are there will be a /replication/ instance coming up sooner or later!</p>
<p><br />
<img src="/images/posts/pg_chocolate/pg_chocolate_1.jpg" alt="pg_chocolate_1" />
<img src="/images/posts/pg_chocolate/pg_chocolate_2.jpg" alt="pg_chocolate_2" /></p>
<p>As a side note, <strong>I really have to thank my friend Max</strong> for patiently drive me home and, most notably, for coming up with the idea of this course. A special thank to the Modena Linux User Group <a href="https://conoscerelinux.org/">ConoscereLinux</a> and its president <strong>Luca</strong> for hosting the course and help me arranging the material.</p>
plperl: invoking other subroutines2018-05-04T00:00:00+00:00https://fluca1978.github.io/2018/05/04/PostgreSQLPLPERLTrampoline<p><code class="language-plaintext highlighter-rouge">plperl</code> does not allow <em>direct sub invocation</em>, so the only way is to execute a query.</p>
<h1 id="plperl-invoking-other-subroutines"><code class="language-plaintext highlighter-rouge">plperl</code>: invoking other subroutines</h1>
<p>The official <a href="https://www.postgresql.org/docs/current/static/plperl-global.html">plperl documentation</a> shows you a way to use a <code class="language-plaintext highlighter-rouge">subref</code> to invoke code shared across different <code class="language-plaintext highlighter-rouge">plperl</code> functions via the special global hash <code class="language-plaintext highlighter-rouge">%_SHARED</code>. While this is a good approach, it only works for code <em>attached to the hash</em>, that is a kind of closure (e.g., a dispatch table), and requires each time an initialization of the <code class="language-plaintext highlighter-rouge">%_SHARED</code> hash since <code class="language-plaintext highlighter-rouge">plperl</code> interpreters does not share nothing across sections.</p>
<p>The other way, always working, is to execute a query to perform the <code class="language-plaintext highlighter-rouge">SELECT</code> that will invoke the function.
As an example:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">plperl_trampoline</span><span class="p">(</span> <span class="n">fun_name</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">TEXT</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">PERL</span><span class="err">$</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">fun_name</span> <span class="p">)</span> <span class="o">=</span> <span class="o">@</span><span class="n">_</span><span class="p">;</span>
<span class="k">return</span> <span class="n">undef</span> <span class="n">if</span> <span class="p">(</span> <span class="o">!</span> <span class="err">$</span><span class="n">fun_name</span> <span class="p">);</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">DEBUG</span><span class="p">,</span> <span class="nv">"Calling [$fun_name]"</span> <span class="p">);</span>
<span class="n">my</span> <span class="err">$</span><span class="n">result_set</span> <span class="o">=</span> <span class="n">spi_exec_query</span><span class="p">(</span> <span class="nv">"SELECT $fun_name() AS result;"</span> <span class="p">);</span>
<span class="k">return</span> <span class="err">$</span><span class="n">result_set</span><span class="o">-></span><span class="p">{</span> <span class="k">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="k">result</span> <span class="p">};</span>
<span class="err">$</span><span class="n">PERL</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p>so that you can simply do:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">></span> <span class="k">select</span> <span class="n">plperl_trampoline</span><span class="p">(</span> <span class="s1">'now'</span> <span class="p">);</span>
<span class="n">plperl_trampoline</span>
<span class="c1">------------------------------</span>
<span class="mi">2018</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">04</span> <span class="mi">13</span><span class="p">:</span><span class="mi">09</span><span class="p">:</span><span class="mi">17</span><span class="p">.</span><span class="mi">11772</span><span class="o">+</span><span class="mi">02</span>
</code></pre></div></div>
<p>The problem of this solution should be clear: it can work only for a set of functions with the same prototype.
In fact, while it could be simple to work around the argument passing (thank to some magic with Perl arrays), the return type and, most notably, its arity makes the approach not easily universal.</p>
<p><br />
<br />
Another introspective approach could have been to use <code class="language-plaintext highlighter-rouge">pg_proc.prosrc</code> to translate the Perl code to an anonymous function on the fly, and put it into the <code class="language-plaintext highlighter-rouge">%_SHARED</code> global hash. However, this requires special care about arguments too, and makes it less than trivial to handle the function protytpe.</p>
<h1 id="an-example-of-using-_shared-to-get-sequence-values">An example of using <code class="language-plaintext highlighter-rouge">%_SHARED</code> to get sequence values</h1>
<p>Once common issue when dealing with stored procedures is to get new values from sequences. While this is really trivial in <code class="language-plaintext highlighter-rouge">plpgsql</code>, and reduces to a single call to <code class="language-plaintext highlighter-rouge">nextval()</code>, it is not so simple in <code class="language-plaintext highlighter-rouge">plperl</code> where an <code class="language-plaintext highlighter-rouge">spi_exec_query()</code> has to be issued.
It is however possible to use the <code class="language-plaintext highlighter-rouge">%_SHARED</code> hash to add an handler for the same query:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">plperl_add_sequence_handler</span><span class="p">(</span> <span class="n">s</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="n">VOID</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">PERL</span><span class="err">$</span>
<span class="n">my</span> <span class="p">(</span> <span class="err">$</span><span class="n">sequence</span> <span class="p">)</span> <span class="o">=</span> <span class="o">@</span><span class="n">_</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span> <span class="n">if</span> <span class="p">(</span> <span class="o">!</span> <span class="err">$</span><span class="n">sequence</span> <span class="p">);</span>
<span class="n">my</span> <span class="err">$</span><span class="n">query</span> <span class="o">=</span> <span class="n">sprintf</span> <span class="nv">"SELECT nextval( '%s' )"</span><span class="p">,</span> <span class="err">$</span><span class="n">sequence</span><span class="p">;</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">DEBUG</span><span class="p">,</span> <span class="nv">"Query [$query]"</span> <span class="p">);</span>
<span class="err">$</span><span class="n">_SHARED</span><span class="p">{</span> <span class="err">$</span><span class="n">sequence</span> <span class="p">}</span> <span class="o">=</span> <span class="n">sub</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">spi_exec_query</span><span class="p">(</span> <span class="err">$</span><span class="n">query</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="k">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="n">nextval</span> <span class="p">};</span>
<span class="p">};</span>
<span class="err">$</span><span class="n">PERL</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plperl</span><span class="p">;</span>
</code></pre></div></div>
<p>The above function sets up an handler with the name of the sequence itself, and each time its code is executed a query is issued against the database to get the <code class="language-plaintext highlighter-rouge">nextval()</code>.
Therefore, it is quite simple to set-up a sequence value in a session and do a retrieval:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">></span> <span class="k">SELECT</span> <span class="n">plperl_add_sequence_handler</span><span class="p">(</span> <span class="s1">'persona_pk_seq'</span> <span class="p">);</span>
<span class="c1">-- and later</span>
<span class="o">></span> <span class="k">DO</span> <span class="k">language</span> <span class="n">plperl</span> <span class="err">$$</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">INFO</span><span class="p">,</span> <span class="err">$</span><span class="n">_SHARED</span><span class="p">{</span> <span class="n">persona_pk_seq</span> <span class="p">}</span><span class="o">-></span><span class="p">()</span> <span class="p">);</span>
<span class="err">$$</span><span class="p">;</span>
</code></pre></div></div>
<p>In the above <code class="language-plaintext highlighter-rouge">plperl</code> code the coderef is invoked via a reference in the <code class="language-plaintext highlighter-rouge">%_SHARED</code> hash, in particular:</p>
<p>$_SHARED{ persona_pk_seq }->()</p>
<p>so that is is easier to get a sequence value in a <em>Perl-way</em>.</p>
PostgreSQL online course via BSD Magazine2018-05-04T00:00:00+00:00https://fluca1978.github.io/2018/05/04/PostgreSQLBSDMagazineCourse<p>I’m preparing another short course on PostgreSQL, this time online with written material.</p>
<h1 id="postgresql-online-course-via-bsd-magazine">PostgreSQL online course via BSD Magazine</h1>
<p><a href="http://bsdmag.org">BSD Magazine</a> is delivering a <a href="https://bsdmag.org/course/course-10-improve-your-postgresql-skills/">PostgreSQL intermediate course</a> with my own material.</p>
<p>I’ve been writing articles for <a href="http://bsdmag.org">BSD Magazine</a> for a lot now, and many of them with the <em>PostgreSQL</em> subject, so according with the editors, we decided to create a full course with <strong>5 modules</strong> and <em>written material</em> to allow readers to get a more detailed view on PostgreSQL capabilities.</p>
<p><br />
<br />
The course will be performed by presenting attendees written material with examples, exercises and offering online support for doubts and questions. This course is an <em>intermediate</em> one, meaning it will not cover basic concepts like installation, SQL basic statements, <code class="language-plaintext highlighter-rouge">psql</code> and connection strings, and so on. The topic list is available on the <a href="https://bsdmag.org/course/course-10-improve-your-postgresql-skills/">course page</a>.</p>
<p><br />
<br />
I would like to thank <a href="http://bsdmag.org">BSD Magazine</a> editors for the great opportunity to spread again the word of PostgreSQL!</p>
plperl: which version of Perl?2018-05-03T00:00:00+00:00https://fluca1978.github.io/2018/05/03/PostgreSQL-plperl-versions<p><code class="language-plaintext highlighter-rouge">plperl</code> is a great extension for PostgreSQL that allows the execution of Perl 5 code within the database.</p>
<h1 id="plperl-which-version-of-perl"><code class="language-plaintext highlighter-rouge">plperl</code>: which version of Perl?</h1>
<p>When executing Perl 5 code within the database, PostgreSQL uses the /embedded Perl 5/ to create one (or more) instance of the interpreter. The version of the compiler and virtual machine that runs depends on how PostgreSQL has been compiled, or better, how <code class="language-plaintext highlighter-rouge">libperl.so</code> has been created.
It is possible to use a specific version of Perl without having to change the system wide Perl 5, and in particular it is possible with some effort to use <code class="language-plaintext highlighter-rouge">perlbrew</code> to this aim.</p>
<h2 id="understanding-which-perl-the-database-is-executing">Understanding which <code class="language-plaintext highlighter-rouge">perl</code> the database is executing</h2>
<p>To know which <code class="language-plaintext highlighter-rouge">perl</code> executable the server will run it is possible to use the <code class="language-plaintext highlighter-rouge">Config</code> module for a little introspection:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">DO</span> <span class="k">LANGUAGE</span> <span class="n">plperlu</span>
<span class="err">$</span><span class="n">PERL</span><span class="err">$</span>
<span class="n">use</span> <span class="n">Config</span><span class="p">;</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">INFO</span><span class="p">,</span> <span class="s1">'Perl executable '</span> <span class="p">.</span> <span class="err">$</span><span class="n">Config</span><span class="p">{</span> <span class="n">perlpath</span> <span class="p">}</span> <span class="p">);</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">INFO</span><span class="p">,</span> <span class="s1">'Perl version '</span> <span class="p">.</span> <span class="err">$</span><span class="n">Config</span><span class="p">{</span> <span class="k">version</span> <span class="p">}</span> <span class="p">);</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">INFO</span><span class="p">,</span> <span class="s1">'Perl library '</span> <span class="p">.</span> <span class="err">$</span><span class="n">Config</span><span class="p">{</span> <span class="n">libperl</span> <span class="p">}</span> <span class="p">);</span>
<span class="err">$</span><span class="n">PERL</span><span class="err">$</span><span class="p">;</span>
</code></pre></div></div>
<p>For example, the above piece of code produces the following output on my system:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INFO: Perl executable /usr/local/bin/perl
INFO: Perl version 5.24.3
INFO: Perl library libperl.so.5.24.3
</code></pre></div></div>
<p>that tells clearly the <code class="language-plaintext highlighter-rouge">perl</code> executable is at version <code class="language-plaintext highlighter-rouge">5.24.3</code>.
The same could have been checked from the <code class="language-plaintext highlighter-rouge">plperl.so</code> library file, that is linked to the above version of the library:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% ldd /usr/local/lib/postgresql/plperl.so
/usr/local/lib/postgresql/plperl.so:
libthr.so.3 <span class="o">=></span> /lib/libthr.so.3 <span class="o">(</span>0x801214000<span class="o">)</span>
libperl.so.5.24 <span class="o">=></span> /usr/local/lib/perl5/5.24/mach/CORE/libperl.so.5.24 <span class="o">(</span>0x80143c000<span class="o">)</span>
libc.so.7 <span class="o">=></span> /lib/libc.so.7 <span class="o">(</span>0x800824000<span class="o">)</span>
libm.so.5 <span class="o">=></span> /lib/libm.so.5 <span class="o">(</span>0x801833000<span class="o">)</span>
libcrypt.so.5 <span class="o">=></span> /lib/libcrypt.so.5 <span class="o">(</span>0x801a5e000<span class="o">)</span>
libutil.so.9 <span class="o">=></span> /lib/libutil.so.9 <span class="o">(</span>0x801c7d000<span class="o">)</span>
</code></pre></div></div>
<p>that is <code class="language-plaintext highlighter-rouge">libperl.so.5.24</code>.</p>
<h2 id="installing-a-different-plperl-version-using-perlbrew">Installing a different <code class="language-plaintext highlighter-rouge">plperl</code> version using <code class="language-plaintext highlighter-rouge">perlbrew</code></h2>
<p><a href="http://perlbrew.pl">perlbrew</a> is a great tool to make different versions of Perl 5 co-exist on the same system.
Thanks to <code class="language-plaintext highlighter-rouge">perlbrew** it is possible to install another version of Perl 5 without tossing the system wide installation.
**In order to be usable by PostgreSQL, Perl must be compiled with the shared library option </code>-D useshrplib`**, so as the user that owns the PostgreSQL daemon:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% perlbrew <span class="nb">install</span> <span class="nt">--multi</span> perl-5.26.2 <span class="nt">-D</span> useshrplib
...
% perlbrew switch perl-5.26.2
</code></pre></div></div>
<p>It is now possible to compile a new version of PostgreSQL, in order to get the new <code class="language-plaintext highlighter-rouge">plperl</code>:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% ./configure <span class="nt">--prefix</span><span class="o">=</span>/opt/postgresql/10/3 <span class="nt">--with-perl</span> <span class="nt">--with-python</span> <span class="nt">--without-readline</span>
...
% ldd /opt/postgresql/10/3/lib/plperl.so
/opt/postgresql/10/3/lib/plperl.so:
libperl.so <span class="o">=></span>
/var/db/postgres/perl5/perlbrew/perls/perl-5.26.2/lib/5.26.2/amd64-freebsd-multi/CORE/libperl.so
<span class="o">(</span>0x801214000<span class="o">)</span>
libc.so.7 <span class="o">=></span> /lib/libc.so.7 <span class="o">(</span>0x800824000<span class="o">)</span>
libthr.so.3 <span class="o">=></span> /lib/libthr.so.3 <span class="o">(</span>0x801612000<span class="o">)</span>
libm.so.5 <span class="o">=></span> /lib/libm.so.5 <span class="o">(</span>0x80183a000<span class="o">)</span>
libcrypt.so.5 <span class="o">=></span> /lib/libcrypt.so.5 <span class="o">(</span>0x801a65000<span class="o">)</span>
libutil.so.9 <span class="o">=></span> /lib/libutil.so.9 <span class="o">(</span>0x801c84000<span class="o">)</span>
</code></pre></div></div>
<p>As you can see, the <code class="language-plaintext highlighter-rouge">libperl.so</code> is now linked to the <code class="language-plaintext highlighter-rouge">5.26.2</code> version of <code class="language-plaintext highlighter-rouge">libperl.so</code>.
It is now time to test the new installation!</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">CREATE</span> <span class="n">EXTENSION</span> <span class="n">plperl</span><span class="p">;</span>
<span class="o">#</span> <span class="k">CREATE</span> <span class="k">LANGUAGE</span> <span class="n">plperlu</span><span class="p">;</span>
<span class="o">#</span> <span class="k">DO</span> <span class="k">LANGUAGE</span> <span class="n">plperlu</span>
<span class="err">$</span><span class="n">PERL</span><span class="err">$</span>
<span class="n">use</span> <span class="n">Config</span><span class="p">;</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">INFO</span><span class="p">,</span> <span class="s1">'Perl executable '</span> <span class="p">.</span> <span class="err">$</span><span class="n">Config</span><span class="p">{</span> <span class="n">perlpath</span> <span class="p">}</span> <span class="p">);</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">INFO</span><span class="p">,</span> <span class="s1">'Perl version '</span> <span class="p">.</span> <span class="err">$</span><span class="n">Config</span><span class="p">{</span> <span class="k">version</span> <span class="p">}</span> <span class="p">);</span>
<span class="n">elog</span><span class="p">(</span> <span class="n">INFO</span><span class="p">,</span> <span class="s1">'Perl library '</span> <span class="p">.</span> <span class="err">$</span><span class="n">Config</span><span class="p">{</span> <span class="n">libperl</span> <span class="p">}</span> <span class="p">);</span>
<span class="err">$</span><span class="n">PERL</span><span class="err">$</span><span class="p">;</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Perl</span> <span class="n">executable</span> <span class="o">/</span><span class="n">var</span><span class="o">/</span><span class="n">db</span><span class="o">/</span><span class="n">postgres</span><span class="o">/</span><span class="n">perl5</span><span class="o">/</span><span class="n">perlbrew</span><span class="o">/</span><span class="n">perls</span><span class="o">/</span><span class="n">perl</span><span class="o">-</span><span class="mi">5</span><span class="p">.</span><span class="mi">26</span><span class="p">.</span><span class="mi">2</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">perl</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Perl</span> <span class="k">version</span> <span class="mi">5</span><span class="p">.</span><span class="mi">26</span><span class="p">.</span><span class="mi">2</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Perl</span> <span class="n">library</span> <span class="n">libperl</span><span class="p">.</span><span class="n">so</span>
<span class="k">DO</span>
</code></pre></div></div>
<p>That’s it!</p>
Generating an italian 'Codice Fiscale' via plpgsql or plperl2018-04-19T00:00:00+00:00https://fluca1978.github.io/2018/04/19/PostgreSQLCodiceFiscale<p>PostgreSQL built-in <code class="language-plaintext highlighter-rouge">plpgsql</code> can be used to build <em>stored procedure</em> and, with a few tricks, to consume data and translate it into other forms. It is also possible to generate a so known <em>codice fiscale</em>, the italian string that represents the <em>tax payer number</em> based on the person’s name, birth date and place.
This posts will show some concepts about how to generate the single pieces of the <em>codice fiscale</em> via <code class="language-plaintext highlighter-rouge">plpgsql</code>.
And why not? Let’s compare it to a <code class="language-plaintext highlighter-rouge">plperl</code> implementation.</p>
<h1 id="generating-an-italian-codice-fiscale">Generating an italian <em>codice fiscale</em></h1>
<p>In order to provide a quite complet example of usage of <code class="language-plaintext highlighter-rouge">plpgsql</code> for a course of mine, I developed a few functions to build up an italian <em>codice fiscale</em> (tax payer number). The idea is not to have a fully working implementation, rather to demonstrate usage of different operators and functions in <code class="language-plaintext highlighter-rouge">plpgsql</code>. And to compare its implementation with a <code class="language-plaintext highlighter-rouge">plperl</code> one.</p>
<p>The full rules for building up a <em>codice fiscale</em> are available <a href="https://it.wikipedia.org/wiki/Codice_fiscale">here in italian</a>. The code shown below is freely available on my <a href="https://github.com/fluca1978/fluca1978-pg-utils">GitHub PostgreSQL-related repository</a>, and in particular there are two scripts:</p>
<ul>
<li>a <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/codice_fiscale.sql">pure <code class="language-plaintext highlighter-rouge">plgpsql</code> script</a>;</li>
<li>a <a href="https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/codice_fiscale.plperl.sql"><code class="language-plaintext highlighter-rouge">plperl</code> implementation</a>.</li>
</ul>
<p>In order to generate a full “codice fiscale” you need to extract some letters from the surname and the name, build a string representing both the date of birth and gender, a code representing the birth place and last comes a character that works as a <em>checksum</em> of all the previous parts.</p>
<p>In order to obtain the full result the following function can be used:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf</span><span class="p">(</span> <span class="n">surname</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">name</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">birth_date</span> <span class="nb">date</span><span class="p">,</span>
<span class="n">birth_place</span> <span class="nb">text</span><span class="p">,</span>
<span class="n">gender</span> <span class="nb">bool</span> <span class="k">DEFAULT</span> <span class="k">true</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">char</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">cf_string</span> <span class="nb">char</span><span class="p">(</span><span class="mi">15</span><span class="p">);</span>
<span class="n">cf_check</span> <span class="nb">char</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="k">BEGIN</span>
<span class="n">cf_string</span> <span class="p">:</span><span class="o">=</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_letters</span><span class="p">(</span> <span class="n">surname</span> <span class="p">)</span>
<span class="o">||</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_letters</span><span class="p">(</span> <span class="n">name</span><span class="p">,</span> <span class="k">true</span> <span class="p">)</span>
<span class="o">||</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_date</span><span class="p">(</span> <span class="n">birth_date</span><span class="p">,</span> <span class="n">gender</span> <span class="p">)</span>
<span class="o">||</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_place</span><span class="p">(</span> <span class="n">birth_place</span> <span class="p">);</span>
<span class="n">cf_check</span> <span class="p">:</span><span class="o">=</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_check</span><span class="p">(</span> <span class="n">cf_string</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'cf: % + %'</span><span class="p">,</span> <span class="n">cf_string</span><span class="p">,</span> <span class="n">cf_check</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="k">upper</span><span class="p">(</span> <span class="n">cf_string</span> <span class="o">||</span> <span class="n">cf_check</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>As simple as such function is, it does a <em>simple</em> string concatenation invoking other functions.</p>
<h2 id="extracting-letters-for-name-and-surname">Extracting letters for name and surname</h2>
<p>The <code class="language-plaintext highlighter-rouge">cf_letters()</code> function performs the extraction of the letters for either a name or surname. General rules are:</p>
<ul>
<li>the result will be three letters wide;</li>
<li>if possible, only consonants will be used taking them from left to right. If there are not enough consonants, vowels must be appended in the same order they appear;</li>
<li>in the case of a name, if there are enough consonants, the first, third and fourth must be choosen.</li>
</ul>
<p>That produces the following code:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_letters</span><span class="p">(</span> <span class="n">subject</span> <span class="nb">text</span><span class="p">,</span> <span class="n">is_name</span> <span class="nb">bool</span> <span class="k">DEFAULT</span> <span class="k">false</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">char</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">BODY</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">missing_chars</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">vowels</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">consonants</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">final_chars</span> <span class="nb">char</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>
<span class="k">BEGIN</span>
<span class="c1">-- work always uppercase, avoid case-sensitiveness</span>
<span class="n">subject</span> <span class="p">:</span><span class="o">=</span> <span class="k">upper</span><span class="p">(</span> <span class="k">trim</span><span class="p">(</span> <span class="n">subject</span> <span class="p">)</span> <span class="p">);</span>
<span class="c1">-- get all the consonants</span>
<span class="n">consonants</span> <span class="p">:</span><span class="o">=</span> <span class="k">translate</span><span class="p">(</span> <span class="n">subject</span><span class="p">,</span> <span class="s1">'AEIOU'</span><span class="p">,</span> <span class="s1">''</span> <span class="p">);</span>
<span class="c1">-- extract all the vowels (negate consonants!)</span>
<span class="n">vowels</span> <span class="p">:</span><span class="o">=</span> <span class="k">translate</span><span class="p">(</span> <span class="n">subject</span><span class="p">,</span> <span class="n">consonants</span><span class="p">,</span> <span class="s1">''</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'cf_letters: [%] -> [%] + [%]'</span><span class="p">,</span>
<span class="n">subject</span><span class="p">,</span>
<span class="n">consonants</span><span class="p">,</span>
<span class="n">vowels</span><span class="p">;</span>
<span class="n">IF</span> <span class="n">is_name</span> <span class="k">THEN</span>
<span class="n">IF</span> <span class="k">length</span><span class="p">(</span> <span class="n">consonants</span> <span class="p">)</span> <span class="o">>=</span> <span class="mi">4</span> <span class="k">THEN</span>
<span class="n">consonants</span> <span class="p">:</span><span class="o">=</span> <span class="k">substring</span><span class="p">(</span> <span class="n">consonants</span> <span class="k">FROM</span> <span class="mi">1</span> <span class="k">FOR</span> <span class="mi">1</span> <span class="p">)</span>
<span class="o">||</span> <span class="k">substring</span><span class="p">(</span> <span class="n">consonants</span> <span class="k">FROM</span> <span class="mi">3</span> <span class="k">FOR</span> <span class="mi">1</span> <span class="p">)</span>
<span class="o">||</span> <span class="k">substring</span><span class="p">(</span> <span class="n">consonants</span> <span class="k">FROM</span> <span class="mi">4</span> <span class="k">FOR</span> <span class="mi">1</span> <span class="p">);</span>
<span class="n">ELSIF</span> <span class="k">length</span><span class="p">(</span> <span class="n">consonants</span> <span class="p">)</span> <span class="o">=</span> <span class="mi">3</span> <span class="k">THEN</span>
<span class="n">consonants</span> <span class="p">:</span><span class="o">=</span> <span class="k">substring</span><span class="p">(</span> <span class="n">consonants</span> <span class="k">FROM</span> <span class="mi">1</span> <span class="k">FOR</span> <span class="mi">3</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">IF</span> <span class="k">length</span><span class="p">(</span> <span class="n">consonants</span> <span class="p">)</span> <span class="o">>=</span> <span class="mi">3</span> <span class="k">THEN</span>
<span class="n">final_chars</span> <span class="p">:</span><span class="o">=</span> <span class="k">substring</span><span class="p">(</span> <span class="n">consonants</span> <span class="k">FROM</span> <span class="mi">1</span> <span class="k">FOR</span> <span class="mi">3</span> <span class="p">);</span>
<span class="k">ELSE</span>
<span class="n">missing_chars</span> <span class="p">:</span><span class="o">=</span> <span class="mi">3</span> <span class="o">-</span> <span class="k">length</span><span class="p">(</span> <span class="n">consonants</span> <span class="p">);</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Pushing % vowel(s)'</span><span class="p">,</span> <span class="n">missing_chars</span><span class="p">;</span>
<span class="n">final_chars</span> <span class="p">:</span><span class="o">=</span> <span class="n">consonants</span> <span class="o">||</span> <span class="k">substring</span><span class="p">(</span> <span class="n">vowels</span> <span class="k">FROM</span> <span class="mi">1</span> <span class="k">FOR</span> <span class="n">missing_chars</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">final_chars</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">BODY</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>The function builds two strings: one made up by all the consonants and one made up by only vowels appearing in the string passed as input. After that, in the case it is a name and has enough consonants, the latters are used to build up the three required chars. In any case, if there are not enough consonants, vowels are appended computing first how many letters are <em>missing</em> from the final length of <code class="language-plaintext highlighter-rouge">3</code>.</p>
<p>The same alghoritm results a little shorter in Perl, mainly due to regular expressions and array slicing:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">cf</span><span class="o">.</span><span class="nv">cf_letters</span><span class="p">(</span> <span class="nv">text</span><span class="p">,</span> <span class="nv">bool</span> <span class="nv">DEFAULT</span> <span class="nv">false</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">char</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="nv">AS</span> <span class="nv">$BODY</span><span class="err">$</span>
<span class="k">my</span> <span class="p">(</span><span class="nv">$subject</span><span class="p">,</span> <span class="nv">$is_name</span><span class="p">)</span> <span class="o">=</span> <span class="p">(</span> <span class="nb">lc</span><span class="p">(</span> <span class="vg">$_</span><span class="p">[</span> <span class="mi">0</span> <span class="p">]</span> <span class="p">),</span> <span class="vg">$_</span><span class="p">[</span> <span class="mi">1</span> <span class="p">]</span> <span class="p">);</span>
<span class="c1"># split the word into letters</span>
<span class="k">my</span> <span class="nv">@letters</span> <span class="o">=</span> <span class="nb">split</span> <span class="sr">//</span><span class="p">,</span> <span class="nv">$subject</span><span class="p">;</span>
<span class="c1"># grep out vowels and consonants</span>
<span class="k">my</span> <span class="nv">@consonants</span> <span class="o">=</span> <span class="nb">grep</span> <span class="p">{</span> <span class="vg">$_</span> <span class="o">!~</span> <span class="sr">/[aeiou]/</span> <span class="p">}</span> <span class="nv">@letters</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">@vowels</span> <span class="o">=</span> <span class="nb">grep</span> <span class="p">{</span> <span class="vg">$_</span> <span class="o">=~</span> <span class="sr">/[aeiou]/</span> <span class="p">}</span> <span class="nv">@letters</span><span class="p">;</span>
<span class="k">return</span> <span class="nb">join</span><span class="p">(</span> <span class="p">'',</span> <span class="nv">$consonants</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="nv">$consonants</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="nv">$consonants</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="p">)</span> <span class="k">if</span> <span class="p">(</span> <span class="nv">$is_name</span> <span class="o">&&</span> <span class="nv">@consonants</span> <span class="o">>=</span> <span class="mi">4</span> <span class="p">);</span>
<span class="k">return</span> <span class="nb">join</span><span class="p">(</span> <span class="p">'',</span> <span class="nv">@consonants</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">2</span><span class="p">]</span> <span class="p">)</span> <span class="k">if</span> <span class="p">(</span> <span class="nv">@consonants</span> <span class="o">>=</span> <span class="mi">3</span> <span class="p">);</span>
<span class="k">return</span> <span class="nb">join</span><span class="p">(</span> <span class="p">'',</span> <span class="nv">@consonants</span><span class="p">,</span> <span class="nv">@vowels</span><span class="p">[</span><span class="mi">0</span> <span class="o">..</span> <span class="mi">3</span> <span class="o">-</span> <span class="nv">@consonants</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span> <span class="p">);</span>
<span class="nv">$BODY</span><span class="err">$</span>
<span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<h2 id="generating-the-birth-date-and-gender-part">Generating the birth date and gender part</h2>
<p>The rules are quite simple:</p>
<ul>
<li>the year comes first expressed as two digits;</li>
<li>a capital letter specifies the month;</li>
<li>the day of birth comes then, expressed as two digits and added by <code class="language-plaintext highlighter-rouge">40</code> in the case of female gender.</li>
</ul>
<p>Therefore a birth date like <code class="language-plaintext highlighter-rouge">1978-07-19</code> becomes <code class="language-plaintext highlighter-rouge">78L19</code> (where <code class="language-plaintext highlighter-rouge">L</code> is the letter representing <code class="language-plaintext highlighter-rouge">July</code>).
In order to compute the date string it is required to know, as input parameters, the date of birth and the gender.
The function that implements the computation is the following one:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_date</span><span class="p">(</span> <span class="n">birth_date</span> <span class="nb">date</span><span class="p">,</span>
<span class="n">male</span> <span class="nb">boolean</span> <span class="k">DEFAULT</span> <span class="k">true</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">char</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">y</span> <span class="nb">int</span><span class="p">;</span> <span class="c1">-- year</span>
<span class="n">m</span> <span class="nb">int</span><span class="p">;</span> <span class="c1">-- month (1-12)</span>
<span class="n">d</span> <span class="nb">int</span><span class="p">;</span> <span class="c1">-- day</span>
<span class="n">month_decode</span> <span class="nb">char</span><span class="p">[]</span> <span class="p">:</span><span class="o">=</span> <span class="n">ARRAY</span><span class="p">[</span> <span class="s1">'a'</span> <span class="c1">-- january</span>
<span class="p">,</span> <span class="s1">'b'</span> <span class="c1">-- february</span>
<span class="p">,</span> <span class="s1">'c'</span> <span class="c1">-- march</span>
<span class="p">,</span> <span class="s1">'d'</span> <span class="c1">-- april</span>
<span class="p">,</span> <span class="s1">'e'</span> <span class="c1">-- may</span>
<span class="p">,</span> <span class="s1">'h'</span> <span class="c1">-- june</span>
<span class="p">,</span> <span class="s1">'l'</span> <span class="c1">-- july</span>
<span class="p">,</span> <span class="s1">'m'</span> <span class="c1">-- august</span>
<span class="p">,</span> <span class="s1">'p'</span> <span class="c1">-- september</span>
<span class="p">,</span> <span class="s1">'r'</span> <span class="c1">-- october</span>
<span class="p">,</span> <span class="s1">'s'</span> <span class="c1">-- november</span>
<span class="p">,</span> <span class="s1">'t'</span> <span class="c1">-- december</span>
<span class="p">]::</span><span class="nb">char</span><span class="p">[];</span>
<span class="k">BEGIN</span>
<span class="c1">-- get the year, last two digits</span>
<span class="n">y</span> <span class="p">:</span><span class="o">=</span> <span class="n">to_char</span><span class="p">(</span> <span class="n">birth_date</span><span class="p">,</span> <span class="s1">'yy'</span> <span class="p">);</span>
<span class="c1">-- get the month index</span>
<span class="n">m</span> <span class="p">:</span><span class="o">=</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="k">month</span> <span class="k">FROM</span> <span class="n">birth_date</span> <span class="p">);</span>
<span class="c1">-- get the day</span>
<span class="n">d</span> <span class="p">:</span><span class="o">=</span> <span class="k">EXTRACT</span><span class="p">(</span> <span class="k">day</span> <span class="k">FROM</span> <span class="n">birth_date</span> <span class="p">);</span>
<span class="c1">-- if this is for a female, add</span>
<span class="c1">-- a number to the day</span>
<span class="n">IF</span> <span class="k">NOT</span> <span class="n">male</span> <span class="k">THEN</span>
<span class="n">d</span> <span class="p">:</span><span class="o">=</span> <span class="n">d</span> <span class="o">+</span> <span class="mi">40</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'cf_date: % -> % % [%] % '</span><span class="p">,</span>
<span class="n">birth_date</span><span class="p">,</span>
<span class="n">y</span><span class="p">,</span>
<span class="n">m</span><span class="p">,</span>
<span class="n">month_decode</span><span class="p">[</span><span class="n">m</span><span class="p">],</span>
<span class="n">d</span><span class="p">;</span>
<span class="c1">-- compose and return the string</span>
<span class="k">RETURN</span> <span class="n">lpad</span><span class="p">(</span> <span class="n">y</span><span class="p">::</span><span class="nb">text</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="s1">'0'</span> <span class="p">)</span>
<span class="o">||</span> <span class="k">upper</span><span class="p">(</span> <span class="n">month_decode</span><span class="p">[</span><span class="n">m</span><span class="p">]</span> <span class="p">)</span>
<span class="o">||</span> <span class="n">lpad</span><span class="p">(</span> <span class="n">d</span><span class="p">::</span><span class="nb">text</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="s1">'0'</span> <span class="p">);</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>After extracting the single parts of the <code class="language-plaintext highlighter-rouge">birth_date</code>, all the parts are composed in a final string using <code class="language-plaintext highlighter-rouge">lpad()</code> to prepend required <code class="language-plaintext highlighter-rouge">0</code> if the number is not a two-digit one.</p>
<p>The tricky part of the function is the decoding of the month, from an integer value to a letter. In order to provide a <em>poor-man map</em> functionality, I placed all the month letters in their order into an array, so that given a numeric month value (e.g., <code class="language-plaintext highlighter-rouge">7</code>) I can extract the corresponding letter (<code class="language-plaintext highlighter-rouge">l</code> for <em>july</em>).</p>
<h2 id="generating-the-birth-place-code">Generating the birth place code</h2>
<p>This is a quite boring task, since there is no computation involving. A <em>simple lookup</em> is required, so I implemented it with a <em>simple name lookup</em> thru a table.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">cf</span><span class="p">.</span><span class="n">places</span><span class="p">(</span>
<span class="n">code</span> <span class="nb">char</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">,</span>
<span class="n">description</span> <span class="nb">text</span> <span class="k">NOT</span> <span class="k">NULL</span><span class="p">,</span>
<span class="k">UNIQUE</span><span class="p">(</span> <span class="n">description</span> <span class="p">),</span>
<span class="n">EXCLUDE</span><span class="p">(</span> <span class="k">lower</span><span class="p">(</span> <span class="k">trim</span><span class="p">(</span> <span class="n">description</span> <span class="p">)</span> <span class="p">)</span> <span class="k">WITH</span> <span class="o">=</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_place</span><span class="p">(</span> <span class="n">birth_place</span> <span class="nb">text</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">char</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">birth_code</span> <span class="nb">char</span><span class="p">(</span><span class="mi">4</span><span class="p">);</span>
<span class="k">BEGIN</span>
<span class="k">SELECT</span> <span class="n">code</span>
<span class="k">INTO</span> <span class="n">birth_code</span> <span class="c1">-- no strict! allow NOT FOUND to work!</span>
<span class="k">FROM</span> <span class="n">cf</span><span class="p">.</span><span class="n">places</span>
<span class="k">WHERE</span> <span class="k">lower</span><span class="p">(</span> <span class="n">description</span> <span class="p">)</span> <span class="o">=</span> <span class="k">lower</span><span class="p">(</span> <span class="n">birth_place</span> <span class="p">);</span>
<span class="n">IF</span> <span class="k">NOT</span> <span class="k">FOUND</span> <span class="k">THEN</span>
<span class="n">RAISE</span> <span class="n">WARNING</span> <span class="s1">'% not in cf.places!'</span><span class="p">,</span> <span class="n">birth_place</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="s1">'XXXX'</span><span class="p">;</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">RETURN</span> <span class="n">birth_code</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">CODE</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>Again, the <code class="language-plaintext highlighter-rouge">plperl</code> version of the function is not as much shorter as one could expect:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">cf</span><span class="o">.</span><span class="nv">cf_place</span><span class="p">(</span> <span class="nv">text</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">char</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
<span class="nv">AS</span> <span class="nv">$CODE</span><span class="err">$</span>
<span class="k">my</span> <span class="p">(</span><span class="nv">$birth_place</span><span class="p">)</span> <span class="o">=</span> <span class="nv">@_</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$query</span> <span class="o">=</span> <span class="p">"</span><span class="s2">SELECT code FROM cf.places WHERE lower( '</span><span class="si">$birth_place</span><span class="s2">' ) = lower( description ) </span><span class="p">";</span>
<span class="k">my</span> <span class="nv">$result_set</span> <span class="o">=</span> <span class="nv">spi_exec_query</span><span class="p">(</span> <span class="nv">$query</span> <span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
<span class="k">return</span> <span class="nv">$result_set</span><span class="o">-></span><span class="p">{</span><span class="nv">rows</span><span class="p">}[</span><span class="mi">0</span><span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="nv">code</span> <span class="p">}</span> <span class="k">if</span> <span class="p">(</span> <span class="nv">$result_set</span><span class="o">-></span><span class="p">{</span> <span class="nv">rows</span> <span class="p">}</span> <span class="p">);</span>
<span class="k">return</span> <span class="p">'</span><span class="s1">XXXX</span><span class="p">';</span>
<span class="nv">$CODE</span><span class="err">$</span>
<span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
<h2 id="generating-the-checksum-character">Generating the checksum character</h2>
<p>This is even more boring than generating the place code. The rules are that each character is assigned a value depending on both its position within the string (even or odd) and its actual value. All the values are summed and the result is computed <code class="language-plaintext highlighter-rouge">modulo 26</code>, and the result is looked up within the help of another table.</p>
<p>So, the whole has been implemented with a lookup table and a fat loop over all the character of the string:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">cf</span><span class="p">.</span><span class="n">check_chars</span><span class="p">(</span>
<span class="k">c</span> <span class="nb">char</span> <span class="k">PRIMARY</span> <span class="k">KEY</span><span class="p">,</span>
<span class="n">odd_value</span> <span class="nb">int</span> <span class="k">NOT</span> <span class="k">NULL</span><span class="p">,</span>
<span class="n">even_value</span> <span class="nb">int</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">cf</span><span class="p">.</span><span class="n">cf_check</span><span class="p">(</span> <span class="n">subject</span> <span class="nb">char</span><span class="p">(</span><span class="mi">15</span><span class="p">)</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="nb">char</span>
<span class="k">AS</span> <span class="err">$</span><span class="n">BODY</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">check_char</span> <span class="nb">char</span><span class="p">;</span>
<span class="n">odd_sum</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">even_sum</span> <span class="nb">int</span> <span class="p">:</span><span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">i</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">current_value</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">final_value</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">odd_in</span> <span class="nb">text</span> <span class="p">:</span><span class="o">=</span> <span class="s1">''</span><span class="p">;</span>
<span class="n">even_in</span> <span class="nb">text</span> <span class="p">:</span><span class="o">=</span> <span class="s1">''</span><span class="p">;</span>
<span class="n">current_letter</span> <span class="nb">char</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="k">FOR</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="p">..</span><span class="k">length</span><span class="p">(</span> <span class="n">subject</span> <span class="p">)</span> <span class="n">LOOP</span>
<span class="n">IF</span> <span class="n">i</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">THEN</span>
<span class="n">even_in</span> <span class="p">:</span><span class="o">=</span> <span class="n">array_append</span><span class="p">(</span> <span class="n">even_in</span><span class="p">,</span>
<span class="k">upper</span><span class="p">(</span> <span class="k">substring</span><span class="p">(</span> <span class="n">subject</span> <span class="k">FROM</span> <span class="n">i</span> <span class="k">FOR</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">)::</span><span class="nb">char</span> <span class="p">);</span>
<span class="k">ELSE</span>
<span class="n">odd_in</span> <span class="p">:</span><span class="o">=</span> <span class="n">array_append</span><span class="p">(</span> <span class="n">odd_in</span><span class="p">,</span>
<span class="k">upper</span><span class="p">(</span> <span class="k">substring</span><span class="p">(</span> <span class="n">subject</span> <span class="k">FROM</span> <span class="n">i</span> <span class="k">FOR</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">)::</span><span class="nb">char</span> <span class="p">);</span>
<span class="k">END</span> <span class="n">IF</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span><span class="p">;</span>
<span class="k">SELECT</span> <span class="k">sum</span><span class="p">(</span><span class="n">even_value</span><span class="p">)</span>
<span class="k">INTO</span> <span class="n">even_sum</span>
<span class="k">FROM</span> <span class="n">cf</span><span class="p">.</span><span class="n">check_chars</span>
<span class="k">JOIN</span> <span class="k">unnest</span><span class="p">(</span> <span class="n">even_in</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">letter</span>
<span class="k">ON</span> <span class="k">c</span> <span class="o">=</span> <span class="n">letter</span><span class="p">;</span>
<span class="k">SELECT</span> <span class="k">sum</span><span class="p">(</span><span class="n">odd_value</span><span class="p">)</span>
<span class="k">INTO</span> <span class="n">odd_sum</span>
<span class="k">FROM</span> <span class="n">cf</span><span class="p">.</span><span class="n">check_chars</span>
<span class="k">JOIN</span> <span class="k">unnest</span><span class="p">(</span> <span class="n">odd_in</span> <span class="p">)</span> <span class="k">AS</span> <span class="n">letter</span>
<span class="k">ON</span> <span class="k">c</span> <span class="o">=</span> <span class="n">letter</span><span class="p">;</span>
<span class="n">final_value</span> <span class="p">:</span><span class="o">=</span> <span class="p">(</span> <span class="n">odd_sum</span> <span class="o">+</span> <span class="n">even_sum</span> <span class="p">)</span> <span class="o">%</span> <span class="mi">26</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'cf_check: % + % %% 26 = %'</span><span class="p">,</span> <span class="n">odd_sum</span><span class="p">,</span> <span class="n">even_sum</span><span class="p">,</span> <span class="n">final_value</span><span class="p">;</span>
<span class="c1">-- this is a trick: the remaining part</span>
<span class="c1">-- indicates the positional order of the letter</span>
<span class="c1">-- within the alphabet, which is</span>
<span class="c1">-- the values into the table excluding digits</span>
<span class="k">SELECT</span> <span class="k">c</span>
<span class="k">INTO</span> <span class="k">STRICT</span> <span class="n">check_char</span>
<span class="k">FROM</span> <span class="n">cf</span><span class="p">.</span><span class="n">check_chars</span>
<span class="k">WHERE</span> <span class="n">even_value</span> <span class="o">=</span> <span class="n">final_value</span>
<span class="k">AND</span> <span class="k">c</span> <span class="k">NOT</span> <span class="k">IN</span> <span class="p">(</span> <span class="s1">'0'</span><span class="p">,</span> <span class="s1">'1'</span><span class="p">,</span> <span class="s1">'2'</span><span class="p">,</span> <span class="s1">'3'</span><span class="p">,</span> <span class="s1">'4'</span><span class="p">,</span> <span class="s1">'5'</span><span class="p">,</span> <span class="s1">'6'</span><span class="p">,</span> <span class="s1">'7'</span><span class="p">,</span> <span class="s1">'8'</span><span class="p">,</span> <span class="s1">'9'</span> <span class="p">);</span>
<span class="k">RETURN</span> <span class="n">check_char</span><span class="p">;</span>
<span class="k">END</span>
<span class="err">$</span><span class="n">BODY</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>In order to avoid querying data multiple times, that was my very first implementation, I placed the letters into an array and asked SQL to compute the <em>sum</em> for me.</p>
<p>The <code class="language-plaintext highlighter-rouge">plperl</code> version is shorter because of usage of postfix operators, even if it queries all the letters one by one.
Also usage of the <em>ellipsis</em> and <code class="language-plaintext highlighter-rouge">join</code> simplify the query construction:</p>
<div class="language-perl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">CREATE</span> <span class="nv">OR</span> <span class="nv">REPLACE</span> <span class="nv">FUNCTION</span> <span class="nv">cf</span><span class="o">.</span><span class="nv">cf_check</span><span class="p">(</span> <span class="nv">subject</span> <span class="nv">char</span><span class="p">(</span><span class="mi">15</span><span class="p">)</span> <span class="p">)</span>
<span class="nv">RETURNS</span> <span class="nv">char</span>
<span class="nv">AS</span> <span class="nv">$BODY</span><span class="err">$</span>
<span class="k">my</span> <span class="p">(</span><span class="nv">$subject</span><span class="p">)</span> <span class="o">=</span> <span class="nv">@_</span><span class="p">;</span>
<span class="k">my</span> <span class="p">(</span><span class="nv">$odd_sum</span><span class="p">,</span> <span class="nv">$even_sum</span><span class="p">)</span> <span class="o">=</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">);</span>
<span class="k">my</span> <span class="nv">$index</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$query</span><span class="p">;</span>
<span class="k">my</span> <span class="nv">$final_value</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span> <span class="nb">split</span> <span class="sr">//</span><span class="p">,</span> <span class="nv">$subject</span> <span class="p">)</span> <span class="p">{</span>
<span class="nv">$index</span><span class="o">++</span><span class="p">;</span>
<span class="nv">$query</span> <span class="o">=</span> <span class="nb">sprintf</span><span class="p">(</span> <span class="p">"</span><span class="s2">SELECT odd_value, even_value FROM cf.check_chars WHERE c = '%s'</span><span class="p">",</span> <span class="nb">uc</span><span class="p">(</span> <span class="vg">$_</span> <span class="p">)</span> <span class="p">);</span>
<span class="nv">$odd_sum</span> <span class="o">+=</span> <span class="nv">spi_exec_query</span><span class="p">(</span> <span class="nv">$query</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="nv">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="nv">odd_value</span> <span class="p">}</span> <span class="k">if</span> <span class="p">(</span> <span class="nv">$index</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">!=</span> <span class="mi">0</span> <span class="p">);</span>
<span class="nv">$even_sum</span> <span class="o">+=</span> <span class="nv">spi_exec_query</span><span class="p">(</span> <span class="nv">$query</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="nv">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="nv">even_value</span> <span class="p">}</span> <span class="k">if</span> <span class="p">(</span> <span class="nv">$index</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span> <span class="p">);</span>
<span class="p">}</span>
<span class="nv">$final_value</span> <span class="o">=</span> <span class="p">(</span> <span class="nv">$odd_sum</span> <span class="o">+</span> <span class="nv">$even_sum</span> <span class="p">)</span> <span class="o">%</span> <span class="mi">26</span><span class="p">;</span>
<span class="nv">elog</span><span class="p">(</span> <span class="nv">DEBUG</span><span class="p">,</span> <span class="p">"</span><span class="s2">cf_check: </span><span class="si">$subject</span><span class="s2"> -> </span><span class="si">$odd_sum</span><span class="s2"> + </span><span class="si">$even_sum</span><span class="s2"> % 26 = </span><span class="si">$final_value</span><span class="p">"</span> <span class="p">);</span>
<span class="nv">$query</span> <span class="o">=</span> <span class="nb">sprintf</span><span class="p">(</span> <span class="p">"</span><span class="s2">SELECT c FROM cf.check_chars WHERE even_value = %d AND c NOT IN ( %s ) </span><span class="p">",</span>
<span class="nv">$final_value</span><span class="p">,</span>
<span class="nb">join</span><span class="p">(</span> <span class="p">'</span><span class="s1">,</span><span class="p">',</span> <span class="p">(</span> <span class="nb">map</span> <span class="p">{</span> <span class="nb">sprintf</span><span class="p">(</span> <span class="p">"</span><span class="s2">'%1s'</span><span class="p">",</span> <span class="vg">$_</span> <span class="p">)</span> <span class="p">}</span> <span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="mi">9</span><span class="p">)</span> <span class="p">)</span> <span class="p">)</span> <span class="p">);</span>
<span class="k">return</span> <span class="nv">spi_exec_query</span><span class="p">(</span> <span class="nv">$query</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span><span class="o">-></span><span class="p">{</span> <span class="nv">rows</span> <span class="p">}[</span> <span class="mi">0</span> <span class="p">]</span><span class="o">-></span><span class="p">{</span> <span class="nv">c</span> <span class="p">};</span>
<span class="k">END</span>
<span class="nv">$BODY</span><span class="err">$</span>
<span class="nv">LANGUAGE</span> <span class="nv">plperl</span><span class="p">;</span>
</code></pre></div></div>
What happens when you add a column?2018-04-17T00:00:00+00:00https://fluca1978.github.io/2018/04/17/PostgreSQLALterTable<p>Adding a column with a default value requires a full table rewrite, and therefore it is often suggested to avoid the default value. However, it is possible to add the default value without having PostgreSQL perform the full table rewrite.</p>
<h1 id="adding-a-column-via-alter-table-add-column">Adding a column via <code class="language-plaintext highlighter-rouge">ALTER TABLE ADD COLUMN</code></h1>
<p>In order to demonstrate what does PostgreSQL when a new column is added, consider the following simple table to begin with:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">></span> <span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">foo</span><span class="p">(</span> <span class="n">i</span> <span class="nb">int</span> <span class="p">);</span>
<span class="o">></span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">foo</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="k">SELECT</span> <span class="n">v</span>
<span class="k">FROM</span> <span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">100000</span><span class="p">)</span> <span class="n">v</span><span class="p">;</span>
</code></pre></div></div>
<p>Suppose it is required to add a column, <em>with a default value</em>, to the table. When an <code class="language-plaintext highlighter-rouge">ALTER TABLE ADD COLUMN</code> is issued, and a default value is provided, PostgreSQL performs a <strong>full update</strong> of the whole table, that is all the 100k tuples are updated immediatly:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="o">></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="k">c</span> <span class="nb">char</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="k">DEFAULT</span> <span class="s1">'A'</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">180</span><span class="p">.</span><span class="mi">997</span> <span class="n">ms</span>
<span class="o">></span> <span class="k">SELECT</span> <span class="k">distinct</span><span class="p">(</span> <span class="k">c</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">foo</span><span class="p">;</span>
<span class="k">c</span>
<span class="c1">---</span>
<span class="n">A</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>As readers can see, the <code class="language-plaintext highlighter-rouge">c</code> column has been added and all the tuples have been updated to the default value.</p>
<p>When the number of tuples is really high, performing such <code class="language-plaintext highlighter-rouge">ADD COLUMN</code> will result in a very huge database activity. Therefore, it is often suggested to perform the <code class="language-plaintext highlighter-rouge">ADD COLUMN</code> without a default value, so to get the column added very fast, and then issuing the update of the column value.</p>
<p>This is of course something with a different meaning: the default value is not placed in the table and therefore it is possible to insert some <code class="language-plaintext highlighter-rouge">NULL</code> values in such column.</p>
<p>There is a little trick to avoid the above problem when it is required a default value: issue two different <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> to (1) add the column and (2) set the default value:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">foo</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">v</span> <span class="nb">char</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">564</span> <span class="n">ms</span>
<span class="o">></span> <span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">foo</span> <span class="k">ALTER</span> <span class="k">COLUMN</span> <span class="n">v</span> <span class="k">SET</span> <span class="k">DEFAULT</span> <span class="s1">'B'</span><span class="p">;</span>
<span class="k">ALTER</span> <span class="k">TABLE</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">724</span> <span class="n">ms</span>
<span class="o">></span> <span class="k">SELECT</span> <span class="k">distinct</span><span class="p">(</span> <span class="n">v</span> <span class="p">)</span> <span class="k">FROM</span> <span class="n">foo</span><span class="p">;</span>
<span class="n">v</span>
<span class="c1">---</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
</code></pre></div></div>
<p>As readers can see, this time the column has a default value <strong>but the table has not been rewritten</strong> (and therefore the addition of the column is almost immediate preserving the semantic meaning of the column itself). It is now possible to perform the update of the table in a batch whenever possible or appropriate, while not incurring into the problem of risking wrong tuples to hit the table.</p>
PostgreSQL 10 short course in Modena2018-04-13T00:00:00+00:00https://fluca1978.github.io/2018/04/13/PostgreSQLCourseModena<p>The local Linux Users’ Group ConoscereLinux is delivering a six-part course on PostgreSQL 10. Guess who’s lecturing…</p>
<h1 id="postgresql-10-short-course-in-modena">PostgreSQL 10 short course in Modena</h1>
<p>I’m doing a short course on PostgreSQL, with particular regard to the 10 edition, in Modena.
Thanks to the local <a href="https://conoscerelinux.org/">Linux Users’ Group (LUG) /ConoscereLinux/</a>, that provided all the infrastructure for the course, I wiil introduce attendees at basic SQL concepts and how PostgreSQL works.</p>
<p>The course schedule is available <a href="https://conoscerelinux.org/courses/postgresql/">at the official course page</a>, and the course will be based on 6 lessons (2 already done). Attendees can come with their own laptops, and lessons will be “live”: I will show concepts while explaining on my own laptop running PostgreSQL 10.1.</p>
<p><br />
<br />
All the slides will be available for free on the course page, and are based on my work available on my <a href="https://github.com/fluca1978/fluca1978-pg-utils">github repository</a>. The course will take place every week on Tuesday evening.
So far the attendees are very interested in PostgreSQL and its technology, and are curious about evaluating all its features as a</p>
<p><br />
<br />
<strong>I really have to thank the ConoscereLinux LUG</strong>,
with particular regard to Luca and Massimiliano, for both giving me such chance and, most notably, for waiting me to be ready after my last eye-surgery, and driving me home!</p>
PostgreSQL 10 and Python 3 (on FreeBSD)2018-04-12T00:00:00+00:00https://fluca1978.github.io/2018/04/12/PostgreSQL-plpython<p>Today a friend of mine asked me for a trouble between PostgreSQL 10 and Python. Since I’m not a pythonist, my quick answer was that PostgreSQL 10 does support Python 3. But it turned out it was not so simple (at least, not so simple as to get Perl 5 working!).</p>
<h1 id="postgresql-10-python-3-and-freebsd-and-ubuntu">PostgreSQL 10, Python 3 and FreeBSD (and Ubuntu)</h1>
<p>tl;dr <em>it can be done</em> (of course!)</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">postgres</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">pyv</span><span class="p">();</span>
<span class="n">pyv</span>
<span class="c1">----------------------------------------------------------------------------</span>
<span class="mi">3</span><span class="p">.</span><span class="mi">6</span><span class="p">.</span><span class="mi">4</span> <span class="p">(</span><span class="k">default</span><span class="p">,</span> <span class="n">Jan</span> <span class="mi">2</span> <span class="mi">2018</span><span class="p">,</span> <span class="mi">01</span><span class="p">:</span><span class="mi">25</span><span class="p">:</span><span class="mi">35</span><span class="p">)</span> <span class="o">+</span>
<span class="p">[</span><span class="n">GCC</span> <span class="mi">4</span><span class="p">.</span><span class="mi">2</span><span class="p">.</span><span class="mi">1</span> <span class="n">Compatible</span> <span class="n">FreeBSD</span> <span class="n">Clang</span> <span class="mi">4</span><span class="p">.</span><span class="mi">0</span><span class="p">.</span><span class="mi">0</span> <span class="p">(</span><span class="n">tags</span><span class="o">/</span><span class="n">RELEASE_400</span><span class="o">/</span><span class="k">final</span> <span class="mi">297347</span><span class="p">)]</span>
<span class="n">postgres</span><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">proname</span><span class="p">,</span> <span class="n">prolang</span><span class="p">,</span> <span class="n">prosrc</span> <span class="k">FROM</span> <span class="n">pg_proc</span> <span class="k">WHERE</span> <span class="n">proname</span> <span class="o">=</span> <span class="s1">'pyv'</span><span class="p">;</span>
<span class="o">-</span><span class="p">[</span> <span class="n">RECORD</span> <span class="mi">1</span> <span class="p">]</span><span class="c1">-----------------</span>
<span class="n">proname</span> <span class="o">|</span> <span class="n">pyv</span>
<span class="n">prolang</span> <span class="o">|</span> <span class="mi">16387</span>
<span class="n">prosrc</span> <span class="o">|</span> <span class="o">+</span>
<span class="o">|</span> <span class="n">import</span> <span class="n">sys</span> <span class="o">+</span>
<span class="o">|</span> <span class="k">return</span> <span class="n">sys</span><span class="p">.</span><span class="k">version</span><span class="o">+</span>
<span class="o">|</span>
</code></pre></div></div>
<p>My default environment for running PostgreSQL is FreeBSD, so in order to get <code class="language-plaintext highlighter-rouge">plpython</code> working I jumped to the console and installed the package <code class="language-plaintext highlighter-rouge">postgresql10-plpython-10.3</code>, thinking of course that FreeBSD would do the right thing. Unluckily it did not!</p>
<p>Creating a <code class="language-plaintext highlighter-rouge">plpython3u</code> language did not succeed, and the problem was that the above package installed only the <code class="language-plaintext highlighter-rouge">libpython2.so</code> under the <code class="language-plaintext highlighter-rouge">lib</code> directory (e.g., <code class="language-plaintext highlighter-rouge">/usr/local/lib/postgresql</code>). I then tried installing the port of the very same name, but again it was installing only the <code class="language-plaintext highlighter-rouge">libpython2.so</code>.</p>
<p>In order to better understand what was missing, I switched to a clean /Ubuntu 17.10/ installing <a href="https://www.postgresql.org/download/linux/ubuntu/">all the <code class="language-plaintext highlighter-rouge">deb</code> packages for PostgreSQL and Python</a>, but again this was not working. The problem seemed to me that the system was using Python 3.7 and the <code class="language-plaintext highlighter-rouge">plpython</code> package was requiring Python 3.5, which is no more supported on that version of Ubuntu.
I then switched back to another Ubuntu machine, older than the previous one, and tried installing the whole <a href="https://www.enterprisedb.com/downloads/postgres-postgresql-downloads">Enterprise DB Interactive Installer</a>, hoping it would come with a self-contained version. Again, this was not working, since the server was unable to load the <code class="language-plaintext highlighter-rouge">libpython3.so</code> <em>even if that was in place</em>!
I then inspected such file in order to see what it was missing:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ldd /opt/PostgreSQL/10/lib/postgresql/plpython3.so
linux-vdso.so.1 <span class="o">=></span> <span class="o">(</span>0x00007ffef10fd000<span class="o">)</span>
libpython3.4m.so.1.0 <span class="o">=></span> not found
libc.so.6 <span class="o">=></span> /lib/x86_64-linux-gnu/libc.so.6 <span class="o">(</span>0x00007fbf394cb000<span class="o">)</span>
/lib64/ld-linux-x86-64.so.2 <span class="o">(</span>0x000055d03b80f000<span class="o">)</span>
</code></pre></div></div>
<p>Fine: <code class="language-plaintext highlighter-rouge">libpython3.4m.so</code> is missing. I then searched my hard drive and found a version laying around, therefore exported <code class="language-plaintext highlighter-rouge">LD_LIBRARY_PATH</code> to include the path to such file and restarted the server.
This time I was able to create the language <code class="language-plaintext highlighter-rouge">plpython3u</code>.
But when I tried to define the above <code class="language-plaintext highlighter-rouge">pyv</code> function, the backend process crashed!
Inspecting the log I found:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting <span class="nv">$PYTHONHOME</span> to <prefix>[:<exec_prefix>]
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named <span class="s1">'encodings'</span>
LOG: server process <span class="o">(</span>PID 18776<span class="o">)</span> was terminated by signal 6: Aborted
</code></pre></div></div>
<p>So, it seemed a Python-configuration problem. However I was unable to solve it, probably due to my poor Python knowledge.</p>
<p>I then switched back to FreeBSD and got the latest PostgreSQL 10.3 source code.
I then compiled it passing the <code class="language-plaintext highlighter-rouge">--with-python</code> at <code class="language-plaintext highlighter-rouge">configure</code> time and after a while I had a PostgreSQL 10.3 instance running and with Python 3.6 working.</p>
<p>Therefore, at the end of this story, I was unable to get Python 3 working within the <code class="language-plaintext highlighter-rouge">plpython</code> language until I compile PostgreSQL by myself. I suspect binary packages should advice the lack of such language support to avoid users’ troubles. Of course, this could have just been due to my poor knowledge of Python itself…</p>
Creating many (really many) users in PostgreSQL2018-01-04T00:00:00+00:00https://fluca1978.github.io/2018/01/04/PostgreSQLUsers<p>PostgreSQL can really handle a lot of database users (<em>roles</em>), and it is possible to stress the system in a very simple way.</p>
<h1 id="creating-many-really-many-users-in-postgresql">Creating many (<em>really many</em>) users in PostgreSQL</h1>
<p>In his post <a href="https://www.cybertec-postgresql.com/en/creating-1-million-users-in-postgresql/">Hans-Jürgen Schönig showed how to easily and quickly create a million users in PostgreSQL</a>; taking inspiration from such post, I decided to stress one of my virtual machines with a little more complex user creation use-case.</p>
<h2 id="roles-in-roles">Roles in Roles</h2>
<p>One feature of PostgreSQL <em>roles</em> is that they can contain other roles, creating a hierarchy of roles. Therefore, I decided to write a simple <code class="language-plaintext highlighter-rouge">plpgsql</code> function to loop creating a chain of roles at each iteration. The function <code class="language-plaintext highlighter-rouge">f_users</code> accepts two integers:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">deep</code> is the number of roles within the single role inheritance chain;</li>
<li><code class="language-plaintext highlighter-rouge">how_many</code> is the number of iterations.</li>
</ul>
<p>As a result, the procedure will create <code class="language-plaintext highlighter-rouge">( 1 + deep ) x how_many</code> roles. Each role name is made by a random string and the iteration number, therefore preventing as much as possible collisions.</p>
<p>The function code is as follows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">OR</span> <span class="k">REPLACE</span> <span class="k">FUNCTION</span> <span class="n">f_users</span><span class="p">(</span> <span class="n">deep</span> <span class="nb">int</span><span class="p">,</span> <span class="n">how_many</span> <span class="nb">int</span> <span class="p">)</span>
<span class="k">RETURNS</span> <span class="n">VOID</span>
<span class="k">AS</span>
<span class="err">$</span><span class="n">BODY</span><span class="err">$</span>
<span class="k">DECLARE</span>
<span class="n">main_role_name</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">current_role_name</span> <span class="nb">text</span><span class="p">;</span>
<span class="n">current_level</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">iteration</span> <span class="nb">int</span><span class="p">;</span>
<span class="n">query</span> <span class="nb">text</span><span class="p">;</span>
<span class="k">BEGIN</span>
<span class="o"><<</span><span class="n">LP_MAIN</span><span class="o">>></span>
<span class="k">FOR</span> <span class="n">iteration</span> <span class="k">IN</span> <span class="mi">1</span><span class="p">..</span><span class="n">how_many</span> <span class="n">LOOP</span>
<span class="c1">-- main role</span>
<span class="n">main_role_name</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'role_test_'</span> <span class="o">||</span> <span class="n">md5</span><span class="p">(</span> <span class="n">random</span><span class="p">()::</span><span class="nb">text</span> <span class="p">)</span> <span class="o">||</span> <span class="s1">'_'</span> <span class="o">||</span> <span class="n">iteration</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Main role is %'</span><span class="p">,</span> <span class="n">main_role_name</span><span class="p">;</span>
<span class="n">query</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'CREATE ROLE '</span> <span class="o">||</span> <span class="n">main_role_name</span> <span class="o">||</span> <span class="s1">' WITH NOLOGIN CONNECTION LIMIT 0;'</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'%'</span><span class="p">,</span> <span class="n">query</span><span class="p">;</span>
<span class="k">EXECUTE</span> <span class="n">query</span><span class="p">;</span>
<span class="o"><<</span><span class="n">LP_DEEP</span><span class="o">>></span>
<span class="k">FOR</span> <span class="n">current_level</span> <span class="k">IN</span> <span class="mi">1</span><span class="p">..</span><span class="n">deep</span> <span class="k">BY</span> <span class="mi">1</span> <span class="n">LOOP</span>
<span class="n">current_role_name</span> <span class="p">:</span><span class="o">=</span> <span class="n">main_role_name</span> <span class="o">||</span> <span class="s1">'_'</span> <span class="o">||</span> <span class="n">current_level</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'Level % -> role %'</span><span class="p">,</span> <span class="n">current_level</span><span class="p">,</span> <span class="n">current_role_name</span><span class="p">;</span>
<span class="n">query</span> <span class="p">:</span><span class="o">=</span> <span class="s1">'CREATE ROLE '</span> <span class="o">||</span> <span class="n">current_role_name</span> <span class="o">||</span> <span class="s1">' WITH IN ROLE '</span> <span class="o">||</span> <span class="n">main_role_name</span> <span class="o">||</span> <span class="s1">' NOLOGIN CONNECTION LIMIT 0;'</span><span class="p">;</span>
<span class="n">RAISE</span> <span class="n">DEBUG</span> <span class="s1">'%'</span><span class="p">,</span> <span class="n">query</span><span class="p">;</span>
<span class="k">EXECUTE</span> <span class="n">query</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span> <span class="n">LP_DEEP</span><span class="p">;</span>
<span class="k">END</span> <span class="n">LOOP</span> <span class="n">LP_MAIN</span><span class="p">;</span>
<span class="k">END</span><span class="p">;</span>
<span class="err">$</span><span class="n">BODY</span><span class="err">$</span>
<span class="k">LANGUAGE</span> <span class="n">plpgsql</span><span class="p">;</span>
</code></pre></div></div>
<p>Please note the above code can be optimized reducing the number of <code class="language-plaintext highlighter-rouge">RAISE</code> (that implies string concatenation).
The <code class="language-plaintext highlighter-rouge">connection limit 0</code> is for safety reasons: it is not desiderable to have such automatically created roles to be of any practical use.</p>
<h2 id="results">Results</h2>
<p>The first attempt was short and sweet: 5000 roles within 1000 groups.</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">SELECT</span> <span class="n">f_users</span><span class="p">(</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">1000</span> <span class="p">);</span>
<span class="n">f_users</span>
<span class="c1">---------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">965</span><span class="p">.</span><span class="mi">479</span> <span class="n">ms</span>
</code></pre></div></div>
<p>As readers can see, this took less than a second to perform. What about a 10x factor?</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">SELECT</span> <span class="n">f_users</span><span class="p">(</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">10000</span> <span class="p">);</span>
<span class="n">f_users</span>
<span class="c1">---------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">9118</span><span class="p">.</span><span class="mi">100</span> <span class="n">ms</span>
</code></pre></div></div>
<p>It seems time is growing linearly.
Increase by a 5x factor:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">SELECT</span> <span class="n">f_users</span><span class="p">(</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">50000</span> <span class="p">);</span>
<span class="n">f_users</span>
<span class="c1">---------</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span>
<span class="nb">Time</span><span class="p">:</span> <span class="mi">104680</span><span class="p">.</span><span class="mi">382</span> <span class="n">ms</span>
</code></pre></div></div>
<p>To recap, the following is the timing of role creations:</p>
<table>
<thead>
<tr>
<th style="text-align: center">Groups</th>
<th>Levels</th>
<th style="text-align: center">ROLES</th>
<th style="text-align: right">TIME</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">1000</td>
<td>5</td>
<td style="text-align: center">6000</td>
<td style="text-align: right">1 sec</td>
</tr>
<tr>
<td style="text-align: center"> </td>
<td>2</td>
<td style="text-align: center">3000</td>
<td style="text-align: right">0.3 sec</td>
</tr>
<tr>
<td style="text-align: center">10000</td>
<td>5</td>
<td style="text-align: center">60000</td>
<td style="text-align: right">10 sec</td>
</tr>
<tr>
<td style="text-align: center"> </td>
<td>2</td>
<td style="text-align: center">30000</td>
<td style="text-align: right">2.7 sec</td>
</tr>
<tr>
<td style="text-align: center">50000</td>
<td>5</td>
<td style="text-align: center">300000</td>
<td style="text-align: right">105 sec</td>
</tr>
<tr>
<td style="text-align: center"> </td>
<td>2</td>
<td style="text-align: center">150000</td>
<td style="text-align: right">36 sec</td>
</tr>
</tbody>
</table>
<p><em>for a total of <strong>549000</strong> roles in <strong>155 secs</strong>.</em></p>
<p>So time is not really increasing linearly, but as readers can see PostgreSQL can easily handle a half million roles in less than three minutes.
What about the virtual machine? Well, it is a <em>poor</em> <code class="language-plaintext highlighter-rouge">FreeBSD 11.1-RELEASE</code> running PostgreSQL 9.6 with 512 MB of RAM without WAL archiving or any other replication active. I cannot hit one million roles in a single shoot in such machine because it starts swapping until the swap daemon freezes.</p>
<p>In order to confirm such, let’s consider how many roles there are in my system:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">SELECT</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">FROM</span> <span class="n">pg_roles</span><span class="p">;</span>
<span class="k">count</span>
<span class="c1">--------</span>
<span class="mi">552018</span>
</code></pre></div></div>
<p>the final result is greater than what is expected because I had already a discrete amount of roles.</p>
<p><em>Not so bad for a database!</em></p>
Perl blogs will be powered by PostgreSQL2017-06-11T06:54:00+00:00https://fluca1978.github.io/2017/06/11/perl-blogs-will-be-powered-by-postgresql<h1>~</h1>
<div style="text-align: justify;">There is a <a href="http://news.perlfoundation.org/2017/06/grant-proposal-revitalize-blog-1.html">grant request</a> aiming at revamping <a href="http://blogs.perl.org/">blogs.perl.org</a>.<br />I have to admit that <a href="http://blogs.perl.org/">blogs.perl.org</a> is in a bad shape, and in fact I do not use it anymore for my personal contents since the well known<br /><a href="https://github.com/blogs-perl-org/blogs.perl.org/issues/291#issuecomment-69999275">login issues</a>.<br />Well, the important part about the grant request, at least with regard to the PostgreSQL community, is that…surpise! The new platform will store content on a PostgreSQL backend:<br /></div><i><br /></i><blockquote><i>[…]<br />will be written on top of Dancer2,<br />DBIx::Class, and DBI,<br />with a PostgreSQL database<br />imported from the existing<br />[…]</i><br /></blockquote><br />A great news for two of my favourite Open Source projects (Perl and PostgreSQL) and a great wat to spread the word thru our own content!<br />PgDay.IT 2016: schedule is online!2016-11-11T17:42:00+00:00https://fluca1978.github.io/2016/11/11/pgdayit-2016-schedule-is-online<h1>~</h1>
The <a href="http://2016.pgday.it/">PgDay.IT 2016</a> is approaching and the schedule is available on line <a href="http://2016.pgday.it/en/pages/schedule.html">here</a>. Of course it could be subject to some little changes, but it is pretty mich a complete list of what you are going to see at the tenth edition of the PostgreSQL Italian Day.<br /><br />PGDay.IT 2016: there's some extra time before the CFP is out!2016-10-14T19:43:00+00:00https://fluca1978.github.io/2016/10/14/pgdayit-2016-theres-some-extra-time<h1>~</h1>
<div style="text-align: justify;">Ehi, the CFP for the <a href="http://2016.pgday.it/">PGDay.IT 2016</a> (Italian tenth edition of the PGDay) has been extended until next <b>Saturday 22 October at 23:59</b> (Rome).</div><div style="text-align: justify;">Don't miss the opportunity to be a speaker at one of the most well known PGDay!</div>PGDay.IT 2016: it's time for you to speak!2016-09-18T16:42:00+00:00https://fluca1978.github.io/2016/09/18/pgdayit-2016-its-time-for-you-to-speak<h1>~</h1>
As you probably already know the Call For Papers for the <a href="http://2016.pgday.it/">PGDay.IT 2016</a> is now open. Please see the details <a href="http://2016.pgday.it/en/pages/call-for-papers.html">here</a> and send your contribution following the instructions. The organizing committee will review each proposal in order to deliver a great program for the tenth edition of the italian PostgreSQL based conference.PGday.IT 2016: tenth edition of the italian PostgreSQL conference2016-09-09T16:43:00+00:00https://fluca1978.github.io/2016/09/09/pgdayit-2016-tenth-edition-of-italian<h1>~</h1>
<div style="text-align: justify;">ITPUG is proud to announce the tenth edition of the italian PostgreSQL conference, namely <a href="http://2016.pgday.it/">PGDay.IT 2016</a>, that will take place in Prato, Tuscany, on December 13th.</div><div style="text-align: justify;">The organizing committee is working to provide another great edition of the famous italian day dedicated to PostgreSQL.</div><div style="text-align: justify;">Very soon the Call For Papers will start (see <a href="http://2016.pgday.it/">http://2016.pgday.it</a> for more details).</div><div style="text-align: justify;">In the meantime...stay tuned!</div><br /><br /><a href="http://2016.pgday.it/"><br /> <img alt="pgday_468x60_it" src="http://2016.pgday.it/images/pgday_468x60_it.png" height="60" width="468" /><br /></a>ITPUGLab @ PGDay.IT 20152015-10-26T20:49:00+00:00https://fluca1978.github.io/2015/10/26/itpuglab-pgdayit-2015<h1>~</h1>
<div style="text-align: justify;">I had the opportunity and pleasure to play an active role in the third ITPUGLab, a well established tradition and a successful event me and my friend Gianluca proposed a few years ago.<br />And I have to say: it was really fun and educative.</div><br /><div style="text-align: justify;">What is the ITPUGLab? In short: it is an Open Space container entirely focused on PostgreSQL.<br />Attendees meet for exchanging, proposing or requesting ideas, thoughts, approaches and experiences getting 'hands-on' in a LAN environment and building a constructive shared experience on their laptops, or even philosophical discussions of any kind all being user-experience centric and related to PostgreSQl. No matter what the participants' skill level is.<br />There are no predefined contents: attendees come and propose or join others' proposals.<br /><div style="text-align: justify;">The evolution of the shared interactive contributions is what leads to discovering a path (not necessarily the right one) and get to a possible goal.</div></div><div style="text-align: justify;">This translates to human-networking with a PostgreSQL-social approach, allowing attendees to get acquainted in ways one cannot predict. </div><br /><div style="text-align: justify;">This year we had two and half hours dedicated to the lab, a very comfortable room and very nice people attending.<br /><br />The following is the list of topics discussed end experienced:</div><ul style="text-align: justify;"><li>installation on Microsoft Windows, where the users challenged the differences on installing PostgreSQL on a Unix-unlike machine, coming to the goal of providing a running instance to other people in the room;</li><li>migration and upgrade, with particular interest to the migration of a quite old cluster from a MS Windows machine to a mature and up-to-date cluster on a *nix machine, as well how to do it automatically and error-safely;</li><li>install, configure and use the PostGIS extension from scratch;</li><li>pl/pgsql scripting, with particular focus on editors, repos and best practices;</li><li>data integrity check and validation with regard to the database and/or application;</li><li>periodical data dump and load from one server to one (or many) others, with regard to various scenarios and possible automations.</li></ul><br /><div style="text-align: justify;">Rules in the ITPUGLab are simple: after introducing themselves, participants start grouping spontaneously, warm up and get discussing, hands-on. Everybody can join a formed OpenSpace as well as leave it or, even, the room. When it's over, it's over: once the time elapsed pencil are down, and what happened is always the only and rightmost thing could happened.<br />Pictures cannot provide the excitement and fun filling the room.</div><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://3.bp.blogspot.com/-3LQDwXyBOws/ViuMFIipcWI/AAAAAAAADWc/ypm80u6_HpA/s1600/DSC00552.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="http://3.bp.blogspot.com/-3LQDwXyBOws/ViuMFIipcWI/AAAAAAAADWc/ypm80u6_HpA/s320/DSC00552.JPG" width="320" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-FW3NDrTYUTY/ViuMKeUIDMI/AAAAAAAADWk/kqjq14bQVnI/s1600/DSC00549.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="http://1.bp.blogspot.com/-FW3NDrTYUTY/ViuMKeUIDMI/AAAAAAAADWk/kqjq14bQVnI/s320/DSC00549.JPG" width="320" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://2.bp.blogspot.com/-NLfNeRgA64Y/ViuMQkhPNKI/AAAAAAAADWs/f34snLtmmK0/s1600/DSC00553.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="http://2.bp.blogspot.com/-NLfNeRgA64Y/ViuMQkhPNKI/AAAAAAAADWs/f34snLtmmK0/s320/DSC00553.JPG" width="320" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-ElZhjDYuC5A/ViuMT0qoZUI/AAAAAAAADW0/as2H-o5-8JI/s1600/DSC00554.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="http://4.bp.blogspot.com/-ElZhjDYuC5A/ViuMT0qoZUI/AAAAAAAADW0/as2H-o5-8JI/s320/DSC00554.JPG" width="320" /></a></div><br /><div style="text-align: justify;">As said, this is the third edition of the ITPUGLab, and quite frankly I'm proud of the continuous success it is getting within the PGDay.IT annual conference.<br />One thing all the three edition did have in common is the same request by attendees for more time: we are evaluating how to extend the session in the next PGDay.IT.</div><div style="text-align: justify;">If you are coming to the next PGDay.IT, get into the lab: it's an experience you really don't want to miss!</div>PGDay.IT 2015: nine editions and counting!2015-10-24T13:38:00+00:00https://fluca1978.github.io/2015/10/24/pgdayit-2015-nine-editions-and-counting<h1>~</h1>
<div style="text-align: justify;"><div style="text-align: center;"><img alt="pgday_200x60_it" src="http://2015.pgday.it/wp-content/uploads/2015/08/pgday_200x60_it.png" height="60" width="200" /></div><br />We made it!<br /><a href="http://www.itpug.org/">ITPUG</a> (Italian PostgreSQL Users' Group) organized the ninth edition of the Italian PGDay, namely <a href="http://pgday.it/">PGDay.IT</a>.<br />We have a very strong a quite long tradition in organizing this national conference, and as in previous editions, we had a successful conference even this year.<br />The location, the Camera di Commercio di Prato, was simply great: a modern and really beautiful context to host the two tracks and the third edition of the ITPUGLab, the Open Space container entirely dedicated to PostgreSQL.<br />The keynote speech was performed by the well known community member Andres Freund, but he was not the only member of the international community.<br />After the keynote and the usual coffee break, with many delicious Italian pastries, the conference split in two parallel tracks where a set of very competent and efficient speakers presented new projects, ideas, features and core implementations of our favorite database.<br />In the afternoon another track added to the already mentioned two giving the possibility to attendees to participate to the ITPUGLab, another well established tradition of the PGDay.IT.<br />Last but not least, the usual session of lightning talks, the group picture and the very good beer offered by one of the conference sponsor.<br /><br />At the end we can count one hundred attendees, ten regular talks, sixteen speakers, six sponsors and two social beers.<br /><br />It is quite difficult to recap in a few lines what this event was and has been in the past edition. I can only say that if you are missing this conference you are missing a very technical event within a friendly environment and fun context</div>PGDay.IT 2015: we are here!2015-10-07T22:22:00+00:00https://fluca1978.github.io/2015/10/07/pgdayit-2015-we-are-here<h1>~</h1>
<div style="text-align: justify;">The ninth edition of the Italian PGDay (<a href="http://2015.pgday.it/">PGDay.IT 2015</a>) is really close, and ITPUG is proud to announce that the schedule is available on-line.</div><div style="text-align: justify;">As in the previous editions we have a rich set of talks and contributions, as well as the third edition of the ITPUG's own Open Space named <i>ITPUG-Lab</i>.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Check out the official website at <a href="http://2015.pgday.it/">http://2015.pgday.it</a> and see you soon at PGDay.IT 2015!</div>Thank you ITPUG2015-02-24T17:52:00+00:00https://fluca1978.github.io/2015/02/24/thank-you-itpug<h1>~</h1>
<div style="text-align: justify;">2014 was a very bad year, one I will remember forever for the things and the people I missed.</div><div style="text-align: justify;">But it was also the first year I missed the PGDay.IT, but today, thank to the board of directors and volounteers, I received the shirts of the event.</div><div style="text-align: justify;">This is a great thing for me, as being part of this great community.</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-2UjAfmSlHeU/VOy5VouMo_I/AAAAAAAACQ8/MgZFV5MkEOo/s1600/20150224_180829.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://1.bp.blogspot.com/-2UjAfmSlHeU/VOy5VouMo_I/AAAAAAAACQ8/MgZFV5MkEOo/s1600/20150224_180829.jpg" height="320" width="192" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-UcAjBCRRLQE/VOy5Xhf_XLI/AAAAAAAACRI/JGfIdturVGg/s1600/20150224_180841.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://1.bp.blogspot.com/-UcAjBCRRLQE/VOy5Xhf_XLI/AAAAAAAACRI/JGfIdturVGg/s1600/20150224_180841.jpg" height="320" width="192" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://3.bp.blogspot.com/-ckmorNPPQ2I/VOy5VDcquDI/AAAAAAAACQ4/_dM3b0jQqM8/s1600/20150224_180856.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://3.bp.blogspot.com/-ckmorNPPQ2I/VOy5VDcquDI/AAAAAAAACQ4/_dM3b0jQqM8/s1600/20150224_180856.jpg" height="320" width="192" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://2.bp.blogspot.com/-JmdFqJPWyOM/VOy5aFtkcwI/AAAAAAAACRQ/GdYaDNE01oU/s1600/20150224_180909.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-JmdFqJPWyOM/VOy5aFtkcwI/AAAAAAAACRQ/GdYaDNE01oU/s1600/20150224_180909.jpg" height="320" width="192" /></a></div><br />A special thank also to the OpenERP Iitalia!ITPUG interview2015-02-22T15:53:00+00:00https://fluca1978.github.io/2015/02/22/itpug-interview<h1>~</h1>
<div style="text-align: justify;">Thanks to the effort of some of our associates, we were able to perform a short interview to our associates themselves in order to see how ITPUG is working and how they feel within the association.</div><div style="text-align: justify;">The results, in italian, are available <a href="http://fluca1978.blogspot.it/2015/02/sondaggio-itpug_23.html">here</a> for a first brief description.</div><div style="text-align: justify;">As a general trend, ITPUG is going fine, or even better of how it was going a few years before. However there is still a lot of work to do in order to spread the PostgreSQL word and to make our associates a little more involved in the community itself.</div><div style="text-align: justify;">As a last word, I believe this kind of interviews should be performed on a regular basis in order to keep under control the work of the association and of its members.<br /><br /><b>Update: Feb 23</b><br />It seems that this kind of interview, and consequently the result inspection/analysys, generated a good discussion among the ITPUG members, that is I'm proud of thia other interesting result in the general management of the association.<b> </b> </div>ITPUG's Advent Calendar2014-12-23T22:41:00+00:00https://fluca1978.github.io/2014/12/23/itpugs-advent-calendar<h1>~</h1>
<div style="text-align: justify;">This December, borrowing a concept commonly used in other communities, I decide to challenge the <a href="http://www.itpug.org/">ITPUG</a> members in writing an <a href="http://en.wikipedia.org/wiki/Advent_calendar">Advent Calendar</a>.</div><div style="text-align: justify;">The idea is quite simple: write a post (related to PostgreSQL of course) per day.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">We started a little late, on December 3rd, and were able to push an article almost every day so far. Posts were pushed to the ITPUG members mailing list, generating several interesting discussions among the italian community.</div><div style="text-align: justify;">It is possible to find the posts at our (italian-language) planet at <a href="http://www.planetpostgresql.it/">www.planetpostgresql.it</a>; please note that all articles on the planet are from my blog because I did the effort to "forward" them to the planet.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I hope this would be the first of several ITPUG Advent Calendars, and I hope other PostgreSQL Users' Groups will do the same in the future.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I'd like to thank all PostgreSQL developers, translators, testers for the great product they continue to deliver us.</div><div style="text-align: justify;">And I wish a Merry Christmas to you all.</div>Me @ Planet PostgreSQL2014-12-20T07:48:00+00:00https://fluca1978.github.io/2014/12/20/me-planet-postgresql<h1>~</h1>
<div style="text-align: justify;">This is my first attempt to appear on Planet PostgreSQL.</div><div style="text-align: justify;">I'm Luca, the current president of the Italian PostgreSQL Users' Group (ITPUG) and I'm a PostgreSQL addicted.</div><div style="text-align: justify;">Unluckily I'm currently not using PostgreSQL in a day-by-day job, but I'm following the evolution of the project and using it wherever is possible.</div><div style="text-align: justify;">I hope to able to contribute to the community somehow.</div>