Planet PostgreSQLhttps://planet.postgresql.orgPlanet PostgreSQLen-usMon, 16 Mar 2026 17:03:18 +0000Jobin Augustine: What Is in pg_gather Version 33 ?https://postgr.es/p/7uYIt started as a humble personal project, few years back. The objective was to convert all my PostgreSQL notes and learning into a automatic diagnostic tool, such that even a new DBA can easily spot the problems. The idea was simple, a simple tool which don&#8217;t need any installation but do all possible analysis and [&#8230;]Mon, 16 Mar 2026 17:03:18 +0000https://postgr.es/p/7uYCornelia Biacsics: Contributions for week 10, 2026https://postgr.es/p/7uV<p>On Tuesday March 10, 2026 <a href="https://www.meetup.com/postgresbe/events/313227373/">PUG Belgium met for the March edition</a>, organized by Boriss Mejias and Stefan Fercot. </p> <p>Speakers: </p> <ul> <li>Esteban Zimanyi </li> <li>Thijs Lemmens</li> <li>Yoann La Cancellera</li> </ul> <p>Robert Haas organized a Hacking Workshop on Tuesday March 10, 2026. Tomas Vondra discussed questions about one of his talks.</p> <p><a href="https://luma.com/5pglgx8h">PostgreSQL Edinburgh meetup Mar 2026</a> met on Thursday March 12, 2026</p> <p>Speakers:</p> <ul> <li>Radim Marek</li> <li>Jimmy Angelakos </li> </ul> <p><a href="https://eventyay.com/e/88882f3e">FOSSASIA Summit 2026</a> took place from Sunday March 8 - Tuesday March 10, 2026 in Bangkok. </p> <p>PostgreSQL speakers: </p> <ul> <li>Koji Annoura</li> <li>Charly Batista</li> <li>Gary Evans</li> <li>Joe Conway</li> <li>Suraj Kharage</li> <li>Robert Treat</li> <li>Sameer Kumar</li> <li>Roneel Kumar</li> <li>Sivaprasad Murali</li> <li>Yugo Nagata</li> <li>Denis Smirnov</li> <li>Vaibhav Dalvi</li> <li>Gyeongseon Park</li> <li>Bo Peng</li> <li>Brian McKerr</li> <li>Chris Travers</li> <li>Jirayut Nimsaeng</li> <li>Gilles Darold</li> <li>Rajni Baliyan</li> </ul> <p><a href="https://live.pgconf.in/">PostgreSQL Conference India</a> took place in Bengaluru (India) from March 11 - March 13, 2026. </p> <p>Organizers: </p> <ul> <li>Pavan Deolasee</li> <li>Ashish Kumar Mehra</li> <li>Nikhil Sontakke</li> <li>Hari Kiran</li> <li>Rushabh Lathia</li> </ul> <p>Talk Selection Committee:</p> <ul> <li>Amul Sul</li> <li>Dilip Kumar</li> <li>Marc Linster</li> <li>Thomas Munro</li> <li>Vigneshwaran c</li> </ul> <p>Speakers:</p> <ul> <li>Abhijeet Rajurkar</li> <li>Aditya Duvuri</li> <li>Ajit Awekar</li> <li>Amit Kumar Singh</li> <li>Amogh Bharadwaj</li> <li>Amul Sul</li> <li>Andreas Scherbaum</li> <li>Ashutosh Bapat</li> <li>Avinash Vallarapu</li> <li>Boopathi Parameswaran</li> <li>Claire Giordano</li> <li>Danish Khan</li> <li>Deepak R Mahto</li> <li>Dilip Kumar</li> <li>Divya Bhargov</li> <li>Dr. M. J. Shankar Raman</li> <li>Franck Pachot</li> <li>Hari Kiran</li> <li>Hari Prasad</li> <li>Harish Perumal</li> <li>Jayant Haritsa</li> <li>Jim Mlodgenski</li> <li>Jobin Augustine</li> <li>Joe Conway</li> <li>Kanthanathan S</li> <li>Kevin Biju</li> <li>Koji Annoura</li> <li>Kranthi Kiran Burada</li> <li>Lalit Choudhary</li> <li>Michael Zhilin</li> <li>Mithun Chicklore Yogendra</li> <li>Mohit Agarwal</li> <li>NarendraSingh Tawar</li> <li>Neel Patel</li> <li>Neeta Goel</li> <li>Nikhil Chawla</li> <li>Nikhil Sontakke</li> <li>Nishad Mankar</li> <li>Palak Chaturvedi</li> <li>Pavan Deolasee</li> <li>Pushkar Khadilkar</li> <li>Rahila Syed</li> <li>Rajeev Rastogi</li> <li>Rajkumar Raghuwanshi</li> <li>René Cannaò</li> <li>Ripunjay Tripathi</li> <li>Rohith BCS</li> <li>Roneel Rohitesh Kumar</li> <li>Sai Srirampur</li> <li>Sameer Kumar</li> <li>Samuel Cherukutty</li> <li>Sashikanta Pattanayak</li> <li>Sathakathullah Abdul Kafar</li> <li>Saurabh Gupta</li> <li>Shashidhar</li> <li>Shlok Kumar Kyal</li> <li>Shriram Muthukrishnan</li> <li>Srinath Reddy Sadipiralla</li> <li>Sumedh Pathak</li> <li>Suresh dash</li> <li>Tom Kincaid</li> <li>Vaibhav Popat</li> <li>Vaijayanti Bharadwaj</li> <li>Venkat Akhil Pavuluri</li> <li>Vinay Paladi</li> <li>Vishnu R Nambiar</li> <li>Wazir Ahmed</li> </ul> <p>Volunteers:</p> <ul> <li>Aarti Nadekar</li> <li>Aditya Sanjay Raje</li> <li>Ashesh Vashi</li> <li>Khushboo Vashi</li> <li>Pinaz Raut</li> <li>Rahila Syed</li> </ul> <p>Community Blog Posts: </p> <ul> <li><a href="https://gorthx.wordpress.com/2026/03/11/scale23x/">SCaLE23x</a> by Gabrielle Roth</li> </ul>Mon, 16 Mar 2026 08:20:47 +0000https://postgr.es/p/7uVRichard Yen: Learning AI Fast with pgEdge's RAGhttps://postgr.es/p/7uW<h1 id="introduction">Introduction</h1> <p>If you’ve been paying attention to the technology landscape recently, you’ve probably noticed that AI is <strong>everywhere</strong>. New frameworks, new terminology, and a dizzying array of acronyms and jargon: <strong>LLM</strong>, <strong>RAG</strong>, <strong>embeddings</strong>, <strong>vector databases</strong>, <strong>MCP</strong>, and more.</p> <p>Honestly, it’s been difficult to figure out where to start. Many tutorials either dive deep into machine learning theory (Bayesian transforms?) or hide everything behind a single API call to a hosted model. Neither approach really explains how these systems actually work.</p> <p>Recently I spent some time experimenting with the <a href="https://www.pgedge.com">pgEdge</a> AI tooling after hearing Shaun Thomas’ talk at a <a href="https://prairiepostgres.org/">PrairiePostgres</a> meetup. He talked about how to set up the various components of an AI chatbot system, starting from ingesting documents into a Postgres database, vectorizing the text, setting up a RAG and then an MCP server.</p> <p>When I got home I wanted to try it out for myself – props to the pgEdge team for making it all free an open-source! What surprised me most was not just that everything worked, but how easy it was to get a complete AI retrieval pipeline running locally. More importantly, it turned out to be one of the clearest ways I’ve found to understand how modern AI systems are constructed behind the scenes. Thanks so much, Shaun!</p> <hr /> <h1 id="the-pgedge-ai-components">The pgEdge AI Components</h1> <p>The pgEdge AI ecosystem provides several small tools that fit together naturally. I’ll go through them real quickly here</p> <ul> <li><a href="https://github.com/pgEdge/doc-converter">Doc Converter</a> – The doc-converter normalizes documents into a format that is easy to process downstream. Whether the input is PDF, HTML, Markdown, or plain text, the converter produces clean text output suitable for ingestion.</li> <li><a href="https://github.com/pgEdge/pgedge-vectorizer">Vectorizer</a> – The vectorizer handles the process of converting text chunks into embeddings. These embeddings are numeric representations of text that capture semantic meaning. Once generated, they can be stored inside PostgreSQL using <a href="https://github.com/pgvector/pgvector">pgvector</a> and queried with similarity search.</li> <li><a href="https://github.com/pgEdge/pgedge-rag-server">Retrieval-Augmented Generation (RAG) Server</a> – The RAG framework ties everything together. It orchestrates: <ol> <li>embedding the user’s query</li> <li>retrieving similar document chunks</li> <li>assembling prompt context</li> <li>sending the prompt to an LLM</li> <li>returning the generated response</li> </ol> </li> </ul> <p>When the full system is running, you essentially have ChatGPT or Gemini running on your laptop</p> <hr /> <h1 id="running-everything-locally-with-ollama">Running Everything Locally with Ollama</h1> <p>With ChatGPT and Gemini, getting tokens or sharing my payment info was a blocker, especially if I just want to test stuff for educational purposes. Through Shaun’s presentation, I was introduced to <a href="https://ollama.com">Ollama</a>, which is a great alternative, if you’re okay with slower performance (especially on a 8GB M1 Mac Mini).</p> <p>I was pleasantly surprised at how easy it was to run the entire pipeline without relying on external AI APIs. Specifically, I used the <strong>embeddinggemma</strong> model for generating embeddings. This meant the entire stack could run locally, no API keys required! Running everything locally removes those barriers and definitely makes experimentation much easier.</p> <hr /> <h1 id="understanding-rag-by-actually-running-it">Understanding RAG by Actually Running It</h1> <p>One of the most confusing concepts in learning AI prior to Shaun’s talk was Retrieval-Augmented Generation (RAG). I learned that what a RAG does is:</p> <blockquote> <p>Before asking the LLM to answer a question, retrieve relevant information and include it in the prompt.</p> </blockquote> <p>With the pgEdge pipeline, the flow becomes very visible.</p> <ol> <li>Documents are converted into clean text</li> <li>Text is split into chunks</li> <li>Chunks are embedded into vectors</li> <li>Vectors are stored in PostgreSQL</li> <li>A question is embedded into a vector</li> <li>A similarity search finds relevant chunks</li> <li>Those chunks are inserted into the prompt</li> <li>The LLM generates the response</li> </ol> <p>From this, I realized that the LLM is not storing my data. Instead, the system retrieves relevant information <em>on demand</em> and feeds it into the prompt. The RAG is a facilitator to the LLM’s response.</p> <hr /> <h1 id="the-role-of-the-vectorizer">The Role of the Vectorizer</h1> <p>The vectorizer is a crucial step in the pipeline. Its job is to convert human language into embeddings, which are high-dimensional numeric representations of meaning. With vectors, searching with natural language becomes possible, instead of old-fashioned keyword matches.</p> <p>Once the embeddings (vectorized documents) are stored in PostgreSQL using pgvector, everything starts to look familiar again for database engineers:</p> <ul> <li>indexing</li> <li>storage</li> <li>similarity search</li> <li>ranking results</li> </ul> <p>Managing these things look pretty doable for a database guy like me 😂</p> <hr /> <h1 id="dont-try-this-at-home"><del>Don’t</del> Try This At Home!</h1> <p>After writing about the pgEdge stack I wanted to make it as easy as possible for others to reproduce the same experience, so I <a href="https://github.com/richyen/learn-ai-with-postgres">packaged everything into a Docker Compose project</a>.</p> <p>Clone the repository and run:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/richyen/learn-ai-with-postgres.git <span class="nb">cd </span>learn-ai-with-postgres <span class="nb">mkdir </span>documents <span class="c"># put some txt files in there for vectorization</span> docker compose up <span class="nt">--build</span> </code></pre></div></div> <p>That single command:</p> <ol> <li>Builds a custom PostgreSQL image with <code class="language-plaintext highlighter-rouge">pgvector</code> and <code class="language-plaintext highlighter-rouge">pgedge_vectorizer</code> compiled in</li> <li>Starts an Ollama container and pulls the <code class="language-plaintext highlighter-rouge">embeddinggemma</code> and <code class="language-plaintext highlighter-rouge">glm-4.7-flash</code> models locally</li> <li>Runs <code class="language-plaintext highlighter-rouge">pgedge-docloader</code> to ingest any documents you’ve put into the <code class="language-plaintext highlighter-rouge">documents/</code> folder</li> <li>Calls <code class="language-plaintext highlighter-rouge">pgedge_vectorizer.enable_vectorization()</code>, which starts background workers inside Postgres that chunk and embed every page</li> <li>Starts the RAG server on port 8080</li> </ol> <p>No API keys, no cloud services. Everything runs on your own hardware.</p> <p>Once the RAG server is up (watch for the setup container to exit cleanly), try asking it a question:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-s</span> <span class="nt">-X</span> POST http://localhost:8080/v1/pipelines/pg-docs <span class="se">\</span> <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span> <span class="nt">-d</span> <span class="s1">'{"query": "How does autovacuum decide when to run?"}'</span> <span class="se">\</span> | jq <span class="nb">.</span> </code></pre></div></div> <p>The answer comes back a few seconds later, grounded in the actual PostgreSQL documentation:</p> <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w"> </span><span class="nl">"answer"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Autovacuum in PostgreSQL is triggered based on thresholds defined by two parameters: autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor. The daemon considers a table eligible for vacuuming when the number of dead tuples exceeds the threshold plus (scale_factor × total row count) ..."</span><span class="w"> </span><span class="p">}</span><span class="w"> </span></code></pre></div></div> <p>You can also run raw similarity searches directly in SQL to see exactly what the retrieval step is doing before the LLM touches anything:</p> <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">d</span><span class="p">.</span><span class="n">title</span><span class="p">,</span> <span class="k">left</span><span class="p">(</span><span class="k">c</span><span class="p">.</span><span class="n">content</span><span class="p">,</span> <span class="mi">200</span><span class="p">)</span> <span class="k">AS</span> <span class="n">snippet</span> <span class="k">FROM</span> <span class="n">documents_content_chunks</span> <span class="k">c</span> <span class="k">JOIN</span> <span class="n">documents</span> <span class="n">d</span> <span class="k">ON</span> <span class="k">c</span><span class="p">.</span><span class="n">source_id</span> <span class="o">=</span> <span class="n">d</span><span class="p">.</span><span class="n">id</span> <span class="k">WHERE</span> <span class="k">c</span><span class="p">.</span><span class="n">embedding</span> <span class="k">IS</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="k">c</span><span class="p">.</span><span class="n">embedding</span> <span class="o">&lt;=&gt;</span> <span class="n">pgedge_vectorizer</span><span class="p">.</span><span class="n">generate_embedding</span><span class="p">(</span><span class="s1">'autovacuum threshold configuration'</span><span class="p">)</span> <span class="k">LIMIT</span> <span class="mi">5</span><span class="p">;</span> </code></pre></div></div> <p>This is the same pgvector <code class="language-plaintext highlighter-rouge">&lt;=&gt;</code> (cosine distance) operator the RAG server uses internally — you can inspect the retrieval step at any time without going through the HTTP API.</p> <p>Embeddings are generated in the background by Postgres workers, so you can start querying as soon as a few hundred chunks are ready. Watch the progress with:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>psql postgresql://postgres:password@localhost:5432/pgai <span class="nt">-c</span> <span class="s2">" SELECT (SELECT count(*) FROM documents) AS total_docs, (SELECT count(*) FROM documents_content_chunks WHERE embedding IS NOT NULL) AS vectorized; "</span> </code></pre></div></div> <p>The project also includes the pgedge-postgres-mcp server on port 8081, which exposes the knowledge base via the Model Context Protocol — so it can be wired directly into Claude Desktop, VS Code Copilot, or any other MCP-compatible client.</p> <hr /> <h1 id="final-thoughts">Final Thoughts</h1> <p>There’s a lot of pressure right now to “learn AI,” but that phrase can mean many different things. For people coming from infrastructure, databases, or backend engineering, one of the most approachable paths is simply:</p> <blockquote> <p>build a small RAG pipeline and observe how the pieces fit together.</p> </blockquote> <p>The pgEdge tooling made this surprisingly straightforward. Instead of assembling half a dozen unrelated frameworks, the components already fit together:</p> <ul> <li>doc ingestion</li> <li>vectorization</li> <li>PostgreSQL storage</li> <li>retrieval</li> <li>prompt generation</li> <li>LLM response</li> </ul> <p>Once I saw the entire flow working end-to-end, the AI ecosystem makes a lot more sense. Setting up the pgEdge RAG stack turned out to be a surprisingly effective way to see that architecture in action.</p> <p>Enjoy!</p>Mon, 16 Mar 2026 08:00:00 +0000https://postgr.es/p/7uWDave Page: AI Features in pgAdmin: AI Insights for EXPLAIN Planshttps://postgr.es/p/7uX<p>This is the third and final post in a series covering the new AI functionality in <a href="https://www.pgadmin.org/">pgAdmin 4</a>. In the <a href="https://www.pgedge.com/blog/ai-features-in-pgadmin-configuration-and-reports">first post</a>, I covered LLM configuration and the AI-powered analysis reports, and in the <a href="https://www.pgedge.com/blog/ai-features-in-pgadmin-the-ai-chat-agent">second</a>, I introduced the AI Chat agent for natural language SQL generation. In this post, I'll walk through the AI Insights feature, which brings LLM-powered analysis to PostgreSQL EXPLAIN plans.Anyone who has spent time optimising PostgreSQL queries knows that reading EXPLAIN output is something of an acquired skill. pgAdmin has long provided a graphical EXPLAIN viewer that makes the plan tree easier to navigate, along with analysis and statistics tabs that surface key metrics, but interpreting what you're seeing and deciding what to do about it still requires a solid understanding of the query planner's behaviour. The AI Insights feature aims to bridge that gap by providing an expert-level analysis of your query plans, complete with actionable recommendations.<h2>Where to Find It</h2>AI Insights appears as a fourth tab in the EXPLAIN results panel, alongside the existing Graphical, Analysis, and Statistics tabs. It's only visible when an LLM provider has been configured, so if you don't see it, check that you've set up a provider in Preferences (as described in the first post). The tab header simply reads 'AI Insights'.To use it, run a query with EXPLAIN (or EXPLAIN ANALYZE for the most useful results, since actual execution timings give the AI much more to work with), and then click on the AI Insights tab. The analysis starts automatically when you switch to the tab, or you can trigger it manually with the Analyze button.<img src="https://a.storyblok.com/f/187930/950x887/76d389b7e7/picture1.png" /><h2>What the Analysis Provides</h2>The AI Insights analysis produces three sections:<h3>Summary</h3>A concise paragraph providing an overall assessment of the query plan's performance characteristics. This gives you a quick sense of whether the plan is generally healthy or has significant issues worth investigating. For well-optimised queries, the summary will confirm that the plan looks reasonable; for problematic ones, it highlights the key areas of concern.<h3>Performance Bottlenecks</h3>This is the heart of the analysis. The AI examines the plan tree and identifies specific nodes that may be causing performance problems. Each bottleneck is presented as a card showing:<ul><li>: Classified as high, medium, or low, with colour-coded indicators (red for high, orange for medium, blue for low) so you can quickly spot the most important issues</li></ul><ul><li>: The specific plan node involved (for example, 'Seq Scan on orders' or 'Nested Loop')</li></ul><ul><li>Issue: A brief description of the problem</li></ul><ul><li>: A more thorough explanation of why this is a problem and what impact it has on query performance</li></ul>The types of issues the analysis looks for include sequential scans on large tables where an index might help, nested loops with high row counts that suggest missing indexes or poor join ordering, large variances between estimated and actual row counts (which usually indicate stale statistics), sort operations on large datasets without supporting indexes, hash joins spilling to disk, and bitmap heap scans with excessive recheck conditions.Importantly, the analysis also applies contextual judgement. Not every sequential scan is a problem; scanning a small lookup table sequentially is often faster than using an index, and the AI takes table size and selectivity into account when deciding whether to flag something as an issue.<h3>Recommendations</h3>Each identified bottleneck comes with one or more prioritised recommendations for addressing it. Recommendations are numbered by priority, with the most impactful changes listed first. Each recommendation includes:<ul><li>: A short description of the suggested change</li></ul><ul><li>: Why this change will help, connecting the recommendation back to the specific bottleneck</li></ul><ul><li>: Where applicable, the exact SQL statement to implement the recommendation</li></ul>This last point is particularly valuable. Rather than telling you "consider adding an index" and leaving you to work out the details, the analysis provides the actual statement with the appropriate table name, column list, and index type. Each SQL code block has a copy button and an 'Insert into Editor' button that places the SQL directly into your query editor, so you can review and execute it with minimal friction.Recommendations aren't limited to index creation, however. You might see suggestions to run on tables with stale statistics, to adjust for queries that are spilling sorts or hash operations to disk, to rewrite suboptimal query structures, or to consider partial indexes when a full index would be unnecessarily large.<h2>A Worked Example</h2>To give a sense of what this looks like in practice, imagine you run EXPLAIN ANALYZE on a query that joins a large table with a table and filters by date range. The AI Insights analysis might produce something like this:: The query takes 2.3 seconds to execute, with the majority of time spent on a sequential scan of the table. The join to is well-optimised using an index lookup, but the date range filter on is not supported by an index, causing a full table scan of 4.2 million rows. (High Severity): Sequential Scan on , scanning 4,200,000 rows but returning only 12,500. The planner estimated 15,000 rows, suggesting statistics are reasonably up to date, but the lack of an index forces a full scan.: Create an index on the date column:: If queries typically also filter by status, consider a composite index:You could click 'Insert into Editor' on either recommendation, review the statement, execute it, and then re-run your EXPLAIN ANALYZE to see the improvement.<h2>Downloading Reports</h2>If you want to save or share the analysis, the Download button exports a complete Markdown report including the original SQL query, the raw execution plan, and the full AI analysis with all bottlenecks and recommendations. The file is named with the current date (for example, ) for easy filing.<h2>Regenerating and Stopping</h2>Because LLM responses can vary between invocations, you might occasionally want to get a second opinion on the same plan. The Regenerate button reruns the analysis from scratch, which can sometimes surface different insights or provide alternative recommendations. If a new EXPLAIN is run whilst the AI Insights tab is visible, the analysis will automatically trigger for the new plan.If the analysis is taking longer than expected (the timeout is five minutes, though most analyses complete in well under a minute), you can click the Stop button to cancel the in-flight request. The panel will show an 'Analysis stopped' message and you can choose to retry or move on.<h2>How It Works Under the Hood</h2>When you trigger an analysis, the frontend sends the full EXPLAIN plan (in JSON format) and the original SQL query to a backend endpoint via a streaming HTTP request. The backend constructs a prompt that instructs the LLM to act as a PostgreSQL performance expert, providing it with detailed guidelines on what to look for in query plans and how to classify severity. The LLM's response is parsed as structured JSON (with bottlenecks, recommendations, and summary as separate fields), which allows the frontend to render each piece with appropriate formatting and interactivity.The streaming architecture means you see a 'thinking' indicator whilst the analysis is in progress, with rotating messages such as 'Analyzing query plan...', 'Examining node costs...', 'Looking for sequential scans...', and 'Evaluating join strategies...'. Results appear as soon as the LLM completes its response, without needing to reload or poll.<h2>Getting the Most from AI Insights</h2>A few suggestions for making the most of this feature:<ul><li>. The actual execution timings and row counts give the AI significantly more information to work with. Plain EXPLAIN provides only the planner's estimates, which limits the depth of analysis possible.</li></ul><ul><li>. The AI provides excellent starting points, but you should consider your specific workload patterns before creating indexes. An index that helps one query might slow down write-heavy operations on the same table. Use the recommendations as informed suggestions that merit testing rather than directives to follow without question.</li></ul><ul><li>. Even if you're already experienced with EXPLAIN output, the detailed explanations of why specific plan nodes are problematic can help reinforce your understanding or occasionally highlight something you might have overlooked. For less experienced users, it's an excellent way to build up familiarity with how PostgreSQL executes queries.</li></ul><ul><li>. If the AI Insights analysis identifies issues but you want to explore further (perhaps to understand your data distribution or check current index usage statistics), switch to the AI Chat agent and ask follow-up questions. The two features complement each other well.</li></ul><h2>Wrapping Up the Series</h2>Across these three posts, I've covered the full range of AI functionality now available in pgAdmin 4: the LLM configuration that underpins everything, the AI-powered security, performance, and schema design reports for proactive database analysis, the AI Chat agent for natural language SQL generation and database exploration, and the AI Insights feature for query plan optimisation.All of these features are designed to enhance rather than replace your expertise. They lower the barrier to performing analyses that would otherwise require significant time and specialist knowledge, whilst keeping you firmly in control of what actually gets executed against your database. Whether you use a cloud-hosted model from Anthropic or OpenAI, or prefer to keep everything local with Ollama or Docker Model Runner, the AI features adapt to your environment and preferences.Give them a try; I think you'll find they become a natural part of your PostgreSQL workflow. And as always, we welcome feedback and contributions from the community.</p>Mon, 16 Mar 2026 06:31:22 +0000https://postgr.es/p/7uXPavel Luzanov: PostgreSQL 19: part 4 or CommitFest 2026-01https://postgr.es/p/7uU<p>Continuing the series of CommitFest 19 reviews, today we&rsquo;re covering the January 2026 CommitFest. <p>The highlights from previous CommitFests are available here: <a href="https://postgrespro.com/blog/pgsql/5972724">2025-07</a>, <a href="https://postgrespro.com/blog/pgsql/5972743">2025-09</a>, <a href="https://postgrespro.com/blog/===FIXME===">2025-11</a>. <ul> <li>Partitioning: merging and splitting partitions</li> <li>pg_dump[all]/pg_restore: dumping and restoring extended statistics</li> <li>file_fdw: skipping initial rows</li> <li>Logical replication: enabling and disabling WAL logical decoding without server restart</li> <li>Monitoring logical replication slot synchronization delays</li> <li>pg_available_extensions shows extension installation directories</li> <li>New function pg_get_multixact_stats: multixact usage statistics</li> <li>Improvements to vacuum and analyze progress monitoring</li> <li>Vacuum: memory usage information</li> <li>vacuumdb --dry-run</li> <li>jsonb_agg optimization</li> <li>LISTEN/NOTIFY optimization</li> <li>ICU: character conversion function optimization</li> <li>The parameter standard_conforming_strings can no longer be disabled</li> </ul> <p>...Mon, 16 Mar 2026 00:00:00 +0000https://postgr.es/p/7uUAshutosh Bapat: Professional karmahttps://postgr.es/p/7uT<p>In the very early days of my career, an incident made me realise that perfoming my job irresponsibily will affect me adversely, not because it will affect my position adversely, but because it can affect my life otherwise also. I was part a team that produced a software used by a financial institution where I held my account. A bug in the software caused a failure which made several accounts, including my bank account, inaccessible! Fortunately I wasn't the one who introduced that bug and neither was other software engineer working on the product. It has simply crept through the cracks that the age-old software had developed as it went through many improvements. Something that happens to all the architectures, software or otherwise in the world. That was an enlightening and eve opening experience. But professional karma is not always bad; many times it's good. When the humble work I do for earning my living also improves my living, it gives me immense satisfaction. It means that it's also improving billions of lives that way across the globe.</p><p>When I was studying post-graduation in IIT Bombay, I often travelled by train - local and intercity. The online ticketing system for long distant trains was still in its early stages. Local train tickets were still issued at stations and getting one required standing in a long queue. Fast forward to today, you can buy a local train ticket on a mobile App or at a kiosk at the station by paying online through UPI. In my recent trip to IIT Bombay I bought such a ticket using GPay in a few seconds. And know what, UPI uses PostgreSQL as an OLTP database in its system. I didn't have to go through the same experience thank to the same education and the work I am doing. Students studying in my alma-matter no more have to go through the same painful experience now, thanks to many PostgreSQL contributors who once were students and might have similar painful experiences in their own lives.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3gfVoagA29_N1QXwEGbKfULOQo25mDpr0ek4npRV6FoxqmcHHtwqjWkgDiY2Fmk_5HO6bsokp4ULHI3Udgdo_lSYBDsByXr_Y-W6qxuv8y_mY_e9FW-4PlYY27q4hZheC7T0Ft7MoeAKqmmLVi5NzQfmWcA7G4Me1gJTMaeVtR8Zno8qUu2Ef4sqxB3U/s1280/1773395145724.jpeg" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3gfVoagA29_N1QXwEGbKfULOQo25mDpr0ek4npRV6FoxqmcHHtwqjWkgDiY2Fmk_5HO6bsokp4ULHI3Udgdo_lSYBDsByXr_Y-W6qxuv8y_mY_e9FW-4PlYY27q4hZheC7T0Ft7MoeAKqmmLVi5NzQfmWcA7G4Me1gJTMaeVtR8Zno8qUu2Ef4sqxB3U/w400-h300/1773395145724.jpeg" width="400" /></a></div><br /><p><br /></p><p>In PGConf.India, <a href="https://www.linkedin.com/in/kojiannoura/">Koji Annoura</a>, who is a Graph database expert talked about our ongoing work on SQL/PGQ. He is also a certified professional for coffee, a drink that I greatly enjoy! He talked about improving coffee supply chain using graph databases. He is using SQL/PGQ, the software I am co-authorigin with Peter Eisentraut. I love coffee and my work is helping me procure a better quality coffee at a cheaper price!. Someone telling me that my work was useful to them gives me an immense satisfaction, irrespective of the size of the cause. As software developers we don't often get to hear that from the end users. Open source software and conferences around that give us that opportunity.</p><p>Professional life is full of stress, stress to get work done, to get paid, to secure job, promoted and what not. That stress is so overwhelming that we often loose the site of greater purpose. We often fail to notice that our work has other, arguably greater, benefits. These simple moments are enough to motivate me to continue doing our work leaving behind the stress that it carries.</p>Sat, 14 Mar 2026 05:48:00 +0000https://postgr.es/p/7uTShane Borden: More Obscure Things That Make It Go “Vacuum” in PostgreSQLhttps://postgr.es/p/7uS<p class="wp-block-paragraph">I previously blogged about ensuring that the &#8220;ON CONFLICT&#8221; directive is used in order to avoid vacuum from having to do additional work. I also later demonstrated the characteristics of how the use of the MERGE statement will accomplish the same thing.<br /><br />You can read the original blogs here <a href="https://stborden.wordpress.com/2024/06/18/reduce-vacuum-by-using-on-conflict-directive/" rel="noreferrer noopener" target="_blank">Reduce Vacuum by Using “ON CONFLICT” Directive</a> and here <a href="https://shaneborden.com/2024/09/04/follow-up-reduce-vacuum-by-using-on-conflict-directive/">Follow-Up: Reduce Vacuum by Using “ON CONFLICT” Directive</a></p> <p class="wp-block-paragraph">Now in another recent customer case, I was chasing down why the application was invoking 10s of thousands of Foreign Key and Constraint violations per day and I began to wonder, if these kinds of errors also caused additional vacuum as described in those previous blogs. Sure enough it <strong>DEPENDS</strong>.</p> <p class="wp-block-paragraph">Let&#8217;s set up a quick test to demonstrate:</p> <div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate"> /* Create related tables: */ CREATE TABLE public.uuid_product_value ( id int PRIMARY KEY, pkid text, value numeric, product_id int, effective_date timestamp(3) ); CREATE TABLE public.uuid_product ( product_id int PRIMARY KEY ); ALTER TABLE uuid_product_value ADD CONSTRAINT uuid_product_value_product_id_fk FOREIGN KEY (product_id) REFERENCES uuid_product (product_id) ON DELETE CASCADE; /* Insert some mocked up data */ INSERT INTO public.uuid_product VALUES ( generate_series(0,200)); INSERT INTO public.uuid_product_value VALUES ( generate_series(0,10000), gen_random_uuid()::text, random()*1000, ROUND(random()*100), current_timestamp(3)); /* Vacuum Analyze Both tables */ VACUUM (VERBOSE, ANALYZE) uuid_product; VACUUM (VERBOSE, ANALYZE) uuid_product_value; /* Verify that there are no dead tuples: */ SELECT schemaname, relname, n_live_tup, n_dead_tup FROM pg_stat_all_tables WHERE relname in ('uuid_product_value', 'uuid_product'); schemaname | relname | n_live_tup | n_dead_tup ------------+--------------------+------------+------------ public | uuid_product_value | 10001 | 0 public | uuid_product | 201 | 0 </pre></div> <p class="wp-block-paragraph">Then, let&#8217;s issue a simple insert that will violate the FK and check to see if dead tuples were generated:</p> <div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate"> /* Insert a row that violates the FK, without the ON CONFLICT directive */ INSERT INTO public.uuid_product_value VALUES ( generate_series(10001,10001), gen_random_uuid()::text, random()*1000, 202, /* we know this product_id doesn't exist in the parent */ current_timestamp(3)); ERROR: insert or update on table "uuid_product_value" violates foreign key constraint "uuid_mod_test_product_id_fk" DETAIL: Key (product_id)=(202) is not present in table "uuid_product". Time: 3.065 ms </pre></div> <p class="wp-block-paragraph">And now check the tuple stats:</p> <div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate"> SELECT schemaname, relname, n_live_tup, n_dead_tup FROM pg_stat_all_tables WHERE relname in ('uuid_product_value', 'uuid_product'); schemaname | relname | n_live_tup | n_dead_tup ------------+--------------------+------------+------------ public | uuid_product_value | 10001 | 1 public | uuid_product | 201 | 0 </pre></div> <p class="wp-block-paragraph">Sure enough, we now have a dead row as a result of the FK violation on the insert. But, will an &#8220;ON CONFLICT&#8221; directive help us in this scenario like in the others?</p> <div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate"> /* Insert a row that violates the FK, but with the ON CONFLICT directive */ INSERT INTO public.uuid_product_value VALUES ( generate_series(10001,10001), gen_random_uuid()::text, random()*1000, 202, /* we know this product_id doesn't exist in the parent */ current_timestamp(3)) ON CONFLICT DO NOTHING; /* Verify the tuple stats: */ SELECT schemaname, relname, n_live_tup, n_dead_tup FROM pg_stat_all_tables WHERE relname in ('uuid_product_value', 'uuid_product'); schemaname | relname | n_live_tup | n_dead_tup ------------+--------------------+------------+------------ public | uuid_product_value | 10001 | 2 public | uuid_product | 201 | 0 </pre></div> <p class="wp-block-paragraph">Unfortunately, it does not solve this problem. So we need to really be cognizant of FK violations and its effect on vacuum. Now what about trying to insert a NULL into a NOT NULL column? Will that result in a dead row? Let&#8217;s check.</p> <div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate"> /* Alter a column to NOT NULL */ ALTER TABLE public.uuid_product_value ALTER COLUMN pkid SET NOT NULL; /* Check the table definition */ Table &quot;public.uuid_product_value&quot; Column | Type | Collation | Nullable | Default ----------------+--------------------------------+-----------+----------+--------- id | integer | | not null | pkid | text | | not null | value | numeric | | | product_id | integer | | | effective_date | timestamp(3) without time zone | | | Indexes: &quot;uuid_mod_test_pkey&quot; PRIMARY KEY, btree (id) &quot;uuid_mod_test_product_id_idx&quot; btree (product_id) WHERE id &gt;= 1 AND id &lt;= 1000 &quot;uuid_mod_test_product_id_idx1&quot; hash (product_id) Foreign-key constraints: &quot;uuid_mod_test_product_id_fk&quot; FOREIGN KEY (product_id) REFERENCES uuid_product(product_id) ON DELETE CASCADE /* Insert a row that violates the NOT NULL constraint */ INSERT INTO public.uuid_product_value VALUES ( generate_series(10001,10001), NULL, random()*1000, 200, current_timestamp(3)); ERROR: null value in column &quot;pkid&quot; of relation &quot;uuid_product_value&quot; violates not-null constraint DETAIL: Failing row contains (10001, null, 613.162063338205, 200, 2026-03-13 14:25:28.758). /* Verify the tuple stats: */ SELECT schemaname, relname, n_live_tup, n_dead_tup FROM pg_stat_all_tables WHERE relname in (&#039;uuid_product_value&#039;, &#039;uuid_product&#039;); schemaname | relname | n_live_tup | n_dead_tup ------------+--------------------+------------+------------ public | uuid_product_value | 10001 | 2 public | uuid_product | 201 | 0 </pre></div> <p class="wp-block-paragraph">As you can see, a violation of the NOT NULL constraint does not have the same behavior as a violation of the FK constraint. It&#8217;s always good to know and relay to the application development staff what operations are going to result in more work for the database and adjust the code accordingly. Enjoy!</p>Fri, 13 Mar 2026 15:51:40 +0000https://postgr.es/p/7uSShaun Thomas: Using Patroni to Build a Highly Available Postgres Cluster—Part 2: Postgres and Patronihttps://postgr.es/p/7uR<p>Welcome to Part two of our series about building a High Availability Postgres cluster using <a href="https://www.pgedge.com/blog/using-patroni-to-build-a-highly-available-postgres-clusterpart-1-etcd">Patroni! Part one</a> focused entirely on establishing the DCS using etcd, providing the critical layer that Patroni uses to store metadata and guarantee its leadership token uniqueness across the cluster.With this solid foundation, it's now time to build the next layer in our stack: Patroni itself. Patroni does the job of managing the Postgres service and provides a command interface for node administration and monitoring. Technically the Patroni cluster is complete at the end of this article, but stick around for part three where we add the routing layer that brings everything together.Hopefully you still have the three VMs where you installed etcd. Those will be the same place where everything else happens, so if you haven’t already gone through the steps in part one, come back when you’re ready.Otherwise, let’s get started!<h2>Installing Postgres</h2>The Postgres community site has an incredibly thorough page dedicated to <a href="https://www.postgresql.org/download/"><u>installation on various platforms</u></a>. For the sake of convenience, this guide includes a simplified version of the Debian instructions. Perform these steps on all three servers.Start by setting up the PGDG repository:Then install your favorite version of Postgres. For the purposes of this guide, we’re also going to stop Postgres and drop the initial cluster the Postgres package creates. Patroni will recreate all of this anyway, and it should be in control.It’s also important to completely disable the default Postgres service since Patroni will be in charge:Finally, install the version of Patroni included in the PGDG repositories. This should be available on supported platforms like Debian and RedHat variants, but if it isn’t, you may have to resort to the <a href="https://patroni.readthedocs.io/en/master/installation.html"><u>official installation instructions</u></a>.Once that command completes, we should have three fresh VMs ready for configuration.<h2>Configuring Patroni the easy way</h2>The Debian Patroni package provides a tool called  that transforms a Patroni template into a configuration file customized specifically for Debian systems. Before using it, it’s necessary to modify part of that template to use etcd, as ZooKeeper is the default. Perform these steps on all three servers.Note that the YAML header shows “etcd3” rather than simply “etcd”. Patroni uses etcd2 by default for backward compatibility purposes, and version 3 requires a much different communication protocol.Then create the rest of the config with a single command:This creates a file named  in the  configuration directory, which systemd uses when managing this specific cluster. We’ll also need this for invoking .<h2>Understanding Patroni configuration</h2>Despite the fact that the configuration file is already complete, it’s important to actually understand the purpose of each section and what it does. This will enable users of other platforms to manually configure Patroni if necessary.Let’s start with the topmost section dedicated to the DCS:When Patroni writes to the DCS, all keys start at the path specified by the  parameter. Similarly as one DCS may host multiple clusters, keys for this cluster must include  in the key path. The  indicates how Patroni should refer to this individual node. The configuration tool actually uses the DCS to see which names are already reserved so each VM will be uniquely identified. Go ahead and check all three to make sure they’re correct.The next section, labeled , determines how Patroni should create the initial Postgres cluster, the parameters to use, and other important information. It’s also pretty long, so let’s look at each individual portion:Normally Patroni uses  when creating a new cluster, but for full compatibility with Debian organization quirks regarding Postgres, the configuration specifies an alternative command. This short section will likely only appear on a Debian system.Next comes the  section under . All of these parameters should be covered in the <a href="https://patroni.readthedocs.io/en/master/dynamic_configuration.html"><u>Dynamic Configuration Settings</u></a> documentation, but we’ll explain the important ones. It’s important to note that any settings defined here actually persist in the DCS layer and apply to Patroni on all nodes. After initialization, the only way to change these parameters is through the  utility. It’s a good idea to make sure all of these are set properly, as changing them later is somewhat inconvenient.These parameters define how Patroni interacts with the DCS layer and how it should manage certain Postgres features. Remember that the leadership token determines which node is the primary, so  defines how long that lease should last,  controls how long to wait between lease renewals, and  says how long to wait for a response from the DCS.We’ve included  in this output because the leader race isn’t quite absolute. If Patroni promotes a node to primary, or determines Postgres has failed, it has up to this timeout before it forces a failover. The provides a grace period for crash recovery to complete, but you may find the default of five minutes is much too long. Another important parameter here is , which tells Patroni it should manage the  Postgres configuration setting by automatically using names of other nodes in the cluster. This is how you would enable synchronous replication in Patroni.Next is the  section under : This section defines how Patroni should operate the Postgres service. The first few parameters control how Patroni recycles old primary nodes, such as using the  utility when possible, and whether it should erase the data directory as a last resort. Patroni also uses replication slots for replicas by default to prevent unnecessary replica rebuilds in failure scenarios.You can also pass GUC settings directly to Postgres on all nodes through the  section. This is useful for providing important cluster-wide settings that may not be hardware dependent, such as , , or .The final section under the  heading is : You’ll want to customize this section before starting Patroni; it uses this to build the <a href="https://www.postgresql.org/docs/current/auth-pg-hba-conf.html"><u>pg_hba.conf</u></a> file that controls incoming connection access. The default will allow connections on the server’s subnet if you uncomment the disabled line, otherwise it’s local access only.Next is another  section, but this is a top-level header meant to tell Patroni how it should handle Postgres on this specific server. These sections are explained in more detail in the Patroni <a href="https://patroni.readthedocs.io/en/master/yaml_configuration.html"><u>YAML Configuration Settings</u></a> documentation.This example starts with some Debian-specific content:As before, this is so Debian can integrate with the other packaged Postgres tooling, so it’s safe to skip on other platforms. After that comes a few pertinent parameters for handling connections: This sample effectively tells Patroni how it should connect to the local Postgres service for administrative actions. Patroni uses unix sockets when possible using these settings, which makes sense as Patroni runs as the postgres OS user and has direct socket access.Then comes a fun section that defines several paths: Patroni knows it will be installed in several different environments where Postgres and configuration directories may be in completely arbitrary locations. These are the defaults for Postgres 18 running on a Debian system.Lastly there’s a second parameters section, meant for parameters that should only apply to this specific Postgres server: Nothing here should be surprising; it’s mostly just log storage for the local instance and where the unix socket directory is located. These are likely to be universal across the cluster, but it’s safer to leave them out of the DCS section. If there is ever any variance caused by a hardware or OS distribution migration, you’ll want to have the ability to change these locally.In any case, take some time to examine the  file on each node to spot-check it for any mistakes.<h2>Starting and validating Patroni</h2>The Patroni package provides a standard systemd service file; simply enable and start the service on all VMs.One of the three nodes will “win” the leader race and become the primary for the cluster. Patroni then invokes the  command on that system to create the data and configuration directories before starting Postgres. On the other nodes, Patroni calls  instead to create new streaming replicas. If you want a specific node to start as the primary, simply start Patroni on that node and wait for it to establish a cluster before starting the service on the other two.The end result on all three systems should be a new “demo” database visible to : The next step is to check the status of the Patroni cluster itself. You should be able to run this command from any node as the  OS user. It will also work as , but now that Patroni is installed and managing the cluster, it’s best to avoid relying on the root user.This output tells us the cluster is healthy and operational, node 1 is the current primary, both replicas are streaming, and there’s no replication lag. Success!<h2>Editing the cluster configuration</h2>The last step that might be necessary is to modify the cluster configuration stored in the DCS layer. These are the Postgres parameters and pg_hba.conf entries used to bootstrap the initial state of the cluster, and it’s easy to make mistakes early on.Once again,  comes to the rescue: Patroni loads the current DCS config into the current default editor, and in our case it looks like this:Use this as an opportunity to fix any missing HBA lines, or add any Postgres parameters that should apply to all nodes. For example, add  under  to enable logical replication:Since changing the  parameter requires a Postgres restart, use  to restart the nodes in the cluster: Then check with Postgres to verify that the setting changed as expected. This is the output from node 3, even though I modified the DCS and restarted the cluster from node 1:<h2>Finishing up</h2>Now you know why this series was broken into three parts! Setting up Patroni isn’t too difficult by itself, but getting the configuration right, knowing how and why each section works the way it does, and continuing to modify the cluster after deployment, is a complex process. But if you followed along, you should have a fully operational Patroni cluster at this very moment.Technically you can even stop here and skip the third and final installment of this series. Postgres supports <a href="https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING"><u>multi-host connection strings</u></a>, and specifying  for the  restricts connections to the primary node. Connecting with psql might look like this:But what if, in some distant future, we change server names, or add more nodes to the cluster, or want other connection restrictions? That’s where the routing layer comes in, and what fully completes a Patroni deployment.So come back next week to learn about HAProxy and how it provides that critical and final component!</p>Fri, 13 Mar 2026 06:12:14 +0000https://postgr.es/p/7uRDeepak Mahto: PGConf India 2026: PostgreSQL Query Tuning: A Foundation Every Database Developer Should Buildhttps://postgr.es/p/7uP<p class="wp-block-paragraph">Most PostgreSQL tuning advice that folks chase is quick fixes but not on understanding what made planners choose an path or join over others optimal path. !</p> <blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"> <p class="wp-block-paragraph"><strong>Tuning should not start with Analyze on tables involved in the Query but with intend what is causing the issue and why planner is not self sufficient to choose the optimal path.</strong></p> </blockquote> <p class="wp-block-paragraph">Most fixes we search for SQL tuning are around, </p> <pre class="wp-block-preformatted">Add an index. <br />Rewrite the query. <br />Bump work_mem. <br />Done.</pre> <p class="wp-block-paragraph">Except it&#8217;s not done. The same problem comes back, different query, different table, same confusion.</p> <h2 class="wp-block-heading">The Real Problem</h2> <p class="wp-block-paragraph">A slow query is a symptom. Statistics, DDL, query style, and PG version are the actual culprit&#8217;s.</p> <p class="wp-block-paragraph">Before you touch anything, you need to answer five questions — in order:</p> <ul class="wp-block-list"> <li>Find it — which query actually hurts the most right now?</li> <li>Read the plan — what is the planner doing and where is it wrong?</li> <li>Check statistics — is the planner even working with accurate data?</li> <li>Check the DDL — is your schema helping or hiding the answer?</li> <li>Check GUCs &amp; version — are the defaults silently working against you?</li> </ul> <figure class="wp-block-image size-large"><a href="https://databaserookies.wordpress.com/wp-content/uploads/2026/03/image-2.png"><img alt="" class="wp-image-3600" height="515" src="https://databaserookies.wordpress.com/wp-content/uploads/2026/03/image-2.png?w=1024" width="1024" /></a><figcaption class="wp-element-caption">5-Dimension SQL Tuning Framework</figcaption></figure> <p class="wp-block-paragraph">Most developers skip straight to question two. Many skip to indexes without asking any question at all.</p> <h2 class="wp-block-heading">What I Covered at PGConf India 2026</h2> <p class="wp-block-paragraph">I presented this framework at PGConf India yesterday, a room full of developers and DBA , sharp questions, and a lot of &#8220;I&#8217;ve hit exactly this&#8221; moments.</p> <p class="wp-block-paragraph">The slides cover core foundations for approaching Query Tuning and production gotchas including partition pruning, SARGability, CTE fences, and correlated column statistics.</p> <p class="wp-block-paragraph"><a href="https://docs.google.com/presentation/d/1B9aZiZYscOaha37NWSVjeZyF06KNpkaXru5JvSaDctE/edit?usp=sharing" rel="noreferrer noopener" target="_blank">Slide &#8211; PostgreSQL Query Tuning: A Foundation Every Database Developer Should Build</a></p> <figure class="wp-block-image size-large"><a href="https://databaserookies.wordpress.com/wp-content/uploads/2026/03/image-1.png"><img alt="" class="wp-image-3596" height="588" src="https://databaserookies.wordpress.com/wp-content/uploads/2026/03/image-1.png?w=1024" width="1024" /></a></figure>Fri, 13 Mar 2026 01:12:23 +0000https://postgr.es/p/7uPPavel Luzanov: PostgreSQL 19: part 3 or CommitFest 2025-11https://postgr.es/p/7uQ<p>This article reviews the November 2025 CommitFest. <p>For the highlights of the previous two CommitFests, check out our last posts: <a href="https://postgrespro.com/blog/pgsql/5972724">2025-07</a>, <a href="https://postgrespro.com/blog/pgsql/5972743">2025-09</a>. <ul> <li>Planner: eager aggregation</li> <li>Converting COUNT(1) and COUNT(not_null_col) to COUNT(*)</li> <li>Parallel TID Range Scan</li> <li>COPY &hellip; TO with partitioned tables</li> <li>New function error_on_null</li> <li>Planner support functions for optimizing set-returning functions (SRF)</li> <li>SQL-standard style functions with temporary objects</li> <li>BRIN indexes: using the read stream interface for vacuuming</li> <li>WAIT FOR: waiting for synchronization between replica and primary</li> <li>Logical replication of sequences</li> <li>pg_stat_replication_slots: a counter for memory limit exceeds during logical decoding</li> <li>pg_buffercache: buffer distribution across OS pages</li> <li>pg_buffercache: marking buffers as dirty</li> <li>Statistics reset time for individual relations and functions</li> <li>Monitoring the volume of full page images written to WAL</li> <li>New parameter log_autoanalyze_min_duration</li> <li>psql: search path in the prompt</li> <li>psql: displaying boolean values</li> <li>pg_rewind: skip copying WAL segments already present on the target server</li> <li>pgbench: continue running after SQL command errors</li> </ul> <p>...Fri, 13 Mar 2026 00:00:00 +0000https://postgr.es/p/7uQVibhor Kumar: Transparent Column Encryption in PostgreSQL: Security Without Changing Your SQLhttps://postgr.es/p/7uO<figure class="wp-block-image size-large"><a href="https://vibhorkumar.wordpress.com/wp-content/uploads/2026/03/gemini_generated_image_f03d4tf03d4tf03d.png"><img alt="" class="wp-image-1360" height="558" src="https://vibhorkumar.wordpress.com/wp-content/uploads/2026/03/gemini_generated_image_f03d4tf03d4tf03d.png?w=1024" width="1024" /></a></figure> <p class="wp-block-paragraph">There is a moment in many database reviews when the room becomes a little too quiet.</p> <p class="wp-block-paragraph">Someone asks:</p> <p class="wp-block-paragraph"><strong>“Which columns in this database are encrypted?”</strong></p> <p class="wp-block-paragraph">At first, the answers sound reassuring.</p> <p class="wp-block-paragraph">“We use TLS.”</p> <p class="wp-block-paragraph">“The disks are encrypted.”</p> <p class="wp-block-paragraph">“The application handles sensitive fields.”</p> <p class="wp-block-paragraph">And then the real picture starts to emerge.</p> <p class="wp-block-paragraph">Some values are encrypted in one service but not another.</p> <p class="wp-block-paragraph">Some migrations remembered to apply encryption.</p> <p class="wp-block-paragraph">Some scripts did not.</p> <p class="wp-block-paragraph">Some backups are safe in theory, but no one wants to test that theory the hard way.</p> <p class="wp-block-paragraph">That is the uncomfortable truth of database security:</p> <p class="wp-block-paragraph"><strong>encryption is often present, but not always enforced where the data actually lives.</strong></p> <p class="wp-block-paragraph">That is exactly the problem I wanted to explore with the PostgreSQL extension:</p> <p class="wp-block-paragraph"><strong>column_encrypt</strong>: <a href="https://github.com/vibhorkum/column_encrypt">https://github.com/vibhorkum/column_encrypt</a></p> <p class="wp-block-paragraph">This extension provides <strong>transparent column-level encryption</strong> using custom PostgreSQL datatypes so developers can read and write encrypted columns <strong>without changing their SQL queries</strong>.</p> <p class="wp-block-paragraph">And perhaps the most human part of this project is this:</p> <p class="wp-block-paragraph"><strong>the idea for this project started back in 2016.</strong></p> <p class="wp-block-paragraph">It stayed with me for years as one of those engineering ideas that never quite leaves your mind — the thought that PostgreSQL itself could enforce encryption at the column level.</p> <p class="wp-block-paragraph">Now I’ve finally decided to release it.</p> <p class="wp-block-paragraph">This is the <strong>first public version</strong>. It’s a starting point — useful, practical, and hopefully something the PostgreSQL community can explore and build upon.</p> <h2 class="wp-block-heading"><strong>Why This Matters</strong></h2> <p class="wp-block-paragraph">Encryption conversations often focus first on infrastructure.</p> <ul class="wp-block-list"> <li>We encrypt disks.</li> <li>We use TLS connections.</li> <li>We protect credentials.</li> </ul> <p class="wp-block-paragraph">All of these are important.</p> <p class="wp-block-paragraph">But once data is inside the database, a different question matters:</p> <p class="wp-block-paragraph"><strong>What happens if someone gains access to the database itself?</strong></p> <p class="wp-block-paragraph">That access might come from:</p> <ul class="wp-block-list"> <li>a leaked backup</li> <li>an overprivileged account</li> <li>a dump file</li> <li>a compromised service</li> <li>an operational mistake</li> </ul> <p class="wp-block-paragraph">At that point infrastructure encryption has already done its job.</p> <p class="wp-block-paragraph">The real question becomes:</p> <p class="wp-block-paragraph"><strong>Are the most sensitive columns still readable?</strong></p> <p class="wp-block-paragraph">That is where <strong>column-level encryption</strong> becomes critical.</p> <p class="wp-block-paragraph">Not as a compliance checkbox.</p> <p class="wp-block-paragraph">But as <strong>blast-radius reduction</strong>.</p> <p class="wp-block-paragraph">Because security is not only about preventing breaches — it is also about limiting the damage when something goes wrong.</p> <h2 class="wp-block-heading"><strong>The Problem with Application-Level Encryption</strong></h2> <p class="wp-block-paragraph">Many teams implement encryption in the application layer.</p> <p class="wp-block-paragraph">In theory this works.</p> <p class="wp-block-paragraph">In practice it often becomes fragmented over time.</p> <p class="wp-block-paragraph">Different services implement encryption differently.</p> <p class="wp-block-paragraph">Migration scripts forget encryption steps.</p> <p class="wp-block-paragraph">ETL jobs bypass application logic.</p> <p class="wp-block-paragraph">Support scripts accidentally insert plaintext values.</p> <p class="wp-block-paragraph">The result is predictable:</p> <ul class="wp-block-list"> <li>inconsistent encryption behavior</li> <li>scattered security logic</li> <li>difficult auditing</li> <li>accidental plaintext storage</li> </ul> <p class="wp-block-paragraph">The biggest limitation is simple:</p> <p class="wp-block-paragraph"><strong>the database itself cannot enforce encryption consistently.</strong></p> <p class="wp-block-paragraph">If someone forgets to encrypt a value, PostgreSQL will store it as plaintext.</p> <p class="wp-block-paragraph">That’s where database-level encryption changes the story.</p> <h2 class="wp-block-heading"><strong>Transparent Column Encryption in PostgreSQL</strong></h2> <p class="wp-block-paragraph">The column_encrypt extension moves encryption directly into PostgreSQL.</p> <p class="wp-block-paragraph">It introduces two encrypted data types:</p> <ul class="wp-block-list"> <li>ENCRYPTED_TEXT</li> <li>ENCRYPTED_BYTEA</li> </ul> <p class="wp-block-paragraph">These types perform encryption and decryption <strong>automatically at the datatype level</strong>.</p> <p class="wp-block-paragraph">That means:</p> <ul class="wp-block-list"> <li>On INSERT or UPDATE the plaintext value is encrypted</li> <li>On SELECT the ciphertext is decrypted</li> <li>SQL queries remain unchanged</li> </ul> <p class="wp-block-paragraph">In other words, developers interact with encrypted columns as if they were normal data.</p> <h2 class="wp-block-heading"><strong>Example: Using Encrypted Columns</strong></h2> <p class="wp-block-paragraph">Create a table with an encrypted column:</p> <p>CREATE TABLE secure_data (<br /> id SERIAL,<br /> ssn ENCRYPTED_TEXT<br /> );</p> <p class="wp-block-paragraph"><br />Insert values normally:</p> <p>INSERT INTO secure_data(ssn) VALUES (&#8216;888-999-2045&#8217;);<br /> INSERT INTO secure_data(ssn) VALUES (&#8216;888-999-2046&#8217;);<br /> INSERT INTO secure_data(ssn) VALUES (&#8216;888-999-2047&#8217;);</p> <p class="wp-block-paragraph"><br />Query the data:</p> <p>SELECT * FROM secure_data;</p> <p class="wp-block-paragraph"><br />With the correct key loaded in the session, PostgreSQL returns decrypted values.</p> <p> id | ssn<br /> &#8212;-+&#8212;&#8212;&#8212;&#8212;&#8211;<br /> 1 | 888-999-2045<br /> 2 | 888-999-2046<br /> 3 | 888-999-2047</p> <p class="wp-block-paragraph"><br />Without the key loaded:</p> <p>ERROR: cannot decrypt data, because key was not set</p> <p class="wp-block-paragraph">This behavior ensures encrypted columns remain protected.</p> <h2 class="wp-block-heading"><strong>Why Database-Level Encryption Can Be Better</strong></h2> <p class="wp-block-paragraph">Moving encryption into PostgreSQL has several advantages.</p> <h3 class="wp-block-heading"><strong>Encryption becomes part of the schema</strong></h3> <p class="wp-block-paragraph">Encrypted columns are visible in the table definition. Security becomes part of database design rather than scattered across application code.</p> <h3 class="wp-block-heading"><strong>SQL remains simple</strong></h3> <p class="wp-block-paragraph">Developers can use normal SQL statements without wrapping every query in encryption functions.</p> <h3 class="wp-block-heading"><strong>Consistent enforcement</strong></h3> <p class="wp-block-paragraph">PostgreSQL itself ensures encrypted storage. Developers cannot accidentally bypass encryption.</p> <h3 class="wp-block-heading"><strong>Safer backups and dumps</strong></h3> <p class="wp-block-paragraph">Even if a database dump leaks, sensitive columns remain encrypted.</p> <h3 class="wp-block-heading"><strong>Easier security audits</strong></h3> <p class="wp-block-paragraph">Encrypted data types make it easy to identify which columns contain protected data.</p> <h2 class="wp-block-heading"><strong>How It Works Internally</strong></h2> <p class="wp-block-paragraph">The extension registers two custom PostgreSQL base types backed by bytea.</p> <h3 class="wp-block-heading"><strong>On INSERT / UPDATE</strong></h3> <p class="wp-block-paragraph">The type input functions:</p> <ul class="wp-block-list"> <li>col_enc_text_in</li> <li>col_enc_bytea_in</li> </ul> <p class="wp-block-paragraph">encrypt plaintext values using the active Data Encryption Key (DEK).</p> <h3 class="wp-block-heading"><strong>On SELECT</strong></h3> <p class="wp-block-paragraph">The output functions:</p> <ul class="wp-block-list"> <li>col_enc_text_out</li> <li>col_enc_bytea_out</li> </ul> <p class="wp-block-paragraph">decrypt ciphertext values using the loaded key.</p> <p class="wp-block-paragraph">Encryption and decryption occur <strong>transparently at the type boundary</strong>.</p> <h2 class="wp-block-heading"><strong>Key Management Model</strong></h2> <p class="wp-block-paragraph">The extension uses a <strong>two-tier key architecture</strong>.</p> <h3 class="wp-block-heading"><strong>Data Encryption Key (DEK)</strong></h3> <p class="wp-block-paragraph">Used to encrypt column data.</p> <p class="wp-block-paragraph">Stored encrypted in the database.</p> <h3 class="wp-block-heading"><strong>Key Encryption Key (KEK)</strong></h3> <p class="wp-block-paragraph">A master passphrase used to wrap the DEK.</p> <p class="wp-block-paragraph">The KEK is <strong>never stored inside PostgreSQL</strong>.</p> <p class="wp-block-paragraph">Each session must load the key before accessing encrypted data.</p> <h2 class="wp-block-heading"><strong>Security Features Built Into the Extension</strong></h2> <p class="wp-block-paragraph">Several operational safeguards are included.</p> <h3 class="wp-block-heading"><strong>Log masking</strong></h3> <p class="wp-block-paragraph">A PostgreSQL emit_log_hook prevents sensitive key material from appearing in logs or pg_stat_activity.</p> <h3 class="wp-block-heading"><strong>Row-Level Security</strong></h3> <p class="wp-block-paragraph">The internal cipher_key_table uses Row-Level Security to restrict access.</p> <h3 class="wp-block-heading"><strong>Secure memory cleanup</strong></h3> <p class="wp-block-paragraph">Keys stored in session memory are securely wiped when removed.</p> <h3 class="wp-block-heading"><strong>Key version header</strong></h3> <p class="wp-block-paragraph">Each ciphertext contains a key version identifier to support key rotation.</p> <h2 class="wp-block-heading"><strong>Key Rotation Support</strong></h2> <p class="wp-block-paragraph">The extension provides a helper function to re-encrypt existing data with a new key.</p> <p>SELECT cipher_key_reencrypt_data(<br /> &#8216;public&#8217;,<br /> &#8216;secure_data&#8217;,<br /> &#8216;ssn&#8217;<br /> );</p> <p class="wp-block-paragraph"><br />This allows encrypted data to be rotated to new keys without losing access to existing values.</p> <h2 class="wp-block-heading"><strong>Querying Encrypted Data</strong></h2> <p class="wp-block-paragraph">The extension supports <strong>hash indexes</strong> for equality comparisons.</p> <p class="wp-block-paragraph">Example:</p> <p>CREATE INDEX idx_ssn<br /> ON secure_data USING hash(ssn);</p> <p class="wp-block-paragraph"><br />This allows queries such as:</p> <p>SELECT * FROM secure_data<br /> WHERE ssn = &#8216;888-999-2045&#8217;;</p> <p>┌─────────────────────────────────────────────────────────────────────────────┐<br /> │ QUERY PLAN │<br /> ├─────────────────────────────────────────────────────────────────────────────┤<br /> │ Index Scan using idx_ssn on secure_data (cost=0.00..12.02 rows=1 width=36) │<br /> │ Index Cond: (ssn = &#8216;888-999-2045&#8217;::encrypted_text) │<br /> └─────────────────────────────────────────────────────────────────────────────┘<br /> (2 rows)</p> <p class="wp-block-paragraph"></p> <h2 class="wp-block-heading"><strong>What Should Be Considered Before Using It</strong></h2> <p class="wp-block-paragraph">Because this is the <strong>first public version</strong> of a project originally started in 2016, it should be approached thoughtfully.</p> <p class="wp-block-paragraph">Before production use:</p> <ul class="wp-block-list"> <li>perform code and security reviews</li> <li>validate it against your PostgreSQL version</li> <li>test backup and failover behavior</li> <li>evaluate connection pooling scenarios</li> <li>carefully manage session keys</li> <li>practice key rotation procedures</li> </ul> <p class="wp-block-paragraph">Encryption components deserve extra scrutiny.</p> <h2 class="wp-block-heading"><strong>When Column Encryption Is Most Useful</strong></h2> <p class="wp-block-paragraph">This approach works well for sensitive identifiers such as:</p> <ul class="wp-block-list"> <li>social security numbers</li> <li>financial account numbers</li> <li>API tokens</li> <li>healthcare identifiers</li> <li>personal identification data</li> </ul> <p class="wp-block-paragraph">These fields typically require strong protection but are not heavily indexed.</p> <h2 class="wp-block-heading"><strong>When It May Not Be the Right Fit</strong></h2> <p class="wp-block-paragraph">Be cautious when encrypting columns that:</p> <ul class="wp-block-list"> <li>are frequently used in range queries</li> <li>participate in large joins</li> <li>require advanced indexing strategies</li> </ul> <p class="wp-block-paragraph">Encryption works best when applied selectively to high-value data.</p> <h2 class="wp-block-heading"><strong>PostgreSQL’s Extensibility Makes This Possible</strong></h2> <p class="wp-block-paragraph">PostgreSQL’s extension architecture allows developers to extend the database engine itself.</p> <p class="wp-block-paragraph">Well-known examples include:</p> <ul class="wp-block-list"> <li>PostGIS</li> <li>pgvector</li> <li>TimescaleDB</li> <li>pgcrypto</li> </ul> <p class="wp-block-paragraph">column_encrypt explores how PostgreSQL’s extensibility can also strengthen <strong>database security</strong>.</p> <h2 class="wp-block-heading"><strong>Try It Yourself</strong></h2> <p class="wp-block-paragraph">Clone and install the extension:</p> <p>git clone <a href="https://github.com/vibhorkum/column_encrypt.git" rel="nofollow">https://github.com/vibhorkum/column_encrypt.git</a><br /> cd column_encrypt<br /> make<br /> make install</p> <p class="wp-block-paragraph">Add to postgresql.conf:</p> <p>shared_preload_libraries = &#8216;$libdir/column_encrypt&#8217;</p> <p class="wp-block-paragraph">Restart PostgreSQL and create the extension:</p> <p>CREATE EXTENSION column_encrypt;</p> <p class="wp-block-paragraph">You can then start defining encrypted columns using ENCRYPTED_TEXT or ENCRYPTED_BYTEA.</p> <h2 class="wp-block-heading"><strong>Feedback Welcome</strong></h2> <p class="wp-block-paragraph">This project is being released as a <strong>first version</strong>, and I would genuinely value feedback from the PostgreSQL community.</p> <p class="wp-block-paragraph">If you try it, review it, or have ideas for improving it, please share your feedback on the GitHub repository:</p> <p class="wp-block-paragraph"><a href="https://github.com/vibhorkum/column_encrypt">https://github.com/vibhorkum/column_encrypt</a></p> <p class="wp-block-paragraph">That could include:</p> <ul class="wp-block-list"> <li>design suggestions</li> <li>security observations</li> <li>performance considerations</li> <li>compatibility findings</li> <li>operational lessons</li> <li>ideas for future enhancements</li> </ul> <p class="wp-block-paragraph">Good open source gets better through review, discussion, and real-world use. This project should be no different.</p> <h2 class="wp-block-heading"><strong>Final Thoughts</strong></h2> <p class="wp-block-paragraph">Some ideas take time.</p> <p class="wp-block-paragraph">The concept behind column_encrypt began in <strong>2016</strong>, and this release represents its <strong>first public version</strong>.</p> <p class="wp-block-paragraph">It is not the final word on column-level encryption in PostgreSQL, but it explores a design direction I still believe is important: moving encryption closer to where the data actually lives.</p> <p class="wp-block-paragraph">Security often fails when it depends on every developer, every script, and every service remembering to do the right thing.</p> <p class="wp-block-paragraph">Systems become safer when guardrails are built into the architecture.</p> <p class="wp-block-paragraph">Transparent column encryption is one way to move toward that goal.</p> <p class="wp-block-paragraph">And if this project sparks feedback, discussion, or improvements from the community, that would be a very good next chapter.</p> <h2 class="wp-block-heading"><strong>Repository</strong></h2> <p class="wp-block-paragraph"><a href="https://github.com/vibhorkum/column_encrypt">https://github.com/vibhorkum/column_encrypt</a></p> <p class="wp-block-paragraph"></p>Thu, 12 Mar 2026 15:19:49 +0000https://postgr.es/p/7uORichard Yen: Debugging RDS Proxy Pinning: How a Hidden JIT Toggle Created Thousands of Pinned Connectionshttps://postgr.es/p/7uN<h1 id="introduction">Introduction</h1> <p>When using AWS RDS Proxy, the goal is to achieve connection multiplexing – many client connections share a much smaller pool of backend PostgreSQL connections, givng more resources per connection and keeping query execution running smoothly.</p> <p>However, if the proxy detects that a session has changed internal state in a way it cannot safely track, it <strong>pins</strong> the client connection to a specific backend connection. Once pinned, that connection can never be multiplexed again. This was the case with a recent database I worked on.</p> <p>In this case, we observed the following:</p> <ul> <li>extremely high CPU usage</li> <li>relatively high LWLock wait times</li> <li>OOM killer activity on the database, maybe once every day or two</li> <li>thousands of active connections</li> </ul> <p>What was strange about it all was that the queries involved were relatively simple, with max just one join.</p> <hr /> <h1 id="finding-the-pinning-source">Finding the Pinning Source</h1> <p>To get to the root cause, one option was to look in <code class="language-plaintext highlighter-rouge">pg_stat_statements</code>. However, that approach had two problems:</p> <ol> <li>Getting a clean snapshot of the statistics while thousands of queries were being actively processed would be tricky.</li> <li><code class="language-plaintext highlighter-rouge">pg_stat_statements</code> normalizes queries and does not expose the values passed to parameter placeholders.</li> </ol> <p>Instead, to see the actual parameters, we briefly enabled <code class="language-plaintext highlighter-rouge">log_statement = 'all'</code>. This immediately surfaced something interesting in the logs, which could be downloaded and reviewed on my own time and pace.</p> <p>What we saw were statements like <code class="language-plaintext highlighter-rouge">SELECT set_config($2,$1,$3)</code> with parameters related to JIT configuration – that was the first real clue.</p> <hr /> <h1 id="getting-to-the-bottom">Getting to the Bottom</h1> <p>After tracing the behavior through the stack, the root cause turned out to be surprisingly indirect. The application created new connections through SQLAlchemy’s asyncpg dialect, and we needed to drill down into that driver’s behavior.</p> <hr /> <h3 id="step-1--reviewing-how-sqlalchemy-registers-json-codecs">Step 1 – Reviewing how SQLAlchemy registers JSON codecs</h3> <p>During connection initialization, SQLAlchemy runs an <code class="language-plaintext highlighter-rouge">on_connect</code> hook:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">connect</span><span class="p">(</span><span class="n">conn</span><span class="p">):</span> <span class="n">conn</span><span class="p">.</span><span class="n">await_</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">setup_asyncpg_json_codec</span><span class="p">(</span><span class="n">conn</span><span class="p">))</span> <span class="n">conn</span><span class="p">.</span><span class="n">await_</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">setup_asyncpg_jsonb_codec</span><span class="p">(</span><span class="n">conn</span><span class="p">))</span> </code></pre></div></div> <p>This registers optimized JSON and JSONB codecs.</p> <hr /> <h3 id="step-2--observing-how-asyncpg-introspects-type-metadata">Step 2 – Observing how asyncpg introspects type metadata</h3> <p>Registering those codecs requires looking up type OIDs in <code class="language-plaintext highlighter-rouge">pg_catalog</code>.</p> <p>That triggers asyncpg’s internal function: <code class="language-plaintext highlighter-rouge">introspect_types()</code></p> <hr /> <h3 id="step-3--catching-asyncpg-temporarily-disabling-jit">Step 3 – Catching asyncpg temporarily disabling JIT</h3> <p>Inside <code class="language-plaintext highlighter-rouge">_introspect_types()</code> there is this block:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">_introspect_types</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">typeoids</span><span class="p">,</span> <span class="n">timeout</span><span class="p">):</span> <span class="k">if</span> <span class="bp">self</span><span class="p">.</span><span class="n">_server_caps</span><span class="p">.</span><span class="n">jit</span><span class="p">:</span> <span class="n">cfgrow</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="k">await</span> <span class="bp">self</span><span class="p">.</span><span class="n">__execute</span><span class="p">(</span> <span class="s">"""SELECT current_setting('jit') AS cur, set_config('jit', 'off', false) AS new"""</span><span class="p">,</span> <span class="p">)</span> </code></pre></div></div> <p>The purpose is harmless and avoids rare edge cases with complex type queries by temporarily disabling JIT, running the introspection query, and finally restoring the setting afterwards. For direct PostgreSQL connections, this is perfectly fine.</p> <p>Unfortunately, <code class="language-plaintext highlighter-rouge">set_config()</code> changes session state. RDS Proxy cannot safely track this change. So it decides it is necessary to pin the client connection to a backend session. Once pinned, that connection can never be multiplexed again, for the duration of the session.</p> <p>In short, since every connection initialization triggers the JIT toggle, every RDS Proxy connection gets pinned to a database connection, effectively invalidating the usefulness of RDS Proxy’s purpose of connection multiplexing. With thousands of live connections doing relatively little, Postmaster develops a lot of LWLock overhead memory buffers don’t get flushed, and OOM Killer can be invoked when the conditions are right.</p> <hr /> <h1 id="the-fix">The Fix</h1> <p>The key observation is that asyncpg only runs the JIT toggle if it believes the server supports JIT.</p> <p>That capability is stored in an internal structure <code class="language-plaintext highlighter-rouge">_server_caps</code>. If <code class="language-plaintext highlighter-rouge">jit</code> is set to <code class="language-plaintext highlighter-rouge">False</code>, asyncpg skips the entire block.</p> <p>So we added a SQLAlchemy connection hook:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">@</span><span class="n">event</span><span class="p">.</span><span class="n">listens_for</span><span class="p">(</span><span class="n">engine</span><span class="p">.</span><span class="n">sync_engine</span><span class="p">,</span> <span class="s">"connect"</span><span class="p">,</span> <span class="n">insert</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="k">def</span> <span class="nf">_prevent_rds_proxy_session_pinning</span><span class="p">(</span><span class="n">dbapi_connection</span><span class="p">,</span> <span class="n">connection_record</span><span class="p">):</span> <span class="n">raw_conn</span> <span class="o">=</span> <span class="n">dbapi_connection</span><span class="p">.</span><span class="n">_connection</span> <span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">raw_conn</span><span class="p">,</span> <span class="s">"_server_caps"</span><span class="p">)</span> <span class="ow">and</span> <span class="n">raw_conn</span><span class="p">.</span><span class="n">_server_caps</span><span class="p">.</span><span class="n">jit</span><span class="p">:</span> <span class="n">raw_conn</span><span class="p">.</span><span class="n">_server_caps</span> <span class="o">=</span> <span class="n">raw_conn</span><span class="p">.</span><span class="n">_server_caps</span><span class="p">.</span><span class="n">_replace</span><span class="p">(</span><span class="n">jit</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> </code></pre></div></div> <p>This configuration does the following:</p> <ol> <li>Registers a connection hook so that it runs every time a new connection is created.</li> <li>Runs the hook before SQLAlchemy’s own hooks and ensures our handler runs <strong>before</strong> SQLAlchemy’s <code class="language-plaintext highlighter-rouge">on_connect</code> logic. That is important because the JSON codec registration is what triggers the introspection.</li> <li>Disables the JIT capability flag. By using <code class="language-plaintext highlighter-rouge">_server_caps._replace(jit=False)</code>, we tell asyncpg to skip the <code class="language-plaintext highlighter-rouge">set_config()</code> block entirely.</li> </ol> <hr /> <h1 id="the-result">The Result</h1> <p>After deploying the asyncpg fix, we saw the number of pinned sessions drop precipitously:</p> <p><img alt="RDS Proxy Pinning Graph" src="https://raw.githubusercontent.com/richyen/richyen.github.io/refs/heads/gh-pages/img/rds_proxy_pinning.png" /></p> <p>Of course, we were still seeing many pinned sessions, which we continued to deal with through other fixes, but this first step produced an improvement of over 50%</p> <hr /> <h1 id="other-fix-attempts-that-didnt-work">Other Fix Attempts That Didn’t Work</h1> <p>Before landing on this fix, we attempted a few other approaches.</p> <p>First, we attempted to disable JIT via connection parameters by setting <code class="language-plaintext highlighter-rouge">server_settings={"jit": "off"}</code>. This fails because RDS Proxy rejects it with a message like:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FeatureNotSupportedError: RDS Proxy currently doesn't support the option jit </code></pre></div></div> <p>We also tried disabling prepared statement caching with <code class="language-plaintext highlighter-rouge">prepared_statement_cache_size=0</code> in the configuration. This didn’t work because it prevents named prepared statement pinning, but it does not prevent <code class="language-plaintext highlighter-rouge">set_config()</code> pinning.</p> <p>The only fix that worked was to add the pin-prevention hook as described above.</p> <hr /> <h1 id="lessons-learned">Lessons Learned</h1> <p>A few takeaways from this debugging experience:</p> <ol> <li>RDS Proxy pinning can come from unexpected places. Even small session-level changes can disable multiplexing.</li> <li><code class="language-plaintext highlighter-rouge">pg_stat_statements</code> hides parameter values. It’s great for query patterns, but it does not expose bound parameters, which can hide critical clues. Sometimes the fastest diagnostic tool is temporarily enabling <code class="language-plaintext highlighter-rouge">log_statement = 'all'</code>, which quickly exposed the params in the <code class="language-plaintext highlighter-rouge">set_config()</code> call.</li> <li>SQLAlchemy and asyncpg do have some quirks that need to be addressed when using them with RDS Proxy</li> </ol> <hr /> <h1 id="final-thoughts">Final Thoughts</h1> <p>The entire chain looked like this:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SQLAlchemy connection → asyncpg codec registration → asyncpg type introspection → temporary JIT disable via set_config() → RDS Proxy detects session state change → connection gets pinned </code></pre></div></div> <p>A single hidden configuration toggle resulted in <strong>thousands of pinned sessions</strong>.</p> <p>Once identified, the fix was only a few lines of code.</p> <p>But getting there required following the entire stack – from SQLAlchemy to asyncpg to PostgreSQL to RDS Proxy.</p> <p>Hopefully this saves someone else a few hours (or days) of debugging.</p>Thu, 12 Mar 2026 08:00:00 +0000https://postgr.es/p/7uNgabrielle roth: SCaLE23xhttps://postgr.es/p/7uLI&#8217;m back from Pasadena after SCaLE23x and another installment of PostgreSQL@SCaLE! It was really just wonderful this year, seeing old friends and making new ones, talking to people and soaking up knowledge. I&#8217;m looking forward to implementing what I learned. Expo Hall:We had a lot of booth volunteers this year. Thank you all so much; [&#8230;]Thu, 12 Mar 2026 00:38:49 +0000https://postgr.es/p/7uLBruce Momjian: The MySQL Shadowhttps://postgr.es/p/7uJ<p>For much of Postgres's <a class="txt2html" href="https://www.postgresql.org/docs/current/history.html" style="text-decoration: underline dotted;">history,</a> it has lived in the shadow of other relational systems, and for a time even in <a class="txt2html" href="https://momjian.us/main/blogs/pgblog/2013.html#March_5_2013" style="text-decoration: underline dotted;">the shadow</a> of <a class="txt2html" href="https://momjian.us/main/writings/pgsql/central.pdf#page=21" style="text-decoration: underline dotted;">NoSQL</a> systems. Those shadows have faded, but it is helpful to reflect on this outcome. </p> <p>On the proprietary side, <a class="txt2html" href="https://momjian.us/main/writings/pgsql/forever.pdf#page=14" style="text-decoration: underline dotted;">most database products</a> are now in maintenance mode. The only database to be consistently compared to Postgres was Oracle. Long-term, Oracle was never going to be able to compete against an open source development team, just like Sun's Solaris wasn't able to <a class="txt2html" href="https://arstechnica.com/information-technology/2009/04/oracle-acquires-sun-ars-explores-the-impact-on-open-source/" style="text-decoration: underline dotted;">compete</a> against open source Linux. Few people would choose Oracle's database today, so it is effectively in legacy mode. The Oracle shadow is clearly fading. In fact, almost all enterprise infrastructure software is open source today. </p> <p>The MySQL shadow is more complex. MySQL is not proprietary, since it is distributed as open source, so it had the potential to ride the open source wave into the enterprise, and it clearly did from the mid-1990s to the mid-2000s. However, something changed, and MySQL has been in steady <a class="txt2html" href="https://analyticsindiamag.com/ai-trends/the-end-of-mysql-as-we-knew-it" style="text-decoration: underline dotted;">decline</a> for decades. Looking back, people want to ascribe a reason for the decline: </p> <ul> <li>Sun <a class="txt2html" href="https://www.techpowerup.com/49858/sun-acquires-mysql-developer-of-the-worlds-most-popular-open-source-database" style="text-decoration: underline dotted;">buying</a> MySQL AB </li><li>Oracle <a class="txt2html" href="https://en.wikipedia.org/wiki/Acquisition_of_Sun_Microsystems_by_Oracle_Corporation" style="text-decoration: underline dotted;">buying</a> Sun </li><li>Poor <a class="txt2html" href="https://www.theregister.com/2025/09/11/oracle_slammed_for_mysql_job/" style="text-decoration: underline dotted;">stewardship</a> of MySQL by Oracle, including recent layoffs </li></ul> <p><a href="https://momjian.us/main/blogs/pgblog/2026.html#March_11_2026">Continue Reading &raquo;</a></p>Wed, 11 Mar 2026 14:15:02 +0000https://postgr.es/p/7uJVibhor Kumar: Beyond Features: What a PostgreSQL Strategy Discussion Taught Me About Calm, Modern Platformshttps://postgr.es/p/7uI<figure class="wp-block-image size-large"><a href="https://vibhorkumar.wordpress.com/wp-content/uploads/2026/03/gemini_generated_image_bu5x4pbu5x4pbu5x.png"><img alt="" class="wp-image-1348" height="571" src="https://vibhorkumar.wordpress.com/wp-content/uploads/2026/03/gemini_generated_image_bu5x4pbu5x4pbu5x.png?w=1024" width="1024" /></a></figure> <p class="wp-block-paragraph">Last December, I was part of a long enterprise discussion centered on PostgreSQL.</p> <p class="wp-block-paragraph">On paper, it looked familiar: a new major release, high availability and scale, Aurora migration, monitoring, operational tooling, and the growing conversation around AI-assisted operations.</p> <p class="wp-block-paragraph">The usual ingredients were all there.</p> <p class="wp-block-paragraph">But somewhere in the middle of that day, the tone of the room changed.</p> <p class="wp-block-paragraph">It did not change when we talked about new PostgreSQL capabilities. It changed when the conversation moved to upgrades, patching, monitoring quality, and operational control.</p> <p class="wp-block-paragraph">That was the moment I realized this was not really a feature discussion.</p> <p class="wp-block-paragraph">It was a trust discussion.</p> <p class="wp-block-paragraph">Not trust in PostgreSQL as a database. That question is mostly behind us.</p> <p class="wp-block-paragraph">It was trust in something more practical: can this platform evolve without exhausting the team responsible for it? Can it scale without becoming harder to reason about? Can it be upgraded without becoming a quarterly trauma ritual? Can it be monitored without operators drowning in false signals? Can it support modernization without making every change feel dangerous?</p> <p class="wp-block-paragraph">That, to me, is where the PostgreSQL conversation has matured.</p> <p class="wp-block-paragraph">A modern PostgreSQL platform is not defined only by what it can do. It is defined by how calmly it can change.</p> <h2 class="wp-block-heading"><strong>Why this matters now</strong></h2> <p class="wp-block-paragraph">This matters because PostgreSQL is no longer entering the enterprise through side doors. In many organizations, it is already trusted with serious workloads and is increasingly central to modernization plans.</p> <p class="wp-block-paragraph">That changes the questions.</p> <p class="wp-block-paragraph">A few years ago, teams often asked whether PostgreSQL was ready for enterprise use. Today, the better question is whether the <strong>operating model around PostgreSQL</strong> is ready for enterprise reality.</p> <p class="wp-block-paragraph">Because the database can be strong while the surrounding practice is weak.</p> <p class="wp-block-paragraph">That is where many teams struggle. They like PostgreSQL, but lag on upgrades. They have HA designs, but unclear failure playbooks. They have monitoring, but poor signal quality. They use managed PostgreSQL services, but feel boxed in over time. They want automation and intelligent assistance, but are not always clear what uncertainty they are trying to remove.</p> <p class="wp-block-paragraph">The technology is often ready. The operating discipline around it is where the real work lives.</p> <h2 class="wp-block-heading"><strong>What we were really discussing</strong></h2> <p class="wp-block-paragraph">Even though the conversation touched several areas, the underlying questions were surprisingly consistent.</p> <p class="wp-block-paragraph">We were really discussing:</p> <ol class="wp-block-list" start="1"> <li>How to stay current on PostgreSQL releases without upgrade drama</li> <li>How to think about availability and scale without operational chaos</li> <li>How to evaluate Aurora-to-PostgreSQL migration without turning it into ideology</li> <li>How to improve monitoring quality so operators can act faster and guess less</li> <li>How to use guided operations only where they reduce real uncertainty</li> </ol> <p class="wp-block-paragraph">That is a more useful frame than just listing technologies.</p> <p class="wp-block-paragraph">Most platform pain does not come from lack of features. It comes from lack of confidence in change.</p> <h2 class="wp-block-heading"><strong>PostgreSQL 18: a release only creates value if it is reachable</strong></h2> <p class="wp-block-paragraph">One obvious part of the discussion was PostgreSQL 18.</p> <p class="wp-block-paragraph">A new major release always matters. It brings new capabilities, new opportunities, and another step forward in a long tradition of serious engineering.</p> <p class="wp-block-paragraph">But in real environments, the most important question is not, “What’s new?”</p> <p class="wp-block-paragraph">It is this: can we adopt it without making life harder for the team?</p> <p class="wp-block-paragraph">That is where enterprise PostgreSQL gets real.</p> <p class="wp-block-paragraph">A release only creates value if it is reachable. Not downloadable. Not demo-ready. Operationally reachable.</p> <p class="wp-block-paragraph">That means teams can:</p> <ol class="wp-block-list" start="1"> <li>Validate it with confidence</li> <li>Test application behavior early</li> <li>Align rollout with workload patterns</li> <li>Standardize the upgrade motion</li> <li>Avoid turning major version change into a hero project</li> </ol> <p class="wp-block-paragraph">Too many environments still postpone upgrades until they become politically and technically painful. The version gap grows. Dependencies pile up. Risk accumulates quietly. Then one day the “future upgrade” becomes a transformation project nobody wants to sponsor.</p> <p class="wp-block-paragraph">The better pattern is less dramatic: stay reasonably current, validate earlier, and make major upgrades boring.</p> <p class="wp-block-paragraph">That word is worth defending.</p> <p class="wp-block-paragraph">In PostgreSQL operations, boring is beautiful. A quiet upgrade is not a lack of ambition. It is a sign that the team has built enough discipline to let progress happen without ceremony and fear.</p> <h3 class="wp-block-heading"><strong>A practical PostgreSQL upgrade test</strong></h3> <p class="wp-block-paragraph">If a team says it is “ready” for a major PostgreSQL upgrade, I think it should be able to answer these questions clearly:</p> <ol class="wp-block-list" start="1"> <li>Do we know which applications, extensions, and drivers need compatibility testing?</li> <li>Do we have a rehearsal path in a lower environment that resembles production closely enough to matter?</li> <li>Do we know our acceptable rollback posture if the cutover does not behave as expected?</li> <li>Have we aligned the upgrade window to real workload behavior instead of habit?</li> <li>Can we explain the upgrade sequence in plain language without relying on one hero engineer?</li> </ol> <p class="wp-block-paragraph">If those answers are fuzzy, the problem is usually not PostgreSQL. It is upgrade discipline.</p> <h2 class="wp-block-heading"><strong>Availability and scale: PostgreSQL gets harder when the business stops tolerating pauses</strong></h2> <p class="wp-block-paragraph">High availability is one of those areas where teams can sound mature long before they actually are.</p> <p class="wp-block-paragraph">The vocabulary is easy: resilience, failover, RPO, RTO, locality, five nines.</p> <p class="wp-block-paragraph">The hard part is building a PostgreSQL platform that behaves predictably when real conditions get messy.</p> <p class="wp-block-paragraph">That is where the distributed side of the discussion became important. Not because “distributed” sounds impressive, but because some environments genuinely need:</p> <ol class="wp-block-list" start="1"> <li>Broader availability expectations</li> <li>Regional workload handling</li> <li>Scale without conflict</li> <li>Architecture that remains understandable during failure</li> </ol> <p class="wp-block-paragraph">One capability that stood out in the conversation was regional workload transfer to the closest database while ensuring no conflicts occur.</p> <p class="wp-block-paragraph">That is not just a technical feature description. It points to something bigger: keeping the platform coherent while demand shifts and geography matters.</p> <p class="wp-block-paragraph">And coherence matters more than many teams admit.</p> <p class="wp-block-paragraph">Adding nodes is easy to put on a slide. Adding trust is much harder.</p> <p class="wp-block-paragraph">A PostgreSQL platform only becomes more resilient if the people operating it understand:</p> <ol class="wp-block-list" start="1"> <li>What kinds of failures it is designed to absorb</li> <li>What behavior is automatic versus manual</li> <li>What consistency guarantees matter for the workload</li> <li>How upgrades and patching will happen without destabilizing the design</li> </ol> <p class="wp-block-paragraph">Without that clarity, scale becomes ambiguity. And ambiguity is expensive.</p> <p class="wp-block-paragraph">Five nines is not something you announce. It is something you earn through engineering discipline and operational clarity.</p> <h3 class="wp-block-heading"><strong>What good HA/DR looks like in PostgreSQL</strong></h3> <p class="wp-block-paragraph">A high-availability PostgreSQL design is not “good” just because replication is configured.</p> <p class="wp-block-paragraph">A mature design should let the team answer these questions fast:</p> <ol class="wp-block-list" start="1"> <li>If the primary fails, what happens next?</li> <li>Is failover automatic, manual, or operator-assisted?</li> <li>Who is responsible for deciding whether the system should fail over?</li> <li>What happens to client routing and reconnection?</li> <li>How do we prevent split-brain or conflicting writes in more advanced topologies?</li> <li>How do we patch and upgrade the environment without violating the resilience design?</li> </ol> <p class="wp-block-paragraph">If those answers are not operationally clear, the architecture may look strong but still behave weakly under pressure.</p> <h2 class="wp-block-heading"><strong>Aurora migration: the real question is not “managed or unmanaged?”</strong></h2> <p class="wp-block-paragraph">Another meaningful part of the day was around Aurora and the path beyond it.</p> <p class="wp-block-paragraph">Aurora solves real problems. It gives many teams speed, convenience, and a managed experience that can be a good fit, especially early on.</p> <p class="wp-block-paragraph">But enterprises have a way of growing into harder questions.</p> <p class="wp-block-paragraph">Eventually the conversation stops being “Is it managed?” and becomes “Is it manageable on our terms?”</p> <p class="wp-block-paragraph">That is a different question altogether.</p> <p class="wp-block-paragraph">Because the trade-off is not simply convenience versus complexity. It is often convenience versus control.</p> <p class="wp-block-paragraph">Control over:</p> <ol class="wp-block-list" start="1"> <li>Upgrade timing</li> <li>Patching choices</li> <li>Extension strategy</li> <li>Operational consistency across environments</li> <li>Visibility into how the system behaves</li> <li>Where responsibility truly lives when something goes wrong</li> </ol> <p class="wp-block-paragraph">These edges do not always show up on day one. They tend to appear later, when the estate gets bigger, the stakes get higher, and standardization starts to matter more than convenience.</p> <p class="wp-block-paragraph">The mistake here is to turn the conversation ideological.</p> <p class="wp-block-paragraph">This is not “managed is bad” or “self-managed is pure.” That is not serious thinking.</p> <p class="wp-block-paragraph">The serious question is this: how do we regain control without reintroducing chaos?</p> <p class="wp-block-paragraph">That is why good Aurora-to-PostgreSQL thinking starts with the operating model, not the migration utility.</p> <p class="wp-block-paragraph">Before teams move, they should be clear on:</p> <ol class="wp-block-list" start="1"> <li>Upgrade and patch discipline</li> <li>Backup and recovery assumptions</li> <li>Workload-aware change windows</li> <li>Extension needs</li> <li>Monitoring expectations</li> <li>Blast-radius containment during transition</li> </ol> <p class="wp-block-paragraph">Good migration planning is not just about moving bytes. It is about rebuilding confidence.</p> <h3 class="wp-block-heading"><strong>A practical Aurora migration checkpoint</strong></h3> <p class="wp-block-paragraph">Before moving off Aurora into a broader PostgreSQL operating model, teams should test themselves with a few honest questions:</p> <ol class="wp-block-list" start="1"> <li>What problem are we actually trying to solve: cost, control, extensions, standardization, or something else?</li> <li>What operating burden will we newly own after the move?</li> <li>Do we already have the monitoring, backup, patching, and upgrade discipline to own that burden well?</li> <li>Are we migrating to a clearer operating model, or just migrating away from frustration?</li> <li>Can we stage the move in a way that limits blast radius and preserves confidence?</li> </ol> <p class="wp-block-paragraph">A surprising number of migrations get weaker not because PostgreSQL is hard, but because the target operating model was never designed properly.</p> <h2 class="wp-block-heading"><strong>Monitoring: the hidden tax most teams normalize</strong></h2> <p class="wp-block-paragraph">If there was one part of the conversation that felt universally familiar, it was monitoring and operational noise.</p> <p class="wp-block-paragraph">Most operational pain does not arrive as one dramatic outage. It arrives as accumulation.</p> <p class="wp-block-paragraph">Too many alerts. Too many dashboards. Too many things that look urgent but are not. Too many moments where capable people have to guess which signal matters.</p> <p class="wp-block-paragraph">That creates a tax that is easy to underestimate.</p> <p class="wp-block-paragraph">Alert fatigue becomes decision fatigue. Decision fatigue becomes hesitation. Hesitation becomes fear of change. And fear of change is where modernization quietly slows down.</p> <p class="wp-block-paragraph">This is why observability in PostgreSQL environments should be judged by a stricter standard than “we collect enough metrics.”</p> <p class="wp-block-paragraph">The real question is whether the system helps operators decide what to do next.</p> <p class="wp-block-paragraph">That means:</p> <ol class="wp-block-list" start="1"> <li>Reducing false positives</li> <li>Improving actionability</li> <li>Surfacing likely next actions</li> <li>Making the operational context clearer, not noisier</li> </ol> <p class="wp-block-paragraph">It also means treating workload-aware time blocks as a serious operational tool.</p> <p class="wp-block-paragraph">Good teams do not only ask, “When is the maintenance window?” They also ask, “When is this workload least sensitive to change?” and “When will the platform absorb this patch, tuning adjustment, or upgrade most safely?”</p> <p class="wp-block-paragraph">That shift sounds small. It is not.</p> <p class="wp-block-paragraph">It is one of the clearest signs that a PostgreSQL team has moved from reactive maintenance to intentional operations.</p> <h3 class="wp-block-heading"><strong>A simple observability test for PostgreSQL teams</strong></h3> <p class="wp-block-paragraph">I like to ask teams four blunt questions:</p> <ol class="wp-block-list" start="1"> <li>How many alerts fired last week?</li> <li>How many required human action?</li> <li>How many were false positives or low-value noise?</li> <li>For the alerts that mattered, was the next action obvious?</li> </ol> <p class="wp-block-paragraph">If the team cannot answer those questions, observability may be collecting data without improving decisions.</p> <p class="wp-block-paragraph">That is not observability maturity. That is metric accumulation.</p> <h2 class="wp-block-heading"><strong>Guided operations and AI: useful only if they reduce ambiguity</strong></h2> <p class="wp-block-paragraph">The conversation also touched on AI-assisted operations.</p> <p class="wp-block-paragraph">This is an area where it is very easy to sound futuristic and very hard to be useful.</p> <p class="wp-block-paragraph">So I prefer a simple standard.</p> <p class="wp-block-paragraph">AI has value in PostgreSQL operations only if it reduces ambiguity.</p> <p class="wp-block-paragraph">That means helping with things like:</p> <ol class="wp-block-list" start="1"> <li>Identifying which signals matter</li> <li>Reducing noise</li> <li>Suggesting meaningful next actions</li> <li>Improving patching and upgrade readiness</li> <li>Offering explainable optimization guidance</li> <li>Aligning recommendations with workload behavior</li> </ol> <p class="wp-block-paragraph">If it does those things, it helps.</p> <p class="wp-block-paragraph">If it simply adds another glossy layer of “intelligence” without reducing operator burden, then it is not solving the real problem.</p> <p class="wp-block-paragraph">PostgreSQL teams do not need more spectacle. They need fewer uncertain moments.</p> <p class="wp-block-paragraph">That is where guided operations can actually matter: not by replacing human judgment, but by helping good teams use that judgment more effectively.</p> <h2 class="wp-block-heading"><strong>What teams often get wrong</strong></h2> <p class="wp-block-paragraph">Many platform problems are not mysterious. They are patterns.</p> <h3 class="wp-block-heading"><strong>1. Treating upgrades as rare events</strong></h3> <p class="wp-block-paragraph">When teams delay major PostgreSQL upgrades too long, they are not preserving stability. They are usually accumulating future pain.</p> <h3 class="wp-block-heading"><strong>2. Confusing HA design with resilience</strong></h3> <p class="wp-block-paragraph">Replication and topology alone do not create trust. Resilience requires clear failure behavior and clear operator understanding.</p> <h3 class="wp-block-heading"><strong>3. Measuring observability by volume</strong></h3> <p class="wp-block-paragraph">More dashboards and more alerts are not signs of maturity. Actionability is.</p> <h3 class="wp-block-heading"><strong>4. Migrating without redesigning the operating model</strong></h3> <p class="wp-block-paragraph">Moving from Aurora or any managed environment without rethinking patching, monitoring, and change discipline is a recipe for disappointment.</p> <h3 class="wp-block-heading"><strong>5. Talking about AI without defining the ambiguity it removes</strong></h3> <p class="wp-block-paragraph">If intelligent assistance does not reduce guesswork, it is decoration.</p> <h2 class="wp-block-heading"><strong>What good looks like in a mature PostgreSQL platform</strong></h2> <p class="wp-block-paragraph">A mature PostgreSQL platform is not one that never has incidents. That is fantasy.</p> <p class="wp-block-paragraph">It is one where the team has built enough clarity and discipline that change does not feel theatrical.</p> <p class="wp-block-paragraph">In practice, that means:</p> <ol class="wp-block-list" start="1"> <li>Major upgrades are planned early and executed routinely</li> <li>Patching follows a repeatable cadence</li> <li>HA/DR design is explainable under pressure</li> <li>Monitoring produces clear actions instead of constant noise</li> <li>Maintenance windows reflect workload behavior, not just habit</li> <li>Automation supports judgment instead of masking weak process</li> <li>Intelligent assistance is used where it reduces real uncertainty</li> </ol> <p class="wp-block-paragraph">That is what calm looks like in a serious PostgreSQL estate.</p> <p class="wp-block-paragraph">Not perfection. Not magic. Just fewer surprises and better decisions.</p> <h2 class="wp-block-heading"><strong>The five pillars of calm PostgreSQL operations</strong></h2> <p class="wp-block-paragraph">If I had to reduce this whole experience into a practical framework, it would be this:</p> <h3 class="wp-block-heading"><strong>1. Currency</strong></h3> <p class="wp-block-paragraph">Stay reasonably current on major PostgreSQL releases so upgrades remain manageable.</p> <h3 class="wp-block-heading"><strong>2. Clarity</strong></h3> <p class="wp-block-paragraph">Make HA, failover, ownership, and change behavior understandable before they are tested under pressure.</p> <h3 class="wp-block-heading"><strong>3. Control</strong></h3> <p class="wp-block-paragraph">Know when managed convenience is still helping and when it has become an operational constraint.</p> <h3 class="wp-block-heading"><strong>4. Signal</strong></h3> <p class="wp-block-paragraph">Reduce false alerts and improve actionability so operators can focus on what matters.</p> <h3 class="wp-block-heading"><strong>5. Change discipline</strong></h3> <p class="wp-block-paragraph">Patch, tune, and upgrade in workload-aware windows with repeatable playbooks.</p> <p class="wp-block-paragraph">That is not flashy. It is effective.</p> <h2 class="wp-block-heading"><strong>What to do on Monday</strong></h2> <p class="wp-block-paragraph">A good blog post should not just leave readers with ideas. It should leave them with work worth doing.</p> <p class="wp-block-paragraph">If you run PostgreSQL today, start with five practical steps.</p> <h3 class="wp-block-heading"><strong>1. List the top three changes your team currently avoids</strong></h3> <p class="wp-block-paragraph">Be honest. Which changes create the most hesitation? Major upgrades? Failover testing? Tuning changes? Patch cycles?</p> <h3 class="wp-block-heading"><strong>2. Identify why those changes feel risky</strong></h3> <p class="wp-block-paragraph">Is it testing weakness? Tooling gaps? Poor rollback confidence? Unclear ownership? Monitoring noise?</p> <h3 class="wp-block-heading"><strong>3. Review your version strategy</strong></h3> <p class="wp-block-paragraph">Are you staying reasonably current, or are you drifting into large, painful upgrade gaps?</p> <h3 class="wp-block-heading"><strong>4. Audit your alerts</strong></h3> <p class="wp-block-paragraph">How many alerts actually require action? How many are just operational wallpaper?</p> <h3 class="wp-block-heading"><strong>5. Revisit your maintenance windows</strong></h3> <p class="wp-block-paragraph">Are they based on habit and calendar tradition, or on actual workload behavior?</p> <p class="wp-block-paragraph">These five steps will usually tell you more about PostgreSQL maturity than a dozen architecture slides.</p> <h2 class="wp-block-heading"><strong>How I helped the team move the conversation forward</strong></h2> <p class="wp-block-paragraph">One part of the discussion I valued most was helping turn broad platform themes into practical operating questions.</p> <p class="wp-block-paragraph">That meant helping the team think through:</p> <ol class="wp-block-list" start="1"> <li>How to approach PostgreSQL modernization as an operating model, not just a technology choice</li> <li>How to make major version upgrades and patching more repeatable and less disruptive</li> <li>How to evaluate HA/DR and distributed PostgreSQL through the lens of clarity, resilience, and operational predictability</li> <li>How to frame Aurora migration as a question of control, standardization, and long-term manageability</li> <li>How to improve monitoring by reducing false alerts and making signals more actionable</li> <li>How workload-aware maintenance windows can reduce risk and make change easier to absorb</li> <li>How guided operations and intelligent assistance can be useful only when they reduce real uncertainty for operators</li> </ol> <p class="wp-block-paragraph">For me, that is where these conversations become valuable: when architecture discussion turns into operational clarity, and when strategy starts becoming something a team can actually execute.</p> <p class="wp-block-paragraph">Because most teams do not need more theory. They need a clearer path.</p> <h2 class="wp-block-heading"><strong>Final thought</strong></h2> <p class="wp-block-paragraph">What stayed with me from that day was not the product list or the roadmap sequence.</p> <p class="wp-block-paragraph">It was the pattern in the questions.</p> <p class="wp-block-paragraph">The best teams in the room were not looking for magic. They were looking for confidence.</p> <p class="wp-block-paragraph">Confidence that PostgreSQL could keep evolving without exhausting the people responsible for it. Confidence that upgrades could become routine. Confidence that scale would not create confusion. Confidence that operations could become calmer, not louder.</p> <p class="wp-block-paragraph">That, to me, is where the real value of PostgreSQL lives now.</p> <p class="wp-block-paragraph">Not just in features. Not just in architecture diagrams. Not just in benchmarks.</p> <p class="wp-block-paragraph">But in helping real teams make real changes safely, repeatedly, and without losing their nerve.</p> <p class="wp-block-paragraph">That is not fluff.</p> <p class="wp-block-paragraph">That is the work.</p> <p class="wp-block-paragraph"></p>Wed, 11 Mar 2026 13:36:44 +0000https://postgr.es/p/7uIFloor Drees: The Future of Postgres on the agenda: EDB’s PGConf.dev Previewhttps://postgr.es/p/7uMPGConf.dev is heading to Vancouver, Canada, from May 19–22, bringing together the users, developers, and community organizers driving the future of PostgreSQL. EDB is proud to be a Gold-level sponsor this year, with our own Robert Haas serving as an organizer and Jacob Champion contributing to the Program Committee. Following a highly successful Call for Papers, we’ve put together this preview of the EDB-led sessions you won't want to miss.Wed, 11 Mar 2026 12:29:11 +0000https://postgr.es/p/7uMLukas Fittl: The Dilemma of the ‘AI DBA’https://postgr.es/p/7uGLike many in the industry, my perspective on AI tools has shifted considerably over the past year, specifically when it comes to software engineering tasks. Going from “this is nice, but doesn’t really solve complex tasks for me” to “this actually works pretty well for certain use cases.” But the more capable these tools become, the sharper one dilemma gets: you can hand off the work, but an AI agent won’t ultimately be responsible when the database goes down and your app stops working. For…Wed, 11 Mar 2026 00:00:00 +0000https://postgr.es/p/7uGLætitia AVROT: work_mem: it's a trap!https://postgr.es/p/7uHMy friend Henrietta Dombrovskaya pinged me on Telegram. Her production cluster had just been killed by the OOM killer after eating 2 TB of RAM. work_mem was set to 2 MB. Something didn&rsquo;t add up. Hetty, like me, likes playing with monster hardware. 2 TB of RAM is not unusual in her world. But losing the whole cluster to a single query during peak operations is a very different kind of problem from a 3am outage.Wed, 11 Mar 2026 00:00:00 +0000https://postgr.es/p/7uHVirender Singla: The Part of PostgreSQL We Discuss the Most — 2https://postgr.es/p/7uC<h4><strong>PostgreSQL and Oracle Implementation</strong></h4><p>In the <a href="https://medium.com/@virender-cse/the-part-of-postgresql-we-discuss-the-most-1-6c69c9d15f16">Part 1</a>, we explored the general concepts of MVCC and the implications of storing data snapshots either out-of-place or within heap storage, we can now map these methodologies to specific database engines.</p><p>The PostgreSQL MVCC implementation aligns with the DatabaseI model, whereas Oracle and MySQL are closely related to the DatabaseO model. Specifically, Oracle utilizes block versioning and stores older versions in a separate storage area known as UNDO, while PostgreSQL employs row versioning.</p><p>These engines further optimize their respective in-place or out-of-place MVCC strategies:</p><ul><li><strong>Oracle (DatabaseO) Delta Storage: </strong>To improve efficiency, Oracle avoids copying an entire block to UNDO. Instead, it only stores the modified columns as a “delta.” Consequently, when a query requires an older image, the engine applies this delta to the current heap block to reconstruct the previous state.</li><li><strong>PostgreSQL (DatabaseI) Visibility Map (VM): </strong>To mitigate the overhead of scanning the entire heap for garbage collection, PostgreSQL uses a <a href="https://www.postgresql.org/docs/current/storage-vm.html">Visibility Map</a>. This data structure maintains per-block information of heap, allowing the garbage collector to identify specific blocks containing garbage instead of performing a full table scan.</li><li><strong>Heap Only Tuple (HOT) Optimization: </strong>PostgreSQL addresses continuous index churn caused by new physical address (ctid) through <a href="https://www.postgresql.org/docs/current/storage-hot.html">HOT</a> optimization. If a new row version fits within the same block as the previous version, the indexes are not updated. Instead, index access lands on the heap block, accessing the old version, which then chains directly to the new version within the same block. Note that it’s still a single block fetch.</li><li><strong>Row Locking Mechanism: </strong>PostgreSQL utilizes the visibility counters to manage row locking as well, whereas Oracle employs a distinct data structure located in the block header for this purpose.</li><li><strong>Handling Multiple Data Versions:</strong> When a row undergoes multiple updates, Oracle maintains all historical versions in UNDO, linking them via pointers with the head of the chain anchored in the block header. Hence, the header only needs a single counter to redirect to this UNDO chain. However, if a record has been updated many times — such as ten iterations — a <strong>SELECT</strong> operation may have to traverse the entire UNDO sequence to reconstruct the required snapshot.</li></ul><h4><strong>Scenario when the Block fills</strong></h4><p>PostgreSQL has a higher probability of filling blocks quickly because of its in-place storage method, whereas Oracle typically only fills a block further when an update causes a row’s width to expand. PostgreSQL and Oracle differ in how they handle index updates when the modified row version must be stored in a different heap block due to lack of free space within the same heap block. While the physical address changes in both systems, PostgreSQL updates the index entries in this scenario, whereas Oracle maintains the existing index entry. In Oracle, an index fetch will still land on the original heap block and then be redirected to the new heap block, a process known as “Row Migration” that results in two I/O operations. In summary, the architecture of PostgreSQL results in increased UPDATE latency due to heightened index churn. This stands in contrast to Oracle, where the performance impact is instead shifted to SELECT operations, which will require an additional fetch to retrieve data. Though this design choice is independent of their respective MVCC implementations.</p><p><strong>In MySQL</strong>, data is structured within a clustered index and secondary indexes reference this data using logical primary key pointers rather than physical addresses. Because of this logical mapping, secondary indexes remain unaffected when the physical location of rows within MySQL changes.</p><h3>Inefficiencies in the PostgreSQL Garbage Collection!</h3><h4><strong>More Bloat?</strong></h4><p>The issue of bloat caused by PostgreSQL’s MVCC implementation is a frequent subject of debate. While Oracle bloat is less commonly discussed, long-running queries in that system can still experience significant performance degradation when forced to retrieve older data versions from UNDO blocks. To illustrate this, consider a scenario where an UPDATE statement modifies every row in a table. If a large query begins just before this update and performs a full table scan before any garbage collection occurs, PostgreSQL must fetch nearly double the actual table size because two versions of every row now exist within the heap. Oracle must also access these previous images, doing so by reaching into both the heap and the dedicated UNDO storage. In this sense, “garbage” or versioning bloat impacts the I/O of both engines.</p><p><strong>However, a critical distinction remains:</strong> Oracle queries are only impacted when they specifically require a previous block image; all new queries are served directly from the updated heap blocks. In contrast, PostgreSQL bloat continues to impact every query, regardless of when it started, because the outdated versions remain interleaved with live data. Moreover, PostgreSQL bloat is often permanent. Even after the autovacuum process removes “dead tuples,” the physical table size typically does not shrink, meaning full table scans must still traverse empty space. While this is not a significant concern, as incoming data will eventually occupy these partially filled blocks. Although PostgreSQL can <a href="https://www.postgresql.org/docs/current/runtime-config-vacuum.html#:~:text=19.10.3.%C2%A0Default%20Behavior-,vacuum_truncate,-(boolean)">truncate</a> empty blocks at the end of a file to return space to the OS, this requires an exclusive lock and can introduce other operational challenges, such as query conflicts on replicas.</p><h4><strong>Slow Default Configurations</strong></h4><p>In PostgreSQL, vacuuming is the primary garbage collection mechanism, managed by the “autovacuum” background worker. This process purges obsolete row versions based on a number of <a href="https://www.postgresql.org/docs/current/runtime-config-vacuum.html">configurable</a>. The instance wide default settings — such as triggering a cleanup only when obsolete data reaches 20% of the table size — are often too conservative for modern production workloads. This fixed percentage scales poorly in enterprise environments where tables reach hundreds of gigabytes, leading to significant bloat before maintenance begins. Beyond these default processing speeds, garbage collection can be further delayed or blocked by long-running transactions that require access to older row versions, preventing the reclamation of space. Also, concerns that autovacuum workers might consume resources and interfere with live traffic often lead administrators to favor less aggressive configurations or rely on manual vacuuming during off-peak hours. This necessitates time-consuming, table-level tuning based on specific sizes and workloads rather than relying on a one-size-fits-all approach.</p><h4><strong>Index Churn</strong></h4><p>Index maintenance adds another layer of complexity; cleaning indexes is significantly more expensive than heap cleanup. It requires the autovacuum process to collect row pointers (ctid) and perform exhaustive index scans, a process that slows down considerably as the number of indexes increases.</p><h4><strong>Right value for FILLFACTOR!</strong></h4><p>Both systems provide configuration settings, such as PostgreSQL’s FILLFACTOR, to determine the percentage of a block to fill versus the amount reserved for future updates. To reduce index churn, <strong>FILLFACTOR</strong> (defaults to 100, or 100% occupancy) is a critical setting in PostgreSQL. By reducing this value, administrators reserve space within blocks to facilitate HOT updates; however, identifying the optimal setting can be challenging. This contrasts with Oracle’s architecture, which typically leave a 10% reserved space for future updates and is generally less sensitive to these issues. Further details regarding FILLFACTOR can be found in sections <a href="https://medium.com/nerd-for-tech/moving-oracle-to-postgres-be-aware-of-fillfactor-and-indexing-implications-fcec6596827a">1</a> and <a href="https://medium.com/nerd-for-tech/postgres-fillfactor-baf3117aca0a">2</a>.</p><h4>Oracle Garbage Collection</h4><p>Oracle’s garbage collection is largely autonomous; it manages versioning pressure internally without requiring the same level of granular manual tuning or exposure of maintenance settings to the user. Oracle users primarily focus on two administrative tasks: ensuring adequate UNDO space — similar to managing standard data file storage — and setting a Time-to-Live via the undo_retention parameter to regulate garbage cleanup.</p><p>It is also important to note that Oracle indexes accumulate garbage, as previously detailed in the Index MVCC versioning section. Unlike PostgreSQL, Oracle does not feature a dedicated garbage collection process for its indexes. It performs cleanup lazily or opportunistically, reclaiming space only when subsequent transactions happen to traverse those specific index blocks. Notably, index garbage is only produced during DELETE operations or UPDATEs that modify an indexed column; standard UPDATEs, even those that alter a row’s physical address (due to Row Migration), do not contribute to this.</p><p><strong>MySQL</strong> obviously also has this Index garbage problem. To address this, MySQL includes primary keys in undo records, enabling the removal of garbage from clustered indexes during the cleanup of these undo records.</p><h3>The (In)Famous Transaction ID Wraparound Issue</h3><p>The “Transaction ID Wraparound” issue is a well-known phenomenon in PostgreSQL that can lead to significant database downtime. Why is this predominantly a PostgreSQL concern?</p><h4><strong>Storage and Performance Concerns</strong></h4><p>As PostgreSQL utilizes 4-byte Transaction ID (XID) counters, known as xmin and xmax, for visibility tracking on a per-row basis. With a 4-billion transaction limit, the system must eventually recycle these IDs. While the autovacuum process typically manages this cleanup, high-volume workloads can exhaust these IDs rapidly, so the autovacuum process needs to be tuned well for a high workload. If XID recycling is delayed or obstructed by a blocker, the exhaustion of all available IDs triggers an enforced outage known as a Transaction Wraparound.</p><p>In contrast, Oracle stores XIDs within the block header. This architectural choice allowed Oracle to utilize larger data types: 6 bytes prior to Oracle 12, and 8 bytes in subsequent versions. While the 6-byte implementation did cause a notable <a href="https://datageek.blog/2012/01/19/oracles-scn-flaw-could-it-happen-in-db2/">scare</a> in 2012 due to a specific bug, the larger capacity significantly extends the runway. MySQL uses 6 byte <strong>DB_TRX_ID.</strong></p><blockquote><strong>And 8 byte is not just double of 4 byte, it’s 4,294,967,296 times </strong>🙂</blockquote><p>Here is the straightforward math behind it:</p><blockquote><strong>4-byte capacity:</strong> 32 bits ($2^{32}$) gives you <strong>4,294,967,296</strong> possible values.</blockquote><blockquote><strong>6-byte capacity:</strong> 48 bits ($2^{48}$) gives you <strong>281,474,976,710,656</strong> possible values.</blockquote><blockquote><strong>8-byte capacity:</strong> 64 bits ($2^{64}$) gives you <strong>18,446,744,073,709,551,616</strong> possible values.</blockquote><p><strong>The Burn Rate:</strong> At a very high enterprise throughput of 20,000 transactions per second (TPS), it would take over <strong>440 years</strong> of continuous, uninterrupted processing to exhaust the 6-byte limit. Even at an extreme 100,000 TPS, a system would still have nearly 90 years of runway.</p><p><strong>Why has PostgreSQL not adopted larger XID storage?</strong> While technically feasible, the primary deterrent is that storing these visibility counters on a per-row basis raises significant concerns regarding performance and storage overhead. Interestingly, there is an ongoing <a href="https://www.postgresql.org/message-id/flat/4eb56320-744e-49ba-b766-702bc2fb61a8%40tantorlabs.com#5fc58da444be6766d84c7fae266f030c">hacker thread</a> discussing the possibility of storing 8 byte XID at the block level and combining them with row-level XIDs to determine visibility. Also, some PostgreSQL variants already utilize such methods to bypass the 4-byte limitation.</p><h4><strong>Index Visibility</strong></h4><p>PostgreSQL omits Transaction IDs (XIDs) from its indexes, likely to avoid the substantial per-row storage overhead that would otherwise be required for each index entry. Consider the impact of indexing a 2-byte SMALLINT column: adding two 4-byte XIDs alongside the CTID would significantly inflate the entry size. Hence, row visibility cannot be determined solely at the index level; the system must instead consult the heap row to verify visibility. To optimize this process and avoid constant heap access, PostgreSQL utilizes the Visibility Map (VM), allowing a transaction to quickly check if an entire block is visible. This architecture is why Index-Only Scans in PostgreSQL are not truly “index-only” in the absolute sense.</p><p>For this exact reason, a DELETE operation in PostgreSQL does not immediately modify any Index entries. Cleaning up these entries is prohibited at that stage because concurrent sessions might still be interested in accessing those deleted records. In the absence of visibility counters, other sessions must independently verify a row’s deletion status by consulting heap rows visibility itself. These Index entries are only permanently removed once the autovacuum process confirms that no active transactions can still access the previous versions. That means after a DELETE and before VACUUM happens, every query accessing those ros has to consult the heap to check the row status. To optimize, while PostgreSQL utilizes delete bits within the Index to signify deletions, a DELETE does not toggle them either. PostgreSQL sets these “ bits” opportunistically during a SELECT if it determines that the old entries are no longer visible to any other transaction. Hence subsequent SELECTs do not need to consult the heap.</p><h4><strong>Segregation of work</strong></h4><p>Another challenge with XID recycling is the multifaceted nature of the autovacuum process, which handles dead tuple cleanup, statistics collection, and recycling. Hence, even when XID exhaustion is imminent, autovacuum may still be occupied with heap and index cleaning. This has been largely addressed through a <a href="https://postgresqlco.nf/doc/en/param/vacuum_failsafe_age/">fail-safe</a> mechanism: once XID consumption reaches a specific configurable threshold, the system bypasses other tasks to focus exclusively on XID recycling.</p><h4><strong>Incremental XID recycling?</strong></h4><p>The XID cleanup process is not incremental in PostgreSQL. That means, there is no mechanism to isolate and prioritize the blocks or rows containing the oldest XIDs; the system cannot simply clean those first to bring the database back online while continuing to recycle remaining IDs in the background. Instead, the autovacuum process is forced to scan every block identified by the Visibility Map as requiring recycling — a task that can span many hours for exceptionally large tables.</p><h4><strong>The Role of Blockers</strong></h4><p>While the default autovacuum configuration is frequently blamed for the buildup of dead tuples and transaction wraparound risks, the true cause often lies with various blockers — such as long-running queries, transactions, replication issues, prepared statements, or even <a href="https://medium.com/@virender-cse/idle-session-triggers-a-transaction-wraparound-585cd1bc55d3">temporary tables</a>, as I recently came across. It is crucial to pay close attention to these blockers and mitigate them by utilizing timeout functionality, particularly by setting limits on query run times.</p><p>In contrast, Oracle utilizes a specific <em>undo_retention</em> flag that acts as a Time-to-Live (TTL) for garbage data; once this threshold is met, the data is purged regardless of whether other transactions might still require it. PostgreSQL, however, maintains a stricter dependency: a single transaction executing a command like <em>pg_sleep(infinite)</em> will effectively stall all garbage collection and XID recycling for any transaction IDs following it.</p><h4>Visibility of Overflow Pages</h4><p>In databases, a row that is too wide to fit within a standard heap block (often due to Large Object or LOB data) is split and stored in fixed-size segments across overflow pages, with a redirection pointer remaining in the main page row. In PostgreSQL this is known as a TOAST table.</p><p>A critical architectural nuance in PostgreSQL for <strong>simplicity</strong> is that these overflow pages maintain their own visibility counters, xmin and xmax, despite having no independent identity outside of the main table, because TOAST entries are always accessed via the main table row — which governs overall visibility — this creates a unique maintenance requirement. For instance, a 1TB table that stores only 10GB in the main table while the rest resides in TOAST still requires XID recycling at both locations.</p><h3><strong>Customer User Journeys (CUJs)</strong></h3><p>Ultimately, the typical end-user experience for each system can be summarized as follows:</p><h4><strong>Oracle Journey</strong></h4><ol><li>Issue: A daily query suddenly fails with a “snapshot too old” error.</li><li>Cause: Older UNDO data versions have been cleaned up.</li><li>Investigation: Determine why the query duration increased (e.g., higher data volume or a changed execution plan etc).</li><li>Resolution: Apply a quick fix by increasing UNDO space or pursue a long-term solution by tuning the query.</li></ol><h4><strong>PostgreSQL Journey</strong></h4><ol><li>Issue: The system approaches a transaction ID wraparound or performance degrades due to accumulated dead tuples.</li><li>Cause: Excessive “garbage” collection lag is impacting the workload.</li><li>Investigation: Evaluate if autovacuum is too slow; this often leads to a repetitive cycle of tuning flags at the table level.</li><li>Resolution: The administrator must pinpoint the specific blocker and terminate it to permit autovacuum to catch up.</li></ol><p>The distinction between these journeys is clear: in one scenario, the impact is localized to a specific query, whereas in the other, it affects the entire system workload.</p><h3>Final Thoughts</h3><p>Conceptually both in-place and out-of-place data versioning strategies present distinct advantages and challenges but practically in-place MVCC implementations of PostgreSQL has more drawbacks. In fact a discussion initiated a few years ago regarding <strong>zheap</strong>, an initiative to implement out-of-place versioning in the Postgres. The community has also made rapid progress in optimizing and refining the autovaccum processes. Recent advancements in the cleanup process include the ability to skip indexes during maintenance, parallel index cleanup, and a more streamlined memory architecture. Additionally , the system now features bottom-up deletion and reduced WAL thrashing.</p><p>While managed service providers offer database solutions, the shared responsibility model means that customers remain responsible for the intricate task of tuning the autovacuum. Until PostgreSQL achieves a fully autonomous garbage collection system for its in-place MVCC implementation, we must remain diligent in monitoring and tuning our critical production databases.</p><p><strong>Thank You for reading. Suggestions, feedbacks are appreciated.</strong></p><img alt="" height="1" src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=485190fa8a4e" width="1" />Tue, 10 Mar 2026 17:27:35 +0000https://postgr.es/p/7uCVirender Singla: The Part of PostgreSQL We Discuss the Most — 1https://postgr.es/p/7uD<p>Early in my PostgreSQL journey, I often sensed that a conversation between two Postgres professionals inevitably revolves around vacuuming. That <strong>lighthearted</strong> observation still remains relevant, as my LinkedIn feeds are often filled with discussions around vacuuming and comparing PostgreSQL’s Multi-Version Concurrency Control (MVCC) implementation to other engines like Oracle or MySQL. Given that people are naturally drawn to the most complex components of a system, I will continue this journey by exploring a detailed comparison of these database architectures focused on the MVCC implementations.</p><h3><strong>What is MVCC?</strong></h3><p>Stone age databases relied on strict locking mechanisms to handle concurrency, which proved inefficient under heavy load. In these traditional models, a read operation required a shared lock that prevented other transactions from updating the record. Conversely, write operations required exclusive locks that blocked incoming reads. This resulted in significant lock contention, where <strong>readers blocked writers and writers blocked readers</strong>.</p><p>To solve this, RDBMS implemented MVCC. The idea was very simple. Rather than overwriting data immediately, maintain multiple versions of data simultaneously. This allows transactions to view a consistent snapshot of the database as it existed at a specific point in time. <strong>For instance,</strong> if User 1 starts reading a table just before User 2 starts modifying a record, User 1 sees the original version of the data without hindering User 2’s progress. Without MVCC, the system would be forced to either serialize all access — making User 2 wait — or risk data consistency anomalies like dirty or non-repeatable reads where User 1 sees uncommitted changes that might eventually be rolled back.</p><p>Database engines utilize various architectures to manage this data versioning. A particularly notable point of discussion is the comparison between “in-place” and “out-of-place” data versioning techniques. Let’s examine these approaches more closely.</p><h3><strong>Explaining In-Place and Out-of-Place Data Versioning</strong></h3><h4><strong>Theoretical Framework:</strong></h4><p>To explore the core distinctions between MVCC implementations, let us consider two RDBMS utilizing row-based storage models: <strong>DatabaseI</strong> (in-place data versioning) and <strong>DatabaseO</strong> (out-of-place data versioning). These placeholder names are intended to represent the primary methodologies of PostgreSQL and Oracle, respectively, facilitating a comparison of their MVCC implementations without delving into exhaustive internal details at this level. In the subsequent analysis, we will map these conceptual models to the practical implementations found in PostgreSQL and Oracle.</p><h4><strong>Fundamentals of Table Storage:</strong></h4><p>In RDBMS, the fundamental unit of storage is a block or page — typically sized in kilobytes (though column-oriented models often utilize larger block size). A table, or heap, is essentially a collection of these blocks. Each block contains data rows along with a header that stores essential metadata for maintaining consistency and performing checksums. The block represents the smallest granular unit for reading. When a transaction queries a table, it accesses these heap blocks to retrieve the required data.</p><h4><strong>In-Place and Out-of-Place Data Versioning:</strong></h4><p>To understand the nuances of MVCC, consider the implications of a standard update operation, such as modifying a user’s account balance:</p><pre>UPDATE accounts SET balance = balance + 100 WHERE user_id = 100;</pre><p>While both DatabaseO and DatabaseI must preserve the before and after states of the data to ensure isolation, they diverge in their storage strategies for these versions.</p><p><strong><em>DatabaseO: Block-Based Versioning (Copy-on-Write):</em></strong></p><p>Operating as a Copy-on-Write (CoW) system, DatabaseO employs a block-based strategy. When an update occurs, the engine reads the block into memory, copies the original version to a dedicated storage area for older images, modifies the data in memory, and then writes the updated block back to the heap. This ensures the heap always contains the most current data, while older versions are retrieved from separate storage via redirection structures in the heap block header.</p><p><strong><em>DatabaseI: Row-Level Versioning (In-Place):</em></strong></p><p>Conversely, DatabaseI uses a row-level approach where the original data is not moved. Instead, it creates a new data version with the updated data and stores it directly within the existing heap block alongside the previous version. The process involves reading the block, adding the new row in its complete format (as it is a row-based engine), and writing the modified block back to the heap. Hence, a single block holds multiple iterations of the same row, keeping both states in one physical location.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Nt1wtbZy86jLADhpmS0LZQ.png" /></figure><p>In essence, one approach utilizes block versioning while the other relies on row versioning.</p><h3><strong>Understanding Snapshot Visibility</strong></h3><p>As MVCC functions by maintaining multiple versions of data, allowing a transaction to view data as it existed when that transaction began, the fundamental question then becomes: <strong>how does a transaction determine which version of this snapshot to read?</strong></p><p>To manage this, the database assigns each transaction a unique Transaction ID (call it as <strong>XID</strong>), which acts as a database clock. These XIDs must be stored alongside the data to track when changes occurred and ensure that the appropriate transaction reads the correct data version.</p><p>DatabaseO, which utilizes block versioning, stores the XID in the block header so it can distinguish between two block versions. Conversely, DatabaseI, which employs row versioning, stores the XID within the header of each individual row so it can distinguish between two row versions.</p><h4><strong>A Practical Look at Snapshot Visibility</strong></h4><p>Consider a scenario where a row is inserted at XID=0 and subsequently updated at XID=100:</p><p><strong><em>Visibility in DatabaseO (Block-Level Versioning):</em></strong></p><p>In DatabaseO, an update causes the original XID=0 block image to be moved to separate storage while the heap block is updated with new data and a new XID of 100. Now if a transaction with an XID &lt; 100 (already running before you executed the UPDATE) attempts to read the heap block, it finds the heap block header at XID=100. Recognizing that the data has changed since it began, it retrieves the old block image from the separate storage. Transactions with an XID ≥ 100 can read the current heap block directly as they are interested in reading the most recent version of the data.</p><p><strong>At INSERT (XID = 0):</strong></p><blockquote>Data-Snapshot1: XID=0 /* heap block */</blockquote><p><strong>At UPDATE (XID = 100):</strong></p><blockquote>Data-Snapshot1: XID=0 /*separate storage */</blockquote><blockquote>Data-Snapshot2: XID=100 /* heap block */</blockquote><p><strong><em>Visibility in DatabaseI (Row-Level Versioning):</em></strong></p><p>In DatabaseI, both row versions (XID=0 and XID=100) are stored within the heap block. When a transaction with an XID &lt; 100 reads the block, it reads both rows but only returns the version at XID=0, filtering out the newer version as its XID is greater than the transaction XID and the data did not exists when the transaction started.</p><p>For transactions where the XID ≥ 100, the process becomes slightly more complex as transaction XID is now older than the XID of both the rows. To resolve this, DatabaseI has to include two XID markers per row: one updated upon insertion (XID1) and another updated upon deletion (XID2):</p><p><strong>At INSERT (XID = 0):</strong></p><blockquote>Data-Snapshot1: XID1=0, XID2=NULL /* Inserted by XID 0 */</blockquote><p><strong>At UPDATE (XID = 100):</strong></p><blockquote>Data-Snapshot1: XID1=0, XID2=100 /* Inserted by XID 0, Deleted by XID 100 */</blockquote><blockquote>Data-Snapshot2: XID1=100, XID2=NULL /* Inserted by XID 100 */</blockquote><p>A row is only visible to a transaction if that transaction’s XID falls between the row’s XID1 and XID2. Thus, a transaction with an XID ≥ 100 knows to ignore the original row because it was effectively “deleted” (replaced) at XID=100.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1002/1*4Q5cKBQFEcTibNVfqz2fxQ.png" /></figure><h4><strong>Final Considerations</strong></h4><p>These examples illustrate simplified visibility rules. Real-world concurrent environments involve additional complexities, such as row locking, multiple isolation levels, and the status of transactions (committed vs. rolled back).</p><h3>The Impact of MVCC on Index Scans</h3><p>The analysis thus far has focused on transactions accessing heap blocks directly. In OLTP environments where low latency is critical, a transaction typically targets an index entry before reaching the heap. This raises a key question: how does the MVCC architecture handle index-based access?</p><h4><strong>Index Storage and Row Locators</strong></h4><p>Each row in a heap block is identified by a physical ID, serving as its unique address in the heap block. B+ Tree indexes store column values alongside this row locator to precisely identify the row’s position within the heap block.</p><h4><strong>Impact of Data Versioning on Indexes</strong></h4><p>The choice of MVCC architecture significantly impacts index maintenance:</p><ol><li><strong>DatabaseO:</strong> Because the heap block contains the modified data within the same row, the physical ID location could remain constant. Consequently, the indexes do not require updates when data changes.</li><li><strong>DatabaseI:</strong> Each new row version within the heap block receives a new physical ID. This necessitates updating all associated indexes to point to the new location. That means a lot of Index churn.</li></ol><p><em>Note: </em>If an indexed column value itself is modified, the index entry must be updated regardless of whether DatabaseO or DatabaseI.</p><h3>The Intricacies of Index Versioning</h3><p>We discussed above about data versioning in the heap and it’s impact on the Indexes. Though Indexes may also have multiple versions of data. Implementing MVCC within an index is inherently more complex than managing versioning within heap blocks. In DatabaseO, while heap versioning can simply redirect transactions to read older data versions in separate storage, index versioning forces both DatabaseI and DatabaseO to store data snapshot versions directly within the index blocks to maintain searchability for active queries, imagine if a value “5” were deleted from the index and moved to separate image, a search operation would be unable to locate it in the index structure.</p><p>For example, consider updating an indexed value from <strong>awesome</strong> to <strong>outstanding</strong> . Due to the balanced nature of tree structures, these entries may reside on different leaf blocks, each assigned specific visibility counters.</p><p><strong>DatabaseO:</strong></p><p><strong>At INSERT (XID = 0):</strong></p><blockquote>“awesome”: XID=0 /* Index Block */</blockquote><p><strong>At UPDATE (XID = 100):</strong></p><blockquote>“awesome”: XID=0 /* Index Block 1 */</blockquote><blockquote>“outstanding”: XID=100 /* Index Block 2 */</blockquote><p>In the separate storage, a previous image of Index Block 2 is captured prior to the insertion of the “outstanding” value. Conversely, since no modifications were made to Index Block 1, no previous image was generated for it.</p><p><strong>DatabaseI:</strong></p><p><strong>At INSERT (XID = 0):</strong></p><blockquote>“awesome”: XID1=0, XID2=NULL /* Index Block, Inserted by XID 0 */</blockquote><p><strong>At UPDATE (XID = 100):</strong></p><blockquote>“awesome”: XID1=0, XID2=100 /* Index Block 1, Inserted by XID 0, Deleted by XID 100 */</blockquote><blockquote>“outstanding”: XID1=100, XID2=NULL /* Index Block 2, Inserted by XID 100 */</blockquote><h4>Querying the New Value: “outstanding”</h4><p>When DatabaseO receives a query for “outstanding,” it accesses the relevant leaf block and checks the XID in the block header. The system then determines whether to read the current version (if transaction XID &gt;= block header XID) or to seek a before-image (if transaction XID &lt; block header XID), though that before-image does not contain the “outstanding” value itself. Similarly, DatabaseI evaluates the XIDs stored within the index entry to decide if the value should be read.</p><h4>Querying the Old Value: “awesome”</h4><p>DatabaseI evaluates the XIDs stored within the index entry to decide if the value should be read. But in DatabaseO retrieving the legacy value “awesome” is more difficult because the system must signal to new transactions that the entry is replaced or deleted. This typically requires visibility counters paired with a dedicated delete bit. In DatabaseO, a query for “awesome” checks the current block for a deletion marker or fetches the older image from separate storage where the delete bit is absent, depending on the transaction’s relationship to the block header XID.</p><p><strong>DatabaseO:</strong></p><p><strong>At INSERT (XID = 0):</strong></p><blockquote>“awesome”: XID=0, Del_Bit=0 /* Index Block */</blockquote><p><strong>At UPDATE (XID = 100):</strong></p><blockquote>“awesome”: XID=100, Del_Bit=1 /*Index Block 1 */</blockquote><blockquote>“outstanding”: XID=100, Del_Bit=0 /* Index Block 2 */</blockquote><p>Now both Index Block 1 and Index Block 2 previous images captured in the separate storage. It is also important to note that indexes now retain “garbage” (obsolete values), which necessitates a structured cleanup regardless of whether <strong>DatabaseO</strong> or <strong>DatabaseI</strong> is utilized.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*I-2K4MwMGyIW8PYPLRXy-Q.png" /></figure><h3>Comparing In-Place and Out-of-Place Data Versioning</h3><p>Now Let’s understand the high level difference between these two versioning models:</p><h4><strong>Write Performance during Version Creation</strong></h4><p>At the moment of version creation, DatabaseI appears more efficient. By placing the new version in the same block as the old one, it minimizes I/O; the system fetches a block, produces the new version within that same block, and writes it back in a single operation. In contrast, DatabaseO requires at least two random I/O operations: one to write the heap block and another to write the old image to a separate file.</p><h4><strong>Read Performance</strong></h4><p>For read operations, DatabaseI maintains an I/O advantage by fetching only a single block and filtering the versions locally. DatabaseO, however, may need to fetch the current heap page and then may perform an additional fetch from separate storage if the transaction requires a version older than what the current heap block provides. Of course, if a transaction requires the most recent image of the data, both databases would only need to fetch the heap block.</p><h4><strong>Block Density</strong></h4><p>In terms of storage density, however, DatabaseI encounters a notable hurdle. Its practice of maintaining multiple versions within the same block causes heap blocks to reach capacity much more quickly. Conversely, DatabaseO stores the pre-update data image in a separate location, allowing the row to occupy essentially the same amount of space within the heap. Both models share a common exception: when a brief value is replaced by a substantially larger one (for example, update “country” from “USA” to “Unites States of America”), the block may fill up rapidly regardless of the underlying architecture. This accelerated filling of blocks in DatabaseI creates additional downstream effects on both read and write performance, which will be explored in greater detail in a subsequent section.</p><h4><strong>Efficiency of the Cleanup Process</strong></h4><p>The continuous generation of new row versions presents a fundamental challenge in MVCC design. Databases would encounter unrestricted growth without a structured approach to purge outdated data. This maintenance, termed garbage collection, identifies and removes legacy data snapshots once it is certain that no active transaction requires access to them. DatabaseI storing the garbage directly with live data in the heap, while DatabaseO relocates it to a separate storage area.</p><p>When it comes to the actual garbage collection, DatabaseO has a distinct advantage. By isolating versioned data in a dedicated area, it can clean up obsolete rows more efficiently. DatabaseI, by contrast, must scan entire datasets to differentiate between live rows and garbage mixed within the heap blocks, making the identification and removal process more complex.</p><h4><strong>Snapshot Visibility</strong></h4><p>Snapshot visibility is more straightforward in DatabaseO than in DatabaseI. DatabaseI is at a notable disadvantage because it must maintain two visibility counters for every individual row, whereas DatabaseO only requires a single counter (along with a redirection structure) located within the block header.</p><h4><strong>Rollback Mechanism</strong></h4><p>In the event of a transaction rollback, the architectural differences between the two systems become even more apparent. For DatabaseI, the garbage collection process is essentially part of its normal routine; it must eventually clean up a row version regardless the transaction is committed or rolled back, and it simply uses the transaction status to determine which version is obsolete. Conversely, DatabaseO always maintains the most recent image directly in the heap block. To perform a rollback, it must actively undo the changes by restoring the original data from separate storage back to the heap. This creates additional I/O overhead during the rollback process. Moreover, until the rollback is finalized, any other transaction attempting to access that heap block will detect the uncommitted XID in the heap block and be redirected to separate storage to find a consistent version of the data.</p><h4><strong>Index Churn</strong></h4><p>Just as with heap blocks, DatabaseI indexes naturally consume more storage because they must maintain visibility counters for every row. Also, the shifting physical addresses of rows in DatabaseI appear to cause significantly higher index churn. While data visibility within the index follows a similar logic across both systems, DatabaseO requires additional bits to track entry deletions.</p><h3>Closing</h3><p>Conceptually both in-place and out-of-place data versioning strategies present distinct advantages and challenges. We will map these fundamental concepts to practice followed by PostgreSQL and Oracle in <a href="https://medium.com/@virender-cse/the-part-of-postgresql-we-discuss-the-most-2-485190fa8a4e">Part2</a>…</p><img alt="" height="1" src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=6c69c9d15f16" width="1" />Tue, 10 Mar 2026 17:26:58 +0000https://postgr.es/p/7uDFloor Drees: Shaping SQL in São Paulohttps://postgr.es/p/7uELast week, EDB engineers Matheus Alcantara and Euler Taveira attended the ISO/IEC SQL Standards Committee meeting in São Paulo as invited guests, supported remotely by veteran member Peter Eisentraut. The duo compared the collaborative environment to a PostgreSQL "Commitfest," where technical papers are proposed, debated, and refined much like code patches.Tue, 10 Mar 2026 13:37:56 +0000https://postgr.es/p/7uEAndrew Dunstan: Validating the shape of your JSON datahttps://postgr.es/p/7uB<p>One of the great things about PostgreSQL's jsonb type is the flexibility it gives you — you can store whatever structure you need without defining columns up front. But that flexibility comes with a trade-off: there's nothing stopping bad data from getting in. You can slap a CHECK constraint on a jsonb column, but writing validation logic in SQL or PL/pgSQL for anything beyond the trivial gets ugly fast.</p><p>I've been working on a PostgreSQL extension called <code>json_schema_validate</code> that solves this problem by letting you validate JSON and JSONB data against <a href="https://json-schema.org/" target="_blank">JSON Schema </a>specifications directly in the</p>Tue, 10 Mar 2026 10:13:17 +0000https://postgr.es/p/7uBDave Page: AI Features in pgAdmin: The AI Chat Agenthttps://postgr.es/p/7uK<p>This is the second in a series of three blog posts covering the new AI functionality in <a href="https://www.pgadmin.org/"><u>pgAdmin 4</u></a>. In the <a href="https://www.pgedge.com/blog/ai-features-in-pgadmin-configuration-and-reports">first post</a>, I covered LLM configuration and the AI-powered analysis reports. In this post, I'll introduce the AI Chat agent in the query tool, and in the third, I'll explore the AI Insights feature for EXPLAIN plan analysis.If you've ever found yourself staring at a database schema you didn't design, trying to work out the right joins to answer a seemingly simple question, you'll appreciate what the AI Chat agent brings to pgAdmin's query tool. Rather than having to alt-tab to an external AI service, paste in your schema, describe what you need, and then copy the resulting SQL back into your editor, the entire conversation now happens within the query tool itself, with full awareness of your actual database structure.<h2>Finding the AI Assistant</h2>The AI Chat agent appears as a new tab alongside the Query and Query History tabs in the left panel of the query tool. It's labelled 'AI Assistant' and is only visible when an LLM provider has been configured (as described in the first post in this series). The panel header shows which LLM provider and model are currently active, so you always know what's generating your responses.<img src="https://a.storyblok.com/f/187930/950x713/021a274608/picture1.png" /><h2>Natural Language to SQL</h2>The core capability of the AI Chat agent is translating natural language questions into SQL queries. You type what you want to know in plain English (or whatever language you're comfortable with), and the assistant generates the corresponding SQL, complete with an explanation of what it does and why it was written that way.For example, you might type something like:The assistant will first inspect your database schema to understand the available tables and relationships, then generate an appropriate query. The response includes both the SQL and a brief explanation, so you can understand what the query is doing before you run it.What makes this particularly useful is that the assistant doesn't just guess at your schema; it actively inspects the database using a set of tools that allow it to discover schemas, tables, columns, constraints, and indexes. This means the generated SQL uses your actual table and column names, respects your foreign key relationships, and takes advantage of your existing indexes where appropriate.<h2>How the Agent Works</h2>Behind the scenes, the AI Chat agent operates as a tool-using LLM agent with access to four database inspection tools:<ul><li>get_database_schema</li><li>: Lists all schemas, tables, and views in the connected database</li></ul><ul><li>get_table_info</li><li>: Retrieves detailed column, constraint, and index information for a specific table</li></ul><ul><li>get_table_columns</li><li>: Gets column names, data types, nullability, and defaults for a table</li></ul><ul><li>execute_sql_query</li><li>: Runs read-only SELECT queries to understand data structure and content</li></ul>When you send a message, the assistant typically begins by calling  to understand what tables are available, then drills into specific tables with  to understand columns and relationships, and finally constructs the appropriate SQL. This tool-use loop can iterate multiple times for complex requests; the assistant might need to inspect several tables, check column types, or even run a quick exploratory query before it can generate the final answer.All of this happens within a strict safety boundary. The  tool runs exclusively within a  transaction, results are capped at 1,000 rows, and the maximum number of tool call iterations is configurable (defaulting to 20) through the preferences. The assistant cannot modify your data; it can only read and inspect the database structure.<h2>Working with Generated SQL</h2>When the assistant generates a SQL query, it's presented in a syntax-highlighted code block with three action buttons:<ul><li>Copy</li><li>: Copies the SQL to your clipboard</li></ul><ul><li>Insert at Cursor</li><li>: Inserts the SQL at the current cursor position in the query editor, which is handy if you want to incorporate it into a larger script</li></ul><ul><li>Replace Query</li><li>: Replaces the entire contents of the query editor with the generated SQL</li></ul>The generated SQL is automatically formatted according to your editor preferences for keyword case, identifier case, data type case, and function case, so it blends naturally with the rest of your code.<h2>Conversational Context</h2>The chat maintains a full conversation history within the session, so you can refine your requests iteratively. If the first query isn't quite what you wanted, you can say something like "Actually, filter that to just orders from the last 30 days" and the assistant will adjust the previous query accordingly. The assistant is also smart enough to ask clarifying questions when your request is ambiguous; if you ask for 'the users table' but there are multiple schemas each containing a  table, it will ask which one you mean rather than guessing.You can navigate through your previous messages using the up and down arrow keys, much like command-line history, which is convenient when you want to rephrase or resubmit an earlier question. The Shift+Enter combination lets you type multi-line messages, whilst pressing Enter on its own sends the message.<h2>Beyond SELECT Queries</h2>The AI Chat agent isn't limited to SELECT queries. It can generate INSERT, UPDATE, DELETE, and DDL statements as well. If you ask it to "add a created_at timestamp column to the users table with a default of now()", it will generate the appropriate statement. For UPDATE and DELETE operations, the assistant is instructed to always include WHERE clauses, providing a useful safety net against accidentally modifying every row in a table.That said, it's worth emphasising that the generated SQL is always presented for your review before execution. The assistant never runs modification queries automatically; it generates the SQL and presents it to you, and you decide whether to run it. This keeps you firmly in control.<h2>Streaming Responses</h2>Responses are streamed to the browser via Server-Sent Events (SSE), so you see progress in real time rather than waiting for the complete response. Whilst the assistant is working, you'll see animated thinking messages with PostgreSQL-themed phrases such as 'Consulting the elephant...', 'Traversing the B-tree...', and 'Vacuuming the catalog...' that rotate every couple of seconds to let you know the analysis is in progress. If a request is taking too long (there is a five-minute timeout), you can click the Stop button to cancel the in-flight request and try a different approach.<h2>Practical Tips</h2>Having worked with the AI Chat agent extensively during development, here are a few observations that might help you get the most from it:<ul><li>Be specific about what you want</li><li>. "Show me user activity" is vague, but "show me the number of logins per day for the last month, grouped by user role" gives the assistant enough context to generate precise SQL.</li></ul><ul><li>Use it for exploration</li><li>. When you're working with an unfamiliar database, asking questions like "what tables contain customer data?" or "how are orders related to products?" can be faster than manually browsing through the schema tree.</li></ul><ul><li>Review the generated SQL before running it</li><li>. The assistant is generally very good, but it's working with an LLM under the hood, and LLMs can occasionally produce incorrect or suboptimal queries. Always review what's been generated, especially for modification operations.</li></ul><ul><li>Take advantage of the conversation flow</li><li>. Start broad and refine iteratively; it's much more natural than trying to specify everything in a single message.</li></ul><h2>What's Next</h2>In the final post in this series, I'll cover the AI Insights feature in the EXPLAIN plan viewer, which analyses your query execution plans and provides actionable optimisation recommendations, including specific index creation statements that you can insert directly into the editor. If you've ever found EXPLAIN output difficult to interpret, this feature is for you.</p>Tue, 10 Mar 2026 05:44:17 +0000https://postgr.es/p/7uKYuwei Xiao: Introducing pg_duckpipe: Real-Time CDC for Your Lakehousehttps://postgr.es/p/7uAAutomatically keep a fast, analytical copy of your PostgreSQL tables, updated in real time with no external tools needed.Tue, 10 Mar 2026 00:00:00 +0000https://postgr.es/p/7uAUmair Shahid: Thinking of PostgreSQL High Availability as Layershttps://postgr.es/p/7uy<div class="elementor elementor-29874"> <div class="elementor-element elementor-element-6af0e46d e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-5003493e elementor-widget elementor-widget-text-editor"> <p><span style="font-weight: 400;">High availability for PostgreSQL is often treated as a single, big, dramatic decision: “Are we doing HA or not?”</span></p><p><span style="font-weight: 400;">That framing pushes teams into two extremes:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">a “hero architecture” that costs a lot and still feels tense to operate, or</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">a minimalistic architecture that everyone hopes will just keep running.</span></li></ul><p><span style="font-weight: 400;">A calmer way to design this is to treat HA and DR as layers. You start with a baseline, then add specific capabilities only when your RPO/RTO and budget justify them.</span></p><p><span style="font-weight: 400;">Let us walk through the layers from “single primary” to “multi-site DR posture”.</span></p><h2><span style="font-weight: 400;">Start with outcomes</span></h2><p><span style="font-weight: 400;">Before topology, align on three things:</span></p><p><span style="font-weight: 400;">1. Failure scope</span></p><ul><li><ul><li><span style="font-weight: 400;">A database host fails</span></li><li><span style="font-weight: 400;">A zone or data center goes away</span></li><li><span style="font-weight: 400;">A full region outage happens</span></li><li><span style="font-weight: 400;">Human error</span></li></ul></li></ul><p><span style="font-weight: 400;">2. RPO (Recovery Point Objective)</span></p><ul><li><ul><li style="font-weight: 400;"><span style="font-weight: 400;">We can tolerate up to 15 minutes of data loss</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">We want close to zero</span></li></ul></li></ul><p><span style="font-weight: 400;">3. RTO (Recovery Time Objective)</span></p><ul><li><ul><li style="font-weight: 400;"><span style="font-weight: 400;">We can be back in 30 minutes</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">We want service back in under 2 minutes</span></li></ul></li></ul><p><span style="font-weight: 400;">Here is my stance (and it saves money!): You get strong availability outcomes by layering in the right order.</span></p><h2><span style="font-weight: 400;">Layer 0 &#8211; Single primary (baseline, no backups)</span></h2><p><span style="font-weight: 400;">This is the baseline: one PostgreSQL primary in one site. All reads and writes go to it.</span></p><p><span style="font-weight: 400;">That is it. No replicas. No archiving. No backup flow in this model.</span></p> </div> </div> </div> <div class="elementor-element elementor-element-90e11aa e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-1c07acb elementor-widget elementor-widget-image"> <a href="https://resources.stormatics.tech/improving-postgres-performance-with-partitioning"> <img alt="" class="attachment-large size-large wp-image-29876" height="360" src="https://stormatics.tech/wp-content/uploads/2026/03/1-1024x576.webp" width="640" /> </a> </div> </div> </div> <div class="elementor-element elementor-element-3b9c8f7 e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-4bc6d07 elementor-widget elementor-widget-text-editor"> <p><span style="font-weight: 400;">What you get:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">simplicity</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">low cost</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">low operational overhead</span></li></ul><p><span style="font-weight: 400;">What it means operationally:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">Your “recovery plan” is effectively “rebuild and rehydrate from wherever you can” (which might be infrastructure snapshots, application-level rebuilds, or other ad hoc processes depending on your environment).</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Your availability depends heavily on the stability of the underlying host, storage, and platform.</span></li></ul><p><span style="font-weight: 400;">If you are running Layer 0, the best mindset is: keep it stable and observable.</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">solid monitoring (latency, errors, saturation)</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">sane maintenance (bloat, stats, connection hygiene)</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">predictable change management</span></li></ul><p><span style="font-weight: 400;">Layer 0 is not a “bad” architecture. It is simply the baseline. The moment you want a reliable recovery posture, you move to Layer 1.</span></p><h2><span style="font-weight: 400;">Layer 1 &#8211; Add offsite backups (your first real safety net)</span></h2><p><span style="font-weight: 400;">Layer 1 keeps the same single primary in Site A, and adds backup storage in Site B.</span></p><p><span style="font-weight: 400;">This model introduces a defined recovery path.</span></p> </div> </div> </div> <div class="elementor-element elementor-element-a6eda30 e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-fcf0466 elementor-widget elementor-widget-image"> <a href="https://resources.stormatics.tech/improving-postgres-performance-with-partitioning"> <img alt="" class="attachment-large size-large wp-image-29877" height="360" src="https://stormatics.tech/wp-content/uploads/2026/03/2-1024x576.webp" width="640" /> </a> </div> </div> </div> <div class="elementor-element elementor-element-047f020 e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-c103d77 elementor-widget elementor-widget-text-editor"> <p><span style="font-weight: 400;">What you gain:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">You can lose the primary server and still recover your data.</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">You can meet an RPO that is “last successful backup” (which is often perfectly acceptable for many systems).</span></li></ul><p><span style="font-weight: 400;">Practical ways teams implement this:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">pgBackRest or Barman sending backups to object storage (often in another region/account)</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">retention policies that reflect compliance and business needs</span></li></ul><p><span style="font-weight: 400;">An important point to note here &#8211; a backup is only as good as its ‘restorability’. If you can’t restore a backup, there is no point in taking one. Best practice is to run periodic drills to test the restore procedure, measure the time it takes, and verify the data it restores. </span></p><h2><span style="font-weight: 400;">Layer 2 &#8211; Add WAL archiving (PITR-ready recovery)</span></h2><p><span style="font-weight: 400;">Layer 2 builds on Layer 1 by adding WAL archiving from Site A to Site B.</span></p><p><span style="font-weight: 400;">This is where recovery becomes precise and continuous.</span></p><p><span style="font-weight: 400;">Backups alone restore you to “the last backup.” WAL archiving lets you restore to a point in time.</span></p> </div> <div class="elementor-element elementor-element-1ffe307 elementor-widget elementor-widget-image"> <a href="https://resources.stormatics.tech/improving-postgres-performance-with-partitioning"> <img alt="" class="attachment-large size-large wp-image-29879" height="360" src="https://stormatics.tech/wp-content/uploads/2026/03/3-1024x576.webp" width="640" /> </a> </div> </div> </div> <div class="elementor-element elementor-element-8b03a98 e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-5113d40 elementor-widget elementor-widget-text-editor"> <p><span style="font-weight: 400;">What you gain:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">PITR (Point-in-Time Recovery)</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Tighter RPO</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">A clean response to human error</span></li></ul><p><span style="font-weight: 400;">The habit that makes this layer valuable:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">restore drills</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">timed drills</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">runbooks that a tired engineer can follow at 2 AM</span></li></ul><p><span style="font-weight: 400;">Layer 2 is one of the highest-ROI layers in the entire model because it turns recovery into a controlled process rather than improvisation.</span></p><h2><span style="font-weight: 400;">Layer 3 &#8211; Add a hot standby</span></h2><p><span style="font-weight: 400;">Layer 3 keeps backups + WAL archiving, and adds a hot standby in Site A (often in a different zone or DC).</span></p><p><span style="font-weight: 400;">Primary → standby uses asynchronous streaming replication.</span></p> </div> <div class="elementor-element elementor-element-2ec2341 elementor-widget elementor-widget-image"> <a href="https://resources.stormatics.tech/improving-postgres-performance-with-partitioning"> <img alt="" class="attachment-large size-large wp-image-29878" height="360" src="https://stormatics.tech/wp-content/uploads/2026/03/4-1024x576.webp" width="640" /> </a> </div> </div> </div> <div class="elementor-element elementor-element-fd08d5e e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-3397218 elementor-widget elementor-widget-text-editor"> <p><span style="font-weight: 400;">What you gain:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">much faster RTO (fail over to the standby instead of rebuilding)</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">the option for load balancing (route read queries to the standby)</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">planned switchovers for maintenance that do not disrupt operations</span></li></ul><p><span style="font-weight: 400;">Additional monitoring requirements:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">replication lag</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">WAL generation rate</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">standby replay delay</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">failover readiness </span></li></ul><p><span style="font-weight: 400;">This is also where teams choose between:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">disciplined manual failover</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Auto failover using an HA manager</span></li></ul><p><span style="font-weight: 400;">Either path works when it is tested and documented.</span></p><h2><span style="font-weight: 400;">Layer 4 &#8211; Add synchronous replication</span></h2><p><span style="font-weight: 400;">Layer 4 is where teams typically run a primary and multiple standbys, using:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">synchronous replication for stronger data guarantees, and</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">asynchronous replication for flexibility and additional redundancy.</span></li></ul> </div> <div class="elementor-element elementor-element-e2c42eb elementor-widget elementor-widget-image"> <a href="https://resources.stormatics.tech/improving-postgres-performance-with-partitioning"> <img alt="" class="attachment-large size-large wp-image-29880" height="360" src="https://stormatics.tech/wp-content/uploads/2026/03/5-1024x576.webp" width="640" /> </a> </div> <div class="elementor-element elementor-element-c86a10d elementor-widget elementor-widget-text-editor"> <p><span style="font-weight: 400;">What you gain:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">near-zero data loss for transactions protected by synchronous commit</span></li></ul><p><span style="font-weight: 400;">What you accept:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">added write latency</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">more explicit failure handling</span></li></ul><p><span style="font-weight: 400;">An important part of the policy:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">When the synchronous standby is unavailable, do you prefer continued writes (async mode) or do you prefer waiting until sync returns?</span></li></ul><p><span style="font-weight: 400;">Teams that decide this up front operate Layer 4 calmly. Teams that leave it implicit tend to discover their “real” policy during an incident.</span></p><h2><span style="font-weight: 400;">Layer 5 &#8211; Add a warm standby in Site B</span></h2><p><span style="font-weight: 400;">Layer 5 is where you treat a second site as a true recovery location, adding regional redundancy. </span></p><p><span style="font-weight: 400;">You keep your HA setup in Site A and maintain a warm standby in Site B, fed by backups and WAL archives that are continuously applied to the standby node.</span></p> </div> <div class="elementor-element elementor-element-9c9fe8d elementor-widget elementor-widget-image"> <a href="https://resources.stormatics.tech/improving-postgres-performance-with-partitioning"> <img alt="" class="attachment-large size-large wp-image-29881" height="360" src="https://stormatics.tech/wp-content/uploads/2026/03/6-1024x576.webp" width="640" /> </a> </div> <div class="elementor-element elementor-element-4b719bc elementor-widget elementor-widget-text-editor"> <p><span style="font-weight: 400;">What you gain:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">a cleaner plan for site-level outages</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">a faster recovery path to Site B, reducing RTO</span></li></ul><p><span style="font-weight: 400;">This layer also forces a useful reality check, DR is not only a database design. You also want:</span></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">routing (DNS/LB) that can switch cleanly</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">application configuration that supports failover</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">secrets and access that work in the DR site</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">rehearsed runbooks</span></li></ul><p><span style="font-weight: 400;">When those pieces are ready, Layer 5 feels like a controlled switchover instead of a high-stress scramble.</span></p><h2><span style="font-weight: 400;">Common gotchas that show up in production</span></h2><p><span style="font-weight: 400;">These are the ones I see repeatedly:</span></p><ol><li style="font-weight: 400;"><span style="font-weight: 400;">Backups exist; restore is untested. At best, this is Schrodinger’s backup &#8211; and you will only know when there is an outage. </span></li><li style="font-weight: 400;"><span style="font-weight: 400;">WAL archiving is configured but not monitored. You want to make sure the consumer is consuming the files, so they don’t pile up on the producer. </span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Replication slots retain WAL longer than expected. This needs to be monitored, and you need to ask ‘why’. </span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Synchronous replication without a clear failure policy. Write the rule down, test it, and make it visible to the on-call team.</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Read traffic routed to standbys without thinking about staleness. Replica reads are great when you choose the right queries and accept the consistency model.</span></li></ol> </div> </div> </div> <div class="elementor-element elementor-element-5d9367e e-flex e-con-boxed e-con e-parent"> <div class="e-con-inner"> <div class="elementor-element elementor-element-fbc4296 elementor-widget elementor-widget-image"> <a href="https://resources.stormatics.tech/postgresql-training-one-day-course"> <img alt="" class="attachment-large size-large wp-image-28541" height="360" src="https://stormatics.tech/wp-content/uploads/2025/04/Training-1-1024x576.webp" width="640" /> </a> </div> </div> </div> </div> <p>The post <a href="https://stormatics.tech/blogs/thinking-of-postgresql-high-availability-as-layers">Thinking of PostgreSQL High Availability as Layers</a> appeared first on <a href="https://stormatics.tech">Stormatics</a>.</p>Mon, 09 Mar 2026 14:03:16 +0000https://postgr.es/p/7uyCornelia Biacsics: Contributions for week 9, 2026https://postgr.es/p/7ux<p>The community met on Wednesday, March 4, 2026 for the <a href="https://www.meetup.com/postgresql-user-group-nrw/events/313229102/">7. PostgreSQL User Group NRW MeetUp (Cologne, ORDIX AG)</a>. It was organised by Dirk Krautschick and Andreas Baier. </p> <p>Speakers: </p> <ul> <li>Robin Riel</li> <li>Jan Karremans</li> </ul> <p><a href="https://www.meetup.com/postgresql-meetup-berlin/events/313412510/">PostgreSQL Berlin March 2026 Meetup</a> took place on March 5, 2026 organized by Andreas Scherbaum and Sergey Dudoladov. </p> <p>Speakers: </p> <ul> <li>Andreas Scherbaum</li> <li>Tudor Golubenco</li> <li>Narendra Tawar</li> <li>Kai Wagner</li> </ul> <p>Kai Wagner wrote about his experience at the meetup <a href="https://www.linkedin.com/pulse/postgresql-berlin-meetup-march-2026-kai-wagner-dvwqf/">PostgreSQL Berlin Meetup - March 2026 </a></p> <p>Andreas Scherbaum <a href="https://andreas.scherbaum.la/post/2026-03-06_postgresql-berlin-march-2026-meetup/">wrote a blog posting about the Meetup</a>.</p> <p><a href="https://www.socallinuxexpo.org/scale/23x">SCALE 23x</a> (March 5-8, 2026) had a dedicated PostgreSQL track, filled by the following contributions</p> <p>Trainings: </p> <ul> <li>Elizabeth Christensen</li> <li>Devrim Gunduz</li> <li>Ryan Booz</li> </ul> <p>Talks: </p> <ul> <li>Nick Meyer</li> <li>Tristan Ahmadi</li> <li>Alexandra Wang</li> <li>Christophe Pettus</li> <li>Max Englander</li> <li>Magnus Hagander</li> <li>Bruce Momjian</li> <li>Robert Treat</li> <li>Payal Singh</li> <li>German Eichberger</li> <li>Jimmy Angelakos</li> <li>Justin Frye</li> </ul> <p>SCALE 23x PostgreSQL Booth volunteers: </p> <ul> <li>Bruce Momjian </li> <li>Christine Momjian</li> <li>Gabrielle Roth</li> <li>Jennifer Scheuerell</li> <li>Magnus Hagander</li> <li>Devrim Gunduz</li> <li>Elizabeth Garret Christensen</li> <li>Robert Treat</li> <li>Pavlo Golub</li> <li>Phill Vacca</li> <li>Jimmy Angelakos</li> <li>Erika Miller</li> <li>Aya Griswold</li> <li>Alex Wood</li> <li>Donald Wong </li> <li>Derya Gumustel</li> </ul>Mon, 09 Mar 2026 10:31:43 +0000https://postgr.es/p/7uxDave Page: AI Features in pgAdmin: Configuration and Reportshttps://postgr.es/p/7uz<p>This is the first in a series of three blog posts covering the new AI functionality coming in <a href="https://www.pgadmin.org/"><u>pgAdmin 4</u></a>. In this post, I'll walk through how to configure the LLM integration and introduce the AI-powered analysis reports; in the second, I'll cover the AI Chat agent in the query tool; and in the third, I'll explore the AI Insights feature for EXPLAIN plan analysis.Anyone who manages PostgreSQL databases in a professional capacity knows that keeping on top of security, performance, and schema design is an ongoing endeavour. You might have a checklist of things to review, or perhaps you rely on experience and intuition to spot potential issues, but it is all too easy for something to slip through the cracks, especially as databases grow in complexity. We've been thinking about how AI could help with this, and I'm pleased to introduce a suite of AI-powered features in pgAdmin 4 that bring large language model analysis directly into the tool you already use every day.<h2>Configuring the LLM Integration</h2>Before any of the AI features can be used, you'll need to configure an LLM provider. pgAdmin supports four providers out of the box, giving you flexibility to choose between cloud-hosted models and locally-running alternatives:<ul><li>Anthropic</li><li> (Claude models)</li></ul><ul><li>OpenAI</li><li> (GPT models)</li></ul><ul><li>Ollama</li><li> (locally-hosted open-source models)</li></ul><ul><li>Docker Model Runner</li><li> (built into Docker Desktop 4.40 and later)</li></ul><h3>Server Configuration</h3>At the server level, there is a master switch in  (or, more typically, ) that controls whether AI features are available at all:When  is set to , all AI functionality is hidden from users and cannot be enabled through preferences. This gives administrators full control over whether AI features are permitted in their environment, which is particularly important in organisations with strict data governance policies.Below the master switch, you'll find default configuration for each provider:For the cloud providers (Anthropic and OpenAI), API keys are read from files on disk rather than being stored directly in the configuration, which is a deliberate security choice. The key file should contain nothing but the API key itself, with no additional whitespace or formatting. For Ollama and Docker Model Runner, you simply provide the API URL for the local service (typically  for Ollama and  for Docker).<h3>User Preferences</h3>Whilst the server configuration sets the defaults and boundaries, individual users can customise their AI settings through the Preferences dialog under the 'AI' section. The preferences are organised into categories:AI Configuration contains the general settings:<ul><li>Default Provider</li><li>: Users can select their preferred provider from a dropdown, or choose 'None (Disabled)' to turn off AI features for their account. This setting only takes effect if </li><li>LLM_ENABLED</li><li> is </li><li>True</li><li> in the server configuration.</li></ul><ul><li>Max Tool Iterations</li><li>: Controls how many tool call rounds the AI is allowed to perform during a single conversation, with a default of 20. Higher values allow more complex analyses but consume more resources.</li></ul>Each provider has its own category with provider-specific settings:<ul><li>Anthropic</li><li>: API Key File path and Model selection</li></ul><ul><li>OpenAI</li><li>: API Key File path and Model selection</li></ul><ul><li>Ollama</li><li>: API URL and Model selection</li></ul><ul><li>Docker Model Runner</li><li>: API URL and Model selection</li></ul>One particularly nice touch is that the model selection dropdowns are populated dynamically. When you configure an API key or URL and click the refresh button, pgAdmin queries the provider's API to fetch the list of available models. For Ollama, it even shows the model sizes so you can see at a glance how much disk space each model is using. The model selectors also support typing in custom model names, so you're not limited to whatever the API returns; if you know the exact model identifier you want to use, you can simply type it in.<img src="https://a.storyblok.com/f/187930/950x378/9b5be90313/picture2.png" /><h2>AI Analysis Reports</h2>With the LLM configured, you gain access to three types of AI-powered analysis reports that can be generated from the browser tree context menu. Simply right-click on a server, database, or schema and select the appropriate report from the 'AI Analysis' submenu.<h3>Security Reports</h3>The security report examines your PostgreSQL configuration from a security perspective, covering a comprehensive range of areas:<ul><li>Authentication Configuration</li><li>: Password policies, SSL/TLS settings, authentication methods, and connection security</li></ul><ul><li>Access Control and Roles</li><li>: Superuser accounts, privileged roles, login roles without password expiry, and role privilege assignments</li></ul><ul><li>Network Security</li><li>: Listen addresses, connection limits, and </li><li>pg_hba.conf</li><li> rules</li></ul><ul><li>Encryption and SSL</li><li>: SSL/TLS configuration, password encryption methods, and data-at-rest encryption settings</li></ul><ul><li>Object Permissions</li><li>: Schema, table, and function access control lists, default privileges, and ownership (at database scope)</li></ul><ul><li>Row-Level Security</li><li>: RLS policies, RLS-enabled tables, and policy coverage analysis</li></ul><ul><li>Security Definer Functions</li><li>: Functions running with elevated privileges and their permission settings</li></ul><ul><li>Audit and Logging</li><li>: Connection logging, statement logging, error logging, and audit trail configuration</li></ul><ul><li>Extensions</li><li>: Installed extensions and their security implications</li></ul>Security reports can be generated at the server level (covering server-wide configuration such as authentication and network settings), the database level (adding object permissions and RLS analysis), or the schema level (focusing on a specific schema's security posture).<img src="https://a.storyblok.com/f/187930/950x934/8c43fbb4aa/picture3.png" /><h3>Performance Reports</h3>The performance report analyses your server and database configuration for potential optimisation opportunities:<ul><li>Memory Configuration</li><li>: </li><li>shared_buffers</li><li>, </li><li>work_mem</li><li>, </li><li>effective_cache_size</li><li>, </li><li>maintenance_work_mem</li><li>, and related settings</li></ul><ul><li>Checkpoint and WAL</li><li>: Checkpoint settings, WAL configuration, and background writer statistics</li></ul><ul><li>Autovacuum Configuration</li><li>: Autovacuum settings, tables needing vacuum, and dead tuple accumulation</li></ul><ul><li>Query Planner Settings</li><li>: Cost parameters, statistics targets, JIT compilation, and planner optimisation settings</li></ul><ul><li>Parallelism and Workers</li><li>: Parallel query configuration and worker process settings</li></ul><ul><li>Connection Management</li><li>: Maximum connections, reserved connections, timeouts, and current connection status</li></ul><ul><li>Cache Efficiency</li><li>: Buffer cache hit ratios, database-level cache statistics, and table-level I/O patterns</li></ul><ul><li>Index Analysis</li><li>: Index utilisation, unused indexes, tables that might benefit from additional indexes, and index size analysis</li></ul><ul><li>Query Performance</li><li>: Slowest queries and most frequent queries (when </li><li>pg_stat_statements</li><li> is available)</li></ul><ul><li>Replication Status</li><li>: Replication lag, standby status, and WAL sender statistics</li></ul>Performance reports are available at both the server and database levels, with database-level reports including additional detail on index usage and cache efficiency for that specific database.<h3>Schema Design Reports</h3>The design review report examines your database schema for structural quality and best practices:<ul><li>Table Structure</li><li>: Table definitions, column counts, sizes, ownership, and documentation coverage</li></ul><ul><li>Primary Key Analysis</li><li>: Primary key design and tables lacking primary keys</li></ul><ul><li>Referential Integrity</li><li>: Foreign key relationships, orphan references, and relationship coverage</li></ul><ul><li>Index Strategy</li><li>: Index definitions, duplicate indexes, index types, and coverage analysis</li></ul><ul><li>Constraints</li><li>: Check constraints, unique constraints, and data validation coverage</li></ul><ul><li>Normalisation Analysis</li><li>: Repeated column patterns, potential denormalisation issues, and data redundancy</li></ul><ul><li>Naming Conventions</li><li>: Table and column naming patterns, consistency analysis, and naming standard compliance</li></ul><ul><li>Data Type Review</li><li>: Data type usage patterns, type consistency, and type appropriateness</li></ul>Design reports are available at the database and schema levels, allowing you to review either an entire database's schema design or focus on a specific schema.<h2>How the Reports Work</h2>Under the hood, the report generation follows a sophisticated multi-stage pipeline that keeps each LLM interaction within manageable token limits whilst still producing comprehensive output:<ul><li>Planning</li><li>: The LLM first reviews the available analysis sections and the database context (server version, table count, available extensions, and so on), then selects which sections are most relevant to analyse. This means the report is tailored to your specific environment rather than running every possible check regardless of applicability.</li></ul><ul><li>Data Gathering</li><li>: For each selected section, pgAdmin executes a set of SQL queries against the database to collect the relevant configuration data, statistics, and metadata.</li></ul><ul><li>Section Analysis</li><li>: Each section's data is sent to the LLM independently for analysis. The LLM classifies findings by severity (Critical, Warning, Advisory, or Good) and provides specific, actionable recommendations, including SQL commands where relevant.</li></ul><ul><li>Synthesis</li><li>: Finally, the individual section analyses are combined into a cohesive report with an executive summary, a critical issues section aggregating the most important findings, the detailed section analyses, and a prioritised list of recommendations.</li></ul>As the pipeline works through these stages, the UI shows real-time progress updates: the current stage name (Planning Analysis, Gathering Data, Analysing Sections, Creating Report), a description of what's being processed (for example, 'Analysing Memory Configuration...'), and a progress bar showing how many sections have been completed out of the total. Once all four stages are finished, the completed report is rendered in the panel in one go. Each report can also be downloaded as a Markdown file for archiving or sharing with colleagues.The reports are designed to be genuinely useful rather than generic. Because the LLM receives actual data from your database (configuration settings, role definitions, table statistics, and index information), its analysis is grounded in reality. A security report will flag your specific  rules that might be overly permissive, a performance report will identify your specific tables that are missing useful indexes, and a design report will point out your specific naming inconsistencies.<h2>A Note on Privacy and Data</h2>It is worth noting that when using cloud-hosted LLM providers (Anthropic or OpenAI), the database metadata and configuration data gathered for reports is sent to those providers' APIs. No actual table data is sent for the reports (only metadata, configuration settings, and statistics), but administrators should be aware of this and ensure it aligns with their organisation's data handling policies. For environments where sending any data externally is not acceptable, the Ollama and Docker Model Runner options allow you to run models entirely locally.<h2>Getting Started</h2>If you'd like to try the AI features, the quickest way to get started is to configure an API key for either Anthropic or OpenAI, set the default provider in Preferences, and then right-click on a server in the browser tree to generate your first report. If you prefer to keep everything local, installing Ollama and pulling a model such as  is straightforward, and Docker Desktop users on version 4.40 or later can enable the built-in model runner without any additional setup.In the next post, I'll cover the AI Chat agent in the query tool, which brings natural language to SQL translation directly into your workflow, along with database-aware conversational assistance. Stay tuned.</p>Mon, 09 Mar 2026 05:31:29 +0000https://postgr.es/p/7uzRadim Marek: Production Query Plans Without Production Datahttps://postgr.es/p/7uw<p>In the <a href="https://boringsql.com/posts/postgresql-statistics/">previous article</a> we covered how the PostgreSQL planner reads <code>pg_class</code> and <code>pg_statistic</code> to estimate row counts, choose join strategies, and decide whether an index scan is worth it. The message was clear: when statistics are wrong, everything else goes with it.</p> <div class="sidenote">Streaming replication provides bit-to-bit replication, so all replicas share the same statistics with primary server.</div> But there was one thing we didn't talk about. Statistics are specific to the database cluster that generated them. The primary way to populate them is `ANALYZE` which requires the actual data. <p>PostgreSQL 18 changed that. Two new functions: <code>pg_restore_relation_stats</code> and <code>pg_restore_attribute_stats</code> write numbers directly into the catalog tables. Combined with <code>pg_dump --statistics-only</code>, you can treat optimizer statistics as a deployable artifact. Compact, portable, plain SQL.</p> <p>The feature was <a href="https://www.cybertec-postgresql.com/en/preserve-optimizer-statistics-during-major-upgrades-with-postgresql-v18/" rel="external">driven by the upgrade use case</a>. In the past, major version upgrades used to leave <code>pg_statistic</code> empty, forcing you to run <code>ANALYZE</code>. Which might take hours on large clusters. With PostgreSQL 18 upgrades now transfer statistics automatically. But that's just the beginning. The same logic lets you export statistics from production and inject them anywhere - test database, local debugging, or as part of CI pipelines.</p> <h2 id="the-problem">The problem<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#the-problem"></a> </h2> <p>Your CI database has 1,000 rows. Production has 50 million. The planner makes completely different decisions for each. Running <code>EXPLAIN</code> in CI tells you nothing about the production plan. This is the core premise behind <a href="https://boringsql.com/products/regresql">RegreSQL</a>. Catching query plan regressions in CI is far more reliable when the planner sees production-scale statistics.</p> <p>Same applies to <strong>debugging</strong>. A query is slow in production and you want to reproduce the plan locally, but your database has different statistics, and planner chooses the predictable path. Porting production stats can provide you that snapshot of thinking planner has to do in production, without actually going to production.</p> <h2 id="pg-restore-relation-stats">pg_restore_relation_stats<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#pg-restore-relation-stats"></a> </h2> <p>The first of function behind portable PostgreSQL statistics is <code>pg_restore_relation_stats</code>. It writes table-level data directly into <code>pg_class</code> in form of variadic name/value pairs.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #F97583;">SELECT</span><span> pg_restore_relation_stats(</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'schemaname'</span><span>, </span><span style="color: #9ECBFF;">'public'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relname'</span><span>, </span><span style="color: #9ECBFF;">'orders'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relpages'</span><span>, </span><span style="color: #79B8FF;">123513</span><span>::</span><span style="color: #F97583;">integer</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'reltuples'</span><span>, </span><span style="color: #79B8FF;">50000000</span><span>::</span><span style="color: #F97583;">real</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relallvisible'</span><span>, </span><span style="color: #79B8FF;">123513</span><span>::</span><span style="color: #F97583;">integer</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relallfrozen'</span><span>, </span><span style="color: #79B8FF;">120000</span><span>::</span><span style="color: #F97583;">integer</span></span> <span class="giallo-l"><span>);</span></span></code></pre> <p>But that's just an example. Let's modify some real statistics to see the full value. We will create a small table, inject fake production-like statistics and watch the planner to change its mind.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #F97583;">CREATE TABLE</span><span style="color: #B392F0;"> test_orders</span><span> (</span></span> <span class="giallo-l"><span> id </span><span style="color: #F97583;">integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY</span><span>,</span></span> <span class="giallo-l"><span> customer_id </span><span style="color: #F97583;">integer NOT NULL</span><span>,</span></span> <span class="giallo-l"><span> amount </span><span style="color: #F97583;">numeric</span><span>(</span><span style="color: #79B8FF;">10</span><span>,</span><span style="color: #79B8FF;">2</span><span>)</span><span style="color: #F97583;"> NOT NULL</span><span>,</span></span> <span class="giallo-l"><span style="color: #F97583;"> status text NOT NULL DEFAULT</span><span style="color: #9ECBFF;"> 'pending'</span><span>,</span></span> <span class="giallo-l"><span> created_at </span><span style="color: #F97583;">date NOT NULL DEFAULT</span><span> CURRENT_DATE</span></span> <span class="giallo-l"><span>);</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #F97583;">INSERT INTO</span><span> test_orders (customer_id, amount, </span><span style="color: #F97583;">status</span><span>, created_at)</span></span> <span class="giallo-l"><span style="color: #F97583;">SELECT</span></span> <span class="giallo-l"><span> (random()</span><span style="color: #F97583;"> *</span><span style="color: #79B8FF;"> 9999</span><span style="color: #F97583;"> +</span><span style="color: #79B8FF;"> 1</span><span>)::</span><span style="color: #F97583;">int</span><span>,</span></span> <span class="giallo-l"><span> (random()</span><span style="color: #F97583;"> *</span><span style="color: #79B8FF;"> 5000</span><span style="color: #F97583;"> +</span><span style="color: #79B8FF;"> 5</span><span>)::</span><span style="color: #F97583;">numeric</span><span>(</span><span style="color: #79B8FF;">10</span><span>,</span><span style="color: #79B8FF;">2</span><span>),</span></span> <span class="giallo-l"><span> (</span><span style="color: #F97583;">ARRAY</span><span>['pending','shipped','delivered','cancelled'])[floor(random()*4+1)::int],</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> '2024-01-01'</span><span>::</span><span style="color: #F97583;">date +</span><span> (random()</span><span style="color: #F97583;"> *</span><span style="color: #79B8FF;"> 365</span><span>)::</span><span style="color: #F97583;">int</span></span> <span class="giallo-l"><span style="color: #F97583;">FROM</span><span style="color: #79B8FF;"> generate_series</span><span>(</span><span style="color: #79B8FF;">1</span><span>, </span><span style="color: #79B8FF;">10000</span><span>);</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #F97583;">CREATE INDEX</span><span style="color: #B392F0;"> ON</span><span> test_orders (created_at);</span></span> <span class="giallo-l"><span style="color: #F97583;">CREATE INDEX</span><span style="color: #B392F0;"> ON</span><span> test_orders (</span><span style="color: #F97583;">status</span><span>);</span></span> <span class="giallo-l"><span>ANALYZE test_orders;</span></span></code></pre> <p>When you check the current statistics, it has predictable data.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #F97583;">SELECT</span><span> relname, relpages, reltuples</span></span> <span class="giallo-l"><span style="color: #F97583;">FROM</span><span> pg_class </span><span style="color: #F97583;">WHERE</span><span> relname </span><span style="color: #F97583;">=</span><span style="color: #9ECBFF;"> 'test_orders'</span><span>;</span></span></code></pre><pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span> relname | relpages | reltuples</span></span> <span class="giallo-l"><span>-------------+----------+-----------</span></span> <span class="giallo-l"><span> test_orders | 74 | 10000</span></span> <span class="giallo-l"><span>(1 row)</span></span></code></pre> <p>With 10,000 rows across 74 pages, the planner picks a sequential scan.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span>EXPLAIN </span><span style="color: #F97583;">SELECT * FROM</span><span> test_orders </span><span style="color: #F97583;">WHERE</span><span> created_at </span><span style="color: #F97583;">&gt;</span><span style="color: #9ECBFF;"> '2024-06-01'</span><span>;</span></span></code></pre><pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span> QUERY PLAN</span></span> <span class="giallo-l"><span>-----------------------------------------------------------------</span></span> <span class="giallo-l"><span> Seq Scan on test_orders (cost=0.00..199.00 rows=5891 width=26)</span></span> <span class="giallo-l"><span> Filter: (created_at &gt; '2024-06-01'::date)</span></span> <span class="giallo-l"><span>(2 rows)</span></span></code></pre> <p>Now inject production-scale table stats:</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #F97583;">SELECT</span><span> pg_restore_relation_stats(</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'schemaname'</span><span>, </span><span style="color: #9ECBFF;">'public'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relname'</span><span>, </span><span style="color: #9ECBFF;">'test_orders'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relpages'</span><span>, </span><span style="color: #79B8FF;">123513</span><span>::</span><span style="color: #F97583;">integer</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'reltuples'</span><span>, </span><span style="color: #79B8FF;">50000000</span><span>::</span><span style="color: #F97583;">real</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relallvisible'</span><span>, </span><span style="color: #79B8FF;">123513</span><span>::</span><span style="color: #F97583;">integer</span></span> <span class="giallo-l"><span>);</span></span></code></pre> <p>And you might be surprised by the result.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span>EXPLAIN </span><span style="color: #F97583;">SELECT * FROM</span><span> test_orders </span><span style="color: #F97583;">WHERE</span><span> created_at </span><span style="color: #F97583;">&gt;</span><span style="color: #9ECBFF;"> '2024-06-01'</span><span>;</span></span></code></pre><pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span> QUERY PLAN</span></span> <span class="giallo-l"><span>------------------------------------------------------------------</span></span> <span class="giallo-l"><span> Seq Scan on test_orders (cost=0.00..448.45 rows=17649 width=26)</span></span> <span class="giallo-l"><span> Filter: (created_at &gt; '2024-06-01'::date)</span></span></code></pre> <p>The planner is still using the sequential plan. Only the estimated number of rows has changed. Why? If you remember from previous article, it's where column level statistics come into play. Histogram bounds still match the original 10,000 rows we inserted.</p> <h2 id="pg-restore-attribute-stats">pg_restore_attribute_stats<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#pg-restore-attribute-stats"></a> </h2> <p>This function writes column-level statistics into <code>pg_statistic</code> the same catalog that <a href="https://boringsql.com/posts/postgresql-statistics/#how-analyze-works">ANALYZE populates</a> with <a href="https://boringsql.com/posts/postgresql-statistics/#pg_statistic-via-pg_stats---column-level-stats">MCVs, histograms, and correlation</a>.</p> <p>In previous section, we left the planner stuck on a sequential scan despite believing the table has 50 million rows. The missing piece is column-level statistics. Let's pick up where we left off and inject histogram bounds for <code>created_at</code>.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #F97583;">SELECT</span><span> pg_restore_attribute_stats(</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'schemaname'</span><span>, </span><span style="color: #9ECBFF;">'public'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relname'</span><span>, </span><span style="color: #9ECBFF;">'test_orders'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'attname'</span><span>, </span><span style="color: #9ECBFF;">'created_at'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'inherited'</span><span>, false::</span><span style="color: #F97583;">boolean</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'null_frac'</span><span>, </span><span style="color: #79B8FF;">0</span><span>.</span><span style="color: #79B8FF;">0</span><span>::</span><span style="color: #F97583;">real</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'avg_width'</span><span>, </span><span style="color: #79B8FF;">4</span><span>::</span><span style="color: #F97583;">integer</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'n_distinct'</span><span>, </span><span style="color: #F97583;">-</span><span style="color: #79B8FF;">0</span><span>.</span><span style="color: #79B8FF;">05</span><span>::</span><span style="color: #F97583;">real</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'histogram_bounds'</span><span>, </span><span style="color: #9ECBFF;">'{2019-01-01,2019-07-01,2020-01-01,2020-07-01,2021-01-01,2021-07-01,2022-01-01,2022-07-01,2023-01-01,2023-07-01,2024-01-01}'</span><span>::</span><span style="color: #F97583;">text</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'correlation'</span><span>, </span><span style="color: #79B8FF;">0</span><span>.</span><span style="color: #79B8FF;">98</span><span>::</span><span style="color: #F97583;">real</span></span> <span class="giallo-l"><span>);</span></span></code></pre> <p>Now the planner knows the data spans 5 years. A query filtering on the last 6 months of 2024 covers a narrow slice.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span>EXPLAIN SELECT * FROM test_orders WHERE created_at &gt; '2024-06-01';</span></span></code></pre><pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span> QUERY PLAN</span></span> <span class="giallo-l"><span>----------------------------------------------------------------------------------------------------</span></span> <span class="giallo-l"><span> Index Scan using test_orders_created_at_idx on test_orders (cost=0.29..153.21 rows=6340 width=26)</span></span> <span class="giallo-l"><span> Index Cond: (created_at &gt; '2024-06-01'::date)</span></span></code></pre><div class="sidenote"> Histogram bounds divide the non-MCV portion of the data into equal-population buckets. If <code>most_common_vals</code> accounts for most of the data, the histogram covers only the remaining tail. The number of buckets is controlled by <code>default_statistics_target</code> (default 100, meaning 101 bounds). </div> <p>And that's a plan flip! The histogram tells the planner the data spans 2019–2024, so <code>&gt; '2024-06-01'</code> matches a narrow tail. A small fraction of 50 million rows. The index scan that was ignored before is now the obvious choice. Table-level stats set the scale, column-level stats shaped the selectivity, and together they changed the plan.</p> <div class="callout"> The <code>correlation</code> statistic tells the planner how closely the physical row order matches the column's sort order. A value near 1.0 means sequential access patterns - making <a href="https://boringsql.com/posts/postgresql-statistics/#correlation-and-index-scan-cost">index scan cheaper</a> because the next row is likely on the same or adjacent page. For time-series data like <code>created_at</code> where rows are inserted chronologically, correlation is typically very high. </div> <h2 id="injecting-a-skewed-distribution">Injecting a skewed distribution<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#injecting-a-skewed-distribution"></a> </h2> <p>The same function handles <a href="https://boringsql.com/posts/postgresql-statistics/#selectivity-in-action">MCV lists</a>. In production, your <code>status</code> column isn't uniform, 95% of orders are delivered, 1.5% are pending.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #F97583;">SELECT</span><span> pg_restore_attribute_stats(</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'schemaname'</span><span>, </span><span style="color: #9ECBFF;">'public'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'relname'</span><span>, </span><span style="color: #9ECBFF;">'test_orders'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'attname'</span><span>, </span><span style="color: #9ECBFF;">'status'</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'inherited'</span><span>, false::</span><span style="color: #F97583;">boolean</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'null_frac'</span><span>, </span><span style="color: #79B8FF;">0</span><span>.</span><span style="color: #79B8FF;">0</span><span>::</span><span style="color: #F97583;">real</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'avg_width'</span><span>, </span><span style="color: #79B8FF;">9</span><span>::</span><span style="color: #F97583;">integer</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'n_distinct'</span><span>, </span><span style="color: #79B8FF;">5</span><span>::</span><span style="color: #F97583;">real</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'most_common_vals'</span><span>, </span><span style="color: #9ECBFF;">'{delivered,shipped,cancelled,pending,returned}'</span><span>::</span><span style="color: #F97583;">text</span><span>,</span></span> <span class="giallo-l"><span style="color: #9ECBFF;"> 'most_common_freqs'</span><span>, </span><span style="color: #9ECBFF;">'{0.95,0.015,0.015,0.015,0.005}'</span><span>::</span><span style="color: #F97583;">real</span><span>[]</span></span> <span class="giallo-l"><span>);</span></span></code></pre> <p>You can see</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span>EXPLAIN </span><span style="color: #F97583;">SELECT * FROM</span><span> test_orders </span><span style="color: #F97583;">WHERE status =</span><span style="color: #9ECBFF;"> 'pending'</span><span>;</span></span></code></pre><pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span> QUERY PLAN</span></span> <span class="giallo-l"><span>---------------------------------------------------------------------------------------</span></span> <span class="giallo-l"><span> Bitmap Heap Scan on test_orders (cost=8.93..90.42 rows=599 width=27)</span></span> <span class="giallo-l"><span> Recheck Cond: (status = 'pending'::text)</span></span> <span class="giallo-l"><span> -&gt; Bitmap Index Scan on test_orders_status_idx (cost=0.00..8.78 rows=599 width=0)</span></span> <span class="giallo-l"><span> Index Cond: (status = 'pending'::text)</span></span> <span class="giallo-l"><span>(4 rows)</span></span></code></pre> <p>and compare it with</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span>EXPLAIN </span><span style="color: #F97583;">SELECT * FROM</span><span> test_orders </span><span style="color: #F97583;">WHERE status =</span><span style="color: #9ECBFF;"> 'delivered'</span><span>;</span></span></code></pre><pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span> QUERY PLAN</span></span> <span class="giallo-l"><span>------------------------------------------------------------------</span></span> <span class="giallo-l"><span> Seq Scan on test_orders (cost=0.00..448.45 rows=28458 width=27)</span></span> <span class="giallo-l"><span> Filter: (status = 'delivered'::text)</span></span> <span class="giallo-l"><span>(2 rows)</span></span></code></pre> <p>Same column, same operator, different plans. The planner uses a bitmap index scan for <code>pending</code> (1.5% rare enough to justify the index) and a sequential scan for <code>delivered</code> (95% being most of the table). The selectivity ratios from the MCV list drive the plan choice.</p> <div class="callout"> You might have noticed the row estimates (599 and 28,458) are lower than you'd expect for a 50-million-row table. The planner checks the actual physical file size. Our table is only 74 pages on disk, not the 123,513 we injected. Hence the planner scales <code>reltuples</code> down proportionally. The absolute numbers shrink, but the <i>ratios</i> between them stay correct, and it's the ratios that determine plan shape. When you use <code>pg_dump --statistics-only</code> in practice, you're typically restoring into a database with comparable data volume, so the estimates align naturally. </div> <h2 id="pg-dump">pg_dump<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#pg-dump"></a> </h2> <p>The functions we covered are the mechanics. For operational use <code>pg_dump</code> provides everything you need. PostgreSQL 18 added three flags.</p> <table><thead><tr><th>Flag</th><th>Effect</th></tr></thead><tbody> <tr><td><code>--statistics</code></td><td>dump the statistics (you have to request it explicitely)</td></tr> <tr><td><code>--statistics-only</code></td><td>dump only the statistics, not schema or data</td></tr> <tr><td><code>--no-statistics</code></td><td>do not dump statistics</td></tr> </tbody></table> <p>When you export the statistics for your production database</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span>pg_dump --statistics-only -d production_db &gt; stats.sql</span></span> <span class="giallo-l"></span></code></pre> <p>you will see the output is series of <code>SELECT pg_restore_relation_stats(...)</code> and <code>SELECT pg_restore_attribute_stats(...)</code> calls. Exactly as we explained above.</p> <p>The full workflow to turn your production data into testable plans might look like this:</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #6A737D;"># 1. dump schema from production</span></span> <span class="giallo-l"><span style="color: #B392F0;">pg_dump</span><span style="color: #79B8FF;"> --schema-only -d</span><span style="color: #9ECBFF;"> production_db</span><span style="color: #F97583;"> &gt;</span><span style="color: #9ECBFF;"> schema.sql</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #6A737D;"># 2. dump statistics from production</span></span> <span class="giallo-l"><span style="color: #B392F0;">pg_dump</span><span style="color: #79B8FF;"> --statistics-only -d</span><span style="color: #9ECBFF;"> production_db</span><span style="color: #F97583;"> &gt;</span><span style="color: #9ECBFF;"> stats.sql</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #6A737D;"># 3. create test database with schema</span></span> <span class="giallo-l"><span style="color: #B392F0;">createdb</span><span style="color: #9ECBFF;"> test_db</span></span> <span class="giallo-l"><span style="color: #B392F0;">psql</span><span style="color: #79B8FF;"> -d</span><span style="color: #9ECBFF;"> test_db</span><span style="color: #79B8FF;"> -f</span><span style="color: #9ECBFF;"> schema.sql</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #6A737D;"># 4. load fixture data (optional; masked, minimal)</span></span> <span class="giallo-l"><span style="color: #B392F0;">psql</span><span style="color: #79B8FF;"> -d</span><span style="color: #9ECBFF;"> test_db</span><span style="color: #79B8FF;"> -f</span><span style="color: #9ECBFF;"> fixtures.sql</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #6A737D;"># 5. inject production statistics</span></span> <span class="giallo-l"><span style="color: #B392F0;">psql</span><span style="color: #79B8FF;"> -d</span><span style="color: #9ECBFF;"> test_db</span><span style="color: #79B8FF;"> -f</span><span style="color: #9ECBFF;"> stats.sql</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #6A737D;"># 6. query plans now match production</span></span> <span class="giallo-l"><span style="color: #B392F0;">psql</span><span style="color: #79B8FF;"> -d</span><span style="color: #9ECBFF;"> test_db</span><span style="color: #79B8FF;"> -c</span><span style="color: #9ECBFF;"> &quot;EXPLAIN SELECT * FROM test_orders WHERE status = 'pending'&quot;</span></span></code></pre><div class="callout"> Statistics dumps are tiny. A database with hundreds of tables and thousands of columns produces a statistics dump under 1MB. The production data might be hundreds of GB. The statistics that describe it fit in a text file. </div> <h2 id="keeping-injected-statistics-alive">Keeping injected statistics alive<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#keeping-injected-statistics-alive"></a> </h2> <p>Now you might ask yourself, where's the catch? And there's a big one, the autovacuum will eventually kick in and run <code>ANALYZE</code>. Which will overwrite your injected statistics with real numbers and you are back where you started.</p> <p>To prevent this, disable autovacuum analyze on the tables you've injected.</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #6A737D;">-- disable autovacuum</span></span> <span class="giallo-l"><span style="color: #F97583;">ALTER TABLE</span><span> test_orders </span><span style="color: #F97583;">SET</span><span> (autovacuum_enabled </span><span style="color: #F97583;">=</span><span> false);</span></span> <span class="giallo-l"></span> <span class="giallo-l"><span style="color: #6A737D;">-- or set analyze threshold so high it nevers kicks-in</span></span> <span class="giallo-l"><span style="color: #F97583;">ALTER TABLE</span><span> test_orders </span><span style="color: #F97583;">SET</span><span> (autovacuum_analyze_threshold </span><span style="color: #F97583;">=</span><span style="color: #79B8FF;"> 2147483647</span><span>);</span></span></code></pre><div class="callout"> <strong>Be careful here.</strong> <p> If you're also writing data to these tables in dev: running migrations, loading fixtures, testing inserts, the injected statistics will drift further from reality with every write. The planner will plan based on a production distribution that no longer reflects the local data. </p> <p>For read-only query plan testing this is exactly what you want. For integration tests that modify data, you may need to re-inject statistics after each test run.</p> <p>And please, never ever do this in production!</p> </div> <h2 id="what-s-not-covered">What's not covered?<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#what-s-not-covered"></a> </h2> <p>As we have seen earlier, it's not worth trying to inject <code>relpages</code> as the planner checks the actual file size and scales it proportationally. This limits the number of absolute rows planner might estimate. I.e. to get comparable numbers to production environment you still would have to create comparable data volume (which isn't a problem when talking about the primary use case of this feature - restoring backups).</p> <p>It's also worth to note that <code>CREATE STATISTICS</code> used for <a href="https://boringsql.com/posts/postgresql-statistics/#extended-statistics">multivariate correlations, distinct counts across column groups and MCV lists for column combinations</a> are not covered within PostgreSQL 18. Those still require <code>ANALYZE</code> after restore. PostgreSQL 19 will close this gap with <code>pg_restore_extended_stats()</code>.</p> <h2 id="security">Security<a class="zola-anchor" href="https://boringsql.com/posts/portable-stats/#security"></a> </h2> <p>The restore functions require the <code>MAINTAIN</code> privilege on the target table. This is the same privilege needed for <code>ANALYZE</code>, <code>VACUUM</code>, <code>REINDEX</code>, and <code>CLUSTER</code> as it was <a href="https://boringsql.com/posts/postgresql-predefined-roles/">introduced in PostgreSQL 17</a>.</p> <p>The easiest way to grant it for automation:</p> <pre class="giallo" style="color: #E1E4E8; background-color: #24292E;"><code><span class="giallo-l"><span style="color: #F97583;">GRANT</span><span> pg_maintain </span><span style="color: #F97583;">TO</span><span> ci_service_account;</span></span></code></pre> <p>This grants <code>MAINTAIN</code> on all tables in the database. Enough for a CI pipeline to inject statistics without needing superuser.</p>Sun, 08 Mar 2026 21:15:56 +0000https://postgr.es/p/7uwBruce Momjian: New Presentationhttps://postgr.es/p/7uu<p>I just gave a new presentation at <a class="txt2html" href="https://www.socallinuxexpo.org/scale/23x" style="text-decoration: underline dotted;">SCALE</a> titled <a class="major" href="https://momjian.us/main/presentations/administration.html#wal" style="text-decoration: underline;">The Wonderful World of WAL.</a> I am excited to have a <a class="txt2html" href="https://momjian.us/main/blogs/pgblog/2026.html#January_28_2026" style="text-decoration: underline dotted;">second</a> new talk this year. (I have <a class="txt2html" href="https://momjian.us/main/presentations/ai.html#mcp" style="text-decoration: underline dotted;">one more</a> queued up.) </p> <p>I have always wanted to do a presentation about the write-ahead log (WAL) but I was worried there was not enough content for a full talk. As more features were added to Postgres that relied on the WAL, the talk became more feasible, and at 103 slides, maybe I waited too long. <img src="https://momjian.us/main/img/blog/wink.png" /> </p> <p>I had a full hour to give the talk at SCALE, and that was helpful. I was able to answer many questions during the talk, and that was important — many of the later features rely on earlier ones, e.g., point-in-time recovery (PITR) relies heavily on crash recovery, and if you don't understand how crash recovery works, you can't understand PITR. By taking questions at the end of each section, I could be sure everyone understood. The questions showed that the audience of 46 understood the concepts because they were asking about the same issues we dealt with in designing the features: </p> <ul> <li>How does server start know if crash recovery is needed? </li><li>Can dirty shared buffers be written to storage before the WAL for the transaction that dirtied them is written? </li><li>Can the WAL and heap/index storage get out of sync? </li><li>How is the needed WAL accurately retained for replica servers? </li><li>Can logical replicas be used as failover servers? </li></ul> <p><a href="https://momjian.us/main/blogs/pgblog/2026.html#March_7_2026">Continue Reading &raquo;</a></p>Sat, 07 Mar 2026 18:45:01 +0000https://postgr.es/p/7uuGabriele Bartolini: From proposal to PR: how to contribute to the new CloudNativePG extensions projecthttps://postgr.es/p/7us<p><em>In this article I walk you through the journey of adding the <code>pg_crash</code> extension to the new CloudNativePG extensions project. It explores the transition from legacy standalone repositories to a unified, Dagger-powered build system designed for PostgreSQL 18 and beyond. By focusing on the <em>Image Volume</em> feature and minimal operand images, the post provides a step-by-step guide for community members to contribute and maintain their own extensions within the CloudNativePG ecosystem.</em></p>Sat, 07 Mar 2026 06:36:35 +0000https://postgr.es/p/7us