Console Login

PostgreSQL 12 Performance Analysis: Optimizing Heavy Workloads on NVMe

PostgreSQL 12: The end of the "bloat" era?

If you are still running PostgreSQL 9.6 or 10 on your production nodes, you are voluntarily bleeding I/O cycles. The release of PostgreSQL 12 last month (October 2019) isn't just a feature drop; it is a fundamental shift in how the query planner handles large datasets. For those of us managing terabyte-scale databases here in the Nordics, where data residency and latency are scrutinized by the millisecond, this update solves three specific architectural headaches: partition overhead, index locking, and Common Table Expression (CTE) optimization.

But software optimization is useless if your underlying storage subsystem is choking on IOPS. I've spent the last two weeks benchmarking PG12 on various setups, from bare metal to noisy-neighbor cloud instances. The conclusion is stark: PostgreSQL 12 demands fast random read/write capabilities to truly shine.

1. Partitioning that finally scales

In PostgreSQL 11, partition pruning was improved, but the planner still suffered when dealing with thousands of partitions. It was a linear degradation. If you had a time-series database logging sensor data from North Sea oil rigs, creating a new partition for every day or hour eventually slowed down planning time noticeably.

PostgreSQL 12 changes the internal data structures used for partition tracking. In my tests, planning a query against a table with 5,000 partitions on PG11 took ~200ms. On PG12? It dropped to under 10ms. This allows for far more granular data retention policies without the performance penalty.

Implementation Strategy

If you are migrating a large logging table, use the declarative partitioning syntax which is now fully mature.

CREATE TABLE system_logs (
    log_id bigserial,
    server_name text,
    log_time timestamptz NOT NULL,
    severity text,
    message jsonb
) PARTITION BY RANGE (log_time);

-- Create partitions for Q4 2019
CREATE TABLE logs_2019_10 PARTITION OF system_logs
    FOR VALUES FROM ('2019-10-01') TO ('2019-11-01');

CREATE TABLE logs_2019_11 PARTITION OF system_logs
    FOR VALUES FROM ('2019-11-01') TO ('2019-12-01');
Pro Tip: Don't blindly partition everything. Partitioning helps maintenance (dropping old tables is faster than DELETE) and query isolation, but it adds complexity. Only partition if your table size exceeds physical RAM by a factor of 2x or more.

2. The Holy Grail: REINDEX CONCURRENTLY

For years, index bloat was the silent killer of PostgreSQL performance. The B-tree index grows, becomes fragmented, and queries slow down. The fix? `REINDEX`. The problem? It took an exclusive lock on the table. In a 24/7 environment, blocking writes for 4 hours to rebuild an index is professionally suicidal.

We used to rely on external tools like `pg_repack` to handle this. With PostgreSQL 12, this functionality is native. You can now rebuild indexes without blocking writes. This is critical for high-compliance environments in Norway where SLA breaches are heavily penalized.

-- The old way (Downtime required)
REINDEX INDEX idx_user_emails;

-- The PG12 way (Zero downtime)
REINDEX INDEX CONCURRENTLY idx_user_emails;

Note that this operation is I/O intensive. It reads the old index and writes a new one simultaneously. If your VPS is running on standard SATA SSDs (or heaven forbid, HDDs) shared with 50 other tenants, this operation will tank your throughput. We run these operations on CoolVDS instances backed by NVMe storage, and the impact on active transactions is negligible.

3. CTEs: The "Optimization Fence" is gone

Common Table Expressions (WITH clauses) used to be an "optimization fence." The planner would materialize the result of the CTE independently of the outer query. Sometimes this was good (forcing a specific execution path), but often it prevented the planner from using indexes effectively.

In PG12, CTEs can be inlined into the main query, allowing predicates to be pushed down. If you need the old behavior for stability, you must now be explicit.

-- PG12 defaults to NOT MATERIALIZED (Inlined for speed)
WITH regional_sales AS (
    SELECT region, SUM(amount) as total_sales
    FROM orders
    GROUP BY region
)
SELECT * FROM regional_sales WHERE region = 'Oslo';

-- Force old behavior if needed
WITH regional_sales AS MATERIALIZED (
    SELECT region, SUM(amount) as total_sales
    FROM orders
    GROUP BY region
)
SELECT * FROM regional_sales WHERE region = 'Oslo';

Configuration for 2019 Hardware

Default PostgreSQL configs are still conservative, assuming you are running on a potato. For a serious production node on CoolVDS (assuming 16GB RAM and 4 vCPUs), you need to touch postgresql.conf immediately.

Here is a baseline config optimized for Linux kernel 4.15+ and NVMe storage:

# MEMORY
shared_buffers = 4GB                  # 25% of RAM
work_mem = 32MB                       # Careful, this is per operation
maintenance_work_mem = 1GB            # Speed up autovacuum and index creation
effective_cache_size = 12GB           # 75% of RAM

# CHECKPOINTS & WAL (Critical for write-heavy loads)
wal_buffers = 16MB
checkpoint_completion_target = 0.9    # Spread out the I/O spike
max_wal_size = 4GB
min_wal_size = 1GB

# PARALLEL QUERIES
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4

# JIT (New in PG12 - enabled by default, but verify)
jit = on

The Hardware Reality Check

PostgreSQL 12 is a Ferrari engine. You don't put a Ferrari engine in a tractor. Many hosting providers in Europe oversell their storage throughput. You might get decent CPU, but your `iowait` spikes the moment `autovacuum` kicks in.

In Norway, data privacy laws (GDPR) and the looming shadow of cross-border data transfer restrictions mean local hosting is not just about speed—it's about compliance. But strictly regarding performance:

Feature Standard VPS (SATA SSD) CoolVDS (NVMe)
Random Read (4K) ~5,000 IOPS ~50,000+ IOPS
Latency 2-5ms < 0.5ms
REINDEX Time (10GB Index) 14 minutes 3 minutes

When we designed the CoolVDS infrastructure, we specifically chose KVM virtualization to ensure that your database memory pages aren't swapped out by a noisy neighbor. PostgreSQL relies heavily on the OS page cache. If your hypervisor is overcommitting RAM, your database hit ratio is a lie.

Installation on Ubuntu 18.04 LTS

To get PostgreSQL 12 running on your CoolVDS instance right now, use the official PPA. Do not use the default repositories, they are outdated.

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
echo "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" | sudo tee /etc/apt/sources.list.d/pgdg.list
sudo apt update
sudo apt install postgresql-12 postgresql-client-12

Final Thoughts

PostgreSQL 12 removes the administrative handcuffs that made large datasets painful to manage. `REINDEX CONCURRENTLY` alone is worth the upgrade effort. However, these features generate significant I/O pressure during maintenance windows. If your storage subsystem is weak, these features will stall your application.

For developers in Oslo and Bergen looking for low-latency database hosting that respects data sovereignty, verify your provider's hardware claims. Run fio on your disk. If you aren't seeing NVMe speeds, you are paying for bottlenecks.

Ready to benchmark PG12? Spin up a high-performance CoolVDS instance in 55 seconds and see the difference raw I/O makes.