Console Login

Beyond the Monolith: Architecting Database Sharding for Scale on Bare Metal & VPS

Beyond the Monolith: Architecting Database Sharding for Scale on Bare Metal & VPS

There is a specific type of dread that hits a SysAdmin at 03:00 CET. It's not a server crash—crashes are loud and obvious. It's the silence of a database lock queue piling up while the CPU sits at 100% on your largest available instance. You have maximized innodb_buffer_pool_size. You have upgraded to the most expensive hardware your provider offers. You have optimized every query.

And it is still slow.

Welcome to the vertical scaling wall. In the Nordic hosting market, where data sovereignty and speed are paramount, throwing more RAM at the problem eventually stops working. This is where we stop optimizing and start architecting. We need to talk about sharding.

The Brutal Physics of Horizontal Scaling

Sharding is not a feature; it is a complexity tax you pay for infinite scale. It involves splitting your dataset across multiple distinct database instances (shards) so that no single node handles the full load.

Before you even touch a config file, you must accept a hard truth: Network latency is your new bottleneck.

In a monolithic setup, a `JOIN` is a memory operation. Nanoseconds. In a sharded setup, a cross-shard `JOIN` is a network operation. Milliseconds. If your VPS provider routes traffic through a congested public switch instead of a dedicated private VLAN, your sharded architecture will be slower than the monolith you are trying to replace. This is why we built CoolVDS with distinct focus on internal routing efficiency within our Oslo datacenter. When you split a query across four nodes, the round-trip time (RTT) between them must be negligible.

The Strategy: Hash vs. Range vs. Directory

Choosing a sharding key is a one-way door decision. Get it wrong, and you will spend weeks migrating data manually.

  • Key Based (Hash) Sharding: You take a value (like user_id), hash it, and use the result to determine which server the data lives on. Great for uniform distribution. Terrible for range queries.
  • Range Based Sharding: Data is split by ranges (e.g., created_at dates). Excellent for logging systems. Dangerous for hot-spots (e.g., everyone writing to the "current month" shard).
  • Directory Based Sharding: A lookup table tells the app where data lives. Flexible, but the lookup table becomes a single point of failure (SPOF).

Implementation: PostgreSQL 16 Native Partitioning

Before jumping to full distributed sharding (like Citus or Vitess), use what the database engine gives you. PostgreSQL 16 refined declarative partitioning, which is often enough for datasets under 5TB.

Here is how you implement a hash-partitioned table structure that prepares you for physical sharding later. We assume you are running this on a CoolVDS instance running Ubuntu 24.04 LTS.

1. Designing the Parent Table

-- The parent table doesn't hold data, it routes it. CREATE TABLE traffic_logs ( log_id uuid NOT NULL DEFAULT gen_random_uuid(), server_id int NOT NULL, bytes_sent bigint, log_time timestamptz NOT NULL ) PARTITION BY HASH (server_id);

2. Creating Partitions

We create partitions that will eventually live on different physical disks (NVMe storage is critical here to prevent I/O wait during parallel scans).

-- Create 4 partitions. In a future migration, these can be moved to foreign tables. CREATE TABLE traffic_logs_0 PARTITION OF traffic_logs FOR VALUES WITH (MODULUS 4, REMAINDER 0); CREATE TABLE traffic_logs_1 PARTITION OF traffic_logs FOR VALUES WITH (MODULUS 4, REMAINDER 1); CREATE TABLE traffic_logs_2 PARTITION OF traffic_logs FOR VALUES WITH (MODULUS 4, REMAINDER 2); CREATE TABLE traffic_logs_3 PARTITION OF traffic_logs FOR VALUES WITH (MODULUS 4, REMAINDER 3);

3. Tuning the Kernel for High Concurrency

Sharding increases the number of open file descriptors and connections. The default Linux networking stack is too conservative for a database node handling thousands of inter-shard connections.

On your CoolVDS instance, edit /etc/sysctl.conf:

# Allow more connections to queue up
net.core.somaxconn = 4096

# Reuse TIME-WAIT sockets for new connections (critical for internal shard chatter)
net.ipv4.tcp_tw_reuse = 1

# Increase port range for outgoing connections to other shards
net.ipv4.ip_local_port_range = 1024 65535

# Maximize file handles for Postgres
fs.file-max = 2097152

Apply with sysctl -p.

Pro Tip: Never use standard HDD storage for sharded indexes. The random read/write patterns of a rebalancing shard will kill mechanical drive performance. CoolVDS standardizes on NVMe to ensure that when a shard splits, the I/O throughput is limited only by the network, not the disk.

Moving to Physical Sharding with Foreign Data Wrappers (FDW)

When a single VPS can no longer hold all partitions, you move them to other servers. PostgreSQL's postgres_fdw allows the parent table to query partitions that exist on remote CoolVDS instances transparently.

This is where infrastructure location matters. If your Master node is in Oslo and Shard 1 is in Frankfurt, your latency is ~20ms. For a transaction requiring 10 round trips, that is 200ms of pure lag added to the user experience. By keeping all nodes within the CoolVDS ecosystem in Norway, you keep latency sub-millisecond.

Connecting a Remote Shard

-- Install the extension
CREATE EXTENSION postgres_fdw;

-- Define the remote server (Shard Node 2)
CREATE SERVER shard_node_2 
FOREIGN DATA WRAPPER postgres_fdw 
OPTIONS (host '10.10.0.5', dbname 'traffic_db', port '5432');

-- Create the user mapping
CREATE USER MAPPING FOR current_user 
SERVER shard_node_2 
OPTIONS (user 'db_user', password 'secure_password_here');

-- Import the schema
IMPORT FOREIGN SCHEMA public 
LIMIT TO (traffic_logs_1) 
FROM SERVER shard_node_2 
INTO public;

The Compliance Angle: Datatilsynet & Schrems II

Technical architecture does not exist in a vacuum. If you are operating in Norway or the EU, sharding introduces a compliance risk. Where does Shard #4 live? Is it replicating to a backup bucket in a non-adequate jurisdiction?

Under GDPR and the Schrems II ruling, you must know the physical location of every byte. Using abstract cloud providers often obfuscates this (