Scaling Persistence: Database Sharding Strategies for High-Throughput Systems
It starts with a slow query log. Then, the connection pool saturation alerts begin to fire during peak hours. You throw more RAM at the problem—upgrading from 32GB to 64GB, tweaking innodb_buffer_pool_size or shared_buffers—but eventually, you hit the wall. The wall where vertical scaling becomes financially ruinous or technically impossible.
With Black Friday 2022 just around the corner, relying on a single monolithic database writer is a gamble many CTOs in Oslo are losing sleep over. If you are serving traffic to the Nordic market, latency matters, and data sovereignty matters. When a single node can no longer handle the write throughput (IOPS) or the storage volume, you have to shard.
Sharding is not a silver bullet. It is complex operational overhead. But done right, on the right infrastructure, it is the only path to infinite scale.
The Latency Trap: Why Geography Matters
Before we touch the SQL, let's talk physics. Sharding splits your data across multiple nodes. If your application needs to join data from Shard A and Shard B, that operation is network-bound.
In a distributed setup, network latency is the new disk I/O. If you host your shards on budget providers with oversold network uplinks, your distributed queries will hang. This is critical for Norwegian businesses dealing with GDPR and Schrems II compliance. You cannot simply scatter shards across US cloud providers without attracting the attention of Datatilsynet.
Pro Tip: Keep your shards in the same physical region unless you are strictly sharding by geography (e.g., EU users on EU servers, US users on US servers). For a Norwegian user base, hosting Shard 1 in Oslo and Shard 2 in Frankfurt adds ~15-20ms round-trip time per fetch. Multiply that by 1000 queries per second, and your app feels broken. We built the CoolVDS network directly on the NIX (Norwegian Internet Exchange) backbone to ensure sub-millisecond internal latency between instances.
Strategy 1: Range-Based Sharding (The Time-Series Savior)
This is often the entry point, particularly for logging, analytics, or financial ledgers. You partition data based on a value range, typically a timestamp or a primary key ID.
In PostgreSQL 14 (standard in our production images), declarative partitioning makes this native. You don't need complex application logic to route the data; the database engine handles it.
Implementation: Monthly Log Partitioning
-- Parent table
CREATE TABLE app_logs (
id uuid DEFAULT gen_random_uuid(),
event_time timestamptz NOT NULL,
payload jsonb,
severity text
) PARTITION BY RANGE (event_time);
-- Partitions for late 2022
CREATE TABLE app_logs_2022_10 PARTITION OF app_logs
FOR VALUES FROM ('2022-10-01') TO ('2022-11-01');
CREATE TABLE app_logs_2022_11 PARTITION OF app_logs
FOR VALUES FROM ('2022-11-01') TO ('2022-12-01');
-- Indexing happens on the partitions automatically in modern PG
CREATE INDEX ON app_logs (event_time);
The Problem: The "Hotspot" issue. If you shard by date, and everyone is writing to "today," only one shard takes 100% of the write load. The other nodes sit idle. This helps with storage management (dropping old tables is instant) but does not solve write-throughput bottlenecks.
Strategy 2: Hash-Based Sharding (Key Based)
To distribute write load evenly, we use a hash function on a key (like user_id or tenant_id). This ensures that User A always goes to Shard 1, and User B always goes to Shard 2. The write load is perfectly balanced, assuming your users are somewhat uniform in activity.
However, this makes "resharding" (adding a new server) a nightmare because consistent hashing algorithms require moving significant chunks of data.
Application-Side Routing Logic (Pseudo-Python)
import hashlib
# Configuration mapping Shard IDs to CoolVDS Private IPs
SHARD_MAP = {
0: "10.0.0.5", # db-shard-01
1: "10.0.0.6", # db-shard-02
2: "10.0.0.7", # db-shard-03
3: "10.0.0.8" # db-shard-04
}
def get_db_host(user_id):
# Create a deterministic hash of the user_id
hash_obj = hashlib.md5(str(user_id).encode())
hash_int = int(hash_obj.hexdigest(), 16)
# Modulo operator determines the shard index
shard_index = hash_int % len(SHARD_MAP)
return SHARD_MAP[shard_index]
Hardware Reality Check: IOPS or Death
Logical sharding solves the CPU bottleneck. It does not solve the Disk I/O bottleneck if the underlying storage is garbage. Many VPS providers use network-attached storage (Ceph, GlusterFS) over crowded 1Gbps links. When your database flushes dirty pages to disk, latency spikes.
You need local storage. Specifically, local NVMe.
| Feature | Standard SATA SSD VPS | CoolVDS NVMe Instance |
|---|---|---|
| Random Read (4K) | ~5,000 IOPS | ~80,000+ IOPS |
| Latency | 1-3 ms | < 0.1 ms |
| Wait I/O (under load) | High (CPU stealing) | Negligible |
When running a sharded cluster, if one node hangs on I/O wait, the entire application request thread can hang. We use KVM (Kernel-based Virtual Machine) to ensure that your RAM and CPU are actually yours, not oversold resources that vanish when a neighbor runs a backup script.
Directory-Based Sharding: The Flexible Middle Ground
If you don't want to hardcode hashing logic, use a Lookup Service. You store a mapping table in a highly available, cached store (like Redis or a small standard SQL DB).
user_id: 1054 -> shard_03
This allows you to move heavy users to their own dedicated hardware without re-sharding the entire universe. If a specific enterprise client in Oslo decides to import 10 million rows, you can migrate just that tenant to a dedicated high-memory CoolVDS instance.
Example: Redis Lookup
$ redis-cli
127.0.0.1:6379> SET user:1054:shard "db-node-03"
OK
127.0.0.1:6379> GET user:1054:shard
"db-node-03"
Database Configuration for Sharded Nodes
Once you split the data, you must tune the individual nodes. A 16GB RAM instance hosting a shard needs different tuning than a monolith. Since the dataset per node is smaller, you can likely fit the entire "working set" in RAM.
Here is a battle-tested my.cnf snippet for a MySQL 8.0 shard node with 16GB RAM running on NVMe:
[mysqld]
# 70-80% of RAM for buffer pool is standard, but leave room for OS overhead
innodb_buffer_pool_size = 12G
# Essential for NVMe drives to utilize parallel IO threads
innodb_io_capacity = 20000
innodb_io_capacity_max = 40000
# Redo log sizing - crucial for heavy write workloads
innodb_log_file_size = 2G
# If you don't set this, you risk losing data on crash.
# Doublewrite buffer is safety, but NVMe handles the speed penalty well.
innodb_doublewrite = 1
# Connection handling
max_connections = 500
thread_cache_size = 50
Conclusion: Architecture is the Limit
Sharding introduces complexity: backups become distributed, transactions across shards require Two-Phase Commit (2PC) or careful saga patterns, and reporting becomes harder. Don't shard just because it's trendy.
However, if you are hitting the limits of vertical scaling, or if you need to isolate tenant data for strict GDPR compliance within Norway, sharding is inevitable. The success of this architecture relies on two things: intelligent routing logic and brutal I/O performance.
You write the code. We provide the iron. Don't let slow I/O kill your SEO or your Black Friday sales.
Ready to architect for scale? Deploy a high-frequency NVMe instance on CoolVDS in Oslo today and verify the IOPS for yourself.