Database Sharding Strategies: Surviving the Julehandel Traffic Spike
There is a specific kind of silence that falls over an engineering room when the primary database CPU hits 100% and stays there. It's not peaceful. It's the sound of revenue dying.
I experienced this firsthand two years ago during a Black Friday event for a major Norwegian retailer. We had vertically scaled our primary node to the absolute limitβ128 vCPUs, 2TB of RAM. It wasn't enough. The write locks piled up. The connection pool exhausted. The site didn't just slow down; it ceased to exist.
Vertical scaling has a ceiling. Eventually, you hit physics. Or your CFO hits you with a budget freeze. That is when you must look at sharding.
But be warned: Sharding is not a feature. It is a headache you accept to survive scalability issues.
The "Vertical Wall": When to Actually Shard
Most developers shard too early. If your dataset is under 2TB, you probably don't need sharding yet. You need better indexing, query caching, or faster storage.
Before you architect a complex distributed system, look at your hardware. Are you running on spinning rust or standard SSDs? In 2025, NVMe is the baseline for serious database workloads. Moving a bottlenecked MySQL instance from standard SSD to the high-performance NVMe storage we use at CoolVDS often yields a 400% IOPS increase without changing a single line of code.
Shard only when:
- Write throughput exceeds the capacity of a single master node.
- Your dataset size exceeds the storage capacity of a single physical node.
- You need geographic distribution for latency or legal reasons (e.g., keeping Norwegian user data strictly within NIX boundaries).
Sharding Architectures: The Good, The Bad, and The Complex
1. Key-Based (Hash) Sharding
This is the most common method for evenly distributing load. You take a value (like user_id), hash it, and use the result to determine which shard the data lives on.
The logic looks like this:
def get_shard_id(user_id, total_shards):
# Simple modulo hashing
# In production, use a consistent hashing ring to minimize rebalancing
hash_value = int(hashlib.md5(str(user_id).encode()).hexdigest(), 16)
return hash_value % total_shards
Pros: Excellent distribution. No "hot spots" where one server does all the work.
Cons: Resharding is a nightmare. Adding a new node changes the modulo result, requiring you to migrate huge chunks of data.
2. Range-Based Sharding
You split data based on ranges of values. IDs 1-1,000,000 go to Shard A. IDs 1,000,001-2,000,000 go to Shard B.
Pros: Easy to implement. Easy to query ranges.
Cons: Terrible for sequential writes. If your app generates sequential IDs, all new writes will hammer the last shard while the others sit idle. I've seen this melt a server while 9 others sat at 2% load.
3. Directory-Based (Lookup) Sharding
You maintain a separate "lookup service" that knows exactly where every piece of data lives. This provides maximum flexibility but introduces a Single Point of Failure (SPOF).
Implementation: Configuring PostgreSQL for Partitioning
While full sharding often requires application-level logic or middleware like Citus, PostgreSQL 17 (standard on Ubuntu 24.04 LTS) offers robust declarative partitioning that serves as a precursor to physical sharding.
Here is how you set up a table partitioned by range (e.g., for time-series data like logs or orders):
-- Create the parent table
CREATE TABLE orders (
order_id bigint NOT NULL,
customer_id bigint NOT NULL,
order_date date NOT NULL,
amount decimal
) PARTITION BY RANGE (order_date);
-- Create partitions for Q4 2025
CREATE TABLE orders_2025_q4 PARTITION OF orders
FOR VALUES FROM ('2025-10-01') TO ('2026-01-01');
-- Create partitions for Q1 2026
CREATE TABLE orders_2026_q1 PARTITION OF orders
FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');
-- Indexing applies to partitions automatically in modern Postgres
CREATE INDEX idx_orders_customer ON orders(customer_id);
This keeps your working set small. Queries filtering by date will only touch the relevant partition, drastically reducing I/O.
The Latency Trap and Norwegian Data Sovereignty
Sharding introduces network latency. If your application server is in Oslo, Shard A is in Stockholm, and Shard B is in Frankfurt, a JOIN operation effectively becomes a distributed network call. You are now governed by the speed of light.
For Norwegian businesses, this is compounded by compliance. The Datatilsynet is increasingly strict about where user data physically resides. Schrems II rulings are still relevant in late 2025. Keeping your shards within Norwegian borders isn't just about millisecond latency; it's about not getting fined 4% of your global turnover.
Pro Tip: When deploying a sharded cluster, ensure your VPS instances are in the same physical datacenter or availability zone. At CoolVDS, our Oslo facility allows for private networking between instances with near-zero latency, ensuring that cross-shard communication doesn't become your new bottleneck.
Optimizing the Node: MySQL Configuration
Whether you have 1 database or 50 shards, if the individual node is poorly configured, you lose. A common mistake I see on VPS deployments is leaving the default my.cnf.
For a dedicated 16GB RAM shard, your config should look closer to this:
[mysqld]
# 70-80% of available RAM
innodb_buffer_pool_size = 12G
# Essential for heavy write loads
innodb_log_file_size = 2G
innodb_flush_log_at_trx_commit = 1
# Prevent swapping at all costs
innodb_flush_method = O_DIRECT
# Connection handling
max_connections = 500
thread_cache_size = 50
If you are running this on a shared hosting plan where resources are "burstable" (read: stolen by neighbors), these settings will cause the OOM killer to terminate your database. This is why we insist on KVM virtualization with dedicated resource allocation at CoolVDS. Database memory must be guaranteed, not theoretical.
The verdict
Sharding is complex. It breaks transaction ACID properties across nodes (unless you use heavy 2-phase commits). It complicates backups. It makes reporting difficult.
Don't do it because it's trendy. Do it because you have no choice.
If you are approaching that point, start by auditing your infrastructure. High-frequency trading firms and top-tier e-commerce sites in Norway trust CoolVDS because we provide the raw NVMe I/O and low-latency network required to make distributed databases actually work.
Ready to test your architecture? Spin up a high-performance KVM instance in our Oslo datacenter in under 60 seconds and run your benchmarks.