Database Sharding: The Nuclear Option for Scaling High-Traffic Apps
Let's get one thing straight: sharding is a nightmare.
I have spent the last decade fixing fractured architectures where an over-eager CTO decided to shard a database that was only holding 50GB of data. The result? Cross-shard join complexity, broken transactions, and latency spikes that ruined the user experience. Sharding is the nuclear option. You only press that button when your write throughput physically exceeds the IOPS capability of the fastest hardware money can buy.
But when you do need it—when you are hitting 50,000+ writes per second and your replication lag is measuring in minutes—it is the only path to survival. In this guide, we are going to look at how to implement sharding strategies using tools available right now in 2020, specifically for the Norwegian market where data residency and latency to NIX (Norwegian Internet Exchange) matter.
Step 0: Do You Actually Need to Shard?
Before we start modifying application logic, look at your hardware. In 2020, storage is fast. If you are running on spinning rust or standard SSDs, you are bottlenecks by choice. I recently migrated a client from a standard cloud instance to a CoolVDS NVMe High-Performance VPS. They thought they needed sharding.
They didn't. They just needed high I/O throughput.
Pro Tip: Check your disk latency. Ifiostat -x 1shows your%utilnear 100% but your CPU is idle, sharding is premature. Upgrade to local NVMe storage first. The random read/write speeds on our CoolVDS instances often eliminate the need for complex architectures entirely.
The Architecture of a Sharded System
If you have maximized vertical scaling (innodb_buffer_pool_size is 80% of RAM, you are on NVMe, and you've tuned your query cache) and are still choking, let's talk strategy. The two most robust methods we use in production environments today are Directory-Based Sharding and Key-Based (Hash) Sharding.
Key-Based Sharding with MySQL & ProxySQL
For MySQL 8.0 environments, we rely heavily on ProxySQL. It sits between your application and your database nodes, routing queries based on rules. This abstracts the topology from your code.
Here is a battle-tested configuration for `proxysql.cnf` to split traffic based on a `user_id` modulus. This assumes you have multiple backend servers running on CoolVDS instances within our Oslo datacenter to minimize internal latency.
# Define backend servers (Shards)
INSERT INTO mysql_servers (hostgroup_id, hostname, port, weight)
VALUES
(10, '10.0.0.5', 3306, 1), -- Shard 1
(20, '10.0.0.6', 3306, 1); -- Shard 2
# Define Sharding Rules in mysql_query_rules
# If user_id % 2 == 0, go to Shard 1 (Hostgroup 10)
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (1, 1, "^SELECT.*WHERE user_id % 2 = 0", 10, 1);
# If user_id % 2 == 1, go to Shard 2 (Hostgroup 20)
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (2, 1, "^SELECT.*WHERE user_id % 2 = 1", 20, 1);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;
This approach is clean but rigid. Resharding (moving data from 2 shards to 4) requires a painful migration script. However, for steady growth, it's efficient.
PostgreSQL: The Citus Approach
If you are in the PostgreSQL camp (and with version 12, why wouldn't you be?), the extension Citus turns Postgres into a distributed database. Unlike the manual hacking required in MySQL, Citus handles the distribution logic natively.
On a CoolVDS instance running CentOS 8, installation looks like this:
# Add Citus repository
curl https://install.citusdata.com/community/rpm.sh | sudo bash
# Install PostgreSQL 12 and Citus
sudo yum install -y citus92_12
# Initialize the database
/usr/pgsql-12/bin/postgresql-12-setup initdb
# Configure postgresql.conf to preload libraries
echo "shared_preload_libraries = 'citus'" >> /var/lib/pgsql/12/data/postgresql.conf
Once running, creating a distributed table is trivial compared to the manual routing logic of the past:
-- Enable Citus extension
CREATE EXTENSION citus;
-- Create a standard table
CREATE TABLE events (
device_id bigint,
event_time timestamptz,
payload jsonb
);
-- Distribute it across workers based on device_id
SELECT create_distributed_table('events', 'device_id');
This setup allows you to scale out horizontally by simply adding more CoolVDS nodes to the cluster. The master node handles the query planning, and the worker nodes execute in parallel. This is particularly effective for heavy analytical workloads common in Norway's energy and finance sectors.
The Latency Trap: Why Geography Matters
In a sharded environment, a single user request might fan out to multiple database nodes. If those nodes are scattered across different datacenters, you are introducing network latency that compounds with every join.
For Norwegian businesses, hosting your primary shards in Frankfurt or Amsterdam while your users are in Oslo is a mistake. The round-trip time (RTT) adds up.
Latency comparison (Average):
| Route | Latency (ms) | Impact on 100 serial queries |
|---|---|---|
| Oslo User -> Amsterdam Server | ~25ms | 2.5 seconds |
| Oslo User -> CoolVDS Oslo | ~2ms | 0.2 seconds |
When you split a database, you increase the "chatter" between services. Hosting on CoolVDS in Norway ensures that this internal traffic stays on high-speed local peering, keeping your application snappy. Plus, with the Datatilsynet keeping a close eye on GDPR compliance, keeping your data physically within Norwegian borders (or at least the EEA) simplifies your legal posture significantly.
Critical Configuration: Optimizing the Node
Regardless of your sharding strategy, the individual nodes must be tuned. A default MySQL installation is designed for a laptop, not a high-performance VPS. On a 16GB RAM CoolVDS instance, your my.cnf should explicitly define the buffer pool and log flushing behavior to prevent I/O stalls.
[mysqld]
# 70-80% of RAM for InnoDB Buffer Pool
innodb_buffer_pool_size = 12G
# Keep the log file large enough to handle bursts
innodb_log_file_size = 2G
# 0 = faster but risks data loss on crash. 1 = safest. 2 = good compromise.
innodb_flush_log_at_trx_commit = 1
# Essential for SSD/NVMe drives
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_flush_method = O_DIRECT
The Verdict
Sharding is a powerful tool, but it increases operational complexity by an order of magnitude. Before you deploy Vitess or Citus, ensure you aren't just suffering from slow I/O.
Start by migrating your monolithic database to a CoolVDS NVMe instance. Test the performance. If you are still bottlenecked, then—and only then—start slicing your data. We provide the raw, unthrottled compute power required for both vertical scaling and the orchestration of complex sharded clusters.
Need to test a Citus cluster or ProxySQL setup? Deploy a test instance on CoolVDS in under 55 seconds and see the difference low latency makes.