Console Login

Database Sharding: The Nuclear Option for Scaling (and Why You Probably Just Need Better IOPS)

Database Sharding: The Nuclear Option for Scaling (and Why You Probably Just Need Better IOPS)

I hate sharding. Any systems architect who tells you they enjoy splitting a monolithic database into distributed chunks is either lying or hasn't been paged at 3:00 AM because a re-balancing script corrupted the user index. It is complex, it introduces latency, and it makes your application logic significantly harder to maintain.

Yet, here we are. Your users table just hit 500GB, your write locks are causing timeouts, and your CTO is asking why the checkout page takes four seconds to load. You have optimized your queries, you have added read replicas, and you have cached everything in Redis. You are out of options.

This guide covers how to approach database sharding correctly using the tools available to us today, in early 2019. We will look at why raw hardware performance (specifically NVMe) is your first line of defense, and how to architect a sharded solution when hardware alone is no longer enough.

The "Vertical Scaling" Reality Check

Before we write a single line of sharding logic, we need to look at the hardware. In 2015, spinning rust (HDD) was the bottleneck. In 2019, if you are not running your database on NVMe storage, you are solving the wrong problem.

I recently audited a Norwegian e-commerce platform struggling with MySQL performance. They were planning a massive sharding project estimated to take three months. I looked at `iostat` on their master node:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           15.4    0.0     4.1    78.2     0.0     2.3

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00     3.00  450.00  320.00  18000.0  24000.0   109.09    15.50   20.12   18.50   22.40   1.30 100.00

Notice the 78.2% iowait and 100% utilization. Their disk I/O was saturated. We migrated them to a CoolVDS NVMe instance. The `iowait` dropped to 2%, and the sharding project was cancelled. The lesson? Latency kills databases. Modern NVMe drives provide upwards of 350,000 IOPS compared to the 800 IOPS of standard SSDs from a few years ago. Always scale up (Vertical) on high-performance infrastructure before you scale out (Horizontal).

Strategies for Breaking the Monolith

If you are pushing 10,000 writes per second and have terabytes of data, hardware won't save you. You need to shard. There are two primary locations to place your sharding logic: the Application Level or the Middleware Level.

1. Application-Level Sharding

This is the "traditional" approach. Your application code knows which database node holds the data. You select a Sharding Key (usually `user_id` or `tenant_id`) and route queries accordingly.

The Modulo Approach

The simplest method is `shard_id = user_id % number_of_shards`.

def get_db_connection(user_id):
    # We have 4 database shards defined in config
    SHARD_COUNT = 4
    shard_id = user_id % SHARD_COUNT
    
    config = {
        0: 'db-shard-01.internal.coolvds.net',
        1: 'db-shard-02.internal.coolvds.net',
        2: 'db-shard-03.internal.coolvds.net',
        3: 'db-shard-04.internal.coolvds.net'
    }
    
    host = config[shard_id]
    return connect_to_host(host)

The Trap: If you want to add a 5th shard later, the modulo changes, and every single user ends up pointing to the wrong server. You have to migrate massive amounts of data to rebalance. Avoid simple modulo sharding unless your shard count is fixed forever (it never is).

The Directory (Lookup) Approach

A better way is to maintain a separate, highly available lookup table that maps IDs to physical shards. This allows you to move users between shards without changing the routing logic.

function getShardHost($userId, $pdoLookup) {
    // Check Redis first for speed
    $cacheKey = "user_shard_" . $userId;
    $host = $redis->get($cacheKey);
    
    if (!$host) {
        // Fallback to lookup database
        $stmt = $pdoLookup->prepare("SELECT shard_host FROM user_shards WHERE user_id = ?");
        $stmt->execute([$userId]);
        $host = $stmt->fetchColumn();
        
        // Cache it for 24 hours
        $redis->setex($cacheKey, 86400, $host);
    }
    
    return $host;
}

2. Middleware Sharding (ProxySQL)

In 2019, ProxySQL has become the gold standard for MySQL traffic management. Instead of hardcoding logic in PHP or Python, you send all traffic to a local ProxySQL instance, which handles the routing. This decouples your code from your infrastructure.

You can configure query rules in ProxySQL to route based on comments or regex. For example, routing traffic to different hostgroups based on a shard ID injected into the query comment.

-- Inside ProxySQL Admin Interface

-- Rule to send writes to Shard 1 (Hostgroup 10) for users 0-10000
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (1, 1, "^SELECT.*SHARD_ID=1", 10, 1);

-- Rule to send writes to Shard 2 (Hostgroup 20) for users 10001-20000
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (2, 1, "^SELECT.*SHARD_ID=2", 20, 1);

LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;

This allows you to migrate data in the background and simply flip a switch in ProxySQL to redirect traffic, often with zero downtime for the application.

The Importance of Network Latency

When you shard, you inevitably run into the "Cross-Shard Join" problem. You want to generate a report combining data from Shard A and Shard B. Since you can't join them in SQL, you have to pull data to the application layer and join it in memory.

This generates massive network traffic. If your VPS instances are on a congested network, your reporting jobs will time out.

Pro Tip: Ensure your hosting provider offers a private internal network with unmetered bandwidth. At CoolVDS, we use a private backend LAN for inter-node communication. If your shards talk to each other over the public internet, you are adding 10-20ms of latency per round trip and exposing your data stream. Keep it private, keep it fast.

Configuration Tuning for Shards

Sharded nodes are often smaller than the monolith. You must adjust your `my.cnf` (MySQL) or `postgresql.conf` accordingly. Do not copy the config from your massive master to your smaller shards.

For a standard 8GB RAM shard on a CoolVDS instance, a conservative starting point for InnoDB in MySQL 5.7/8.0 looks like this:

[mysqld]
# Allocate 60-70% of RAM to the pool, leave room for OS
innodb_buffer_pool_size = 5G

# Essential for data integrity on crash, but heavy on I/O. 
# Since we use NVMe, we can afford strict safety.
innodb_flush_log_at_trx_commit = 1

# Separate files for tables is mandatory for maintenance
innodb_file_per_table = 1

# Log file size - make it big enough to handle 1 hour of write traffic
innodb_log_file_size = 512M

# Disable name resolving for speed
skip-name-resolve

GDPR and Data Residency

Since the implementation of GDPR last year, we have to be extremely careful about where shards live. If you are sharding by geography (e.g., German users on Shard A, Norwegian users on Shard B), you must ensure the physical server for Shard B is actually in Norway.

Using a provider like CoolVDS with a clear presence in Oslo simplifies compliance. You can map your "Norwegian Shard" specifically to our Oslo datacenter, ensuring data covered by strict Norwegian interpretations of GDPR never leaves the jurisdiction. This is harder to guarantee with hyper-scale clouds where "Region: Europe" is often a nebulous concept.

Conclusion

Sharding is powerful, but it imposes a tax on complexity. Before you shard, benchmark your current workload on high-performance NVMe VPS infrastructure. If you must shard, choose a directory-based strategy or use middleware like ProxySQL to maintain flexibility.

Don't let slow I/O force you into a premature architecture change. Deploy a high-performance test instance on CoolVDS in under 55 seconds and see how far vertical scaling can actually take you.