Database Sharding in 2018: Surviving the Write-Heavy Load Without Melting Your Servers
Itâs 3:00 AM. Your monitoring dashboard is screaming red. Your primary MySQL node isn't just busy; it's locked up. You've already maxed out the RAM, you've upgraded to the fastest CPUs available, and your read-replicas are lagging so far behind they're serving data from yesterday. Welcome to the scaling wall.
If you are running a high-growth platform targeting users in Oslo or broader Europe, you know that Vertical Scaling (buying a bigger server) eventually fails. You canât buy a CPU with infinite cores. The only path forward is Horizontal Scalingâspecifically, Sharding.
But sharding is terrifying. It breaks your `JOIN`s, complicates your app logic, and if done wrong, results in data corruption that no backup can cleanly restore. In this guide, weâre going to look at how to shard pragmatically using tools available today, like ProxySQL and MySQL 5.7/8.0, while keeping Datatilsynet (The Norwegian Data Protection Authority) happy.
The Bottleneck: When Read-Replicas Are Not Enough
Most sysadmins start by adding slaves (replicas). This works great for read-heavy workloads (blogs, news sites). But if you are building an e-commerce platform or a SaaS tool, you have a write problem. Every `INSERT` or `UPDATE` must go to the master. When your write throughput exceeds the IOPS capacity of your storage or the single-threaded replication limits of MySQL, your application dies.
Pro Tip: Before you shard, verify your storage speed. Running a database on spinning rust or standard SSDs in 2018 is a suicide mission. We use NVMe storage exclusively on CoolVDS instances because database I/O latency needs to be measured in microseconds, not milliseconds.
Sharding Strategies: Choosing Your Poison
Sharding partitions your data across multiple servers. There are three main approaches currently dominating the landscape:
1. Key-Based (Hash) Sharding
You take a value (like `user_id`), hash it, and use the result to pick a server. It distributes load evenly but makes adding new shards painful (resharding involves moving tons of data).
2. Range-Based Sharding
Users 1-1,000,000 go to Shard A. Users 1,000,001-2,000,000 go to Shard B. This is easy to implement but creates "hotspots." If users in the newest range are the most active, Shard B melts while Shard A sleeps.
3. Directory-Based (Lookup) Sharding
You keep a separate lookup table that maps IDs to physical shards. It's flexibleâyou can move heavy users to their own dedicated hardwareâbut the lookup table itself becomes a single point of failure.
Implementation: The ProxySQL Approach
Hardcoding shard logic into your PHP or Python application is risky. In 2018, the smarter move is using ProxySQL as middleware. It sits between your app and your database nodes, routing queries based on rules you define.
Here is a battle-tested configuration for routing based on `user_id`. We assume you have deployed three CoolVDS instances: one ProxySQL node and two MySQL shard nodes.
Step 1: Configure MySQL Nodes
On your shards (`10.0.0.2` and `10.0.0.3`), optimization is key. Since you are splitting the dataset, you can tune the InnoDB buffer pool aggressively.
In /etc/mysql/my.cnf:
[mysqld]
# 70-80% of RAM for Innodb Buffer Pool
innodb_buffer_pool_size = 12G
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 2 # Speed over strict ACID compliance for slaves
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1
# Essential for replication stability
server-id = 101 # (Use 102 for the second node)
binlog_format = ROW
Step 2: ProxySQL Routing Rules
Install ProxySQL (v1.4.x is the stable choice right now). We will configure it to send even user IDs to Shard 1 and odd user IDs to Shard 2. This is a rudimentary form of sharding suitable for testing or simple splits.
-- Connect to ProxySQL Admin interface
mysql -u admin -padmin -h 127.0.0.1 -P 6032
-- Add backend servers
INSERT INTO mysql_servers (hostgroup_id, hostname, port) VALUES (10, '10.0.0.2', 3306);
INSERT INTO mysql_servers (hostgroup_id, hostname, port) VALUES (20, '10.0.0.3', 3306);
-- Load them
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
-- Add sharding rules
-- Traffic with user_id % 2 == 0 goes to HG 10
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (1, 1, "user_id=[02468]$", 10, 1);
-- Traffic with user_id % 2 != 0 goes to HG 20
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (2, 1, "user_id=[13579]$", 20, 1);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;
Now, your application simply connects to the ProxySQL instance. It doesn't know sharding exists. It just sends queries. ProxySQL handles the complexity.
The GDPR Angle: Geo-Sharding
With GDPR now fully enforceable as of May this year, data sovereignty is a massive headache. If you have customers in Norway and customers in Russia or the US, you might be legally required to store their data in specific physical locations.
Sharding solves this. You can create a "Norway Shard" running on a CoolVDS instance in Oslo (optimizing for latency to the NIX) and a "Global Shard" elsewhere. Your lookup service checks the user's registration country and routes them to the correct compliant server.
| Factor | Single Big Server | Sharded Cluster (CoolVDS) |
|---|---|---|
| Write Throughput | Limited by single disk I/O | Linear scalability |
| Complexity | Low | High (requires routing logic) |
| Maintenance | Easy backups, hard upgrades | Complex backups, zero-downtime upgrades possible |
| Cost | Expensive enterprise hardware | Cost-effective commodity VPS nodes |
Application Logic Changes
Even with ProxySQL, your application needs to stop doing cross-shard JOINS. You cannot join a table on Server A with a table on Server B. You must handle data aggregation in code.
Here is a Python pseudo-code example of how you might fetch data from multiple shards if you need a global report:
def get_global_stats(shards):
total_revenue = 0
for shard_connection in shards:
cursor = shard_connection.cursor()
cursor.execute("SELECT sum(amount) FROM orders")
result = cursor.fetchone()
if result[0]:
total_revenue += result[0]
return total_revenue
# This is slower than a single DB query, but it scales indefinitely.
Infrastructure Matters: The Hardware Reality
Sharding software is useless if the underlying hardware chokes. In 2018, many VPS providers still oversell their CPUs. You think you are getting 4 cores, but you are sharing them with 50 other noisy tenants. When their WordPress sites get hacked, your database shard slows down.
This is why we built CoolVDS on KVM with strict resource isolation. We don't play the "burst" game. If you pay for 4 vCPUs and NVMe storage, you get that throughput dedicated to you. For a database shard, latency consistency is more important than raw peak speed. A 200ms spike in disk latency can cause a transaction pile-up that crashes the whole cluster.
Summary
Sharding is not a silver bullet. It introduces complexity. But when you are pushing thousands of writes per second, it is the only way to survive. Start with a directory-based or ProxySQL approach. Test your failover scriptsâbecause a shard will fail eventually.
Need a sandbox to test your sharding logic? Spin up a high-performance CoolVDS instance today. Our Oslo datacenter offers the low latency your Norwegian users expect, and our NVMe drives eat IOPS for breakfast.