Console Login

Surviving the Write Cliff: Practical Database Sharding Strategies for 2015

Database Sharding: Because innodb_buffer_pool_size Can't Fix Everything

We have all been there. It is 3:00 AM. The load average on the master database server just crossed 50.0. You have already maxed out the RAM, you have tuned your indexes, and you have switched to Percona Server. But the I/O wait is still killing your application.

Welcome to the "Write Cliff." Vertical scaling—throwing more money at a bigger server—has stopped working. It is time to talk about sharding.

In the Norwegian hosting market, where data sovereignty laws (personopplysningsloven) are strict, you cannot just blindly throw data onto a massive US cloud instance. You need architecture that respects boundaries while handling massive throughput.

The Sharding Reality Check

Sharding is not a magic bullet. It is complex. It breaks your JOIN queries. It makes backups a nightmare. But if you are pushing 5,000+ writes per second, it is often your only path forward.

Essentially, we are moving from "One Big Database" to "Many Small Databases" that share a schema but split the data.

Strategy 1: Key-Based (Hash) Sharding

This is the most common approach for SaaS platforms. You take a unique identifier (like a user_id), hash it, and use the result to determine which server gets the data.

shard_id = user_id % number_of_shards

The Good: Even distribution of data. Hotspots are rare unless one user is exponentially more active than others.

The Bad: Adding new shards is painful. If you go from 10 to 11 shards, the modulo changes, and you have to rebalance (migrate) almost all your data. Consistent Hashing helps, but it is complex to implement.

Strategy 2: Directory-Based Sharding

You maintain a "Lookup Service"—a lightweight database that knows where every user lives. When a request comes in, the app asks the Lookup Service: "Where is User 452?" The service responds: "Shard-03."

The Good: Flexibility. You can move users between shards without code changes. You can put premium users on high-performance NVMe instances (like the ones we just rolled out at CoolVDS) and free users on standard SSD tiers.

The Bad: The Lookup Service becomes a Single Point of Failure. If it goes down, nobody knows where their data is.

Infrastructure Matters: The Latency Trap

Here is where most developers fail. They design a perfect sharding logic but deploy it on scattered infrastructure. If your App Server is in Oslo and your Shard 1 is in a budget datacenter in Germany, the round-trip time (RTT) will destroy your performance.

When you split a database, network latency becomes your new bottleneck.

Pro Tip: Keep your shards local. At CoolVDS, our internal network between instances in our Oslo facility operates at sub-millisecond latency. This is critical when your application needs to aggregate data from three different shards to render a single dashboard.

Configuration: Tuning for Shards

Since shards are smaller, you don't need the massive configurations used for a monolith. However, reliability is paramount. Since you now have more servers, the statistical probability of a hardware failure increases. You need robust replication.

Here is a snippet for my.cnf focusing on durability for a write-heavy shard node (MySQL 5.6):

[mysqld]
# Use GTID for easier failover and replication management
gtid_mode = ON
enforce_gtid_consistency = ON

# Safety first - do not risk data loss for speed here
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1

# optimize for heavy writes
innodb_io_capacity = 2000 # Only if using SSD/SAS
innodb_log_file_size = 512M

The Datatilsynet Factor

Operating in Norway means respecting the Data Inspectorate (Datatilsynet). If you shard your database based on geography, be extremely careful. Sharding "EU users" to a server in Frankfurt and "Norwegian users" to Oslo is a valid strategy, but ensure your routing logic never accidentally writes Norwegian medical or financial data to a jurisdiction with weaker privacy protections.

Using a provider like CoolVDS ensures that your "Oslo Shard" physically stays in Oslo. We don't silently migrate your VM to a different country for load balancing.

Start Small, but Start Now

Don't wait until your site crashes on Black Friday to think about sharding. You can start by "vertical sharding"—moving different tables to different servers (e.g., users DB on Server A, logs DB on Server B).

If you are ready to test a clustered setup, spin up three CoolVDS instances connected via our private LAN. Benchmark the latency yourself. If ping reports over 1ms, you are on the wrong host.