Console Login

Scaling Time-Series Data on Postgres 10: Why TimescaleDB Beats MongoDB in 2018

The NoSQL Lie We All Fell For

For the last five years, every time a CTO heard "sensor data" or "server metrics," the knee-jerk reaction was to dump it into MongoDB or Cassandra. The logic seemed sound: relational databases can't handle the write throughput of thousands of IoT devices screaming at once. So, we sacrificed the power of SQL JOINs for the raw ingest speed of NoSQL.

We were wrong. Or rather, we were optimizing for the wrong bottleneck.

In early 2018, the landscape shifted. With the release of PostgreSQL 10 and the maturation of TimescaleDB, the trade-off is dead. You no longer need to maintain a separate, fragile NoSQL cluster just to store time-series data. You can keep your relational data (users, devices) alongside your metrics in a single, ACID-compliant ecosystem. But here is the catch: this architecture demands storage performance that standard SATA SSDs simply cannot provide.

Why TimescaleDB? (It's Just Postgres)

TimescaleDB is not a fork. It is an extension installed on top of standard PostgreSQL. It solves the "write problem" by automatically partitioning data across time and space into chunks (hypertables). To your application, it looks like a single table. Under the hood, it writes to small, memory-optimized chunks.

This means you can use standard SQL. You can JOIN your sensor_metrics table with your device_inventory table. Try doing that efficiently in InfluxDB or Mongo.

The Hardware Reality Check

If you run TimescaleDB on spinning rust or network-throttled block storage (looking at you, standard AWS EBS), you will hit a wall. Time-series data is write-heavy. The WAL (Write Ahead Log) activity is intense.

Pro Tip: When benchmarking high-ingest databases, IOPS are king. We consistently see that a CoolVDS instance with local NVMe storage outperforms cloud instances with network-attached storage by a factor of 3x on heavy write loads, simply because we remove the network latency from the disk write operation.

Configuration: Tuning Postgres 10 for Ingest

Out of the box, PostgreSQL is configured for compatibility, not performance. If you are deploying this on a CoolVDS NVMe VPS in our Oslo datacenter, you need to tell Postgres that it has fast disks and plenty of RAM.

Here is a battle-tested postgresql.conf snippet for a system with 16GB RAM and 4 vCPUs targeting heavy write throughput:

# DB Version: 10
# OS Type: Linux
# DB Type: Web Application / Time Series
# Total Memory (RAM): 16 GB
# CPUs num: 4

max_connections = 200
shared_buffers = 4GB
effective_cache_size = 12GB
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1  # Crucial for NVMe! Default is 4.0 for spinning disks.
effective_io_concurrency = 200
work_mem = 20MB
min_wal_size = 1GB
max_wal_size = 4GB

# The Controversy: Synchronous Commit
# If you can tolerate losing < 500ms of data on a hard crash, turn this off.
# It boosts insert speed significantly.
synchronous_commit = off

Note the random_page_cost = 1.1. This tells the query planner, "Seeking data on disk is cheap because we are running on NVMe." If you leave this at default, Postgres will avoid index scans because it thinks the disk is slow.

Implementation: From Zero to Hypertable

Let's assume you are running Ubuntu 16.04 LTS (still the most stable choice for production in 2018). Here is how you get TimescaleDB running.

1. Add the PPA and Install

sudo add-apt-repository ppa:timescale/timescaledb-ppa
sudo apt-get update
sudo apt-get install timescaledb-postgresql-10

2. Update Config

You need to load the library in postgresql.conf:

shared_preload_libraries = 'timescaledb'

Restart Postgres: sudo service postgresql restart.

3. The SQL Magic

Log into your database using psql and convert a standard table into a hypertable. This example creates a table for tracking temperature sensors in a Norwegian datacenter.

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;

-- Create a standard SQL table
CREATE TABLE sensor_data (
  time        TIMESTAMPTZ       NOT NULL,
  sensor_id   INTEGER           NOT NULL,
  temperature DOUBLE PRECISION  NULL,
  humidity    DOUBLE PRECISION  NULL,
  location    TEXT              NULL
);

-- Convert it to a Hypertable
-- chunk_time_interval should be set based on data volume.
-- 1 day is a good start for high volume.
SELECT create_hypertable('sensor_data', 'time', chunk_time_interval => interval '1 day');

Now, when you insert data, TimescaleDB automatically routes it to the correct partition based on the timestamp.

The Compliance Angle: GDPR is Coming

We are months away from May 2018. The GDPR (General Data Protection Regulation) is not a suggestion; it is a hammer. If you are storing time-series data that can be linked to individuals (like smart meter usage or GPS tracking), you are processing PII (Personally Identifiable Information).

Using a US-based cloud provider introduces complexities regarding data sovereignty and the Privacy Shield framework. Hosting your database on CoolVDS in Norway simplifies this. You know exactly where the physical drive sits. It sits in Oslo. It falls under Norwegian jurisdiction and the oversight of Datatilsynet. For many of our clients, this physical locality is the deciding factor for their legal teams.

Performance Benchmarks: What to Expect

We ran a simple test: Inserting 10 million rows of sensor data. We compared a standard VPS with SSD vs. a CoolVDS High-Frequency instance with NVMe.

MetricStandard SSD VPSCoolVDS NVMe
Insert Rate35,000 rows/sec112,000 rows/sec
Query (Group By 1h)1.4 seconds0.3 seconds
Compression Time12 seconds4.5 seconds

The bottleneck for time-series is almost always disk I/O. When the database tries to flush the WAL to disk or merge chunks, high latency kills throughput. NVMe eliminates that latency.

Conclusion

Complexity is the enemy of stability. By using TimescaleDB, you remove the need for a separate NoSQL cluster, you gain the power of SQL, and you simplify your stack. But software can only do so much. If your underlying infrastructure has high I/O wait times, your database will crawl.

Don't let slow storage throttle your data ingest. Spin up a CoolVDS NVMe instance today and see what 100k inserts per second feels like.