Session

TimescaleDB: Re-engineering PostgreSQL as a time-series database

Date: 2018-09-06
Time: 23:00 - 23:50
Room: Stockton
Level: Intermediate
Feedback: Leave feedback

Many developers working with time-series data today are operating polyglot solutions: a NoSQL database to store their time-series data (for scale), and a relational database for associated metadata and key business data. This leads to engineering complexity, operational challenges, and even referential integrity concerns. Thus many have found they require a purpose-built time-series database as this type of data proliferates. Yet the current state of time-series databases is lacking, and still forces users into the same issues with running complex polyglot or immature solutions.

In this talk, I describe why these operational headaches are unnecessary and how we re-engineered PostgreSQL as a time-series database in order to simplify time-series application development. In particular, the nature of time-series workloads—appending data about recent events—presents different demands than transactional (OLTP) workloads. By taking advantage of these differences, TimescaleDB exhibits 20x higher insert rates compared to vanilla PostgreSQL and achieves much faster queries. Compared to Cassandra, a 5 node TimescaleDB cluster outperforms 30 Cassandra nodes, with higher inserts, up to 5800x faster queries, 10% the cost, and a more flexible data model, all while offering full SQL (including JOINs). This simplifies one’s product and stack with a single database, PostgreSQL, while enabling users to ask much more complex and ad-hoc questions about their data.

TimescaleDB is packaged as a PostgreSQL extension, released under the Apache 2 license.

Speaker

David Kohn