Schedule - PGDay Chicago 2024

Automating Postgres Index Selection Using Constraint Programming

Date: 2024-04-26
Time: 10:50–11:35
Room: Forum
Level: Advanced

In this talk, you will get an introduction to a new approach for automatically determining which set of indexes to create for a given Postgres query workload, based on objectives chosen by the developer or DBA.

We'll discuss how to process Postgres query workload statistics derived from pg_stat_statements, turning them into "scans", and finding a set of possible indexes for these scans. Further, we'll discuss why its better to optimize index selection for a given table, not just a single query.

To resolve the conflict between "too many indexes = high write overhead" and "no indexes = slow queries", the talk introduces a constraint programming optimization model that finds the mathematically optimal solution (set of index choices) based on a given set of constraints and objectives.

This approach is flexible, by allowing developers and DBAs to specify their intent on how a table should be indexed, whilst avoiding manual analysis of individual queries. Further it supports Postgres specific constructs, such as the option to limit indexes based on the impact they have on HOT Updates.

After this talk, you will have learned how to guide a constraint programming solver to find the right indexes for a table, and how you can extract the relevant data from Postgres. You will also have access to a fully working open-source example that can be utilized directly on a Postgres database.


Lukas Fittl