Schedule - PGDay Chicago 2023

    Why Machine Learning for Automatically Optimizing Databases Doesn't Work

    Date: 2023-04-20
    Time: 13:40–14:30
    Room: Room 3
    Level: Intermediate

    Database management systems (DBMSs) like PostgreSQL are complex software that requires sophisticated tuning to work efficiently for a given workload and operating environment. Such tuning requires considerable effort from experienced administrators, which is not scalable for large DBMS fleets. This problem has led to research on using machine learning (ML) to devise strategies to optimize DBMS configurations for any application, including automatic physical database design, knob configuration, and query tuning. Despite the many academic papers that tout the benefits of using ML to optimize databases, there have been only a few major success stories in industry in the last decade.

    In this talk, I discuss the challenges of using ML-enhanced tuning methods to optimize PostgreSQL databases. I will address specific assumptions that researchers make about production database environments that are incorrect and identify why ML is not always the best solution to solving real-world database problems. As part of this, I will discuss state-of-the-art academic research and industry tuning implementations.


    Andy Pavlo