Schedule - PGConf NYC 2025

Scaling Postgres at Semgrep: How We Cut p99 Latency From Minutes to Under 7 Seconds

Level: Beginner

None of us were Postgres experts when we took on one of Semgrep’s biggest system overhauls: a rewrite of the data model and query layer behind our slowest, but most critical API. P99 latency was minutes, and our goal was to bring it to under four seconds. We thought it would be a few quick wins. Instead, it became a four-month effort that touched nearly every layer of the stack—and gave us a hands-on, learn-by-doing crash course on Postgres.

Early on, when feature requests were often tied to closing deals, we prioritized shipping speed over query performance. Storing data in a large JSON column made it easy to iterate quickly. But as we landed more enterprise customers, the performance cost became impossible to ignore. We needed to revamp our systems to keep pace with our success.

We’ll share what we tried, what worked, and what didn’t—from identifying bottlenecks by profiling top queries and validating improvements with cloned production data, to using index-only scans, Common Table Expressions (CTEs), and bulk write operations. Along the way, we hit plenty of surprises like primary key lookups writing to temporary files and the query planner unexpectedly favoring less efficient indexes.

We didn’t hit our four-second target, but getting latency down to under seven seconds was a major win. More importantly, the project reshaped how we think about managing databases. Performance is now a core design consideration, not just a post-launch optimization.

If you’re scaling Postgres in a fast-growing product, or just curious how we reworked our systems to meet enterprise-scale demands, this talk offers practical takeaways, lessons learned, tradeoffs made, and the steps we took to cut query latency by over 90%.

Speaker

Pang Nganthavee