PostgreSQL tips and tricks: performance
By definition, it is impossible to finely tune all complex ad hoc queries generated at runtime: you will have to rely more on the query optimizer to make your query fast than you do with carefully handcrafted queries.
PostgreSQL has no query optimizer hints. This is obviouly a long debated issue in the PostgreSQL community. Let’s not argue and try to move forward without query hints.
What are the options at hand when queries are too slow? The usual basic checklist is as follows:
- Configure PostgreSQL to log all slow queries:
auto_explainis your friend. Use the excellent tool at explain.depesz.com to read
- Configure planner cost constants to fit your system configuration: amount of RAM, SSD or HD…
- Make sure there are enough statistics available: set
default_statistics_targetto its maximum value:
- Make sure index statistics are up to date. If not run
ANALYZEto update them.
- The more joins in your query, the more likely the optimizer chooses the wrong plan. As an advanced user, prefer writing explicit joins rather than implicit joins and instruct the optimizer to run explicit joins in the order you write them:
Troubleshooting CTEs and joins
We experienced slow joins with CTEs (Common Table Expressions) based on nested loops whereas the best choice was hash joins. In this case, one might be tempted to disable nested loops with
enable_nestloop = false. Indeed, the difference was pretty impressive: the query took more than 5 minutes to run with default settings, and about 2 seconds with nested loops disabled. But this was not an option as this could have adverse effects on other queries.
CTE are optimization fences and are often presented as a way to trick the query planner. In our particular case, CTEs were triggering bad execution plans. A closer look revealed that row estimates were completely wrong: the optimizer thought there was only one row in both CTE results, whereas they had both around 2,500 rows.
By rewriting the CTEs to use index scans instead of sequential scans, we were able to have their number of rows properly estimated and in the end have a correct execution plan.