Tuesday, April 2, 2013

Database Optimization: Query Performance vs. Request Performance

When it comes time to optimize your web application's database performance, there are (at least) two types of tools you can leverage. You can use an analysis tool like pgFouine, which is focused on individual query performance, or a tool like Scout or New Relic, which is focused on request performance.

pgFouine is fantastic at isolating any single query that, in aggregate, is slowing down your application the most. This is a function of the query's invocation count multiplied by its average execution time. Of course, it will report not just the single most expensive query, but the top N most expensive queries. (The tool will also will show you the most frequently executed queries and the overall single slowest queries, but these metrics are less useful for determining what queries to focus on).

But what if the top N queries are all taking roughly the same amount of time? Which one(s) should you try to optimize or eliminate? pgFouine cannot help you make this decision intelligently. Instead you need a tool that will allow you to focus on performance at the level web requests.  Consider that each request your web application handles is likely comprised of many queries. So if you have one type of request that is taking 80% of your web application's processing time, then you probably want to see the full set of queries that together are causing these requests to be non-performant. Perhaps you can eliminate many of these queries, with an improved application design. This is when a tool such as New Relic (or Scout's "Application" feature) becomes invaluable.

For each type of request, you can see the most expensive queries, but more importantly the full set of queries that are being issued for a given request. (These tools are equally useful for finding application layer code that is non-performant, but I'm only concerned with database performance here.) With a request-level view you can start to evaluate your overall design to determine whether you can eliminate certain queries altogether. For example, you might realize that you are issuing two similar but different queries that can be combined into a single query. pgFouine might show that these two queries are equally performant, while a request-level analysis tool will show you that they are being executed side-by-side while serving a single request. This is the hint one needs to start understanding where optimizations can be made at a design level higher than an individual query.