Performance Monitor

Do not stand in the way of indexes

We’re being approached a lot by customers who’re asking us “why the database doesn’t use my index?”. Well, that’s a great question, with endless possible answers. But, in this article, we’ll try to provide several common options we see a lot, so hopefully, you’ll find them useful for your own use case.

Example #1 – Avoid wrapping indexed columns with functions

Consider this query, which counts the number of hot dogs purchased in the US on 2018. Just in case you’re curious, 18,000,000,000 hot dogs were sold in the US in 2018.

SELECT count(*) FROM us_hotdog_purchases WHERE YEAR(purchase_time) = ‘2018’

As you can see, we are using the YEAR function to grab the year part from the purchase_time column. This function call will prevent the database from being able to use an index for the purchase_time column search, because we indexed the value of purchase_time, but not the return value of YEAR(purchase_time).

To overcome this challenge and tune this SQL query, you can index the function’s result, by using  Generated Columns, which are available starting MySQL 5.7.5.

Another solution can be to find an alternative way to write the same query, without using the function call. In this example, we can transform that condition to a 2-way range condition, which will return the same results:

SELECT count(*) FROM us_hotdog_purchases WHERE purchased_at >= ‘2018-01-01’ AND purchased_at < ‘2019-01-01’

Example #2 – avoid OR conditions

Consider this query, which selects the amount of posts on Facebook posted after new year’s eve, or posted by a user named Mark.

SELECT count(*) FROM fb_posts WHERE username = ‘Mark’ OR post_time > ‘2018-01-01’

Having an index on both the username and post_time columns might sound helpful, but in most cases, the database won’t use it, at least not in full. The reason will be the connection between the two conditions – the OR operator, which makes the database fetch the results of each part of the condition separately.

An alternative way to look at this query can be to ‘split’ the OR condition and ‘combine’ it using a UNION clause. This alternative will allow you to index each of the conditions separately, so the database will use the indexes to search for the results and then combine the results with the UNION clause.

WHERE username = ‘Mark’
WHERE post_time > ‘2018-01-01’

Please note that if you don’t mind duplicate records in your result set, you can also use UNION ALL (which will perform better than the default UNION DISTINCT).

Example #3 – Avoid sorting with a mixed order

Consider this query, which selects all posts from Facebook and sorts them by the username in an ascending order, and then by the post date in a descending order.

0 Comment

Leave a Reply