Why does MySQL not use a hash table for UNIQUE indexes?

All we need is an easy explanation of the problem, so here it is.

I am wondering about the performance impact of a UNIQUE index in MySQL. I read that these use a B-Tree algorithm behind the scenes, just like normal indexes, but I am trying to understand why.

My thought process: The database already knows that there can only be a single record with a given value, so it can use a hash table to optimise reads and writes to O(1) time complexity instead of O(log n).

Please let me know if I am completely off here.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Hashes can cause problems

  1. Collisons

    A collision is when two different data values produce the same hash value.

    Like shown here the probability is actually quite high with increasing numbers.

  2. Size and resources

    To have almost no collisions, you need a huge size to save all values. Check this article if you want to know more. The article also describes the problem of the higher CPU power you need to manage that size of data, plus you need more time to insert data.

The use of a balanced tree has it perks, but in the middle it is a very good choice, because of it relatively small size, to search, insert ,delete: O(log n) as shown here.

In this link you find an idea of an implementation, so you can test it for yourself.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply