Fuzzy searching through multiple fields in postgreSQL

All we need is an easy explanation of the problem, so here it is.

I’ve to do fuzzy search on multiple fields (in an attempt to create something like autocomplete similar to product search in amazon).
I tried this through ElasticSearch but was wondering if there’s something equivalent to it in postgreSQL.

Here’s sample code for elasticsearch: (both the fields, title and description, are index as type: text)

GET index_name/_search
{
    "query": {
          "bool": {
                "must": [
                      {
                           "multi_match": {
                                 fields: ["description", "title"],
                                 query: "postgres",
                                 fuzziness: 1
                           }
                      }
                ]
          }
    }
}

I’ve tried the same using pg_tram in postgreSQL, it worked for one field with similarity() (% operator) but I don’t know how to extend this on multiple fields.

This is what I did in postgreSQL, not sure if it’s good way though:

select * from table t
   where similarity("title", "postgres") > 0.5;
select * from table t
   where similarity("title", "postgres") > 0.5 OR similarity("description", "postgres") > 0.5;

Also is there any way to introduce fuzziness in ts_vector (FTS) query ?
Will appreciate any help/guidance in this context.
P.S: let me know if my description is missing something.

Thanks 🙂

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

You could use the word similarity operator <%:

SELECT ... FROM tab
WHERE 'postgres' <% concat(title, ' ', description);

To speed that up, you can create a GIN index on that expression:

CREATE INDEX ON tab USING gin (concat(title, ' ', description) gin_trgm_ops);

You can adjust the parameter pg_trgm.word_similarity_threshold to get the desired sensitivity.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply