Query optimization with joins and unions and fulltext boolean search

All we need is an easy explanation of the problem, so here it is.

I have the following base tables with subtables for searchbait. (This is on aws aurora with mysql 5.7 compatibility)

tblA (id, userId, title, ...)
tblB (id, userId, title, ...)

tblA_searchBait (id->tblA.id, keywords)
tblB_searchBait (id->tblB.id, keywords)

I need to do a boolean fulltext search to obtain combined results in a single query. Currently I use the following:

select          base.title as title,
                match(search.keywords) against (? in boolean mode) as relevance
from            tblA as base
                join tblA_searchBait as search on search.id = base.id           
where           base.userId = ? and 
                match(search.keywords) against (? in boolean mode)

union all

select          base.title as title,
                match(search.keywords) against (? in boolean mode) as relevance 
from            tblB as base
                join tblB_searchBait as search on search.id = base.id           
where           base.userId = ? and 
                match(search.keywords) against (? in boolean mode)

order by        relevance desc              
limit ?, ?;

This works, but I am wondering: Are there ways the query can be made more performant? (Please note, I cannot change the schema of the tables.). Specifically I’m wondering if there’s any difference with something like this:

  select            base.title as title,
                    match(search.keywords) against (? in boolean mode) as relevance
    from            tblA as base
                    join tblA_searchBait as search on search.id = base.id   
                    and base.userId = ?     
    where           match(search.keywords) against (? in boolean mode)
    
    union all
    
    select          base.title as title,
                    match(search.keywords) against (? in boolean mode) as relevance 
    from            tblB as base
                    join tblB_searchBait as search on search.id = base.id 
                    and base.userId = ?         
    where           match(search.keywords) against (? in boolean mode)
    
    order by        relevance desc              
    limit ?, ?;

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

I would not trust the "relevance" provided by two different tables to be comparable. So, I would work on putting all the Fulltext data into a single table.

Why have 2 similar subtables?

If you had a column in your super-table (if you have such?), then the relative relevance would be possible and the query would run about twice as fast.

I cannot change the schema of the tables.

Sorry. but changing the schema is often a step in optimization.

WHERE vs ON

"Proper" coding specifies the relationship between JOINed tables in the ON clause and filtering in the WHERE clause. I don’t know of a case in JOIN where it matters which place things are put. (It matters for LEFT JOIN.)

You can check that they are identical by doing EXPLAIN SELECT ...; SHOW WARNINGS; for each version. The Warning should be the SELECT spelled out as the Optimizer sees it. That will include, among other things, moving ON clauses into WHERE (for whatever reason).

UNION and parens

The final ORDER BY is right next to a SELECT. Or is it right next to a UNION? I like to make it unambiguous by doing this:

( SELECT ... )
UNION 
( SELECT ... )
ORDER BY ...

Note that each SELECT could also have its own ORDER BY, hence there could be a total of 3 ORDER BYs. (There are obscure cases where such is valid.)

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply