How to improve simple = condition on a large table?

All we need is an easy explanation of the problem, so here it is.

Check out the following simple query:

SELECT * FROM "teammgr_team" WHERE ("teammgr_team"."real" = true AND "teammgr_team"."name" = 'abc');

But it takes too long:

-------------------------------------------------------------------------------------------------------------------
 Seq Scan on teammgr_team  (cost=0.00..114772.49 rows=118 width=121) (actual time=24.544..618.185 rows=12 loops=1)
   Filter: ("real" AND ((name)::text = 'abc'::text))
   Rows Removed by Filter: 4752431
 Planning time: 0.066 ms
 Execution time: 618.217 ms
(5 rows)

I assume because the table is very large:

  count
---------
 4752443
(1 row)

This is the table and relevant column:

                                       Table "public.teammgr_team"
        Column        |         Type          |                         Modifiers
----------------------+-----------------------+-----------------------------------------------------------
 id                   | integer               | not null default nextval('teammgr_team_id_seq'::regclass)
 name                 | character varying(40) | not null

Indexes:
    "teammgr_team_pkey" PRIMARY KEY, btree (id)
    "teammgr_team_club_id" btree (club_id)

I am not sure if adding an index to a character column is advisable. I would think so but I don’t know enough about databases.

So I am thinking about adding a simple index:

CREATE INDEX teammgr_team_name ON teammgr_team (name);

Just having in mind it shouldn’t be UNIQUE because the team name is not unique.

  • Would adding this index help improve execution time?

  • I’ve gone through the docs but is there any option that would be benefitial to my goal?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Your query filters millions of rows to return a hand full of values. So, yes, adding this index will help massively:

CREATE INDEX teammgr_team_name_idx ON teammgr_team (name);

If your query always asks for teammgr_team."real" = true and that case is not the common case in the table, a partial index would be even better:

CREATE INDEX teammgr_team_name_real_idx ON teammgr_team (name)
WHERE real;

Or maybe a multicolumn index:

CREATE INDEX teammgr_team_name_real_idx ON teammgr_team (name, real);

But adding a boolean column as index expression has typically limited benefit. It’s typically more effective to have a partial index on the rare case.

That all depends on exact data distribution. And, possibly, on typical write patterns: highly volatile columns (updated a lot) are more expensive to index.

Aside 1

Simplify:

SELECT * FROM teammgr_team WHERE real AND name = 'abc';

WHERE teammgr_team."real" = true is just a noisy way of saying WHERE real.

Aside 2

Don’t use basic type names like real as identifier. Leads to confusion. And "name" is not a good name, either.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply