Is there a difference between text_pattern_ops and COLLATE “C”?

All we need is an easy explanation of the problem, so here it is.

If I have a text column name with a collation, for example tr-TR, and I run a query

SELECT * FROM t WHERE name LIKE 'a%'

then this will do a seq scan.

If I now create an index

CREATE INDEX ON t(name text_pattern_ops)

the above query will become a bitmap scan. But there is another technique that will achieve the same result:

CREATE INDEX ON t(name COLLATE "C")

Are those approaches completely equivalent or are there differences?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

The manual:

Note that you should also create an index with the default operator
class if you want queries involving ordinary <, <=, >, or >=
comparisons to use an index. Such queries cannot use the
xxx_pattern_ops operator classes. (Ordinary equality comparisons can
use these operator classes, however.) It is possible to create
multiple indexes on the same column with different operator classes.
If you do use the C locale, you do not need the xxx_pattern_ops
operator classes, because an index with the default operator class is
usable for pattern-matching queries in the C locale.

So, the index with COLLATE "C" does everything a text_pattern_ops would do, plus more which the latter cannot. Like support this query:

SELECT * FROM t
WHERE name >= 'z' COLLATE "C";

COLLATE "C" must be specified explicitly in the query to make the COLLATE "C" index applicable (unless you operate with "C" collation to begin with, but then why the index?)

The above query requires the feature of "per-column collation support" (as does your index). The feature was added with Postgres 9.1. The operator class text_pattern_ops is much older and made a lot of sense before Postgres 9.1.

Since Postgres 9.1 the COLLATE "C" variant is superior as it is more versatile. The only reason I could think of would be to intentionally not support additional use cases for some odd reason. The manual might actually add a deprecation notice for the operator classes xxx_pattern_ops.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply