Is there an index that would support comparison of 2-tuple values?

All we need is an easy explanation of the problem, so here it is.

I want to implement cursor-based pagination for a large data set.

With OFFSET based pagination, when the user wants page N, you just OFFSET N * page_size.
The query ends up like:

FROM books
OFFSET 100000

But OFFSET gets slower the larger the value it’s given because PostgreSQL has to load and discard the preceding rows.

A cursor-based approach is where we tell the user "here’s page N, and since the last record on it has id of X, to get the next page you should ask me for records with id > X."
The query ends up like:

FROM books
WHERE id > 100000

In that case, PostgreSQL can load only the needed rows.

That works great when sorting by id.
But I’d like to be able to sort by other columns – for instance, by title – and still paginate.

The problem is that title is not unique. So if the final record on page N has the title "About Weasels" and there are multiple books with that title, requesting the next page with WHERE title > 'About Weasels' may skip some of them.

I can get unique values by having the user request WHERE (title, id) > ('About Weasels', 100000), but that performs poorly because PostgreSQL has to compute (title, id) for every row.

I tried adding an index to pre-compute that 2-tuple: CREATE INDEX books_title_and_id ON books (title, id);

…but that index makes no difference to the query plan.

Is there an index I could create which would speed up this query?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Here is an example using your index:

explain analyze 
select * from books WHERE 
(title, id) > ('KVFNdl5F', 994364) 
order by title, id 
limit 10;

Note that the first occurrence of title, id (in the > comparison) must be in parentheses, and the second occurrence (in the ORDER BY) must not be in parentheses.

                                                                QUERY PLAN                                                                 
 Limit  (cost=0.42..1.09 rows=10 width=45) (actual time=0.062..0.081 rows=10 loops=1)
   ->  Index Scan using books_title_and_id on books  (cost=0.42..42823.96 rows=644090 width=45) (actual time=0.060..0.077 rows=10 loops=1)
         Index Cond: (ROW(title, id) > ROW('KVFNdl5F'::text, 994364))
 Planning Time: 0.107 ms
 Execution Time: 0.109 ms

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from or, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply