How to quickly return the number of rows in a distributed hypertable with more than 100 million rows in TimescaleDB?

All we need is an easy explanation of the problem, so here it is.

In a ‘vanilla’ PostgreSQL 12.7 database, I generally run the following query to learn the estimated number of rows in tables with 100+ million rows:

----------------------------------------------------------
-- Return the estimated number of rows for the given table
----------------------------------------------------------
SELECT reltuples::bigint AS estimate_number_of_rows
FROM   pg_class
WHERE  oid = to_regclass('name_of_some_big_table');

This type of query doesn’t work for a distributed hypertable on our multi-node TimescaleDB installation.

I checked the TimescaleDB API Reference but couldn’t find what I’m looking for.

Is there such a straightforward query to quickly return the estimated number of rows in a distributed hypertable?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

TimescaleDB provides approximate_row_count function, which is described in the documentation here. The accuracy depends when ANALYZE or VACUUM was run last time.

For your table the query will be:

SELECT approximate_row_count('name_of_some_big_table');

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply