All we need is an easy explanation of the problem, so here it is.
explain select count(x.*) from customer x; ... -> Partial Aggregate (cost=27005.45..27005.46 rows=1 width=8) -> Parallel Seq Scan on customer x (cost=0.00..26412.56 rows=237156 width=994)
explain select count(*) from customer x; ... -> Partial Aggregate (cost=27005.45..27005.46 rows=1 width=8) -> Parallel Seq Scan on customer x (cost=0.00..26412.56 rows=237156 width=0)
COUNT(x.*) here makes the
width in the explain result read unnecessary row data.
I thought they should be identical, but it seems not, why?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Logically, both are identical – because
x.* always counts, even when all columns are
But Postgres has a separate implementation for
It does not bother with any expression at all and only considers the existence of live rows. That’s slightly faster, which sums up to a relevant difference over many rows.
The performance penalty for
count(x.*) grows with the number of columns / width of rows, and will be rather substantial for wide rows like yours (
It’s even documented explicitly:
Computes the number of input rows.
Computes the number of input rows in which the input value is not
The gist of it: whenever you don’t care whether an expression is
Some other RDBMS do not have the same fast path for
count(*). OTOH, counting all rows in a table is comparatively slow in Postgres due to its MVCC model that forces checking row visibility. See:
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂