All we need is an easy explanation of the problem, so here it is.
I’m using Postgres 13.3 with inner and outer queries that both only produce a single row (just some stats about row counts).
I can’t figure out why Query2 below is so much slower than Query1. They should basically be almost exactly the same, maybe a few ms difference at most …
Query1: takes 49 seconds
WITH t1 AS ( SELECT (SELECT COUNT(*) FROM racing.all_computable_xformula_bday_combos) AS all_count, (SELECT COUNT(*) FROM racing.xday_todo_all) AS todo_count, (SELECT COUNT(*) FROM racing.xday) AS xday_row_count OFFSET 0 -- this is to prevent inlining ) SELECT t1.all_count, t1.all_count-t1.todo_count AS done_count, t1.todo_count, t1.xday_row_count FROM t1;
Query2: takes 4 minutes and 30 seconds
And I only added one line:
WITH t1 AS ( SELECT (SELECT COUNT(*) FROM racing.all_computable_xformula_bday_combos) AS all_count, (SELECT COUNT(*) FROM racing.xday_todo_all) AS todo_count, (SELECT COUNT(*) FROM racing.xday) AS xday_row_count OFFSET 0 -- this is to prevent inlining ) SELECT t1.all_count, t1.all_count-t1.todo_count AS done_count, t1.todo_count, t1.xday_row_count, -- the line below is the only difference to Query1: util.divide_ints_and_get_percentage_string(todo_count, all_count) AS todo_percentage FROM t1;
Before this point, and with some extra columns in the outer query (which should have made almost zero difference), the whole query was insanely slow, like 25 minutes, which I think was due to inlining maybe? Hence the
OFFSET 0 being added into both queries (which does help a lot).
I’ve also been swapping between using the above CTEs vs subqueries, but with the
OFFSET 0 included it doesn’t seem to make any difference.
Definitions of the functions being called in Query2:
CREATE OR REPLACE FUNCTION util.ratio_to_percentage_string(FLOAT, INTEGER) RETURNS TEXT AS $$ BEGIN RETURN ROUND($1::NUMERIC * 100, $2)::TEXT || '%'; END; $$ LANGUAGE plpgsql IMMUTABLE; CREATE OR REPLACE FUNCTION util.divide_ints_and_get_percentage_string(BIGINT, BIGINT) RETURNS TEXT AS $$ BEGIN RETURN CASE WHEN $2 > 0 THEN util.ratio_to_percentage_string($1::FLOAT / $2::FLOAT, 2) ELSE 'divide_by_zero' END ; END; $$ LANGUAGE plpgsql IMMUTABLE;
As you can see it’s a very simple function, which is only being called once, from the single row the whole thing produces. How can this cause such a massive slowdown? And why is it affecting whether Postgres inlines the initial subquery / CTE? (Or whatever else might be going on here?)
Also, it doesn’t matter what the function does at all, simply replacing it with a function that does nothing but return a
hello causes the exact same slow down of the initial inner query. So it’s not about anything the function "does", but more like some kind of "Schrödinger’s cat" effect, where stuff in the outer query is affecting how the inner query is initially performed. Why does a simple tiny change in the outer query (which basically has zero effect on performance) affect the initial inner query?
EXPLAIN ANALYZE outputs:
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Function inlining is important, and applies here, too. Your PL/pgSQL function cannot be inlined. (Besides being overkill to even call another function for the trivial expression.) But since it’s still very cheap and only called once, it’s not the issue here.
Whether you use the
OFFSET 0 hack or
WITH CTE t1 AS MATERIALIZED, either prevents repeated evaluation. (If you are going to use the
OFFSET 0 hack, you might as well use a slightly cheaper subquery, but the clean way in modern Postgres is a
MATERIALZED CTE.) That’s also not the issue. (Or not any more, after you successfully prevented repeated evaluation, to be precise.)
The all-important issue is parallelism. User functions are
PARALLEL UNSAFE by default. The manual:
PARALLEL UNSAFEindicates that the function can’t be executed in
parallel mode and the presence of such a function in an SQL statement
forces a serial execution plan. This is the default.
Bold emphasis mine.
Your 1st (fast) query plan shows 2x
Parallel Seq Scan and 1x
Parallel Index Only Scan.
Your 2nd (slow) query plan has no parallel queries. Damage done.
Mark your functions
PARALLEL SAFE (because they qualify!) and the issue goes away. Related:
I ran performance tests with a couple of variants. See:
This equivalent function is substantially faster and can be inlined:
CREATE OR REPLACE FUNCTION util.divide_ints_and_get_percentage_string(bigint, bigint) RETURNS text LANGUAGE sql IMMUTABLE PARALLEL SAFE AS $func$ SELECT CASE WHEN $2 = 0 THEN 'divide_by_zero' ELSE round($1 * 100 / $2::numeric, 2)::text || '%' END -- explicit cast! $func$;
Most importantly, it is
LANGUAGE sql which allows function inlining, (unlike
LANGUAGE plpgsql). See:
Notably, we need that explicit cast
::text. The concatenation operator
|| is resolved to one of several internal functions, depending on involved data types, and not all of them are
IMMUTABLE. Without the explicit cast, Postgres would pick a variant that is only
STABLE, and that would disagree with the function declaration and prevent function inlining. Sneaky details!
- How to concatenate columns in a Postgres SELECT?
- Can declaring function volatility IMMUTABLE harm performance?
Fixed a logic issue while being at it:
$2 = 0 checks for division by zero properly (unlike
$2 > 0). Now,
count(*) can never be negative, but since you put the logic into a function, it’s isolated from that precondition.
Or just put the simple expression into the query directly. No function call. That’s not susceptible to any of the mentioned issues.
It looks like you’ve hit some sort of optimisation fence in PostgreSQL whereby your functions, instead of being evaluated once after the CTE, are being evaluated multiple times!
What I would do in your case is the following:
(as before) WITH t1 AS ( SELECT (SELECT COUNT(*) FROM racing.all_computable_xformula_bday_combos) AS all_count, (SELECT COUNT(*) FROM racing.xday_todo_all) AS todo_count, (SELECT COUNT(*) FROM racing.xday) AS xday_row_count OFFSET 0 -- this is to prevent inlining ) SELECT t1.all_count, t1.all_count-t1.todo_count AS done_count, t1.todo_count, t1.xday_row_count, -- the CASE statement below is the only difference to your -- original query - it does the same thing, but removes the -- function calls. -- All your function calls appear to do is call a TEXT -- representation of a percentage. The CASE statement -- calculates the percentage "directly" and returns the -- desired string. CASE WHEN t1.allcount = 0 THEN 'divide by zero' -- (or maybe an actual 0 might be suitable? -- It avoids the cast to ::TEXT below) ELSE ((t1.todo_count::REAL/t1.allcount::REAL) * 100)::TEXT AS todo_percentage FROM t1;
The use of the casts to
::REAL means that you will "only" get a percentage accurate to 6 decimal places (see the PostgreSQL documentation here), but I have rarely come across situations where more than this was required.
FLOAT without precision is, in fact, a
DOUBLE PRECISION (15 places).
From the documentation:
PostgreSQL also supports the SQL-standard notations float and float(p)
for specifying inexact numeric types. Here, p specifies the minimum
acceptable precision in binary digits. PostgreSQL accepts float(1) to
float(24) as selecting the real type, while float(25) to float(53)
select double precision. Values of p outside the allowed range draw an
error. float with no precision specified is taken to mean double
There are other ways and means of doing what you require…
Take a look here for a couple of suggestions from the PostgreSQL site if you don’t require an exact count and an exact percentage. And then there is (yet another) magisterial answer by @Erwin Brandstetter here – he gives you a few ways to accomplish your goal and explains the pros and cons of each…
Some closing points:
Your functions: you appear to go to a lot of trouble to perform what is (or at least should be) final formatting steps long before they’re necessary. Many would argue that what you are doing in your functions should be done in the client/presentation layer. I would refrain from performing this sort of manipulation at least until the very last SQL step! Databases are for storing data, not presenting it!
Another solution (if you wish to absolutely insist on using your functions) could be to wrap your query in another
SELECTand have the functions operate on the results of that query – that should remove the optimisation fence (see sample here)! A bit convoluted perhaps, but then so are your functions!
Following your edit, what you call
"some kind of "Schrödinger's cat" effectis, in fact, an
"optimisation fence"and has been an issue with CTEs for years. It was meant to be fixed by the
WITH cte_name AS [ NOT ] [MATERIALIZED] (...directive (see here). From that answer, your query is not side effect free!
Now, you will say "but, all it does is calculate a percentage…", but the optimiser can’t know that in advance and "takes no chances" and appears to be evaluating your function multiple times instead of once.
Finally, I did point out that you obviously haven’t given us all of the necessary information – there are table names in the PLAN which don’t appear in your question, which implies to me that you are querying
VIEWs, which may very well be a confounding factor.
I suggest that you provide a test case on dbfiddle.uk (with underlying base tables), the views you build on those and all of your queries and functions – otherwise, no further help can be forthcoming.
Your analogy with
"Schrödinger's cat" is perhaps particularly apt – we don’t have all the information – do the postulated
VIEWs exist or not? Are they
VIEWs? Do they
VIEW anything? If a
DROPped in the middle of a forest and nobody hears it, has it really been
Without full disclosure, we can be of little assistance. As far as I’m concerned, I’ve answered the question as asked and (according to yourself) have provided solutions which work. Granted, it may not be entirely satisfactory however with what we have, it’s the best you’re going to get!
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂