# Is a correlated function in the FROM clause executed for every row?

## All we need is an easy explanation of the problem, so here it is.

I have a heavy function, let’s call it `fcalc(x,y) -> my_z`. I need the result `my_z` to both be a filter (too low and the row is discarded) and in the result set (so my client can see it). I write the query like so:

``````SELECT *, my_z
FROM big_table t, (SELECT * FROM fcalc(t.x, t.y)) as my_z
WHERE condition1 AND condition2 AND ... AND my_z > \$threshold
``````

My question is: will all the other conditions apply first which should filter out very large number of rows before it applies `fcalc`? I’m very new to databases.

## How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

### Method 1

NO, Postgres typically does not evaluate the function in the `LATERAL` subquery for all rows.
It will apply simple filters on `big_table` first and execute the function only for rows still in the race.

#### Fix question

Assuming `fcalc()` returns a single value, this would work:

```SELECT * , my_z
FROM big_table t, LATERAL (SELECT * FROM fcalc(t.x, t.y)) AS f(my_z)
WHERE \$condition1 AND \$condition2 AND ... AND f.my_z > \$threshold```

And should be untangled to just:

``````SELECT *
FROM   big_table t
JOIN   LATERAL fcalc(t.x, t.y) AS f(my_z) ON f.my_z > \$threshold
WHERE  \$condition1
AND    \$condition2
AND ...
``````

Moving the `f.my_z > \$threshold` from the `WHERE` clause to the join condition makes the query easier to read and has no effect on the query plan whatsoever (while using `[INNER] JOIN`). This produces the exact same query plan:

``````SELECT *
FROM   big_table t, fcalc(t.x, t.y) f(my_z)
WHERE  \$condition1
AND    f.my_z > \$threshold
AND    \$condition2
AND ...
``````

#### Query plan

Either of the fixed queries will first apply predicates filtering rows in `big_table`, before executing `fcalc()` and filtering on the result.

You can check with `EXPLAIN ANALYZE`. Say, your `big_table` has 8 rows, 5 of which don’t pass your `\$conditionN` filters, and 1 of the remaining 3 does not pass `f.my_z > \$threshold`. You’ll see something like:

```Nested Loop  (cost=0.00..1.17 rows=3 width=79) (actual time=0.026..0.027 rows=1 loops=1)
->  Seq Scan on big_table t  (cost=0.00..1.10 rows=3 width=75) (actual time=0.007..0.009 rows=3 loops=1)
Filter: (id > 5)
Rows Removed by Filter: 5
->  Function Scan on fcalc f  (cost=0.00..0.02 rows=1 width=4)
(actual time=0.005..0.005 rows=0 loops=3)
Filter: (my_z > 9)
Rows Removed by Filter: 1
Planning Time: 0.101 ms
Execution Time: 0.043 ms```

Meaning, `fcalc()` was only executed 3 times in the example. In reality, you should see index scans for the big table, but all the same.

You can further verify this if you set the GUC `track_functions` to `pl` before executing the query with or without `EXPLAIN ANALYZE`. The manual:

Enables tracking of function call counts and time used. Specify `pl`
to track only procedural-language functions, `all` to also track SQL
and C language functions. The default is `none`, which disables
function statistics tracking. Only superusers can change this setting.

#### Note

SQL-language functions that are simple enough to be “inlined” into the
calling query will not be tracked, regardless of this setting.

Then check how often your function has actually been called, before and after executing your query:

``````SELECT calls
FROM   pg_catalog.pg_stat_user_functions
WHERE  funcid = 'fcalc'::regproc
``````

Consult the manual for details about the cast `'fcalc'::regproc`.

#### Aside

Postgres will also prioritize filters on the same level by their estimated cost. You can verify with the tools I laid out above. Tinker with the `COST` setting of simple plpgsql functions …

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂