# Optimization of recursive view

## All we need is an easy explanation of the problem, so here it is.

I have an optimization problem when using a recursive view on PostgreSQL. When I perform a trivial query using this view, the execution time is abnormally long.

To expose my problem, here is a database, with the view and the query that causes the problem: http://sqlfiddle.com/#!17/9d39e/13

The main table is `v_univ_st` and the view is called `v_univ_bf`.

## EXPLAIN

I tried the EXPLAIN method of PostgreSQL, here is the result I get (the table I’m working on is much bigger than the fiddle one):

``````"Hash Right Join  (cost=4724036209.97..4757553915.67 rows=1510012 width=23746) (actual time=5172.917..24833.100 rows=1869 loops=1)"
"  Hash Cond: ((recipes_flat."PRODUCT_ID")::text = (u_sample_tasks."PRODUCT_ID")::text)"
"  ->  CTE Scan on recipes_flat  (cost=4723991547.19..4728513371.25 rows=164429966 width=23705) (actual time=0.197..19761.488 rows=367312 loops=1)"
"        CTE recipes_flat"
"          ->  Recursive Union  (cost=0.00..4723991547.19 rows=164429966 width=15645) (actual time=0.181..17845.438 rows=367312 loops=1)"
"                ->  Seq Scan on v_univ_st  (cost=0.00..8024046.44 rows=279636 width=1824) (actual time=0.171..3060.524 rows=279684 loops=1)"
"                      SubPlan 1"
"                        ->  Aggregate  (cost=28.54..28.55 rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=279684)"
"                              ->  Index Only Scan using idx_recipe_blends_ingredient_id on v_univ_st sr  (cost=0.42..28.52 rows=6 width=0) (actual time=0.004..0.006 rows=0 loops=279684)"
"                                    Index Cond: ("INGREDIENT_ID" = (v_univ_st."PRODUCT_ID_COMP")::text)"
"                                    Heap Fetches: 68468"
"                ->  Nested Loop  (cost=0.42..471267890.14 rows=16415033 width=15645) (actual time=0.262..799.437 rows=9736 loops=9)"
"                      ->  WorkTable Scan on recipes_flat s  (cost=0.00..55927.20 rows=2796360 width=14718) (actual time=0.010..22.153 rows=40812 loops=9)"
"                      ->  Index Scan using idx_recipe_blends_ingredient_id on v_univ_st e  (cost=0.42..0.58 rows=6 width=999) (actual time=0.005..0.008 rows=0 loops=367312)"
"                            Index Cond: (("INGREDIENT_ID")::text = (s."PRODUCT_ID_COMP")::text)"
"                      SubPlan 2"
"                        ->  Aggregate  (cost=28.54..28.55 rows=1 width=0) (actual time=0.038..0.038 rows=1 loops=87628)"
"                              ->  Index Only Scan using idx_recipe_blends_ingredient_id on v_univ_st sr_1  (cost=0.42..28.52 rows=6 width=0) (actual time=0.007..0.037 rows=0 loops=87628)"
"                                    Index Cond: ("INGREDIENT_ID" = (e."PRODUCT_ID_COMP")::text)"
"                                    Heap Fetches: 19160"
"  ->  Hash  (cost=44661.15..44661.15 rows=131 width=9) (actual time=4983.487..4983.487 rows=129 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 14kB"
"        ->  Seq Scan on u_sample_tasks  (cost=0.00..44661.15 rows=131 width=9) (actual time=954.011..4983.337 rows=129 loops=1)"
"              Filter: (("EXAMPLE")::text = 'EXAMPLE'::text)"
"              Rows Removed by Filter: 365517"
"Planning time: 193.957 ms"
"Execution time: 24903.973 ms"
``````

As shown by https://explain.depesz.com/:

Indexes are used, I already tried:

``````SET enable_seqscan = OFF
SET enable_nestloop = OFF
``````

but it doesn’t improve, the result is even worse.

Here is the table on dbfiddle.uk. I use version 9.6.

On this small database the results are not slow. On PostgreSQL the table is 261MB and about 279,000 rows.

`v_univ_st` is the table itself, which has no primary key indeed, but I work on tables that are not necessarily "relational" but rather data extraction. The view created is `v_univ_bf` and it is during this creation that I want to create columns that gives me the "depth" level of the ingredient. In the simplified example I recursively go through the table to get this information.

## How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

### Method 1

By using a materialized view and `EXISTS`, I lowered the execution time of the request by about 96%.
I have changed :

``````CREATE OR REPLACE VIEW public.v_univ_bf
``````

To

``````CREATE MATERIALIZED VIEW public.v_univ_bf
``````

The materialized view is not refreshed with each query that uses it, it must be refreshed if changes have been made to the table.

For my problem, I refresh the materialized view every day with a query in my ETL but here is a topic that presents all the methods to refresh a materialized view.

### Method 2

I see that OP already solved his problem with materialized view.

Anyway I see an optimization with replacing `(SELECT COUNT(1) ...)=0` with `NOT EXISTS`:

``````
CREATE OR REPLACE VIEW public.v_univ_bf AS
WITH RECURSIVE recipes_flat AS (
SELECT v_univ_st.PRODUCT_ID,

v_univ_st.NAME_PRODUCT,
v_univ_st.INGREDIENT_ID,
v_univ_st.PRODUCT_ID_COMP,
/*
(( SELECT count(1) AS count
FROM v_univ_st sr
WHERE sr.INGREDIENT_ID::text = v_univ_st.PRODUCT_ID_COMP::text)) = 0 AS last_comp_blend,
*/
NOT EXISTS(
SELECT 1 FROM v_univ_st sr
WHERE sr.INGREDIENT_ID::text = v_univ_st.PRODUCT_ID_COMP::text
) AS last_comp_blend,
1 AS level_blend
FROM v_univ_st
UNION ALL
SELECT
e.PRODUCT_ID,
s.NAME_PRODUCT,
s.INGREDIENT_ID,
e.PRODUCT_ID_COMP,
NOT EXISTS(
SELECT 1
FROM v_univ_st sr
WHERE sr.INGREDIENT_ID::text = e.PRODUCT_ID_COMP::text
) AS last_comp_blend,
/*
(( SELECT count(1) AS count
FROM v_univ_st sr
WHERE sr.INGREDIENT_ID::text = e.PRODUCT_ID_COMP::text)) = 0 AS last_comp_blend,
*/
s.level_blend + 1
FROM v_univ_st e
JOIN recipes_flat s ON s.PRODUCT_ID_COMP::text = e.INGREDIENT_ID::text
)
SELECT recipes_flat.PRODUCT_ID AS id_product,
recipes_flat.NAME_PRODUCT AS name_product,
recipes_flat.INGREDIENT_ID AS ingredient_id,
recipes_flat.PRODUCT_ID_COMP AS product_id_comp,
recipes_flat.level_blend,
recipes_flat.last_comp_blend,
CASE
WHEN recipes_flat.last_comp_blend AND recipes_flat.level_blend = 1 THEN 'CF'::text
WHEN recipes_flat.last_comp_blend AND recipes_flat.level_blend > 0 THEN 'F'::text
WHEN NOT recipes_flat.last_comp_blend AND recipes_flat.level_blend = 1 THEN 'C'::text
ELSE ''::text
END AS compact_flat_view
FROM recipes_flat;

-- EXPLAIN ANALYZE
select * from v_univ_bf;
``````

http://sqlfiddle.com/#!17/9d39e/45

I’m curious how much it speed up original view or `REFRESH MATERIALIZED VIEW` on real data.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂