Postgres "unnesting" multiple json arrays simultaneouesly with different depths

All we need is an easy explanation of the problem, so here it is.

I am trying to explode/unnest a row of jsonb arrays into individual rows and keep any rows that would result in nulls.

Fiddle: https://www.db-fiddle.com/f/pgd6E5yKf8NzPtgboeoC6c/7

Input table: (I am stuck with the input being in this form)

id name color price description
1 ["Banana"] [["Green"]] [[[1]]] [[["Good"]]]
2 ["Banana"] [["Yellow"]] [[[]]] [[[]]]
3 ["Apple","Carrot"] [["Red"],["Orange"]] [[[10]],[[14]]] [[["Round"]],[["Long"]]]
4 ["Grape","Pineapple"] [["Green"],["Brown","Pink"]] [[[2]],[[13],[29]]] [[["Small"]],[["Short"],["Tall"]]]
5 ["Orange","Potato"] [["Orange"],["Purple","Brown"]] [[[2]],[[9],[]]] [[["Bright"]],[["Gross"],[]]]

Desired Output:

id name color price description
1 Banana Green 1 Good
2 Banana Yellow null null
3 Apple Red 10 Round
3 Carrot Orange 14 Long
4 Grape Green 2 Small
4 Pineapple Brown 13 Short
4 Pineapple Pink 29 Tall
5 Orange Orange 2 Bright
5 Potato Purple 9 Gross
5 Potato Brown null null

I can get the output I want with this query, but I am trying to generate these queries using a templating engine and I’m trying to account for any number of elements being any depth of nested arrays. Trying to template this to handle that will be less than ideal.

-- Gives expected results
WITH first_level AS (
  SELECT
      id,
      jsonb_array_elements(name) AS name,
      jsonb_array_elements(color) AS color,
      jsonb_array_elements(price) AS price,
      jsonb_array_elements(description) AS description
  FROM food
),
second_level AS (
SELECT
      id,
      name,
      jsonb_array_elements(color) AS color,
      jsonb_array_elements(price) AS price,
      jsonb_array_elements(description) AS description
  FROM first_level
  )
SELECT
      id,
      name,
      color,
      CASE
            WHEN price = '[]'::jsonb THEN null
            ELSE jsonb_array_elements(price)
      END AS price,
      CASE
            WHEN description = '[]'::jsonb THEN null
            ELSE jsonb_array_elements(description)
      END AS description
FROM second_level

I’ve tried unnesting them all at once but with pg 9.6 that gives back a bad result and doesnt have the nulls, It works better on pg10+ but it still drops the rows with empty arrays.

Is there a cleaner way to achieve this result? I’ve run across examples of using lateral joins but none of them are exact matches and I cant figure them out to even get a result close to what I need.

Thanks

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Unnesting multiple set-returning functions in parallel behaved in mysterious ways (borderline broken) in Postgres 9.6. That got fixed with Postgres 10. See:

The same fix also made set-returning functions in a CASE construct illegal. Your query only works on account of that superseded behaviour in Postgres 9.6 and breaks in later versions. Same as my simpler rCTE solution:

db<>fiddle here

You didn’t specify what’s allowed in your input format. Making some assumptions, this works in modern Postgres (and in Postgres 9.6, too):

Custom function (accepts non-arrays, too):

CREATE OR REPLACE FUNCTION f_jsonb_array_elements_if_any(jsonb)
  RETURNS SETOF jsonb
  LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE AS
$func$
BEGIN
   IF jsonb_typeof($1) = 'array' THEN 
      IF $1 = '[]' THEN
         RETURN NEXT jsonb 'null';
      ELSE
         RETURN QUERY SELECT jsonb_array_elements($1);
      END IF;
   ELSE
      RETURN NEXT $1;
   END IF;
END
$func$;

Main query (with multiple advanced SQL features):

WITH RECURSIVE cte AS (
   SELECT id, 1 AS lvl, ARRAY[ord] AS path, t.*
   FROM   food
   LEFT   JOIN LATERAL ROWS FROM (
      f_jsonb_array_elements_if_any(name)
    , f_jsonb_array_elements_if_any(color)
    , f_jsonb_array_elements_if_any(price)
    , f_jsonb_array_elements_if_any(description)
      ) WITH ORDINALITY t(n, c, p, d, ord) ON true

   UNION ALL
   SELECT id, lvl + 1, path || t.ord, t.*
   FROM   cte c
   LEFT   JOIN LATERAL ROWS FROM (
      f_jsonb_array_elements_if_any(n)
    , f_jsonb_array_elements_if_any(c)
    , f_jsonb_array_elements_if_any(p)
    , f_jsonb_array_elements_if_any(d)
      ) WITH ORDINALITY t(n, c, p, d, ord) ON true
   WHERE  'array' IN (jsonb_typeof(c.n), jsonb_typeof(c.c), jsonb_typeof(c.p), jsonb_typeof(c.d))
   )
SELECT id, (first_value(n) OVER (PARTITION BY id, grp ORDER BY path))->>0 AS name
     , c->>0 AS color, p->>0 AS price, d->>0 AS description
FROM  (
   SELECT count(n) OVER (PARTITION BY id ORDER BY path) AS grp, *
   FROM   cte c
   WHERE  lvl = (SELECT max(lvl) FROM cte)
   ) sub;

db<>fiddle here

Cycle with a recursive CTE, unnesting arrays until no array is left.

In the outer SELECT only use rows from the lat iteration (max(lvl))

Where a single element in name was combined with multiple in the other columns, we get genuine null values (as opposed to a jsonb 'null' from an empty array). Fill in the last actual value for the same group of rows.

Finally extract text from the jsonb wrapper.

This is one of the rare occasions where a ROWS FROM construct is instrumental. See:

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply