Join variable number of rows in Postgres into json array

All we need is an easy explanation of the problem, so here it is.

I have the following table:

timestamp | type | value | source
----------+------+-------+--------
2020-01-01| 1    | 10    | 2
2020-01-01| 2    | 20    | 2
2020-01-02| 1    | 5     | 4
2020-01-02| 2    | 6     | 3
2020-01-02| 5    | 4     | 3
2020-01-02| 4    | 8     | 1

Each timestamp + type pair is unique. For each timestamp, there can be a variable number of rows (but each with a distinct type). value and source are the actual data.

Now I’d like to aggregate by timestamp into a json array where, for each timestamp, the available (type, value, source) tuples are included. Ideally, something like:

[
  {
    "2020-01-01": [{"type": 1, "value": 10, "source": 2}, {"type": 2, "value": 20, "source": 2}]
  },
  {
    "2020-01-02": [{"type": 1, "value": 5, "source": 4}, {"type": 2, "value": 6, "source": 3}, {"type": 5, "value": 4, "source": 3}, {"type": 4, "value": 8, "source": 1}]
  }
]

I don’t really have strong preferences on the output format, as long as the information is there and grouped correctly, so for example this would be fine too:

[
  {
    "timestamp": "2020-01-01", "data": [{"type": 1, "value": 10, "source": 2}, {"type": 2, "value": 20, "source": 2}]
  },
  {
    "timestamp": "2020-01-02", "data": [{"type": 1, "value": 5, "source": 4}, {"type": 2, "value": 6, "source": 3}, {"type": 5, "value": 4, "source": 3}, {"type": 4, "value": 8, "source": 1}]
  }
]

Or this (which I guess would require casting type to string, which is not a problem):

[
  {
    "timestamp": "2020-01-01", "1": {"value": 10, "source": 2}, "2": {"value": 20, "source": 2}
  },
  {
    "timestamp": "2020-01-02", "1": {"value": 5, "source": 4}, "2": {"value": 6, "source": 3}, "5": {"value": 4, "source": 3}, "4": {"value": 8, "source": 1}
  }
]

Even this, as long as the field ordering is known in advance:

[
  {
    "2020-01-01": [[1, 10, 2], [2, 20, 2]]
  },
  {
    "2020-01-02": [[1, 5, 4], [2, 6, 3], [5, 4, 3], [4, 8, 1]]
  }
]

In short, whatever is easiest/most efficient.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

WITH cte AS (
    SELECT "timestamp", json_agg(json_build_object('type', "type", 
                                                   'value', "value",
                                                   'source', "source")) intermediate_json
    FROM test
    GROUP BY 1
)
SELECT json_agg(json_build_object("timestamp", intermediate_json)) final_json
FROM cte

https://dbfiddle.uk/?rdbms=postgres_12&fiddle=2fc37233058437189b8e0a8eea05a01b

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply