Combining multiple SELECT/WHERE into a result with multiple columns with alias names

All we need is an easy explanation of the problem, so here it is.

I have a Timescale DB / PostgreSQL table (DataTable) which looks like this

Combining multiple SELECT/WHERE into a result with multiple columns with alias names

I’m trying to find a query which would return different columns for different subsystem/metrics combination like this

Combining multiple SELECT/WHERE into a result with multiple columns with alias names

Furthermore I would like to reduce the amount of requested data points using the Timescale DB function time_bucket_gapfill() to keep the response time low for big time ranges. I would also like to use different aggregation functions for different subsystem/metric combinations.
I use the queried data to plot the curves in Grafana.
What is the most time efficient way to achieve that?

At the moment I use separate queries for each metric which quickly gets very slow. For the example the queries could look like that

SELECT
    time_bucket_gapfill('30s',time,start=>'2021-07-19T09:06:26.605Z',finish=>'2021-07-19T20:11:12.340Z') AS "time",
    max(value) AS "Latitude"
FROM DataTable
WHERE
    time BETWEEN '2021-07-19T09:06:26.605Z' AND '2021-07-19T20:11:12.340Z' AND
    subsystem = 'position' AND
    metric = 'lat'
GROUP BY 1,metric,subsystem
ORDER BY time
SELECT
    time_bucket_gapfill('30s',time,start=>'2021-07-19T09:06:26.605Z',finish=>'2021-07-19T20:11:12.340Z') AS "time",
    min(value) AS "Longitude"
FROM DataTable
WHERE
    time BETWEEN '2021-07-19T09:06:26.605Z' AND '2021-07-19T20:11:12.340Z' AND
    subsystem = 'position' AND
    metric = 'lon'
GROUP BY 1,metric,subsystem
ORDER BY time
SELECT
    time_bucket_gapfill('30s',time,start=>'2021-07-19T09:06:26.605Z',finish=>'2021-07-19T20:11:12.340Z') AS "time",
    avg(value) AS "Temperature"
FROM DataTable
WHERE
    time BETWEEN '2021-07-19T09:06:26.605Z' AND '2021-07-19T20:11:12.340Z' AND
    subsystem = 'health' AND
    metric = 'temperature'
GROUP BY 1,metric,subsystem
ORDER BY time

I would much appreciate if somebody could point me in the right direction.

Update:

Using the FILTER clause as suggested by @Charlieface doesn’t work for me, because it doesn’t remove the other rows but instead places NULLs there. Here an example with some real data (the example before was using simplified data and names).

Combining multiple SELECT/WHERE into a result with multiple columns with alias names

I also noticed that the timestamps of the metrics belonging together have tiny differences which would prevent different metrics being on the same row, but that is easy to changes in the unit which feeds the data into the DB.

Update 2:

The timestamps issue was fixed and the WHERE statement was added again as suggested by @Charlieface, but there are still NULL in the data which makes Grafana plot the data points without lines in between.

Combining multiple SELECT/WHERE into a result with multiple columns with alias names

I would like the result to look like this instead.

Combining multiple SELECT/WHERE into a result with multiple columns with alias names

Update 3:

I missed that @Charlieface answer only groups by time. If I do that, then I get the result I wanted.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

You can use conditional aggregation for this.

In PostgreSQL you can use the FILTER clause.

In other DBMSs you can use a condition inside the aggregation
avg(case when ... then value end)

SELECT
    time_bucket_gapfill('30s', time, start => '2021-07-19T09:06:26.605Z', finish => '2021-07-19T20:11:12.340Z') AS "time",
    max(value) FILTER (WHERE subsystem = 'position' AND metric = 'lat') AS "Latitude",
    min(value) FILTER (WHERE subsystem = 'position' AND metric = 'lon') AS "Longitude",
    avg(value) FILTER (WHERE subsystem = 'health' AND metric = 'temperature') AS "Temperature"
FROM DataTable
WHERE
    time BETWEEN '2021-07-19T09:06:26.605Z' AND '2021-07-19T20:11:12.340Z'
  AND (subsystem = 'position' AND metric IN ('lat', 'lon') OR
       subsystem = 'health' AND metric = 'temperature')
GROUP BY 1
ORDER BY time;

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply