How to calculate % change for nearest date

All we need is an easy explanation of the problem, so here it is.

I have a Postgres 14 database with a table recording water levels in a number of locations. Readings are only taken on business days:

CREATE TABLE water_level (
   reading_id BIGINT GENERATED ALWAYS AS IDENTITY,
   location_id BIGINT,
   FOREIGN KEY(location_id) 
      REFERENCES locations(id),
   temperature NUMERIC,
   water_level NUMERIC,
   d_date DATE DEFAULT NOW()
);

An example of the data is:

| reading_id | location_id | temperature | water_level | d_date
| -----------| ------------| ------------| ------------|-------
| 1          | 1           | 17.9        | 145.2       | 2019-01-04
| 2          | 1           | 17.5        | 145.4       | 2019-01-05
| 3          | 1           | 17.4        | 145.5       | 2019-01-06
| 4          | 2           | 18.5        | 180.1       | 2019-01-04
| 5          | 2           | 18.7        | 180.2       | 2019-01-05
| 6          | 2           | 18.7        | 180.2       | 2019-01-06

I have a view that shows the changes over time:

CREATE VIEW current_status AS
SELECT location_id,
   MAX(e.water_level) filter (where e.d_date >= (CURRENT_DATE - INTERVAL '30 DAY')::DATE) day_high_30,
   MIN(e.water_level) filter (where e.d_date >= (CURRENT_DATE - INTERVAL '30 DAY')::DATE) day_low_30,
   MAX(e.water_level) all_time_high,
   MIN(e.water_level) all_time_low
   FROM water_level e
GROUP by location_id;

My Problem – I want the % change from 30 days ago but there may not be a record exactly 30 days ago (because it was a weekend or public holiday), it may be 29 or 28 days ago. So I want to find the water level <= 30 days ago, then calculate % compared to most recent reading
Expected Result I am trying to achieve:

CREATE VIEW current_status AS
SELECT location_id,
   MAX(e.water_level) filter (where e.d_date >= (CURRENT_DATE - INTERVAL '30 DAY')::DATE) day_high_30,
   MIN(e.water_level) filter (where e.d_date >= (CURRENT_DATE - INTERVAL '30 DAY')::DATE) day_low_30,
   MAX(e.water_level) all_time_high,
   MIN(e.water_level) all_time_low,
   (current_value - (water level <= 30 days ago) / (water level <= 30 days ago) * 100 percent_change_30_days
   FROM water_level e
GROUP by location_id;

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

You need windows functions, see : https://www.postgresql.org/docs/14/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS

In the exemple below, the value used for water level change calculation is the first value among the 30 days preceding the current value

SELECT DISTINCT ON (location_id)
location_id,
MAX(e.water_level) filter (where e.d_date >= (CURRENT_DATE - INTERVAL '30 DAY')::DATE)  OVER (PARTITION BY e.location_id) day_high_30,
MIN(e.water_level) filter (where e.d_date >= (CURRENT_DATE - INTERVAL '30 DAY')::DATE)  OVER (PARTITION BY e.location_id) day_low_30,
MAX(e.water_level) OVER (PARTITION BY e.location_id) all_time_high ,
MIN(e.water_level) OVER (PARTITION BY e.location_id) all_time_low,
    (e.water_level /
    FIRST_VALUE(e.water_level) OVER(PARTITION BY e.location_id ORDER BY d_date ASC RANGE '30 day' PRECEDING)) *100
    AS percent_change_30_days
FROM a.water_level e
ORDER BY location_id, d_date DESC 

You can avoid considering too close water level (<28 days) by adding an end_frame. Ex : RANGE BETWEEN '30 day' PRECEDING AND '28 day' PRECEDING. So, if no value between 30 and 28 days, the change will be NULL

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply