Cumulative sum in a period of months

All we need is an easy explanation of the problem, so here it is.

I have this table:

month_rep fruits harvested
2021-09-01 139 139
2021-10-01 143 11
2021-11-01 152 14
2021-12-01 112 9
2022-01-01 133 10
2022-02-01 145 12
2022-03-01 123 5
2022-04-01 111 4
2022-05-01 164 9
2022-06-01 135 12
2022-07-01 124 14
2022-08-01 144 18
2022-09-01 111 111
2022-10-01 108 13
2022-11-01 123 7
2022-12-01 132 20

I want to create a new column called sold that is based on a calculation – which will be a running sum of harvested in a period of months (Sep-Jun). Every September, sold will always be 1 (or 100 in percent). The calculation for Oct 2021 will be fruits / (harvested + harvested_Nov) = 143 / 11 + 139.

For the rest of the months of 2021, follows the same format: fruits / (harvested + harvested_until_Sep) –> this will be a running sum, starting from the month you’re in, and ends in Sep of the previous year.

Another example for 2022 is the calculation for Mar 2022 = fruits / (harvested + harvested_Feb_2022 + harvested_Jan_2022 + harvested_Dec_2021 + harvested_Nov_2021 + harvested_Oct_2021 + harvested_Sep_2021) = 123 / (5+12+10+9+14+11+139).

The table should look like this:

month_rep fruits harvested sold
2021-09-01 139 139 1
2021-10-01 143 11 0.95
2021-11-01 152 14 0.93
2021-12-01 112 9 0.65
2022-01-01 133 10 ..
2022-02-01 145 12 ..
2022-03-01 123 5 ..
2022-04-01 111 4 ..
2022-05-01 164 9 ..
2022-06-01 135 12 ..
2022-07-01 124 14 null
2022-08-01 144 18 null
2022-09-01 111 111 1
2022-10-01 108 13 0.87
2022-11-01 123 7 0.94
2022-12-01 132 20 ..

I tried this:

select 
    month_rep,
    fruits,
    harvested,
    case when extract(month from "month_rep") in (7, 8) then null
         when extract(month from "month_rep") = 9 then 1
        else ROUND(fruits / sum(harvested) over (order by month_rep), 2) end sold
from my_table 

This works well, but only when I have data before the 2022 September. I want Jul and Aug to have null sold – which works. After Aug, Sep 2022 should be a new period where sold is 1. After that, Oct 2022 will be calculated as fruits / (harvested + harvested_Sep_2022) – where we start a new period for the 2nd period Sep 2022 – Jun 2023.

Is there a way to group these "periods" and have the running sum over that? I might need to find a way to group the period and take partition by from that.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Shift the date by 8 months, so that Sep-Jun moves to Jan-Sep. Then you can partition by year:

SELECT *
     , CASE extract(month FROM month_rep)
         WHEN 7 THEN NULL
         WHEN 8 THEN NULL
         WHEN 9 THEN 1
         ELSE round(fruits::numeric / sum(harvested) OVER (PARTITION BY date_trunc('year', month_rep - interval '8 mon') ORDER BY month_rep), 2)
       END AS sold
FROM   tbl
ORDER  BY month_rep;

db<>fiddle here

Produces your desired result exactly.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply