All we need is an easy explanation of the problem, so here it is.
id creation operation value running sum SyJw-c 2016-09-01 00:11:08.307419 positive_op_1 1.33 28.82 SyJw-c 2016-08-21 08:32:54.431662 negative_op_1 -1 27.49 SyJw-c 2016-08-18 07:38:33.878365 positive_op_2 1 28.49 SyJw-c 2016-08-14 18:12:03.599797 negative_op_1 -1 27.49 SyJw-c 2016-08-02 15:44:29.693303 positive_op_1 1.33 28.49 SyJw-c 2016-07-31 12:08:50.659905 override_op_1 4.66 27.16 SyJw-c 2016-06-26 06:53:54.537603 negative_op_1 -3.5 22.5 SyJw-c 2016-05-31 13:34:08.005687 negative_op_1 -1 26 SyJw-c 2016-05-31 13:34:04.776970 negative_op_1 -1 27 SyJw-c 2016-05-31 11:27:09.502983 override_op_2 28 28
BUT my case is more complex. Not only do I need to sum up the values, I need to be able to perform a conversion first to some rows based on the running sum of the row right beneath it.
Let me first explain the motivation:
Currently I have a table with incremental, decremental and override operations. I would like to port the data to a table with only incremental and decremental operations such that I would be able to straightforwardly sum up the values. I am not looking to maintain the old table, simply a way to migrate the data into a simpler model and henceforward to append data to the new table only.
Taken the “raw” table above, I would like to write a query (I am running on postgresql 9.5) and get a table as closely resembling the below. (Conversely, I would like to know that what I am attempting is impossible)
Note the override operators are interspersed between the normal operators, they may appear more than twice as in the example, also, all initial operators (the earliest in the table) are override with an initial value that should be taken into account as in the example below. Moreover I had shown only data belonging to one group (same id) but the general idea is to perform this migration for all groups. Lastly I show the math in parentheses, I don’t need that in the result, it is for the example only:
id creation oper transformed_op value transformed_value running sum SyJw-c 2016- ... pos_op_1 1.33 1.33 10.98 SyJw-c 2016- ... neg_op_1 -1 -1 9.65 SyJw-c 2016- ... pos_op_2 1 1 10.65 SyJw-c 2016- ... neg_op_1 -1 -1 9.65 SyJw-c 2016- ... pos_op_1 1.33 1.33 10.65 SyJw-c 2016- ... ovr_op_1 new_rel_op_1 4.66 (4.66-22.5) = -17.84 4.66 SyJw-c 2016- ... neg_op_1 -3.5 -3.5 22.5 SyJw-c 2016- ... neg_op_1 -1 -1 26 SyJw-c 2016- ... neg_op_1 -1 -1 27 SyJw-c 2016- ... ovr_op_2 new_rel_op_2 28 (28-0) = 28 28
The table is shown ordered from last to first. The
26 - 3.5. This subtraction should be done (
this_value - previous_sum) depends on the value of the
transformed_op. When the original
op had been an
override op I would like to perform some action based on the running sum of the row proceeding it (ordered by creation
desc), in this case subtract that running sum value from the value in the
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Based on this table definition:
CREATE TABLE tbl ( -- no PK? id text NOT NULL , creation timestamp UNIQUE NOT NULL , operation text NOT NULL , value numeric NOT NULL , running_sum numeric -- optional (not needed for task) );
Data types and constraints are almost always essential.
creationdoes not strictly have to be unique. But if there can be duplicate values per group (
id), you need to do more.)
Basic query to compute your special running sum
SELECT id, creation, operation, value , sum(value) OVER (PARTITION BY id, run ORDER BY creation) AS running_sum FROM ( SELECT *, count(*) FILTER (WHERE operation LIKE 'override_op_%') OVER (PARTITION BY id ORDER BY creation) AS run FROM tbl ) t ORDER BY id, creation DESC;
Any operation name starting with ‘override_op_’ indicates the start of a new run (group, patch, partition).
In addition to the related answer you already linked to:
Consider this related question for details how to partition rows into groups (
run in this query, since you are using the term “group” for the
I use the new aggregate
FILTER clause for the partial count:
You could use the simpler (less clear) expression in older versions:
count(operation LIKE 'override_op_%' OR NULL)
Based on this, you can easily compute the desired delta:
SELECT * , running_sum - lag(running_sum, 1, numeric '0') -- data type must match! OVER (PARTITION BY id ORDER BY creation) AS transformed_value FROM ( SELECT id, creation, operation, value , sum(value) OVER (PARTITION BY id, run ORDER BY creation) AS running_sum FROM ( SELECT *, count(*) FILTER (WHERE operation LIKE 'override_op_%') OVER (PARTITION BY id ORDER BY creation) AS run FROM tbl ) t ) t ORDER BY id, creation DESC;
I use the 3-parameter form of the window function
lag() to provide
0 (data type must match!) as default value for the first row in the table.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂