SQL Multiple CASE conditions with SUM and count Functions along with JOIN Condition

All we need is an easy explanation of the problem, so here it is.

Kindly help me with the right syntax of mine, really appreciate it, if you could just edit my existing syntax into correct one without alternate options. thanks in advance

Table schema

CREATE TABLE customer (
  id INTEGER PRIMARY KEY,
  customer_name TEXT,
  dept TEXT
);

CREATE TABLE invoice (
  invoice_number INTEGER PRIMARY KEY,
  customer_id INTEGER,
  time_Canceled DATETIME,
  time_refunded DATETIME,
  time_issued DATETIME,
  time_paid DATETIME,
  total_price float
);

Now the question requires to find customer_name, number_of_invoices, lifetime_Value, most_recent_invoice_time, most_recent_invoice_total_price order by lifetime value desc and customer_name

Rules

number_of_invoices is Total number of invoices assosiated with the custmer over al time, regardless whether invoice was cancelled or refunded. return zero if that customer has no invoices.

lifetime_Value – This is the Sum of how much a customer has spent (total_price) totaled across all invoices. exclude invoices which were cancelled or refunded. return zero if that customer has no qualifying invoices. all values should be returned zero as an unassigned integer.

most_recent_invoice_time – the most recent time an invoice was issued for a customer, regardless whether it is cancelled or refunded. Return NULL if that customer has no invoices.

most_recent_invoice_total_price – this is when Sum of how much a customer has spent (‘total_price’) with most recent time an invoice was paid for a customer totaled across al invoices regardless whether it is cancelled or refunded. return NULL if that customer has no qualifying invoices. all values should be returned zero as an unassigned integer.

Below is the error – Invalid use of group function

error

SQL Multiple CASE conditions with SUM and count Functions along with JOIN Condition

Below is my query i have worked

SELECT cu.customer_name,
(CASE 
WHEN inv.invoice_number IS NOT NULL THEN count(inv.invoice_number)
ELSE 0
END) as number_of_invoices ,
SUM(
CASE 
  WHEN (inv.time_Canceled AND inv.time_refunded IS NOT NULL) Then inv.total_price 
  Else 0 
End ) as lifetime_Value,
(CASE 
  WHEN inv.invoice_number IS NOT NULL Then max(inv.time_issued) -- ORDER BY inv.time_issued DESC limit 1
  Else NULL 
End) as most_recent_invoice_time,
SUM(
CASE 
  WHEN inv.invoice_number IS NOT NULL Then max(inv.total_price) -- ORDER BY inv.time_paid DESC limit 1 
  Else NULL
End ) as most_recent_invoice_total_price

FROM customer cu inner join invoice inv ON cu.id = inv.customer_id

GROUP BY cu.customer_name, most_recent_invoice_time 

ORDER BY lifetime_Value DESC , customer_name

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

To answer this question, I did the following (all of the code below is available on the fiddle here):

Tables – as per question.

Data – simulated.

INSERT INTO customer VALUES
(1, 'cust1', 'Sales'),
(2, 'cust2', 'Sales'),
(3, 'cust3', 'Sales'),
(4, 'cust4', 'Sales'),
(5, 'cust5', 'Sales');

and

INSERT INTO invoice VALUES
(1,  1, '2022-01-01', '2022-01-03',         NULL,         NULL,   1.0),
(2,  1, '2022-01-03', '2022-01-08',         NULL,         NULL,   1.0),
(3,  1, '2022-01-05', '2022-01-13',         NULL,         NULL,   1.0),

(4,  2, '2022-02-01', '2022-02-03',         NULL,         NULL,   2.0),
(5,  2, '2022-02-03', '2022-02-08',         NULL,         NULL,   2.0),
(6,  2, '2022-02-05', '2022-02-13',         NULL,         NULL,   2.0),


(7,  3, '2022-03-01',         NULL, '2022-03-07',         NULL,  3.0),
(8,  3, '2022-03-08',         NULL, '2022-03-14',         NULL,  3.4),
(9,  3, '2022-03-14',         NULL, '2022-03-21',         NULL,  3.4),

(10, 4, '2022-04-01', '2022-04-06', '2022-04-10', '2022-04-22',  4.0),  -- with 4, some refunded
(11, 4, '2022-04-02', '2022-04-07', '2022-04-12', '2022-04-25',  4.1),  -- some not!
(12, 4, '2022-04-03', '2022-04-09', '2022-04-16', '2022-04-28',  4.2),

(13, 4, '2022-04-20', '2022-04-27',         NULL,         NULL, 100.1),
(14, 4, '2022-04-21', '2022-04-29',         NULL,         NULL, 100.2);

The first query is "exploratory" – i.e. getting the required information together. Use is made of window functions – these are very powerful and are well worth getting to know – they will repay any effort spent on learning them many times over!

SELECT
  ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY c.id, i.time_issued) AS rn,
  i.invoice_number AS invno, c.id AS cid, c.customer_name AS cname, c.dept AS cdept,
  i.time_issued AS idate, i.time_paid AS ipaid, 
  i.time_canceled AS icancel, 
  i.time_refunded AS irefund,
  LAST_VALUE(i.time_issued) OVER (PARTITION BY c.id ORDER BY c.id) AS l_inv_date,
  ROUND(COALESCE(i.total_price, 0), 2)   AS tot_price,
  ROUND(SUM(i.total_price) OVER (PARTITION By c.id ORDER BY c.id), 2) AS s_tot,
  ROUND(SUM(NULLIF(time_refunded IS NOT NULL, NULL) * total_price)
     OVER (PARTITION BY c.id ORDER BY c.id), 2) AS refund,

  ROUND(SUM(i.total_price) OVER (PARTITION By c.id ORDER BY c.id) -
  SUM(NULLIF(time_refunded IS NOT NULL, NULL) * total_price)
     OVER (PARTITION BY c.id ORDER BY c.id), 2) AS billed


FROM
  customer c
LEFT JOIN invoice i
  ON i.customer_id = c.id
ORDER BY cid, idate;

Result:

For the result, see the fiddle.

And then, we run the query:

SELECT
  cid, cname, inv_count, l_inv_date, s_tot, refund, valid_billed
FROM
(
  SELECT
    ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY c.id, i.time_issued) AS rn,
    c.id AS cid, c.customer_name AS cname,
    
    COUNT(i.invoice_number) OVER (PARTITION BY c.id ORDER BY c.id) AS inv_count,
    
    LAST_VALUE(i.time_issued) OVER (PARTITION BY c.id ORDER BY c.id) AS l_inv_date,
    
    ROUND(COALESCE(SUM(i.total_price) OVER (PARTITION BY c.id ORDER BY c.id), 0), 2) AS s_tot,
    ROUND(COALESCE(SUM(NULLIF(time_refunded IS NOT NULL, NULL) * total_price)
      OVER (PARTITION BY c.id ORDER BY c.id), 0), 2) AS refund,

    ROUND(COALESCE(SUM(i.total_price) OVER (PARTITION By c.id ORDER BY c.id) -
      SUM(NULLIF(time_refunded IS NOT NULL, NULL) * total_price)
        OVER (PARTITION BY c.id ORDER BY c.id), 0), 2) AS valid_billed
FROM
  customer c
LEFT JOIN invoice i
  ON i.customer_id = c.id
) AS sub1
WHERE rn = 1
ORDER BY cid;

Result:

cid  cname  inv_count   l_inv_date  s_tot   refund  valid_billed
1    cust1          3   2022-01-05      3        0             3
2    cust2          3   2022-02-05      6        0             6
3    cust3          3   2022-03-14    9.8        0           9.8
4    cust4          5   2022-04-21  212.6     12.3         200.3
5    cust5          0         NULL      0        0             0
  • in future, when asking questions such as this, please include a fiddle with your tables and data. The usefulness of this is twofold – it creates a single source of truth for the question and it eliminates duplication of effort on behalf of those trying to help you.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply