How sum over work

All we need is an easy explanation of the problem, so here it is.

Given this table:

CREATE TABLE Table1
(
  [Classroom] int,
  [CourseName] varchar(8),
  [Lesson] varchar(9),
  [StartTime] char(4),
  [EndTime] char(4)
);

then:

INSERT INTO Table1
(
  [Classroom], 
  [CourseName], 
  [Lesson], 
  [StartTime], 
  [EndTime]
)
VALUES
    (1001, 'Course 1', 'Lesson 1', '0800', '0900'),
    (1001, 'Course 1', 'Lesson 2', '0900', '1000'),
    (1001, 'Course 1', 'Lesson 3', '1000', '1100'),
    (1001, 'Course 1', 'Lesson 6', '1100', '1200'),
    (1001, 'Course 2', 'Lesson 10', '1100', '1200'),
    (1001, 'Course 2', 'Lesson 11', '1200', '1300'),
    (1001, 'Course 1', 'Lesson 4', '1300', '1400'),
    (1001, 'Course 1', 'Lesson 5', '1400', '1500');

And my query is:

With A AS 
(
  SELECT 
    ClassRoom
    CourseName
    StartTime
    EndTime
    PrevCourse = LAG(CourseName, 1, CourseName) OVER (ORDER BY StartTime)
  FROM   Table1
), B AS (
  SELECT 
    ClassRoom
    CourseName
    StartTime
    EndTime
    Ranker = SUM(CASE WHEN CourseName = PrevCourse THEN 0 ELSE 1 END)
                OVER (ORDER BY StartTime, CourseName)
  FROM   A
)
SELECT B.* FROM B;

This gives me the following result:

ClassRoom CourseName StartTime EndTime Ranker
1001      Course 1   0800   0900    0
1001      Course 1   0900   1000    0
1001      Course 1   1000   1100    0
1001      Course 1   1100   1200    0
1001      Course 2   1100   1200    1
1001      Course 2   1200   1300    1
1001      Course 1   1300   1400    2
1001      Course 1   1400   1500    2

Please focus on the ranker column. If I have not misunderstood, at the every first row where current course is different to previous course, then sum(1); the next rows, where current course == previous course, then sum(0), so my expectation of the ranker should be: (0,0,0,0), (1,1), (1,1) but it give me (0,0,0,0), (1,1), (2,2). Why at the end it give me (2, 2)? Or am I missing something?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Your window frame for summing is all previous rows. Is this closer to what you want?

, Ranker = SUM(CASE WHEN CourseName = PrevCourse THEN 0 ELSE 1 END)
               OVER (PARTITION BY CourseName ORDER BY StartTime)

ClassRoom   CourseName  StartTime   EndTime Ranker
1001    Course 1    0800    0900    0
1001    Course 1    0900    1000    0
1001    Course 1    1000    1100    0
1001    Course 1    1100    1200    0
1001    Course 2    1100    1200    1
1001    Course 2    1200    1300    1
1001    Course 1    1300    1400    1
1001    Course 1    1400    1500    1

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply