Query to find the count of active days (days with status = ON) between specified dates

All we need is an easy explanation of the problem, so here it is.

I have a table as you can see below :

create table z_test_duration
( Days     date,
  Status   char(8)
);

Sample data is as below :

Days Status
1/1/2022 on
1/2/2022 on
1/3/2022 on
1/4/2022 off
1/5/2022 on
1/6/2022 off
1/7/2022 on
1/8/2022 on
1/9/2022 off

The desired result is this

ON_DATE OFF_DATE COUNT_OF_ACTIVE_DAYS
1/1/2022 1/4/2022 3
1/5/2022 1/6/2022 1
1/7/2022 1/9/2022 2

My solution so far is this :

select min(days) on_date, 
       off_day off_date, 
       off_day - min(days) cnt
       
  from (select t1.off_day, 
               t1.prev_offday, 
               t2.days            
          from (                
                select t.days off_day,
                        nvl(lag(t.days, 1) over(order by t.days),convert(datetime, '1/1/2022') - 100) prev_offday
                  from z_test_duration t
                 where t.status = 'off'
                                 
                ) t1
         inner join z_test_duration t2
            on t2.days > t1.prev_offday
           and t2.days < t1.off_day)
 group by off_day;

I’m thinking if there are better ways of solving this , It would be appreciated if you share your way of solving this .

Thanks in advance.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

This is an ‘islands’ problem.

One popular and efficient solution is to number rows in the desired order. When there is a gap in the sequence, the difference between the ordering column and the row number also jumps.

Let’s look at that step by step. First, the numbering:

SELECT 
    Z.*, 
    Seq = Z.[Days], -- ordering column
    rn = ROW_NUMBER() OVER (ORDER BY Z.[Days]) -- numbering
FROM dbo.z_test_duration AS Z
WHERE Z.[Status] = 'on';
Days Status Seq rn
2022-01-01 on 2022-01-01 1
2022-01-02 on 2022-01-02 2
2022-01-03 on 2022-01-03 3
2022-01-05 on 2022-01-05 4
2022-01-07 on 2022-01-07 5
2022-01-08 on 2022-01-08 6

Notice the Seq values increase at the same rate as rn until there is a gap. We can see this more clearly by subtracting rn from the Seq value.

The only slight complication here is Seq is a date, so we need to convert that to a number before subtracting. I have used the DATEDIFF function here, but any consistent method of turning a date into a number would do.

SELECT 
    Z.*, 
    Seq = Z.[Days],
    diff = 
        DATEDIFF(DAY, '2022-01-01', Z.[Days]) - 
            ROW_NUMBER() OVER (
                ORDER BY Z.[Days]) 
FROM dbo.z_test_duration AS Z
WHERE Z.[Status] = 'on';
Days Status Seq diff
2022-01-01 on 2022-01-01 -1
2022-01-02 on 2022-01-02 -1
2022-01-03 on 2022-01-03 -1
2022-01-05 on 2022-01-05 0
2022-01-07 on 2022-01-07 1
2022-01-08 on 2022-01-08 1

The diff values are the same for every contiguous element in a group.

Now we know how to group, the final query follows directly:

SELECT
    ON_DATE = MIN(G.Seq), 
    OFF_DATE = DATEADD(DAY, 1, MAX(G.Seq)),
    COUNT_OF_ACTIVE_DAYS = 1 + DATEDIFF(DAY, MIN(G.Seq), MAX(G.Seq))
FROM 
(
    SELECT 
        Z.*, 
        Seq = Z.[Days],
        grp = 
            DATEDIFF(DAY, '2022-01-01', Z.[Days]) - 
                ROW_NUMBER() OVER (
                    ORDER BY Z.[Days]) 
    FROM dbo.z_test_duration AS Z
    WHERE Z.[Status] = 'on'
) AS G
GROUP BY G.grp;
ON_DATE OFF_DATE COUNT_OF_ACTIVE_DAYS
2022-01-01 2022-01-04 3
2022-01-05 2022-01-06 1
2022-01-07 2022-01-09 2

db<>fiddle online demo

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply