# Get a count of consecutive dates

## All we need is an easy explanation of the problem, so here it is.

I want to get a list of dates where there are consecutive dates and it’s corresponding count.

For example, if the I have the following data set

``````Date
2021-07-28
2021-07-27
2021-07-26
2021-07-25
2021-07-24
2021-07-23
2021-07-22
2021-07-21
2021-07-18
2021-07-17
2021-07-14
2021-07-11
2021-07-09
2021-07-06
2021-07-04
2021-07-03
2021-07-02
``````

The result I would like is where consecutive date count > x

``````2021-07-21  8
2021-07-17  2
2021-07-02  3
``````

I’m not really sure how to approach this problem. If an explanation could be provided with the query that would be great, although not required.

## How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

### Method 1

As correctly noted by Charlieface, this is a Gaps and Islands problem. Another way of solving this specific variation – also involving a window function, though a different one this time – would go like this:

``````WITH
partitioned AS
(
SELECT
*
, DATEDIFF(Date, '1970-01-01') - ROW_NUMBER() OVER (ORDER BY Date ASC) AS PartID
FROM
YourTable
)
SELECT
MIN(Date) AS StartDate
, COUNT(*)  AS DayCount
FROM
partitioned
GROUP BY
PartID
HAVING
COUNT(*) > 1
ORDER BY
PartID
;
``````

This solution relies on the fact that the difference between a representation of a date as an integer (`DATEDIFF(...)`) and the date’s numerical position in an ordered sequence (`ROW_NUMBER() OVER ...`) is a constant value. If we looked at the intermediate values returned by the functions in the `PartID` expression, we would find the following:

Date DATEDIFF(Date, ‘1970-01-01’) ROW_NUMBER() OVER (ORDER BY Date ASC) PartID
2021-07-02 18810 1 18809
2021-07-03 18811 2 18809
2021-07-04 18812 3 18809
2021-07-06 18814 4 18810
2021-07-09 18817 5 18812
2021-07-11 18819 6 18813
2021-07-14 18822 7 18815
2021-07-17 18825 8 18817
2021-07-18 18826 9 18817
2021-07-21 18829 10 18819
2021-07-22 18830 11 18819
2021-07-23 18831 12 18819
2021-07-24 18832 13 18819
2021-07-25 18833 14 18819
2021-07-26 18834 15 18819
2021-07-27 18835 16 18819
2021-07-28 18836 17 18819

As you can see, the difference between `DATEDIFF` and `ROW_NUMBER` (represented by the column `PartID`) is the same where dates are consecutive, and it is different for different sequences, which makes it a perfect candidate for a GROUP BY criterion. And that is exactly what the query is using it for. By the way, the date `1970-01-01` has no specific meaning in this case. Any date could be used instead of it as long as it is a constant value.

Another important note to make – and it makes this answer substantially different from Charlieface’s suggestion – is that all the dates must be unique for the method to work as expected.

A live demo of this solution can be found at db<>fiddle.

### Method 2

This is a type of gaps-and-islands problem, of which there are a number of solutions.

Here is one:

• We can identify the starting points of each island by using `LAG` to check the previous row (with a default for the first row)
• We can then number the islands using a running `COUNT`
• Then simply group by that number
``````WITH StartingPoints AS (
SELECT *,
LAG(`Date`, 1, '1900-01-01') OVER (ORDER BY `Date`),
INTERVAL 1 DAY
) < `Date` THEN 1 END AS IsStart
FROM YourTable
),
Grouped AS (
SELECT *,
COUNT(IsStart) OVER (ORDER BY `Date` ROWS UNBOUNDED PRECEDING) AS GroupId
FROM StartingPoints
)
SELECT
MIN(`Date`) AS StartDate,
COUNT(*) AS `Count`
FROM Grouped
GROUP BY GroupId
HAVING COUNT(*) > 1
ORDER BY GroupId DESC;
``````

db<>fiddle

### Method 3

The solution which does not use user-defined variables and/or CTEs:

``````SELECT t1.`Date` range_start,
MIN(t2.`Date`) range_finish,
DATEDIFF(MIN(t2.`Date`), t1.`Date`) + 1 range_length
FROM test t1
JOIN test t2 ON t2.`Date` >= t1.`Date`
WHERE NOT EXISTS ( SELECT NULL
FROM test t3
WHERE t3.`Date` = t1.`Date` - INTERVAL 1 DAY )
AND NOT EXISTS ( SELECT NULL
FROM test t4
WHERE t4.`Date` = t2.`Date` + INTERVAL 1 DAY )
GROUP BY range_start
HAVING range_length > 1
ORDER BY range_start
``````

https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=dd87574aea9b024ebf1fe5096e31210f

`t1` is used for consecutive ranges starts selection (NOT EXISTS by `t3` ensures that there is no adjacent previous date).

Accordingly `t2` is used for consecutive ranges ends selection (NOT EXISTS by `t4` ensures that there is no adjacent next date).

Joining condition for each start filters all ends which are potentially the end of this particular range. Grouping and MIN() selects range end date which matches range start date from all candidates.

HAVING removes the ranges which’ length is 1 day only.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂