Get the most recent 2 entries for each group based on timestamp

All we need is an easy explanation of the problem, so here it is.

I am using MySQL 5.7 database with the following columns

  • item_symbol
  • source_date
  • price

I have composite primary keys which are a combination of item_symbol and source_date. I have around 100k entries in the table.

My question here is, for each item_symbol, I would like to select item_symbol and price for the most recent (based on source date) 2 entries from the table.

An example of my table:

item_symbol price source_date
A 20.1 2021-06-10
A 18.2 2021-06-11
A 10.9 2021-06-13
A 21.0 2021-06-15
B 88.2 2021-06-10
B 60.9 2021-06-11
B 78.16 2021-06-13
B 79.0 2021-06-15

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

I went to my favourite MySQL "tips and tricks" site here and went to the common queries link and looked for the Top N per group section. The great thing about this site is that it tells you how to do stuff in MySQL for all versions – well, going back at least to MySQL 5.5 – and if you’re still running that, well…

I came up with the following adapted from above (all of the DDL, DML and SQL below is available on the fiddle here):

I used the DDL and DML from @nbk, kudos to him (and +1):

CREATE TABLE item
(
  item_symbol CHAR(1), 
  price       DECIMAL(10,2), 
  source_date DATE
);

and populate it:

INSERT INTO item
(item_symbol, price, source_date)
VALUES
('A', 20.1, '2021-06-10'),
('A', 18.2, '2021-06-11'),
('A', 10.9, '2021-06-13'),
('A', 21.0, '2021-06-15'),
('B', 88.2, '2021-06-10'),
('B', 60.9, '2021-06-11'),
('B', 78.16, '2021-06-13'),
('B', 79.0, '2021-06-15');

MySQL allows the use of user variables which are a godsend when you don’t have capabilities such as the ROW_NUMBER() window function which would have made this query trivial. I would strongly urge you to upgrade to version 8, it has many other goodies – CTEs, CHECK constraints…

Anyway, I’ll demonstrate the steps, partly to explain them to you, and partly to explain them to myself! 🙂

SELECT
  item_symbol, price, source_date,
  IF 
  (
    @prev <> item_symbol,
    @row_num := 1,
    @row_num := @row_num + 1
  ) AS my_rank,
  @prev := item_symbol
FROM item
JOIN 
  (
    SELECT @row_num := NULL, @prev := 0
  ) AS r
ORDER BY item_symbol, source_date DESC, price DESC;

Result:

item_symbol     price   source_date     my_rank     @prev := item_symbol
          A     21.00   2021-06-15            1     A
          A     10.90   2021-06-13            2     A
          A     18.20   2021-06-11            3     A
          A     20.10   2021-06-10            4     A
          B     79.00   2021-06-15            1     B
          B     78.16   2021-06-13            2     B
          B     60.90   2021-06-11            3     B
          B     88.20   2021-06-10            4     B
8 rows

So, we have the items (‘A’, ‘B’) order by date DESC (most recent first) with the price. Note that 12 lines of that query could be replaced by one ROW_NUMBER() function line!

So, now we wrap that in a query, pulling out those results whose my_rank value is <= 2 – which gives us the two most recent dates!

SELECT item_symbol, price, source_date, my_rank  -- this last one is not 
FROM                                             -- required - for clarity...
(
  SELECT
    item_symbol, price, source_date,
    IF 
    (
      @prev <> item_symbol,
      @row_num := 1,
      @row_num := @row_num + 1
    ) AS my_rank,
    @prev := item_symbol
    FROM item
    JOIN (SELECT @row_num := NULL, @prev := 0) AS r
    ORDER BY item_symbol, source_date DESC, price DESC  -- in case of ties!
) AS t
WHERE t.my_rank <= 2
ORDER BY item_symbol, source_date DESC; -- change this as required

Result:

item_symbol     price   source_date     my_rank
          A     21.00   2021-06-15            1
          A     10.90   2021-06-13            2
          B     79.00   2021-06-15            1
          B     78.16   2021-06-13            2

I would suggest that you spend some time browsing the artful softare site!

Method 2

If you have more than one price for one date you have to expand teh soltion and add a row_number which are made in Mysql 5,7 with @ variqables

CREATE TABLE item
    (`item_symbol` varchar(1), `price` DECIMAL(10,2), `source_date` Date)
;
    
INSERT INTO item
    (`item_symbol`, `price`, `source_date`)
VALUES
    ('A', 20.1, '2021-06-10'),
    ('A', 18.2, '2021-06-11'),
    ('A', 10.9, '2021-06-13'),
    ('A', 21.0, '2021-06-15'),
    ('B', 88.2, '2021-06-10'),
    ('B', 60.9, '2021-06-11'),
    ('B', 78.16, '2021-06-13'),
    ('B', 79.0, '2021-06-15')
;
SELECT i.`item_symbol`, `price` 
FROm item i INNER JOIN (SELECT `item_symbol`,MAX(`source_date`) maxdate FROM item  GROUP By `item_symbol`) t1
ON i.`item_symbol` = t1.`item_symbol` AND i.`source_date` = t1.maxdate
item_symbol | price
:---------- | ----:
A           | 21.00
B           | 79.00
SELECT `item_symbol`, `price`,rn
FROM(SELECT  `price` ,`source_date`
, IF(@id =  `item_symbol`,@rn:= @rn +1,@rn := 1) As rn ,@id :=  `item_symbol` as item_symbol 
FROM item , (SELECT @rn := 0, @id := 0) t1 ORDER BY `item_symbol`,`source_date` DESC) t2
WHERE rn <= 2
item_symbol | price | rn
:---------- | ----: | -:
A           | 21.00 |  1
A           | 10.90 |  2
B           | 79.00 |  1
B           | 78.16 |  2

db<>fiddle here

Method 3

You should be able to pull this one off by using something alike:

select [I].field1, [I].field2 ,[MR].Price]
from item [I]
cross apply ( 
        select top (2) Price from Item [s] where [s.].key = [I].Key 
        order by [S].DateField desc
         ) MR

Just out of mind, so adjust with your columns/tables.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply