All we need is an easy explanation of the problem, so here it is.
I am using MySQL 5.7 database with the following columns
I have composite primary keys which are a combination of
source_date. I have around 100k entries in the table.
My question here is, for each
item_symbol, I would like to select
price for the most recent (based on source date) 2 entries from the table.
An example of my table:
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
I went to my favourite MySQL "tips and tricks" site here and went to the common queries link and looked for the Top N per group section. The great thing about this site is that it tells you how to do stuff in MySQL for all versions – well, going back at least to MySQL 5.5 – and if you’re still running that, well…
I came up with the following adapted from above (all of the DDL, DML and SQL below is available on the fiddle here):
I used the DDL and DML from @nbk, kudos to him (and +1):
CREATE TABLE item ( item_symbol CHAR(1), price DECIMAL(10,2), source_date DATE );
and populate it:
INSERT INTO item (item_symbol, price, source_date) VALUES ('A', 20.1, '2021-06-10'), ('A', 18.2, '2021-06-11'), ('A', 10.9, '2021-06-13'), ('A', 21.0, '2021-06-15'), ('B', 88.2, '2021-06-10'), ('B', 60.9, '2021-06-11'), ('B', 78.16, '2021-06-13'), ('B', 79.0, '2021-06-15');
MySQL allows the use of user variables which are a godsend when you don’t have capabilities such as the
ROW_NUMBER() window function which would have made this query trivial. I would strongly urge you to upgrade to version 8, it has many other goodies – CTEs, CHECK constraints…
Anyway, I’ll demonstrate the steps, partly to explain them to you, and partly to explain them to myself! 🙂
SELECT item_symbol, price, source_date, IF ( @prev <> item_symbol, @row_num := 1, @row_num := @row_num + 1 ) AS my_rank, @prev := item_symbol FROM item JOIN ( SELECT @row_num := NULL, @prev := 0 ) AS r ORDER BY item_symbol, source_date DESC, price DESC;
item_symbol price source_date my_rank @prev := item_symbol A 21.00 2021-06-15 1 A A 10.90 2021-06-13 2 A A 18.20 2021-06-11 3 A A 20.10 2021-06-10 4 A B 79.00 2021-06-15 1 B B 78.16 2021-06-13 2 B B 60.90 2021-06-11 3 B B 88.20 2021-06-10 4 B 8 rows
So, we have the items (‘A’, ‘B’) order by date
DESC (most recent first) with the price. Note that 12 lines of that query could be replaced by one
ROW_NUMBER() function line!
So, now we wrap that in a query, pulling out those results whose
my_rank value is <= 2 – which gives us the two most recent dates!
SELECT item_symbol, price, source_date, my_rank -- this last one is not FROM -- required - for clarity... ( SELECT item_symbol, price, source_date, IF ( @prev <> item_symbol, @row_num := 1, @row_num := @row_num + 1 ) AS my_rank, @prev := item_symbol FROM item JOIN (SELECT @row_num := NULL, @prev := 0) AS r ORDER BY item_symbol, source_date DESC, price DESC -- in case of ties! ) AS t WHERE t.my_rank <= 2 ORDER BY item_symbol, source_date DESC; -- change this as required
item_symbol price source_date my_rank A 21.00 2021-06-15 1 A 10.90 2021-06-13 2 B 79.00 2021-06-15 1 B 78.16 2021-06-13 2
I would suggest that you spend some time browsing the artful softare site!
If you have more than one price for one date you have to expand teh soltion and add a row_number which are made in Mysql 5,7 with @ variqables
CREATE TABLE item (`item_symbol` varchar(1), `price` DECIMAL(10,2), `source_date` Date) ; INSERT INTO item (`item_symbol`, `price`, `source_date`) VALUES ('A', 20.1, '2021-06-10'), ('A', 18.2, '2021-06-11'), ('A', 10.9, '2021-06-13'), ('A', 21.0, '2021-06-15'), ('B', 88.2, '2021-06-10'), ('B', 60.9, '2021-06-11'), ('B', 78.16, '2021-06-13'), ('B', 79.0, '2021-06-15') ;
SELECT i.`item_symbol`, `price` FROm item i INNER JOIN (SELECT `item_symbol`,MAX(`source_date`) maxdate FROM item GROUP By `item_symbol`) t1 ON i.`item_symbol` = t1.`item_symbol` AND i.`source_date` = t1.maxdateitem_symbol | price :---------- | ----: A | 21.00 B | 79.00
SELECT `item_symbol`, `price`,rn FROM(SELECT `price` ,`source_date` , IF(@id = `item_symbol`,@rn:= @rn +1,@rn := 1) As rn ,@id := `item_symbol` as item_symbol FROM item , (SELECT @rn := 0, @id := 0) t1 ORDER BY `item_symbol`,`source_date` DESC) t2 WHERE rn <= 2
item_symbol | price | rn :---------- | ----: | -: A | 21.00 | 1 A | 10.90 | 2 B | 79.00 | 1 B | 78.16 | 2
You should be able to pull this one off by using something alike:
select [I].field1, [I].field2 ,[MR].Price] from item [I] cross apply ( select top (2) Price from Item [s] where [s.].key = [I].Key order by [S].DateField desc ) MR
Just out of mind, so adjust with your columns/tables.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂