Use CTE to fetch most recent row for each ID

All we need is an easy explanation of the problem, so here it is.

I just recently started learning about the wonders of CTEs and how they can be used, instead of WHILE loops, to optimize performance.

For context, I am automating the process of updating firmware on devices in my company. The devices send UDP data to the server and store the messages into a single table, so the table is quite large

There are 2 modules on each device, high and low frequency and the messages are very similar

Previously, I had temp tables, counters, and a while loop to get the most recent firmware message for each device on each module. I have since used the following CTE to extract the same information (only high frequency code shown, table and column names changed for confidentiality)

;WITH HFCTE (DIdentifier, CachedHFFWVersion, CurrentHFFWVersion, HFFWMessage, HFFWMessageDate) AS
(
    --Get all HF Firmware Messages for each device
    SELECT  udp.DIdentifier,
            LEFT(RIGHT(UDPMessage,7),3),
            RIGHT(UDPMessage, 3),
            UDPMessage,
            DateRecorded
        FROM UDPTable udp WITH (NOLOCK)
            INNER JOIN FirmwareInfoTable fw ON udp.DIdentifier = fw.DIdentifier
        WHERE UDPMessage LIKE '%FWMessage:%' 
            AND UDPMessage NOT LIKE '%FWMessage:LF%' 
            AND fw.Status = 'HF Pending'
)

-- Get the most recent HF Firmware Message for each device
SELECT cte.DIdentifier, 
cte.CachedHFFWVersion, 
cte.CurrentHFFWVersion, 
cte.HFFWMessage, 
cte.HFFWMessageDate
FROM HFCTE cte 
    INNER JOIN (
                SELECT DIdentifier,
                    MAX(HFFWMessageDate) AS MaxDate
                FROM HFCTE
                GROUP BY DIdentifier
               ) AS gcte
        ON cte.DIdentifier= gcte.DIdentifierAND cte.HFFWMessageDate = gcte.MaxDate
ORDER BY cte.DIdentifier DESC

The main problem is that before picking out the most recent firmware message using the last SELECT statement, the CTE has to get all FWMessage records for each device, which could be very bad for performance during peak hours.

Is there any way to only pull the one record I need for each device in my initial WITH CTE query? Would it even optimize the overall performance of the query? Should I be using a different method entirely?

P.S. DIdentifier is a PRIMARY_KEY in the FirmwareInfo table and a FOREIGN_KEY (Not to the Firmware table but to the table that holds devices) in the UDP table. The UDP table also has 2 relevant indexes, both non-clustered & non-unique. One includes DIdentifier and UDPMessage, the other includes DIdentifier and DateRecorded. I cannot create a new index for UDP table due to performance constraints

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

You can try using a window function like ROW_NUMBER() or DENSE_RANK() to generate an ID that represents the latest rows per your sort criteria. Here’s an example:

;WITH HFCTE (DIdentifier, CachedHFFWVersion, CurrentHFFWVersionn, HFFWMessage, HFFWMessageDate, SortId) AS
(
    --Get all HF Firmware Messages for each device
    SELECT  udp.DIdentifier,
        LEFT(RIGHT(UDPMessage,7),3),
         RIGHT(UDPMessage, 3),
         UDPMessage,
         DateRecorded,
         DENSE_RANK() OVER (PARTITION BY udp.DIdentifier ORDER BY DateRecorded DESC) AS SortId
    FROM UDPTable udp WITH (NOLOCK)
        INNER JOIN FirmwareInfoTable fw ON udp.DIdentifier = fw.DIdentifier
    WHERE UDPMessage LIKE '%FWMessage:%' 
        AND UDPMessage NOT LIKE '%FWMessage:LF%' 
        AND fw.Status = 'HF Pending'
)
     
SELECT cte.DIdentifier, 
    cte.CachedHFFWVersion, 
    cte.CurrentHFFWVersion, 
    cte.HFFWMessage, 
    cte.HFFWMessageDate
FROM HFCTE cte 
WHERE cte.SortId = 1
ORDER BY cte.DIdentifier DESC

This eliminates the need to join your CTE to itself in a subquery and further upstreams the logic for determining the latest records by date. Also the window functions tend to be rather performant for this kind of problem.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply