SQL query max aggregate selecting correct nonaggregate columns?

All we need is an easy explanation of the problem, so here it is.

I am confused about the max() aggregator, I think this query makes sense…

select title, vote_avg, floor(cast(substr(date, 1, 4) as integer)/10)*10 as decade
   ...> from films
   ...> where (vote_avg, decade) in (
   ...>   select max(vote_avg), floor(cast(substr(date, 1, 4) as integer)/10)*10 as decade
   ...>   from films
   ...>   where vote_cnt > 100
   ...>   group by decade
   ...> )
   ...> and vote_cnt > 100
   ...> group by decade
   ...> order by decade desc;
title                            vote_avg  decade
-------------------------------  --------  ------
Homecoming: A Film by Beyoncé    8.6       2010  
Spirited Away                    8.5       2000  
Dilwale Dulhania Le Jayenge      8.9       1990  
The Empire Strikes Back          8.4       1980  
We All Loved Each Other So Much  8.6       1970  
Psycho                           8.4       1960  
12 Angry Men                     8.4       1950  
The Great Dictator               8.4       1940  
City Lights                      8.4       1930  
Sherlock Jr.                     8.2       1920  
The Immigrant                    7.5       1910  
The Great Train Robbery          7.2       1900  

but I am not sure why this one works?

select title, max(vote_avg), floor(cast(substr(date, 1, 4) as integer)/10)*10 as decade
   ...> from films
   ...> where vote_cnt > 100
   ...> group by decade
   ...> order by decade desc;
title                            max(vote_avg)  decade
-------------------------------  -------------  ------
Homecoming: A Film by Beyoncé    8.6            2010  
Spirited Away                    8.5            2000  
Dilwale Dulhania Le Jayenge      8.9            1990  
The Empire Strikes Back          8.4            1980  
We All Loved Each Other So Much  8.6            1970  
Psycho                           8.4            1960  
12 Angry Men                     8.4            1950  
The Great Dictator               8.4            1940  
City Lights                      8.4            1930  
Sherlock Jr.                     8.2            1920  
The Immigrant                    7.5            1910  
The Great Train Robbery          7.2            1900  

I’m using sqlite3, which of these queries is better practice? (Or are they both incorrect?)

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Such a bare column in an aggregate query is not allowed by the SQL standard. But SQLite allows it, and returns the intuitively correct result:

When the min() or max() aggregate functions are used in an aggregate query, all bare columns in the result set take values from the input row which also contains the minimum or maximum.

If you want to write portable SQL, don’t use this feature.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply