All we need is an easy explanation of the problem, so here it is.
Given database of the website “eMovies” that manage data about cinema movies.
Write in SQL query the most common genre (that have the biggest number of movies)
Actors ActorID PK Firstname LastName Directors DirectorID PK Firstname LastName Genres GenreID PK GenreDescr Movies MovieID PK MovieName Year ActorsMovies MovieID PK ActorID PK DirectorsMovies MovieID PK DirectorID PK GenresMovies MovieID PK GenreID PK
SELECT GenresDescr FROM GenresMovies NATURAL JOIN Genres WHERE GenresMovies.GenresID=Genres.GenresID GROUP BY GenreDescr HAVING MAX(COUNT(GenresMovies.MovieID));
I’m stuck, my code does not work.
EDIT: attampt number 2 after using @Darwin von Corax hint:
SELECT Genres.GenreDescr,COUNT(GenresMovies.GenreID) AS MCount FROM GenresMovies,Genres WHERE GenresMovies.GenreID = Genres.GenreID GROUP BY GenresMovies.GenreID,Genres.GenreDescr HAVING MCount=MAX(MCount);
But still the code does not work
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
I think this one needs a subquery, although someone else may prove me wrong.
Lets take it step-by-step: First, lets get the genre of every movie in
GenresMovies, along with the description:
-- query 1 -- SELECT gm.GenreID, g.GenreDescr FROM GenresMovies gm, Genres g WHERE gm.GenreID = g.GenreID;
Note this is an alternate syntax for performing an inner join; I prefer it simply because it’s what I’m used to.
Now use an aggregate function to count the occurrences of each genre:
-- query 2 -- SELECT COUNT(gm.GenreID) AS movieCount, g.GenreDescr FROM GenresMovies gm, Genres g WHERE gm.GenreID = g.GenreID GROUP BY gm.GenreID;
Note that most databases require all columns not being aggregated to be listed in the
GROUP BY clause, in which case you would use
GROUP BY gm.GenreID, g.GenreDescr;
which in this case will produce the same result. Also note I’ve given the
COUNT() column an alias; this will be handy in the next step.
You now have a list of genres, along with the number of movies in each genre, and you want to find the largest value in that list. One way to do this is to treat the above query as though it were a table, and query it. This is known as a subquery (or derived table in this case); you simply take the previous query, wrap it in parentheses, and drop it into the outer query’s
FROM clause in place of a table name:
-- query 3 -- SELECT MAX(res.movieCount) AS movieCount, res.descr FROM ( SELECT COUNT(gm.GenreID) AS movieCount, g.GenreDescr AS descr FROM GenresMovies gm, Genres g WHERE gm.GenreID = g.GenreID GROUP BY gm.GenreID, g.GenreDescr ) res;
The outer query has no
GROUP BY clause, as you want to aggregate a single value from the entire result of the inner query. The
res at the end simply gives a name to the subquery; this is required for syntactic completeness.
This solution returns a single row. For the situation where several genres are tied for first place, see below.
Note my use of indentation; the database doesn’t care about indentation, but you should, as it helps you to visualize the structure of your code.
Now the problem with query 3 is that – except for MySQL and not all its versions, even there – it doesn’t work. It will give some syntax error because for the second aggregation, this part:
SELECT MAX(res.movie_count) AS movie_count, res.descr FROM (...) res;
descr is in the
SELECT list but not in the
GROUP BY list. And this is not allowed because we are aggregating many rows into one (to find the
MAX of the
movie_count) but if there are more than one, different descriptions, which one should be shown? The query is rejected. In MySQL, where it allowed, you’d get a random indeterminate result, one of the descriptions.
So, lets correct the query to get only the maximum of the movie_count:
-- query 4 -- SELECT MAX(res.movieCount) AS movieCount FROM ( SELECT COUNT(gm.GenreID) AS movieCount, g.GenreDescr as descr FROM GenresMovies gm, Genres g WHERE gm.GenreID = g.GenreID GROUP BY gm.GenreID, g.GenreDescr ) res;
We can improve it a bit by removing the join to
Genres, since we don’t use any of its columns in the result:
-- query 5 -- SELECT MAX(res.movieCount) AS movieCount FROM ( SELECT COUNT(gm.GenreID) AS movieCount FROM GenresMovies gm GROUP BY gm.GenreID ) res;
We now have two working queries that give us the maximum count of movies per genre. But we don’t have the genres themselves!
There are many ways to solve this. One would be to use this result and
JOIN it back to our original query. It gets a bit complicated but remember that we can just use any result as a table. Just remember to put it in parentheses and give a name (alias). check how we are combining queries 2 and 5:
-- query 6 -- SELECT q2.movieCount, q2.GenreDescr FROM ( SELECT COUNT(gm.GenreID) AS movieCount, g.GenreDescr FROM GenresMovies gm, Genres g WHERE gm.GenreID = g.GenreID GROUP BY gm.GenreID, g.GenreDescr ) q2, ( SELECT MAX(res.movieCount) AS movieCount FROM ( SELECT COUNT(gm.GenreID) AS movieCount FROM GenresMovies gm GROUP BY gm.GenreID ) res ) q5 WHERE q2.movieCount = q5.movieCount ;
This is one way to do it; there are others, but I hope this helps you understand the process of developing a solution.
It turns out that Access has the
TOP operator which can be combined with
ORDER BY to give for example, using
TOP 1, the first result only. But Access has a rather peculiar way of dealing with ties. Instead of returning only the 1st row, it returns all tied rows (similar to
TOP 1 WITH TIES in SQL Server). So, we can modify query 2 to get the same result as query 7 but with a far more simpler syntax:
-- query 8 -- SELECT TOP 1 COUNT(gm.GenreID) AS movieCount, g.GenreDescr AS descr FROM GenresMovies gm, Genres g WHERE gm.GenreID = g.GenreID GROUP BY gm.GenreID, g.GenreDescr ORDER BY movieCount DESC ;
In this query, the result, after the
GROUP BY and the
SELECT have been calculated, are orderer by the
movieCount in reverse order. First the higher movie counts, then the lower, until the smallest count. Then, due to the
TOP 1 only the rows that tie in first place, so only the highest movie counts are returned.
This code is working well:
SELECT GenreDescr FROM (SELECT Genres.GenreDescr,COUNT(GenresMovies.GenreID) AS MCount FROM GenresMovies,Genres WHERE GenresMovies.GenreID =Genres.GenreID GROUP BY GenresMovies.GenreID,Genres.GenreDescr) WHERE MCount= ( SELECT MAX(MCount) FROM (SELECT COUNT(GenresMovies.GenreID) AS MCount FROM GenresMovies, Genres WHERE GenresMovies.GenreID =Genres.GenreID GROUP BY GenresMovies.GenreID) );
THis will only give you count of movies
after that you can apply having or where to get max etc
and for movie or genre name use inner join with their parent tables
Select Count(*),MovieID From GenresMovies Group By MovieID
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂