HIVE SQL [Error 10025]: Expression not in GROUP BY key

All we need is an easy explanation of the problem, so here it is.

Here is my SQL statement:

my_table includes 10+ columns (e.g, day, ip_address, user, request, etc), including strings and numbers. I want to GROUP by & HAVING based on column ‘ip_address’, if more than 20 records.

SELECT day, ip_address, user, request 
FROM my_table
WHERE DAY = current_date()
GROUP BY ip_address
HAVING count(client_ip)>20

I got this error message

Error while compiling statement: FAILED: SemanticException [Error 10025]: Expression not in GROUP BY key xxx

But I need to keep all columns.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

As long as one ip address has more than 20 rows, all these rows will be returned. If less then 20, do not return.

WITH cte AS ( SELECT *, COUNT(ip_address) OVER (PARTITION BY ip_address) cnt
              FROM my_table
              WHERE DAY = current_date() )
SELECT *
FROM cte
WHERE cnt > 20;

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply