Finding rows that are missing specific value in a set

All we need is an easy explanation of the problem, so here it is.

The database that I am using is MySQL. I have an options table that roughly looks like this

id          int(10) PRIMARY
annex_id    int(10)
title       varchar(255)
sort_order  INT(10) NOT NULL

Every annex (annex_id) has bunch of options enumerated by sort_order column which varies from 1 to N and are controlled by humans.

Due to software bug from a couple of months a lot of annexes do not have an option with sort_order == 1 but rather start from 2 or 3.

What I want to do is get all the annex_id‘s which contain options that do not have sort_order == 1. I’ll show an exemplary table because I feel like I’m explaining it wrong (which could be the reason I can’t come up with a query for it and I’ve been at it for hours).

ID Annex ID Title Sort Order
12 567 Title #1 1
13 567 Title #2 2
14 567 Title #3 3
15 890 Another title #1 2
16 890 Another title #2 3

From the given table I pretty much want craft a query that would get annex id 890 because it doesn’t have a row with sort order == 1.

The table has hundreds of thousands of rows and going at it by hand is insane. I’ve been dealing with databases for years and either I’m overthinking it or idk what but I can’t come up with an efficient query to do this job.

Is it even possible to have a query like this?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

You have definitely more than one option here. There is another method with grouping and aggregation and it uses conditional counting:

SELECT
  annex_id
FROM
  table
GROUP BY
  annex_id
HAVING
  COUNT(CASE sort_order WHEN 1 THEN 1 END) = 0
;

The conditional counting itself has several ways of expressing in MySQL. Instead of

COUNT(CASE sort_order WHEN 1 THEN 1 END)

you could also use any of these:

COUNT(IF(sort_order = 1, 1, NULL))
COUNT(sort_order = 1 OR NULL)
SUM(sort_order = 1)

But you could also take a completely different approach. You could use the EXCEPT set operator to get the difference between the set of all annex_id values and the set of those that have sort_order = 1:

SELECT
  annex_id
FROM
  table

EXCEPT

SELECT
  annex_id
FROM
  table
WHERE
  sort_order = 1
;

Note that EXCEPT on its own stands for EXCEPT DISTINCT, which means that duplicate values are automatically eliminated.

Method 2

For anyone wondering, as per matigo‘s instruction the query would look like the following

SELECT annex_id FROM table GROUP BY annex_id HAVING MIN(sort_order) > 1

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply