All we need is an easy explanation of the problem, so here it is.
I have two tables I need to join together. However, I only want to join records when there are unqiue matches rather than picking one of several for the join.
Versioning:
Using MariaDB version 10.3.34
Example Data:
Core (LEFT) data towns
id | town | postcode |
---|---|---|
1 | Hudderfield | HD11 4ER |
2 | Manchester | MN14 3JE |
3 | Macklesfield | MK17 9FL |
4 | Edinburgh | ED5 3MJ |
5 | Liverpool | LV9 8XT |
Joined (RIGHT) data peoples
:
id | names | postcode |
---|---|---|
1 | Jimmy Saville | HD11 4ER |
2 | Jason Bomb | IP14 8FK |
3 | Micky Mouse | MK17 9FL |
4 | Bobby Dillian | ED5 3MJ |
5 | Lenny Davies | ED5 3MJ |
My SQL:
My initial query would be something like:
SELECT towns.id, towns.town, peoples.name FROM towns
LEFT JOIN people ON towns.postcode = peoples.postcode
But this will include Edinburgh
but there are two people in Edinburgh, I only want to join when there’s a single unqiue row to join on.
I use LEFT join because I need to return all of towns
but only the unique rows of peoples
.
Expected results:
id | town | names |
---|---|---|
1 | Hudderfield | Jimmy Saville |
2 | Manchester | <null> |
3 | Macklesfield | Micky Mouse |
4 | Edinburgh | <null> |
5 | Liverpool | <null> |
What I’ve tried
I’ve tried using COUNT()
in the JOIN but can’t get this to work,
SELECT towns.id, towns.town, peoples.names FROM towns
LEFT JOIN people ON towns.postcode = peoples.postcode AND count(peoples.id) = 1
Comes up with a syntax error.
I can’t think about how I can qualify this join that it only joins when there’s a single result found. Internet searching gives me lots of far more vague and off topic references.
I’m sure it’s simple but I can’t do it. Also, I’d like to avoid subquerying if possible?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Method 1
An alternative way to Lennart’s window function answer, is to just use a GROUP BY
and HAVING
clause against the peoples
table to filter out the ones with the same postcode
like so:
SELECT towns.id, towns.town, peoples.names
FROM towns
LEFT JOIN
(
SELECT MAX(names) AS names, postcode
FROM peoples
GROUP BY postcode
HAVING COUNT(name) = 1
) peoples
ON towns.postcode = peoples.postcode
Method 2
You should be able to left join peoples with:
(select name, postcode
from (
select name, postcode, count(1) over (partition by postcode) as cnt
from peoples
) as t
where cnt = 1)
i.e.
SELECT t.id, t.town, p.name
FROM towns t
LEFT JOIN (SELECT name, postcode
FROM (
SELECT name, postcode
, count(1) over (partition by postcode) as cnt
FROM peoples
) as x
WHERE cnt = 1
) p
USING (postcode)
EDIT:
Given the ddl provided in the update I created the db<>fiddle
SELECT t.id, t.town, p.names
FROM towns t
LEFT JOIN (
SELECT names, postcode
FROM (
SELECT names, postcode
, count(1) over (partition by postcode) as cnt
FROM peoples
) as x
WHERE cnt = 1
) p
USING (postcode);
It appears to give the expected result
Method 3
Even simpler:
SELECT t.id, t.town,
IF (COUNT(DISTINCT p.names) = 1, MAX(p.names), NULL) AS names
FROM towns AS t
LEFT JOIN peoples AS p ON t.postcode = p.postcode
GROUP BY t.id
I believe it avoids the "only full group by" issue mentioned in previous Comments. (If not, see the comments on this Answer.)
Method 4
Look, mum, no subqueries and aggregate or windowing functions!
SELECT t.id
, t.town
, IF( p2.id IS NOT NULL, NULL, p1.names ) AS names
FROM towns AS t
LEFT JOIN peoples AS p1 ON p1.postcode = t.postcode
LEFT JOIN peoples AS p2 ON p2.postcode = p1.postcode
AND p2.id != p1.id
GROUP BY t.id
;
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0