# How to write a query for exact pairs of groups

## All we need is an easy explanation of the problem, so here it is.

I am given the schema:

EMPLOYEE(Fname,Ssn,Dno)

DEPARTMENT(Dname,Dnumber,Mgr_ssn,Mgr_start_date)

PROJECT(Pname,Pnumber,Plocation,Dno)

WORKS_ON(Essn,Pno)

Primary keys are given in bold.

The question: Is it possible to define a view that will report the following detail for each employee: Ssn and the number of employees who work on exactly the same set of projects? Explain why not or given an SQL query.

My idea was to find unique pairs of employees and check if projects worked by first employee equal those of the second one (this was done by using set operation except as if s1 except s2 union s2 except s1 is empty than it seems like s1 and s2 are equal)

My attempt for this was:

``````SELECT e1.ssn,count(e2.ssn)
FROM EMPLOYEE e1, EMPLOYEE e2
WHERE e1.ssn < e2.ssn
AND NOT EXISTS
(
(
(SELECT Pno
FROM WORKS_ON w1
WHERE w1.Essn = e1.Ssn)
EXCEPT ALL
(SELECT Pno
FROM WORKS_ON w2
WHERE w2.Essn = e2.Ssn)
)

UNION ALL

(
(SELECT Pno
FROM WORKS_ON w2
WHERE w2.Essn = e2.Ssn)
EXCEPT ALL
(SELECT Pno
FROM WORKS_ON w1
WHERE w1.Essn = e1.Ssn)
)
)
GROUP BY(e1.ssn)
``````

Does this query look correct? Also is there a neater way of doing this?

As a final side note what exactly determines if it is impossible for us to construct an SQL query for a problem?

## How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

### Method 1

A simple join should be sufficient.

Now i collect every project_id an user worked on in an aeeay, and compare

Query #1

``````WITH CTE AS (SELECT Essn,array_agg(Pno ORDER BY Pno) AS AerPno FROM WORKS_ON
GROUP BY Essn)
SELECT c1.Essn, COUNT(*) FROM CTE c1 JOIN CTE c2 on c1.AerPno = c2.AerPno AND c1.Essn <> c2.Essn
GROUP BY c1.Essn;
``````
essn count
5 1
1 1

View on DB Fiddle

### Method 2

This is a question of Relational Division Without Remainder. Joe Celko has a very good article on this.

There are a number of ways to slice this cake. Here are a couple:

1. Get a windowed count of each employee’s projects (you can also do this as a scalar subquery).
Left-join employees with each other, matching on `Pno`.
Group up by each pair of employees, exclude any which either have too many or too few Projects than the other.
Then do a final grouping of each employee, counting up the matching employees.

``````WITH WORKS_ONwithCount AS (
SELECT
*,
COUNT(*) OVER (PARTITION BY Essn) AS CountPno
FROM WORKS_ON
),
EmployeesCrossJoin AS (
SELECT this.Essn,
other.Essn
FROM cte this
LEFT JOIN cte other ON other.Essn <> this.Essn AND other.Pno = this.Pno
GROUP BY this.Essn, other.Essn, other.CountPno
HAVING COUNT(*) = COUNT(other.Pno) AND COUNT(*) = other.CountPno
)
SELECT emp.Essn, COUNT(*) AS CountMatching
FROM EmployeesCrossJoin emp
GROUP BY emp.Essn;
``````

2. Group up each employee, with their project numbers in an array.
Then take the count of other employees who: after un-nesting and full-joining both arrays, there are no `NULL` values on either side.

``````WITH cte AS (
SELECT
Essn,
array_agg(Pno ORDER BY Pno) AS AerPno
FROM WORKS_ON
GROUP BY Essn
)
SELECT Essn,
(SELECT COUNT(*)
FROM cte other
WHERE this.Essn <> other.Essn
AND NOT EXISTS (SELECT 1
FROM unnest(this.AerPno) AS nThis(value)
FULL JOIN unnest(other.AerPno) AS nOther(value) ON nOther.value = nThis.value
WHERE nOther.value IS NULL OR nThis.value IS NULL
) AS
FROM cte this;
``````

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂