All we need is an easy explanation of the problem, so here it is.
I am given the schema:
Primary keys are given in bold.
The question: Is it possible to define a view that will report the following detail for each employee: Ssn and the number of employees who work on exactly the same set of projects? Explain why not or given an SQL query.
My idea was to find unique pairs of employees and check if projects worked by first employee equal those of the second one (this was done by using set operation except as if s1 except s2 union s2 except s1 is empty than it seems like s1 and s2 are equal)
My attempt for this was:
SELECT e1.ssn,count(e2.ssn) FROM EMPLOYEE e1, EMPLOYEE e2 WHERE e1.ssn < e2.ssn AND NOT EXISTS ( ( (SELECT Pno FROM WORKS_ON w1 WHERE w1.Essn = e1.Ssn) EXCEPT ALL (SELECT Pno FROM WORKS_ON w2 WHERE w2.Essn = e2.Ssn) ) UNION ALL ( (SELECT Pno FROM WORKS_ON w2 WHERE w2.Essn = e2.Ssn) EXCEPT ALL (SELECT Pno FROM WORKS_ON w1 WHERE w1.Essn = e1.Ssn) ) ) GROUP BY(e1.ssn)
Does this query look correct? Also is there a neater way of doing this?
As a final side note what exactly determines if it is impossible for us to construct an SQL query for a problem?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
A simple join should be sufficient.
Now i collect every project_id an user worked on in an aeeay, and compare
WITH CTE AS (SELECT Essn,array_agg(Pno ORDER BY Pno) AS AerPno FROM WORKS_ON GROUP BY Essn) SELECT c1.Essn, COUNT(*) FROM CTE c1 JOIN CTE c2 on c1.AerPno = c2.AerPno AND c1.Essn <> c2.Essn GROUP BY c1.Essn;
This is a question of Relational Division Without Remainder. Joe Celko has a very good article on this.
There are a number of ways to slice this cake. Here are a couple:
1. Get a windowed count of each employee’s projects (you can also do this as a scalar subquery).
Left-join employees with each other, matching on
Group up by each pair of employees, exclude any which either have too many or too few Projects than the other.
Then do a final grouping of each employee, counting up the matching employees.
WITH WORKS_ONwithCount AS ( SELECT *, COUNT(*) OVER (PARTITION BY Essn) AS CountPno FROM WORKS_ON ), EmployeesCrossJoin AS ( SELECT this.Essn, other.Essn FROM cte this LEFT JOIN cte other ON other.Essn <> this.Essn AND other.Pno = this.Pno GROUP BY this.Essn, other.Essn, other.CountPno HAVING COUNT(*) = COUNT(other.Pno) AND COUNT(*) = other.CountPno ) SELECT emp.Essn, COUNT(*) AS CountMatching FROM EmployeesCrossJoin emp GROUP BY emp.Essn;
2. Group up each employee, with their project numbers in an array.
Then take the count of other employees who: after un-nesting and full-joining both arrays, there are no
NULL values on either side.
WITH cte AS ( SELECT Essn, array_agg(Pno ORDER BY Pno) AS AerPno FROM WORKS_ON GROUP BY Essn ) SELECT Essn, (SELECT COUNT(*) FROM cte other WHERE this.Essn <> other.Essn AND NOT EXISTS (SELECT 1 FROM unnest(this.AerPno) AS nThis(value) FULL JOIN unnest(other.AerPno) AS nOther(value) ON nOther.value = nThis.value WHERE nOther.value IS NULL OR nThis.value IS NULL ) AS FROM cte this;
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂