How to address aggregate fanout in the below query

All we need is an easy explanation of the problem, so here it is.

I have a table user. Every user has a one-to-many relationships with topic_likes and topic_dislikes. I want to query for a user and have their one-to-many relationships exist in their own column in the result as arrays of JSON objects.

Let’s say the below is the user table:

CREATE TABLE user(
    id UUID NOT NULL PRIMARY KEY,
    email VARCHAR(254) NOT NULL UNIQUE,
    username TEXT NOT NULL,
    creationTimestamp TIMESTAMPTZ NOT NULL
);

topic_like:

CREATE TABLE topic_like(
    user_id UUID NOT NULL REFERENCES user(id) ON UPDATE RESTRICT ON DELETE CASCADE,
    likeable_topic_id INTEGER REFERENCES likeable_topic(id) ON UPDATE RESTRICT ON DELETE RESTRICT,
    index INTEGER NOT NULL,
    UNIQUE(user_id, index),
    UNIQUE(user_id, likeable_topic_id)
);

and topic_dislike:

CREATE TABLE topic_dislike(
    user_id UUID NOT NULL REFERENCES user(id) ON UPDATE RESTRICT ON DELETE CASCADE,
    dislikeable_topic_id INTEGER REFERENCES dislikeable_topic(id) ON UPDATE RESTRICT ON DELETE RESTRICT,
    index INTEGER NOT NULL,
    UNIQUE(user_id, index),
    UNIQUE(user_id, dislikeable_topic_id)
);

What I have now is a query like so:

SELECT user.id, email, username, 
json_agg(json_build_object('likeable_topic_id', likeable_topic_id, 'index', topic_like.index)),
json_agg(json_build_object('dislikeable_topic_id', dislikeable_topic_id, 'index', topic_dislike.index))
FROM user
LEFT JOIN topic_like ON user.id=topic_like.user_id
LEFT JOIN topic_dislike ON user.id=topic_dislike.user_id
WHERE user.id='58b6fe31-f3f6-4781-af06-93e29cb05bca'
GROUP BY user.id;

But when I do this, I get a result where, if there are 5 topic_like records for a given user and 5 topic_dislike records for a given user, the resulting output will duplicate every topic_like record and every topic_dislike record in their JSON arrays five times each. Resulting in a JSON array for each column that contains 25 elements despite there being only 5 records in each corresponding table.

How to restructure this query such that each JSON array only contains as many elements as there are matching records in the corresponding table?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Use correlated sub-selects rather than left joins

SELECT user.id, email, username, 
    (select json_agg(json_build_object('likeable_topic_id', likeable_topic_id, 'index', topic_like.index)) from topic_like WHERE user.id=topic_like.user_id) liked_json,
    (select json_agg(json_build_object('dislikeable_topic_id', dislikeable_topic_id, 'index', topic_dislike.index)) from topic_dislike WHERE user.id=topic_dislike.user_id) disliked_json
FROM user
WHERE user.id='58b6fe31-f3f6-4781-af06-93e29cb05bca'

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply