What is more efficient on select – objects identified only by their ID, or by their ID + customer id?

All we need is an easy explanation of the problem, so here it is.

I have a collection of ~500K configurations. Each configuration belongs to a specific customer. There are up to a few hundreds of customers.

I want to store the configurations in a table, and it will be used in JOIN statements.
In the select query, there is always a condition on the customer id. It is always joined with the configurations table. Sometimes there is a condition on one or more of the columns in the configuration table.

I would like to know what is the better approach:

  • PK of the configurations table is customer id + configuration id. I will include the customer id condition also in the join clause.
  • PK of the configuration table is only configuration id.

I want to understand:

  • Should the presence of the customer id in the PK have major affect on performance?
  • Are there any disadvantages on using 2-column PK? Assuming there is ALWAYS a condition on the customer, so I will never query the configurations table on configuration-id only.

Thanks.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Q: Should the presence of the customer id in the PK have major affect on performance?

A: Doing so will result in PostgreSQL creating a B-Tree index sorted on both the configuration_id and customer_id, which would help ensure your queries receive the most optimal execution plan, when you’re querying on both fields. This is known as a covering index. So yes, the affects on performance are usually beneficial ones.

Q: Are there any disadvantages on using 2-column PK? Assuming there is ALWAYS a condition on the customer, so I will never query the configurations table on configuration-id only.

A: No there is nothing inherently wrong with a two column Primary Key. But keep in mind, the main goal of the Primary Key is to ensure uniqueness of that Table. It is only a secondary benefit that PostgreSQL creates an index on the Primary Key.

Therefore if you’re breaking the uniqueness that makes logical sense of your data by adding another column to the Primary Key then you shouldn’t do that. Instead you should add your own secondary index on both columns.

In your case, it sounds like you’d want the data uniqueness to be dependent on customer_id as well, so you’re probably fine to just include it in the Primary Key.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply