Is a clustered index of 3 columns too big?

All we need is an easy explanation of the problem, so here it is.

My goal is to design a table, that can be queried trough an external id(uniqueidentifier), an internal id(bigint), always in combination with companyId(bigint), userId(bigint) and dashboardId(bigint) or in combination with (dashboardId IN @0, ..., @n, n=0,10), both of which conditions play the role of an ownership check.

I came up with the following index compositions:

CREATE CLUSTERED INDEX Mytable_createdBy_cix ON Mytable(companyId, createdBy, dashboardId)

CREATE UNIQUE NONCLUSTERED INDEX Mytable_extId_nix ON Mytable(extId) INCLUDE (valueD, valueN)

CREATE UNIQUE NONCLUSTERED INDEX Mytable_chartId_nix ON Mytable (chartId) INCLUDE (valueD, valueN)

I do not know the answer of the following questions:

  • Is the clustered index bad due to being non-unique? Should I add segregate key and not use the auto-assigned uniqueifier?
  • Are 3 * 8-byte columns + 4 byte uniqueifier(28bytes total) too much for a clustered index? I read it gets included in the include pages of each unique nonclustered index (where additional 16 or 8 bytes are added according to the key used).
  • Does this index design even make sense for the queries below?

I plan on running queries, similar to those:

SELECT chartId, valueD, valueN FROM Mytable WHERE companyId = @companyId AND createdBy = @userId

SELECT chartId, valueD, valueN FROM Mytable WHERE companyId = @companyId AND createdBy = @userId AND dashboardId = @dashboardId

SELECT chartId, valueD, valueN FROM Mytable WHERE dashboardId IN (@0, @1, @2)

SELECT chartId, valueD, valueN FROM Mytable WHERE (companyId = @companyId AND createdBy = @userId AND dashboardId = @dashboardId) OR dashboardId IN (@0, @1, @2)

UPDATE Mytable SET valueD = @valueD WHERE companyId = @companyId AND createdBy = @userId AND chartId = @chartId

UPDATE Mytable SET valueD = @valueD WHERE companyId = @companyId AND createdBy = @userId AND extChartId= @extId

UPDATE Mytable SET valueD = @valueD WHERE ((companyId = @companyId AND createdBy = @userId) OR dashboardId IN (@0, @1, @2)) AND extChartId= @extId

I do know, it is better to test, evaluate execution plans, share them when asking questions on stackexchange, but this is the design phase so no actual data or tables exist yet.

I can adjust the keys/indexes/table structure to better fit the queries. I just hope to get it at least partially right this first time while creating them, so the question does not get revisited.

Big thanks for any help in advance.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Firstly, note that the uniqueifier on a non-unique key is only added in cases where there are actually duplicate values. It doesn’t take up any space if there are no duplicates on the same index page. So unless there are two rows with the exact same companyId, createdBy, dashboardId then this is not going to happen.


Wide clustering keys can be problematic, but they also solve some deadlocking issues, so that may be a factor. It is not by any means clear that your choice of clustering key is right, but on the other hand: how do two UNIQUE non-clustered indexes make sense given the table design? If they are unique then why are all the predicates necessary in these queries?

It seems from your comments that the extra chartId is simply to have a smaller index column. I think this is probably a premature optimization: it simply adds extra indexing costs because you now need to index that column also. I would recommend you remove it, and rely purely on the exrChartId even though it’s wider.


For the given queries, you need to work through them and decide how best to satisfy them with indexes. The question of which one should be the clustering index is somewhat orthogonal, as a clustering index effectively INCLUDEs all columns automatically.

Each one may be able to use an index that also satisfies a different one, as long as the leading key columns are the same, irrespective of any additional columns in the key or INCLUDE.

  1. SELECT extChartId, valueD, valueN FROM Mytable WHERE companyId = @companyId AND createdBy = @userId
    This can be satisfied with the following index
    (companyId, createdBy) INCLUDE (extChartId, valueD, valueN)

  1. SELECT extChartId, valueD, valueN FROM Mytable WHERE companyId = @companyId AND createdBy = @userId AND dashboardId = @dashboardId
    This can be satisfied with the following index
    (companyId, createdBy, dashboardId) INCLUDE (extChartId, valueD, valueN)

  1. SELECT extChartId, valueD, valueN FROM Mytable WHERE dashboardId IN (@0, @1, @2)
    This can be satisfied with the following index
    (dashboardId) INCLUDE (extChartId, valueD, valueN)

  1. SELECT extChartId, valueD, valueN FROM Mytable WHERE (companyId = @companyId AND createdBy = @userId AND dashboardId = @dashboardId) OR dashboardId IN (@0, @1, @2)
    This one is more difficult, and needs an index union (might need to rewrite the query to get that). The required indexes would be the same as #1 and #3

  1. We change this one to only use the natural key, so it’s exactly the same as #6

  1. UPDATE Mytable SET valueD = @valueD WHERE companyId = @companyId AND createdBy = @userId AND extChartId = @extId
    Because extChartId is unique, the other columns can go in the INCLUDE
    This therefore needs an index
    (extChartId) INCLUDE (companyId, createdBy, valueD)

  1. UPDATE Mytable SET valueD = @valueD WHERE ((companyId = @companyId AND createdBy = @userId) OR dashboardId IN (@0, @1, @2)) AND extChartId = @extId
    Again this is difficult to satisfy, due to the OR. It may have been necessary to split this into two separate updates. But given that extChartId is unique, we can again just rely on that same index.

Looking at those indexes, we arrive at the following conclusions:

  • The predicates are all = equality predicates, or IN on a short list, so the key columns can be in any order. This help us immensely in combining the indexes.
  • An index suitable for #1 could have extra columns to suit #2, but not #3. Equally one suitable for #3 could also work for #2 but not for #1. So we need separate indexes. The question remains which of those combinations could also satisfy other queries.
  • #4 can work with the same indexes as the first three, so we won’t worry about that.
  • #6 and #7 need extChartId, which you say is unique. Therefore all the rest of the columns can go in the INCLUDE with little performance impact.

So it follows that the best combination of indexes is something like this

(companyId, createdBy, dashboardId) INCLUDE (extChartId, valueD, valueN)
(dashboardId) INCLUDE (extChartId, valueD, valueN)
(extChartId) INCLUDE (companyId, createdBy, dashboardId, valueD)

The question remains which of these you choose for your clustering key. Whichever you choose will INCLUDE all other columns.

The first or third index make the most sense to me. Given that extChartId is unique by itself, it could make more sense to use that due to the size, as you rightly noted.

But deadlocking may also be a problem, depending on the complexity of your transactional updates etc. Start with that, and switch the clustering key if you see it’s a problem.

Method 2

I do know, it is better to test, evaluate execution plans, share them when asking questions on stackexchange, but this is the design phase so no actual data or tables exist yet.

In the design phase add a unique index for every key, and an additional non-clustered index for every foreign key not supported by a key.

Then as you develop, evaluate query execution and consider additional indexes.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply