All we need is an easy explanation of the problem, so here it is.
Imagine we have received the results of a health survey on daily consumption habits of 3 different items, like the following:
I want to model this in a star schema model. In the fact table, I create foreign key relationships to date and item dimensions as well as a demographics dimension with country and age. I then sum up the number of respondents pr. demograhic group. If the number of respondents is above 100, I mark the group as being representative of the population. Finally I calculate the total and average consumption for each group.
For instance, there was 70 respondents from demographic 1 (e.g. country = US, age = 18). They have on average consumed 4 of item 1 (e.g. cigarettes).
Generally we should strive to hold only facts and foreign keys in the fact table. However I personally don’t think that a seperate dimension for the boolean flag provides any value. Can this flag be considered a generate dimension, or is it considered bad design to have it in the fact table?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
It’s not really an answer, but in lack of good ideas, I decided to put the flag in a dimension table after all.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂