All we need is an easy explanation of the problem, so here it is.
Briefly, what are the main advantages and disadvantages of normalisation as a technique for database design?
It may be easier to answer this with a snapshot of some unorganised data, as a concrete example:
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Here’s a Microsoft article that does a pretty good job explaining normalization and the different forms, albeit a little dated regarding Microsoft Access, the theory and principles still apply.
In short, database normalization employs the following benefits:
Reduces data redundancy which improves maintainability. As you see in your example table
Customer_Sale, things that aren’t central to a Sale are repeated unnecessarily.
A good example of this is the
ItemDesccolumn. Imagine the scenario where
ItemNo = 123has been sold to
Customersfor the past year, 10s of thousands of sales for example. And then the business realizes the
ItemDescis wrong and needs to be updated. With the current denormalized design of storing the
Customer_Saletable, you would need to update all 10s of thousands of records to fix the
Normalization would be to have another table called
Itemswhich would store one row per unique
Itemand is where the
ItemDescwould live. The primary key of that table would likely be the
ItemNo(assuming that’s the unique identifier for an
Itemhere). So there would only be one record for
ItemNo = 123with the
ItemDesccolumn in the
Customer_Saletable would no longer have a column called
ItemDesc(you’d be able to reference it in the
Itemstable by joining on the
ItemNofield in both tables). Now if the description of an
Itemneeds to change, you’d only have to update it in one place, that single row in the
For similar reasons this would improve performance, by minimizing the amount of work the above type of maintenance requires. Less rows to update generally means shorter lock times and less chance of lock escalation (when applicable). So overall your database system and the applications that consume it will be able to run more efficiently.
Another reason performance may improve due to normalization is because your tables (and more so the objects they live in on the disk – generally referred to as data pages) become smaller in data size.
This helps improve the performance of a SQL engine when locating and loading those data pages off the disk, which is typically the biggest bottleneck in a server’s provisioned hardware. Since your table and effectively its rows become smaller in size, that means more rows can be fit in a single data page, which means less pages would need to be located and loaded off the disk as a result too.
From the consuming application perspective, you also generally gain more flexibility when the architecture of your database is normalized appropriately.
By having the fields of your data points broken out into appropriately less wide tables, that make general sense to your domain model, and keeping the closely related fields of a particular entity together in the same table, you maximize your ability to utilize, query, and manipulate those data points and entities as needed in your consuming applications (a refactoring of your data in a very loose sense of the word).
An example of this again using your
Customer_Saletable would be if you had a Sales Order application that has two screens. One that was the unique list of
Itemsthe business sells with their descriptions, and the other was the list of
Customer_Salesthat have been made by the business so far.
If you didn’t have the normalized
Itemstable that stored the
ItemDescfield (as my first point exemplified) then to support both of those screens and their use cases, you’d have a more difficult time with the less flexible denormalized
Customer_Saletable because of its data redundancy for the
Of course in your consuming programming language you can use a distinct operator of sorts to transform the data of your
Customer_Saletable to fit the model of the
Itemsavailable for selling screen, but that is an additional set of work that the consuming application needs to do every time that screen is loaded. It also becomes dicey fast with managing the code, especially as more complex business rules come into play over time as opposed to a normalized database architecture where the
Itemstable already exists.
Possible disadvantages of normalization are:
Over-normalization can lead to potential performance problems. By breaking the fields up into too many tables, it may overcomplicate querying that always involves re-joining most of those tables back together to get the rows you need. Some database systems struggle harder with too many joins, but your mileage may vary.
In a database structure meant to support heavy OLAP (Online Analytical Processing – essentially data warehousing and heavy reporting purposess) sometimes denormalized tables perform better by keeping cached and pre-calculated commonly needed facts and figures nearby.
There’s a lot more in-depth and technical reasoning not discussed in this answer that is in the article I linked at the start of this answer. So I highly recommend reading through that article after getting a very brief overview from this answer.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂