Monday, 5 December 2016

Key Points of Normalization (DBMS Topic)

Normalization: It is the process of removing redundant data from your tables in order to improve storage efficiency, data integrity and scalability. This improvement is balanced against an increase in complexity and potential performance losses from the joining of the normalized tables at query-time.
There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. 
 Normalization is also called “Bottom-up-approach”, because this technique requires full knowledge of every participating attribute and its dependencies on the key attributes, if you try to add new attributes after normalization is done, it may change the normal form of the database design.
Without Normalization Problems: Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not normalized. 

    ·Insert Anomaly - Due to lack of data i.e., all the data available for insertion such that null values in keys should be avoided. This kind of anomaly can seriously damage a database
   ·Update Anomaly - It is due to data redundancy i.e. multiple occurrences of same values in a column. This can lead to inefficiency.
    ·Deletion Anomaly - It leads to loss of data for rows that are not stored elsewhere. It could result in loss of vital data.

On decomposition of a relation into smaller relations with fewer attributes on normalization the resulting relations whenever joined must result in the same relation without any extra rows. The join operations can be performed in any order. This is known as Lossless Join decomposition. The resulting relations (tables) obtained on normalization should possess the properties such as each row must be identified by a unique key, no repeating groups, homogenous columns, each column is assigned a unique name etc.

Normalization has Five Normal Forms:

A relation is considered to be in first normal   form if all of its attributes have domain that are indivisible or atomic.

A table is in 1NF if and only if its satisfies the following five conditions:
       · There is no top-to-bottom ordering to the rows.
       · There is no left-to-right ordering to the columns.
       · There are no duplicate rows.
       · Every row and column intersection contains exactly one value from the applicable domain.
      · All columns are regular
      · Each attribute must contain only a single value from its predefined domain.


      · Table is in 1NF (First normal form)
      ·         No non-prime attribute is dependent on the proper subset of any candidate key of table.

An attribute that is not part of any candidate key is known as non-prime attribute.


A functional dependency is said to be transitive if it is indirectly formed by two functional dependencies. For e.g.
X -> Z is a transitive dependency if the following three functional dependencies hold true:
· X->Y
·Y does not 
· Y->Z
A table design is said to be in 3NF if both the following conditions hold:
·  Table must be in 2NF
·  Transitive functional dependency  of non-prime attribute on any super key should be removed.

An attribute that is not part of any candidate key is known as non-prime attribute.
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each functional dependency X-> Y at least one of the following conditions hold:
·   X is a super key of table
·   Y is a prime attribute of table

An attribute that is a part of one of the candidate keys is known as prime attribute.


A relational schema R is considered to be in Boyce–Codd normal form (BCNF) if, for every one of its dependencies X → Y, one of the following conditions holds true:
·   X → Y is a trivial functional dependency (i.e., Y is a subset of X)
·       X is a superkey for schema R
 “Each attribute must represent a fact about the key, the whole key, and nothing but the key.

A database is said to be in 5NF, if and only if,
· It's in 4NF
· If we can decompose table further to eliminate redundancy and anomaly, and when we re-join the decomposed tables by means of candidate keys, we should not be losing the original data or any new record set should not arise. In simple words, joining two or more decomposed table should not lose records nor create new records.

No comments:

Post a Comment