January 29, 2012

Normal Forms

First Normal Form

First Normal Form eliminates repeating groups by putting each into a separate table and connecting them with a one-to-many relationship.
Two rules follow this definition:
  • Each table has a primary key made of one or several fields and uniquely identifying each record
  • Each field is atomic, it does not contain more than one value
For instance, assuming a table WAGON to follow each wagon in every station.

wagon_id    integer  (ex. 101)
description string   (ex. 'wagon_type, empty_weight, capacity, designer, design_date')
state       string   (ex. 'under repair')
timestamp   datetime (ex. '22/12/2008 17:37')
station     string   (ex. 'New York Grand Central')

The primary key is (wagon_id, timestamp).
This table is not in 1NF because "description" is not atomic. To move it in 1NF we have to split "description" field in its components:

wagon_id     integer
wagon_type   string
empty_weight number
capacity     number
designer     string
design_date  datetime
state        string
timestamp    datetime
station      string

[edit]Second Normal Form

Second Normal Form eliminates functional dependencies on a partial key by putting the fields in a separate table from those that are dependent on the whole key.
In our example, "wagon_type", "empty_weight", "capacity"... only depends on "wagon_id" but not on "timestamp" field of the primary key, so this table is not in 2NF. In order to reach 2NF, we have to split the table in two in the way that each field of each table depends on all the fields of each primary key:

wagon_id     integer
wagon_type   string
empty_weight number
capacity     number
designer     string
design_date  datetime
wagon_id     integer
timestamp    datetime
state        string
station      string

[edit]Third Normal Form

Third Normal Form eliminates functional dependencies on non-key fields by putting them in a separate table. At this stage, all non-key fields are dependent on the key, the whole key and nothing but the key.
In our example, in the first table it is most likely that "empty_weight", "capacity", "designer" and "design_date" depend on "wagon_type", so we have to split this table in two:

wagon_id     integer
wagon_type   string
wagon_type   string
empty_weight number
capacity     number
designer     string
design_date  datetime

Now our example with its 3 tables is in 3NF.

[edit]Fourth Normal Form

Fourth Normal Form separates independent multi-valued facts stored in one table into separate tables.
In the last table of our example, it is clear that "empty_weight" and "capacity" are interesting for every day wagon management whereas "designer" and "design_date" for historical purpose. So we can think they will never be used at the same time and have to split the table in two to fit the 4NF.

wagon_type   string
empty_weight number
capacity     number
wagon_type   string
designer     string
design_date  datetime

[edit]Fifth Normal Form

Fifth Normal Form breaks out data redundancy that is not covered by any of the previous normal forms.

No comments:

Post a Comment

I'm certainly not an expert, but I'll try my hardest to explain what I do know and research what I don't know.

My Favorite Site's List

#update below script more than 500 posts