RELATIONAL DATABASE THEORY ON NORMAL FORMS

Click here for audio-text lecture (for both this unit and the previous one) and feed it to the speech agent
Click here for an audio lecture that can be played using RealPlayer

FUNCTIONAL DEPENDENCY

X -> Y, or X functionally determines Y if and only if whenever two tuples of a relational instance r(R) of R agree on their X-value, they must also agree on the Y-value.

Example: SSN -> NAME

We say X determines Y, or Y is dependent on X

Example: {S#,P#} -> QTY

Since QTY depends on BOTH attributes S# and P#, this is called a FULL functional dependency.

Example: {SSN, NAME} -> ADDRESS

Since SSN -> ADDRESS, this is called a partial functional dependency.

We also write:

S# P# -> QTY
SSN NAME -> ADDRESS

NORMALIZATION

The relation on the left-hand side is unnormalized and contains nested relations. The relation on the right-hand side is normalized. This is the first normal form.


NORMAL FORMS

  • A relation is in a "normal form" if it satisfies a certain set of constraints.

  • First Normal Form (1NF): A relation's underlying domains contain atomic values only.

  • Second Normal Form (2NF): A relation's every nonkey attribute is fully dependent on the primary key.

  • Third Normal Form (3NF): A relation's nonkey attributes are: (a) mutually independent, and (b) fully dependent on the primary key.

  • EVERY RELATION MUST BE IN 1NF. This is one of the basic properties of a relation. Not all 1NF relation is in 2NF. Not all 2NF relation is in 3NF.

    EXAMPLE OF NORMALIZATION

    The following relation FIRST is in first normal form.
    S#       STATUS     CITY      P#    QTY
    ---      ------   --------   ---   ----
     S1          20   London      P1   300
     S1          20   London      P2   200
     S1          20   London      P3   400
     S1          20   London      P4   200
     S1          20   London      P5   100
     S1          20   London      P6   100
     S2          10    Paris      P1   300
     S2          10    Paris      P2   400
     S3          10    Paris      P2   200
     S4          20   London      P2   200
     S4          20   London      P4   300
     S4          20   London      P5   400
    


    The FIRST relation has the following functional dependencies:



    The functional dependencies in relations S, P and SP:


    UPDATE ANOMALIES

  • Relation FIRST(S#,STATUS,CITY,P#,QTY)

  • INSERT anomaly

    We cannot insert (S5,20,London) into the relation FIRST.

  • DELETE anomaly

    If we delete (S3,10,Paris,P2,200), we lose the information that S3 is in Paris.

    SOLUTION

  • Decompose FIRST into two relations: SECOND(S#,STATUS,CITY) and SP(S#,P#,QTY)

  • Decompose SECOND into two relations: SC(S#, CITY) and CS(CITY, STATUS)

  • We end up with three 3NF relations SP, SC and CS.
    Functional dependencies in the relations SECOND and SP.


    KINDS OF NORMAL FORMS

    The following diagram illustrates the various kinds of normal forms.


    KINDS OF RELATIONS

    (duplicated page)

  • Base relations: The real relations. Called "base table" in SQL.

  • Views: The virtual relations. A view is a named, derived relation.

  • Snapshots: A snapshot is a real, not virtual, named derived relation.

  • Query results: The final output relation from a specified query. It may not be named and has no permanent existence.

  • Temporary relations: A nonpermanent named derived relation.

    INFERENCE RULES FOR FUNCTIONAL DEPENDENCIES


    IR1 (Reflective rule) IF X contains Y, then X -> Y
    IR2 (Augmentation rule) {X->Y} implies XZ -> YZ
    IR3 (Transitive rule) {X->Y,Y->Z} imples X->Z
    IR4 (Projection rule) {X->YZ} implies X->Y
    IR5 (Union rule) {X->Y,X->Z} implies X->YZ
    IR6 (Pseudotransitive rule) {X->Y,WY->Z} implies WX->Z

    Inference rules IR1 to IR3 (Armstrong) are SOUND and COMPLETE
    SOUND (inferred dependencies hold if the original set of dependencies hold for r(R))
    COMPLETE (don't need more rules)