Database Management System 16: Normalization

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Normalization

Chittaranjan Pradhan

Database Management Normalization


Lossless Decomposition

System 16 Lossless - Join Algorithm


Dependency Preservation

Guidelines followed in

Normalization Designing Good


Database

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
16.1
Normalization
Normalization
Chittaranjan Pradhan

Normalization
Normalization
Normalization is the process of decomposing or breaking a Lossless Decomposition

relation schema R into fragments (i.e. smaller schemas) R1 , Lossless - Join Algorithm
Dependency Preservation
R2 , ... Rn such that the following conditions hold: Guidelines followed in
Designing Good
• Lossless decomposition: The fragments should contain Database

the same information as the original relation


• Dependency preservation: All the functional
dependencies should be preserved within each fragment
Ri
• Good form: Each fragment Ri should be free from any
type of redundancy
In other words, normalization is the process of refining the
relational data model. It is used because of the following
reasons:
• It improves database design
• It ensures minimum redundancy of data
• It removes anomalies for database activities
16.2
Normalization
Lossless Decomposition
Chittaranjan Pradhan
Lossless Decomposition
The decomposition of a base relation is said to be lossless if Normalization
Lossless Decomposition
the original relation can be recovered back by joining the Lossless - Join Algorithm
Dependency Preservation
fragment relations
Guidelines followed in
Let R be the base relation, which is decomposed into R1 , R2 , ... Designing Good
Database
Rn . This decomposition is lossless iff R=R1 o
n R2 on... o
n Rn . In
other words, the consecutive fragments are interrelated by the
primary key and foreign key relationships
A B
a 1
R a 2
b 1
b 2

A B
R1 R2 a 1
b 2

A B
a 1
R1 o
n R2 a 2
b 1
b 2

As R1 o
n R2 = R, the decomposition is lossless 16.3
Normalization
Lossless Decomposition...
Chittaranjan Pradhan

Normalization
A B Lossless Decomposition
a 1
R a 2
Lossless - Join Algorithm
Dependency Preservation

b 1 Guidelines followed in
Designing Good
Database
A
R1 a
b

B
R2 1
2

A B
a 1
R1 o
n R2 a 2
b 1
b 2

n R2 6= R, the decomposition is lossy


As R1 o

16.4
Normalization
Lossless Decomposition...
Chittaranjan Pradhan
Lossless Decomposition...
When the base relation schema is decomposed into the Normalization
Lossless Decomposition
fragmented relation schemas, the consecutive relations should Lossless - Join Algorithm
Dependency Preservation
be related by primary key - foreign key pair on the common
Guidelines followed in
column; so that natural join be possible on the common Designing Good
Database
column. Thus, we have to check whether the common column
is the key of any relation or not:
• If the common column is the key, the decomposition is
lossless
• If the common column is not the key, the decomposition is
lossy

Ex: R(A, B, C, D) with F={A→B, B→C, C→D} is decomposed


to R1 (A, B, C) and R2 (C, D) with F1 ={A→B, B→C} and
F2 ={C→D} respectively

Ex: R(A, B, C, D) with F={A→B, B→C, D→C} is decomposed


to R1 (A, B, C) and R2 (C, D) with F1 ={A→B, B→C} and
F2 ={D→C} respectively
16.5
Normalization
Lossless - Join Algorithm
Chittaranjan Pradhan

Lossless - Join Algorithm


Normalization
Lossless Decomposition
This algorithm is used to check whether the decomposition of a Lossless - Join Algorithm

relation is lossless or lossy type. The steps are: Dependency Preservation

Guidelines followed in
• Construct a table with n columns, where n is the number of Designing Good
Database
attributes in the original relation and k rows, where k is the
number of decomposed relations. Label the columns as
A1 , A2 , ... An
• Fill the entries in this table as follows: for each attribute Ai ,
check if this attribute is one of the attributes of the relation
schema Rj
• If attribute Ai is in the Relation Rj , then the entry (Ai , Rj ) of
the table will be ai
• If attribute Ai is not an attribute in the relation Rj , then the
entry (Ai , Rj ) of the table will be bij
• For each of FD X→Y of F, do the following until it is not
possible to make any more changes to the table. When the
table no longer changes, continue with step4

16.6
Normalization
Lossless - Join Algorithm...
Chittaranjan Pradhan

Lossless - Join Algorithm... Normalization


Lossless Decomposition

• . Lossless - Join Algorithm


Dependency Preservation

• If there are two or more rows with the same value under the Guidelines followed in
attribute or attributes of the determinant X, then make equal Designing Good
Database
their entries under attribute Y (i.e. RHS of FD)
• When making equal two or more entries under any column, if
one of them is ai , then make all of them ai
• If none of them is ai , then make all the entries uniform by
considering one b term. If they are bij and bkl , chose one of
these two values as the representative value and make the
other values equal to it. That means, make all the entries as
either bij or bkl . Continue with step3
• If there are no two rows with the same value under the
attribute or attributes of the determinant X, continue with
step3
• Check the rows of the table. If there is a row with its
entries equal to a1 , a2 ... an , then the decomposition is
lossless. Otherwise, the decomposition is lossy

16.7
Normalization
Lossless - Join Algorithm...
Chittaranjan Pradhan

Normalization
Lossless Decomposition
Lossless - Join Algorithm... Lossless - Join Algorithm
Dependency Preservation
R=(X, Y, Z), F={X→Y, Z→Y }, R1 =(X, Y), R2 =(Y, Z) Guidelines followed in
Designing Good
Database
A1 (X) A2 (Y) A3 (Z)
R1
R2

A1 (X) A2 (Y) A3 (Z)


R1 a1 a2 b31
R2 b12 a2 a3

For X→Y, the table remains unchanged

For Z→Y, the table remains unchanged

Decomposition is lossy type

16.8
Normalization
Lossless - Join Algorithm...
Chittaranjan Pradhan

Lossless - Join Algorithm... Normalization


Lossless Decomposition
R=(X, Y, Z), F={X→Y, Y→Z }, R1 =(X, Y), R2 =(Y, Z) Lossless - Join Algorithm
Dependency Preservation

A1 (X) A2 (Y) A3 (Z) Guidelines followed in


Designing Good
R1 Database
R2

A1 (X) A2 (Y) A3 (Z)


R1 a1 a2 b31
R2 b12 a2 a3

For X→Y, the table remains unchanged

For Y→Z, the table will be modified


A1 (X) A2 (Y) A3 (Z)
R1 a1 a2 a3
R2 b12 a2 a3

Decomposition is lossless type

16.9
Normalization
Dependency Preservation
Chittaranjan Pradhan

Dependency Preservation Normalization


Lossless Decomposition
Lossless - Join Algorithm
The decomposition of a relation schema R with FDs F is a set Dependency Preservation

of fragment relations Ri with FDs Fi , where Fi is the subset of Guidelines followed in


Designing Good
dependencies in F + that include only attributes in Ri . The Database

decomposition is dependency preserving if and only if

(∪i Fi )+ = F +

R = (A, B, C) and F={A→B, B→C, A→C}

A B C
1 2 3
R 2 2 3
3 2 3
4 3 4

Key=A

16.10
Normalization
Dependency Preservation...
Chittaranjan Pradhan

Normalization
Lossless Decomposition
A B Lossless - Join Algorithm
1 2 Dependency Preservation

R1 = (A, B) and F={A→B} 2 2 Guidelines followed in


3 2 Designing Good
4 3 Database

A C
1 3
R2 = (A, C) and F={A→C} 2 3
3 3
4 4

The decomposition is lossless because R1 o


n R2 = R

But, the decomposition is not dependency preserving because


(F1 ∪ F2 )+ ={A→B, A→C}

Thus, (F1 ∪ F2 )+ 6= F + as we lost the FD B→C

16.11
Normalization
Dependency Preservation...
Chittaranjan Pradhan

Normalization
Lossless Decomposition
Lossless - Join Algorithm
A B Dependency Preservation
1 2
Guidelines followed in
R1 = (A, B) and F={A→B} 2 2 Designing Good
3 2 Database
4 3

B C
R2 = (B, C) and F={B→C} 2 3
3 4

The decomposition is lossless because R1 o


n R2 = R

Similarly, the decomposition is dependency preserving


because (F1 ∪ F2 )+ ={A→B, B→C, A→C}

Thus, (F1 ∪ F2 )+ = F +

16.12
Normalization
Guidelines followed in Designing Good Database
Chittaranjan Pradhan

Guideline 1
Normalization
Design a relation schema so that it is easy to explain its Lossless Decomposition
Lossless - Join Algorithm
meaning. Do not combine attributes from multiple entity sets Dependency Preservation

and relationship sets into a single relation Guidelines followed in


Designing Good
Database

If a relation schema corresponds to one entity set or one


relationship set, then the meaning tends to be simple and
clear. Otherwise, the relation corresponds to a mixture of
multiple entities and relationships and hence becomes
complex and unclear

Only foreign keys should be used to refer to other entities.


Entity and relationship attributes should be kept apart as much
as possible

Bottom Line: design the schemas that can be explained easily


relation by relation. In such cases, the semantics of attributes
should be easy to interpret
16.13
Normalization
Guidelines followed in Designing Good Database...
Chittaranjan Pradhan

Guideline 2
Design the base relation schemas in such a way that the Normalization
Lossless Decomposition
anomalies such as insertion, deletion, or updation anomalies Lossless - Join Algorithm
Dependency Preservation
are removed from the relations
Guidelines followed in
Designing Good
Database
If any anomalies are present, note them clearly and make sure
that the programs that modify (update) the database will
operate correctly

Guideline 3
Avoid placing attributes in a base relation whose values may
frequently be NULL

If NULLs are unavoidable, make sure that they apply in


exceptional cases only and do not apply to a majority of tuples
in the relation

Attributes that are NULL frequently could be placed in separate


relations (with the primary key)
16.14
Normalization
Guidelines followed in Designing Good Database...
Chittaranjan Pradhan

Normalization
Lossless Decomposition
Lossless - Join Algorithm
Dependency Preservation

Guidelines followed in
Designing Good
Database
Guideline 4
Design the relation schemas so that they can be joined in a
such a way that no spurious tuples are generated

Avoid relations that contain matching attributes that are not


(foreign key and primary key) combinations, because joining on
such attributes may produce spurious tuples

16.15

You might also like