Chapter 5 PDF
Chapter 5 PDF
Chapter 5 PDF
Dr.Amjad Hawash
An-Najah National University
Database Design and Programming
Chapter 5
Advanced Data Modeling
5.1 THE EXTENDED ENTITY
RELATIONSHIP MODEL
●
There has been an increasing need to capture more
information in the data model due to the increase in
complexity in data structures and applications.
●
The extended entity relationship model (EERM),
sometimes referred to as the enhanced entity relationship
model, is the result of adding more semantic constructs to
the original entity relationship (ER) model.
●
A diagram using this model is called an EER diagram
(EERD).
5.1.1 Entity Supertypes and
Subtypes
●
Grouping things according to some properties.
●
For instance, a retail company could group employees as
salaried and hourly employees, while a university could
group employees as faculty, staff, and administrators.
●
Benefits:
– It avoids unnecessary nulls in the employee attributes
when some employees have characteristics that are not
shared by other employees.
– It enables a particular employee type to participate in
relationships that are unique to that employee type.
5.1.1 Entity Supertypes and
Subtypes
●
For example, unlike other employees, pilots must meet
special requirements such as flight hour restrictions, flight
checks, and periodic training.
●
Therefore, if all employee characteristics and special
qualifications were stored in a single EMPLOYEE entity,
you would have a lot of nulls or you would have to make a
lot of needless dummy entries.
5.1.1 Entity Supertypes and
Subtypes
●
In this case, special pilot characteristics such as
EMP_LICENSE, EMP_RATINGS, and EMP_MED_TYPE
will generate nulls for employees who are not pilots.
●
In addition, pilots participate in some relationships that are
unique to their qualifications.
●
For example, not all employees can fly airplanes; only
employees who are pilots can participate in the “employee
flies airplane” relationship.
5.1.1 Entity Supertypes and
Subtypes
●
Based on the preceding discussion, you would correctly deduce
that the PILOT entity stores only those attributes that are unique
to pilots, and that the EMPLOYEE entity stores attributes that are
common to all employees.
●
Based on that hierarchy, you can conclude that PILOT is a
subtype of EMPLOYEE, and that EMPLOYEE is the supertype of
PILOT.
●
In modeling terms, an entity supertype is a generic entity type
that is related to one or more entity subtypes, where the entity
supertype contains the common characteristics, and the entity
subtypes contain the unique characteristics of each entity
subtype.
5.1.1 Entity Supertypes and
Subtypes
●
There are two criteria that help the designer determine
when to use subtypes and supertypes:
– There must be different, identifiable kinds or types of
the entity in the user’s environment.
– The different kinds or types of instances should have
one or more attributes that are unique to that kind or
type of instance.
5.1.2 Specialization Hierarchy
●
Entity supertypes and subtypes are organized in a
specialization hierarchy, which depicts the arrangement
of higher-level entity supertypes (parent entities) and
lower-level entity subtypes (child entities).
●
The relationships depicted within the specialization
hierarchy are sometimes described in terms of “is-a”
relationships.
●
For example, a pilot is an employee, a mechanic is an
employee, and an accountant is an employee.
5.1.2 Specialization Hierarchy
5.1.2 Specialization Hierarchy
●
Specialization hierarchies enable the data model to
capture additional semantic content (meaning) into the
ERD.
●
A specialization hierarchy provides the means to:
– Support attribute inheritance.
– Define a special supertype attribute known as the
subtype discriminator.
– Define disjoint/overlapping constraints and
complete/partial constraints.
5.1.3 Inheritance
●
Enables an entity subtype to inherit the attributes and relationships of
the supertype.
●
A supertype contains those attributes that are common to all of its
subtypes.
●
Subtypes contain only the attributes that are unique to the subtype.
●
Pilots, mechanics, and accountants all inherit the employee number, last
name, first name, middle initial, hire date, and so on from the
EMPLOYEE entity.
●
One important inheritance characteristic is that all entity subtypes inherit
their primary key attribute from their supertype.
●
Note in Figure 5.2 that the EMP_NUM attribute is the primary key for
each of the subtypes.
5.1.3 Inheritance
●
Entity subtypes inherit all relationships in which the
supertype entity participates.
●
For example, Figure 5.2 shows the EMPLOYEE entity
supertype participating in a 1:M relationship with a
DEPENDENT entity.
●
Through inheritance, all subtypes also participate in that
relationship.
●
In specialization hierarchies with multiple levels of supertype/
subtypes, a lower-level subtype inherits all of the attributes
and relationships from all of its upper-level supertypes.
5.1.4 Subtype Discriminator
●
Is the attribute in the supertype entity that determines to which
subtype the supertype occurrence is related.
●
As seen in Figure 5.2, the subtype discriminator is the employee
type (EMP_TYPE).
●
Using Figure 5.2 as your guide, note that the supertype is
related to a PILOT subtype if the EMP_TYPE has a value of “P.”
●
If the EMP_TYPE value is “M,” the supertype is related to a
MECHANIC subtype.
●
And if the EMP_TYPE value is “A,” the supertype is related to
the ACCOUNTANT subtype.
5.1.5 Disjoint and Overlapping
Constraints
●
An entity supertype can have disjoint or overlapping entity
subtypes.
●
For example, in the aviation example, an employee can be
a pilot or a mechanic or an accountant.
●
Disjoint subtypes, also known as nonoverlapping
subtypes, are subtypes that contain a unique subset of the
supertype entity set; in other words, each entity instance of
the supertype can appear in only one of the subtypes.
5.1.5 Disjoint and Overlapping
Constraints
●
Overlapping subtypes are subtypes that contain nonunique
subsets of the supertype entity set; that is, each entity instance
of the supertype may appear in more than one subtype.
●
For example, in a university environment, a person may be an
employee or a student or both.
●
In turn, an employee may be a professor as well as an
administrator.
●
Because an employee may also be a student, STUDENT and
EMPLOYEE are overlapping subtypes of the supertype
PERSON, just as PROFESSOR and ADMINISTRATOR are
overlapping subtypes of the supertype EMPLOYEE.
5.1.5 Disjoint and Overlapping
Constraints
5.1.6 Completeness Constraint
●
The completeness constraint specifies whether each entity
supertype occurrence must also be a member of at least one
subtype.
●
The completeness constraint can be partial or total.
●
Partial completeness (symbolized by a circle over a single
line) means that not every supertype occurrence is a member
of a subtype; that is, there may be some supertype
occurrences that are not members of any subtype.
●
Total completeness (symbolized by a circle over a double
line) means that every supertype occurrence must be a
member of at least one subtype.
5.1.6 Completeness Constraint
●
The ERDs in Figures 5.2 and 5.4 represent the
completeness constraint based on the Visio Category
shape.
●
A single horizontal line under the circle represents a partial
completeness constraint; a double horizontal line under
the circle represents a total completeness constraint.
5.1.7 Specialization and
Generalization
●
Specialization is the top-down process of identifying
lower-level, more specific entity subtypes from a higher-
level entity supertype.
●
Specialization is based on grouping unique characteristics
and relationships of the subtypes.
●
Generalization is the bottom-up process of identifying a
higher-level, more generic entity supertype from lower-
level entity subtypes.
●
Generalization is based on grouping common
characteristics and relationships of the subtypes.
5.2 ENTITY CLUSTERING
●
Few entities at the beginning of database design.
●
A lot of entities after that, could lead to crowed design that requires
some kind of clustering.
●
An entity cluster is a “virtual” entity type used to represent multiple
entities and relationships in the ERD.
●
An entity cluster is formed by combining multiple interrelated entities into
a single abstract entity object.
●
An entity cluster is considered “virtual” or “abstract” in the sense that it is
not actually an entity in the final ERD.
●
Instead, it is a temporary entity used to represent multiple entities and
relationships, with the purpose of simplifying the ERD and thus
enhancing its readability.
5.2 ENTITY CLUSTERING
5.3 ENTITY INTEGRITY:
SELECTING PRIMARY KEYS
●
The most important characteristic of an entity is its primary
key (a single attribute or some combination of attributes),
which uniquely identifies each entity instance.
●
The primary key’s function is to guarantee entity integrity.
●
Furthermore, primary keys and foreign keys work together
to implement relationships in the relational model.
●
Therefore, the importance of properly selecting the primary
key has a direct bearing on the efficiency and
effectiveness of database implementation.
5.3.1 Natural Keys and Primary Keys
●
A natural key or natural identifier is a real-world, generally
accepted identifier used to distinguish—that is, uniquely
identify—real-world objects.
●
Usually, if an entity has a natural identifier, a data modeler
uses that as the primary key of the entity being modeled.
●
Generally, most natural keys make acceptable primary key
identifiers.
5.3.2 Primary Key Guidelines
●
Why was EMP_NUM selected as a primary key of
EMPLOYEE and not a combination of EMP_LNAME,
EMP_FNAME, EMP_INITIAL, and EMP_DOB?
●
In particular, given a primary key value—that is, the
determinant—the relational model can determine the value
of all dependent attributes that “describe” the entity.
●
The function of the primary key is to guarantee entity
integrity, not to “describe” the entity.
●
Primary keys and foreign keys are used to implement
relationships among entities.
5.3.2 Primary Key Guidelines
●
PK characteristics:
– Unique values.
– Nonintelligent.
– No change over time.
– Preferably single-attribute.
– Preferably numeric.
– Security-compliant.
5.3.3 When to Use Composite
Primary Keys
●
Composite primary keys are particularly useful in two
cases:
– As identifiers of composite entities, where each primary
key combination is allowed only once in the M:N
relationship.
– As identifiers of weak entities, where the weak entity
has a strong identifying relationship with the parent
entity.
5.3.3 When to Use Composite
Primary Keys
●
To illustrate the first case, assume that you have a STUDENT
entity set and a CLASS entity set. In addition, assume that
those two sets are related in an M:N relationship via an
ENROLL entity set in which each student/class combination
may appear only once in the composite entity.
●
Figure 5.6 shows the ERD to represent such a relationship.
●
the composite primary key automatically provides the benefit
of ensuring that there cannot be duplicate values—that is, it
ensures that the same student cannot enroll more than once
in the same class.
5.3.3 When to Use Composite
Primary Keys
5.3.3 When to Use Composite
Primary Keys
●
In the second case, a weak entity in a strong identifying
relationship with a parent entity is normally used to
represent one of two situations:
– A real-world object that is existence-dependent on
another real-world object. Those types of objects are
distinguishable in the real world.
– A real-world object that is represented in the data model
as two separate entities in a strong identifying
relationship.
5.3.4 When to Use Surrogate Primary
Keys
●
There are some instances when a primary key doesn’t
exist in the real world or when the existing natural key
might not be a suitable primary key.
●
A surrogate key is a primary key created by the database
designer to simplify the identification of entity instances.
The surrogate key has no meaning in the user’s
environment.
5.3.4 When to Use Surrogate Primary
Keys
5.4 DESIGN CASES: LEARNING
FLEXIBLE DATABASE DESIGN
●
Data modeling and database design require skills that are
acquired through experience.
●
In turn, experience is acquired through practice—regular
and frequent repetition, applying the concepts learned to
specific and different design problems.
●
This section presents four special design cases that
highlight the importance of flexible designs, proper
identification of primary keys, and placement of foreign
keys.