dbms-unit-1_aktu
dbms-unit-1_aktu
dbms-unit-1_aktu
Objectives
At the end of this chapter the reader will be able to:
• Distinguish between data and information and Knowledge
• Distinguish between file processing system and DBMS
• Describe DBMS its advantages and disadvantages
• Describe data models, schemas and instances.
• Describe Database users including data base administrator
• Describe DBMS Architecture & Data Independence
• Describe Data Languages
• A database management system coordinates both the physical and the logical access to the data,
whereas a file-processing system coordinates only the physical access.
• A database management system reduces the amount of data duplication by ensuring that a physical
piece of data is available to all programs authorized to have access to it, whereas data written by one
program in a file-processing system may not be readable by another program.
• A database management system is designed to allow flexible access to data (i.e., queries), whereas a
file-processing system is designed to allow predetermined access to data (i.e., compiled programs).
• A database management system is designed to coordinate multiple users accessing the same data at
the same time. A file-processing system is usually designed to allow one or more programs to access
different data files at the same time. In a file-processing system, a file can be accessed by two
programs concurrently only if both programs have read-only access to the file.
• Redundancy is control in DBMS, but not in file system
• Unauthorized access is restricted in DBMS but not in file system.
Subject: Database Management System
• DBMS provide backup and recovery. When data is lost in file system then it not recover.
• DBMS provide multiple user interfaces. Data is isolated in file system
Disadvantages of a DBMS
• Danger of a Overkill: For small and simple applications for single users a database system is often
not advisable.
• Complexity: A database system creates additional complexity and requirements. The supply and
operation of a database management system with several users and databases is quite costly and
demanding.
• Qualified Personnel: The professional operation of a database system requires appropriately trained
staff. Without a qualified database administrator nothing will work for long.
• Costs: Through the use of a database system new costs are generated for the system itself but also
for additional hardware and the more complex handling of the system.
• Lower Efficiency: A database system is a multi-use software which is often less efficient than
specialized software which is produced and optimized exactly for one problem.
A database system is a collection of interrelated files and a set of programs that allow users to access
and modify these files. A major purpose of a database system is to provide users with an abstract view of
the data. That is, the system hides certain details of how the data are stored and maintained
Data Abstraction
Since many database-systems users are not computer trained, developers hide the complexity from users
through several levels of abstraction, to simplify users’ interactions with the system:
• Physical level. The lowest level of abstraction describes how the data are actually stored. The
physical level describes complex low-level data structures in detail.
Subject: Database Management System
• Logical level. The next-higher level of abstraction describes what data are stored in the database,
and what relationships exist among those data. The logical level thus describes the entire
database in terms of a small number of relatively simple structures. Although implementation of
the simple structures at the logical level may involve complex physical-level structures, the user
of the logical level does not need to be aware of this complexity. Database administrators, who
must decide what information to keep in the database, use the logical level of abstraction.
• View level. The highest level of abstraction describes only part of the entire database. The view
level of abstraction exists to simplify their interaction with the system. The system may provide
many views for the same database.
Database Instance
The collection of information stored in the database at a particular moment is called an instance of the
database. A database instance is a state of operational database with data at any given time. It contains a
snapshot of the database. Database instances tend to change with time. A DBMS ensures that its every
instance (state) is in a valid state, by diligently following all the validations, constraints, and conditions
that the database designers have imposed.
Database Schema
The overall design of the database is called the database schema. Database systems have several
schemas, partitioned according to the levels of abstraction.
• The physical schema describes the database design at the physical level,
• The logical schema describes the database design at the logical level.
• A database may also have several schemas at the view level, sometimes called sub-schemas, that
describe different views of the database.
Database schema skeleton structure of and it represents the logical view of entire database. It tells about how the data is
organized and how relation among them is associated. A database schema defines its entities and the relationship
among them. Database schema is a descriptive detail of the database, which can be depicted by means of schema
diagrams.
Subject: Database Management System
Data Independence
Data independence is the feature that allows the schema of one layer of the database system to be
changed without any impact on the schema of the next higher level of the database system. Through data
independence, we can build an environment in which data is independent of all programs, and through
the three schema architectures, data independence will be more understandable.
Data Models
Data model: A collection of conceptual tools for describing data, data relationships, data semantics, and
consistency constraints.
Entity-Relationship Model
The entity-relationship (E-R) data model perceives the real world as consisting of basic objects, called
entities, and relationships among these objects. It was developed to facilitate database design by allowing
specification of an enterprise schema, which represents the overall logical structure of a database.
The E-R model is very useful in mapping the meanings and interactions of real-world enterprises onto a
conceptual schema. Because of this usefulness, many database-design tools draw on concepts from the E-R
model.
Entity
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For
example, in a school database, students, teachers, classes, and courses offered can be considered as entities.
All these entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with attribute
sharing similar values. For example, a Students set may contain all the students of a school; likewise a
Teachers set may contain all the teachers of a school from all faculties. Entity sets need not be disjoint.
Attributes
Entities are represented by means of their properties, called attributes. All attributes have values. For
example, a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a student's name
cannot be a numeric value. It has to be alphabetic. A student's age cannot be negative, etc.
Types of Attributes
• Simple attribute − Simple attributes are atomic values, which cannot be divided further. For
example, a student's BirthsDate is an atomic.
• Composite attribute − Composite attributes are made of more than one simple attribute. For
example, a student's complete name may have first_name and last_name.
Derived attribute − Derived attributes are the attributes that do not exist in the physical database, but their
values are derived from other attributes present in the database. For another example, age can be derived from
BirthDate.
Relationship: The logical association among entities is called relationship. Relationships are mapped with
entities in various ways. Mapping cardinalities define the number of association between two entities.
A. Relational model
The most common model, the relational model sorts data
into tables, also known as relations, each of which
consists of columns and rows. Each column lists an
attribute of the entity in question, such as price, zip code,
or birth date. Together, the attributes in a relation are
called a domain. A particular attribute or combination of
attributes is chosen as a primary key that can be referred
to in other tables, when it’s called a foreign key.
Each row, also called a tuple, includes data about a
specific instance of the entity in question, such as a
particular employee.
The model also accounts for the types of relationships
between those tables, including one-to-one, one-to-many,
and many-to-many relationships. Here’s an example:
Within the database, tables can be normalized, or brought to comply with normalization rules that make the
database flexible, adaptable, and scalable. When normalized, each piece of data is atomic, or broken into the
smallest useful pieces.
B. Hierarchical model
The hierarchical model organizes data into a tree-like structure, where each record has a single parent or root.
Sibling records are sorted in a particular order. That order is used as the physical order for storing the
Subject: Database Management System
database. This model is good for describing many real-world relationships.
C. Network model
The network model builds on the hierarchical model by allowing
many-to-many relationships between linked records, implying
multiple parent records. Based on mathematical set theory, the model
is constructed with sets of related records. Each set consists of one
owner or parent record and one or more member or child records. A
record can be a member or child in multiple sets, allowing this model
to convey complex relationships.
Database Languages
A database system provides a data definition language to specify the database schema and a data
manipulation language to express database queries and updates. The data definition and data manipulation
languages are not two separate languages; instead, they simply form parts of a single database language,
such as the widely used SQL language.
A data-manipulation language (DML) is a language that enables users to access or manipulate data as
organized by the appropriate data model. There are basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
Subject: Database Management System
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data
are needed without specifying how to get those data.
Data Control Language (DCL): DCL is used for granting and revoking user access on a database
• To grant access to user – GRANT
• To revoke access from user – REVOKE
Transaction Control Language (TCL) : Statements are used to manage the changes made by DML
statements. It allows statements to be grouped together into logical transactions.
• COMMIT - save work done
• SAVEPOINT - identify a point in a transaction to which you can later roll back
• ROLLBACK - restore database to original since the last COMMIT
• SET TRANSACTION - Change transaction options like isolation level and what rollback
segment to use.
Database Users
Database users are the one who really use and take the benefits of database. There will be different types
of users depending on their need and way of accessing the database.
• Naive Users - these are the users who use the existing application to interact with the database.
For example, online library system, ticket booking systems, ATMs etc. which has existing
application and users use them to interact with the database to fulfill their requests.
• Application Programmers - They are the developers who interact with the database by means
of DML queries. These DML queries are written in the application programs like C, C++, JAVA,
Pascal etc. These queries are converted into object code to communicate with the database. For
example, writing a C program to generate the report of employees who are working in particular
department will involve a query to fetch the data from database. It will include a embedded SQL
query in the C Program.
• Sophisticated Users - They are database developers, who write SQL queries to
select/insert/delete/update data. They do not use any application or programs to request the
database. They directly interact with the database by means of query language like SQL. These
users will be scientists, engineers, analysts who thoroughly study SQL and DBMS to apply the
concepts in their requirement. In short, we can say this category includes designers and developers
of DBMS and SQL.
• Specialized Users - These are also sophisticated users, but they write special database
application programs. They are the developers who develop the complex programs to the
requirement.
• Stand-alone Users - These users will have stand –alone database for their personal use. These
kinds of database will have readymade database packages which will have menus and graphical
interfaces.
Database Administrator
One of the main reasons for using DBMSs is to have central control of both the data and the programs
that access those data. A person who has such central control over the system is called a database
administrator (DBA). The functions of a DBA include:
• Schema definition. The DBA creates the original database schema by executing a set of data
definition statements in the DDL.
• Schema and physical-organization modification. The DBA carries out changes to the schema
and physical organization to reflect the changing needs of the organization, or to alter the
physical organization to improve performance.
• Granting of authorization for data access. By granting different types of authorization, the
Subject: Database Management System
database administrator can regulate which parts of the database various users can access. The
authorization information is kept in a special system structure that the database system consults
whenever someone attempts to access the data in the system.
• Routine maintenance. Examples of the database administrator‘s routine maintenance activities
are:
o Periodically backing up the database, either onto tapes or onto remote servers, to prevent
loss of data in case of disasters such as flooding.
o Ensuring that enough free disk space is available for normal operations, and upgrading
disk space as required.
o Monitoring jobs running on the database and ensuring that performance is not degraded
by very expensive tasks submitted by some users.
1. Two-tier Architecture: In two-tier architecture, the application is partitioned into a component that
resides at the client machine, which invokes database system functionality at the server machine
through query language statements. Application program interface standards like ODBC and JDBC
are used for interaction between the client and the server.
2. Three-tier Architecture: In a three-tier architecture, the client machine acts as merely a front end
and does not contain any direct database calls. Instead, the client end communicates with an
application server, usually through a forms interface. The application server in turn communicates
with a database system to access data. The business logic of the application, which says what actions
to carry out under what conditions, is embedded in the application server, instead of being
distributed across multiple clients. Three-tier applications are more appropriate for large
applications, and for applications that run on theWorld WideWeb.
Subject: Database Management System
Other Architecture:
• Centralized Systems
• Client-Server Systems
Centralized Systems
• Run on a single computer system and do not interact with other computer systems.
• General-purpose computer system: one to a few CPUs and a number of device controllers that
are connected through a common bus that provides access to shared memory.
• Single-user system (e.g., personal computer or workstation): desk-top unit, single user, usually
has only one CPU and one or two hard disks; the OS may support only one user.
• Client-Server Systems: Server systems satisfy requests generated at n‘client systems, whose general structure is
shown below:
Storage Manager:
The storage manager is the component of a database system that provides the interface between the low-
level data stored in the database and the application programs and queries submitted to the system. The
storage manager is responsible for storing, retrieving, and updating data in the database.
Relationship
The association among entities is called a relationship. For example, an employee works_at a
department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships.
Relationship Set: A set of relationships of similar type is called a relationship set. Like entities, a
relationship too can have attributes. These attributes are called descriptive attributes.
Degree of Relationship: The number of participating entities in a relationship defines the degree of the
relationship.
• Binary = degree 2
• Ternary = degree 3
• n-ary = degree n
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can be associated with the number of
entities of other set via relationship set.
• One-to-one − One entity from entity set A can be associated with at most one entity of entity set
B and vice versa.
• One-to-many − One entity from entity set A can be associated with more than one entities of
entity set B however an entity from entity set B, can be associated with at most one entity.
• Many-to-one − More than one entities from entity set A can be associated with at most one
entity of entity set B, however an entity from entity set B can be associated with more than one
entity from entity set A.
Subject: Database Management System
• Many-to-many − One entity from A can be associated with more than one entity from B and
vice versa.
Participation Constraints
Participation constraints in a database management system (DBMS) define whether entities on one side of
a relationship must be related to entities on the other side. There are two types of participation constraints:
total participation and partial participation.
1. Total Participation − Each entity is involved in the relationship. Total participation is represented by
double lines.
2. Partial participation − Not all entities are involved in the relationship. Partial participation is
represented by single lines.
Subject: Database Management System
Entity-Relationship Symbols:
Weak Entity Sets: An entity set may not have sufficient attributes to form a primary key. Such an entity set
is termed a weak entity set. An entity set that has a primary key is termed a strong entity set.
For a weak entity set to be meaningful, it must be associated with another entity set, called the identifying or
owner entity set. Every weak entity must be associated with an identifying entity; that is, the weak entity set
is said to be existence dependent on the identifying entity set. The identifying entity set is said to own the
weak entity set that it identifies. The relationship associating the weak entity set with the identifying entity set
is called the identifying relationship. The identifying relationship is many to one from the weak entity set to
the identifying entity set, and the participation of the weak entity set in the relationship is total.
Subject: Database Management System
Generalization, Specialization and Aggregation in ER Model
Using the ER model for bigger data creates a lot of complexity while designing a database model, So in order to
minimize the complexity Generalization, Specialization, and Aggregation were introduced in the ER model and
these were used for data abstraction in which an abstraction mechanism is used to hide details of a set of
objects. Some of the terms were added to the Enhanced ER Model, where some new concepts were added.
These new concepts are:
• Generalization
• Specialization
• Aggregation
Specialization
In specialization, an entity is divided into sub-entities based on
its characteristics. It is a top-down approach where the higher-
level entity is specialized into two or more lower-level entities.
For Example, an EMPLOYEE entity in an Employee
management system can be specialized into DEVELOPER,
TESTER, etc. as shown in Figure. In this case, common
attributes like E_NAME, E_SAL, etc. become part of a higher
entity (EMPLOYEE), and specialized attributes like
TES_TYPE become part of a specialized entity (TESTER).
Specialization is also called as ” Top-Down approach”.
Generalization:
Generalization is the process of extracting common
properties from a set of entities and creating a generalized
entity from it. It is a bottom-up approach in which two or
more entities can be generalized to a higher-level entity if
they have some attributes in common. For Example,
STUDENT and FACULTY can be generalized to a higher-
level entity called PERSON. In this case, common attributes
like P_NAME, and P_ADD become part of a
higher entity (PERSON), and specialized attributes like
S_FEE become part of a specialized entity (STUDENT).
Aggregation:
One limitation of the E-R model is that it cannot express relationships among relationships. The best way to
model a situation such as the one just described is to use aggregation. Aggregation is an abstraction through
which relationships are treated as higher level entities.
For example, the following figure shows the ternary relationship works-on, which we saw earlier, between
a employee, branch, and job.
Subject: Database Management System
There is redundant information in the figure, however, since every employee, branch, job combination in
manages is also in works-on. That is, from the above figure following results may be depicted:
• Employee manages job
• Job manages Employee
• Employee manages branch and vice-versa
2. An E-R diagram for a hospital with a set of patients and a set of medical doctors. Associate with
each patient a log of the various tests and examinations conducted.
3. An E-R diagram for a car-insurance company whose customers own one or more cars each. Each
car has associated with it zero to any number of recorded accidents.
5. Consider a database used to record the marks that students get in different exams of different
course offerings.
a. Construct an E-R diagram that models exams as entities, and uses a ternary relationship,
for the above database
Subject: Database Management System
6. Construct an alternative E-R diagram that uses only a binary relationship between students and
course-offerings. Make sure that only one relationship exists between a particular student and
course-offering pair, yet you can represent the marks that a student gets in different exams of a
course offering.
7. Design an E-R diagram for keeping track of the exploits of your favourite sports team. You
should store the matches played, the scores in each match, the players in each match and
individual player statistics for each match. Summary statistics should be modeled as derived
attributes