Database Management Quantum Seb
Database Management Quantum Seb
Database Management Quantum Seb
QUANTUM SERIES
For
B.Tech Students of Third Year
of All Engineering Colleges Affiliated to
Dr. A.P.J. Abdul Kalam Technical University,
Uttar Pradesh, Lucknow
(Formerly Uttar Pradesh Technical University)
By
Anjana Gautam
TM
CONTENTS
KCS–501 : DATABASE MANAGEMENT SYSTEM
1 Introduction
CONTENTS
Part-1 : Overview, Database .................................... 1–2A to 1–8A
System vs File System,
Database System Concept
and Architecture
PART-1
Overview, Database System vs File System,
Database System Concept and Architecture.
Questions-Answers
Answer
1. Database management system (DBMS) is a software which is use to
manage the database. For example, MySQL, Oracle, are commercial
database which is used in different applications.
2. DBMS provides an interface to perform various operations like database
creation, storing data, updating data, creating a table in the database
etc.
3. It provides protection and security to the database. In case of multiple
users, it also maintains data consistency.
DBMS allows users the following tasks :
1. Data definition : It is used for creation, modification, and removal of
database objects that defines the organization of data in the database.
2. Data updation : It is used for the insertion, modification, and deletion
of the actual data in the database.
3. Data retrieval : It is used to retrieve the data from the database
which can be used by applications for various purposes.
4. User administration : It is used for registering and monitoring users,
maintaining data integrity, enforcing data security, dealing with
concurrency control, monitoring performance and recovering
information corrupted by unexpected failure.
Answer
Advantages of DBMS :
1. Database redundancy : It controls data redundancy because it stores
all the data in one single database file and that recorded data is placed
in the database.
Database Management System 1–3 A (CS/IT-Sem-5)
Answer
Database users are the one who use and take the benefits of database. The
different types of users depending on the need and way of accessing the
database are :
1. Application programmers :
a. They are the developers who interact with the database by means
of DML queries.
b. These DML queries are written in the application programs like C,
C++, JAVA, Pascal etc.
c. These queries are converted into object code to communicate with
the database.
2. Sophisticated users :
a. They are database developers, who write SQL queries to select/
insert/delete/update data.
b. They directly interact with the database by means of query language
like SQL.
c. These users can be scientists, engineers, analysts who thoroughly
study SQL and DBMS to apply the concepts in their requirement.
Introduction 1–4 A (CS/IT-Sem-5)
3. Specialized users :
a. These are also sophisticated users, but they write special database
application programs.
b. They are the developers who develop the complex programs
according to the requirement.
4. Standalone users :
a. These users will have standalone database for their personal use.
b. These kinds of database will have predefined database packages
which will have menus and graphical interfaces.
5. Native users :
a. These are the users who use the existing application to interact
with the database.
b. For example, online library system, ticket booking systems, ATMs
etc.
Que 1.4. Who are data administrators ? What are the functions
of database administrator ?
OR
Discuss the role of database administrator.
AKTU 2017-18, Marks 10
Answer
Database administrators are the personnel’s who has control over data and
programs used for accessing the data.
Functions/role of database administrator (DBA) :
1. Schema definition :
a. Original database schema is defined by DBA.
b. This is accomplished by writing a set of definitions, which are
translated by the DDL compiler to a set of labels that are permanently
stored in the data dictionary.
2. Storage structure and access method definition :
a. The creation of appropriate storage structure and access method.
b. This is accomplished by writing a set of definitions, which are
translated by the data storage and definition language compiler.
3. Schema and physical organization and modification :
a. Modification of the database schema or the description of the physical
storage organization.
b. These changes are accomplished by writing a set of definition to do
modification to the appropriate internal system tables.
Database Management System 1–5 A (CS/IT-Sem-5)
Answer
Data abstraction is the process of finding irrelevant details from user i.e.,
hiding the background details from the users.
Different levels of data abstraction :
View level
View 1 View 1 ............ View n
Logical level
Physical level
1. Physical level :
i. Physical level is the lowest level of abstraction and describes how
the data are actually stored.
ii. The physical level describes the complex low-level data structures
in details.
2. Logical level :
i. Logical level is the next-higher level of abstraction and it describes
what data are stored in the database, and what relationship exists
among those data.
ii. The logical level thus describes the entire database in terms of a
small number of relatively simple structures.
3. View level :
i. View level is the highest level of abstraction; it describes only part
of the entire database.
ii. The view level of abstraction exists to simplify their interaction
with the system.
iii. The system may provide many views for the same database.
Introduction 1–6 A (CS/IT-Sem-5)
Answer
Answer
Que 1.8. Discuss the architecture of DBMS. What are the types
of DBMS architecture ?
Answer
1. The DBMS design depends upon its architecture. The basic client/
server architecture is used to deal with a large number of PCs, web
servers, database servers and other components that are connected
with networks.
2. DBMS architecture depends upon how users are connected to the
database to get their request done.
Types of DBMS architecture :
i. 1-Tier architecture :
1. In this architecture, the database is directly available to the user.
2. Any changes done are directly done on the database itself. It does
not provide a handy tool for end users.
3. The 1-Tier architecture is used for development of the local
application, where programmers can directly communicate with
the database for the quick response.
ii. 2-Tier architecture :
1. The 2-Tier architecture is same as basic client-server.
2. In the two-tier architecture, applications on the client end can
directly communicate with the database at the server side. For
this interaction, API’s such as : ODBC, JDBC are used.
2. The user interfaces and application programs are run on the
client-side.
Introduction 1–8 A (CS/IT-Sem-5)
3. The server side is responsible to provide the functionalities like
query processing and transaction management.
4. To communicate with the DBMS, client-side application establishes
a connection with the server side.
Application
Client
User
Database
Server
Application server
Application client
Client
User
PART-2
Data Model Schema and Instances, Data Independence and
Database Language and Interfaces, Data Definition
Language, DML, Overall Database Structure.
Questions-Answers
Que 1.9. What are data models ? Briefly explain different types of
data models.
Answer
Data models :
1. Data models define how the logical structure of the database is modeled.
2. Data models are a collection of conceptual tools for describing data, data
relationships, data semantics and consistency constraints.
3. Data models define how data is connected to each other and how they
are processed and stored inside the system.
Types of data models :
1. Entity relationship model :
a. The entity relationship (ER) model consists of a collection of basic
objects, called entities and of relationships among these entities.
b. Entities are represented by means of their properties, called
attributes.
2. Relational model :
a. The relational model represents data and relationships among data
by a collection of tables, each of which has a number of columns
with unique names.
Introduction 1–10 A (CS/IT-Sem-5)
b. Relational data model is used for data storage and processing.
c. This model is simple and it has all the properties and capabilities
required to process data with storage efficiency.
3. Hierarchical model :
a. In hierarchical model data elements are linked as an inverted tree
structure (root at the top with branches formed below).
b. Below the single root data element are subordinate elements each
of which in turn has its own subordinate elements and so on, the
tree can grow to multiple levels.
c. Data element has parent child relationship as in a tree.
4. Network model :
a. This model is the extension of hierarchical data model.
b. In this model there exist a parent child relationship but a child data
element can have more than one parent element or no parent at
all.
5. Object-oriented model :
a. Object-oriented models were introduced to overcome the
shortcomings of conventional models like relational, hierarchical
and network model.
b. An object-oriented database is collection of objects whose behaviour,
state, and relationships are defined in accordance with object-
oriented concepts (such as objects, class, etc.).
Answer
1. The description of a database is called the database schema, which
specified during database design and is not expected to change
frequently.
2. Most of the data models have certain convention for displaying schema
as diagram which is called as schema diagram.
3. A schema diagram displays only some aspects of a schema, such as the
names of record types and data items, and some types of constraints.
For example : Schema diagram for studentinfo database
Student (Name, Student_number, Class, Branch)
Course (Course_name, Course_number, Department)
Instances :
1. The data in the database at a particular moment is called a database
state or snapshot. It is also called the current set of occurrences or
instances in the database.
Database Management System 1–11 A (CS/IT-Sem-5)
2. In a database state, each schema construct has its own current set of
instances.
3. Many database states can be constructed to correspond to a particular
database schema. Every time we insert or delete a record or change the
value of a data item in a record, we change one state of the database into
another state.
Answer
Data independence : Data independence is defined as the capacity to change
the schema at one level of a database system without having to change the
schema at the next higher level.
Types of data independence :
1. Physical data independence :
a. Physical data independence is the ability to modify internal schema
without changing the conceptual schema.
b. Modification at the physical level is occasionally necessary in order
to improve performance.
c. It refers to the immunity of the conceptual schema to change in the
internal schema.
d. Examples of physical data independence are reorganizations of
files, adding a new access path or modifying indexes, etc.
2. Logical data independence :
a. Logical data independence is the ability to modify the conceptual
schema without having to change the external schemas or
application programs.
b. It refers to the immunity of the external model to changes in the
conceptual model.
c. Examples of logical data independence are addition/removal of
entities.
Answer
Classification of database languages :
1. Data Definition Language (DDL) :
a. DDL is set of SQL commands used to create, modify and delete
database structures but not data.
b. They are used by the DBA to a limited extent, a database designer,
or application developer.
c. Create, drop, alter, truncate are commonly used DDL command.
2. Data Manipulation Language (DML) :
a. A DML is a language that enables users to access or manipulates
data as organized by the appropriate data model.
b. There are two types of DMLs :
i. Procedural DMLs : It requires a user to specify what data
are needed and how to get those data.
ii. Declarative DMLs (Non-procedural DMLs) : It requires
a user to specify what data are needed without specifying
how to get those data.
c. Insert, update, delete, query are commonly used DML commands.
3. Data Control Language (DCL) :
a. It is the component of SQL statement that control access to data
and to the database.
b. Commit, rollback command are used in DCL.
4. Data Query Language (DQL) :
a. It is the component of SQL statement that allows getting data
from the database and imposing ordering upon it.
b. It includes select statement.
5. View Definition Language (VDL) :
1. VDL is used to specify user views and their mapping to conceptual
schema.
2. It defines the subset of records available to classes of users.
3. It creates virtual tables and the view appears to users like conceptual
level.
4. It specifies user interfaces.
SQL is a DML language.
Answer
Database languages : Refer Q. 1.12, Page 1–11A, Unit-1.
Examples :
DDL :
CREATE, ALTER, DROP, TRUNCATE, COMMENT, GRANT, REVOKE
statement
DML :
INSERT, UPDATE, DELETE statement
DCL :
GRANT and REVOKE statement
DQL :
SELECT statement
VDL :
1. create view emp5 as
select * from employee
where dno = 5 ;
Creates view for dept 5 employees.
2. create view empdept as
select fname, lname, dno, dname
from employee, department
where dno=dnumber ;
Creates view using two tables.
Que 1.14. Explain DBMS interfaces. What are the various DBMS
interfaces ?
Answer
DBMS interfaces : A database management system (DBMS) interface is a
user interface which allows for the ability to input queries to a database
without using the query language itself.
Various DBMS interfaces are :
1. Menu-based interfaces for web clients or browsing :
a. These interfaces present the user with lists of options (called menus)
that lead the user through the formulation of a request.
b. Pull-down menus are a very popular technique in Web-based user
interfaces.
c. They are also often used in browsing interfaces, which allow a user
to look through the contents of a database in an exploratory and
unstructured manner.
Introduction 1–14 A (CS/IT-Sem-5)
2. Forms-based interfaces :
a. A forms-based interface displays a form to each user.
b. Users can fill out all of the form entries to insert new data, or they
can fill out only certain entries, in which the DBMS will retrieve
matching data for the remaining entries.
3. Graphical user interfaces (GUI) :
a. A GUI typically displays a schema to the user in diagrammatic
form.
b. The user then can specify a query by manipulating the diagram. In
many cases, GUIs utilize both menus and forms.
4. Natural language interfaces :
a. A natural language interface has its own schema, which is similar
to the database conceptual schema, as well as a dictionary of
important words.
b. The natural language interface refers to the words in its schema,
as well as to the set of standard words in its dictionary to interpret
the request.
c. If the interpretation is successful, the interface generates a high-
level query corresponding to the natural language request and
submits it to the DBMS for processing; otherwise, a dialogue is
started with the user to clarify the request.
5. Speech input and output :
a. The speech input is detected using a library of predefined words
and used to set up the parameters that are supplied to the queries.
b. For output, a similar conversion from text or numbers into speech
takes place.
6. Interfaces for the DBA :
a. Most database systems contain privileged commands that can be
used only by the DBA’s staff.
b. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema,
and reorganizing the storage structures of a database.
Que 1.15. Briefly describe the overall structure of DBMS.
OR
Draw the overall structure of DBMS and explain its components in
brief. AKTU 2018-19, Marks 07
Answer
A database system is partitioned into modules that deal with each of the
responsibilities of the overall system. The functional components of a database
system can be broadly divided into two components :
Database Management System 1–15 A (CS/IT-Sem-5)
Query Evaluation
Engine
Query processor
Storage manager
Indices Data
Disk storage
Dictionary
PART-3
Data Modeling using the Entity Relationship Model : ER Model Concepts,
Notation for ER Diagram, Mapping Constraints.
Questions-Answers
Answer
An entity relationship model (ER model) is a way of representing the entities
and the relationships between the entities in order to create a database.
Database Management System 1–17 A (CS/IT-Sem-5)
Elements/notation of ER model/diagram :
1. Entity :
a. An entity is a real world object that can be easily identifiable.
b. An entity can be abstract.
c. An entity is an object that exists and is distinguishable from other
objects.
2. Entity set :
a. Entity set is a collection of similar type of entities.
b. An entity set may contain entities with attribute sharing similar
values.
3. Attribute :
a. An attribute gives the characteristics of the entity.
b. It is also called as data element, data field, a field, a data item, or an
elementary item.
4. Relationship :
a. A relationship is the association between entities or entity
occurrence.
b. Relationship is represented by diamond with straight lines
connecting the entities.
Que 1.17.
3.17. What do you understand by attributes and domain ?
Explain various types of attributes used in conceptual data model.
Answer
Attributes :
1. Attributes are properties which are used to represent the entities.
2. All attributes have values. For example, a student entity may have
name, class, and age as attributes.
3. There exists a domain or range of values that can be assigned to attributes.
4. For example, a student’s name cannot be a numeric value. It has to be
alphabetic. A student’s age cannot be negative, etc.
Domain :
1. A domain is an attribute constraint which determines the type of data
values that are permitted for that attribute.
2. Attribute domains can be very large, or very short.
Types of attributes used in conceptual data model :
1. Simple attribute : Simple attributes are atomic values, which cannot
be divided further. For example, a student’s phone number is an atomic
value of 10 digits.
Introduction 1–18 A (CS/IT-Sem-5)
2. Composite attribute : Composite attributes are made of more than
one simple attribute. For example, a student's complete name may have
first_name and last_name.
3. Derived attribute : Derived attributes are the attributes that do not
exist in the physical database, but their values are derived from other
attributes present in the database. For example, average_salary in a
department should not be saved directly in the database, instead it can
be derived.
4. Single-value attribute : Single-value attributes contain single value.
For example, Social_Security_Number.
5. Multi-value attribute : Multi-value attributes may contain more than
one values. For example, a person can have more than one phone
number, email_address, etc.
Answer
Purpose of the ER diagram :
1. ER diagram is used to represent the overall logical structure of the
database.
2. ER diagrams emphasis on the schema of the database and not on the
instances because the schema of the database is changed rarely.
3. It is useful to communicate the logical structure of database to end
users.
4. It serves as a documentation tool.
5. It helps the database designer in understanding the information to be
contained in the database.
ER diagram :
Course
Professors Department University Course duration
type
Course Course
name department
Answer
Que 1.20.
Que 1.19.
Availability
Payee name
Initiates Work Available
Payment mode
Database Management System
Fig.1.19.1.
Confirmation
Number of Product ID
items
Store No.
database, assuming your own data requirements.
Shipping type
Answer
In this ER diagram, the main entity sets are student, course, course offering
and instructor. The entity set course offering is a weak entity set dependent
on course. The assumptions made are :
a. A class meets only at one particular place and time. This ER diagram
cannot model a class meeting at different places at different times.
b. There is no guarantee that the database does not have two classes
meeting at the same place and time.
s_id name time secno room i_id name
enrolls course
student offerings teaches instructor
syllabus courseno
prerequisite
requires course title
maincourse
credits
Answer
1. Mapping constraints act as a rule followed by contents of database.
2. Data in the database must follow the constraints.
Types of mapping constraints are :
1. Mapping cardinalities :
a. Mapping cardinalities (or cardinality ratios) specifies the number of
entities of which another entity can be associated via a relationship
set.
b. Mapping cardinalities are used in describing binary relationship
sets, although they contribute to the description of relationship
sets that involve more than two entity sets.
c. For binary relationship set R between entity sets A and B, the
mapping cardinality must be one of the following :
i. One to one : An entity in A is associated with at most one
entity in B and an entity in B is associated with at most one
entity in A.
A B
a1 b1
a2 b2
a3 b3
a4 b4
Fig. 1.21.1.
b1
a1 b2
a2 b3
a3 b4
b5
Fig. 1.21.2.
a1
a2 b1
a3 b2
a4 b3
a5
Fig. 1.21.3.
iv. Many to many : An entity in A is associated with any number
of entities in B, and an entity in B is associated with any number
of entities in A.
A B
a1 b1
a2 b2
a3 b3
a4 b4
Fig. 1.21.4.
PART-4
Keys, Concept of Super Key, Candidate Key,
Primary Key, Generalization, Aggregation.
Questions-Answers
Que 1.22. Discuss the candidate key, primary key, super key,
composite key and alternate key.
Database Management System 1–23 A (CS/IT-Sem-5)
OR
Explain the primary key, super key, foreign key and candidate key
with example. AKTU 2017-18, Marks 10
OR
Define key. Explain various types of keys.
AKTU 2019-20, Marks 07
Answer
1. Key is a attribute or set of attributes that is used to identify data in
entity sets.
2. Key is defined for unique identification of rows in table.
Consider the following example of an Employee table :
Employee (EmployeeID, FullName, SSN, DeptID)
Various types of keys are :
1. Primary key :
a. Primary key uniquely identifies each record in a table and must
never, be the same for records. Here in Employee table we can
choose either EmployeeID or SSN columns as a primary key.
b. Primary key is a candidate key that is used for unique identification
of entities within the table.
c. Primary key cannot be null.
d. Any table has a unique primary key.
2. Super key :
a. A super key for an entity is a set of one or more attribute whose
combined value uniquely identifies the entity in the entity set.
b. For example : Here in employee table (EmployeeID, FullName) or
(EmployeeID, FullName, DeptID) is a super key.
3. Candidate key :
a. A candidate key is a column, or set of column, in the table that can
uniquely identify any database record without referring to any
other data.
b. Candidate key are individual columns in a table that qualifies for
uniqueness of all the rows. Here in Employee table EmployeeID
and SSN are candidate keys.
c. Minimal super keys are called candidate keys.
Introduction 1–24 A (CS/IT-Sem-5)
4. Composite key :
a. A composite key is a combination of two or more columns in a
table that can be used to uniquely identify each row in the table.
b. It is used when we cannot identify a record using single attributes.
c. A primary key that is made by the combination of more than one
attribute is known as a composite key.
5. Alternate key :
a. The alternate key of any table are those candidate keys which are
not currently selected as the primary key.
b. Exactly one of those candidate keys is chosen as the primary key
and the remainders, if any are then called alternate keys.
c. An alternate key is a function of all candidate keys minus the
primary key.
d. Here in Employee table if EmployeeID is primary key then SSN
would be the alternate key.
6. Foreign key :
a. Foreign key represents the relationship between tables and
ensures the referential integrity rule.
b. A foreign key is derived from the primary key of the same or some
other table.
c. Foreign key is the combination of one or more columns in a table
(parent table) at references a primary key in another table (child
table).
d. A foreign key value can be left null.
For example : Consider another table :
Project (ProjectName, TimeDuration, EmployeeID)
a. Here, the ‘EmployeeID’ in the ‘Project’ table points to the
‘EmployeeID’ in ‘Employee’ table
b. The ‘EmployeeID’ in the ‘Employee’ table is the primary key.
c. The ‘EmployeeID’ in the ‘Project’ table is a foreign key.
Que 1.23. What do you mean by a key to the relation ? Explain the
differences between super key, candidate key and primary key.
AKTU 2015-16, Marks 10
Answer
Key : Refer Q. 1.22, Page 1–22A, Unit-1.
Database Management System 1–25 A (CS/IT-Sem-5)
Registration Vehicle_id
Colour Make
Answer
Generalization :
a. Generalization is a process in which two lower level entities combine
to form higher level entity.
b. It is bottom-up approach.
c. Generalization is used to emphasize the similarities among lower level
entity sets and to hide the differences.
For example :
Generalization
Account
IS
A
Saving Current
Fig. 1.24.1.
Specialization :
a. Specialization is a process of breaking higher level entity into lower
level entity.
b. It is top-down approach.
c. It is opposite to generalization.
For example :
Specialization
Person
IS
A
Employee Customer
Fig. 1.24.2.
Aggregation :
a. Aggregation is an abstraction through which relationships are treated
as higher level entities.
For example :
1. The relationship works_on (relating the entity sets employee, branch
and job) act as a higher-level entity set.
2. We can then create a binary relationship ‘Manages’, between works on
and manager to represent who manages what tasks.
Database Management System 1–27 A (CS/IT-Sem-5)
Job
Manages
Manager
Fig. 1.24.3. ER diagram with aggregation.
Comparison :
PART-5
Reduction of an ER Diagram to Tables, Extended ER Model,
Relationship of Higher Degree.
Questions-Answers
Answer
1. In ER model, database are represented using the different notations or
diagrams, and these notations can be reduced to a collection of tables.
2. In the database, every entity set or relationship set can be represented
in tabular form.
Consider following ER diagram :
Subject_Name
Course_Name
Lecturer_ID
Subject_ID
Course_ID
Course
Subject
M
1
Has
Course_ID
N
N
1
Pin
k es
Attends
Teaches
State
Ta
Course_ID
M
M
1
City
Address
Lecturer
Teaches
Student
Lecturer_Name
M
1
Street
Door_No
Lecturer_ID
Student_Name
Student_ID
DoB
Age
Hobby
Database Management System 1–29 A (CS/IT-Sem-5)
Stud_Hobbey
Student_ID
Hobby
Answer
1. The ER model that is supported with the additional semantic concepts
is called the extended entity relationship model or EER model.
2. The EER model includes all the concepts of the original ER model
together with the following additional concepts :
a. Specialization : Refer Q. 1.24, Page 1–26A, Unit-1.
b. Generalization : Refer Q. 1.24, Page 1–26A, Unit-1.
c. Aggregation : Refer Q. 1.24, Page 1–26A, Unit-1.
3. The super class/subclass entity types (or super type /subtype entities)
is one of the most important modelling constructs that is included in
the EER model.
4. This feature enables us to model a general entity and then subdivide it
into several specialized entity types (subclasses or subtypes).
5. EER diagrams are used to capture business rules such as constraints
in the super type/subtype relations. Thus, a super class is an entity
type that includes distinct subclasses that require to be represented in
a data model.
6. A subclass is an entity type that has a distinct role and is also a member
of a super class.
Database Management System 1–31 A (CS/IT-Sem-5)
Shared attributes
Superclass
Subclass Subclass
Answer
1. Unified Modeling Language (UML) is a standardized modeling language
enabling developers to specify, visualize, construct and document artifacts
of a software system.
2. UML makes these artifacts scalable, secure and robust in execution.
3. UML is an important aspect involved in object-oriented software
development.
4. It uses graphic notation to create visual models of software systems.
Types of UML :
1. Activity diagram :
a. It is generally used to describe the flow of different activities and
actions.
b. These can be both sequential and in parallel.
c. They describe the objects used, consumed or produced by an activity
and the relationship between the different activities.
2. Use case diagram :
a. Case diagrams are used to analyze the system’s high-level
requirements.
Introduction 1–32 A (CS/IT-Sem-5)
b. These requirements are expressed through different use cases.
3. Interaction overview diagram :
a. The interaction overview diagram is an activity diagram made of
different interaction diagrams.
4. Timing diagram :
a. Timing UML diagrams are used to represent the relations of objects
when the center of attention rests on time.
b. Each individual participant is represented through a lifeline, which
is essentially a line forming steps since the individual participant
transits from one stage to another.
c. The main components of a timing UML diagram are :
i. Lifeline ii. State timeline
iii. Duration constraint iv. Time constraint
v. Destruction occurrence
5. Sequence UML diagram :
a. Sequence diagrams describe the sequence of messages and
interactions that happen between actors and objects.
b. Actors or objects can be active only when needed or when another
object wants to communicate with them.
c. All communication is represented in a chronological manner.
6. Class diagram :
a. Class diagrams contain classes, alongside with their attributes (also
referred to as data fields) and their behaviours (also referred to as
member functions).
b. More specifically, each class has three fields : the class name at the
top, the class attributes right below the name, the class operations/
behaviours at the bottom.
c. The relation between different classes (represented by a connecting
line), makes up a class diagram.
Database Management System 2–1 A (CS/IT-Sem-5)
2
UNIT
Relational Data Model
and Language
CONTENTS
Part-1 : Relational Data Model ................................ 2–2A to 2–6A
Concept, Integrity Constraints,
Entity Integrity, Referential
Integrity, Keys Constraints,
Domain Constraints
PART-1
Relational Data Model Concept, Integrity Constraints,
Entity Integrity, Referential Integrity, Keys
Constraints, Domain Constraints.
Questions-Answers
Answer
1. A relational model is a collection of conceptual tools for describing data,
data relationships, data semantics and consistency constraints.
2. It is the primary data model for commercial data processing applications.
3. The relational model uses collection of tables to represent both data and
the relationships among those data.
4. Each table has multiple columns and each column has a unique name.
For example :
1. The tables represent a simple relational database.
2. The Table 2.1.1 shows details of bank customers, Table 2.1.2 shows
accounts and Table 2.1.3 shows which accounts belong to which customer.
Table 2.1.1 : Customer table
cust_id c_name c_city
C_101 Ajay Delhi
C_102 Amit Mumbai
C_103 Alok Kolkata
C_104 Akash Chennai
Table 2.1.2 : Account table
acc_no. balance
A-1 1000
A-2 2000
A-3 3000
A-4 4000
Database Management System 2–3 A (CS/IT-Sem-5)
Answer
1. A constraint is a rule that is used for optimization purposes.
2. Constraints enforce limits to the data or type of data that can be inserted/
updated/deleted from a table.
3. The whole purpose of constraints is to maintain the data integrity during
an update/delete/insert into a table.
Types of constraints :
1. NOT NULL :
i. NOT NULL constraint makes sure that a column does not hold
NULL value.
ii. When we do not provide value for a particular column while inserting
a record into a table, it takes NULL value by default.
iii. By specifying NULL constraint, we make sure that a particular
column cannot have NULL values.
2. UNIQUE :
i. UNIQUE constraint enforces a column or set of columns to have
unique values.
ii. If a column has a unique constraint, it means that particular column
cannot have duplicate values in a table.
3. DEFAULT :
i. The DEFAULT constraint provides a default value to a column when
there is no value provided while inserting a record into a table.
4. CHECK :
i. This constraint is used for specifying range of values for a particular
column of a table.
Relational Data Model & Language 2–4 A (CS/IT-Sem-5)
ii. When this constraint is being set on a column, it ensures that the
specified column must have the value falling in the specified range.
5. Key constraints :
i. Primary key :
a. Primary key uniquely identifies each record in a table.
b. It must have unique values and cannot contain null.
ii. Foreign key :
a. Foreign keys are the columns of a table that points to the
primary key of another table.
b. They act as a cross-reference between tables.
6. Domain constraints :
i. Each table has certain set of columns and each column allows a
same type of data, based on its data type.
ii. The column does not accept values of any other data type.
Answer
1. Integrity constraints provide a way of ensuring that changes made to
the database by authorized users do not result in a loss of data
consistency.
2. A form of integrity constraint with ER models is :
a. key declarations : certain attributes form a candidate key for the
entity set.
b. form of a relationship : mapping cardinalities 1-1, 1-many and
many-many.
3. An integrity constraint can be any arbitrary predicate applied to the
database.
4. Integrity constraints are used to ensure accuracy and consistency of
data in a relational database.
Answer
i. Entity integrity constraint :
a. This rule states that no attribute of primary key will contain a null
value.
Database Management System 2–5 A (CS/IT-Sem-5)
Sourabh 6
th 19
Foreign Key
Table 2.4.2.
ENO NAME Age DNO
1 Ankit 19 10
2 Srishti 18 11
3 Somvir 22 14 Not Allowed, as DNO 14 is not
4 Sourabh 19 10 defined as a primary key of Table 2.4.3,
and in Table 2.4.2, DNO is a foreign
key defined
Relationship
Table 2.4.3.
DNO D.Location
10 Rohtak
11 Bhiwani
12 Hansi
Primary Key
PART-2
Relational Algebra.
Questions-Answers
Answer
1. The relational algebra is a procedural query language.
2. It consists of a set of operations that take one or two relations as input
and produces a new relation as a result.
3. The operations in the relational algebra are select, project, union, set
difference, cartesian product and rename.
Basic relational algebra operations are as follows :
1. Select operation :
a. The select operation selects tuples that satisfies a given predicate.
b. Select operation is denoted by sigma ().
c. The predicate appears as a subscript to .
d. The argument relation is in parenthesis after the .
2. Project operation :
a. The project operation is a unary operation that returns its
argument relation with certain attributes left out.
Database Management System 2–7 A (CS/IT-Sem-5)
Answer
a. code (Registered)
b. title (Course Registered)
c. code (Course) – code (Registered)
d. name ((code (Course) – code (Registered)) Course)
e. name, title (Student Registered Course))
f&g. ssn (Student Registered (title = ‘Database Systems’ Course))
ssn (Student Registered (title = ‘Analysis of Algorithms’ Course))
h. A = ssn (Student Registered (title = ‘Database System’ Course))
ssn (Student Registered (title = ‘Analysis of Algorithms’ Course))
name (A Student)
A = ( ) function
i. code, ssn (Registered) / ssn (Student)
j. code, ssn (Registered) / ssn (major = ‘ECMP’ Student)
Answer
The additional operations of relational algebra are :
1. Set intersection operation :
a. Set intersection is denoted by , and returns a relation that contains
tuples that are in both of its argument relations. The set intersection
operation is written as :
r s = r – (r – s)
2. Natural join operation :
a. The natural join is a binary operation that allows us to combine
certain selections and a cartesian product into one operation. It is
denoted by the join symbol .
b. The natural join operation forms a cartesian product of its two
arguments, performs a selection forcing equality on those attributes
that appear in both relation schemas and finally removes duplicate
attributes.
Database Management System 2–9 A (CS/IT-Sem-5)
3. Division operation :
1. In division operation, division operator is denoted by the symbol E ().
2. The relation r ÷ s is a relation on schema R – S. A tuple t is in r ÷ s if and
only if both of two conditions hold :
a. t is in R–S (r).
b. For every tuple ts in s, there is a tuple tr in r satisfying both of the
following :
i. tr[S] = ts[S]
ii. tr[R – S] = t
3. The division operation can be written in terms of fundamental operation
as follows :
r ÷ s = R–S (r) – R–S (( R–S (r) × s) – R–S, S (r))
4. Assignment operation : The assignment operation, denoted by ,
works like assignment in a programming language.
Answer
i. name(code = cs3020(student enrolledin))
ii. code(name = Hector(student enrolledin))
iii. lecturer(code = cs1500(subject))
iv. lecturer(code = cs1500 code = cs3020(subject))
v. For this query we have to relate subject to itself. To disambiguate the
relation, we will call the subject relation R and S.
lecturer(R.lecture = S.lecturer R.code< >S.code(R S))
vi. name(code = cs1500(student enrolledin)) (name(code = cs307(student
enrolledin)))
Relational Data Model & Language 2–10 A (CS/IT-Sem-5)
PART-3
Relational Calculus, Tuple and Domain Calculus.
Questions-Answers
Answer
1. Relational calculus is a non-procedural query language.
2. Relational calculus is a query system where queries are expressed as
formulas consisting of a number of variables and an expression involving
these variables.
3. In a relational calculus, there is no description of how to evaluate a
query.
Important characteristics of relational calculus :
1. The relational calculus is used to measure the selective power of
relational languages.
2. Relational calculus is based on predicate calculus.
3. In relational calculus, user is not concerned with the procedure to
obtain the results.
4. In relational calculus, output is available without knowing the method
about its retrieval.
Tuple Relational Calculus (TRC) :
1. The TRC is a non-procedural query language.
2. It describes the desired information without giving a specific procedure
for obtaining that information.
Database Management System 2–11 A (CS/IT-Sem-5)
PART-4
Introduction on SQL : Characteristics of SQL, Advantage of SQL.
Questions-Answers
Answer
1. SQL stands for Structured Query Language.
2. It is a non-procedural language that can be used for retrieval and
management of data stored in relational database.
3. It can be used for defining the structure of data, modifying data in the
database and specifying the security constraints.
4. The two major categories of SQL commands are :
a. Data Definition Language (DDL) : DDL provides commands
that can be used to create, modify and delete database objects.
b. Data Manipulation Language (DML) : DML provides commands
that can be used to access and manipulate the data, that is, to
retrieve, insert, delete and update data in a database.
Characteristics of SQL :
1. SQL usage is extremely flexible.
2. It uses a free form syntax that gives the user the ability to structure
SQL statements in a way best suited to them.
3. Each SQL request is parsed by the RDBMS before execution, to check
for proper syntax and to optimize the request.
4. Unlike certain programming languages, there is no need to start SQL
statements in a particular column or be finished in a single line. The
same SQL request can be written in a variety of ways.
Answer
Advantages of SQL :
1. Faster query processing : Large amount of data is retrieved quickly
and efficiently. Operations like insertion, deletion, manipulation of data
is done in almost no time.
2. No coding skills : For data retrieval, large number of lines of code is
not required. All basic keywords such as SELECT, INSERT INTO,
UPDATE, etc are used and also the syntactical rules are not complex in
SQL, which makes it a user-friendly language.
3. Standardised language : Due to documentation it provides a uniform
platform worldwide to all its users.
4. Portable : It can be used in programs in PCs, server, laptops independent
of any platform (Operating System, etc). Also, it can be embedded with
other applications as per need/requirement/use.
5. Interactive language : Easy to learn and understand, answers to
complex queries can be received in seconds.
Disadvantages of SQL :
1. Complex interface : SQL has a difficult interface that makes few
users uncomfortable while dealing with the database.
2. Cost : Some versions are costly and hence, programmers cannot access
it.
3. Partial control : Due to hidden business rules, complete control is not
given to the database.
PART-5
SQL Data Type and Literals, Types of SQL Commands,
SQL Operators and their Procedure.
Questions-Answers
Answer
SQL supports following datatypes :
1. char (n) : A fixed length character string with user specified maximum
length n.
Relational Data Model & Language 2–14 A (CS/IT-Sem-5)
2. varchar (n) : A variable length character string with user specified
maximum length n.
3. int : An integer which is a finite subset of the integers that is machine
dependent.
4. small int : A small integer is machine independent subset of integer
domain type.
5. numeric (p, d) : A fixed point number with user defined precision. It
consists of p digits and d of the p digits are to the right of the decimal
point.
6. real or double precision : Floating point and double precision floating
point numbers with machine dependent precision.
7. float (n) : A floating point number with precision of at least n digits.
8. date : A calendar date containing a year (four digit), month (two digit)
and day (two digit) of the month.
9. time : The time of the day in hours, minutes and seconds.
Answer
The four kinds of literal values supported in SQL are :
1. Character string :
a. Character strings are written as a sequence of characters enclosed
in single quotes.
b. The single quote character is represented within a character string
by two single quotes. For example, ‘Computer Engg’, ‘Structured
Query Language’
2. Bit string :
a. A bit string is written either as a sequence of 0s and 1s enclosed in
single quotes and preceded by the letter ‘B’ or as a sequence of
hexadecimal digits enclosed in single quotes and preceded by the
letter ‘X’.
b. For example, B’ 1011011’, B’1’, B’0’, X’A 5’ , X’T’
3. Exact numeric :
a. These literals are written as a signed or unsigned decimal number
possibly with a decimal point.
b. For example, 9, 90, 90.00, 0.9, + 99.99, – 99.99.
4. Approximate numeric :
a. Approximate numeric literals are written as exact numeric literals
followed by the letter ‘E’, followed by a signed or unsigned integer.
b. For example, 5E5, 55.5E5, + 55E–5, + 55E–5, 055E, – 5.55E–9.
Database Management System 2–15 A (CS/IT-Sem-5)
Answer
Different types of SQL commands are :
1. Insert :
a. This command is used to insert tuples in a table.
b. This command adds a single tuple at a time in a table.
Syntax :
Insert into table_name (attribute1, ..., attributen) values (values_list);
2. Update :
a. This command is used to make changes in the values of attributes
of the table.
b. It use set and where clause.
Syntax :
Update table_name set attribute_name = new_value where condition;
3. Delete :
a. This command is used to remove tuples.
b. Tuples can be deleted from only one table at a time.
Syntax :
Delete from table_name where condition;
4. Select : This command is used to retrieve a subset of tuples from one
or more table.
Syntax :
Select attribute1, ..., attributen from table_name where condition;
5. Alter table :
a. This command is used to make changes in the structure of a table.
b. This command is used :
i. to add an attribute
ii. to drop an attribute
iii. to rename an attribute
iv. to add and drop a constraint
Syntax :
Alter table table_name add column_name datatype;
Alter table table_name drop column column_name;
Alter table table_name drop constraint constraint_name;
Answer
a. SQL DDL is used to define relation of a system. The general syntax of
SQL sentence is :
VERB (parameter1, parameter2; ......, parametern)
b. The relations are created using CREATE verb.
1. CREATE TABLE : This command is used to create a new relation
and the corresponding syntax is :
CREATE TABLE relation_name
(field1 datatype (size), field2 datatype (size),..., fieldn datatype
(size));
2. CREATE TABLE ... AS SELECT ... : This type of create command
is used to create the structure of a new table from the structure of
existing table.
The generalized syntax of this form is :
CREATE TABLE relation_name1
(field1, field2, ...., fieldn)
AS SELECT field1, field2, ..., fieldn
FROM relation_name2;
c. Structure of relations are changed using ALTER verb.
1. ALTER TABLE ... ADD ... : This is used to add some extra columns
into an existing table. The generalized format is :
ALTER TABLE relation_name
ADD (new field1 datatype (size),
new field2 datatype (size), ......,
new fieldn datatype (size)) ;
2. ALTER TABLE .... MODIFY ... : This form is used to change the
width as well as data type of existing relations. The generalized
syntax is :
ALTER TABLE relation_name
MODIFY (field1 new data type (size),
field2 new data type (size),
----------------
fieldn new data type (size)) ;
Answer
Relational schemas :
branch (branch-name, branch-city, assets)
customer (customer-name, customer-street, customer-city, customer-id)
account (account-number, balance)
loan (loan-number, amount)
employee (employee-id, employee-name, telephone-number, start- date,
employment length, dependent-name)
payment (payment-number, payment-amount, payment-date)
saving-account (interest-rate)
checking-account (overdraft-amount)
branch-city
branch-name assets
branch
loan-branch
customer-name payment-date
access-date
account-number balance
cust-banker type
Aggregation
depositor account
manager
Generalization
Specialization
dependent-name telephone-number
employment
length start-date interest-rate overdraft-amount
Que 2.17.
3.17. Describe the operators and its types in SQL.
Answer
Operators and conditions are used to perform operations such as addition,
subtraction or comparison on the data items in an SQL statement.
Different types of SQL operators are :
1. Arithmetic operators : Arithmetic operators are used in SQL
expressions to add, subtract, multiply, divide and negate data values.
The result of this expression is a number value.
Database Management System 2–19 A (CS/IT-Sem-5)
Operator Definition
AND Returns true if both component conditions
are true; otherwise returns false.
OR Returns true if either component condition
is true otherwise returns false
NOT Returns true if the condition is false;
otherwise returns false.
Operator Definition
UNION Returns all distinct rows from both queries
INTERSECT Returns common rows selected by both queries
MINUS Returns all distinct rows that are in the first
query, but not in second one.
Answer
103 Suraj J2
104 XX
105 BB
106 CC
Piid PiName S
101 Jai J1
101 Jai J2
103 Suraj J1
103 Suraj J2
104 XX J1
104 XX J2
105 BB J1
105 BB J1
106 CC J1
106 CC J2
5. Rename :
Consider the Book relation with attributes Title, Author, Year and
Price. The rename operator is used on Book relation as follows :
* Temp(Bname, Aname, Pyear, Bprice) (Book)
Here both the relation name and the attribute names are renamed.
PART-6
Tables, Views and Indexes, Queries and Sub Queries.
Questions-Answers
Answer
1. A view is a virtual relation, whose contents are derived from already
existing relations and it does not exist in physical form.
2. The contents of view are determined by executing a query based on
any relation and it does not form the part of database schema.
3. Each time a view is referred to, its contents are derived from the
relations on which it is based.
Database Management System 2–23 A (CS/IT-Sem-5)
4. A view can be used like any other relation that is, it can be queried,
inserted into, deleted from and joined with other relations or views.
5. Views can be based on more than one relation and such views are
known as complex views.
6. A view in SQL terminology is a single table that is derived from other
tables. These other tables can be base tables or previously defined
views.
Syntax for creating view :
CREATE VIEW view_name
AS SELECT * FROM table_name
WHERE Category IN (‘attribute1’, ‘attribute2’);
For example : Command to create a view consisting of attributes
Book_title, Category, Price and P_ID of the BOOK relation, Pname
and State of the PUBLISHER relation can be specified as
CREATE VIEW BOOK_3
AS SELECT BOOK_title, Category, Price, BOOK.P_ID, Pname, State
FROM BOOK, PUBLISHER
WHERE BOOK.P_ID = PUBLISHER.P_ID;
Answer
1. Indexes are special lookup tables that the database search engine can
use to speed up data retrieval.
2. An index helps to speed up SELECT queries and WHERE clauses, but it
slows down data input, with the UPDATE and the INSERT statements.
3. Indexes can be created or dropped with no effect on the data.
4. Indexes are used to retrieve data from the database more quickly.
5. The users cannot see the indexes; they are just used to speed up searches/
queries.
6. Syntax :
CREATE INDEX index
ON TABLE column;
where the index is the name given to that index and TABLE is the name
of the table on which that index is created and column is the name of
that column for which it is applied.
7. Unique indexes are used for the maintenance of the integrity of the
data present in the table as well as for the fast performance; it does not
allow multiple values to enter into the table.
Syntax for creating unique index is :
CREATE UNIQUE INDEX index_name
Relational Data Model & Language 2–24 A (CS/IT-Sem-5)
ON TABLE column;
8. To remove an index from the data dictionary by using the DROP INDEX
command.
DROP INDEX index_name;
Answer
1. A sub-query is a SQL query nested inside a larger query.
2. Sub-queries must be enclosed within parenthesis.
3. The sub-query can be used with the SELECT, INSERT, UPDATE, or
DELETE statement along with the operators like =, >, <, >=, <=, IN,
ANY, ALL, BETWEEN.
4 A sub-query is usually added within the WHERE clause of another SQL
SELECT statement.
5. A sub-query is also called an inner query while the statement containing
a sub-query is also called an outer query.
6. The inner query executes first before its parent query so that the result
of an inner query can be passed to the outer query.
Syntax of SQL sub-query : A sub-query with the IN operator,
SELECT column_names
FROM table_name1
WHERE column_name IN (SELECT column_name
FROM table_name2
WHERE condition);
Example :
We have the following two tables ‘student’ and ‘marks’ with common field
‘StudentID’.
Student Marks
StudentID Name StudentID Total_marks
V001 Abha V001 95
V002 Abhay V002 80
V003 Anand V003 74
V004 Amit V004 81
Now considering table ‘Student’, we want to write a query to identify all
students who get more marks than the student whose StudentID is ‘V002’,
but we do not know the marks of ‘V002’.
Database Management System 2–25 A (CS/IT-Sem-5)
So, consider another table ‘Marks’ containing total marks of the student and
apply query considering both tables.
SQL code with sub-query :
SELECT a.StudentID, a.Name, b.Total_marks
FROM student a, marks b
WHERE a.StudentID = b.StudentID AND b.Total_marks >
(SELECT Total_marks
FROM marks
WHERE StudentID = ‘V002’);
Query result :
PART-7
Aggregate Functions, Insert,Update and Delete Operations.
Questions-Answers
Que 2.22. Write full relation operation in SQL. Explain any one
of them.
OR
Explain aggregate function in SQL.
Answer
In SQL, there are many full relation operations like :
i. Eliminating duplicates
ii. Duplicating in union, intersection and difference
iii. Grouping
iv. Aggregate function
Aggregate function :
1. Aggregate functions are functions that take a collection of clues as
input and return a single value.
Relational Data Model & Language 2–26 A (CS/IT-Sem-5)
2. SQL offers five built-in aggregate functions :
a. Average : avg
Syntax : avg ( [ Distinct |All ] n)
Purpose : Returns average value of n, ignoring null values.
Example : Let us consider a SQL query :
select avg(unit price) as “Average Price” from book;
Output :
Average Price
359.8
b. Minimum : min
Syntax : min ( [ Distinct | All ] expr)
Purpose : Returns minimum value of expression
Example :
SQL> select min(unit_price) as “Minimum Price”
from book ;
Output :
Minimum Price
250
c. Maximum : max
Syntax : max ([ Distinct | All ] expr)
Purpose : Returns maximum value of expression
Example :
SQL> select max(unit_price) as “Maximum Price”
from book;
Output :
Maximum Price
450
d. Sum : sum
Syntax : sum ([ Distinct | All ] n)
Purpose : Returns sum of values of n
Example :
SQL> select sum(unit price) as “Total”
from book;
Output :
Total
1799
Database Management System 2–27 A (CS/IT-Sem-5)
e. Count : count
Syntax : count ([ Distinct | All ] expr)
Purpose : Returns the number of rows where expr is not null
Example :
SQL> select count(title) as “No. of Books”
from book;
Output :
No. of Books
5
Que 2.23. Explain how the GROUP BY clause in SQL works. What
is the difference between WHERE and HAVING clause ?
Answer
GROUP BY :
1. GROUP BY was added to SQL because aggregate functions (like SUM)
return the aggregate of all column values every time they are called,
and without the GROUP BY function it is not possible to find the sum
for each individual group of column values.
2. The syntax for the GROUP BY function is :
SELECT columns, SUM(column) FROM table GROUP BY column
Example :
This “Sales” Table :
Company Amount
TCS 5500
IBM 4500
TCS 7100
Company Amount
TCS 12600
IBM 4500
Relational Data Model & Language 2–28 A (CS/IT-Sem-5)
Difference :
S. No. WHERE HAVING
Answer
Different operations that modify the contents of the database are :
1. Delete :
a. The delete operation is used to delete all or specific rows from
database.
b. Delete command do not delete values of particular attributes.
c. A delete command operates only on relation or table.
Syntax :
delete from table_name
where condition;
2. Insert :
a. Insert command is used to insert data into a relation/table.
b. The attribute values for inserted tuples must be members of the
attribute’s domain specified in the same order as in the relation
schema.
Syntax : Insert into table_name values (attribute1, attribute2, attribute3,
...... attributeN);
3. Updates : Update command is used to update a value in a tuple.
Database Management System 2–29 A (CS/IT-Sem-5)
PART-8
Joins, Unions, Intersections, Minus.
Questions-Answers
Answer
A join clause is used to combine rows from two or more tables, based on a
related column between them.
Various types of join operations are :
1. Inner join :
a. Inner join returns the matching rows from the tables that are
being joined.
For example : Consider following two relations :
Employee (Emp_Name, City)
Employee_Salary (Emp_Name, Department, Salary)
These two relations are shown in Table 2.25.1 and 2.25.2.
Table. 2.25.1. The Employee relation.
Employee
Emp_Name City
Hari Pune
Om Mumbai
Suraj Nashik
Jai Solapur
Relational Data Model & Language 2–30 A (CS/IT-Sem-5)
Table. 2.25.2. The Employee_Salary relation.
Employee_Salary
Emp_Name Department Salary
Hari Computer 10000
Om IT 7000
Billu Computer 8000
Jai IT 5000
Emp_Name Salary
Hari 10000
Om 7000
Jai 5000
2. Outer join :
a. An outer join is an extended form of the inner join.
b. It returns both matching and non-matching rows for the tables
that are being joined.
c. Types of outer join are as follows :
i. Left outer join : The left outer join returns matching rows
from the tables being joined and also non-matching rows
from the left table in the result and places null values in the
attributes that comes from the right table.
For example :
Select Employee.Emp_Name, Salary
from Employee left outer join Employee_Salary
on Employee.Emp_Name = Employee_Salary.Emp_Name;
Result : The result of preceding query with selected fields of
Table 2.25.1 and Table 2.25.2.
Database Management System 2–31 A (CS/IT-Sem-5)
Emp_Name Salary
Hari 10000
Om 7000
Jai 5000
Suraj null
ii. Right outer join : The right outer join operation returns
matching rows from the tables being joined, and also non-
matching rows from the right table in the result and places
null values in the attributes that comes from the left table.
For example :
Select Employee.Emp_Name, City, Salary from Employee
right outer join
Employee_Salary on Employee.Emp_Name =
Employee_Salary. Emp_Name;
Result : The result of preceding query with selected fields of
Table 2.25.1 and Table 2.25.2.
Emp_Name City Salary
Hari Pune 10000
Om Mumbai 7000
Jai Solapur 5000
Billu null 8000
Que 2.26. Write difference between cross join, natural join, left
outer join and right outer join with suitable example.
AKTU 2018-19, Marks 07
Answer
Cross join :
1. Cross join produces a result set which is the product of number of rows
in the first table multiplied by the number of rows in the second table
if no where clause is used along with cross join. This kind of result is
known as Cartesian product.
2. If where clause is used with cross join, it functions like an inner join.
Relational Data Model & Language 2–32 A (CS/IT-Sem-5)
For example :
Beers Select * From Beers
Name Cross Join Likes
Manf
Name Manf Drinker Beer
Beer 1 XYZ
Beer 1 XYZ Kanika Beer 1
Beer 2 ABC
Beer 1 XYZ Aditya Beer 1
Beer 3 ABC
Beer 1 XYZ Mahesh Beer 3
Beer 2 ABC Kanika Beer 1
Likes Beer 2 ABC Aditya Beer 1
Drinker Beer Beer 2 ABC Mahesh Beer 3
Kanika Beer 1 Beer 3 ABC Kanika Beer 1
Aditya Beer 2 Beer 3 ABC Aditya Beer 1
Beer 3 ABC Mahesh Beer 3
Mahesh Beer 3
Natural join :
1. Natural join joins two tables based on same attribute name and data
types.
2. The resulting table will contain all the attributes of both the table but
keep only one copy of each common column.
3. In natural join, if there is no condition specifies then it returns the
rows based on the common column.
For example : Consider the following two relations :
Student (Roll_No, Name)
Marks (Roll_No, Marks)
These two relations are shown in Table 2.26.1 and 2.26.2.
Table 2.26.1. The Student relation.
Student
Roll_No Name
1 A
2 B
3 C
Table 2.26.2. The Marks relation.
Marks
Roll_No Marks
2 70
3 50
4 85
Consider the query :
Select * from Student natural join Marks ;
Database Management System 2–33 A (CS/IT-Sem-5)
Result :
Roll_No Name Marks
2 B 70
3 C 50
Left outer join and right outer join : Refer Q. 2.25, Page 2–29A,
Unit-2.
Que 2.27. Describe the SQL set operations.
Answer
The SQL set operations are :
1. Union operation : Union clause merges the output of two or more
queries into a single set of rows and column.
Common
records
in both
queries
Records
only in
query one
PART-9
Cursors, Triggers, Procedures in SQL/PL SQL.
Questions-Answers
Answer
Cursors :
1. A cursor is a temporary work area created in the system memory when
a SQL statement is executed.
2. A cursor contains information on a select statement and the rows of
data accessed by it.
3. A cursor can hold more than one row, but can process only one row at a
time.
4. The set of rows the cursor holds is called the active set.
5. There are two types of cursors :
a. Implicit cursors :
i. These are created by default when DML statements like,
INSERT, UPDATE, and DELETE statements are executed.
ii. They are also created when a SELECT statement that returns
just one row is executed.
b. Explicit cursors :
i. They must be created when we are executing a SELECT
statement that returns more than one row.
ii. When we fetch a row the current row position moves to next
row.
Sequences :
Sequences are frequently used in databases because many applications
require each row in a table to contain a unique value and sequences provide
an easy way to generate them.
Syntax :
CREATE SEQUENCE [schema.]sequence_name
Database Management System 2–35 A (CS/IT-Sem-5)
[ AS datatype ]
[ START WITH value ]
[ INCREMENT BY value ]
[ MINVALUE value | NO MINVALUE ]
[ MAXVALUE value | NO MAXVALUE ]
[ CYCLE | NO CYCLE ]
[ CACHE value | NO CACHE ];
Procedures :
1. A procedure is a sub-program that performs a specification.
2. A procedure has two parts :
i. Specification : The procedure specification begins with the keyword
procedure and ends with the procedure name or parameter list.
ii. Body : The procedure body begins with the keyword is and ends
with the keyword end.
Syntax : To create a procedure,
create or replace procedure <proc name> [parameter list] is
< local declaration >
begin
(executable statements)
[exception] (exception handlers)
end ;
Syntax : To execute a procedure,
exec < proce_name > (parameters);
Answer
Triggers :
1. A trigger is a procedure (code segment) that is executed automatically
when some specific events occur in a table/view of a database.
2. Triggers are mainly used for maintaining integrity in a database.
Triggers are also used for enforcing business rules, auditing changes
in the database and replicating data.
Following are different types of triggers :
Relational Data Model & Language 2–36 A (CS/IT-Sem-5)
1. Data Manipulation Language (DML) triggers :
a. DML triggers are executed when a DML operation like INSERT,
UPDATE OR DELETE is fired on a Table or View.
b. DML triggers are of two types :
i. AFTER triggers :
1. AFTER triggers are executed after the DML statement
completes but before it is committed to the database.
2. AFTER triggers if required can rollback its actions and
source DML statement which invoked it.
ii. INSTEAD OF triggers :
1. INSTEAD OF triggers are the triggers which get
executed automatically in place of triggering DML (i.e.,
INSERT, UPDATE and DELETE) action.
2. It means if we are inserting a record and we have a
INSTEAD OF trigger for INSERT then instead of INSERT
whatever action is defined in the trigger that gets
executed.
2. Data Definition Language (DDL) triggers :
a. DDL triggers are executed when a DDL statements like CREATE,
ALTER, DROP, GRANT, DENY, REVOKE, and UPDATE
STATISTICS statements are executed.
b. DDL triggers can be DATABASE scoped or SERVER scoped. The
DDL triggers with server level scope gets fired in response to a
DDL statement with server scope like CREATE DATABASE,
CREATE LOGIN, GRANT_SERVER, ALTER DATABASE, ALTER
LOGIN etc.
c. Where as DATABASE scoped DDL triggers fire in response to
DDL statement with DATABASE SCOPE like CREATE TABLE,
CREATE PROCEDURE, CREATE FUNCTION, ALTER TABLE,
ALTER PROCEDURE, ALTER FUNCTION etc.
3. LOGON triggers :
a. LOGON triggers get executed automatically in response to a
LOGON event.
b. They get executed only after the successful authentication but
before the user session is established.
c. If authentication fails the LOGON triggers will not be fired.
4. CLR triggers :
a. CLR triggers are based on the Sql CLR.
b. We can write DML and DDL triggers by using the supported .NET
CLR languages like C#, VB.NET etc.
Database Management System 2–37 A (CS/IT-Sem-5)
Answer
i. Select E.employee_name, city
from employee E, works W
where W.company_name = ‘XYZ Bank’ and
W.employee_name = E.employee_name
ii. Select * from employee
where employee_name in
select employee_name from Works
where company_name=‘XYZ Bank’ and salary>10000
select E. employee_name, street address, city
from Employee as E, Works as W
where E. employee_name=W.person_name and
W.company_name = ‘XYZ Bank’ and W.salary>10000
iii. Select E. employee _name
from Employee as E, Works as W, Company as C
where E. employee _name=W.person_name and E.city=C.city
and W.company_name=C.company_name
Answer
i. In relational algebra :
Roll No, Name (Branch = “CSE” (Student))
In SQL :
Select Roll No, Name from Students
where Branch = “CSE”;
ii. In relational algebra :
Name Publisher = “BPB” and Student_Roll No = P.Roll No (Student (Roll No,
(Issue.ISBN = Book.ISBN P (Book Issue))))
Publisher
In SQL :
Select Student.name from Student inner join
(Select Book.Publisher, Issue.Roll No from Issue inner join Book on
Issue.ISBN = Book.ISBN as P)
ON Student.Roll No = P. Roll No
where P.Publisher = “BPB”;
iii. In relational algebra :
S.Title, S.Author S.Name like(‘a%’) (T.Name, Book.Author, Book.Title Book.ISBN = T.ISBN
S (Book (Name, ISBN Student.Roll No = Issue.Roll No T (Student
Issue))));
In SQL :
Select S.title, S.Author from
(Select T.Name, Book.Author, Book.Title from Book inner join (Select
Student.Name, Issue.ISBN from Student inner join Issue
ON Student.Roll No = Issue.Roll No as T)
ON Book.ISBN = T.ISBN as S)
where S.Name like ‘a%’;
iv. In relational algebra :
Title (date > = 20/09/2012 (Book Issue))
Database Management System 2–39 A (CS/IT-Sem-5)
In SQL :
Select Book.Title from Book inner join Issue ON Book.ISBN = Issue.ISBN
as R
where R.date > = 20/09/2012;
iv. In relational algebra :
Name (Author = “Sanjeev” and Student.Roll No = Q.Roll No (Student Roll No, Author
Issue.ISBN = Book.ISBN Q (Book Issue)))
In SQL :
Select Student.Name from Student inner join
(Select Issue.Roll No, Book.Author from Issue inner join Book ON
Issue.ISBN = Book.ISBN as Q)
ON Student.Roll No = Q.Roll No
where Q.Author = “Sanjeev”;
Que 2.32. Suppose there are two relations R(A, B, C), S(D, E, F).
Write TRC and SQL for the following RAs :
i. A, B (R)
ii. B = 45 (R)
iii. A, F(C = D(R × S)) AKTU 2018-19, Marks 07
Answer
i. A, B (R) :
TRC : {s.A, s.B|R(s)}
SQL : Select A, B from R ;
ii. B = 45 (R) :
TRC : {s |R(s) s.B = 45}
SQL : Select * from R where B = 45 ;
iii. A, F(C = D(R × S)) :
TRC : {t|pr qs (t[A] = p[A] t[F] = q[F] p[C] = q[D])}
SQL : Select A, F from R inner join S ON R.C = S.D ;
Answer
i. Select person_name from Works
Where company_name=‘ABC Bank’
ii. Select E1.person_name
From Employee as E1, Employee as E2, Manages as M
Where E1.person_name=M.person_name
and E2.person_name=M.manager_name
and E1.street=E2.street and E1.city=E2.city
iii. Select * from employee
where person_name in
(select person_name from Works
where company_name= ‘ABC Bank’ and salary>7000
select E.person_name, street,
city from Employee as E, Works as W
where E.person_name = W.person_name
and W.company_name=‘ABC Bank’ and W.salary>7000
iv. Select person_name from Works
where salary > all
(select salary from Works
where company_name=‘XYZ’)
select person_name from Works
where salary>(select max(salary) from Works
where company_name=‘XYZ’)
v. Update Works
set salary=salary*1.07
where company_name=‘ABC Bank’
vi. Delete from Works
Database Management System 2–41 A (CS/IT-Sem-5)
Answer
Embedded SQL :
1. The SQL standard defines embeddings of SQL in a variety of
programming languages such as Pascal, PL/I, Fortran, C and COBOL.
2. A language in which SQL queries are embedded is referred to as a host
language and the SQL structures permitted in the host language
constitute embedded SQL.
3. Programs written in the host language can use the embedded SQL
syntax to access and update data stored in a database.
4. In embedded SQL, all query processing is performed by the database
system.
5. The result of the query is then made available to the program one tuple
at a time.
6. Embedded SQL statements must be completely present at compile time
and compiled by the embedded SQL preprocessor.
7. To identify embedded SQL requests to the preprocessor, we use the
EXEC, SQL statement as :
EXEC SQL <embedded SQL statement> END.EXEC
8. Variable of the host language can be used within embedded SQL
statements, but they must be preceded by a colon (:) to distinguish them
from SQL variables.
Dynamic SQL :
1. The dynamic SQL component of SQL allows programs to construct and
submit SQL queries at run time.
2. Using dynamic SQL, programs can create SQL queries as strings at run
time and can either have them executed immediately or have them
prepared for subsequent use.
3. Preparing a dynamic SQL statement compiles it, and subsequent uses of
the prepared statement use the compiled version.
4. SQL defines standards for embedding dynamic SQL calls in a host
language, such as C, as in the following example,
Relational Data Model & Language 2–42 A (CS/IT-Sem-5)
char * sqlprog = “update account set balance = balance * 1.05
where account_number = ?”;
EXEC SQL prepare dynprog from : sqlprog;
char account[10] = “A-101”;
EXEC SQL execute dynprog using : account;
Answer
1. PL/SQL is a block-structured language that enables developers to combine
the power of SQL with procedural statements.
2. A stored procedure in PL/SQL is nothing but a series of declarative SQL
statements which can be stored in the database catalogue.
3. A procedure can be thought of as a function or a method.
4. They can be invoked through triggers, other procedures, or applications
on Java, PHP etc.
5. All the statements of a block are passed to Oracle engine all at once
which increases processing speed and decreases the traffic.
Advantages of procedures in PL/SQL :
1. They result in performance improvement of the application. If a
procedure is being called frequently in an application in a single
connection, then the compiled version of the procedure is delivered.
2. They reduce the traffic between the database and the application, since
the lengthy statements are already fed into the database and need not
be sent again and again via the application.
3. They add to code reusability, similar to how functions and methods
work in other languages such as C/C++ and Java.
Disadvantages of procedures in PL/SQL :
1. Stored procedures can cause a lot of memory usage. The database
administrator should decide an upper bound as to how many stored
procedures are feasible for a particular application.
2. MySQL does not provide the functionality of debugging the stored
procedures.
Database Management System 3–1 A (CS/IT-Sem-5)
3
UNIT
Database
Design & Normalization
CONTENTS
Part-1 : Functional Dependencies ........................... 3–2A to 3–7A
PART-1
Functional Dependencies.
Questions-Answers
Answer
Functional dependency :
1. A functional dependency is a constraint between two sets of attributes
from the database.
2. A functional dependency is denoted by X Y, between two sets of
attributes X and Y that are subsets of R specifies a constraint on the
possible tuples that can form a relation r.
3. The constraint for any two tuples t1 and t2, in r which have
t1 [X] = t2 [X];
Also, must have
t1 [Y] = t2 [Y];
4. This means that the values of the Y component of a tuple in r depends
on, or are determined by the value of the X components, or alternatively,
the values of the X component of a tuple uniquely (or functionally)
determine the value of the Y component.
Role of functional dependency :
1. Functional dependency allows the database designer to express facts
about the enterprise that the designer is modeling with the enterprise
databases.
2. It allows the designers to express constraints, which cannot be expressed
with super keys.
Inference rules for functional dependencies :
1. Reflexivity rule : If is a set of attributes and then holds.
2. Augmentation rule : If holds and is a set of attributes then
holds.
3. Transitivity rule : If holds and holds, then holds.
4. Complementation rule : If hold, then {R – ( )}
holds.
Database Management System 3–3 A (CS/IT-Sem-5)
Answer
Functional dependency : Refer Q. 3.1, Page 3–2A, Unit-3.
Trivial functional dependency : The dependency of an attribute on a set
of attributes is known as trivial functional dependency if the set of attributes
includes that attribute.
A B is trivial functional dependency if B is a subset of A.
Non-trivial functional dependency : If a functional dependency X Y
holds true where Y is not a subset of X then this dependency is called non-
trivial functional dependency.
For example :
Let a relation R (A, B, C)
The following functional dependencies are non-trivial :
A B (B is not a subset of A)
A C (C is not a subset of A)
The following dependencies are trivial :
{A, B} B [B is a subset of {A, B}]
Canonical cover : A canonical cover of a set of functional dependencies F is
a simplified set of functional dependencies that has the same closure as the
original set F.
Numerical :
There are two functional dependencies with the same set of attributes
A BC
AB
Database Design & Normalization 3–4 A (CS/IT-Sem-5)
These two can be combined to get
A BC
Now, the revised set F becomes :
F= {
A BC
BC
AB C
}
There is an extraneous attribute in AB C because even after removing AB
C from the set F, we get the same closures. This is because B C is
already a part of F.
Now, the revised set F becomes :
F= {
A BC
BC
}
C is an extraneous attribute in A BC, also A B is logically implied by
A B and B C (by transitivity)
F= {
AB
BC
}
After this step, F does not change anymore.
Hence, the required canonical cover is,
F = {A B, B C}
Answer
Full functional dependency :
1. Given a relation scheme R and functional dependency X Y, Y is fully
functionally dependent on X, if there is no Z, where Z is a proper subset
of Y such that Z Y.
2. The dependency X Y is left reduced, there being no extraneous
attributes in the L.H.S of the dependency.
For example : In the relational schema R (ABCDEH) with the FDs.
F = {A BC, CD E, E C, CD AH, ABH BD, DH BC}.
Database Management System 3–5 A (CS/IT-Sem-5)
Fig. 3.3.1.
i. In Fig. 3.3.1, [Name + Course] is a candidate key, So Name and Course
are prime attributes, Grade is fully functionally dependent on the
candidate keys and Phone no., Course-deptt. and roll no. are partially
functional dependent on the candidate key.
ii. Given R (A, B, C, D) and F = {AB C, B D}. Then key of this relation
is AB and D is partially dependent on the key.
Answer
Partial functional dependency : Refer Q. 3.3, Page 3–4A, Unit-3.
Numerical :
From F,
E AD
E A (By Decomposition Rule)
ED
Also given that
EH
So, E AH (By Union Rule)
which is a FD of set G.
Again A C and AC D
Database Design & Normalization 3–6 A (CS/IT-Sem-5)
Imply A D (By Pseudotransitivity Rule)
A CD (by Union Rule)
which is FD of set G.
Hence, F and G are equivalent.
Que 3.5. Write the algorithm to find minimal cover F for set of
functional dependencies E.
Answer
Algorithm :
1. Set F : = E.
2. Replace each functional dependency X {A1, A2, ..., An} in F by the n
functional dependencies X A1, X A2, ..., X An.
3. For each functional dependency X A in F
for each attribute B that is an element of X
if { {F – {X A} } { (X – {B}) A} } is equivalent to F,
then replace X A with (X – {B} ) A in F.
4. For each remaining functional dependency X A in F
if {F – {X A} } is equivalent to F,
then remove X A from F.
Answer
Minimal cover : A minimal cover of a set of FDs F is a minimal set of
functional dependencies Fmin that is equivalent to F.
Numerical :
Given : R(A, B, C)
Non-redundant cover for F :
Step 1 : Only one attribute on right hand side
F=A B
B C
A C
AB B
AB C
AC B
Database Management System 3–7 A (CS/IT-Sem-5)
PART-2
Normal Forms, First, Second, Third Normal Form, BCNF.
Questions-Answers
Que 3.7. Define normal forms. List the definitions of first, second
and third normal forms. Explain BCNF with a suitable example.
AKTU 2015-16, Marks 10
OR
Explain 1NF, 2NF, 3NF and BCNF with suitable example.
AKTU 2016-17, Marks 7.5
Answer
1. Normal forms are simply stages of database design, with each stage
applying more strict rules to the types of information which can be
stored in a table.
2. Normal form is a method to normalize the relations in database.
3. Normal forms are based on the functional dependencies among the
attributes of a relation.
Different normal forms are :
1. First Normal Form (1NF) :
a. A relations R is in 1NF if all domains are simple i.e., all elements
are atomic.
Database Design & Normalization 3–8 A (CS/IT-Sem-5)
For example : The relation LIVED-IN given in Table 3.7.1 is not in
1NF because the domain values of the attribute ADDRESS are not
atomic.
Table 3.7.1. LIVED-IN
Name Address
CITY Year-moved-in Year-left
Ashok Kolkata 2007 2015
Delhi 2011 2015
CITY Year-moved-in Year-left
Ajay Mumbai 2000 2004
Chennai 2005 2009
Answer
(AB)+ = ABC AB C
= ABCDE A DE
Database Design & Normalization 3–10 A (CS/IT-Sem-5)
= ABCDEF BF
= ABCDEFGH F GH
= ABCDEFGHIJ D IJ
So, AB is key of R.
In the given relation, R has a composite primary key [A,B].
The non-prime attribute are [C, D, E, F, G, H, I, J].
In this case, FDs are AB C, A DE, B F which is only part of the
primary key. Therefore, this table does not satisfy 2NF.
To bring this table to 2NF, we break the table into three relation as :
R1(A, B, C), R2(A, D, E, I, J) and R3(B, F, G, H).
Answer
Answer
1. A relation, R, is in 3NF iff for every dependency X A satisfied by R at
least one of the following conditions :
a. X A is trivial (i.e., A is subset of X)
b. X is a superkey for R, or
c. A is a key attribute for R.
BCNF does not permit the third of these options.
2. BCNF identifies some of the anomalies that are not addressed by 3NF.
3. A relation in BCNF is also in 3NF but vice-versa is not true.
Hence, BCNF is more strict / stronger than 3NF.
Que 3.11. Write the difference between 3NF and BCNF. Find the
normal form of relation R(A, B, C, D, E) having FD s et
F= {A B, BC E, ED A}. AKTU 2018-19, Marks 07
Answer
Difference : Refer Q. 3.9, Page 3–10A, Unit-3.
Numerical :
Given : R(A, B, C, D, E) and
F= AB
BC E
ED A
(ACD)+ = ACDB AB
= ABCDE BC E
= ABCDE ED A
So, ACD is a key of R.
Relation R is in 1NF as all domains are simple i.e., all elements are atomic.
PART-3
Inclusion Dependence, Lossless Join Decomposition.
Questions-Answers
Answer
1. An inclusion dependency R. X < S.Y between two set of attributes X of
relation schema R, and Y of relation schema S specifies the constraint
that, at any specific time when r is a relation state of R and s a relation
state of S, we must have
X(r(R)) Y(s(S))
2. The set of attributes on which the inclusion dependency is specified X of
R and Y of S must have the same number of attributes. Also domains for
each pair of corresponding attributes should be compatible.
3. Inclusion dependencies are defined in order to formalize two types of
interrelational constraints :
a. The foreign key (or referential integrity) constraint cannot be
specified as a functional or multivalued dependency because it
relates attributes across relations.
b. The constraint between two relations that represent a class/subclass
relationship also has no formal definition in terms of the functional,
multivalued, and join dependencies.
4. For example, if X = {A1, A2, . . . , An} and Y = {B1, B2, . . . , Bn}, one possible
correspondence is to have dom(Ai) compatible with dom(Bi) for 1 i n.
In this case, we say that Ai corresponds to Bi.
Answer
Functional dependency : Refer Q. 3.1, Page 3–2A, Unit-3.
Lossless decomposition : A decomposition {R1, R2, ... Rn} of a relation R is
called a lossless decomposition for R if the natural join of R1, R2, ..., Rn
produces exactly the relation R.
Following are the condition to show that decompositions are lossless using
FD set :
1. Union of attributes of R1 and R2 must be equal to attribute of R. Each
attribute of R must be either in R1 or in R2.
Att(R1) Att(R2) = Att(R)
2. Intersection of attributes of R1 and R2 must not be NULL.
Database Management System 3–13 A (CS/IT-Sem-5)
Answer
To check the decomposition is lossless following condition should hold.
1. R1 R2 R3 R4 R5 = (X, W) (X, Y) (Y, Q) (Z, W, Q) (X, Q) =
(X, Y, Z, W, Q) = R
2. (R1 R2) (R3 R4) R5 = ((X, W) (X, Y)) ((Y, Q)
(Z, W, Q)) (X, Q) = X Q {X, Q)
= XQ=
Since, condition 2 violates the condition of lossless join decomposition. Hence
decomposition is lossy.
PART-4
Normalization using FD, MVD and JDs, Alternative
Approaches to Database Design.
Questions-Answers
Answer
1. Normalization is the process of reducing data redundancy in a relational
database.
2. Normalization is a refinement process that the database designer
undertakes. After identifying the data objects of the proposed database,
their relationships define the tables required and columns within each
table.
3. The fundamental principle of normalization is, “The same data should
not be stored in multiple places.” No information is lost in the process;
however, the number of tables generally increases as the rules are
applied.
Types of normalization : Refer Q. 3.7, Page 3–7A, Unit-3.
Advantages :
1. It helps to remove the redundancy from the relation.
2. It helps in easy manipulation of data.
3. It helps to provide more information to the user.
4. It eliminates modification anomalies.
Answer
Multivalued Dependency (MVD) :
1. MVD occurs when two or more independent multivalued facts about
the same attribute occur within the same relation.
2. MVD is denoted by X Y specified on relation schema R, where X
and Y are both subsets of R.
3. Both X and Y specifies the following constraint on any relation state r
of R : If two tuples t1 and t2 exist in r such that t1(X) = t2(X), then two
tuples t3 and t4 should also exist in r with the following properties,
where we use Z to denote (R – (X Y))
t3(X) = t4(X) = t1(X) = t2(X)
t3(Y) = t1(Y) and t3 (Z) = t2(Z)
Database Management System 3–15 A (CS/IT-Sem-5)
Que 3.17. Explain the fourth and fifth normal with suitable
example.
Answer
Fourth Normal Form (4NF) :
1. A table is in 4NF, if it is in BCNF and it contains multivalued
dependencies.
2. A relation schema R is in 4NF, with respect to a set of dependencies F
(that includes FD and multivalued dependencies) if, for every non-
trivial multivalued dependency X Y in F+, X is superkey for R.
Database Design & Normalization 3–16 A (CS/IT-Sem-5)
For example : A Faculty has multiple courses to teach and he is
leading several committees. This relation is in BCNF, since all the
three attributes concatenated together constitutes its key. The rule
for decomposition is to decompose the offending table into two, with
the multi-determinant attribute or attributes as part of the key of
both. In this case to put the relation in 4NF, two separate relations are
formed as follows :
FACULTY_COURSE (FACULTY, COURSE)
FACULTY_COMMITTEE (FACULTY, COMMITTEE)
Faculty Course Faculty Committee
John Subject John Placement
John Networking John Scholarship
John MIS
Company_Product Company_Supplier
Company Product Company Supplier
Godrej Soap Godrej Mr. X
Godrej Shampoo Godrej Mr. X
H.Lever Soap Godrej Mr. Z
H.Lever Shampoo H.Lever Mr. X
H.Lever Mr. Y
The redundancy has been eliminated but we have lost the information.
Now suppose that the original table to be decomposed in three parts,
Company_Product, Company_Supplier and Product_Supplier, which
is as follows :
Product_Supplier
PRODUCT SUPPLIER
Soap Mr. X
Soap Mr. Y
Shampoo Mr. X
Shampoo Mr. Y
Shampoo Mr. Z
Answer
Attribute preservation condition on decomposition :
1. The relational database design algorithms start from a single universal
relation schema R = {A1, A2, ..., An} that includes all the attributes of the
database.
2. We implicitly make the universal relation assumption, which states that
every attribute name is unique.
3. Using the functional dependencies, the algorithms decompose the
universal relation schema R into a set of relation schemas D = {R1, R2, ...,
Rm} that will become the relational database schema; D is called a
decomposition of R.
Database Design & Normalization 3–18 A (CS/IT-Sem-5)
4. Each attribute in R must appear in at least one relation schema Ri in the
decomposition so that no attributes are lost; formally, we have
m
R
i 1
i =R
After applying first two functional dependencies first row contain all “a”
symbols. Hence it is lossless join.
Answer
An alternate approach to database design is dangling tuples :
1. Tuples that “disappear” in computing a join are known as dangling tuples.
a. Let r1(R1), r2(R2), ..., rn(Rn) be a set of relations.
b. A tuple t of relation Ri is a dangling tuple if t is not in the relation :
Ri (r1 r2 ... rn)
2. The relation r1 r2 ... rn is called a universal relation since it
involves all the attributes in the “universe” defined by R1 R2 ... Rn.
3. If dangling tuples are allowed in the database, instead of decomposing a
universal relation, we may prefer to synthesize a collection of normal
form schemas from a given set of attributes.
Q. 2. Define normal forms. Explain 1NF, 2NF, 3NF and BCNF with
suitable example.
Ans. Refer Q. 3.7.
Database Management System 4–1 A (CS/IT-Sem-5)
4
UNIT
Transaction Processing
Concept
CONTENTS
Part-1 : Transaction System, .................................... 4–2A to 4–6A
Testing of Serializability,
Serializability of Schedules
PART-1
Transaction System, Testing of Serializability,
Serializability of Schedules.
Questions-Answers
Answer
1. A transaction is a logical unit of database processing that includes one
or more database access operations; these include insertion, deletion,
modification or retrieval operations.
2. The database operations that form a transaction can be embedded
within an application program.
3. By specifying explicit begin transaction and end transaction we can
specify the transaction boundaries.
4. If the database operations in a transaction do not update the database
but only retrieve data, the transaction is called a read-only transaction.
Answer
Transaction : Refer Q. 4.1, Page 4–2A, Unit-4.
To ensure integrity of data, the database system maintains some properties
of transaction. These properties are known as ACID properties.
Let us consider an example for the set of operations :
Database Management System 4–3 A (CS/IT-Sem-5)
Answer
ACID properties : Refer Q. 4.2, Page 4–2A, Unit-4.
Ensuring the atomicity :
1. To ensure atomicity, database system keeps track of the old values of
any data on which a transaction performs a write.
2. If the transaction does not complete its execution, the database system
restores the old values.
Transaction Processing Concept 4–4 A (CS/IT-Sem-5)
Que 4.4. List the ACID properties. Explain the usefulness of each
property.
Answer
ACID properties of transaction : Refer Q. 4.2, Page 4–2A, Unit-4.
Usefulness of ACID properties :
Atomicity : Atomicity is useful to ensure that if for any reason an error
occurs and the transaction is unable to complete all of its steps, then the
system is returned to the state it was in before the transaction was started.
Consistency : The consistency property is useful to ensure that a complete
execution of transaction from beginning to end is done without interference
of other transactions.
Isolation : Isolation property is useful to ensure that a transaction should
appear isolated from other transactions, even though many transactions
are executing concurrently.
Durability : Durability is useful to ensure that the changes applied to the
database by a committed transaction must persist in the database.
Answer
Transaction : Refer Q. 4.1, Page 4–2A, Unit-4.
State diagram of transaction :
1. Active : The transaction is said to be in the active state till the final
statement is executed.
Database Management System 4–5 A (CS/IT-Sem-5)
Partially
Committed
committed
Active
Failed Aborted
Fig. 4.5.1.
Answer
Implementation of atomicity in transaction can be done in two ways :
1. Completeness :
a. All of the operations encapsulated within a database transaction
represent an atomic unit of work.
b. According to atomicity either all of transaction will run to
completion (Commit) or none of them.
c. There will not be any partial transaction in left over state from
incomplete execution of one or more operations in a transaction.
d. If the user decides to cancel everything (Rollback), all of the
changes made by the transaction will be undone and the state
would be as if the transaction never began by using undo operation.
e. For every change made by operations in the database, it logs undo
data to be used to rollback the effects of operations.
2. Mutual exclusion/locking :
a. Only one transaction will be allowed to progress by taking an
exclusive lock on the particular data item.
Transaction Processing Concept 4–6 A (CS/IT-Sem-5)
b. The lock will not be released until the transaction ends (either
through rollback, commit or abort).
c. Any other concurrent transaction interested in updating the same
row will have to wait.
Que 4.7. What is serializability ? Why serializability is required ?
Write short note on serializability of schedule.
Answer
Serializability : Serializability is a property of a transaction schedule which
is used to keep the data in the data item in consistent state. It is the classical
concurrency scheme.
Serializability is required :
1. To control concurrent execution of transaction.
2. To ensure that the database state remains consistent.
Serializability of schedule :
1. In DBMS, the basic assumption is that each transaction preserves
database consistency.
2. Thus, the serial execution of a set of transaction preserves database
consistency.
3. A concurrent schedule is serializable if it is equivalent to a serial
schedule.
PART-2
Conflict and View Serializable Schedule, Recoverability,
Recovery from Transaction Failures.
Questions-Answers
Answer
1. Consider a schedule S, in which there are two consecutive instructions
Ii and Ij of transactions Ti and Tj respectively (i j).
2. If Ii and Ij refer to different data items, then swap Ii and Ij without
affecting the results of any instruction in the schedule.
Database Management System 4–7 A (CS/IT-Sem-5)
3. However, if Ii and Ij refer to the same data item Q, then the order of the
two steps matter.
4. Following are four possible cases :
Ii Ij Swapping possible
Answer
1. The schedule S and S are said to be view equivalent if following
three conditions met :
a. For each data item Q, if transaction Ti reads the initial value of Q
in schedule S, then transaction Ti in schedule S, must also read
the initial value of Q.
b. For each data item Q if transaction Ti executes read (Q) in schedule
S and if that value produced by a write (Q) operation executed by
transaction Tj, then the read (Q) operation of transaction Ti, in
schedule S, must also read the value of Q that was produced by
the same write (Q) operation of transaction Tj.
c. For each data item Q, the transaction (if any) that performs the
final write (Q) operation in schedule S must perform the final
write (Q) operation in schedule S.
2. Conditions (a) and (b) ensure that each transaction reads the same
values in both schedules and therefore, performs the same computation.
Condition (c), coupled with condition (a) and condition (b) ensure that
both schedules result in the same final system state.
3. The concept of view equivalence leads to the concept of view
serializability.
4. We say that schedule S is view serializable, if it is view equivalent to
serial schedule.
5. Every conflict serializable schedule is also view serializable but there
are view serializable schedules that are not conflict serializable.
Example :
Schedule S1 Schedule S2
T1 T2 T1 T2
Answer
Schedule : A schedule is a set of transaction with the order of execution of
instruction in the transaction.
Recoverable schedule :
A recoverable schedule is one in which for each pair of transaction Ti and Tj
if Tj reads a data item previously written by Ti, the commit operation of Ti
appears before the commit operation of Tj.
For example : In schedule S, let T2 commits immediately after executing
read (A) i.e., T2 commits before T1 does. Now let T1 fails before it commits,
we must abort T2 to ensure transaction atomicity. But as T2 has already
committed, it cannot be aborted. In this situation, it is impossible to recover
correctly from the failure of T1.
Schedule S
T1 T2
read (A)
write (A)
read (A)
read (B)
Cascadeless schedule :
1. A cascadeless schedule is one, where for each pair of transaction T i
and Tj such that Tj reads a data item previously written by T i, the
commit operation to Tj appears before the read operation of T i.
2. Even if a schedule is recoverable, to recover correctly from the
failure of a transaction Ti, we may have to rollback several transactions.
Such situations occur if transactions have read data written by T i.
Strict schedule :
1. A schedule is called strict if every value written by a transaction T is
not read or changed by other transaction until T either aborts or
commits.
2. A strict schedule avoids cascading and recoverability.
Transaction Processing Concept 4–10 A (CS/IT-Sem-5)
Answer
Precedence graph :
1. A precedence graph is a directed graph G = (N, E) that consists of set of
nodes N = {T1, T2, ...., Tn} and set of directed edges E = [e1, e2...., em].
2. There is one node in the graph for each transaction Ti in the schedule.
3. Each edge ei in the graph is of the form (Tj Tk), 1 j n, 1 k n,
where Tj is the starting node of ei and Tk is the ending node of ei.
4. Such an edge is created if one of the operations in Tj appears in the
schedule before some conflicting operation in Tk.
Algorithm for testing conflict serializability of schedule S :
a. For each transaction Ti participating in schedule S, create a node
labeled Ti in the precedence graph.
b. For each case in S where T j executes a read_item(X) after T i
executes a write_item(X), create an edge (Ti Tj) in the precedence
graph.
c. For each case in S where Tj executes a write_item(X) after T i
executes read_item(X), create an edge (Ti Tj) in the precedence
graph.
d. For each case in S where Tj executes a write_item(X) after T i
executes a write_item(X), create an edge (Ti Tj) in the precedence
graph.
e. The schedule S is serializable if and only if the precedence graph
has no cycles.
5. The precedence graph is constructed as described in given algorithm.
6. If there is a cycle in the precedence graph, schedule S is not (conflict)
serializable; if there is no cycle, S is serializable.
7. In the precedence graph, and edge from Ti to Tj means that transaction
Ti must come before transaction Tj in any serial schedule that is
equivalent to S, because two conflicting operations appear in the schedule
in that order.
8. If there is no cycle in the precedence graph, we can create an equivalent
serial schedule S that is equivalent to S, by ordering the transactions
that participate in S as follows : Whenever an edge exists in the precedence
graph from Ti to Tj, Ti must appear before Tj in the equivalent serial
schedule S.
Database Management System 4–11 A (CS/IT-Sem-5)
Example :
T1 T2 T3
read (Y) ;
read (Z) ;
read (X) ;
write (X) ;
write (Y) ;
write (Z) ;
read (Z) ;
read (Y) ;
write (Y) ;
read (Y) ;
write (Y) ;
read (X) ;
write (X) ;
Schedule S
X,Y
T1 T2
Y Y, Z
T3
Answer
i. The serialization graph is :
T1 T2
T3
Fig. 4.12.1.
There are two cycles. It is not serializable.
ii. The serialization graph is :
T1 T2
T3
Fig. 4.12.2.
Transaction Processing Concept 4–12 A (CS/IT-Sem-5)
Answer
Cascadeless schedule : Refer Q. 4.10, Page 4–9A, Unit-4.
Cascading rollback : Cascading rollback is a phenomenon in which a
single failure leads to a series of transaction rollback.
For example :
Schedule S
T1 T2 T3
read (A )
read (B )
write (A )
read (A )
write (A )
read ( A)
In the example, transaction T1 writes a value of A that is read by transaction
T2. Transaction T2 writes a value of A that is read by transaction T3. Suppose
that at this point T1 fails. T1 must be rolled back. Since T2 is dependent on T1,
T2 must be rolled back, since T3 is dependent on T2, T3 must be rolled back.
Need for cascadeless schedules :
Cascadeless schedules are desirable because the failure of a transaction
does not lead to the aborting of any other transaction. This comes at the
cost of less concurrency.
Answer
The set of rules which must be followed for preparing serializable
schedule are :
1. Take any concurrent schedule.
2. Draw the precedence graph for concurrent schedule.
3. If there is a cycle in precedence graph then schedule is not serializable.
4. If there is no cycle the schedule is serializable.
5. Prepare serializable schedule using precedence graph.
Database Management System 4–13 A (CS/IT-Sem-5)
Answer
Schedule : Refer Q. 4.10, Page 4–9A, Unit-4.
Difference between conflict and view serializability :
S. No. Conflict serializability View serializability
1. Easy to achieve. Difficult to achieve.
2. Cheaper to test. Expensive to test.
3. Every conflict serializable is Every view serializable is not
view serializable. conflict serializable.
4. Used in most concurrency Not used in concurrency control
control scheme. scheme.
Que 4.16. What is schedule ? What are its types ? Explain view
serializable and cascadeless schedule with suitable example of each.
AKTU 2018-19, Marks 07
Answer
Schedule : Refer Q. 4.10, Page 4–9A, Unit-4.
Types of schedule are :
1. Recoverable schedule
2. Cascadeless schedule
3. Strict schedule
View serializable : Refer Q. 4.9, Page 4–8A, Unit-4.
Cascadeless schedule : Refer Q. 4.10, Page 4–9A, Unit-4.
Transaction Processing Concept 4–14 A (CS/IT-Sem-5)
Answer
For S1 :
T1 T2 T3
r1(x)
r3(x)
w3(x)
w1(x)
r2(x)
x
T2
T1
x T3
Fig. 4.17.1.
Since, the graph does not contain cycle. Hence, it is conflict serializable.
For S2 :
T1 T2 T3
r3(x)
r2(x)
w3(x)
r1(x)
w1(x)
T1 T2
x
x
T3
Fig. 4.17.2.
Database Management System 4–15 A (CS/IT-Sem-5)
Since, the graph does not contain cycle. Hence, it is conflict serializable.
For S3 :
T1 T2 T3
r3(x)
r3(x)
r1(x)
w3(x)
w1(x)
T1 T2
x
T3
Fig. 4.17.3.
Since, the graph contains cycle. Hence, it is not conflict serializable.
PART-3
Log Based Recovery, Checkpoints.
Questions-Answers
Answer
1. The log / system log is a sequence of log records, recording all the update
activities in the database.
2. Various types of log records are denoted as :
a. <Ti start> : Transaction Ti has started.
b. <Ti, Xj, V1, V2> : Transaction Ti has performed a write on data item
Xj. Xj had value V1 before the write, and will have value V2 after the
write.
Transaction Processing Concept 4–16 A (CS/IT-Sem-5)
Answer
1. Shadow paging is a technique in which multiple copies (known as shadow
copies) of the data item to be modified are maintained on the disk.
2. Shadow paging considers the database to be made up of fixed-size logical
units of storage called pages.
3. These pages are mapped into physical blocks of storage with the help of
page table (or directory).
4. The physical blocks are of the same size as that of the logical blocks.
5. A page table with n entries is constructed in which the ith entry in the
page table points to the ith database page on the disk as shown in
Fig. 4.19.1.
6. The main idea behind this technique is to maintain two page tables.
a. In current page the entries points to the most recent database
pages on the disk. When a transaction starts, the current page
table is copied into a shadow page table (or shadow directory).
b. The shadow page table is then saved on the disk and the current
page table is used by the transaction. The shadow page table is
never modified during the execution of the transaction.
Shadow page
table Database Disk pages Current page table
1 1 Page 1 (old) 1
2 2 Page 2 2
3 3 Page 3 (old) 3
4 4 Page 4 4
5 5 Page 5 5
6 6 Page 6 6
7 7 Page 7 7
8 8 Page 8 8
9 Page 1 (new)
10 Page 3 (new)
11
Answer
Checkpointing :
1. It is a process of saving a snapshot of the application’s state, so that it
can restart from that point in case of failure.
2. Checkpoint is a point of time at which a record is written onto the
database from the buffers.
3. Checkpointing shortens the recovery process.
Types of checkpointing techniques :
1. Consistent checkpointing :
a. Consistent checkpointing creates a consistent image of the database
at checkpoint.
b. During recovery, only those transactions which take place after
last checkpoint are undone or redone.
c. The transactions that take place before the last consistent
checkpoint are already committed and need not be processed again.
d. The actions taken for checkpointing are :
i. All changes in main-memory buffers are written onto the
disk.
ii. A “checkpoint” record is written in the transaction log.
iii. The transaction log is written to the disk.
2. Fuzzy checkpointing :
a. In fuzzy checkpointing, at the time of checkpoint, all the active
transactions are written in the log.
b. In case of failure, the recovery manager processes only those
transactions that were active during checkpoint and later.
c. The transactions that have been committed before checkpoint are
written to the disk and hence need not be redone.
Que 4.21. What is log file ? Write the steps for log based recovery
Answer
Log file : A log file is a file that records all the update activities occur in the
database.
Database Management System 4–19 A (CS/IT-Sem-5)
Answer
There are many different database recovery techniques to recover a database :
1. Deferred update recovery : Refer Q. 4.18, Page 4–15A, Unit-4.
Advantages :
a. Recovery is easy.
b. Cascading rollback does not occur because no other transaction
sees the work of another until it is committed.
Disadvantages :
a. Concurrency is limited.
2. Immediate update recovery : Refer Q. 4.18, Page 4–15A, Unit-4.
Transaction Processing Concept 4–20 A (CS/IT-Sem-5)
Advantages :
a. It allo ws higher concurrency because transactions write
continuously to the database rather than waiting until the commit
point.
Disadvantages :
a. It leads to cascading rollbacks.
b. It is time consuming and may be problematic.
PART-4
Deadlock Handling.
Questions-Answers
Answer
Deadlock :
1. A deadlock is a situation in which two or more transactions are waiting
for locks held by the other transaction to release the lock.
2. Every transaction is waiting for another transaction to finish its
operations.
Methods to handle a deadlock :
1. Deadlock prevention protocol : This protocol ensures that the system
will not go into deadlock state. There are different methods that can be
used for deadlock prevention :
a. Pre-declaration method : This method requires that each
transaction locks all its data item before it starts execution.
b. Partial ordering method : In this method, system imposes a
partial ordering of all data items and requires that a transaction
can lock a data item only in the order specified by partial order.
c. Timestamp method : In this method, the data item are locked
using timestamp of transaction.
Database Management System 4–21 A (CS/IT-Sem-5)
2. Deadlock detection :
a. When a transaction waits indefinitely to obtain a lock, system
should detect whether the transaction is involved in a deadlock or
not.
b. Wait-for-graph is one of the methods for detecting the deadlock
situation.
c. In this method a graph is drawn based on the transaction and
their lock on the resource.
d. If the graph created has a closed loop or a cycle, then there is a
deadlock.
Wait-for-lock(R1)
T1 T2
Wait-for-lock(R2)
Fig. 4.23.1. Wait-for-graph.
Answer
Deadlock prevention schemes :
1. Wait-die scheme :
i. In this scheme, if a transaction request to lock a resource (data
item), which is already held with conflicting lock by some other
transaction, one of the two possibilities may occur :
a. If TS(Ti) < TS(Tj), i.e., Ti, which is requesting a conflicting
lock, is older than Tj, Ti is allowed to wait until the data item
is available.
b. If TS(Ti) > TS(Tj), i.e., Ti is younger than Tj, so Ti dies. Ti is
restarted later with random delay but with same timestamp.
ii. This scheme allows the older transaction to wait but kills the
younger one.
2. Wound-wait scheme :
i. In this scheme, if a transaction request to lock a resource (data
item), which is already held with conflicting lock by some other
transaction, one of the two possibilities may occur :
a. If TS(Ti) < TS(Tj), i.e., Ti, which is requesting a conflicting
lock, is older than Tj, Ti forces Tj to be rolled back, that is Ti
wounds Tj. Tj is restarted later with random delay but with
same timestamp.
b. If TS(Ti) > TS(Tj), i.e., Ti is younger than Tj, Ti is forced to
wait until the resource (i.e., data item) is available.
ii. This scheme, allows the younger transaction to wait but when an
older transaction request an item held by younger one, the older
transaction forces the younger one to abort and release the item.
In both cases, transaction, which enters late in the system, is
aborted.
Answer
Deadlock : Refer Q. 4.23, Page 4–20A, Unit-4.
Necessary condition for deadlock : A deadlock situation can arise if the
following four conditions hold simultaneously in a system :
1. Mutual exclusion : At least one resource must be held in a non-
sharable mode; that is, only one process at a time can use the resource.
If another process requests that resource, the requesting process must
be delayed until the resource has been released.
2. Hold and wait : A process must be holding at least one resource and
waiting to acquire additional resources that are currently being held by
other processes.
3. No pre-emption : Resources cannot be pre-empted; i.e., a resource can
be released only by the process holding it, after that process has
completed its task.
4. Circular wait : A set {P0, P1, ..., Pn} of waiting processes must exist
such that P0 is waiting for a resource held by P1, P1 is waiting for a
resource held by P2, ..., Pn–1 is waiting for a resource held by Pn, and Pn
is waiting for a resource held by P0.
Deadlock detection and recovery : Refer Q. 4.23, Page 4–20A, Unit-4.
PART-5
Distributed Database : Distributed Data Storage.
Questions-Answers
Answer
Distributed database :
Fragments
Fragments of Main
of Main database
database
User 2
User 1
Fragments Fragments
of Main of Main
database database
User 3
User 5
User 4 Fragments
of Main
database
Disadvantages of DDBMS :
1. Complex software is required for a distributed database environment.
2. The various sites must exchange message and perform additional
calculations to ensure proper coordination among the sites.
3. A by-product of the increased complexity and need for coordination is
the additional exposure to improper updating and other problems of
data integrity.
4. If the data are not distributed properly according to their usage, or if
queries are not formulated correctly, response to requests for data can
be extremely slow.
Answer
1. Atomic commit protocols are the key element in supporting global
atomicity of distributed transactions.
2. Two-phase commit protocol (2PC) is the standard atomic commit protocol.
3. 2PC is important to guarantee correctness properties in the complex
distributed world whilst at the same time it reduces parallelism due
to high disk and message overhead and locking during windows of
vulnerability.
4. An atomic commitment problem requires processes to agree on a
common outcome which can be either commit or abort.
5. An atomic commit protocol must guarantee the following atomic
commitment properties :
a. AC1 : All processes that reach an outcome reach the same one.
b. AC2 : A process cannot reverse its outcome after it has reached
one.
c. AC3 : The commit outcome can only be reached if all participant
voted Yes.
d. AC4 : If there are no failures and all participants voted Yes,
then the outcome will be commit.
e. AC5 : Consider any execution containing only failures that the
protocol is designed to tolerate.
6 At any point in the execution, if all existing failures are repaired and
no new failures occur for sufficiently long, then all processes will
eventually reach an outcome.
Que 4.28. Explain replication and its types in distributed system.
Answer
1. Replication is a technique of replicating data over a system.
Transaction Processing Concept 4–26 A (CS/IT-Sem-5)
Passive replication
Client
Answer
Fragmentation :
1. It is the decomposition of a relation into fragments.
2. It permits to divide a single query into a set of multiple sub-queries that
can execute parallel on fragments.
3. Fragmentation is done according to the data selection patterns of
applications running on the database.
Fragmentation techniques/types are as follows :
1. Vertical fragmentation :
a. It divides a relation into fragments which contain a subset of
attributes of a relation along with the primary key attribute of the
relation.
2. Horizontal fragmentation :
a. It divides a relation into fragments along its tuples. Each fragment
is a subset of tuples of a relation.
b. It identifies some specific rows based on some criteria and marks it
as a fragment.
Name Reg. No. Course Depth
Fragmentation1
Fragmentation2
Fragmentation3
Fragmentation4
Fig. 4.29.2. Horizontal fragmentation.
c. Various horizontal fragmentation techniques are :
i. Primary horizontal fragmentation : This type of
fragmentation is done where the tables in a database are
neither joined nor have dependencies. So, no relationship exists
among the tables.
ii. Derived horizontal fragmentation : Derived horizontal
fragmentation is used for parent relation. It is used where
tables are interlinked with the help of foreign keys. It ensures
that the fragments which are joined together are put on the
same site.
3. Hybrid/mixed fragmentation :
a. The mixed/hybrid fragmentation is combination of horizontal and
vertical fragmentations.
b. This type is most complex one, because both types are used in
horizontal and vertical fragmentation of the DB application.
c. The original relation is obtained back by join or union operations.
R
HF HF
R1 R2
VF VF VF VF
VF
Answer
Distributed database : Refer Q. 4.26, Page 4–23A, Unit-4.
Advantages of data replication :
i. Availability : If one of the sites containing relation r fails, then the
relation r can be found in another site. Thus, the system can continue to
process queries involving ‘r’, despite the failure of one site.
ii. Increased parallelism : Number of transactions can read relation r in
parallel. The more replicas of ‘r’ there are, the greater parallelism is
achieved.
Disadvantages of data replication :
i. Increased overhead on update : The system must ensure that all
replicas of a relation r are consistent; otherwise, erroneous computation
may result. Thus, whenever r is updated, the update must be propagated
to all sites containing replicas. The result is increased overhead.
Advantages of data fragmentation :
i. Parallelized execution of queries by different sites is possible.
ii. Data management is easy as fragments are smaller compare to the
complete database.
iii. Increased availability of data to the users/queries that are local to the
site in which the data stored.
iv. As the data is available close to the place where it is most frequently
used, the efficiency of the system in terms of query processing,
transaction processing is increased.
v. Data that are not required by local applications are not stored locally. It
leads to reduced data transfer between sites, and increased security.
Disadvantages of data fragmentation :
i. The performance of global application that requires data from several
fragments located at different sites may be slower.
ii. Integrity control may be more difficult if data and functional dependencies
are fragmented and located at different sites.
Transaction Processing Concept 4–30 A (CS/IT-Sem-5)
Answer
Distributed databases are classified as :
1. Homogeneous distributed database :
a. In this, all sites have identical database management system
software.
b. All sites are aware of one another, and agree to co-operate in
processing user’s requests.
2. Heterogeneous distributed database :
a. In this, different sites may use different schemas, and different
database management system software.
b. The sites may not be aware of one another, and they may provide
only limited facilities for co-operation in transaction processing.
PART-6
Concurrent Control, Directory System.
Database Management System 4–31 A (CS/IT-Sem-5)
Questions-Answers
Answer
1. Concurrency Control (CC) is a process to ensure that data is updated
correctly and appropriately when multiple transactions are concurrently
executed in DBMS.
2. It is a mechanism for correctness when two or more database
transactions that access the same data or dataset are executed
concurrently with time overlap.
3. In general, concurrency control is an essential part of transaction
management.
Concurrency control is needed :
1. To ensure consistency in the database.
2. To prevent following problem :
a. Lost update :
i. A second transaction writes a second value of a data item on
top of a first value written by a first concurrent transaction,
and the first value is lost to other transactions running
concurrently which need, by their precedence, to read the
first value.
ii. The transactions that have read the wrong value end with
incorrect results.
b. Dirty read :
i. Transactions read a value written by a transaction that has
been later aborted.
ii. This value disappears from the database upon abort, and should
not have been read by any transaction (“dirty read”).
iii. The reading transactions end with incorrect results.
Answer
Following are concurrency control mechanism in distributed
database :
a. Two-phase commit protocol : Two-phase commit protocol is
designed to allow any participant to abort its part of transaction. Due to
the requirement for atomicity, if one part of a transaction is aborted
then the whole transaction must also be aborted.
Following are the two phase used in this protocol :
Phase 1 (voting phase) :
1. The co-ordinator sends a canCommit? request to each of the
participants in the transaction.
2. When a participant receives canCommit? request it replies with its
vote (Yes or No) to the co-ordinator. Before voting Yes, it prepares
to commit by saving objects in permanent storage. If the vote is No,
the participant aborts immediately.
Phase 2 (completion according to outcome of vote) :
1. The co-ordinator collects the votes (including its own).
a. If there are no failures and all the votes are Yes the co-ordinator
decides to commit the transaction and sends a doCommit
request to each of the participants.
b. Otherwise the co-ordinator decides to abort the transaction
and sends doAbort requests to all participants that voted Yes.
2. Participants that voted Yes are waiting for a doCommit or doAbort
request from the co-ordinator. When a participant receives one of
these messages it acts accordingly and in the case of commit, makes
a haveCommitted calls as confirmation to the co-ordinator.
Co-ordinator Participant
status step status step
canCommit?
1 prepared to commit 2 prepared to
(waiting for votes) Yes commit
(uncertain)
doCommit
3 committed 4 committed
have Committed
done
Answer
1. A directory is a listing of information about some class of objects such
as persons.
2. Directories can be used to find information about a specific object, or in
the reverse direction to find objects that meet a certain requirement.
3. In the networked world, the directories are present over a computer
network, rather than in a physical (paper) form.
4. A directory system is implemented as one of more servers, which
service multiple clients.
5. Clients use the application programmer interface defined by the
directory system to communicate with the directory servers.
Directory access protocols :
1. Directory access protocol is a protocol that allows to access directory
information through program.
2. Directory access protocols also define a data model and access control.
3. For instance, web browsers can store personal bookmarks and other
browser settings in a directory system. A user can thus access the
same settings from multiple locations, such as at home and at work,
without having to share a file system.
Transaction Processing Concept 4–34 A (CS/IT-Sem-5)
Database Management System 5–1 A (CS/IT-Sem-5)
5
UNIT
Concurrency Control
Techniques
CONTENTS
Part-1 : Concurrency Control, ............................... 5–2A to 5–11A
Locking Techniques for
Concurrency Control
PART-1
Concurrency Control, Locking Techniques for Concurrency Control.
Questions-Answers
Answer
Refer Q. 4.32, Page 4–31A, Unit-4.
Answer
Lock : A lock is a variable associated with each data item that indicates
whether read or write operation is applied.
Different types of locks are :
1. Binary lock :
i. A binary lock can be two states or values : locked and unlocked (or
1 and 0).
ii. A distinct lock is associated with each database item X.
iii. If the value of the lock on X is 1, item X cannot be accessed by a
database operation that requests the item.
iv. If the value of the lock on X is 0, the item can be accessed when
requested.
v. We refer to the current value (or state) of the lock associated with
item X as lock(X).
vi. Two operations, lock_item and unlock_item, are used with binary
locking.
If the simple binary locking scheme is used, every transaction
must obey the following rules :
i. A transaction T must issue the operation lock_item(X) before any
read_item(X) or write_item(X) operations are performed in T.
Database Management System 5–3 A (CS/IT-Sem-5)
Answer
1. Lock compatibility determines whether locks can be acquired on a data
item by multiple transactions at the same time.
2. Suppose a transaction Ti requests a lock of mode m1 on a data item Q on
which another transaction Tj currently holds a lock of mode m2.
3. If mode m2 is compatible with mode m1, the request is immediately
granted, otherwise rejected.
Concurrency Control Techniques 5–4 A (CS/IT-Sem-5)
Shared YES NO
Exclusive NO NO
Answer
Implementation of locking :
1. The locking or unlocking of data items is implemented by a subsystem of
the database system known as the lock manager.
2. It receives the lock requests from transactions and replies them with a
lock grant message or rollback message (in case of deadlock).
3. In response to an unlock request, the lock manager only replies with an
acknowledgement. In addition, it may result in lock grant messages to
other waiting transactions.
The lock manager handles the requests by the transaction to lock
and unlock a data item in the following way :
1. Lock request :
a. When a first request to lock a data item arrives, the lock manager
creates a new linked list to record the lock request for the data
item.
b. It immediately grants the lock request of the transaction.
c. If the linked list for the data item already exists, it includes the
request at the end of the linked list.
d. The lock request will be granted only if the lock request is compatible
with all the existing locks and no other transaction is waiting for
acquiring lock on this data item otherwise, the transaction has to
wait.
2. Unlock request :
a. When an unlock request for the data items arrives, the lock manager
deletes the record corresponding to that transaction from the linked
list for the data item.
b. It then checks whether other waiting requests on that data item
can be granted.
Database Management System 5–5 A (CS/IT-Sem-5)
Answer
Implementation of lock manager :
1. A typical lock manager is implemented with a hash table, also called lock
table, with the data object identifier as the key.
2. A lock table entry contains the following information :
a. The number of transactions currently holding a lock on the object.
b. The nature of the lock.
c. A pointer to a queue of lock requests.
Reason for lock and unlock being atomic operations : Lock and
unlock must be atomic operations because it may be possible for two
transactions to obtain an exclusive lock on the same object, thereby
destroying the principles of 2PL.
Difference between lock and latch :
S. No. Lock Latch
1. It is used when we lock any It is used when we release lock.
data item.
2. Hold for long duration. Hold for short duration.
3. It is used at initial stage of It is used when all the operations
transaction. are completed.
Convoy :
1. Convoy is a queue of waiting transactions.
2. It occurs when a transaction holding a heavily used lock is suspended
by the operating system, and every other transaction that needs this
lock is queued.
Lock manager handle convoy by allowing a transaction to acquire lock only
once.
Concurrency Control Techniques 5–6 A (CS/IT-Sem-5)
Answer
1. Lock based protocol indicates when a transaction may lock and unlock
the data items, during the concurrent execution. It restricts the number
of possible schedules.
2. It ensures that the data items must be accessed in mutual exclusive
manner and for this we use different lock modes.
3. There are two modes in which a data item may be locked :
i. Shared mode lock : If a transaction Ti has obtained a shared
mode lock on item Q then Ti can read but cannot write Q. It is
denoted by S.
ii. Exclusive lock : If a transaction Ti has obtained an exclusive
mode lock on item Q then Ti can read and also write Q. It is denoted
by X.
Answer
1. Two-phase locking is a procedure in which a transaction is said to
follow the two-phase locking protocol if all locking operations precede
the first unlock operation in the transaction.
2. In 2PL, each transaction lock and unlock the data item in two phases :
a. Growing phase : In the growing phase, the transaction acquires
locks on the desired data items.
b. Shrinking phase : In the shrinking phase, the transaction
releases the locks acquired by the data items.
3. According to 2PL, the transaction cannot acquire a new lock, after it
has unlocked any of its existing locked items.
4. Given below, the two transactions T1 and T2 that do not follow the
two-phase locking protocol.
Database Management System 5–7 A (CS/IT-Sem-5)
T1 T2
Answer
1. Cascading rollbacks can be avoided by a modification of two-phase locking
called the strict two-phase locking protocol.
2. This protocol requires not only that locking be two phase, but also that
all exclusive-mode locks taken by a transaction be held until that
transaction commits.
3. This requirement ensures that any data written by an uncommitted
transaction are locked in exclusive mode until the transaction commits,
preventing any other transaction from reading the data.
4. Strict two-phase is the most widely used locking protocol in concurrency
control. This protocol has two rules :
a. If a transaction T wants to read (modify) an object, it first requests
a shared (exclusive) lock on the object.
Concurrency Control Techniques 5–8 A (CS/IT-Sem-5)
Fig. 5.8.1.
Answer
Salient features of graph based locking protocol are :
1. The graph based locking protocol ensures conflict serializability.
2. Free from deadlock.
3. Unlocking may occur earlier in the graph based locking protocol than
in the two phase locking protocol.
4. Shorter waiting time, and increase in concurrency.
5. No rollbacks are required.
6. Data items may be unlocked at any time.
7. Only exclusive locks are considered.
8. The first lock by T1 may be on any data item. Subsequently, a data Q
can be locked by T1 only if the parent of Q is currently locked by T1.
9. A data item that has been locked and unlocked by T 1 cannot
subsequently be relocked by T1.
Database Management System 5–9 A (CS/IT-Sem-5)
For example :
We have three transactions in this schedule, i.e., we will only see how
locking and unlocking of data item.
A
C
B
F
D E
G H
T1 T2 T3
Lock-X(A)
Lock-X(D)
Lock-X(H)
Unlock-X(D)
Lock-X(E)
Lock-X(D)
Unlock-X(B)
Unlock-X(E)
Lock-X(B)
Lock-X(E)
Unlock-X(H)
Answer
Two-phase locking technique : Refer Q. 5.7, Page 5–6A, Unit-5.
Two-phase locking guarantee the serializability :
1. Two-phase locking protocol restricts the unwanted read/write by
applying exclusive lock.
2. Moreover, when there is an exclusive lock on an item it will only be
released in shrinking phase.
3. Due to this restriction, there is no chance of getting any inconsistent
state. Because any inconsistency may only be created by write operation.
In this way the two-phase locking protocol ensures serializability.
Concurrency Control Techniques 5–10 A (CS/IT-Sem-5)
OR
Describe the problem faced when concurrent transactions are
executing in uncontrolled manner. Give an example and explain.
Answer
Concurrent transaction : Concurrent transaction means multiple
transactions are active at the same time.
Following problems can arise if many transactions try to access a common
database simultaneously :
1. The lost update problem :
a. A second transaction writes a second value of a data item on top of
a first value written by a first concurrent transaction, and the first
value is lost to other transactions running concurrently which
need, by their precedence, to read the first value.
b. The transactions that have read the wrong value end with
incorrect results.
Example :
Transaction T1 Transaction T2
Read X A 2
Read Y A1
A2 + A1 A2
Read X A2
A2 + 1 A2
Write A2 X
Write A1 Y Time Write A2 X
Example :
Transaction T1 Transaction T2
Read Y A1
Read X A 2
A2 + A1 A2
Write A2 X
Read X A2
Commit
Fails Time
Transaction T1 Transaction T2
Read Y A2 Read X A2
Read X A 1 A2 + 1 A2
A2 + A1 A2 Write A2 X
Write A2 X Rollback
Time
PART-2
Time Stamping Protocols for Concurrency Control,
Validation Based Protocol.
Concurrency Control Techniques 5–12 A (CS/IT-Sem-5)
Questions-Answers
Answer
Timestamp based protocols :
Timestamp based protocol ensures serializability. It selects an ordering among
transactions in advance using timestamps.
Timestamps :
1. With each transaction in the system, a unique fixed timestamp is
associated. It is denoted by TS(Ti).
2. This timestamp is assigned by the database system before the transaction
Ti starts execution.
3. If a transaction Ti has been assigned timestamp TS(Ti), and a new
transaction Tj enters the system, then TS(Ti) < TS(Tj).
4. The timestamps of the transactions determine the serializability order.
Thus, if TS(Tj) > TS(Ti ) , then the system must ensure that in produced
schedule, transaction Ti appears before transaction Tj .
5. To implement this scheme, two timestamps are associated with each
data item Q.
a. W-timestamp (Q) : It denotes the largest timestamp of any
transaction that executed write(Q) successfully.
b. R-timestamp (Q) : It denotes the largest timestamp of any
transaction that executed read(Q) successfully.
These timestamps are updated whenever a new read(Q) or write(Q)
instruction is executed.
The timestamp ordering protocol :
The timestamp ordering protocol ensures that any conflicting read and write
operations are executed in timestamp order. This protocol operates as follows :
1. Suppose that transaction Ti issues read(Q).
a. If TS(Ti ) < W-timestamp(Q), then Ti needs a value of Q that was
already overwritten. Hence, read operation is rejected, and Ti is
rolledback.
Database Management System 5–13 A (CS/IT-Sem-5)
Answer
i. Thomas’ write rule : Thomas’ write rule is a modified version of
timestamp ordering protocol. Suppose that transaction Ti issues
write(Q) :
1. If TS (Ti) < R-timestamp(Q), then the value of Q that Ti is producing
was previously needed, and it had been assumed that the value would
never be produced. Hence, the system rejects the write operation and
rolls Ti back.
2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete
value of Q. Hence, this write operation can be ignored.
Concurrency Control Techniques 5–14 A (CS/IT-Sem-5)
Answer
Validation protocol in concurrency control consists of following
three phase :
1. Read phase :
a. During this phase, the system executes transaction Ti.
b. It reads the values of the various data items and stores them in
variables local to Ti.
c. It performs all write operations on temporary local variables,
without updates of the actual database.
2. Validation phase : Transaction Ti performs a validation test to
determine whether it can copy to the database, the temporary local
variables that hold the results of write operations without causing a
violation of serializability.
3. Write phase : If transaction Ti succeeds in validation phase, then the
system applies the actual updates to the database, otherwise, the system
rolls back Ti.
All three phases of concurrently executing transactions can be interleaved.
To perform the validation test, we should know when the various phases of
transactions Ti took place. We shall, therefore, associate three different
timestamps with transaction Ti :
1. Start (Ti), the time when Ti started its execution.
2. Validation (Ti), the time when Ti finished its read phase and started its
validation phase.
3. Finish (Ti), the time when Ti finished its write phase.
Database Management System 5–15 A (CS/IT-Sem-5)
Answer
Phantom phenomenon :
1. A deadlock that is detected but is not really a deadlock is called a phantom
deadlock.
2. In distributed deadlock detection, information about wait-for relationship
between transactions is transmitted from one server to another.
3. If there is a deadlock, the necessary information will eventually be
collected in one place and a cycle will be detected.
4. As this procedure will take some time, there is a chance that one of the
transactions that hold a lock will meanwhile have released it; in this
case the deadlock will no longer exist.
For example :
1. Consider the case of global deadlock detector that receives local wait-for
graph from servers X and Y as shown in Fig. 5.15.1 and 5.15.2.
T U
V T
U V
PART-3
Multiple Granularity, Multiversion Schemes.
Questions-Answers
Answer
Multiple granularity :
1. Multiple granularity can be defined as hierarchically breaking up the
database into blocks which can be locked.
2. It maintains the track of what to lock and how to lock.
3. It makes easy to decide either to lock a data item or to unlock a data
item.
Implementation :
1. Multiple granularity is implemented in transaction system by defining
multiple levels of granularity by allowing data items to be of various
sizes and defining a hierarchy of data granularity where the small
granularities are nested within larger ones.
2. In the tree, a non leaf node represents the data associated with its
descendents.
3. Each node is an independent data item.
4. The highest level represents the entire database.
5. Each node in the tree can be locked individually using shared or exclusive
mode locks.
6. If a node is locked in an intention mode, explicit locking is being done at
lower level of the tree (that is, at a finer granularity).
7. Intention locks are put on all the ancestors of a node before that node is
locked explicitly.
8. While traversing the tree, the transaction locks the various nodes in an
intention mode. This hierarchy can be represented graphically as a tree.
DB Database
A1 A2 Area
Fb Fc File
Fa
Que 5.17.
3.17. What is multiple granularity protocol of concurrency
control ?
Answer
1. Multiple granularity protocol is a protocol in which we lock the data
items in top-down order and unlock them in bottom-up order.
2. In multiple granularity locking protocol, each transaction Ti can lock a
node Q in any locking mode by following certain rules, which ensures
serializability. These rules are as follows :
i. Ti must follow the compatibility matrix as shown in Fig. 5.17.1 to
lock a node Q. This matrix contain following additional locks :
a. Intension-Shared (IS) : Explicit locking at a lower level of
tree but only with shared locks.
b. Intension-Exclusive (IX) : Explicit locking at a lower level
with exclusive or shared locks.
c. Shared and Intension-Exclusive (SIX) : The sub-tree
rooted by that node is locked explicitly in shared mode and
explicit locking is being done at a lower level with exclusive
mode locks.
ii. It first locks the root of the tree and then locks the other nodes.
iii. It can lock a node Q in S or IS mode only if it currently has the
parent of Q locked in either IX or IS mode.
iv. It can lock node Q in X, SIX or IX mode only if it currently has the
parent of Q locked in either IX or SIX mode.
v. It can lock a node if it has not previously unlocked any node.
vi. It can unlock a node Q only if it currently has none of the children
of Q locked.
Requested mode X SIX IX S IS
X NO NO NO NO NO
SIX NO NO NO NO YES
IX NO NO YES NO YES
S NO NO NO YES YES
IS NO YES YES YES YES
Answer
Granularity locking :
1. Granularity locking is a concept of locking the data item on the basis of
size of data item.
2. It is based on the hierarchy of data where small granularities are nested
within larger one. The lock may be granted at any level from bottom to
top.
Effect of granularity of data item over the performance of
concurrency control :
1. The larger the data item size is, the lower the degree of concurrency
permitted. For example, if the data item size is disk block, a transaction
T that need to lock a record B must lock the whole disk block X that
contains B. If the other transactions want to lock record C which resides
in same lock then it is forced to wait.
2. If the data item size is small then the number of items in the database
increases. Because every item is associated with a lock, the system will
have a larger number of active locks to be handled by the lock manager.
3. More lock and unlock operations will be performed which cause higher
overhead.
Factors affecting the selection of granularity size of data items :
1. It depends on the types of transaction involved.
2. If a typical transaction accesses a small number of records, it is
advantageous to have the data item granularity be one record.
3. If a transaction typically accesses many records in the same file, it may
be better to have block or file granularity so that the transaction will
consider all those records as one (or a few) data items.
B E
C D F G
T1 wants to access item C in read mode
T2 wants to access item D in exclusive mode
T3 wants to read all the children of item B
T4 wants to access all items in read mode
AKTU 2018-19, Marks 07
Concurrency Control Techniques 5–20 A (CS/IT-Sem-5)
Answer
Multi granularity : Refer Q. 5.16, Page 5–17A, Unit-5.
Concurrency in multi granularity protocol : Refer Q. 5.17,
Page 5–17A, Unit-5.
Numerical :
1. Transaction T1 reads the item C in B. Then, T2 needs to lock the item
the item A and B in IS mode (and in that order), and finally to lock the
item C in S mode.
2. Transaction T2 modifies the item D in B. Then, T2 needs to lock the
item A AND B (and in that order) in IX mode, and at last to lock the
item D in X mode.
3. Transaction T3 reads all the records in B. Then, T3 needs to lock the A
and B (and in that order) in IS mode, and at last to lock the item B in S
mode.
4. Transaction T4 read the all item. It can be done after locking the item
A in S mode.
Answer
Multiversion concurrency control :
1. Multiversion concurrency control is a schemes in which each write(Q)
operation creates a new version of Q.
2. When a transaction issues a read(Q) operation, the concurrency-control
manager selects one of the version of Q to be read.
3. The concurrency control scheme must ensure that the version to be
read is selected in manner that ensures serializability.
Multiversion timestamping protocol :
1. The most common transaction-ordering technique used by multiversion
schemes is timestamping.
2. With each transaction Ti in the system, we associate a unique static
timestamp, denoted by TS(Ti).
3. This timestamp is assigned before the transaction starts execution.
4. Concurrency can be increased if we allow multiple versions to be stored,
so that the transaction can access the version that is consistent for them.
5. With this protocol, each data item Q is associated with a sequence of
versions < Q1, Q2, . . . , Qm >.
6. Each version Qk contains three data fields :
a. Content is the value of version Qk.
Database Management System 5–21 A (CS/IT-Sem-5)
Answer
1. The multiversion two-phase locking attempts to combine the advantages
of multiversion concurrency control with the advantages of two-phase
locking.
2. In addition to read and write lock modes, multiversion two-phase locking
provides another lock mode, i.e., certify.
3. In order to determine whether these lock modes are compatible with
each other or not, consider Fig. 5.21.1
Answer
Problems that can arise during concurrent execution of two or
more transaction : Refer Q. 5.11, Page 5–10A, Unit-5.
Methods to avoid these problems :
1. Lock based protocol :
a. It requires that all data items must be accessed in a mutually
exclusive manner.
b. In this protocol, concurrency is controlled by locking the data
items.
c. A lock guarantees exclusive use of a data item to current
transaction.
d. Locks are used as a means of synchronizing the access by
concurrent transaction to the database items.
2. Timestamp based protocol : Refer Q. 5.12, Page 5–12A, Unit-5.
3. Multiversion scheme :
a. Multiversion timestamping protocol : Refer Q. 5.20,
Page 5–20A, Unit-5.
b. Multiversion two-phase locking : Refer Q. 5.21, Page 5–21A,
Unit-5.
PART-4
Recovery with Concurrent Transaction, Case Study of Oracle.
Questions-Answers
Answer
Recovery from concurrent transaction can be done in the following four
ways :
1. Interaction with concurrency control :
a. In this scheme, the recovery scheme depends greatly on the
concurrency control scheme that is used.
b. So to rollback a failed transaction, we must undo the updates
performed by the transaction.
2. Transaction rollback :
a. In this scheme we rollback a failed transaction by using the log.
b. The system scans the log backward, for every log record found in
the log the system restores the data item.
3. Checkpoints :
a. In this scheme we used checkpoints to reduce the number of log
records that the system must scan when it recovers from a crash.
b. In a concurrent transaction processing system, we require that
the checkpoint log record be of the form <checkpoint L>, where
‘L’ is a list of transactions active at the time of the checkpoint.
4. Restart recovery :
a. When the system recovers from a crash, it constructs two lists.
b. The undo-list consists of transactions to be undone, and the redo-
list consists of transaction to be redone.
c. The system constructs the two lists as follows : Initially, they are
both empty. The system scans the log backward, examining each
record, until it finds the first <checkpoint> record.
Answer
1. The Oracle database (commonly referred to as Oracle RDBMS or simply
Oracle) consists of a relational database management system (RDBMS).
2. Oracle is a multi-user database management system. It is a software
package specializing in managing a single, shared set of information
among many concurrent users.
3. Oracle is one of many database servers that can be plugged into a
client/server equation.
4. Oracle works to efficiently manage its resource, a database of
information, among the multiple clients requesting and sending data
in the network.
Concurrency Control Techniques 5–24 A (CS/IT-Sem-5)
Storage :
1. The Oracle RDBMS stores data logically in the form of table spaces and
physically in the form of data files.
2. Table spaces can contain various types of memory segments, such as
data segments, index segments, etc.
Que 5.25. Write the name of disk files used is Oracle. Explain
database schema.
Answer
Disk files consists two files which are as follows :
1. Data files :
a. At the physical level, data files comprise one or more data blocks,
where the block size can vary between data files.
b. Data files can occupy pre-allocated space in the file system of a
computer server, utilize raw disk directly, or exist within ASM
logical volumes.
2. Control files : One (or multiple multiplexed) control files (also known
as “control files”) store overall system information and statuses.
Database schema :
1. Oracle database conventions refer to defined groups of object ownership
as schemas.
2. Most Oracle database installation has a default schema called SCOTT.
3. After the installation process has set up the sample tables, the user can
log into the database with the username scott and the password tiger.
5. The SCOTT schema has seen less use as it uses few of the features of
the more recent releases of Oracle.
6. Most recent examples supplied by Oracle Corporation reference the
default HR or OE schemas.
Answer
i. Tablespace :
1. A tablespace is a logical portion of an Oracle database used to allocate
storage for table and index data.
2. Each tablespace corresponds to one or more physical database files.
Database Management System 5–25 A (CS/IT-Sem-5)
Que 5.27. Explain SQL Plus, SQL * Net and SQL & LOADER.
Answer
SQL Plus :
1. SQL Plus is the front end tools for Oracle.
2. The SQL Plus window looks much like a DOS window with a white
background similar Notepad.
3. This tool allows us to type in our statements, etc., and see the results.
SQL * Net :
1. This is Oracle’s own middleware product which runs on both the client
and server to hide the complexity of the network.
2. SQL * Net’s multiprotocol interchange allows client/server connections
to span multiple communication protocols without the need for bridges
and routers, etc., SQL * Net will work with any configuration design.
SQL * LOADER :
1. A utility used to load data from external files into Oracle tables.
2. It can load data from as ASCII fixed-format or delimited file into an
Oracle table.
Concurrency Control Techniques 5–26 A (CS/IT-Sem-5)
Answer
Locking techniques :
1. The locking technique is used to control concurrency execution of
transactions which is based on the concept of locking data items.
2. The purpose of locking technique is to obtain maximum concurrency
and minimum delay in processing transactions.
3. A lock is a variable associate with a data item in the database and
describes the status of that data item with respect to possible operations
that can be applied to the item; there is one lock for each data item in the
database.
Following are the locking techniques with concurrent transaction :
1. Lock based locking technique : Refer Q. 5.2, Page 5–2A, Unit-5.
2. Two-phase locking technique : Refer Q. 5.7, Page 5–6A, Unit-5.
Following are the recovery techniques with concurrent transaction :
1. Log based recovery : Refer Q. 4.18, Page 4–15A, Unit-4.
2. Checkpoint : Refer Q. 4.20, Page 4–18A, Unit-4.
Database Management System SQ–1 A (CS/IT-Sem-5)
1
UNIT
Introduction
(2 Marks Questions)
1.4. What is data model ? List the types of data model used.
AKTU 2016-17, Marks 02
Ans. Data model is a logical structure of the database. It is a collection of
conceptual tools for describing data, data relationships, data
semantics and consistency constraints.
Types of data model used :
1. Hierarchical model
2. Network model
3. Relational model
4. Object-oriented model
5. Object-relational model
2 Marks Questions SQ–2 A (CS/IT-Sem-5)
Student name
Fig. 1.11.1.
1.11. Write the difference between super key and candidate key.
AKTU 2018-19, Marks 02
Ans.
S. No. Super key Candidate key
1. Super key is a set of one or A candidate key is a column, or set
mo re attribute s take n of column, in the table that can
collectively that allows us to uniquely identify any database
identify uniquely an entity in record without referring to any
the entity set. other data.
2. Supe r ke y is a bro adest Candidate key is a subset of super
unique identifier. key.
2 Marks Questions SQ–4 A (CS/IT-Sem-5)
1.12. Give example for one to one and one to many relationships.
AKTU 2016-17, Marks 02
Ans.
A B A B
b1
a1 b1
a1 b2
a2 b2
a2 b3
a3 b3
a3 b4
a4 b4
b5
1.16. Define super key, candidate key, primary key and foreign
key. AKTU 2019-20, Marks 02
Ans. Super key : It is a set of one or more attributes that, taken
collectively, allows us to identify uniquely an entity in the entity
set.
Candidate key : A candidate key is a column, or set of column, in
the table that can uniquely identify any database record without
referring to any other data.
Primary key : Primary key is a candidate key that is used for
unique identification of entities within the table.
Database Management System SQ–5 A (CS/IT-Sem-5)
Foreign key : A foreign key is derived from the primary key of the
same or some other table. Foreign key is the combination of one or
more columns in a table (parent table) at references a primary key
in another table (child table).
2 Marks Questions SQ–6 A (CS/IT-Sem-5)
2
UNIT
Relational Data Model
and Language
(2 Marks Questions)
Foreign Key
Table 1.
ENO NAME Age DNO
1 Ankit 19 10
2 Srishti 18 11
3 Somvir 22 14 Not Allowed as DNO 14 is not
4 Sourabh 19 10 defined as a primary key of Table 2,
and in Table 1, DNO is a foreign
key defined
Relationship
Table 2.
DNO D.Location
10 Rohtak
11 Bhiwani
12 Hansi
Primary Key
2 Marks Questions SQ–10 A (CS/IT-Sem-5)
3
UNIT
Database Design &
Normalization
(2 Marks Questions)
2 Marks Questions SQ–14 A (CS/IT-Sem-5)
4
UNIT
Transaction Processing
Concept
(2 Marks Questions)
Demerits :
i. Commit overhead
ii. Data fragmentation
iii. Garbage collection
Database Management System SQ–17 A (CS/IT-Sem-5)
5
UNIT
Concurrency Control
Techniques
(2 Marks Questions)
Database Management System SP–1 A (CS/IT-Sem-5)
B. Tech.
(SEM. IV) EVEN SEMESTER THEORY
EXAMINATION, 2015-16
DATABASE MANAGEMENT SYSTEMS
SECTION – A
Note : Attempt all parts. All parts carry equal marks. Write answer
of each part in short. (2 × 10 = 20)
SECTION – B
Note : Attempt any five questions from this section : (10 × 5 = 50)
2. A university registrar’s office maintains data about the
following entities (a) courses, including number, title,
credits, syllabus and prerequisites; (b) course offerings,
including course number, year, semester section number,
instructor(s), timings and classroom; (c) students, including
student-id, name and program; and (d) instructors,
Solved Paper (2015-16) SP–2 A (CS/IT-Sem-5)
SECTION – C
Note : Attempt any two questions from this section : (15 × 2 = 30)
10. Describe major problems associated with concurrent
processing with examples. What is the role of locks in
avoiding these problems ?
Solved Paper (2015-16) SP–4 A (CS/IT-Sem-5)
SECTION – A
Note : Attempt all parts. All parts carry equal marks. Write answer
of each part in short. (2 × 10 = 20)
Student name
Fig. 1.
Ans. In RDBMS, certain update anomalies can arise, which are as follows :
i. Insertion anomaly : It leads to a situation in which certain
information cannot be inserted in a relation unless some other
information is stored.
ii. Deletion anomaly : It leads to a situation in which deletion of data
representing certain information results in losing data representing
some other information that is associated with it.
iii. Modification anomaly : It leads to a situation in which repeated
data changed at one place results in inconsistency unless the same
data are also changed at other places.
SECTION – B
Note : Attempt any five questions from this section : (10 × 5 = 50)
2. A university registrar’s office maintains data about the
following entities (a) courses, including number, title,
credits, syllabus and prerequisites; (b) course offerings,
including course number, year, semester section number,
instructor(s), timings and classroom; (c) students, including
student-id, name and program; and (d) instructors,
including identification number, name department and title.
Further the enrollment of students in courses and grades
awarded to students in each course they are enrolled for
must be appropriately modeled. Construct an ER diagram
for the registrar’s office. Document all assumption that
you make about the mapping constraints.
Ans. In this ER diagram, the main entity sets are student, course, course
offering and instructor. The entity set course offering is a weak
entity set dependent on course. The assumptions made are :
a. A class meets only at one particular place and time. This ER diagram
cannot model a class meeting at different places at different times.
b. There is no guarantee that the database does not have two classes
meeting at the same place and time.
Database Management System SP–7 A (CS/IT-Sem-5)
enrolls course
student offerings teaches instructor
syllabus courseno
prerequisite
requires course title
maincourse
credits
Fig. 2. ER diagram for University.
Registration Vehicle_id
Colour Make
Partially
Committed
committed
Active
Failed Aborted
Fig. 4.
2. Partially committed : A transaction is said to be entered in the
partial state when final statement gets executed. But it is still
possible that it may have to be aborted, since its actual operation
is still resided in main memory in which the power failure brings
failure of its execution.
3. Failed : A transaction enters a failed state after the system
determines that the transaction can no longer proceeds with its
normal execution.
Database Management System SP–13 A (CS/IT-Sem-5)
SECTION – C
Note : Attempt any two questions from this section : (15 × 2 = 30)
Example :
Transaction T1 Transaction T2
Read X A 2
Read Y A1
A2 + A1 A2
Read X A2
A2 + 1 A2
Write A2 X
Write A1 Y Time Write A2 X
Transaction T1 Transaction T2
Read Y A1
Read X A 2
A2 + A1 A2
Write A2 X
Read X A2
Commit
Fails Time
Example :
Transaction T1 Transaction T2
Read Y A2 Read X A2
Read X A 1 A2 + 1 A2
A2 + A1 A2 Write A2 X
Write A2 X Rollback
Time
An example of unrepeatable read in which if T1 were to read the
value of X after T2 had updated X, the result of T1 would be different.
Role of locks :
1. It locks the data item in the transaction in correct order.
2. If any data item is locked than it must be unlock at the end of
operation.
T U
V T
U V
A1 A2 Area
Fb Fc File
Fa
Database Management System SP–1 A (CS/IT-Sem-5)
B. Tech.
(SEM. IV) EVEN SEMESTER THEORY
EXAMINATION, 2016-17
DATABASE MANAGEMENT SYSTEMS
Section-A
1. Answer all parts. All parts carry equal marks. Write answer of
each part in short. (2 × 10 = 20)
e. What is normalization ?
j. Define timestamp.
Section-B
2. Attempt any five question from this section. (10 × 5 = 50)
a. Consider the following relational database employee
(employee_name, street, city works (employee_name,
company_name, salary) company (company_name, city)
manage (employee_name, manager_name).
Give an expression in SQL to express each of the following
queries :
i. Find the names and cities of residence of all employees who
work for XYZ bank.
Solved Paper (2016-17) SP–2 A (CS/IT-Sem-5)
ii. Find the names, street address, and cities of residence of all
employee who works for XYZ Bank and earn more than
Rs. 10,000 per annum.
iii. Find the names of all employees in this database who live
in the same city as the company for which they work.
Section-C
Note : Attempt any two questions from this section. (15 × 2 = 30)
3. a. What are the relational algebra operation supported in
SQL ? Write the SQL statement for each operation.
Database Management System SP–3 A (CS/IT-Sem-5)
Section-A
1. Answer all parts. All parts carry equal marks. Write answer of
each part in short. (2 × 10 = 20)
b1
a1 b1
a1 b2
a2 b2
a2 b3
a3 b3
a3 b4
a4 b4
b5
Foreign Key
Table 1.
ENO NAME Age DNO
1 Ankit 19 10
2 Srishti 18 11
3 Somvir 22 14 Not Allowed as DNO 14 is not
4 Sourabh 19 10 defined as a primary key of Table 2,
and in Table 1, DNO is a foreign
key defined
Relationship
Table 2.
DNO D.Location
10 Rohtak
11 Bhiwani
12 Hansi
Primary Key
e. What is normalization ?
Ans. Normalization is the process of organizing a database to reduce
redundancy and improve data integrity.
j. Define timestamp.
Ans. A timestamp is a unique identifier created by the DBMS to identify
a transaction. This timestamp is used in timestamp based
concurrency control techniques.
Section-B
Ans.
S. No. Physical level Conceptual/ View level
Logical level
1. This is the lowest This is the middle level This is the highest
le ve l of data of data abstraction. level of data
abstraction. abstraction.
2. It describes how It describes what data It describes the user
data is actually is stored in database. interaction with
stored in database. database system.
3. It de scribe s the It de scribe s the It describes only those
complex low-level structure o f whole part of the database
data structures in database and hides in which the users are
detail. de tails of physical interested and hides
storage structure. rest of all information
from the users.
4. A user is not aware A user is not aware of A user is aware of the
of the complexity the co mple xity o f complexity of
of database. database. database.
Dynamic SQL :
1. The dynamic SQL component of SQL allows programs to construct
and submit SQL queries at run time.
2. Using dynamic SQL, programs can create SQL queries as strings at
run time and can either have them executed immediately or have
them prepared for subsequent use.
3. Preparing a dynamic SQL statement compiles it, and subsequent
uses of the prepared statement use the compiled version.
4. SQL defines standards for embedding dynamic SQL calls in a host
language, such as C, as in the following example,
char * sqlprog = “update account set balance = balance * 1.05
where account_number = ?”;
EXEC SQL prepare dynprog from : sqlprog;
char account[10] = “A-101”;
EXEC SQL execute dynprog using : account;
Shadow page
table Database Disk pages Current page table
1 1 Page 1 (old) 1
2 2 Page 2 2
3 3 Page 3 (old) 3
4 4 Page 4 4
5 5 Page 5 5
6 6 Page 6 6
7 7 Page 7 7
8 8 Page 8 8
9 Page 1 (new)
10 Page 3 (new)
11
Serializability :
1. Serializability is a property of a transaction schedule which is used
to keep the data in the data item in consistent state.
2. It is the classical concurrency scheme.
3. There are two type of serializability :
a. Conflict serializatility.
b. View serializability.
4. Serializability of schedule :
a. In DBMS, the basic assumption is that each transaction
preserves database consistency.
b. Thus, the serial execution of a set of transaction preserves
database consistency.
c. A concurrent schedule is serializable if it is equivalent to a
serial schedule.
Section-C
Note : Attempt any two questions from this section. (15 × 2 = 30)
3. a. What are the relational algebra operation supported in
SQL ? Write the SQL statement for each operation.
Ans. Basic relational algebra operations are as follows :
1. Select operation :
a. The select operation selects tuples that satisfies a given
predicate.
b. Select operation is denoted by sigma ().
c. The predicate appears as a subscript to .
d. The argument relation is in parenthesis after the .
2. Project operation :
a. The project operation is a unary operation that returns its
argument relation with certain attributes left out.
b. In project operation duplicate rows are eliminated.
c. Projection is denoted by pi ().
3. Set difference operation :
a. The set difference operation denoted by allows us to find
tuples that are in one relation but are not in another.
b. The expression r s produces a relation containing those
tuples in r but not in s.
4. Cartesian product operation :
a. The cartesian product operation, denoted by a cross (×), allows
us to combine information from any two relations. The
cartesian product of relations r1 and r2 is written as r1 × r2.
5. Rename operation :
a. The rename operator is denoted by rho ().
b. Given a relational algebra expression E, x(E)
returns the result of expression E under the name x.
c. The rename operation can be used to rename a relation r to
get the same relation under a new name.
d. The rename operation can be used to obtain a new relation
with new names given to the original attributes of original
relation as
Database Management System SP–11 A (CS/IT-Sem-5)
103 Suraj J2
104 XX
105 BB
106 CC
We want to manipulate the × operation between [personnel ×
software_packages] using SQL as :
Select * from personnel, software_packages
Piid PiName S
101 Jai J1
101 Jai J2
103 Suraj J1
103 Suraj J2
104 XX J1
104 XX J2
105 BB J1
105 BB J1
106 CC J1
106 CC J2
Solved Paper (2016-17) SP–12 A (CS/IT-Sem-5)
5. Rename :
Consider the Book relation with attributes Title, Author, Year
and Price. The rename operator is used on Book relation as follows :
Select title as Bname, Author as Aname, year as Pyear, price as
Bprice from Book as Temp
Here both the relation name and the attribute names are renamed.
Products purchased
Payment mode
Amount paid
Payee name
Invoice
Purchase No.
Product ID
Store No.
Store address
Product name
Expiry date
Price
Confirmation
Payment
Store
Store name
Available
Product
Product ID
Address
Salary
Shipping
Employee
Demand
Work
Employee name
delivery
Date of
Delivery
Employee
Number of
ID
No. of items
items
Promo name
Promotion
Shipping type
Initiates
Order
Tracking No.
Order ID
Promo code
Availability
location
Product ID
Order
Product
Price
name
Fig. 3.
Ans.
1. First Normal Form (1NF) :
a. A relations R is in 1NF if all domains are simple i.e., all elements
are atomic.
For example : The relation LIVED-IN given in Table 3.7.1 is not
in 1NF because the domain values of the attribute ADDRESS are
not atomic.
Table 1. LIVED-IN
Name Address
CITY Year-moved-in Year-left
Ashok Kolkata 2007 2015
Delhi 2011 2015
CITY Year-moved-in Year-left
Ajay Mumbai 2000 2004
Chennai 2005 2009
Relation not in 1NF and can be normalized by replacing the non-
simple domain with simple domains. The normalized form of
LIVED-IN is given in Table 2.
Table 2. LIVED-IN
Name City Year-moved-in Year-left
Ashok Kolkata 2007 2010
Ashok Delhi 2011 2015
Ajay Mumbai 2000 2004
Ajay Chennai 2005 2009
Database Management System SP–1 A (CS/IT-Sem-5)
B. Tech.
(SEM. V) ODD SEMESTER THEORY
EXAMINATION, 2017-18
DATABASE MANAGEMENT SYSTEM
c. Define DML.
f. Define 2 NF.
h. Define schedule.
SECTION-B
SECTION-C
Solved Paper (2017-18) SP–4 A (CS/IT-Sem-5)
SECTION-A
c. Define DML.
Ans. A DML is a language that enables users to access and manipulate
data as organized by the appropriate data model. Insert, update,
delete, query are commonly used DML commands.
f. Define 2 NF.
Ans. A relation R is in second normal form (2NF) if and only if it is in
1NF and every non-key attribute is fully dependent on the primary
key. A relation R is in 2NF if every non-prime attribute of R is
fully functionally dependent on each relation key.
Database Management System SP–5 A (CS/IT-Sem-5)
h. Define schedule.
Ans. A schedule is a list of operations (actions) ordered by time,
performed by a set of transactions that are executed together in
the system.
SECTION-B
Employee
Emp_Name City
Hari Pune
Om Mumbai
Suraj Nashik
Jai Solapur
Table. 2. The Employee_Salary relation.
Employee_Salary
Emp_Name Department Salary
Hari Computer 10000
Om IT 7000
Billu Computer 8000
Jai IT 5000
Select Employee.Emp_Name, Employee_Salary.Salary from
Employee inner join Employee_Salary on Employee.Emp_Name
= Employee_Salary.Emp_Name;
Result : The result of preceding query with selected fields of
Table 1 and Table 2.
Emp_Name Salary
Hari 10000
Om 7000
Jai 5000
2. Outer join :
a. An outer join is an extended form of the inner join.
Database Management System SP–7 A (CS/IT-Sem-5)
Ans.
S. No. BCNF 3NF
1. In BCNF, for any FDs for a In 3NF, there should be no
relation R, A B, A should be a transitive dependency that is no
super key of relation. non prime attribute should be
transitively dependent on the
candidate key.
2. It is comparatively stronger than It is less strong than BCNF.
3NF.
3. In BCN F, the functional In 3N F, the functio nal
dependencies are already in dependencies are already in
1NF, 2NF and 3NF. 1NF and 2NF.
4. Redundancy is low. Redundancy is high.
5. In BCNF, there may or may not In 3NF, there is preservation of
be preservation of all. all functional dependencies.
6. It is difficult to achieve. It is comparatively easier to
achieve.
7. Lossless decomposition is hard Lossless decomposition can be
to achieve in BCNF. achieved by 3NF.
T1 T2
SECTION-C
3. However, if Ii and Ij refer to the same data item Q, then the order
of the two steps matter.
4. Following are four possible cases :
Ii Ij Swapping possible
iii. Both schedules will produce the same final system state.
6. If a schedule S can be transformed into a schedule S by a series of
swaps of non-conflicting instructions, we say that S and S are
conflict equivalent.
7. The concept of conflict equivalence leads to the concept of conflict
serializability and the schedule S is conflict serializable.
View serializability :
1. The schedule S and S are said to be view equivalent if following
three conditions met :
a. For each data item Q, if transaction Ti reads the initial value
of Q in schedule S, then transaction Ti in schedule S, must
also read the initial value of Q.
b. For each data item Q if transaction Ti executes read (Q) in
schedule S and if that value produced by a write (Q) operation
executed by transaction Tj, then the read (Q) operation of
transaction Ti, in schedule S, must also read the value of Q
that was produced by the same write (Q) operation of
transaction Tj.
c. For each data item Q, the transaction (if any) that performs
the final write (Q) operation in schedule S must perform the
final write (Q) operation in schedule S.
2. Conditions (a) and (b) ensure that each transaction reads the same
values in both schedules and therefore, performs the same
computation. Condition (c), coupled with condition (a) and condition
(b) ensure that both schedules result in the same final system
state.
3. The concept of view equivalence leads to the concept of view
serializability.
4. We say that schedule S is view serializable, if it is view equivalent
to serial schedule.
5. Every conflict serializable schedule is also view serializable but
there are view serializable schedules that are not conflict
serializable.
Example :
Schedule S1 Schedule S2
T1 T2 T1 T2
Exclusive YES NO NO
Certify NO NO NO
branch-city
branch-name assets
branch
loan-branch
customer-name payment-date
access-date
account-number balance
cust-banker type
Aggregation
depositor account
manager
Generalization
Specialization
dependent-name telephone-number
employment
length start-date interest-rate overdraft-amount
3. Candidate key :
a. A candidate key is a column, or set of column, in the table
that can uniquely identify any database record without
referring to any other data.
b. Candidate key are individual columns in a table that qualifies
for uniqueness of all the rows. Here in Employee table
EmployeeID and SSN are candidate keys.
c. Minimal super keys are called candidate keys.
4. Composite key :
a. A composite key is a combination of two or more columns in
a table that can be used to uniquely identify each row in the
table.
b. It is used when we cannot identify a record using single
attributes.
c. A primary key that is made by the combination of more than
one attribute is known as a composite key.
5. Alternate key :
a. The alternate key of any table are those candidate keys which
are not currently selected as the primary key.
b. Exactly one of those candidate keys is chosen as the primary
key and the remainders, if any are then called alternate keys.
c. An alternate key is a function of all candidate keys minus the
primary key.
d. Here in Employee table if EmployeeID is primary key then
SSN would be the alternate key.
6. Foreign key :
a. Foreign key represents the relationship between tables and
ensures the referential integrity rule.
b. A foreign key is derived from the primary key of the same or
some other table.
c. Foreign key is the combination of one or more columns in a
table (parent table) at references a primary key in another
table (child table).
d. A foreign key value can be left null.
For example : Consider another table :
Project (ProjectName, TimeDuration, EmployeeID)
a. Here, the ‘EmployeeID’ in the ‘Project’ table points to the
‘EmployeeID’ in ‘Employee’ table
b. The ‘EmployeeID’ in the ‘Employee’ table is the primary key.
c. The ‘EmployeeID’ in the ‘Project’ table is a foreign key.
Ans.
i. MVD or JD :
1. MVD occurs when two or more independent multivalued facts
about the same attribute occur within the same relation.
2. MVD is denoted by X Y specified on relation schema R,
where X and Y are both subsets of R.
3. Both X and Y specifies the following constraint on any relation
state r of R : If two tuples t1 and t2 exist in r such that
t1(X) = t2(X), then two tuples t3 and t4 should also exist in r
with the following properties, where we use Z to denote
(R – (X Y))
t3(X) = t4(X) = t1(X) = t2(X)
t3(Y) = t1(Y) and t3 (Z) = t2(Z)
t4(Y) = t2(Y) and t4(Z) = t1(Z)
4. An MVD X Y in R is called a trivial MVD if
a. X is a subset of Y or
b. X Y = R
An MVD that satisfies neither (a) nor (b) is called a non-trivial
MVD.
For example :
Relation with MVD
Faculty Subject Committee
John DBMS Placement
John Networking Placement
John MIS Placement
John DBMS Scholarship
John Networking Scholarship
John MIS Scholarship
Join Dependency (JD) :
1. A Join Dependency (JD), denoted by (R1, R2, ..., Rn) specified
on relation scheme R, specifies a constraints on the states r of
R.
2. The constraint states that every legal state r of R should have
a lossless join decomposition into R1, R2, ..., Rn. That is, for
every such r, we have
( R1(r), R2(r),...., Rn(r)) = r
3. A join dependency JD (R1, R2, ..., Rn), specified on relation
schema R, is a trivial JD if one of the relation schemas Ri in
JD (R1, R2, ..., Rn) is equal to R.
4. Such a dependency is called trivial because it has the lossless
join property for any relation state r of R and hence does not
specify any constraint on R.
Solved Paper (2017-18) SP–22 A (CS/IT-Sem-5)
ii.
1. Normalization is the process of reducing data redundancy in a
relational database.
2. Normalization is a refinement process that the database
designer undertakes. After identifying the data objects of the
proposed database, their relationships define the tables
required and columns within each table.
3. The fundamental principle of normalization is, “The same data
should not be stored in multiple places.” No information is lost
in the process; however, the number of tables generally
increases as the rules are applied.
Advantages :
1. It helps to remove the redundancy from the relation.
2. It helps in easy manipulation of data.
3. It helps to provide more information to the user.
4. It eliminates modification anomalies.
VDL :
1. create view emp5 as
select * from employee
where dno = 5 ;
Creates view for dept 5 employees.
2. create view empdept as
select fname, lname, dno, dname
from employee, department
where dno=dnumber ;
Creates view using two tables.
Fragmentation1
Fragmentation2
Fragmentation3
Fragmentation4
Fig. 4. Horizontal fragmentation.
Solved Paper (2017-18) SP–26 A (CS/IT-Sem-5)
R1 R2
VF VF VF VF
VF
Database Management System SP–1 A (CS/IT-Sem-5)
B. Tech.
(SEM. V) ODD SEMESTER THEORY
EXAMINATION, 2018-19
DATABASE MANAGEMENT SYSTEM
SECTION-A
SECTION-B
c. What is log file ? Write the steps for log based recovery of a
system with suitable example.
Solved Paper (2018-19) SP–2 A (CS/IT-Sem-5)
SECTION-C
B E
C D F G
T1 wants to access item C in read mode
T2 wants to access item D in exclusive mode
T3 wants to read all the children of item B
T4 wants to access all items in read mode
Solved Paper (2018-19) SP–4 A (CS/IT-Sem-5)
SECTION-A
Ans.
S. No. Physical data Logical data
independence independence
1. Physical data independence is Logical data independence is the
the ability to modify physical ability to modify the conceptual
schema without causing the schema without having to change
conceptual schema or the external schemas or application
application programs to be programs.
rewritten.
2. Examples of physical Examples of logical data
independence are independence are addition/removal
re organisations of files, of entities.
adding a new access path etc.
SECTION-B
T1 T2
read (A)
write (A)
read (A)
read (B)
2. Cascadeless schedule :
a. A cascadeless schedule is one, where for each pair of
transaction Ti and Tj such that Tj reads a data item previously
written by T i, the commit operation to T j appears before
the read operation of T i.
b. Even if a schedule is recoverable, to recover correctly from
the failure of a transaction T i, we may have to rollback
several transactions. Such situations occur if transactions
have read data written by T i.
3. Strict schedule :
a. A schedule is called strict if every value written by a transaction
T is not read or changed by other transaction until T either
aborts or commits.
b. A strict schedule avoids cascading and recoverability.
View serializability :
1. The schedule S and S are said to be view equivalent if following
three conditions met :
a. For each data item Q, if transaction Ti reads the initial value
of Q in schedule S, then transaction Ti in schedule S, must
also read the initial value of Q.
b. For each data item Q if transaction Ti executes read (Q) in
schedule S and if that value produced by a write (Q) operation
Solved Paper (2018-19) SP–8 A (CS/IT-Sem-5)
T1 T2 T1 T2
c. What is log file ? Write the steps for log based recovery of a
system with suitable example.
Ans. Log file : A log file is a file that records all the update activities
occur in the database.
Database Management System SP–9 A (CS/IT-Sem-5)
T1 T2
Wait-for-lock(R2)
Fig. 1. Wait-for-graph.
2. Recovery from deadlock :
a. Selection of a victim : In this we determine which
transaction (or transactions) to roll back to break the deadlock.
We should rollback those transactions that will incur the
minimum cost.
b. Rollback : The simplest solution is a ‘‘total rollback’’. Abort
the transaction and then restart it.
c. Starvation : In a system where selection of transactions, for
rollback, is based on the cost factor, it may happen that the
some transactions are always picked up.
Query Evaluation
Engine
Query processor
Storage manager
Indices Data
Disk storage
Dictionary
SECTION-C
Ans.
Comparison :
S. No. Generalization Specialization Aggregation
For example :
1. Generalization :
Generalization
Account
IS
A
Saving Current
Fig. 3.
2. Specialization :
Specialization
Person
IS
A
Employee Customer
Fig. 4.
Solved Paper (2018-19) SP–14 A (CS/IT-Sem-5)
3. Aggregation :
For example :
a. The relationship works_on (relating the entity sets employee,
branch and job) act as a higher-level entity set.
b. We can then create a binary relationship ‘Manages’, between
works on and manager to represent who manages what tasks.
Job
Manages
Manager
Fig. 5. ER diagram with aggregation.
Natural join :
1. Natural join joins two tables based on same attribute name and
data types.
Database Management System SP–15 A (CS/IT-Sem-5)
2. The resulting table will contain all the attributes of both the table
but keep only one copy of each common column.
3. In natural join, if there is no condition specifies then it returns the
rows based on the common column.
For example : Consider the following two relations :
Student (Roll_No, Name)
Marks (Roll_No, Marks)
These two relations are shown in Table 1 and 2.
Table 1. The Student relation.
Student
Roll_No Name
1 A
2 B
3 C
Table 2. The Marks relation.
Marks
Roll_No Marks
2 70
3 50
4 85
Consider the query :
Select * from Student natural join Marks ;
Result :
Roll_No Name Marks
2 B 70
3 C 50
Left outer join and right outer join :
Consider following two relations :
Employee (Emp_Name, City)
Employee_Salary (Emp_Name, Department, Salary)
These two relations are shown in Table 3 and 4.
Solved Paper (2018-19) SP–16 A (CS/IT-Sem-5)
Employee
Emp_Name City
Hari Pune
Om Mumbai
Suraj Nashik
Jai Solapur
Table. 4. The Employee_Salary relation.
Employee_Salary
Emp_Name Department Salary
Hari Computer 10000
Om IT 7000
Billu Computer 8000
Jai IT 5000
Fig. 6.
i. In Fig. 6, [Name + Course] is a candidate key, So Name and Course
are prime attributes, Grade is fully functionally dependent on the
candidate keys and Phone no., Course-deptt. and roll no. are
partially functional dependent on the candidate key.
Database Management System SP–17 A (CS/IT-Sem-5)
Ans.
1. Two-phase locking is a procedure in which a transaction is said to
follow the two-phase locking protocol if all locking operations
precede the first unlock operation in the transaction.
2. In 2PL, each transaction lock and unlock the data item in two
phases :
a. Growing phase : In the growing phase, the transaction acquires
locks on the desired data items.
b. Shrinking phase : In the shrinking phase, the transaction
releases the locks acquired by the data items.
3. According to 2PL, the transaction cannot acquire a new lock, after
it has unlocked any of its existing locked items.
4. Given below, the two transactions T1 and T2 that do not follow the
two-phase locking protocol.
T1 T2
A
C
B
F
D E
G H
T1 T2 T3
Lock-X(A)
Lock-X(D)
Lock-X(H)
Unlock-X(D)
Lock-X(E)
Lock-X(D)
Unlock-X(B)
Unlock-X(E)
Lock-X(B)
Lock-X(E)
Unlock-X(H)
x
T2
T1
x T3
Fig. 7.
Since, the graph does not contain cycle. Hence, it is conflict
serializable.
For S2 :
T1 T2 T3
r3(x)
r2(x)
w3(x)
r1(x)
w1(x)
T1 T2
x
x
T3
Fig. 8.
Since, the graph does not contain cycle. Hence, it is conflict
serializable.
Database Management System SP–21 A (CS/IT-Sem-5)
For S3 :
T1 T2 T3
r3(x)
r3(x)
r1(x)
w3(x)
w1(x)
T1 T2
x
T3
Fig. 9.
Since, the graph contains cycle. Hence, it is not conflict serializable.
Numerical :
Given : R(A, B, C, D, E) and
F= AB
BC E
ED A
(ACD)+ = ACDB AB
= ABCDE BC E
= ABCDE ED A
So, ACD is a key of R.
Relation R is in 1NF as all domains are simple i.e., all elements are
atomic.
B E
C D F G
T1 wants to access item C in read mode
T2 wants to access item D in exclusive mode
T3 wants to read all the children of item B
T4 wants to access all items in read mode
Ans. Multi granularity :
1. Multiple granularity can be defined as hierarchically breaking up
the database into blocks which can be locked.
2. It maintains the track of what to lock and how to lock.
3. It makes easy to decide either to lock a data item or to unlock a data
item.
Database Management System SP–23 A (CS/IT-Sem-5)
Numerical :
1. Transaction T1 reads the item C in B. Then, T2 needs to lock the
item the item A and B in IS mode (and in that order), and finally to
Solved Paper (2018-19) SP–24 A (CS/IT-Sem-5)
Database Management System SP–1 A (CS/IT-Sem-5)
B. Tech.
(SEM. V) ODD SEMESTER THEORY
EXAMINATION, 2019-20
DATA BASE MANAGEMENT SYSTEM
SECTION-A
SECTION-B
SECTION-C
Solved Paper (2019-20) SP–4 A (CS/IT-Sem-5)
SECTION-A
SECTION-B
a1 b1
a2 b2
a3 b3
a4 b4
Fig. 1.
ii. One to many : An entity in A is associated with any number of
entities in B. An entity in B, however, can be associated with at
most one entity in A.
A B
b1
a1 b2
a2 b3
a3 b4
b5
Fig. 2.
iii. Many to one : An entity in A is associated with at most one entity
in B, and an entity in B, however, can be associated with any
number of entities in A.
A B
a1
a2 b1
a3 b2
a4 b3
a5
Fig. 3.
iv. Many to many : An entity in A is associated with any number of
entities in B, and an entity in B is associated with any number of
entities in A.
Database Management System SP–7 A (CS/IT-Sem-5)
A B
a1 b1
a2 b2
a3 b3
a4 b4
Fig. 4.
2. Participation constraints : It tells the participation of entity
sets. There are two types of participations :
i. Partial participation
ii. Total participation
For example :
1. Consider the case of global deadlock detector that receives local
wait-for graph from servers X and Y as shown in Fig. 9 and 10.
T U
V T
U V
3. Ti will need to read all the buckets in that index which have key
values in that range.
4. Ti will need to write one of the buckets in that index when any
deletion or insertion operation on the tuple is done.
5. Thus the logical conflict is converted to a conflict on an index bucket,
and the phantom phenomenon is avoided.
SECTION-C
Employee
Emp_Name City
Hari Pune
Om Mumbai
Suraj Nashik
Jai Solapur
Table. 2. The Employee_Salary relation.
Employee_Salary
Emp_Name Department Salary
Hari Computer 10000
Om IT 7000
Billu Computer 8000
Jai IT 5000
Select Employee.Emp_Name, Employee_Salary.Salary from
Employee inner join Employee_Salary on Employee.Emp_Name
= Employee_Salary.Emp_Name;
Result : The result of preceding query with selected fields of
Table 1 and Table 2.
Solved Paper (2019-20) SP–12 A (CS/IT-Sem-5)
Emp_Name Salary
Hari 10000
Om 7000
Jai 5000
2. Outer join :
a. An outer join is an extended form of the inner join.
b. It returns both matching and non-matching rows for the tables
that are being joined.
c. Types of outer join are as follows :
i. Left outer join : The left outer join returns matching rows from
the tables being joined and also non-matching rows from the left
table in the result and places null values in the attributes that
comes from the right table.
For example :
Select Employee.Emp_Name, Salary
from Employee left outer join Employee_Salary
on Employee.Emp_Name = Employee_Salary.Emp_Name;
Result : The result of preceding query with selected fields of
Table 1 and Table 2.
Emp_Name Salary
Hari 10000
Om 7000
Jai 5000
Suraj null
ii. Right outer join : The right outer join operation returns
matching rows from the tables being joined, and also non-matching
rows from the right table in the result and places null values in
the attributes that comes from the left table.
For example :
Select Employee.Emp_Name, City, Salary from Employee right
outer join
Employee_Salary on Employee.Emp_Name =
Employee_Salary. Emp_Name;
Result : The result of preceding query with selected fields of
Table 1 and Table 2.
Database Management System SP–13 A (CS/IT-Sem-5)
ii. This scheme, allows the younger transaction to wait but when
an older transaction request an item held by younger one,
the older transaction forces the younger one to abort and
release the item. In both cases, transaction, which enters late
in the system, is aborted.
ii. Finds the name of all employees who live in the same city
and on the same street as do their managers.
iii. Find the name street address and cities of residence of all
employees who work for ABC bank and earn more than
7,000 per annum.
iv. Find the name of all employee who earn more than every
employee of XYZ.
v. Give all employees of corporation ABC a 7% salary raise.
vi. Delete all tuples in the works relation for employees of
ABC.
vii. Find the name of all employees in this DATABASE who live
in the same city as the company for which they work.
Ans.
i. Select person_name from Works
Where company_name=‘ABC Bank’
ii. Select E1.person_name
From Employee as E1, Employee as E2, Manages as M
Where E1.person_name=M.person_name
and E2.person_name=M.manager_name
and E1.street=E2.street and E1.city=E2.city
iii. Select * from employee
where person_name in
(select person_name from Works
where company_name= ‘ABC Bank’ and salary>7000
select E.person_name, street,
city from Employee as E, Works as W
where E.person_name = W.person_name
and W.company_name=‘ABC Bank’ and W.salary>7000
iv. Select person_name from Works
where salary > all
(select salary from Works
where company_name=‘XYZ’)
select person_name from Works
where salary>(select max(salary) from Works
where company_name=‘XYZ’)
v. Update Works
set salary=salary*1.07
where company_name=‘ABC Bank’
vi. Delete from Works
where company_name=‘ABC Bank’
vii. Select E.person_name
from Employee as E, Works as W, Company as C
where E.person_name=W.person_name and E.city=C.city
and W.company_name=C.company_name