Unit 1 (Notes)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Unit-1

Introduction of DBMS

A Database Management System (DBMS) is a software system that is designed to


manage and organize data in a structured manner. It allows users to create, modify,
and query a database, as well as manage the security and access controls for that
database.

Some key features of a DBMS include:

1. Data modeling: A DBMS provides tools for creating and modifying data models,
which define the structure and relationships of the data in a database.
2. Data storage and retrieval: A DBMS is responsible for storing and retrieving data
from the database, and can provide various methods for searching and querying the
data.
3. Concurrency control: A DBMS provides mechanisms for controlling concurrent
access to the database, to ensure that multiple users can access the data without
conflicting with each other.
4. Data integrity and security: A DBMS provides tools for enforcing data integrity
and security constraints, such as constraints on the values of data and access
controls that restrict who can access the data.
5. Backup and recovery: A DBMS provides mechanisms for backing up and
recovering the data in the event of a system failure.
6. DBMS can be classified into two types: Relational Database Management System
(RDBMS) and Non-Relational Database Management System (NoSQL or Non-
SQL)
7. RDBMS: Data is organized in the form of tables and each table has a set of rows
and columns. The data is related to each other through primary and foreign keys.
8. NoSQL: Data is organized in the form of key-value pairs, document, graph, or
column-based. These are designed to handle large-scale, high-performance
scenarios.
Database is a collection of interrelated data which helps in the efficient retrieval,
insertion, and deletion of data from the database and organizes the data in the form of
tables, views, schemas, reports, etc. For Example, a university database organizes the
data about students, faculty, admin staff, etc. which helps in the efficient retrieval,
insertion, and deletion of data from it.
Database Management System: The software which is used to manage
databases is called Database Management System (DBMS). For Example, MySQL,
Oracle, etc. are popular commercial DBMS used in different applications. DBMS
allows users the following tasks:
 Data Definition: It helps in the creation, modification, and removal of definitions
that define the organization of data in the database.
 Data Updation: It helps in the insertion, modification, and deletion of the actual
data in the database.
 Data Retrieval: It helps in the retrieval of data from the database which can be
used by applications for various purposes.
 User Administration: It helps in registering and monitoring users, enforcing data
security, monitoring performance, maintaining data integrity, dealing with
concurrency control, and recovering information corrupted by unexpected failure.

Paradigm Shift from File System to DBMS

File System manages data using files on a hard disk. Users are allowed to create,
delete, and update the files according to their requirements. Let us consider the
example of file-based University Management System. Data of students is available to
their respective Departments, Academics Section, Result Section, Accounts Section,
Hostel Office, etc. Some of the data is common for all sections like Roll No, Name,
Father Name, Address, and Phone number of students but some data is available to a
particular section only like Hostel allotment number which is a part of the hostel
office. Let us discuss the issues with this system:
 Redundancy of data: Data is said to be redundant if the same data is copied at
many places. If a student wants to change their Phone number, he or she has to get
it updated in various sections. Similarly, old records must be deleted from all
sections representing that student.
 Inconsistency of Data: Data is said to be inconsistent if multiple copies of the
same data do not match each other. If the Phone number is different in Accounts
Section and Academics Section, it will be inconsistent. Inconsistency may be
because of typing errors or not updating all copies of the same data.
 Difficult Data Access: A user should know the exact location of the file to access
data, so the process is very cumbersome and tedious. If the user wants to search
the student hostel allotment number of a student from 10000 unsorted students’
records, how difficult it can be.
 Unauthorized Access: File Systems may lead to unauthorized access to data. If a
student gets access to a file having his marks, he can change it in an unauthorized
way.
 No Concurrent Access: The access of the same data by multiple users at the same
time is known as concurrency. The file system does not allow concurrency as data
can be accessed by only one user at a time.
 No Backup and Recovery: The file system does not incorporate any backup and
recovery of data if a file is lost or corrupted.

ADVANTAGES OR DISADVANTAGES:

Advantages of using a DBMS:

1. Data organization: A DBMS allows for the organization and storage of data in a
structured manner, making it easy to retrieve and query the data as needed.
2. Data integrity: A DBMS provides mechanisms for enforcing data integrity
constraints, such as constraints on the values of data and access controls that
restrict who can access the data.
3. Concurrent access: A DBMS provides mechanisms for controlling concurrent
access to the database, to ensure that multiple users can access the data without
conflicting with each other.
4. Data security: A DBMS provides tools for managing the security of the data, such
as controlling access to the data and encrypting sensitive data.
5. Backup and recovery: A DBMS provides mechanisms for backing up and
recovering the data in the event of a system failure.
6. Data sharing: A DBMS allows multiple users to access and share the same data,
which can be useful in a collaborative work environment.

Disadvantages of using a DBMS:

1. Complexity: DBMS can be complex to set up and maintain, requiring specialized


knowledge and skills.
2. Performance overhead: The use of a DBMS can add overhead to the performance
of an application, especially in cases where high levels of concurrency are
required.
3. Scalability: The use of a DBMS can limit the scalability of an application, since it
requires the use of locking and other synchronization mechanisms to ensure data
consistency.
4. Cost: The cost of purchasing, maintaining and upgrading a DBMS can be high,
especially for large or complex systems.
5. Limited use cases: Not all use cases are suitable for a DBMS, some solutions don’t
need high reliability, consistency or security and may be better served by other
types of data storage.
These are the main reasons which made a shift from file system to DBMS. Also, see
A Database Management System (DBMS) is a software system that allows users to
create, maintain, and manage databases. It is a collection of programs that enables
users to access and manipulate data in a database. A DBMS is used to store, retrieve,
and manipulate data in a way that provides security, privacy, and reliability.

File System Approach


File based systems were an early attempt to computerize the manual system. It is
also called a traditional based approach in which a decentralized approach was taken
where each department stored and controlled its own data with the help of a data
processing specialist. The main role of a data processing specialist was to create the
necessary computer file structures, and also manage the data within structures and
design some application programs that create reports based on file data.

In the above figure:

Consider an example of a student's file system. The student file will contain
information regarding the student (i.e. roll no, student name, course etc.). Similarly,
we have a subject file that contains information about the subject and the result file
which contains the information regarding the result.
Some fields are duplicated in more than one file, which leads to data redundancy. So
to overcome this problem, we need to create a centralized system, i.e. DBMS
approach.

DBMS:
A database approach is a well-organized collection of data that are related in a
meaningful way which can be accessed by different users but stored only once in a
system. The various operations performed by the DBMS system are: Insertion,
deletion, selection, sorting etc.

In the above figure,

In the above figure, duplication of data is reduced due to centralization of data.

DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture
is used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get
their request done.
Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture

o In this architecture, the database is directly available to the user. It means the user
can directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide
a handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.

2-Tier Architecture

o The 2-Tier architecture is same as basic client-server. In the two-tier architecture,


applications on the client end can directly communicate with the database at the
server side. For this interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction management.
o To communicate with the DBMS, client-side application establishes a connection with
the server side.
Fig: 2-tier Architecture

3-Tier Architecture

o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.

Three schema Architecture


o The three schema architecture is also called ANSI/SPARC architecture or three-level
architecture.
o This framework is used to describe the structure of a specific database system.
o The three schema architecture is also used to separate the user applications and
physical database.
o The three schema architecture contains three-levels. It breaks the database down into
three different categories.

The three-schema architecture is as follows:

In the above diagram:

o It shows the DBMS architecture.


o Mapping is used to transform the request and response between various database
levels of architecture.
o Mapping is not good for small DBMS because it takes more time.
o In External / Conceptual mapping, it is necessary to transform the request from
external level to conceptual schema.
o In Conceptual / Internal mapping, DBMS transform the request from the conceptual
to internal level.

Objectives of Three schema Architecture


The main objective of three level architecture is to enable multiple users to access
the same data with a personalized view while storing the underlying data only once.
Thus it separates the user's view from the physical structure of the database. This
separation is desirable for the following reasons:

o Different users need different views of the same data.


o The approach in which a particular user needs to see the data may change over time.
o The users of the database should not worry about the physical implementation and
internal workings of the database such as data compression and encryption
techniques, hashing, optimization of the internal structures etc.
o All users should be able to access the same data according to their requirements.
o DBA should be able to change the conceptual structure of the database without
affecting the user's
o Internal structure of the database should be unaffected by changes to physical
aspects of the storage.

1. Internal Level

o The internal level has an internal schema which describes the physical storage
structure of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored in
a block.
o The physical level is used to describe complex low-level data structures in detail.

The internal level is generally is concerned with the following activities:

o Storage space allocations.


For Example: B-Trees, Hashing etc.
o Access paths.
For Example: Specification of primary and secondary keys, indexes, pointers and
sequencing.
o Data compression and encryption techniques.
o Optimization of internal structures.
o Representation of stored fields.

2. Conceptual Level

o The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
o In the conceptual level, internal details such as an implementation of the data
structure are hidden.
o Programmers and database administrators work at this level.

3. External Level

o At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.
o The view schema describes the end user interaction with database systems.

Mapping between Views


The three levels of DBMS architecture don't exist independently of each other. There
must be correspondence between the three levels i.e. how they actually correspond
with each other. DBMS is responsible for correspondence between the three types of
schema. This correspondence is called Mapping.
There are basically two types of mapping in the database architecture:

o Conceptual/ Internal Mapping


o External / Conceptual Mapping

Conceptual/ Internal Mapping

The Conceptual/ Internal Mapping lies between the conceptual level and the internal
level. Its role is to define the correspondence between the records and fields of the
conceptual level and files and data structures of the internal level.

External/ Conceptual Mapping

The external/Conceptual Mapping lies between the external level and the Conceptual
level. Its role is to define the correspondence between a particular external and the
conceptual view.

Fig: 3-tier Architecture

Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data
models used for understanding the structure of the database:
1) Relational Data Model: This type of model designs the data in the form of rows
and columns within a table. Thus, a relational model uses tables for representing data
and in-between relationships. Tables are also called relations. This model was initially
described by Edgar F. Codd, in 1969. The relational data model is the widely used
model which is primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of


data as objects and relationships among them. These objects are known as entities,
and relationship is an association among these entities. This model was designed by
Peter Chen and published in 1976 papers. It was widely used in database designing.
A set of attributes describe the entities. For example, student_name, student_id
describes the 'student' entity. A set of the same type of entities is known as an 'Entity
set', and the set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of


functions, encapsulation, and object identity, as well. This model supports a rich type
system that includes structured and collection types. Thus, in 1980s, various database
systems following the object-oriented approach were developed. Here, the objects
are nothing but the data carrying its properties.

4) Semistructured Data Model: This type of data model is different from the other
three data models (explained above). The semistructured data model allows the data
specifications at places where the individual data items of the same type may have
different attributes sets. The Extensible Markup Language, also known as XML, is
widely used for representing the semistructured data. Although XML was initially
designed for including the markup information to the text document, it gains
importance because of its application in the exchange of data.

Data Models
Data models define how the logical structure of a database is modeled. Data Models
are fundamental entities to introduce abstraction in a DBMS. Data models define
how data is connected to each other and how they are processed and stored inside
the system.
The very first data model could be flat data-models, where all the data used are to be
kept in the same plane. Earlier data models were not so scientific, hence they were
prone to introduce lots of duplication and update anomalies.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and
relationships among them. While formulating real-world scenario into the database
model, the ER Model creates entity set, relationship set, general attributes and
constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
 Entities and their attributes.
 Relationships among entities.
These concepts are explained below.

 Entity − An entity in an ER Model is a real-world entity having properties


called attributes. Every attribute is defined by its set of values
called domain. For example, in a school database, a student is considered as
an entity. Student has various attributes like name, age, class, etc.
 Relationship − The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping cardinalities
define the number of association between two entities.
Mapping cardinalities −
o one to one
o one to many
o many to one
o many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a
model than others. This model is based on first-order predicate logic and defines a
table as an n-ary relation.

The main highlights of this model are −


 Data is stored in tables called relations.
 Relations can be normalized.
 In normalized relations, values saved are atomic values.
 Each row in a relation contains a unique value.
 Each column in a relation contains values from a same domain.
Database Schema
A database schema is the skeleton structure that represents the logical view of the
entire database. It defines how the data is organized and how the relations among
them are associated. It formulates all the constraints that are to be applied on the
data.
A database schema defines its entities and the relationship among them. It contains
a descriptive detail of the database, which can be depicted by means of schema
diagrams. It’s the database designers who design the schema to help programmers
understand the database and make it useful.

A database schema can be divided broadly into two categories −


 Physical Database Schema − This schema pertains to the actual storage of
data and its form of storage like files, indices, etc. It defines how the data will
be stored in a secondary storage.
 Logical Database Schema − This schema defines all the logical constraints
that need to be applied on the data stored. It defines tables, views, and
integrity constraints.
Database Instance
It is important that we distinguish these two terms individually. Database schema is
the skeleton of database. It is designed when the database doesn't exist at all. Once
the database is operational, it is very difficult to make any changes to it. A database
schema does not contain any data or information.
A database instance is a state of operational database with data at any given time. It
contains a snapshot of the database. Database instances tend to change with time.
A DBMS ensures that its every instance (state) is in a valid state, by diligently
following all the validations, constraints, and conditions that the database designers
have imposed.
If a database system is not multi-layered, then it becomes difficult to make any
changes in the database system. Database systems are designed in multi-layers as
we learnt earlier.
Data Independence
A database system normally contains a lot of data in addition to users’ data. For
example, it stores data about data, known as metadata, to locate and retrieve data
easily. It is rather difficult to modify or update a set of metadata once it is stored in
the database. But as a DBMS expands, it needs to change over time to satisfy the
requirements of the users. If the entire data is dependent, it would become a tedious
and highly complex job.

Metadata itself follows a layered architecture, so that when we change data at one
layer, it does not affect the data at another level. This data is independent but
mapped to each other.

Logical Data Independence


Logical data is data about database, that is, it stores information about how data is
managed inside. For example, a table (relation) stored in the database and all its
constraints, applied on that relation.
Logical data independence is a kind of mechanism, which liberalizes itself from
actual data stored on the disk. If we do some changes on table format, it should not
change the data residing on the disk.

Physical Data Independence


All the schemas are logical, and the actual data is stored in bit format on the disk.
Physical data independence is the power to change the physical data without
impacting the schema or logical data.
For example, in case we want to change or upgrade the storage system itself −
suppose we want to replace hard-disks with SSD − it should not have any impact on
the logical data or schemas.
Fig: Data Independence

Database Languages in DBMS


o A DBMS has appropriate languages and interfaces to express database queries and
updates.
o Database languages can be used to read, store and update the data in the database.

Types of Database Languages

1. Data Definition Language (DDL)


o DDL stands for Data Definition Language. It is used to define database structure or
pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the
number of tables and schemas, their names, indexes, columns in each table,
constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.


o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come
under Data definition language.

2. Data Manipulation Language (DML)


DML stands for Data Manipulation Language. It is used for accessing and
manipulating data in a database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language (DCL)


o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have
the feature of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.


o Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language (TCL)


TCL is used to run the changes made by the DML statement. TCL can be grouped
into a logical transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.


o Rollback: It is used to restore the database to original since the last Commit.

Interfaces in DBMS
A database management system (DBMS) interface is a user interface that allows for
the ability to input queries to a database without using the query language itself. User-
friendly interfaces provided by DBMS may include the following:
 Menu-Based Interfaces
 Forms-Based Interfaces
 Graphical User Interfaces
 Natural Language Interfaces
 Speech Input and Output Interfaces
 Interfaces for Parametric Users
 Interfaces for the Database Administrator (DBA)
Menu-Based Interfaces
These interfaces present the user with lists of options (called menus) that lead the user
through the formation of a request. The basic advantage of using menus is that they
remove the tension of remembering specific commands and syntax of any query
language. The query is basically composed step by step by collecting or picking
options from a menu that is shown by the system. Pull-down menus are a very popular
technique in Web-based interfaces. They are also often used in browsing interfaces
which allow a user to look through the contents of a database in an exploratory and
unstructured manner.
Forms-Based Interfaces
A forms-based interface displays a form to each user. Users can fill out all of the form
entries to insert new data, or they can fill out only certain entries, in which case the
DBMS will redeem the same type of data for other remaining entries. These types of
forms are usually designed or created and programmed for users that have no
expertise in operating systems. Many DBMS’s have form specification languages
which are special languages that help specify such forms.

Example: SQL Forms is a form-based language that specifies queries using a form
designed in conjunction with the relational database schema.
Graphical User Interface
A GUI typically displays a schema to the user in diagrammatic form. The user then
can specify a query by manipulating the diagram. In many cases, GUI utilise both
menus and forms. Most GUI use a pointing device such as a mouse, to pick a certain
part of the displayed schema diagram.
Natural Language Interfaces
These interfaces accept requests written in English or some other language and
attempt to understand them. A Natural language interface has its own schema, which
is similar to the database conceptual schema as well as a dictionary of important
words.
The natural language interface refers to the words in its schema as well as to the set of
standard words in a dictionary to interpret the request. If the interpretation is
successful, the interface generates a high-level query corresponding to the natural
language and submits it to the DBMS for processing, otherwise, a dialogue is started
with the user to clarify any provided condition or request. The main disadvantage of
this is that the capabilities of this type of interface are not that advance.
Speech Input and Output Interfaces
There is limited use of speech be it for a query or an answer to a question or being a
result of a request it is becoming commonplace. Applications with limited vocabulary
such as inquiries for telephone directory, flight arrival/departure, and bank account
information are allowed speech for input and output to enable ordinary folks to access
this information.
The Speech input is detected using predefined words and used to set up the parameters
that are supplied to the queries. For output, a similar conversion from text or numbers
into speech takes place.

Interface for Parametric Users

Interfaces for Parametric Users contain some commands that can be handled with a
minimum of keystrokes. It is generally used in bank transactions for transferring
money. These operations are performed repeatedly.
Interfaces for Database Administrators (DBA)
Most database system contains privileged commands that can be used only by the
DBA’s staff. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, and reorganizing the
storage structures of databases.

You might also like