Introduction To Database Management System

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 47

Introduction to Database

Management System
Data - unprocessed information
- raw facts/ figures.
Information- Processed Data.
Data is converted into information, and
information is converted into knowledge.
DBMS
Database Management System is a collection
of interrelated data and a set of programs to
access those data.
The examples of DBMS are MS-ACCESS,
ORACLE, SQL SERVER etc.
Goals of DBMS

The primary goal of a DBMS is to provide an efficient and


convenient way to store and retrieve large volumes of data

Posing of data retrieval queries in a standard manner.


Retrieval of query results efficiently.
Concurrent use of the system by a large number of users in a
consistent manner.
Guaranteed availability of data irrespective of system
failures.
Data security
Database Applications:-
Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized
recommendations
Manufacturing: production, inventory, orders etc.
Human resources: employee records, salaries, tax
deductions
Purpose of DBMS
File based System:-
File based systems were an early attempt to
computerize the manual filing system that we
all are familiar with.

A collection of application programs that


perform services for the end-users such as the
production of reports. Each program defines
and manages its own data.
The various disadvantages of keeping organizational
information in a file processing system are:
1. Data redundancy and inconsistency

The same information may be duplicated in several


files.
For eg:, the address and telephone no: of a
particular customer may appear in a file that consists
of savings-account records and in a file that consists
of checking-account records. This redundancy leads
to higher storage and access cost.

In addition, it may lead to data inconsistency, i.e.,


the various copies of the same data may no longer
agree.
For eg; a changed address may be reflected
in savings-account records but not elsewhere in the
system.
Difficulty in accessing data
In file processing, to allow users to
manipulate the information, the system has a
number of application programs. When we
want a particular set of information with file
system, the system programmer has to write
the necessary application program.
Eg: customers name in a particular city.

i.e., File processing environments do not allow


needed data to be retrieved in a convenient
and efficient manner.
3. Separation and isolation of data
Since data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve
the appropriate data is difficult.
Data Isolation deals with consistency and completeness
of data retrieved by queries unaffecting a user data by other
user actions.

4. Integrity problems
The data values stored in the database must satisfy certain
types of consistency constraints.

For eg:, the balance of a bank account may never fall below a
prescribed amount. Developers enforce these constraints in the
system by adding appropriate code in the various application
programs. However, when new constraints are added, it is
difficult to change the program to enforce them. The problem is
compounded when constraints involve several data items from
different files.
5. Atomicity problems
A computer is subject to failure. In many
applications , it is crucial that, once a failure occurs and
has been detected, the data are restored to the
consistent state that existed prior to the failure. For e.g.,
consider transferring of money from account A to B. If a
system failure occurs during the execution of a program,
it is possible that the money was removed from account
A but was not credited to account B, resulting in an
inconsistent database state. So the fund transfer must
be atomic that means it must happen in its entirety or
not at all.
6. Concurrent access anomalies.

For the sake of overall performance of the system and


faster response, many systems allow multiple users to
update the data simultaneously. In such an environment,
interaction of concurrent updates may result in inconsistent
data.
Consider bank account A, containing Rs.5000/-. If two
customers withdraw funds (say Rs.500 & Rs.1000
respectively) from account A at about the same time, the
result of the concurrent executions may leave the account
in an incorrect state. If the two programs run concurrently,
they may both red the value Rs.5000, and write back
Rs.4500 and Rs.4000 respectively. Depending on which one
writes the vale last, the account may contain Rs.4500 or
Rs.4000, rather than the correct value of Rs.3500. To guard
against this possibility, system must maintain some form of
supervision
7.Security problems.
Not every user of the database system should be
able to access all the data. For example, in a
university, payroll personnel need to see only
that part of the database that has financial
information. They do not need access to
information about academic records. But, since
application programs are added to the file-
processing system in an ad hoc manner,
enforcing such security constraints is difficult.
Benefits of Database Approach
The data can be shared.
Redundancy can be reduced.
Inconsistency can be avoided.
Transaction support can be provided.
Improved security.
Improved data integrity.
Improved maintenance through data
independence.
Increased concurrency.
Improved backup and recovery services.
Views of Data The Three- Level ANSI-
SPARC Architecture
The major purpose of a database system is to provide
users with an abstract view of the data. That is, the
system hides certain details of how the data are stored
and maintained.
Data Abstraction
The system hides certain details of how the data are
stored and maintained.
To retrieve data efficiently, complex data structures are
used to represent data in the database. Since many
database systems users are not computer trained,
developers hide the complexity from users through
several levels of abstraction to simplify users
interactions with the system.
The Three- Level ANSI-SPARC Architecture

VIEW LEVEL

View 2 . View
.
n
View 1

External / View Level

Conceptual schema
Conceptual level

Internal Schema
Internal Level

Database
Physical data organization
1. Internal or Physical level :

The lowest level of abstraction describes how the


data are actually stored. At physical level,
complexity of data structures is described in
detail.
The internal level covers the physical
implementation of the database to achieve
optimal runtime performance and storage space
utilization.
It covers the data structures and file
organizations used to store data on storage
devices.
2. Conceptual or Logical Level :
The next higher level of abstraction describes what
data are stored in the database and what
relationships exist among those data. It thus
describes the entire database in terms of a small
number of relatively simple structures. The
conceptual level represents
all entities, their attributes, and their
relationships;
the constraints on the data;
semantic information about the data;
security and integrity information.
3. View Level :
The highest level of abstraction describes only part
of the entire database. Many users of the database
system do not need all the information stored in
the database, instead they need to access only a
part of the database. The view level of abstraction
exists to simplify their interaction with the system.
The system may provide many views for the same
database.
Views provide a level of security. Views can be set
up to exclude data that some users should not see.
Views provide a mechanism to customize the
appearance of the database. A view can present a
consistent, unchanging picture of the structure of
the database, even if the underlying database is
changed.
Instances and Schemas:-
The collection of information stored in the database
at a particular moment is called an instance of the
database.
A schema describes the organization of data and
relationships within the database. i.e., the overall
design of the database is called the database
schema. A database schema corresponds to the
variable declarations in a program. Each variable
has a particular value at a given instant. The values
of the variables in a program at a point in time
correspond to an instance of a database schema.
According to the levels of abstraction,
database systems have several schemas.
The physical schema (internal schema)
describes the database design at the
physical level,
The logical schema (conceptual schema)
describes the database design at the logical
level.
The schemas at the view level can be called
as sub schemas (external schema) and they
describe views of the database for
particular users.
Data Independence:-
The ability to modify a schema definition in one level
without affecting a schema definition in the next higher
level is called data independence. The interfaces between
the various levels and components should be well defined
so that changes in some parts do not seriously influence
others.
Two levels of data independence :
1. Physical Data Independence
It is the ability to modify the physical schema
without causing application programs to be rewritten; rarely
need to improve performance. It refers to the immunity of
the conceptual schema to changes in the internal schema.
Changes to the internal schema, such as using different
storage devices, modifying indexes, or hashing algorithms,
should be possible without having to change the conceptual
or external schemas. From the users point of view, the only
effect that may be noticed is a change in performance.
2. Logical Data Independence
It is the ability to modify the logical schema
without causing application programs to be
rewritten. It needs when logical structure of
database is changed. It is difficult to achieve,
since application programs are heavily
dependent on logical structure.
Changes to the logical schema, such as the
addition or removal of new entities, attributes,
or relationships, should be possible without
having to change existing external schemas or
having to rewrite application programs.
Data Models
A collection of conceptual tools for describing
data, data relationships, data semantics, and
consistency constraints. A data model
provides a way to describe the design of a
database at the physical, logical, and view
levels.
Relational Model.
The relational model uses a collection of
tables to represent both data and the
relationships among those data. Each table
has multiple columns, and each column has a
unique name. Tables are also known as
relations. The relational model is an example
of a record-based model.
Record-based models are so named because the
database is structured in xed-format records
of several types. Each table contains records
of a particular type. Each record type denes a
xed number of elds, or attributes.
The columns of the table correspond to the
attributes of the record type. The relational
data model is the most widely used data
model, and a vast majority of current
database systems are based on the relational
model.
Entity-Relationship Model.
The entity-relationship (E-R) data model uses a
collection of basic objects, called entities,and
relationships among these objects.
An entity is a thing or object in the real
world that is distinguishable from other
objects. The entity-relationship model is
widely used in database
design.
Object-Based Data Model.
Object-oriented programming (especially in
Java,C++, or C#) has become the dominant
software-development methodology. This led to
the development of an object-oriented data
model that can be seen as extending the E-R
model with notions of encapsulation,
methods(functions), and object identity. The
object-relational data model combines features
of the object-oriented data model and relational
data model.
Semi structured Data Model.
The semi structured data model permits the
specication of data where individual data items
of the same type may have different sets of
attributes. This is in contrast to the data
models mentioned earlier, where every data
item of a particular type must have the same
set of attributes. The Extensible Markup
Language (XML) is widely used to represent
semi structured data.
Database Languages:-

A database system provides a data definition


language to specify the database schema and a
data manipulation language to express database
queries and updates.
DDL

We specify a database schema by a set of


denitions expressed by a special language
called a data-denition language (DDL). The
DDL is also used to specify additional
properties of the data.
We specify the storage structure and access
methods used by the database system by a set
of statements in a special type of DDL called a
data storage and denition language(DSDL).
These statements dene the implementation
details of the database schemas, which are
usually hidden from the users.
The data values stored in the database must
satisfy certain consistency constraints
Integrity constraints
Domain Constraints.
A domain of possible values must be associated
with every attribute (for example, integer
types, character types, date/time types).
Declaring an attribute to be of a particular
domain acts as a constraint on the values that
it can take. Domain constraints are the most
elementary form of integrity constraint. They
are tested easily by the system whenever a
new data item is entered into the database.
Referential Integrity.
There are cases where we wish to ensure that a
value that appears in one relation for a given
set of attributes also appears in a certain
set of attributes in another relation (referential
integrity). For example, the department listed
for each course must be one that actually
exists. More precisely, the dept name value in
a course record must appear in the dept name
Attribute of some record of the department
relation.
Database modications can cause violations of
referential integrity. When a referential-
integrity constraint is violated, the normal
procedure is to reject the action that caused
the violation.
Assertions:-
An assertion is any condition that the database must
always satisfy. Domain constraints and referential-
integrity constraints are special
forms of assertions. However, there are many
constraints that we cannot express by using only
these special forms. For example, Every department
must have at least ve courses offered every
semester must be expressed as
an assertion. When an assertion is created, the system
tests it for validity. If the assertion is valid, then any
future modication to the database is allowed only if
it does not cause that assertion to be violated.
Authorization.
We may want to differentiate among the users
as far as the type of access they are permitted
on various data values in the database. These
differentiations are expressed in terms of
authorization, the most common
being
read authorization
which allows reading, but not modication of data.
insert authorization
which allows insertion of new data, but not
modication of existing data.
update authorization
which allows modication,but not deletion, of data.
delete authorization
which allows deletion of data. We may assign the
user all, none, or a combination of these types of
authorization.
Note
The DDL, just like any other programming language,
gets as input some instructions (statements) and
generates some output. The output of the DDL is
placed in the data dictionary, which contains
metadatathat is, data about data.The data
dictionary is considered to be a special type of table
that can only be accessed and updated by the
database system itself (not a regular user). The
database system consults the data dictionary before
reading or modifying actual data.
Data Manipulation Language (DML)
DML is a language that provides a set of operations to
support the basic data manipulation operations on
the data held in the database.
Data manipulation operation usually includes
The retrieval of information stored in the database
The insertion of new information into the
database.
The deletion of information from the database.
The modification of information stored in the
database.
A DML is a language that enables users to access or
manipulate data as organized by the appropriate data
model.
The part of a DML that involves data retrieval
is called query language. A query language
can be defined as a high level special purpose
language used to satisfy diverse requests for
the retrieval of data held in the database

Two types of DML:


Procedural DML - require a user to specify
what data are needed and how to get those
data.
Non Procedural DML (Declarative DML)-
require a user to specify what data are
needed without specifying how to get
those data.
Database Users
There are four different types of database-
system users, differentiated by the way
they expect to interact with the system.
Different types of user interfaces have
been designed for the different types of users.
Naive users are unsophisticated users who
interact with the system by invoking one of
the application programs that have been
written previously.
For example, a clerk in the university who
needs to add a new instructor to the database
Application programmers are computer
professionals who write application programs.
Application programmers can choose from many tools
to develop user interfaces. Rapid application
development (RAD) tools are tools that enable an
application programmer to construct forms and reports
with minimal programming effort.
Sophisticated users interact with the system without
writing programs. Instead, they form their requests
either using a database query language or by
using tools such as data analysis software. Analysts
who submit queries to explore data in the database fall
in this category.
Specialized users are sophisticated users who
write specialized database applications that
do not t into the traditional data-processing
framework.
Among these applications are computer-aided
design systems, knowledgebase and expert
systems, systems that store data with complex
data types (for example, graphics data and
audio data), and environment-modeling
systems.
Database Administrator
One of the main reasons for using DBMSs is to have
central control of both the data and the programs
that access those data. A person who has such
central control over the system is called a database
administrator ( DBA).
The functions of a DBA include:
Schema denition. TheDBA creates the original
database schema by executing a set of data
denition statements in the DDL.
Storage structure and access-method
denition.
Schemaand physical-organization
modication.
TheDBAcarries out changes to the schema and
physical organization to reect the changing
needs of the organization, or to alter the
physical organization to improve performance.
Granting of authorization for data access.
By granting different types of authorization,
the database administrator can regulate which
parts of the database various users can access.
The authorization information is kept in a
special system structure that the database
system consults whenever someone
attempts to access the data in the system.
Routine maintenance.
Examples of the database administrators
routine maintenance activities are:
Periodically backing up the database, either
onto tapes or onto remote servers, to prevent
loss of data in case of disasters such as
ooding.
Ensuring that enough free disk space is
available for normal operations,and upgrading
disk space as required.
Monitoring jobs running on the database and
ensuring that performance is not degraded by
very expensive tasks submitted by some users.

You might also like