Using Basic Structured Query Language
Using Basic Structured Query Language
Using Basic Structured Query Language
What is SQL?
SQL stands for Structured Query Language
History of SQL
During the 1970s, a group at IBM San Jose Research Laboratory developed the System R
relational database management system, based on the model introduced by Edgar F. Codd
in his influential paper, A Relational Model of Data for Large Shared Data Banks.
The acronym SEQUEL was later changed to SQL because "SEQUEL" was a trademark of
the UK-based Hawker Siddeley aircraft company.
At the highest level, SQL statements can be broadly categorized as follows into three types:
DDL and DCL statements are commonly used by a database designer and database
administrator for establishing the database structures used by an application.
The DDL part of SQL permits database tables to be created or deleted. It also
defines indexes (keys), specify links between tables, and impose constraints
between tables. The most important DDL statements in SQL are:
1. CREATE DATABASE - creates a new database
2. ALTER DATABASE - modifies a database
3. CREATE TABLE - creates a new table
- Creates a table with the column names the user provides. The user also
needs to specify the data type for each column. Unfortunately, data
types vary slightly from one RDBMS to another, so that user might
need metadata to establish the data types used for a particular
database.
- It is normally used less often than the data manipulation commands
because a table is created only once, whereas inserting and deleting
rows or changing individual values generally occurs more
frequently.
A table is a collections of related data entries and it consists of columns and rows.
Almost all modern Relational Database Management Systems like MS SQL Server, Microsoft
Access, MSDE, Oracle, DB2, Sybase, MySQL, Postgres and Informix use SQL as standard database
language.
2. SQL Basic
2.1. Language elements
The SQL language is sub-divided into several language elements, including:
The SQL languages consist of many statements, which are summarized below. Each
statements requests a specific action from the DBMS, such as creating a new table, retrieving
data, or inserting new data into the database.
Statement Descriptions
Data Manipulation
SELECT Retrieve data from the database
INSERT Adds new rows of the data to the database
DELETE Removes rows of the data from the database
UPDATE Modifies existing database data
Data Definition
CREATE Adds a new table to the database
TABLE
DROP TABLE Removes a table from the database
ALTER Change the structure of an existing table
Every SQL statement begins with a verb, a keyword that describes what the statement does.
CREATE, INSERT, and DELETE are typical verbs. The statement continues with one or
The SQL keywords are words that are reserved words that are not used as a user defined
data. The most commonly used SQL key words according to ANSI/ISO SQL keywords are
as follows:
ADA DEC GRANT NUMERIC
ALL DECIMAL GROUP OF
AND DECLARE HAVING ON
ANY DEFUALT IN OPEN
AS DELETE INDICATOR OPTION
ASC DESC INSERT OPEN
AUTHORIZATION DISTNICT INT OPTION
AVG DOUBLE INTEGER OR
BEGIN END INTO ORDER
BETWEEN ESCAPE IS PRIMARY
BY EXEC KEY REAL
C EXISTS LANGAUGE SELECT
CHAR FETCH LIKE SET
CHARACTER FLOAT MAX SOME
CHECK FOR MIN SUM
CLOSE FOREIGN MODULE TABLE
COBOL FORTRAN NOT TO UNION
COMMIT FOUND NULL UPDATE
CONTINUE FROM USER
COUNT GO VIEW
CREATE GOTO WHERE
CURRENT WITH
CURSOR WORK
There is a standard that specifies various types of data that can be stored in a SQL-based database and
manipulated by the SQL languages.
Character strings:
Data type Description Storage
Binary types:
Data type Description Storage
bit Allows 0, 1, or NULL
binary(n) Fixed-length binary data. Maximum 8,000 bytes
varbinary(n) Variable-length binary data. Maximum 8,000 bytes
varbinary(max) Variable-length binary data. Maximum 2GB
image Variable-length binary data. Maximum 2GB
Number types:
The p parameter indicates the maximum total number of digits that can
be stored (both to the left and to the right of the decimal point). p must
be a value from 1 to 38. Default is 18.
The p parameter indicates the maximum total number of digits that can
be stored (both to the left and to the right of the decimal point). p must
be a value from 1 to 38. Default is 18.
Date types:
sql_variant Stores up to 8,000 bytes of data of various data types, except text, ntext, and
timestamp
o Fixed-length character strings: columns holding these types of data typically store names of
people and companies, addresses, descriptions, and so on.
Date: w/c stores a date like June 30, 2009 or 30 June 2009
2.4. Constants
In some SQL statements a numeric, character, or date data value must be expressed in text form.
For example: INSERT statement, w/c adds a student to the database:
The value for each column in the newly inserted row is specified in the VALUES clause. Constant
data values are also used in expression such as in the SELECT statement
SELECT city
FROM offices
If a single quotes is to be used included in the constant text, it is written within the constant
as two consecutive single quote characters. This is constant value:
Example: “I can’t”
2.4.3. Date and time constants
In SQL products the supports date/time data. Constant values for dates, times, and time
intervals are specified as string constants. The format of these constants varies from one
DBMS to the next.
Example:
SELECT Name, dept
FROM student
SELECT city
FROM offices
WHERE sales > target +50000.00
2.6. Missing data (Null Values)
3. SQL Commands
3.1. Create:-
3.1.1. Create SQL DB
Database Tables
A database most often contains one or more tables. Each table is identified by a name (e.g.
"Customers" or "Orders"). Tables contain records (rows) with data.
Below is an example of a table called "Persons":
P_Id LastName FirstName Address City
1 Aberu Ola AA Areda Subsity
2 Chala sanyi W/shoa Ambo
3 MohammadKari Arsi Asella
The table above contains three records (one for each person) and five columns (P_Id,
LastName, FirstName, Address, and City).
Now we want to create a table called "Persons" that contains five columns: P_Id, LastName,
FirstName, Address, and City.
The P_Id column is of type int and will hold a number. The LastName, FirstName, Address, and
City columns are of type varchar with a maximum length of 255 characters.
The empty table can be filled with data with the INSERT INTO statement.
Constraints are used to limit the type of data that can go into a table.
Constraints can be specified when a table is created (with the CREATE TABLE statement) or after
the table is created (with the ALTER TABLE statement).
The NOT NULL constraint enforces a column to NOT accept NULL values.
The NOT NULL constraint enforces a field to always contain a value. This means that you cannot
insert a new record, or update a record without adding a value to this field.
The following SQL enforces the "P_Id" column and the "LastName" column to
not accept NULL values:
The UNIQUE and PRIMARY KEY constraints both provide a guarantee for uniqueness for a
column or set of columns.
Note that you can have have many UNIQUE constraints per table, but only one PRIMARY KEY
constraint per table.
The following SQL creates a UNIQUE constraint on the "P_Id" column when the
"Persons" table is created:
To allow naming of a UNIQUE constraint, and for defining a UNIQUE constraint on multiple
columns, use the following SQL syntax:
To create a UNIQUE constraint on the "P_Id" column when the table is already
created, use the following SQL:
The PRIMARY KEY constraint uniquely identifies each record in a database table.
Each table should have a primary key, and each table can have only one primary key.
The following SQL creates a PRIMARY KEY on the "P_Id" column when the
"Persons" table is created:
To allow naming of a PRIMARY KEY constraint, and for defining a PRIMARY KEY
constraint on multiple columns, use the following SQL syntax:
To create a PRIMARY KEY constraint on the "P_Id" column when the table is
already created, use the following SQL:
To allow naming of a PRIMARY KEY constraint, and for defining a PRIMARY KEY
constraint on multiple columns, use the following SQL syntax:
Note: If you use the ALTER TABLE statement to add a primary key, the primary key column(s)
must already have been declared to not contain NULL values (when the table was first created).
Let's illustrate the foreign key with an example. Look at the following two tables:
Note that the "P_Id" column in the "Orders" table points to the "P_Id" column in the "Persons" table.
The "P_Id" column in the "Persons" table is the PRIMARY KEY in the "Persons" table.
The "P_Id" column in the "Orders" table is a FOREIGN KEY in the "Orders" table.
The FOREIGN KEY constraint is used to prevent actions that would destroy link between tables.
The FOREIGN KEY constraint also prevents that invalid data is inserted into the foreign key
column, because it has to be one of the values contained in the table it points to.
The following SQL creates a FOREIGN KEY on the "P_Id" column when the
"Orders" table is created:
To allow naming of a FOREIGN KEY constraint, and for defining a FOREIGN KEY
constraint on multiple columns, use the following SQL syntax:
To create a FOREIGN KEY constraint on the "P_Id" column when the "Orders" table is already
created, use the following SQL::
To allow naming of a FOREIGN KEY constraint, and for defining a FOREIGN KEY
constraint on multiple columns, use the following SQL syntax:
The CHECK constraint is used to limit the value range that can be placed in a column.
If you define a CHECK constraint on a single column it allows only certain values for this column.
If you define a CHECK constraint on a table it can limit the values in certain columns based on
values in other columns in the row.
The following SQL creates a CHECK constraint on the "P_Id" column when the
"Persons" table is created. The CHECK constraint specifies that the column
"P_Id" must only include integers greater than 0.
To create a CHECK constraint on the "P_Id" column when the table is already
created, use the following SQL:
The default value will be added to all new records, if no other value is specified.
The following SQL creates a DEFAULT constraint on the "City" column when the
"Persons" table is created:
The DEFAULT constraint can also be used to insert system values, by using
functions like GETDATE():
To create a DEFAULT constraint on the "City" column when the table is already
created, use the following SQL:
Most of the actions you need to perform on a database are done with SQL statements.
The following SQL statement will select all the records in the "Persons" table:
In this tutorial we will teach you all about the different SQL statements.
Some database systems require a semicolon at the end of each SQL statement.
Semicolon is the standard way to separate each SQL statement in database systems that allow more
than one SQL statement to be executed in the same call to the server. For most systems, every SQL
statement is terminated by a semicolon (;).
We are using MS Access and SQL Server 2000 and we do not have to put a semicolon after each
SQL statement, but some database programs force you to use it.
An SQL statement can be entered on one line or split across several lines for clarity.
For most systems SQL is not case sensitive. We can mix uppercase and lowercase when referencing
SQL keywords (such as SELECT and INSERT), tables names, and column names. However, case
does matter when referring to the contents of a column.
and
The "Persons"
table: P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
Stavange
3 Pettersen Kari Storgt 20
r
Now we want to select the content of the columns named "LastName" and "FirstName" from the
table above.
LastName FirstName
Hansen Ola
Svendson Tove
Pettersen Kari
SELECT * Example
Now we want to select all the columns from the "Persons" table.
Navigation in a Result-set
Most database software systems allow navigation in the result-set with programming functions, like:
Move-To-First-Record, Get-Record-Content, Move-To-Next-Record, etc.
In a table, some of the columns may contain duplicate values. This is not a problem, however,
sometimes you will want to list only the different (distinct) values in a table.
The DISTINCT keyword can be used to return only distinct (different) values.
Now we want to select only the distinct values from the column named "City" from the table above.
City
The WHERE clause is used to extract only those records that fulfill a specified criterion.
Now we want to select only the persons living in the city "Sandnes" from the table above.
SQL uses single quotes around text values (most database systems will also accept double quotes).
This is correct:
SELECT * FROM Persons WHERE FirstName='Tove'
This is wrong:
SELECT * FROM Persons WHERE FirstName=Tove
This is correct:
SELECT * FROM Persons WHERE Year=1965
This is wrong:
SELECT * FROM Persons WHERE Year='1965'
Operator Description
= Equal
<> Not equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
BETWEEN Between an inclusive range
LIKE Search for a pattern
IN If you know the exact value you want to return for at least one of the
columns
The AND operator displays a record if both the first condition and the second condition is true.
The OR operator displays a record if either the first condition or the second condition is true.
Now we want to select only the persons with the first name equal to "Tove" AND the last name
equal to "Svendson":
OR Operator Example
Now we want to select only the persons with the first name equal to "Tove" OR the first name equal
to "Ola":
You can also combine AND and OR (use parenthesis to form complex expressions).
If you want to sort the records in a descending order, you can use the DESC keyword.
ORDER BY Example
Now we want to select all the persons from the table above, however, we want to sort the persons by
their last name.
Now we want to select all the persons from the table above, however, we want to sort the persons
descending by their last name.
The second form specifies both the column names and the values to be
inserted:
The following SQL statement will add a new row, but only add data in the
"P_Id", "LastName" and the "FirstName" columns:
Note: Notice the WHERE clause in the UPDATE syntax. The WHERE clause specifies which
record or records that should be updated. If you omit the WHERE clause, all records will be
updated!
Now we want to update the person "Tjessem, Jakob" in the "Persons" table.
UPDATE Persons
Be careful when updating records. If we had omitted the WHERE clause in the
example above, like this:
UPDATE Persons
SET Address='Nissestien 67', City='Sandnes'
Note: Notice the WHERE clause in the DELETE syntax. The WHERE clause specifies which record
or records that should be deleted. If you omit the WHERE clause, all records will be deleted!
Now we want to delete the person "Tjessem, Jakob" in the "Persons" table.
It is possible to delete all rows in a table without deleting the table. This means
that the table structure, attributes, and indexes will be intact:
Note: Be very careful when deleting records. You cannot undo this statement!
Indexes allow the database application to find data fast; without reading the whole table.
Indexes
An index can be created in a table to find data more quickly and efficiently.
Note: Updating a table with indexes takes more time than updating a table without (because the
indexes also need an update). So you should only create indexes on columns (and tables) that will be
frequently searched against.
Note: The syntax for creating indexes varies amongst different databases. Therefore: Check the
syntax for creating indexes in your database.
The SQL statement below creates an index named "PIndex" on the "LastName"
column in the "Persons" table:
If you want to create an index on a combination of columns, you can list the
column names within the parentheses, separated by commas:
What if we only want to delete the data inside the table, and not the table itself?
The ALTER TABLE statement is used to add, delete, or modify columns in an existing table.
To delete a column in a table, use the following syntax (notice that some
database systems don't allow deleting a column):
To change the data type of a column in a table, use the following syntax:
Notice that the new column, "DateOfBirth", is of type date and is going to hold a date. The data type
specifies what type of data the column can hold.
Now we want to change the data type of the column named "DateOfBirth" in the "Persons" table.
Notice that the "DateOfBirth" column is now of type year and is going to hold a year in a two-digit
or four-digit format.
Next, we want to delete the column named "DateOfBirth" in the "Persons" table.
Very often we would like the value of the primary key field to be created automatically every time a
new record is inserted.
The MS SQL Server uses the IDENTITY keyword to perform an auto-increment feature.
By default, the starting value for IDENTITY is 1, and it will increment by 1 for each new record.
To specify that the "P_Id" column should start at value 10 and increment by 5, change the identity to
IDENTITY(10,5).
To insert a new record into the "Persons" table, we will not have to specify a
value for the "P_Id" column (a unique value will be added automatically):
The SQL statement above would insert a new record into the "Persons" table. The "P_Id" column
would be assigned a unique value. The "FirstName" column would be set to "Lars" and the
"LastName" column would be set to "Monsen".
A view contains rows and columns, just like a real table. The fields in a view are fields from one or
more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the
data were coming from one single table.
Note: A view always shows up-to-date data! The database engine recreates the data, using the view's
SQL statement, every time a user queries a view.
If you have the Northwind database you can see that it has several views installed by default.
The view "Current Product List" lists all active products (products that are not
discontinued) from the "Products" table. The view is created with the following
SQL:
Another view in the Northwind sample database selects every product in the
"Products" table with a unit price higher than the average unit price:
Another view in the Northwind database calculates the total sale for each
category in 1997. Note that this view selects its data from another view called
"Product Sales for 1997":
We can also add a condition to the query. Now we want to see the total sale
only for the category "Beverages":
Now we want to add the "Category" column to the "Current Product List" view.
We will update the view with the following SQL:
4. SQL Functions
4.1. Overview of SQL Functions
SQL aggregate functions return a single value, calculated from values in a column.
SQL scalar functions return a single value, based on the input value.
Tip: The aggregate functions and the scalar functions will be explained in details in the next
chapters.
OrderAverage
950
Now we want to find the customers that have an OrderPrice value higher then the average
OrderPrice value.
Customer
Hansen
Nilsen
Note: COUNT(DISTINCT) works with ORACLE and Microsoft SQL Server, but not with
Microsoft Access.
CustomerNilsen
2
NumberOfOrders
6
Now we want to count the number of unique customers in the "Orders" table.
NumberOfCustomers
3
which is the number of unique customers (Hansen, Nilsen, and Jensen) in the "Orders" table.
The FIRST() function returns the first value of the selected column.
FirstOrderPrice
1000
The LAST() function returns the last value of the selected column.
LastOrderPrice
100
The MAX() function returns the largest value of the selected column.
LargestOrderPrice
2000
The MIN() function returns the smallest value of the selected column.
SmallestOrderPrice
100
OrderTotal
5700
The GROUP BY statement is used in conjunction with the aggregate functions to group the result-
set by one or more columns.
Now we want to find the total sum (total order) of each customer.
Customer SUM(OrderPrice)
Hansen 2000
Nilsen 1700
Jensen 2000
Customer SUM(OrderPrice)
Hansen 5700
Nilsen 5700
Hansen 5700
Hansen 5700
Jensen 5700
Nilsen 5700
Explanation of why the above SELECT statement cannot be used: The SELECT statement above has
two columns specified (Customer and SUM(OrderPrice). The "SUM(OrderPrice)" returns a single
value (that is the total sum of the "OrderPrice" column), while "Customer" returns 6 values (one
value for each row in the "Orders" table). This will therefore not give us the correct result. However,
you have seen that the GROUP BY statement solves this problem.
We can also use the GROUP BY statement on more than one column, like this:
The HAVING clause was added to SQL because the WHERE keyword could not be used with
aggregate functions.
Now we want to find if any of the customers have a total order of less than 2000.
Customer SUM(OrderPrice)
Nilsen 1700
Now we want to find if the customers "Hansen" or "Jensen" have a total order of more than 1500.
Customer SUM(OrderPrice)
Hansen 2000
Jensen 2000
Now we want to select the content of the "LastName" and "FirstName" columns above, and convert
the "LastName" column to uppercase.
LastName FirstName
HANSEN Ola
SVENDSON Tove
PETTERSEN Kari
4. www.Microsoft .com