100% found this document useful (1 vote)

1K views44 pages

Advanced SQL Programming in AS400 Iseries

This document provides an overview of advanced SQL techniques for improving programming efficiency and performance. It covers topics like different join types, subqueries, unions, views, and performance optimization. The goal is to show how letting the database do more work can reduce programming effort and improve performance over traditional programming methods. Examples are provided for many of the techniques discussed.

Uploaded by

Ramprasad Pasupuleti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views44 pages

Advanced SQL Programming in AS400 Iseries

Uploaded by

Ramprasad Pasupuleti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Advanced SQL Programming

Q: How does the document suggest using SQL to improve database performance, and what are some specific strategies mentioned?

The document suggests various strategies to enhance SQL performance, including creating indexes over columns that limit data in WHERE clauses or are used in JOIN, ORDER BY, and GROUP BY operations. Additionally, Encoded Vector Indexes (EVI's) are recommended for large datasets with moderate distinct values. It also advises encouraging the optimizer to use indexes by using keyed columns in WHERE clauses and minimizing complex statements to prevent sub-optimal query plans. The document also stresses the importance of enabling parallel processing via DB2's parallelism features to efficiently handle queries.

Q: What are the implications of referential integrity and constraints in SQL, and how can they influence database integrity?

Referential integrity ensures relationships between tables are maintained, such as a child row requiring a parent row. Constraints enforce various rules like ensuring a primary key must be unique, while foreign keys maintain relationships by linking tables. These constraints can automatically manage deletions or updates of related data. For example, the 'ON DELETE CASCADE' constraint ensures all child records are also deleted when a parent record is removed, safeguarding data accuracy and consistency across the database.

Q: What restrictions and rules are associated with maintaining primary and foreign keys in SQL tables as described in the document?

The document specifies that primary keys must be unique within their table, serving as unique identifiers for rows. Foreign keys link child tables to parent tables, maintaining referential integrity by ensuring child rows correspond to parent rows. Restrictions include parent-based rules, such as a parent row not being deletable if dependent children exist, unless 'CASCADE' or 'SET NULL/DEFAULT' rules are specified. Foreign keys ensure consistency and validity of data relationships across tables.

Q: In what ways do SQL Views simplify data handling, and what benefits do they offer according to the document?

SQL Views simplify data handling by allowing users to encapsulate SQL logic for repeated use, making complex queries easier to manage and understand. They provide consistency by incorporating business rules and allow database structures to be more comprehensible to users. Views can also help departments like human resources perform specific queries efficiently, such as identifying 'new' employees hired within the past two years.

Q: What role do subqueries play in SQL, and how are different types of subqueries utilized?

Subqueries in SQL allow for selecting only necessary data without separate statements. The document distinguishes between correlated and non-correlated subqueries. A correlated subquery refers to the outer query and is evaluated multiple times, while a non-correlated subquery does not relate to the outer query and is evaluated once. Subqueries are powerful when selecting data, such as listing employees making more than the average salary. Optimization of subqueries can be essential, with techniques like avoiding using ALL or using MAX in subqueries for performance improvement.

Q: How does the document address the optimization of complex SQL queries and what are the recommended practices?

The document provides several tips for optimizing complex SQL queries, such as keeping statements simple to allow better optimization opportunities. It recommends using indexes strategically on columns frequently used in WHERE clauses or involved in joins. Avoiding full table scans by indexing can significantly enhance performance. Additionally, enabling parallelism in DB2 can distribute the query processing load, improving speed and efficiency. These measures help ensure that SQL queries are processed optimally by the SQL optimizer.

Q: What are the different types of joins mentioned, and how do they differ in terms of data retrieval from tables?

The document describes four types of joins: Inner Join, Left Outer Join, Exception Join, and Cross Join. Inner Join is used to find related data between tables by returning only rows that have matching values. Left Outer Join returns all rows from the left table and the matched rows from the right table, with NULLs for rows in the left table that have no match in the right table. Exception Join fetches rows from the left table that do not have corresponding entries in the right table, thus identifying 'orphaned' rows. Cross Join returns every possible combination of rows from both tables, resulting in a Cartesian product.

Q: What are the performance diagnosis tools mentioned in the document, and how do they contribute to SQL query optimization?

The document mentions performance diagnosis tools such as STRDBG, which outputs debug messages in the job log, STRDBMON for putting optimizer information into a file, QAQQINI to force messages, and CHGQRYA which outputs messages when a time limit is set to zero. These tools help diagnose and optimize SQL queries by providing insights into how the optimizer processes them, allowing developers to make data-driven adjustments for enhanced performance.

Q: What are the SQL CASE statements, and how do they improve data manipulation in queries as given in the document?

SQL CASE statements improve data manipulation by allowing conditional logic within SQL queries. The document provides examples where CASE is used to perform different calculations based on certain conditions, such as region-based categorization or preventing division-by-zero errors in calculations. This conditional logic simplifies data handling and reduces errors in query processing.

Q: Discuss the conditions under which the UNION and UNION ALL operations are used in SQL, according to the document. What differentiates them?

The UNION operation in SQL is used to combine the results of two or more SELECT statements into a single result set, containing only distinct rows. In contrast, UNION ALL also combines results but does not remove duplicates, thus preserving all rows. Each SELECT within a UNION must have the same number of columns in their result sets and the columns must have compatible data types. UNION ALL is beneficial when duplicates are allowed, which can lead to more efficient performance since no duplicate removal is necessary.

Mark Holm Centerfield Technology

Goals
Introduce some useful advanced SQL programming techniques Show you how to let the database do more work to reduce programming effort Go over some basic techniques and tips to improve performance

Notes
V4R3 and higher syntax used in examples Examples show only a small subset of what can be done!

Agenda
Joining files - techniques, dos and donts Query within a query - Subqueries Stacking data - Unions Simplifying data with Views Referential Integrity and constraints Performance, performance, performance

Joining files
Joins are used to relate data from different tables Data can be retrieved with one open file rather than many Concept is identical to join logical files without an associated permanent object (except if the join is done with an SQL view)

Join types
Inner Join
Used to find related data

Left Outer (or simply Outer) Join

Used to find related data and orphaned rows

Exception Join
Used to only find orphaned rows

Cross Join
Join all rows to all rows
6

Sample tables
FirstName John Cindy Sally
Department table

LastName Doe Smith Anderson

Dept 397 450 250

Dept 397 550 250

Area Development Marketing Sales

Employee table

Inner Join
Method #1 - Using the WHERE Clause SELECT LastName, Division FROM Employee, Department WHERE Employee.Dept = Department.Dept Method #2 - Using the JOIN Clause SELECT LastName, Division FROM Employee INNER JOIN Department ON Employee.Dept = Department.Dept
NOTE: This method is useful if you need to influence the order of the tables are joined in for performance reasons. Only works on releases prior to V4R4.

Results
Return list of employees that are in a valid department. Employee Smith is not returned because she is not in a department listed in the Department table

Result table

LastName Doe Anderson

Area Development Sales

Left Outer Join

Must use Join Syntax SELECT LastName, Area FROM Employee LEFT OUTER JOIN Department ON Employee.Dept = Department.Dept

Results
Return list of employees even if they are not in a valid department Employee Smith has a NULL Area because it could not be associated with a valid Dept

Result table

LastName Doe Smith Anderson

Area Development Sales

Exception Join

Must use Join Syntax SELECT LastName, Area FROM Employee EXCEPTION JOIN Department ON Employee.Dept = Department.Dept

Results
Return list of employees only if they are NOT in a valid department Employee Smith is only one without a valid department

Result table

LastName Smith

Area -

WARNING!
The order tables are listed in the FROM clause is important For OUTER and EXCEPTION joins, the database must join the tables in that order. The result may be horrible performancemore on this topic later

Observations
Joins provide one way to bury application logic in the database Each join type has a purpose and can be used to not only get the data you want but identify incomplete information With some exceptions, if joined properly performance should be at least as good as an application
15

Subqueries
Subqueries are a powerful way to select only the data you need without separate statements. Example: List employees making a higher than average salary

Subquery Example
SELECT FNAME, LNAME FROM EMPLOYEE WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE)

SELECT FNAME, LNAME FROM EMPLOYEE WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE WHERE LNAME = JONES)

Subqueries - types
Correlated
Inner select refers to part of the outer (parent) select (multiple evaluations)

Non-Correlated
Inner select does not relate to outer query (one evaluation)

Subquery Tips 1
Subquery optimization (2nd statement will be faster)
SELECT name FROM employee WHERE salary > ALL (SELECT salary FROM salscale) SELECT name FROM employee WHERE salary > (SELECT max(salary) FROM salscale)

Subquery Tips 2
Subquery optimization (2nd statement will be faster)
SELECT name FROM employee WHERE salary IN (SELECT salary FROM salscale) SELECT name FROM employee WHERE EXISTS (SELECT salary FROM salscale WHERE employee.salid = salscale.salid)

UNIONs
Unions provide a way to append multiple row sets files in one statement Example: Process all of the orders from January and February
SELECT * FROM JanOrders WHERE SKU = 199976 UNION SELECT * FROM FebOrders WHERE SKU = 199976

Unions
Each SELECT statement that is UNIONed together must have the same number of result columns and have compatible types Two forms of syntax
UNION ALL -- allow duplicate records UNION -- return only distinct rows

Views
Views provide a convenient way to permanently put SQL logic Create once and use many times Also make the database more understandable to users Can put simple business rules into views to ensure consistency
23

Views
Example: Make it easy for the human resources department to run a report that shows new employees.
CREATE VIEW HR/NEWBIES (EMPLOYEE_NAME, DEPARTMENT, HIRE_DATE) AS SELECT concat(concat(strip(last_name),','),strip(first_name)), department, hire_date FROM WHERE HR/EMPLOYEE (year(current date)-year(hire_date)) < 2

Performance
SQL performance is harder to predict and tune than native I/O. SQL provides a powerful way to manipulate data but you have little control over HOW it does it. Query optimizer takes responsibility for doing it right.
25

Performance - diagnosis
Getting information about how the optimizer processed a query is crucial Can be done via one or all of the following:
STRDBG: debug messages in job log STRDBMON: optimizer info put in file QAQQINI: can be used to force messages CHGQRYA: messages put out when time limit set to 0
26

Performance tips
Create indexes
Over columns that significantly limit data in WHERE clause Over columns that join tables together Over columns used in ORDER BY and GROUP BY clauses

Performance tips
Create Encoded Vector Indexes (EVIs)
Most useful in heavy query environments with a lot of data (e.g. large data warehouses) Helps queries that process between 20-60% of a tables data Create over columns with a modest number of distinct values and those with data skew EVIs bridge the gap between traditional indexes and table scans
28

Performance tips
Encourage optimizer to use indexes
Use keyed columns in WHERE clause if possible Use ANDed conditions as much as possible OPTIMIZE FOR n ROWS Dont do things that eliminate index use
Data conversion (binary-key = 1.5) LIKE clause w/leading wildcard (NAME LIKE %JOE)
29

Performance tips
Keep statements simple
Complex statements are much more difficult to optimize Provide more opportunity for the optimizer to choose a sub-optimal plan of attack

Performance tips
Enable DB2 to use parallelism
Query processed by many tasks (CPU parallelism) or by getting data from many disks at once (I/O parallelism) CPU parallelism requires IBMs SMP feature and a machine with multiple processors Enabled via the QQRYDEGREE system value, CHGQRYA, or the QAQQINI file
31

Other useful features

CASE clause - conditional calculations ALIAS - access to multi-member files Primary/Foreign keys - referential integrity Constraints

CASE
Conditional calculations with CASE
SELECT Warehouse, Description, CASE RegionCode WHEN 'E' THEN 'East Region' WHEN 'S' THEN 'South Region' WHEN 'M' THEN 'Midwest Region' WHEN 'W' THEN 'West Region' END FROM Locations
33

CASE
Avoiding calculation errors (e.g. division by 0)
SELECT Warehouse, Description, CASE NumInStock WHEN 0 THEN NULL ELSE CaseUnits/NumInStock END FROM Inventory

ALIAS names
The CREATE ALIAS statement creates an alias on a table, view, or member of a database file.
CREATE ALIAS alias-name FOR table member

Example: Create an alias over the second member of a multi-member physical file
CREATE ALIAS February FOR MonthSales February

Referential Integrity
Keeps two or more files in synch with each other Ensures that children rows have parents Can also be used to automatically delete children when parents are deleted

Referential Integrity Rules

A row inserted into a child table must have a parent row (typically in another table). Parent rules
A parent row can not be deleted if there are dependent children (Restrict rule) OR All children are also deleted (Cascade rule) OR All childrens foreign keys are changed (Set Null and Set Default rules)
37

Primary key must be unique

Primary Key

Parent table Child table

Foreign Key

Referential Integrity syntax

ALTER TABLE Hr/Employee ADD CONSTRAINT EmpPK PRIMARY KEY (EmployeeId) ALTER TABLE Hr/Department ADD CONSTRAINT EmpFK FOREIGN KEY (EmployeeId) REFERENCES Hr/Employee (EmployeeId) ON DELETE CASCADE ON UPDATE RESTRICT
39

Check Constraints
Rules which limit the allowable values in one or more columns:
CREATE TABLE Employee (FirstName CHAR(20), LastName CHAR(30), Salary CHECK (Salary>0 AND Salary<200000))
40

Check Constraints
Effectively does data checking at the database level. Data checking done with display files or application logic can now be done at the database level. Ensures that it is always done and closes back doors like DFU, ODBC, 3-rd party utilities.
41

Other resources
Database Design and Programming for DB2/400 - book by Paul Conte SQL for Smarties - book by Joe Celko SQL Tutorial - www.as400network.com AS/400 DB2 web site at http://www.as400.ibm.com/db2/db2main.htm Publications at http://publib.boulder.ibm.com/pubs/html/as400/ Our web site at http://www.centerfieldtechnology.com

Summary
SQL is a powerful way to access and process data Used effectively, it can reduce the time it takes to build applications Once tuned, it can perform very close (and sometimes better) than HLLs alone

Good Luck and Happy SQLing

Common questions