Advanced SQL Programming in AS400 Iseries
Advanced SQL Programming in AS400 Iseries
The document suggests various strategies to enhance SQL performance, including creating indexes over columns that limit data in WHERE clauses or are used in JOIN, ORDER BY, and GROUP BY operations. Additionally, Encoded Vector Indexes (EVI's) are recommended for large datasets with moderate distinct values. It also advises encouraging the optimizer to use indexes by using keyed columns in WHERE clauses and minimizing complex statements to prevent sub-optimal query plans. The document also stresses the importance of enabling parallel processing via DB2's parallelism features to efficiently handle queries.
Referential integrity ensures relationships between tables are maintained, such as a child row requiring a parent row. Constraints enforce various rules like ensuring a primary key must be unique, while foreign keys maintain relationships by linking tables. These constraints can automatically manage deletions or updates of related data. For example, the 'ON DELETE CASCADE' constraint ensures all child records are also deleted when a parent record is removed, safeguarding data accuracy and consistency across the database.
The document specifies that primary keys must be unique within their table, serving as unique identifiers for rows. Foreign keys link child tables to parent tables, maintaining referential integrity by ensuring child rows correspond to parent rows. Restrictions include parent-based rules, such as a parent row not being deletable if dependent children exist, unless 'CASCADE' or 'SET NULL/DEFAULT' rules are specified. Foreign keys ensure consistency and validity of data relationships across tables.
SQL Views simplify data handling by allowing users to encapsulate SQL logic for repeated use, making complex queries easier to manage and understand. They provide consistency by incorporating business rules and allow database structures to be more comprehensible to users. Views can also help departments like human resources perform specific queries efficiently, such as identifying 'new' employees hired within the past two years.
Subqueries in SQL allow for selecting only necessary data without separate statements. The document distinguishes between correlated and non-correlated subqueries. A correlated subquery refers to the outer query and is evaluated multiple times, while a non-correlated subquery does not relate to the outer query and is evaluated once. Subqueries are powerful when selecting data, such as listing employees making more than the average salary. Optimization of subqueries can be essential, with techniques like avoiding using ALL or using MAX in subqueries for performance improvement.
The document provides several tips for optimizing complex SQL queries, such as keeping statements simple to allow better optimization opportunities. It recommends using indexes strategically on columns frequently used in WHERE clauses or involved in joins. Avoiding full table scans by indexing can significantly enhance performance. Additionally, enabling parallelism in DB2 can distribute the query processing load, improving speed and efficiency. These measures help ensure that SQL queries are processed optimally by the SQL optimizer.
The document describes four types of joins: Inner Join, Left Outer Join, Exception Join, and Cross Join. Inner Join is used to find related data between tables by returning only rows that have matching values. Left Outer Join returns all rows from the left table and the matched rows from the right table, with NULLs for rows in the left table that have no match in the right table. Exception Join fetches rows from the left table that do not have corresponding entries in the right table, thus identifying 'orphaned' rows. Cross Join returns every possible combination of rows from both tables, resulting in a Cartesian product.
The document mentions performance diagnosis tools such as STRDBG, which outputs debug messages in the job log, STRDBMON for putting optimizer information into a file, QAQQINI to force messages, and CHGQRYA which outputs messages when a time limit is set to zero. These tools help diagnose and optimize SQL queries by providing insights into how the optimizer processes them, allowing developers to make data-driven adjustments for enhanced performance.
SQL CASE statements improve data manipulation by allowing conditional logic within SQL queries. The document provides examples where CASE is used to perform different calculations based on certain conditions, such as region-based categorization or preventing division-by-zero errors in calculations. This conditional logic simplifies data handling and reduces errors in query processing.
The UNION operation in SQL is used to combine the results of two or more SELECT statements into a single result set, containing only distinct rows. In contrast, UNION ALL also combines results but does not remove duplicates, thus preserving all rows. Each SELECT within a UNION must have the same number of columns in their result sets and the columns must have compatible data types. UNION ALL is beneficial when duplicates are allowed, which can lead to more efficient performance since no duplicate removal is necessary.