Project Instructions: SQL Analysis on a Schema
Objective
The objective of this project is to perform SQL queries on a schema, demonstrating your
understanding of SQL concepts and data analysis techniques. You will write and execute
queries to extract meaningful insights from the data and present your findings.
Steps and Guidelines
1. Set Up Your Environment
o Ensure you have access to a SQL database (MySQL).
o Download and set up a schema.
2. Project Setup
o Create a new directory for your project.
o Use a SQL script for your analysis.
o Create a README file that briefly describes the dataset and the steps you
plan to take in your analysis.
3. Data Import and Schema Overview
o Import the data into your SQL database.
o Provide an overview of the schema with ER diagram.
o Use DESCRIBE or SHOW COLUMNS commands to understand the structure of
each table.
4. Data Analysis Using SQL
o Descriptive Statistics:
Write queries to calculate basic statistics such as count, sum, average,
minimum, and maximum for numerical columns.
Example: Calculate the total sales, average order value, and total
number of transactions.
o Data Cleaning:
Write queries to identify and handle missing or inconsistent data.
Example: Find rows with NULL values in important columns and
decide how to handle them (e.g., filling, removing).
o Aggregation and Grouping:
Use GROUP BY to aggregate data by different dimensions.
Example: Calculate total sales by product category, region, or time
period.
o Joins and Relationships:
Write queries to join tables to extract meaningful insights.
Example: Join the sales with the product table to get product names
and categories.
o Subqueries and CTEs:
Use subqueries and Common Table Expressions (CTEs) to write
complex queries.
Example: Use a CTE to calculate the monthly sales and then find the
month with the highest sales.
o Advanced SQL Functions:
Use window functions to perform calculations across a set of table
rows related to the current row.
Example: Calculate running totals or moving averages.
5. Insights and Conclusions
o Summarize your key findings from the SQL analysis.
o Discuss any patterns, anomalies, or interesting relationships you discovered.
o Highlight any potential areas for further analysis or questions that emerged
from your analysis.
6. Documentation and Presentation
o Ensure your SQL script is well-documented. Include comments to explain
each query, the rationale behind your choices, and your findings.
o Prepare a brief presentation (5-10 slides) summarizing your project. Include
key findings, interesting queries, and potential next steps.
o Submit your SQL script, the README file, and the presentation slides.
Submission Deadline
Please submit your project by [Insert Deadline Here].
If you have any questions or need further assistance, feel free to reach out during office hours
or via email.
Evaluation Rubric for SQL Analysis Project
Total Marks: 20
Criteria Description Marks Scoring Details
1. Environment and Project Setup (2 Marks)
Ensure access to a SQL database 1: Fully set up, 0.5: Partially set up,
Environment Setup 1
(MySQL) and setup of the schema. 0: Not set up
Project Directory Creation of a new directory and 1: Fully created, 0.5: Partially
1
and Script use of a SQL script for analysis. created, 0: Not created
2. Data Import and Schema Overview (2 Marks)
Overview of the schema with ER 1: Comprehensive, 0.5: Partial, 0:
Schema Overview 1
diagram and table descriptions. Missing or unclear
Use of DESCRIBE or SHOW 1: Thoroughly understood, 0.5:
Table Structure COLUMNS to understand table 1 Partially understood, 0: Not
structure. understood
3. Data Analysis Using SQL (8 Marks)
Descriptive Queries to calculate basic statistics 2: Comprehensive, 1: Partial, 0:
2
Statistics (count, sum, average, min, max). Missing or incorrect
Queries to identify and handle 2: Effectively handled, 1: Partially
Data Cleaning 2
missing or inconsistent data. handled, 0: Not handled
Criteria Description Marks Scoring Details
Aggregation and Use of GROUP BY to aggregate 2: Effectively used, 1: Partially used,
2
Grouping data by different dimensions. 0: Not used
Joins and Queries to join tables to extract 2: Comprehensive, 1: Partial, 0:
2
Relationships meaningful insights. Missing or incorrect
4. Subqueries and CTEs (2 Marks)
Use of subqueries and Common
Subqueries and 2: Effectively used, 1: Partially used,
Table Expressions (CTEs) for 2
CTEs 0: Not used
complex queries.
5. Advanced SQL Functions (2 Marks)
Use of window functions to
Advanced SQL 2: Effectively used, 1: Partially used,
perform calculations across a set 2
Functions 0: Not used
of table rows.
6. Insights and Conclusions (2 Marks)
Summary of key findings, patterns, 1: Highly significant and relevant,
Insights anomalies, or interesting 1 0.5: Somewhat significant, 0: Not
relationships. significant
1: Clear and well-supported, 0.5:
Clear and well-supported
Conclusions 1 Somewhat clear, 0: Unclear or
conclusions based on analysis.
unsupported
7. Documentation and Presentation (2 Marks)
Well-documented SQL script with 1: Clear and thorough, 0.5: Partially
Documentation 1
comments explaining each query. clear, 0: Unclear or missing
Brief presentation (5-10 slides) 1: Clear and well-organized, 0.5:
Presentation summarizing the project with key 1 Partially clear, 0: Unclear or
findings. disorganized
Summary of Marks:
Environment and Project Setup: 2 marks
Data Import and Schema Overview: 2 marks
Data Analysis Using SQL: 8 marks
Subqueries and CTEs: 2 marks
Advanced SQL Functions: 2 marks
Insights and Conclusions: 2 marks
Documentation and Presentation: 2 marks