0% found this document useful (0 votes)

83 views5 pages

SQL Internship for CS Students

Analysis of diabetes

Uploaded by

suvejah18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views5 pages

SQL Internship for CS Students

Analysis of diabetes

Uploaded by

suvejah18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Name: Neha Upadhyay

Email: [email protected]

Introduction
Hello there! Welcome to our online internship on SQL for Diabetes Prediction. In this
program, we delve into the fascinating realm of data science and healthcare, combining the
power of SQL with the critical task of predicting diabetes. As a student BTech in Computer
Science with a keen interest in software development, you'll find this internship to be a
valuable opportunity to apply your skills and expand your knowledge.
Throughout the internship, we’ll be working with a dataset encompassing various factors
such as gender, age, hypertension, heart disease, smoking history, BMI, HbA1c levels,
blood glucose levels, and the presence or absence of diabetes. This real-world dataset
mirrors the complexity of healthcare data and provides a rich environment for honing your
SQL skills.

Aim: To analyze the given dataset ‘Diabetes_prediction.xlsx’ and perform the following
queries in MySQL.

1. Retrieve the Patient_id and ages of all patients.

Ans : SELECT Patient_id, age FROM dp;

2. Select all female patients who are older than 40.
Ans : SELECT * FROM dp WHERE gender = 'Female' AND age > 40;
3. Calculate the average BMI of patients.
Ans : SELECT AVG(bmi) AS average_bmi FROM dp;
4. List patients in descending order of blood glucose levels.
Ans : SELECT * FROM dp ORDER BY blood_glucose_level DESC;
5. Find patients who have hypertension and diabetes.
Ans : SELECT * FROM dp WHERE hypertension = 1 AND diabetes = 1;
6. Determine the number of patients with heart disease.
Ans : SELECT COUNT(*) AS number_of_patients_with_heart_disease FROM dp WHERE
heart_disease = 1;
7. Group patients by smoking history and count how many smokers and nonsmokers
there are. Ans :

SELECT smoking_history, COUNT(*) AS number_of_patients

FROM dp
GROUP BY smoking_history;
Name: Neha Upadhyay

Email: [email protected]

8. Retrieve the Patient_ids of patients who have a BMI greater than the
average BMI. Ans :

SELECT Patient_id
FROM dp
WHERE bmi > (SELECT AVG(bmi) FROM dp);
9. Find the patient with the highest HbA1c level and the patient with the lowest
HbA1clevel. Ans :

 Patient with the highest HbA1c level

SELECT *
FROM dp
WHERE HbA1_level = (SELECT MAX(HbA1_level) FROM dp);
 Patient with the lowest HbA1c level
SELECT *
FROM dp
WHERE HbA1_level = (SELECT MIN(HbA1_level) FROM dp);
10. Calculate the age of patients in years (assuming the current date as
of now). Ans :

SELECT Patient_id, age,

DATEDIFF(CURDATE(), STR_TO_DATE(age, '%Y-%m-%d')) / 365 AS
calculated_age FROM dp;

11. Rank patients by blood glucose level within each gender group

Ans :
SELECT Patient_id, gender, blood_glucose_level,
RANK() OVER (PARTITION BY gender ORDER BY blood_glucose_level DESC) AS
glucose_level_rank
FROM dp;
Name: Neha Upadhyay

Email: [email protected]
12. Update the smoking history of patients who are older than 50 to
"Ex-smoker." Ans :

UPDATE dp
SET smoking_history = 'Ex-smoker'
WHERE age > 50;
13. Insert a new patient into the database with sample data.
Ans :
INSERT INTO dp
(EmployeeName, Patiend_id, gender, age, hypertension, heart_disease,
smoking_history, bmi, HbA1_level, blood_glucose_level, diabetes)

VALUES
('John Doe', 'P123456', 'male', 35, 'no', 'no', 'non-smoker', 25.5, 5.7,
120, 'no'); 14. Delete all patients with heart disease from the
database.

Ans :

DELETE FROM dp
WHERE heart_disease = 1;
15. Find patients who have hypertension but not diabetes using the EXCEPT
operator. Ans :

SELECT Patient_id

FROM dp
WHERE hypertension = 1
EXCEPT
SELECT Patient_id
FROM dp
WHERE diabetes=1;
16. Define a unique constraint on the "patient_id" column to ensure its values
are unique. Ans :

ALTER TABLE dp ADD UNIQUE (patient_id);

Name: Neha Upadhyay
Email: [email protected]

17. Create a view that displays the Patient_ids, ages, and BMI of patients.
Ans :
CREATE VIEW patient_info AS
SELECT Patient_id, age, bmi
FROM dp;
18. Suggest improvements in the database schema to reduce data redundancy and
improve data integrity.

Ans : To reduce data redundancy and improve data integrity in your database schema, you
can consider the following suggestions:

 Normalization: Ensure your database follows normalization rules, particularly up to at

least the  third normal form (3NF). This minimizes redundancy by organizing data
efficiently.  Use of Primary Keys: Ensure each table has a primary key to uniquely
identify each record. This
 helps in avoiding duplicate entries.
 Foreign Keys: Use foreign keys to establish relationships between tables. This
maintains  referential integrity and prevents inconsistencies.
 Data Types and Constraints: Choose appropriate data types for columns to minimize
storage  space.
 Composite Keys: In cases where a combination of columns can uniquely identify a
record,  consider using a composite key instead of a single column as the primary
key.  Avoid Storing Derived Data: Avoid storing data that can be derived from other
columns. This  reduces redundancy and ensures data consistency.

19. Explain how you can optimize the performance of SQL queries on this dataset. Ans:
Optimizing the performance of SQL queries involves various strategies, and the specific
approach depends on the nature of your dataset, the complexity of your queries, and the
underlying database management system. Here are some general tips to optimize SQL
queries on your dataset: Use Indexing

Identify columns frequently used in WHERE clauses and JOIN conditions and create indexes
on those columns.
Name: Neha Upadhyay

Email: [email protected]
Be cautious not to over-index, as it can impact write performance.
Optimize JOIN Operations:
Use INNER JOIN instead of OUTER JOIN, when possible, as INNER JOIN is generally more
efficient. Ensure that columns used in JOIN conditions are indexed.

Limit the Result Set:

Only retrieve the columns you need. Avoid using SELECT * if you don't need all
columns. Use the LIMIT clause to restrict the number of rows returned, especially
for large datasets. Avoid SELECT DISTINCT:

Use SELECT DISTINCT sparingly, as it can be resource-intensive. Consider whether it's

necessary or if alternative approaches can achieve the same result.

1702911116045
No ratings yet
1702911116045
23 pages
Cycle Sheet-2 DBMS: Preet Patel 19bce0622
100% (1)
Cycle Sheet-2 DBMS: Preet Patel 19bce0622
39 pages
SQL Examples
No ratings yet
SQL Examples
7 pages
Healthcare Data Analyst SQL Interview Questions
No ratings yet
Healthcare Data Analyst SQL Interview Questions
5 pages
Insurance Exercise Questions
No ratings yet
Insurance Exercise Questions
1 page
SQL Commands and Outputs Worksheet
No ratings yet
SQL Commands and Outputs Worksheet
8 pages
Sql-Practice - Solution (By Balwant Singh)
No ratings yet
Sql-Practice - Solution (By Balwant Singh)
15 pages
Unique Patient Birth Years Ascending
No ratings yet
Unique Patient Birth Years Ascending
12 pages
Railway Reservation System: SQL Commands For Creating Tables
No ratings yet
Railway Reservation System: SQL Commands For Creating Tables
11 pages
SQL Programs (1-15)
No ratings yet
SQL Programs (1-15)
16 pages
Program: B.Tech Course: Cse2004 - Dbms Database Management Systems Lab Cycle Sheet 2
No ratings yet
Program: B.Tech Course: Cse2004 - Dbms Database Management Systems Lab Cycle Sheet 2
5 pages
Practice SQL
No ratings yet
Practice SQL
11 pages
Untitled
No ratings yet
Untitled
57 pages
SQL Queries
No ratings yet
SQL Queries
21 pages
SQL Commands for Student, Gym, and Employee Databases
No ratings yet
SQL Commands for Student, Gym, and Employee Databases
19 pages
Course Outline
No ratings yet
Course Outline
3 pages
SQL
No ratings yet
SQL
12 pages
SQL Server Database Training for HRMS
No ratings yet
SQL Server Database Training for HRMS
69 pages
HdihLab Practicals DBMS
No ratings yet
HdihLab Practicals DBMS
2 pages
2B Medical Patient Doctor
No ratings yet
2B Medical Patient Doctor
4 pages
Completed SQL Homework Assignment
No ratings yet
Completed SQL Homework Assignment
2 pages
Top 30 SQL Query Interview Questions: Updated On Sep 19, 2023 17:55 IST
No ratings yet
Top 30 SQL Query Interview Questions: Updated On Sep 19, 2023 17:55 IST
20 pages
Mysql Lab Programs
No ratings yet
Mysql Lab Programs
7 pages
Real Data Analyst Interview Questions Answers
No ratings yet
Real Data Analyst Interview Questions Answers
15 pages
Program: B.Tech Course: Cse2004 - Dbms Database Management Systems Lab Cycle Sheet 1
0% (1)
Program: B.Tech Course: Cse2004 - Dbms Database Management Systems Lab Cycle Sheet 1
5 pages
Practical 7
No ratings yet
Practical 7
16 pages
Class 11 Record
No ratings yet
Class 11 Record
19 pages
MySQL Queries
No ratings yet
MySQL Queries
6 pages
My SQL Lab Answers Tables 1-3
No ratings yet
My SQL Lab Answers Tables 1-3
13 pages
Lab5 NGuyễn Trầm Gia Hưng ITITIU23007
No ratings yet
Lab5 NGuyễn Trầm Gia Hưng ITITIU23007
11 pages
Hospital Database Schema Overview
No ratings yet
Hospital Database Schema Overview
5 pages
Hospital Management System Project
No ratings yet
Hospital Management System Project
15 pages
Mysql Worksheets With Answers
No ratings yet
Mysql Worksheets With Answers
45 pages
CS 232L Oel
No ratings yet
CS 232L Oel
3 pages
Detailed MySQL Hospital Assignment
No ratings yet
Detailed MySQL Hospital Assignment
5 pages
SQL Commands for Database Management
No ratings yet
SQL Commands for Database Management
17 pages
Cs 12 Assignment
No ratings yet
Cs 12 Assignment
12 pages
Project 7 - Patient Diagnosis Report (SQL)
No ratings yet
Project 7 - Patient Diagnosis Report (SQL)
1 page
Assignment 2
No ratings yet
Assignment 2
2 pages
Nformatics Practices - Mysql Practicals
No ratings yet
Nformatics Practices - Mysql Practicals
17 pages
DBMS
No ratings yet
DBMS
6 pages
Dbms DTU Lab
No ratings yet
Dbms DTU Lab
40 pages
SQL Material
No ratings yet
SQL Material
84 pages
ch12SQL Commands
No ratings yet
ch12SQL Commands
8 pages
Assignment+2 Solution 2024
No ratings yet
Assignment+2 Solution 2024
8 pages
Practice Set 1
No ratings yet
Practice Set 1
32 pages
CS 232L Oel
No ratings yet
CS 232L Oel
3 pages
Note 1-4
No ratings yet
Note 1-4
10 pages
Oracle 10g Cheat Sheet: 1 ER Model
No ratings yet
Oracle 10g Cheat Sheet: 1 ER Model
3 pages
SQL Queries and Outputs for Teachers and Employees
No ratings yet
SQL Queries and Outputs for Teachers and Employees
10 pages
Dbms 6
No ratings yet
Dbms 6
29 pages
Document 9
No ratings yet
Document 9
23 pages
6 Mark Questions
No ratings yet
6 Mark Questions
10 pages
Lab4 - 4 2 16
No ratings yet
Lab4 - 4 2 16
4 pages
MySQL Assignments for Class XI (IP)
No ratings yet
MySQL Assignments for Class XI (IP)
8 pages
Hifiman Sundara 2020 EQ Settings Guide
No ratings yet
Hifiman Sundara 2020 EQ Settings Guide
1 page
Thesis Report Final
No ratings yet
Thesis Report Final
11 pages
As You Like It
No ratings yet
As You Like It
11 pages
Washington Tenant Eviction Appeal
No ratings yet
Washington Tenant Eviction Appeal
34 pages
8086 Arithmetic Operations Overview
No ratings yet
8086 Arithmetic Operations Overview
10 pages
SPSS Data Analysis Course at NUS
No ratings yet
SPSS Data Analysis Course at NUS
3 pages
RCGC - Database 02.08.2019
No ratings yet
RCGC - Database 02.08.2019
85 pages
Verizon Wireless DROID X by Motorola Manual
No ratings yet
Verizon Wireless DROID X by Motorola Manual
68 pages
National Income Accounting Overview
No ratings yet
National Income Accounting Overview
15 pages
Unit Inew
No ratings yet
Unit Inew
89 pages
Rotel RX-602 Part List
No ratings yet
Rotel RX-602 Part List
19 pages
The City of Makati v. The Municipality of Bakun
No ratings yet
The City of Makati v. The Municipality of Bakun
1 page
Batman Vol 5 Zero Year Dark City The New 52 Hardcover PDF
0% (1)
Batman Vol 5 Zero Year Dark City The New 52 Hardcover PDF
2 pages
MCQ Iot
No ratings yet
MCQ Iot
9 pages
Logical Text Complete
No ratings yet
Logical Text Complete
4 pages
Chapter 2-Utilitarianism
No ratings yet
Chapter 2-Utilitarianism
23 pages
Oracle Application's Blog - Printer Profile Option in Oracle Apps
100% (1)
Oracle Application's Blog - Printer Profile Option in Oracle Apps
5 pages
Unit-20 Digital Principle
No ratings yet
Unit-20 Digital Principle
2 pages
Understanding Lacan's Concept of Lack
100% (2)
Understanding Lacan's Concept of Lack
12 pages
Conical Screw Extruder
No ratings yet
Conical Screw Extruder
7 pages
(Ebook) Challenging The Mafia Mystique: Cosa Nostra From Legitimisation To Denunciation by Rino Coluccello (Auth.) ISBN 9781137280503, 9781349555529, 1137280506, 1349555525 No Waiting Time
No ratings yet
(Ebook) Challenging The Mafia Mystique: Cosa Nostra From Legitimisation To Denunciation by Rino Coluccello (Auth.) ISBN 9781137280503, 9781349555529, 1137280506, 1349555525 No Waiting Time
321 pages
Understanding Amorphic Sprawl
100% (1)
Understanding Amorphic Sprawl
33 pages
Eng Ansy
100% (1)
Eng Ansy
16 pages
Lpe2503 Writing Portfolio Task 2 (Final)
No ratings yet
Lpe2503 Writing Portfolio Task 2 (Final)
6 pages
Deep Foundations
No ratings yet
Deep Foundations
33 pages
Rio Grande Supply Case Study
No ratings yet
Rio Grande Supply Case Study
26 pages
Civil Engineering MCQs Guide
No ratings yet
Civil Engineering MCQs Guide
6 pages
Enhancing Self-Awareness in Teaching
100% (1)
Enhancing Self-Awareness in Teaching
18 pages
Deep Reading Guide Highlights
No ratings yet
Deep Reading Guide Highlights
16 pages
Embrace's Healing Our Heroes' Homes Program To Begin Remodeling The Home of 100% Disabled Navy Veteran Mark Shumaker
No ratings yet
Embrace's Healing Our Heroes' Homes Program To Begin Remodeling The Home of 100% Disabled Navy Veteran Mark Shumaker
2 pages

Uploaded by

Uploaded by

Name: Neha Upadhyay

1. Retrieve the Patient_id and ages of all patients.

Ans : SELECT Patient_id, age FROM dp;

SELECT smoking_history, COUNT(*) AS number_of_patients

 Patient with the highest HbA1c level

SELECT Patient_id, age,

ALTER TABLE dp ADD UNIQUE (patient_id);

 Normalization: Ensure your database follows normalization rules, particularly up to at

Limit the Result Set:

Use SELECT DISTINCT sparingly, as it can be resource-intensive. Consider whether it's

You might also like