Final Report
Final Report
Final Report
LEARNING
A Mini Project Report
Submitted by
of
COMPUTER SCIENCE AND ENGINEERING
Under the Guidance of
December, 2018
DECLARATION
Place: KALYANI
Date:
CERTIFICATE
This is to certify that RAHUL MAZUMDER (380116010069), RAJU KUMAR
SHARMA (380116010070), SAKSHI (380116020081), SAURAV
CHOWDHURY (380116010086), KUMAR SOURABH ANAND
(380117114001) has completed their project entitled DISEASE PREDICTION
USING MACHINE LEARNING, under the guidance of Dr. Dharmpal Singh
in partial fulfillment of the requirements for the award of the Bachelor of
Technology in Computer Science and Engineering from JIS college of
Engineering (An Autonomous Institute) is an authentic record of their own work
carried out during the academic year 2018-19 and to the best of our knowledge,
this work has not been submitted elsewhere as part of the process of obtaining a
degree, diploma, fellowship or any other similar title.
--------------------------------- -------------------------------
_________________________________
Place:
Date:
ABSTRACT
Date:
RAHUL MAZUMDER
B.TECH in Computer Science and Engineering
3rd YEAR/6th SEMESTER
Univ Roll—380116010069
SAKSHI
B.TECH in Computer Science and Engineering
3rd YEAR/\6th SEMESTER
Univ Roll--380116020081
SAURAV CHOWDHURY
B.TECH in Computer Science and Engineering
3rd YEAR/6th SEMESTER
Univ Roll—380116010086
Title Page i
Declaration of the Student ii
Certificate of the Guide iii
Abstract iv
Acknowledgement v
1. INTRODUCTION 1
1.1 Problem Definition 1
1.2 Project Overview/Specifications 2
1.3 Hardware Specification 3
1.4 Software Specification
2. LITERATURE SURVEY 4
2.1 Existing System 4
2.2 Proposed System 5
2.3 Feasibility Study 6
4. RESULTS / OUTPUTS 22
5. CONCLUSIONS & FUTURE SCOPE 23
6. REFERENCES 24
1. INTRODUCTION
Logistic Regression
Random Forest
Furthermore, some steps will be taken for optimizing the algorithms
thereby improving the accuracy. These steps include cleaning the dataset
and data pre-processing. The algorithms were judged based on their
accuracy and it is observed that the Logistic Regression is the most
accurate out of the three with 83.11% efficiency. Hence, it is selected for
the main application .The main application is a web application which
accepts the various parameters from the user as input and computes the
result. The result is displayed along with the accuracy of prediction .
1.3 Hardware Specification
1.4.1 CODELAB
1.4.2 JUPYTER
1.4.3 PANDAS LIBRARY
1.4.4 NUMPY LINRARY
1.4.5 SKLEARN LIBRARY
1.4.6 FLASK
1.4.7 HTML
1.4.8 CSS
1.4.9 BOOTSRAP
2. LITERATURE SURVEY
2.1 Existing System
The existing system has various problems which are
mentioned below-
System is only using one data set for validation which does
not predictable enough to generate outcomes.
System is only exploring the common predictable
performance of their models without considering the F-
score and precision as measures.
Most studies do not provide statistical test results to
demonstrate the level of significance of their experimental
results
Most studies related to ensemble classifier do not compare
the performance difference between individual classifiers
and an ensemble classifier consisted of individual
classifiers
2.2 Proposed System
Fig.3: Dataset
…
Best parameters
Fig.17: final accuracy by logistic regression
Result:
Due to overfitting of new features according to feature importance
we will go with the features of correlation which gives us better
prediction.
FUTURE SCOPE
The proposed work will be further increased developed for the
automation of the other disease prediction more accurately.
6. REFERENCES