Diabetes_prediction_India

Diabetes Prediction using Machine Learning. Dataset used: PIMA Indians Diabetes dataset from Kaggle. There are estimated 72.96 million cases of diabetes in adult population of India. The prevalence in urban areas ranges between 10.9% and 14.2% and prevalence in rural India was 3.0-7.8% among population aged 20 years and above with a much higher prevalence among individuals aged over 50 years (INDIAB Study).

Early prediction is therefore paramount in preventing the disease.

The dataset used consist of several medical predictor (independent) variables and one target (dependent) variable, Outcome.

Independent variables include the number of pregnancies the patient has had, their BMI, insulin level, age, skin thickness, glucose level, blood pressure, diabetes pedigree.

correlation is found between each independent variables and the missiing zero values in the predictor columns are checked too.

Used SimpleImputer to fill misssing values and the algorithm used is - RandomforestClassifier, with which an accuracy of 76% is achieved. Hyperparameter tuning is done as well using RandomizedSearchCV

Classsifier Used is the XGBClassifier() cross validation is done , score is obtained and the mean of the score is obtained.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Diabetes_prediction_India.ipynb		Diabetes_prediction_India.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diabetes_prediction_India

About

Releases

Packages

Languages

solanki1993/Diabetes_prediction_India

Folders and files

Latest commit

History

Repository files navigation

Diabetes_prediction_India

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages