- The source of the dataset is the following repo
- The task is a binary classification task to predict the water potability given the different feature measurements of the water quality
- The dataset sample contains about 3.2K samples
- This repo contains a water potability Machine Learning FastAPI application deployment
- For the MLOps, MLFlow has been utilized
- For deployment, an API has been developed and deployed using FastAPI and docker
- For the training, the dataset is split into 90% - 10% for train and test sets respectively
- For getting the latest model from the mlflow logs for production, use the script get_model_for_production.py
- The python packages are listed in requirements.txt
- The docker container can be deployed using Dockerfile
- For training and logging the model, use the modeling/ml_model_dev.py script
- The FastAPI app deployment code is in app.py script
- To test the deployed FastAPI app on a local machine, the test_post_request.py script can be used
- To build the container, run the following command
docker build -t fastapi_water_potability .
- To the run the container, run the following command
docker run -p 5000:5000 -t fastapi_water_potability
- To deploy the container on a kubernetes cluster, refer kubernetes_deployment/README.md
- The FastAPI application with appropriate changes has also been deployed to HuggingFace
- To test the deployed FastAPI app on HuggingFace, use the test_post_request.py script in the HuggingFace repo since the endpoint is different
The documentation generated with sphinx is available in docs/_build/html/index.html