The Real-Time Anomaly Detection Dashboard is a Python-based application that simulates a real-time data stream, detects anomalies using River's Predictive Anomaly Detection algorithm, and visualizes the results through an interactive Dash web interface. This application is designed for monitoring and analyzing streaming data, providing users with the ability to start, stop, reset the data stream, and download detected anomalies for further analysis.
- Real-Time Data Simulation: Generates streaming data with dynamic concept drift and occasional anomalies.
- Anomaly Detection: Utilizes River's Predictive Anomaly Detection to identify anomalies in the data stream.
- Interactive Dashboard: Visualizes data and anomalies in real-time using Plotly and Dash.
- Control Panel: Start, stop, and reset the data stream with intuitive buttons.
- Data Export: Download detected anomalies as a CSV file for offline analysis.
- Threading Support: Ensures smooth and responsive performance during data streaming and processing.
- Robust Error Handling: Improved error handling and data validation to ensure application stability.
The algorithm used for anomaly detection in this code is River's Predictive Anomaly Detection, which leverages a time series model (SNARIMAX) to predict future values based on past observations. The effectiveness of this approach lies in its ability to adapt to changes in the data stream, such as concept drift and seasonal patterns, making it suitable for real-time applications.
The model calculates an anomaly score for each incoming data point, and if the score exceeds a predefined threshold, the point is flagged as an anomaly. This method is robust for detecting outliers in dynamic environments, ensuring timely identification of potential issues.
SNARIMAX stands for (S)easonal (N)on-linear (A)uto(R)egressive (I)ntegrated (M)oving-(A)verage with e(X)ogenous inputs model. This model generalizes many established time series models in a single interface that can be trained online. It assumes that the provided training data is ordered in time and is uniformly spaced. It is made up of the following components:
- S (Seasonal): Captures seasonal patterns in the data.
- N (Non-linear): Any online regression model can be used, not necessarily linear regression.
- AR (Autoregressive): Lags of the target variable are used as features.
- I (Integrated): The model can be fitted on a differenced version of a time series.
- MA (Moving average): Lags of the errors are used as features.
- X (Exogenous): Users can provide additional features, ensuring they are available at both training and prediction time.
Figure: Real-Time Anomaly Detection Dashboard
- Python 3.8 or higher
pip
package manager
-
Clone the Repository
git clone https://github.com/HimanshuKumar17052001/real-time-anomaly-detection.git cd real-time-anomaly-detection
-
Create a Virtual Environment
It's recommended to use a virtual environment to manage dependencies.
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Dependencies
pip install -r requirements.txt
-
Run the Application
python app.py
-
Access the Dashboard
Open your web browser and navigate to
http://127.0.0.1:8050/
to view the dashboard. -
Dashboard Controls
- Start Streaming: Begin the real-time data stream.
- Stop Streaming: Halt the data stream.
- Reset: Clear the current data and reset the simulation.
- Download Anomalies CSV: Download a CSV file containing all detected anomalies.
real-time-anomaly-detection/
│
├── app.py # Main application script
├── requirements.txt # Python dependencies
├── README.md # Project documentation
├── screenshots/ # Directory for screenshots
│ └── dashboard.png
└── LICENSE # License information
Contains the core functionality, including data simulation, anomaly detection, and the Dash dashboard setup.
Contributions are welcome! Please follow these steps:
-
Fork the Repository
-
Create a Feature Branch
git checkout -b feature/NewFeature
-
Commit Your Changes
git commit -m "Add new feature"
-
Push to the Branch
git push origin feature/NewFeature
-
Open a Pull Request
This project is licensed under the MIT License. You are free to use, modify, and distribute this software in any project.
- Dash by Plotly for providing the framework for building interactive web applications.
- River for the Predictive Anomaly Detection implementation.
- NumPy and Pandas for data manipulation and analysis.