Body Final Merged
Body Final Merged
PROJECT REVIEW I
for
MACHINE LEARNING-BASED
NETWORK INTRUSION DETECTION
SYSTEM
Approved Prepared By
Dr. S Nagarajan Alex Joy
PRK23DS1021
6-2-25
TABLE OF CONTENTS
Contents Page No
1. INTRODUCTION
1.1 Overview of the Project 1
1.2 Document Conventions 2
1.3 Motivation and Background 2
1.4 Problem Statement 3
1.5 Objective of the Project 4
1.6 Relevance and Need of the Project in the Present Context 4
1.7 Project Scope 5
1.8 References 6
2. LITERATURE REVIEW
2.1 Introduction 7
2.2 Review of the Existing Systems 7
2.2.1 Existing System 1 : Snort IDS [Technologies used, Features, Drawbacks] 7
2.2.2 Existing System 2 : Suricata IDS [Technologies used, Features, Drawbacks] 7
2.2.3 Existing System 3 : Bro IDS () [Technologies used, Features, Drawbacks] 8
2.2.4 Existing System 4 : OSSEC [Technologies used, Features, Drawbacks] 8
2.2.5 Existing System 5 : Cisco [Technologies used, Features, Drawbacks] 9
2.3 Summary of Drawbacks of the Existing Systems/Research Gaps 9
2.4 Proposed Approach 10
2.5 Unique Features of the Proposed System 11
2.6 Utility Value of the Proposed System 11
2.7 Scalability and Environmental Sustainability 12
3. SYSTEM REQUIREMENTS
3.1 Introduction 13
3.2 Users of the System 13
3.3 Functional Requirements. 13
3.4 Non-Functional Requirements 15
3.5 Hardware and Software Requirements 16
3.6 External Interface Requirements 17
3.6.1 User Interfaces 17
3.6.2 Database Interfaces 18
3.7 Feasibility Analysis of the Requirements 18
4. USER INTERFACE DESIGN
4.1 Screen Element Requirement Analysis 19
4.2 Screen Interfaces 21
5. Appendices
5.1 Context Diagram 22
5.2 Data Flow Diagram 22
5.3 Use Case Diagram 25
CHAPTER 1
INTRODUCTION
1.1. OVERVIEW OF THE PROJECT
Intrusion refers to unauthorized access or activities within a network or system, typically with the
intent to compromise its confidentiality, integrity, or availability. Intrusions can range from minor
unauthorized access, like hacking attempts, to more severe activities such as malware installation,
data theft, or system manipulation. These activities can be conducted by external attackers or
malicious insiders who exploit vulnerabilities in a network or system.
An Intrusion Detection System (IDS) is a security mechanism designed to monitor and analyze
network or system traffic to detect potential intrusions or unauthorized access attempts. IDS works
by identifying suspicious activities, which could be signs of attacks such as denial-of-service
(DoS), malware infection, or unauthorized access. There are two primary types of IDS: signature-
based, which detects known attack patterns, and anomaly-based, which identifies deviations from
normal behavior that may indicate new or unknown threats. The primary purpose of an IDS is to
provide alerts, enabling network administrators to respond to and mitigate potential security
breaches before they cause significant harm.
1
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
2
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
Background
Network security has always been a critical concern in the digital era. Traditional Intrusion
Detection Systems (IDS) are broadly classified into two types:
1. Signature-Based IDS – Detects known attacks by matching traffic patterns against a
predefined database of signatures (e.g., Snort, Suricata).
2. Anomaly-Based IDS – Detects deviations from normal traffic behavior, which helps in
identifying zero-day attacks.
While signature-based methods are effective against known threats, they fail to detect novel or
evolving cyberattacks. Anomaly-based detection, on the other hand, can flag unknown threats but
often suffers from high false-positive rates. A hybrid approach that combines machine learning
with both signature-based and anomaly-based techniques can significantly enhance intrusion
detection capabilities. This project employs Random Forest for detecting known threats based on
labeled datasets such as NSL-KDD and CICIDS2017, while Isolation Forest is used for identifying
anomalies, making it well-suited for detecting zero-day attacks. Additionally, feature engineering,
PCA for dimensionality reduction, and real-time monitoring ensure that the system remains
accurate and efficient. By integrating machine learning into intrusion detection, ML-NIDS aims
to provide a scalable, adaptive, and proactive cybersecurity solution capable of safeguarding
modern networks against evolving cyber threats.
3
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
relevant features, and classify malicious activities in real time. This project, Machine Learning-
Based Network Intrusion Detection System (ML-NIDS), aims to address these challenges by
leveraging machine learning algorithms such as Random Forest for detecting known threats and
Isolation Forest for identifying anomalies. Using benchmark datasets like NSL-KDD and
CICIDS2017, the system applies feature engineering, Principal Component Analysis (PCA), and
real-time traffic monitoring to enhance detection accuracy.
4
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
methods generate high false-positive rates. The growing complexity of network traffic and the rise
in ransomware, phishing, and distributed denial-of-service (DDoS) attacks highlight the urgent
need for intelligent, adaptive, and automated security solutions.
This project, Machine Learning-Based Network Intrusion Detection System (ML-NIDS),
addresses these challenges by leveraging machine learning techniques to detect both known and
unknown cyber threats in real time. In the present digital landscape, where cyberattacks are
becoming more frequent and complex, ML-driven intrusion detection is essential for strengthening
cybersecurity defenses and ensuring the resilience of modern networks.
5
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
1.8. REFERENCES
1. Sommer, R., & Paxson, V. (2010). Outside the closed world: On using machine learning
for network intrusion detection. Proceedings of the 2010 IEEE Symposium on Security and
Privacy, 305-316. [Link]
2. Bace, R. G., & Mell, P. (2001). Intrusion detection systems. National Institute of Standards
and Technology. [Link]
3. Saxe, J., & Berlin, L. (2015). Deep learning for detecting cyber threats in network traffic.
In 2015 4th IEEE International Conference on Cloud Computing and Intelligence Systems
(CCIS), 67-74. [Link]
4. Chandran, S., & Ramaswamy, R. (2017). A survey on network intrusion detection systems
using machine learning algorithms. International Journal of Computer Applications,
167(2), 11-16. [Link]
5. Scarfone, K., & Mell, P. (2007). Guide to intrusion detection and prevention systems
(IDPS). National Institute of Standards and Technology (NIST).
[Link]
6. Liao, H. W., Lin, C. H., Lin, Y. W., & Chen, Y. L. (2013). Intrusion detection system: A
comprehensive review. Journal of Network and Computer Applications, 36(1), 16-24.
[Link]
7. Sharma, S., & Guo, L. (2018). Anomaly-based network intrusion detection systems: A
survey. Future Generation Computer Systems, 79, 91-101.
[Link]
8. Buczak, A. L., & Guven, E. (2016). A survey of data mining and machine learning methods
for cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2),
1153-1176. [Link]
9. Ahmed, M., Mahmood, A. N., & Hu, J. (2016). A survey of network anomaly detection
techniques. Journal of Network and Computer Applications, 60, 19-31.
[Link]
10. Al-Qarni, M., & Salama, M. (2020). Hybrid machine learning-based intrusion detection
system for network security. Journal of Computer Networks and Communications, 2020,
1-10. [Link]
6
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
CHAPTER 2
LITERATURE REVIEW
2.1. INTRODUCTION
A literature review serves as a critical analysis of existing research and developments in the field
of network intrusion detection systems (NIDS). With the rapid evolution of cyber threats,
researchers and practitioners have been continuously exploring new methodologies and
technologies to enhance the effectiveness and efficiency of NIDS. The focus has shifted from
traditional signature-based approaches to more advanced techniques, including machine learning,
anomaly detection, and hybrid models that offer better adaptability to emerging threats. This
review explores various existing systems, their underlying technologies, and identifies the gaps
and challenges that drive the need for more robust and scalable solutions in the field of network
security.
7
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
• Features:
o Multi-threaded architecture for high-performance traffic analysis
o Combination of signature-based, anomaly-based, and protocol analysis
o Advanced logging and alerting features with JSON output
o Supports both intrusion detection and prevention
o High-speed packet capture and analysis
• Drawbacks:
o Performance can be affected on high-traffic networks
o Complex configuration and management
o Less community support compared to other IDS tools like Snort
2.2.3. Bro IDS (Zeek)
• Technologies Used: C++, Python, Lua, MySQL
• Features:
o Focuses on network monitoring and behavior analysis
o Provides high-level logging for network traffic and security events
o Extensible with scripting support for custom protocols and rules
o Can analyze HTTP, DNS, FTP, and many other protocols
• Drawbacks:
o Limited signature-based detection for known attacks
o High resource consumption, especially in high-traffic environments
o Requires advanced knowledge for configuration and tuning
2.2.4. OSSEC
• Technologies Used: C, Python, MySQL
• Features:
o Host-based IDS that focuses on log analysis, file integrity checking, and rootkit
detection
o Real-time alerts for system events, file integrity changes, and unauthorized access
attempts
o Integrates well with external security tools like firewalls and VPNs
o Can work across distributed systems and remote servers
• Drawbacks:
8
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
9
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
and Cisco Firepower. Finally, many of these solutions are costly, with proprietary tools like Cisco
Firepower requiring specialized knowledge for deployment and ongoing management. These
limitations highlight the need for more scalable, adaptive, and easy-to-manage solutions capable
of detecting both known and unknown threats effectively in real-time.
10
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
system’s ability to correctly identify attacks while minimizing false positives. The evaluation
process ensures that the model is reliable and robust for real-world applications.
11
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
CHAPTER 3
SYSTEM REQUIREMENTS
3.1 INTRODUCTION
The successful implementation of the project requires a robust and efficient infrastructure to handle
data processing, model training, and real-time monitoring. The system's requirements include both
hardware and software components, ensuring seamless integration with existing network
environments. These requirements ensure the system operates at optimal performance, supports
large-scale data processing, and provides accurate intrusion detection. Adequate resources for data
storage, computational power, and real-time monitoring are essential to ensure the system's
reliability and scalability in detecting and preventing cyberattacks effectively.
13
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
visualizes network traffic, detected intrusions, attack logs, and predictive analytics. The dashboard
allows administrators to easily access and interpret security data, receive alerts for suspicious
activities, and track the system’s overall health. It offers actionable insights and enables efficient
management of the network’s security, empowering users to make informed decisions and take
necessary actions swiftly.
3.3.2. End User (Admin)
The Admin user plays a crucial role in overseeing the operation of the ML-NIDS. Admins are
responsible for managing user access, configuring system settings, and ensuring that the system
functions smoothly across the network. They have the ability to review alerts, monitor real-time
traffic, and assess the performance of the intrusion detection system. Admins are also responsible
for responding to detected intrusions, generating reports, and managing ongoing system updates.
They play an essential role in maintaining the security and efficiency of the network.
3.3.3. ML-IDS Model: Build and Train
The ML-IDS Model: Build and Train feature involves developing and training machine learning
models using historical network traffic data. The model uses algorithms like Random Forest and
Isolation Forest to classify network traffic as either normal or malicious. This phase includes data
preprocessing steps such as normalization, feature selection, and PCA to ensure optimal
performance. The system will train the model using benchmark datasets like NSL-KDD and
CICIDS2017, refining the model to accurately detect a variety of attacks including DoS, R2L,
U2R, and zero-day attacks.
3.3.4. Intrusion Detection (Real-Time)
The Intrusion Detection (Real-Time) functionality continuously analyzes incoming network
traffic to identify any signs of malicious activity or network intrusions. The system uses the trained
machine learning model to classify traffic in real-time, distinguishing between legitimate and
potentially harmful behavior. This real-time detection ensures that the network is constantly
monitored, and any abnormal patterns are immediately flagged for further investigation. The
ability to detect and respond to threats in real-time significantly enhances network security by
minimizing the window of exposure to cyberattacks.
3.3.5. Alert Generation
The Alert Generation function is activated whenever the system detects suspicious or malicious
activity in the network. Alerts are automatically generated based on the detected anomalies and
14
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
predefined thresholds set by the administrator. These alerts include detailed information about the
nature of the threat, the affected system, and recommended actions for mitigation. Alerts are
immediately visible on the Web Dashboard, where they can be reviewed, analyzed, and acted
upon by the administrator. The alert system is designed to prioritize threats based on severity to
facilitate swift decision-making.
3.3.6. Model Evaluation
The Model Evaluation process assesses the performance of the machine learning model after it
has been trained and deployed. It measures the model's ability to accurately identify network
intrusions and classify them into appropriate categories (e.g., DoS, Probe, R2L, etc.). The
evaluation uses metrics such as accuracy, precision, recall, and F1-score to quantify the model’s
effectiveness. This step ensures that the model can be trusted to perform reliably in real-world
scenarios, identifying and responding to cyber threats with minimal false positives or false
negatives. Regular evaluation and model fine-tuning are essential for maintaining the system’s
efficiency over time.
15
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
• Reliability
The system should be highly reliable, operating without interruption. It must provide consistent
intrusion detection and alerting capabilities, even during hardware failures or system overloads,
ensuring 24/7 availability and minimal downtime.
• Security
Security is crucial, ensuring that all data and communications are protected from unauthorized
access. The system should employ strong encryption methods and secure authentication protocols
to safeguard sensitive information and prevent potential breaches.
• Usability
The user interface should be intuitive and easy to navigate for administrators and cybersecurity
analysts. The system must provide a clear and simple design for monitoring, analyzing, and
responding to security alerts, making it user-friendly for non-technical users as well.
• Maintainability
The system should be easy to maintain and update. New features or security patches should be
easily integrated, with minimal disruption to the ongoing operation. The system should be designed
with modularity to allow easy debugging, testing, and updates.
• Compatibility
The system must be compatible with a variety of network environments and devices. It should
integrate seamlessly with existing infrastructure and support common network protocols and
configurations.
16
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
17
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
18
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
CHAPTER 4
USER INTERFACE DESIGN
4.1 SCREEN ELEMENT REQUIREMENT ANALYSIS
The project will require several key screen elements to provide a user-friendly interface for
administrators to monitor and manage the system effectively. Below is an analysis of the essential
screen elements:
1. Login/Authentication Screen:
• Purpose: Secure access to the system.
• Elements:
o Username and password fields
o Login button
o Forgot password link
o Security (e.g., CAPTCHA, 2FA) for enhanced security.
2. Dashboard Screen:
• Purpose: Provide a real-time overview of the network status and alerts.
• Elements:
o Overview of network traffic (graphs, charts)
o Recent activity and attack logs
o Real-time attack detection status
o Alerts section showing triggered intrusions
o Summary of detected threats (e.g., DoS, Probe, R2L, U2R)
3. Attack Logs Screen:
• Purpose: Display historical attack data and details of each intrusion.
• Elements:
o Table listing detected attacks with timestamps, attack type, and severity
o Filter and search options (by date, severity, attack type)
o Option to export logs (CSV, PDF)
4. Intrusion Detection Configuration Screen:
• Purpose: Allow administrators to configure system parameters, including thresholds for
attack detection.
19
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
• Elements:
o Dropdown/select fields for setting detection parameters (e.g., sensitivity levels)
o Option to enable/disable specific attack detection models (Random Forest, Isolation
Forest)
o Configuration save button
5. Alert Management Screen:
• Purpose: View and manage active alerts.
• Elements:
o Table or list of active alerts with severity, attack type, and status
o Option to acknowledge, resolve, or escalate alerts
o Alert history and resolution logs
o Option to view alert details (including possible attack patterns and
recommendations)
6. Model Evaluation Screen:
• Purpose: Display the performance metrics of the ML model.
• Elements:
o Metrics: Accuracy, Precision, Recall, F1-Score
o Visual graphs (e.g., confusion matrix, ROC curve)
o Option to download evaluation report
o Option to retrain or adjust the model based on evaluation results
7. Settings/Preferences Screen:
• Purpose: Customize user preferences and system settings.
• Elements:
o User profile settings (change username, password)
o Notification preferences (e.g., email alerts, SMS)
o System update and maintenance options
o Backup and restore configuration
These screens will work together to provide a seamless and intuitive user experience, allowing
administrators to manage and respond to network intrusions effectively. The system will ensure
that key elements such as alerts, logs, and performance metrics are easily accessible and actionable.
20
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
21
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
CHAPTER 5
APPENDICES
5.1 CONTEXT DIAGRAM
22
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
1-level DFD
In 1-level DFD, the context diagram is decomposed into multiple bubbles/processes. In this level,
we highlight the main functions of the system and breakdown the high-level process of 0-level
DFD into sub processes.
23
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
2-level DFD:
2-level DFD goes one step deeper into parts of 1-level DFD. It can be used to plan or record the
specific/necessary detail about the system’s functioning
24
Division of Digital Science Karunya Institute of Technology and Sciences
Machine Learning-Based Network Intrusion Detection System
25
Division of Digital Science Karunya Institute of Technology and Sciences