skip to main content
research-article

Real-Time Anomaly Detection in Edge Streams

Published: 08 January 2022 Publication History

Abstract

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprising edges.
In this work, we propose Midas, which focuses on detecting microcluster anomalies, or suddenly arriving groups of suspiciously similar edges, such as lockstep behavior, including denial of service attacks in network traffic data. We further propose Midas-F, to solve the problem by which anomalies are incorporated into the algorithm’s internal states, creating a “poisoning” effect that can allow future anomalies to slip through undetected. Midas-F introduces two modifications: (1) we modify the anomaly scoring function, aiming to reduce the “poisoning” effect of newly arriving edges; (2) we introduce a conditional merge step, which updates the algorithm’s data structures after each time tick, but only if the anomaly score is below a threshold value, also to reduce the “poisoning” effect. Experiments show that Midas-F has significantly higher accuracy than Midas.
In general, the algorithms proposed in this work have the following properties: (a) they detects microcluster anomalies while providing theoretical guarantees about the false positive probability; (b) they are online, thus processing each edge in constant time and constant memory, and also processes the data orders-of-magnitude faster than state-of-the-art approaches; and (c) they provides up to 62% higher area under the receiver operating characteristic curve than state-of-the-art approaches.

References

[1]
Charu C. Aggarwal, Yuchen Zhao, and Philip S. Yu. 2010. On clustering graph streams. In SDM.
[2]
Charu C. Aggarwal, Yuchen Zhao, and Philip S. Yu. 2011. Outlier detection in graph streams. In ICDE.
[3]
Leman Akoglu, Mary McGlohon, and Christos Faloutsos. 2010. Oddball: Spotting anomalies in weighted graphs. In PAKDD.
[4]
Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph based anomaly detection and description: A survey. Data Mining and Knowledge Discovery 29, 3 (2015), 626–688.
[5]
Mohamed Jaward Bah, Hongzhi Wang, Mohamed Hammad, Furkh Zeshan, and Hanan Aljuaid. 2019. An effective minimal probing approach with micro-cluster for distance-based outlier detection in data streams. IEEE Access 7 (2019), 154922–154934.
[6]
Maroua Bahri, Silviu Maniu, and Albert Bifet. 2018. A sketch-based naive Bayes algorithms for evolving data streams. In IEEE Big Data.
[7]
Caleb Belth, Xinyi Zheng, and Danai Koutra. 2020. Mining persistent activity in continually evolving networks. In KDD.
[8]
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In WWW.
[9]
Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, and Christos Faloutsos. 2020. MIDAS: Microcluster-based detector of anomalies in edge streams. In AAAI.
[10]
Elnaz Bigdeli, Mahdi Mohammadi, Bijan Raahemi, and Stan Matwin. 2018. Incremental anomaly detection using two-layer cluster-based structure. Information Sciences 429 (2018), 315–331.
[11]
Petko Bogdanov, Christos Faloutsos, Misael Mongiovì, Evangelos E. Papalexakis, Razvan Ranca, and Ambuj K. Singh. 2013. NetSpot: Spotting significant anomalous regions on dynamic networks. In SDM.
[12]
Paul Boniol and Themis Palpanas. 2020. Series2graph: Graph-based subsequence anomaly detection for time series. Proceedings of the VLDB Endowment 13, 12 (2020), 1821–1834.
[13]
Deepayan Chakrabarti. 2004. Autopart: Parameter-free graph partitioning and outlier detection. In PKDD.
[14]
Yen-Yu Chang, Pan Li, Rok Sosic, M. H. Afifi, Marco Schweighauser, and Jure Leskovec. 2021. F-FADE: Frequency factorization for anomaly detection in edge streams. In WSDM.
[15]
Graham Cormode and Shan Muthukrishnan. 2005. An improved data stream summary: The count-min sketch and its applications. Journal of Algorithms 55, 1 (2005), 58–75.
[16]
Dhivya Eswaran and Christos Faloutsos. 2018. Sedanspot: Detecting anomalies in edge streams. In ICDM.
[17]
Dhivya Eswaran, Christos Faloutsos, Sudipto Guha, and Nina Mishra. 2018. SpotLight: Detecting anomalies in streaming graphs. In KDD.
[18]
Yixiang Fang, Xin Huang, Lu Qin, Ying Zhang, Wenjie Zhang, Reynold Cheng, and Xuemin Lin. 2020. A survey of community search over big graphs. The VLDB Journal 29, 1 (2020), 353–392.
[19]
Sebastian Garcia, Martin Grill, Jan Stiborek, and Alejandro Zunino. 2014. An empirical comparison of botnet detection methods. Computers & Security 45 (2014), 100–123.
[20]
Manish Gupta, Jing Gao, Yizhou Sun, and Jiawei Han. 2012. Integrating community matching and outlier detection for mining evolutionary community outliers. In KDD.
[21]
Liang He, Bin Shao, Yatao Li, and Enhong Chen. 2015. Distributed real-time knowledge graph serving. In BIGCOMP.
[22]
Bryan Hooi, Kijung Shin, Hyun Ah Song, Alex Beutel, Neil Shah, and Christos Faloutsos. 2017. Graph-based fraud detection in the face of camouflage. ACM Transactions on Knowledge Discovery from Data 11, 4 (2017), 1–26.
[23]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2016. Catching synchronized behaviors in large networks: A graph mining approach. ACM Transactions on Knowledge Discovery from Data 10, 4 (2016), 1–27.
[24]
Arijit Khan and Sixing Yan. 2018. Composite hashing for data stream sketches. ArXiv abs/1808.06800 (2018).
[25]
Jon M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 5 (1999), 604–632.
[26]
Danai Koutra, Joshua T. Vogelstein, and Christos Faloutsos. 2013. Deltacon: A principled massive-graph similarity function. In SDM.
[27]
Philipp Kranen, Ira Assent, Corinna Baldauf, and Thomas Seidl. 2011. The ClusTree: Indexing micro-clusters for anytime stream mining. Knowledge and Information Systems 29, 2 (2011), 249–272.
[28]
Adarsh Kulkarni, Priya Mani, and Carlotta Domeniconi. 2017. Network-based anomaly detection for insider trading. ArXiv abs/1702.05809 (2017).
[29]
Panagiotis Liakos, Katia Papakonstantinopoulou, Alexandros Ntoulas, and Alex Delis. 2020. Rapid detection of local communities in graph streams. IEEE Transactions on Knowledge and Data Engineering (2020), 1–1.
[30]
Richard Lippmann, Robert K. Cunningham, David J. Fried, Isaac Graf, Kris R. Kendall, Seth E. Webster, and Marc A. Zissman. 1999. Results of the DARPA 1998 offline intrusion detection evaluation. In Recent Advances in Intrusion Detection.
[31]
Wenjuan Luo, Han Zhang, Xiaodi Yang, Lin Bo, Xiaoqing Yang, Zang Li, Xiaohu Qie, and Jieping Ye. 2020. Dynamic heterogeneous graph neural network for real-time event prediction. In KDD.
[32]
Nour Moustafa and Jill Slay. 2015. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In MilCIS.
[33]
Xin Mu, Feida Zhu, Juan Du, Ee-Peng Lim, and Zhi-Hua Zhou. 2017. Streaming classification with emerging new class by class matrix sketching. In AAAI.
[34]
Takaaki Nakamura, Makoto Imamura, Ryan Mercer, and Eamonn Keogh. 2020. MERLIN: Parameter-free discovery of arbitrary length anomalies in massive time series archives. In ICDM.
[35]
Caleb C. Noble and Diane J. Cook. 2003. Graph-based anomaly detection. In KDD.
[36]
Stephen Ranshous, Steve Harenberg, Kshitij Sharma, and Nagiza F. Samatova. 2016. A scalable approach for outlier detection in edge streams using sketch-based approximations. In SDM.
[37]
Shebuti Rayana and Leman Akoglu. 2016. Less is more: Building selective anomaly ensembles. ACM Transactions on Knowledge Discovery from Data 10, 4 (2016), 1–33.
[38]
Florin Rusu and Alin Dobra. 2009. Sketching sampled data streams. In ICDE.
[39]
Mandana Saebi, Jian Xu, Lance M. Kaplan, Bruno Ribeiro, and Nitesh V. Chawla. 2020. Efficient modeling of higher-order dependencies in networks: from algorithm to application for anomaly detection. EPJ Data Science 9, 1 (2020), 15.
[40]
Konstantinos Semertzidis, Evaggelia Pitoura, Evimaria Terzi, and Panayiotis Tsaparas. 2019. Finding lasting dense subgraphs. Data Mining and Knowledge Discovery 33 (2019), 1417–1445.
[41]
Kijung Shin, Tina Eliassi-Rad, and Christos Faloutsos. 2018. Patterns and anomalies in k-cores of real-world graphs with applications. Knowledge and Information Systems 54, 3 (2018), 677–710.
[42]
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. DenseAlert: Incremental dense-subtensor detection in tensor streams. In KDD.
[43]
Kumar Sricharan and Kamalika Das. 2014. Localizing anomalous changes in time-evolving graphs. In SIGMOD.
[44]
Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, and Philip S. Yu. 2007. GraphScope: Parameter-free mining of large time-evolving graphs. In KDD.
[45]
Jimeng Sun, Dacheng Tao, and Christos Faloutsos. 2006. Beyond streams and graphs: Dynamic tensor analysis. In KDD.
[46]
Hanghang Tong and Ching-Yung Lin. 2011. Non-negative residual matrix factorization with application to graph anomaly detection. In SDM.
[47]
Da Yan, Guimu Guo, Md Mashiur Rahman Chowdhury, M. Tamer Özsu, Wei-Shinn Ku, and John C. S. Lui. 2020. G-thinker: A distributed framework for mining subgraphs in a big graph. In ICDE.
[48]
Minji Yoon, Bryan Hooi, Kijung Shin, and Christos Faloutsos. 2019. Fast and accurate anomaly detection in dynamic graphs with a two-pronged approach. In KDD.
[49]
Weiren Yu, Charu C. Aggarwal, Shuai Ma, and Haixun Wang. 2013. On anomalous hotspot discovery in graph streams. In ICDM.

Cited By

View all
  • (2024)Anomaly Detection in Dynamic Graphs: A Comprehensive SurveyACM Transactions on Knowledge Discovery from Data10.1145/366990618:8(1-44)Online publication date: 29-May-2024
  • (2024)G2A2: Graph Generator with Attributes and AnomaliesProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649206(3-11)Online publication date: 7-May-2024
  • (2024)Anomalous Link Detection in Dynamically Evolving Scale-Free-Like Networked Systems2024 IEEE International Systems Conference (SysCon)10.1109/SysCon61195.2024.10553458(1-8)Online publication date: 15-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 16, Issue 4
August 2022
529 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3505210
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 January 2022
Accepted: 01 October 2021
Revised: 01 September 2021
Received: 01 June 2021
Published in TKDD Volume 16, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Anomaly detection
  2. streaming
  3. real-time
  4. dynamic graphs
  5. edge streams
  6. microcluster

Qualifiers

  • Research-article
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)255
  • Downloads (Last 6 weeks)37
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Anomaly Detection in Dynamic Graphs: A Comprehensive SurveyACM Transactions on Knowledge Discovery from Data10.1145/366990618:8(1-44)Online publication date: 29-May-2024
  • (2024)G2A2: Graph Generator with Attributes and AnomaliesProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649206(3-11)Online publication date: 7-May-2024
  • (2024)Anomalous Link Detection in Dynamically Evolving Scale-Free-Like Networked Systems2024 IEEE International Systems Conference (SysCon)10.1109/SysCon61195.2024.10553458(1-8)Online publication date: 15-Apr-2024
  • (2024)A graph embedding based fault detection framework for process systems with multi-variate time-series datasetsDigital Chemical Engineering10.1016/j.dche.2023.10013510(100135)Online publication date: Mar-2024
  • (2024)Statistical methods utilizing structural properties of time-evolving networks for event detectionData Mining and Knowledge Discovery10.1007/s10618-024-01060-938:6(3831-3867)Online publication date: 1-Nov-2024
  • (2024)Network security AIOps for online stream data monitoringNeural Computing and Applications10.1007/s00521-024-09863-z36:24(14925-14949)Online publication date: 1-Aug-2024
  • (2023)IoT Network Attack Detection: Leveraging Graph Learning for Enhanced SecurityProceedings of the 18th International Conference on Availability, Reliability and Security10.1145/3600160.3605053(1-7)Online publication date: 29-Aug-2023
  • (2023)Towards Anomaly Detection using Multiple Instances of Micro-Cluster Detection2023 7th Cyber Security in Networking Conference (CSNet)10.1109/CSNet59123.2023.10339735(185-191)Online publication date: 16-Oct-2023
  • (2023)State of the art on quality control for data streamsComputer Science Review10.1016/j.cosrev.2023.10055448:COnline publication date: 1-May-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media