TY - GEN
T1 - Minimizing Noise in HyperLogLog-Based Spread Estimation of Multiple Flows
AU - Dao, Dinh Nguyen
AU - Jang, Rhongho
AU - Jung, Changhun
AU - Mohaisen, David
AU - Nyang, Dae Hun
N1 - Funding Information:
ACKNOWLEDGMENT This research was supported by the Global Research Laboratory (GRL) Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2016K1A1A2912757), by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2020R1A2C2009372), and by the Ewha Womans University Research Grant of 2020 (1-2020-0311-001-1). DaeHun Nyang is the corresponding author. Thanks to anonymous reviewers and the shepherd Dr. Eduardo Alchieri for valuable feedback.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Cardinality estimation has become an essential building block of modern network monitoring systems due to the increasing concerns of cyberattacks (e.g., Denial-of-Service, worm, spammer, scanner, etc.). However, the ever-increasing attack scale and the diversity of patterns (i.e., flow size distribution) will produce a biased estimation of existing solutions if apply a monotonic hypothesis for network traffic. The most representative solution is virtual HyperLogLog (vHLL), which extended the proven HLL, a single element cardinality estimation solution, to a multi-tenant version using a memory random sharing and noise elimination approach. In this paper, we show that the assumption made by vHLL's does not work for large-scale network traffic with diverse flow distributions. To resolve the issue, we propose a novel noise elimination method, called Rank Recovery-based Spread Estimator (RRSE), which is tolerant to both attack and normal traffic scenarios while using limited computation and storage. We show that our recovery function is more reliable than state-of-the-art approaches. Moreover, we implemented RRSE in a programmable switch to show the feasibility.
AB - Cardinality estimation has become an essential building block of modern network monitoring systems due to the increasing concerns of cyberattacks (e.g., Denial-of-Service, worm, spammer, scanner, etc.). However, the ever-increasing attack scale and the diversity of patterns (i.e., flow size distribution) will produce a biased estimation of existing solutions if apply a monotonic hypothesis for network traffic. The most representative solution is virtual HyperLogLog (vHLL), which extended the proven HLL, a single element cardinality estimation solution, to a multi-tenant version using a memory random sharing and noise elimination approach. In this paper, we show that the assumption made by vHLL's does not work for large-scale network traffic with diverse flow distributions. To resolve the issue, we propose a novel noise elimination method, called Rank Recovery-based Spread Estimator (RRSE), which is tolerant to both attack and normal traffic scenarios while using limited computation and storage. We show that our recovery function is more reliable than state-of-the-art approaches. Moreover, we implemented RRSE in a programmable switch to show the feasibility.
KW - Cardinality Estimation
KW - Network Anomaly Detection
KW - Programmable Switch
KW - Sketch
UR - http://www.scopus.com/inward/record.url?scp=85136322570&partnerID=8YFLogxK
U2 - 10.1109/DSN53405.2022.00042
DO - 10.1109/DSN53405.2022.00042
M3 - Conference contribution
AN - SCOPUS:85136322570
T3 - Proceedings - 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2022
SP - 331
EP - 342
BT - Proceedings - 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 June 2022 through 30 June 2022
ER -