TY - GEN
T1 - SketchFlow
AU - Jang, Rhongho
AU - Min, Dae Hong
AU - Moon, Seong Kwang
AU - Mohaisen, David
AU - Nyang, Dae Hun
N1 - Funding Information:
This research was supported by Grobal Research Laboratory (GRL) Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2016K1A1A2912757).This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2017R1A2B4010657). DaeHun Nyang is the corresponding author.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Sampling is a powerful tool to reduce the processing overhead in various systems. NetFlow uses a local table for counting records per flow, and sFlow sends out the collected packet headers periodically to a collecting server over the network. Any measurement system falls into either one of these two models. To reduce the overhead, as in sFlow, simple random sampling (SRS) has been widely used in practice because of its simplicity. However, SRS provides non-uniform sampling rates for different fine-grained flows (defined by 5-tuple), because it samples packets over an aggregated data flow (defined by switch port or VLAN). Consequently, some flows are sampled more than the designated sampling rate (resulting in over-estimation), and others are sampled fewer (resulting in under-estimation). Starting with a simple idea that independent per-flow packet sampling provides the most accurate estimation of each flow, we introduce a new concept of per-flow systematic sampling, aiming to provide the same sampling rate across all flows. In addition, we provide a concrete sampling method called SketchFlow, which approximates the idea of the per-flow systematic sampling using a sketch saturation event. We demonstrate SketchFlow's performance in terms of accuracy, sampling rate, and overhead using real-world datasets, including a backbone network trace, I/O trace, and Twitter dataset. Experimental results show that SketchFlow outperforms SRS (i.e., sFlow) and the non-linear sampling method while requiring a small CPU overhead to measure high-speed traffic in real-time.
AB - Sampling is a powerful tool to reduce the processing overhead in various systems. NetFlow uses a local table for counting records per flow, and sFlow sends out the collected packet headers periodically to a collecting server over the network. Any measurement system falls into either one of these two models. To reduce the overhead, as in sFlow, simple random sampling (SRS) has been widely used in practice because of its simplicity. However, SRS provides non-uniform sampling rates for different fine-grained flows (defined by 5-tuple), because it samples packets over an aggregated data flow (defined by switch port or VLAN). Consequently, some flows are sampled more than the designated sampling rate (resulting in over-estimation), and others are sampled fewer (resulting in under-estimation). Starting with a simple idea that independent per-flow packet sampling provides the most accurate estimation of each flow, we introduce a new concept of per-flow systematic sampling, aiming to provide the same sampling rate across all flows. In addition, we provide a concrete sampling method called SketchFlow, which approximates the idea of the per-flow systematic sampling using a sketch saturation event. We demonstrate SketchFlow's performance in terms of accuracy, sampling rate, and overhead using real-world datasets, including a backbone network trace, I/O trace, and Twitter dataset. Experimental results show that SketchFlow outperforms SRS (i.e., sFlow) and the non-linear sampling method while requiring a small CPU overhead to measure high-speed traffic in real-time.
UR - http://www.scopus.com/inward/record.url?scp=85090295219&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM41043.2020.9155252
DO - 10.1109/INFOCOM41043.2020.9155252
M3 - Conference contribution
AN - SCOPUS:85090295219
T3 - Proceedings - IEEE INFOCOM
SP - 1339
EP - 1348
BT - INFOCOM 2020 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 6 July 2020 through 9 July 2020
ER -