SketchFlow: Per-Flow Systematic Sampling Using Sketch Saturation Event

Rhongho Jang, Dae Hong Min, Seong Kwang Moon, David Mohaisen, Dae Hun Nyang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations

Abstract

Sampling is a powerful tool to reduce the processing overhead in various systems. NetFlow uses a local table for counting records per flow, and sFlow sends out the collected packet headers periodically to a collecting server over the network. Any measurement system falls into either one of these two models. To reduce the overhead, as in sFlow, simple random sampling (SRS) has been widely used in practice because of its simplicity. However, SRS provides non-uniform sampling rates for different fine-grained flows (defined by 5-tuple), because it samples packets over an aggregated data flow (defined by switch port or VLAN). Consequently, some flows are sampled more than the designated sampling rate (resulting in over-estimation), and others are sampled fewer (resulting in under-estimation). Starting with a simple idea that independent per-flow packet sampling provides the most accurate estimation of each flow, we introduce a new concept of per-flow systematic sampling, aiming to provide the same sampling rate across all flows. In addition, we provide a concrete sampling method called SketchFlow, which approximates the idea of the per-flow systematic sampling using a sketch saturation event. We demonstrate SketchFlow's performance in terms of accuracy, sampling rate, and overhead using real-world datasets, including a backbone network trace, I/O trace, and Twitter dataset. Experimental results show that SketchFlow outperforms SRS (i.e., sFlow) and the non-linear sampling method while requiring a small CPU overhead to measure high-speed traffic in real-time.

Original languageEnglish
Title of host publicationINFOCOM 2020 - IEEE Conference on Computer Communications
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1339-1348
Number of pages10
ISBN (Electronic)9781728164120
DOIs
StatePublished - Jul 2020
Event38th IEEE Conference on Computer Communications, INFOCOM 2020 - Toronto, Canada
Duration: 6 Jul 20209 Jul 2020

Publication series

NameProceedings - IEEE INFOCOM
Volume2020-July
ISSN (Print)0743-166X

Conference

Conference38th IEEE Conference on Computer Communications, INFOCOM 2020
Country/TerritoryCanada
CityToronto
Period6/07/209/07/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Fingerprint

Dive into the research topics of 'SketchFlow: Per-Flow Systematic Sampling Using Sketch Saturation Event'. Together they form a unique fingerprint.

Cite this