Posted by: atri | May 9, 2008

April 28th talk

(Guest Post by Karthik Dwarakanath)

I presented the paper Streaming algorithms for Robust, real-time detection of DDoS attacks. I presented two data structures and algorithms for the efficient tracking of potentially malicious connections. The key element of this approach is a novel data-streaming algorithm for efficiently tracking, in guaranteed small space and time, destination IP addresses that are large with respect to the number of distinct source IP addresses that access them.

The main idea in the basic estimator is to employ the distinct-count sketch synopsis to build a (appropriately-sized) distinct sample of the observed active source – destination pairs. Once such a distinct sample is available, the top-k estimation process is fairly straightforward: the destinations with the k highest occurrence frequencies in the distinct sample are identified, and are returned with their frequencies appropriately scaled by the distinct-sampling rate. This algorithm can incur a high overhead for producing the top-k frequency estimates and corresponding destination addresses. The update time for this algorithm is O(r \log m) and the query time is O(r s \log^2{m}), where r and s are the number of “first level” and “second level” hash tables.

They provide a second algorithm the “Tracking-DCS-based top-k estimation algorithm” which offers guaranteed poly-logarithmic update and query times, while increasing the overall storage space by only a small constant factor over thebaseline distinct-count sketch synopsis. The update time for this algorithm is O(r\log^2{m}) and the query time is O(k \log m). Experimental results with synthetic data showed that the tracking version of the algorithm in the paper provides a very efficient way to track large distinct frequencies in small space and guaranteed small update/query time.


Leave a response

Your response:

Categories