Posted by: atri | April 12, 2008

April 4th presentation

(Guest post by Anh Le)

I presented the sliding window model which computes aggregates and statistics over data streams of the last N data elements seen so far. There are a wide range of applications that require statistics over the most recently observed data elements, such as intrusion detection in network monitoring, stock price prediction, weblog mining, user click stream for personalization etc. First, we considered a basic counting problem whose solution is a prerequisite for efficient maintenance of a variety of more complex statistical aggregates. We showed a solution for basic counting which use O(\frac{1}{\epsilon}log^2N) bits of memory to estimate the number of ones to within a factor of 1+\epsilon. Then, we showed the lower bound of \Omega(\frac{1}{\epsilon}log^2N) memory bits for any deterministic or randomized algorithms. We then extended the scheme to maintain the sum of the last N positive integers. Finally, we briefly discussed the scheme to adapt several other problems in the sliding window model. However, maintaining other statistics is more complex than those simple aggregates.


Leave a response

Your response:

Categories