(Guest post by Anh Le)
I presented the sliding window model which computes aggregates and statistics over data streams of the last data elements seen so far. There are a wide range of applications that require statistics over the most recently observed data elements, such as intrusion detection in network monitoring, stock price prediction, weblog mining, user click stream for personalization etc. First, we considered a basic counting problem whose solution is a prerequisite for efficient maintenance of a variety of more complex statistical aggregates. We showed a solution for basic counting which use
bits of memory to estimate the number of ones to within a factor of
. Then, we showed the lower bound of
memory bits for any deterministic or randomized algorithms. We then extended the scheme to maintain the sum of the last
positive integers. Finally, we briefly discussed the scheme to adapt several other problems in the sliding window model. However, maintaining other statistics is more complex than those simple aggregates.