(Guest post by Denis Mindolin)
In my presentation, I showed the results from the following paper: Stabbing the Sky: Efficient Skyline Computation over Sliding Windows, ICDE ‘05.
The paper deals with the problem of computing skylines over data streams. Given a data set of elements with
attributes
, a skyline of
is a set of all undominated elements of
. An element
dominates another element
if for all
,
, and for at least one
,
. This problem is a topic of active research for the past thirty years. However, the paper I presented is one of the first ones considering skylines in the data stream framework.
The stream model considered in the paper is append only. A stream element is a vector having components each of which is a number. It also assumed that there’s a sliding window of the most recent
elements, and the queries issued are
- Compute the skyline of the most recent
elements (
), and
- Compute the skyline of the elements between the most recent
and
(
).
To answer queries of the first type (-of-
), the authors propose to keep in main memory a set of nonredundant elements
. The size of
is
for some restricted streams. In order to avoid the quadratic size of the dominance graph over
, it is proposed to keep at most two dominance relationships per element in
which are enough to answer
-of-
queries. This makes the dominance graph size linear in the size of
. To answer skyline queries in
time, the authors propose to store the graph as a set of intervals (
space) which reduces the skyline problem to the already solved problem of answering stabbing queries.
The second query model ( -of-
) is handled in a similar way with an exception that here the entire set of the most recent
elements should be kept in main memory. As a result, the space requirement here is higher than in the first approach.
Posted in presentation