<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Data Stream Algorithms with applications in Networks and Databases</title>
	<atom:link href="http://datastreams.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://datastreams.wordpress.com</link>
	<description>CSE 725 &#38; 728 joint seminar @ CSE SUNY Buffalo</description>
	<lastBuildDate>Sat, 10 May 2008 15:20:54 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<image>
		<url>http://www.gravatar.com/blavatar/c93eca1054323344bc9127f110e7201a?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>Data Stream Algorithms with applications in Networks and Databases</title>
		<link>http://datastreams.wordpress.com</link>
	</image>
			<item>
		<title>April 14th talk</title>
		<link>http://datastreams.wordpress.com/2008/05/10/april-14th-talk/</link>
		<comments>http://datastreams.wordpress.com/2008/05/10/april-14th-talk/#comments</comments>
		<pubDate>Sat, 10 May 2008 15:20:54 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[presentation]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=42</guid>
		<description><![CDATA[(Guest post by Omar Mukhtar)
I presented the paper Efficient Computation of Frequent and Top-k Elements in Data
Streams.
I presented the SpaceSaving data structure and algorithms for finding the Top- and Frequent elements in an insert-only setting. The key element of the paper is a unified approach to solving the very common Top- and Frequent element problem [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=42&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>(Guest post by <strong>Omar Mukhtar</strong>)</em></p>
<p>I presented the paper <a href="http://www.cs.ucsb.edu/~dsl/publications/2005/ICDT2005-metwally.pdf"><span><span style="text-decoration:underline;">Efficient Computation of Frequent and Top-k Elements in Data<br />
Streams</span></span></a>.</p>
<p>I presented the SpaceSaving data structure and algorithms for finding the Top-<img src='http://s1.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> and Frequent elements in an insert-only setting. The key element of the paper is a unified approach to solving the very common Top-<img src='http://s2.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> and Frequent element problem as well as guaranteeing the results. The main hurdle to a unified approach is that a Top-<img src='http://s3.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> query can&#8217;t preprocess a Frequent element query and vice versa, so the authors use a common data structure, SpaceSaving, that is used to maintain stream statistics, and then answer queries. Thus, only one-pass is made over data to answer the Top-<img src='http://s1.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> and Frequent element query.</p>
<p>The main idea is to have a set of counters which keep an exact frequency of  individual elements. Each counter also keeps track of the over-estimation error in the elements frequency. The SpaceSaving data structure has two parts, a counter which has a field representing the element, and the over estimation error, <img alt="" /><img src='http://s2.wordpress.com/latex.php?latex=%5Cepsilon&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\epsilon' title='\epsilon' class='latex' />. The second part is a sorted doubly linked of buckets, each element is attached to a bucket which represents that elements frequency.  Once the pass over the Data Stream is made and the SpaceSaving data structure constructed, a pass over the counters is made to find the Top-<img src='http://s3.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> and the Frequent elements. The over estimation field is used to guarentee the accuracy of the Top-<img src='http://s1.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> and Frequent element query.</p>
<p>The space bounds of the algorithm are lower than other algorithms which solve the Top-<img src='http://s2.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> and Frequent element query separately. Experimental results with synthetic as well as Zipfian data showed that the algorithm in the paper provides a very efficient way to track frequencies in small space with guaranteed results.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/42/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/42/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/42/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/42/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/42/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/42/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/42/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/42/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/42/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/42/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/42/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/42/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=42&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/05/10/april-14th-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>April 28th talk</title>
		<link>http://datastreams.wordpress.com/2008/05/09/april-28th-talk/</link>
		<comments>http://datastreams.wordpress.com/2008/05/09/april-28th-talk/#comments</comments>
		<pubDate>Fri, 09 May 2008 02:39:51 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[presentation]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=41</guid>
		<description><![CDATA[(Guest Post by Karthik Dwarakanath)
I presented the paper Streaming algorithms for Robust, real-time detection of DDoS attacks. I presented two data structures and algorithms for the efficient tracking of potentially malicious connections. The key element of this approach is a novel data-streaming algorithm for efficiently tracking, in guaranteed small space and time, destination IP addresses [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=41&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>(Guest Post by </em><strong><em>Karthik Dwarakanath</em></strong><em>)</em></p>
<p>I presented the paper <a href="http://www.cs.berkeley.edu/~minos/Papers/icdcs07ddos-full.pdf"><span style="text-decoration:underline;">Streaming algorithms for Robust, real-time detection of DDoS attacks</span></a>. I presented two data structures and algorithms for the efficient tracking of potentially malicious connections. The key element of this approach is a novel data-streaming algorithm for efficiently tracking, in guaranteed small space and time, destination IP addresses that are large with respect to the number of distinct source IP addresses that access them.</p>
<p>The main idea in the basic estimator is to employ the distinct-count sketch synopsis to build a (appropriately-sized) distinct sample of the observed active source – destination pairs. Once such a distinct sample is available, the top-<img src='http://s3.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> estimation process is fairly straightforward: the destinations with the <img src='http://s1.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> highest occurrence frequencies in the distinct sample are identified, and are returned with their frequencies appropriately scaled by the distinct-sampling rate. This algorithm can incur a high overhead for producing the top-<img src='http://s2.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> frequency estimates and corresponding destination addresses. The update time for this algorithm is <img src='http://s3.wordpress.com/latex.php?latex=O%28r+%5Clog+m%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(r \log m)' title='O(r \log m)' class='latex' /> and the query time is <img src='http://s1.wordpress.com/latex.php?latex=O%28r+s+%5Clog%5E2%7Bm%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(r s \log^2{m})' title='O(r s \log^2{m})' class='latex' />, where <img src='http://s2.wordpress.com/latex.php?latex=r&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='r' title='r' class='latex' /> and <img src='http://s3.wordpress.com/latex.php?latex=s&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='s' title='s' class='latex' /> are the number of &#8220;first level&#8221; and &#8220;second level&#8221; hash tables.</p>
<p>They provide a second algorithm the “Tracking-DCS-based top-k estimation algorithm” which offers guaranteed poly-logarithmic update and query times, while increasing the overall storage space by only a small constant factor over thebaseline distinct-count sketch synopsis. The update time for this algorithm is <img src='http://s1.wordpress.com/latex.php?latex=O%28r%5Clog%5E2%7Bm%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(r\log^2{m})' title='O(r\log^2{m})' class='latex' /> and the query time is <img src='http://s2.wordpress.com/latex.php?latex=O%28k+%5Clog+m%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(k \log m)' title='O(k \log m)' class='latex' />. Experimental results with synthetic data showed that the tracking version of the algorithm in the paper provides a very efficient way to track large distinct frequencies in small space and guaranteed small update/query time.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/41/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/41/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/41/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/41/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/41/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/41/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/41/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/41/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/41/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/41/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/41/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/41/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=41&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/05/09/april-28th-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>April 25th talk</title>
		<link>http://datastreams.wordpress.com/2008/05/06/april-25th-talk/</link>
		<comments>http://datastreams.wordpress.com/2008/05/06/april-25th-talk/#comments</comments>
		<pubDate>Mon, 05 May 2008 20:34:51 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[presentation]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=40</guid>
		<description><![CDATA[(Guest post by Denis Mindolin)
In my presentation, I showed the results from the following paper:  Stabbing the Sky: Efficient Skyline Computation over Sliding Windows, ICDE &#8216;05.
The paper deals with the problem of computing skylines over data streams. Given a data set  of elements with  attributes , a skyline of  is a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=40&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>(Guest post by <strong>Denis Mindolin</strong>)</em></p>
<p>In my presentation, I showed the results from the following paper:  <a href="http://www.cse.unsw.edu.au/~lxue/xlin_skyline.ps"><span style="text-decoration:underline;">Stabb</span></a><a href="http://www.cse.unsw.edu.au/~lxue/xlin_skyline.ps"><span style="text-decoration:underline;">ing the Sky: Efficient Skyline Computation over Sliding Windows</span></a>, <a href="http://www.informatik.uni-trier.de/~ley/db/conf/icde/icde2005.html">ICDE &#8216;05</a>.</p>
<p>The paper deals with the problem of computing skylines over data streams. Given a data set <img src='http://s3.wordpress.com/latex.php?latex=S&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S' title='S' class='latex' /> of elements with <img src='http://s1.wordpress.com/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' /> attributes <img src='http://s2.wordpress.com/latex.php?latex=%28A_1%2C...A_d%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(A_1,...A_d)' title='(A_1,...A_d)' class='latex' />, a skyline of <img src='http://s3.wordpress.com/latex.php?latex=S&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S' title='S' class='latex' /> is a set of all undominated elements of <img src='http://s1.wordpress.com/latex.php?latex=S&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S' title='S' class='latex' />. An element <img src='http://s2.wordpress.com/latex.php?latex=X&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='X' title='X' class='latex' /> dominates another element <img src='http://s3.wordpress.com/latex.php?latex=Y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Y' title='Y' class='latex' /> if for all <img src='http://s1.wordpress.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i' title='i' class='latex' />, <img src='http://s2.wordpress.com/latex.php?latex=X_%7BA_i%7D+%5Cleq+Y_%7BA_i%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='X_{A_i} \leq Y_{A_i}' title='X_{A_i} \leq Y_{A_i}' class='latex' />, and for at least one <img src='http://s3.wordpress.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i' title='i' class='latex' />, <img src='http://s1.wordpress.com/latex.php?latex=X_%7BA_i%7D+%3C+Y_%7BA_i%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='X_{A_i} &lt; Y_{A_i}' title='X_{A_i} &lt; Y_{A_i}' class='latex' />. This problem is a topic of active research for the past thirty years. However, the paper I presented is one of the first ones considering skylines in the data stream framework.</p>
<p>The stream model considered in the paper is append only. A stream element is a vector having <img src='http://s2.wordpress.com/latex.php?latex=d&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d' title='d' class='latex' /> components each of which is a number. It also assumed that there&#8217;s a sliding window of the most recent <img src='http://s3.wordpress.com/latex.php?latex=N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='N' title='N' class='latex' /> elements, and the queries issued are</p>
<ol>
<li>Compute the skyline of the most recent <img src='http://s1.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> elements (<img src='http://s2.wordpress.com/latex.php?latex=n+%5Cleq+N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n \leq N' title='n \leq N' class='latex' />), and</li>
<li>Compute the skyline of the elements between the most recent <img src='http://s3.wordpress.com/latex.php?latex=n_1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n_1' title='n_1' class='latex' /> and <img src='http://s1.wordpress.com/latex.php?latex=n_2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n_2' title='n_2' class='latex' /> (<img src='http://s2.wordpress.com/latex.php?latex=n_1+%5Cleq+n_2+%5Cleq+N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n_1 \leq n_2 \leq N' title='n_1 \leq n_2 \leq N' class='latex' />).</li>
</ol>
<p>To answer queries of the first type (<img src='http://s3.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />-of-<img src='http://s1.wordpress.com/latex.php?latex=N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='N' title='N' class='latex' />), the authors propose to keep in main memory a set of nonredundant elements <img src='http://s2.wordpress.com/latex.php?latex=R_N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_N' title='R_N' class='latex' />. The size of <img src='http://s3.wordpress.com/latex.php?latex=R_N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_N' title='R_N' class='latex' /> is <img src='http://s1.wordpress.com/latex.php?latex=O%28log%5Ed+N%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(log^d N)' title='O(log^d N)' class='latex' /> for some restricted streams. In order to avoid the quadratic size of the dominance graph over <img src='http://s2.wordpress.com/latex.php?latex=R_N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_N' title='R_N' class='latex' />, it is proposed to keep at most two dominance  relationships per element in <img src='http://s3.wordpress.com/latex.php?latex=R_N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_N' title='R_N' class='latex' /> which are enough to answer <img src='http://s1.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />-of-<img src='http://s2.wordpress.com/latex.php?latex=N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='N' title='N' class='latex' /> queries. This makes the dominance graph size linear in the size of <img src='http://s3.wordpress.com/latex.php?latex=R_N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_N' title='R_N' class='latex' />. To answer skyline queries in <img src='http://s1.wordpress.com/latex.php?latex=O%28log+%7CR_N%7C%29+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(log |R_N|) ' title='O(log |R_N|) ' class='latex' /> time, the authors propose to store the graph as a set of intervals (<img src='http://s2.wordpress.com/latex.php?latex=O%28%7CR_N%7C%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(|R_N|)' title='O(|R_N|)' class='latex' /> space) which reduces the skyline problem to  the already solved problem of answering stabbing queries.</p>
<p>The second query model ( <img src='http://s3.wordpress.com/latex.php?latex=%28n_1%2Cn_2%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(n_1,n_2)' title='(n_1,n_2)' class='latex' />-of-<img src='http://s1.wordpress.com/latex.php?latex=N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='N' title='N' class='latex' /> ) is handled in a similar way with an exception that here the entire set of the most recent <img src='http://s2.wordpress.com/latex.php?latex=N&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='N' title='N' class='latex' /> elements should be kept in main memory. As a result, the space requirement here is higher than in the first approach.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/40/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/40/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/40/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/40/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/40/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/40/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/40/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/40/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/40/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/40/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/40/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/40/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=40&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/05/06/april-25th-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>Reminders</title>
		<link>http://datastreams.wordpress.com/2008/05/05/reminders/</link>
		<comments>http://datastreams.wordpress.com/2008/05/05/reminders/#comments</comments>
		<pubDate>Mon, 05 May 2008 14:48:39 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[annoucement]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=39</guid>
		<description><![CDATA[
The report is due today by midnight.
If you haven&#8217;t submit  the blog entry for your presentation, please do so by tomorrow.

       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=39&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><ul>
<li>The <a href="http://datastreams.wordpress.com/2008/04/12/report/"><span style="text-decoration:underline;">report</span></a> is due <strong>today</strong> by midnight.</li>
<li>If you haven&#8217;t submit  the blog entry for your presentation, please do so by tomorrow.</li>
</ul>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/39/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/39/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/39/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/39/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/39/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=39&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/05/05/reminders/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>Lunch</title>
		<link>http://datastreams.wordpress.com/2008/04/28/lunch/</link>
		<comments>http://datastreams.wordpress.com/2008/04/28/lunch/#comments</comments>
		<pubDate>Mon, 28 Apr 2008 14:26:55 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[annoucement]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=38</guid>
		<description><![CDATA[I would like to take you guys out for lunch this Wednesday or Thursday at noon. Use the comments section to let us know if you can make it (and if you prefer one day over the other).
Update (04/29) Based upon the comments Thursday looks like the better option. So let&#8217;s meet at noon on [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=38&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I would like to take you guys out for lunch this Wednesday or Thursday at noon. Use the comments section to let us know if you can make it (and if you prefer one day over the other).</p>
<p><strong>Update (04/29)</strong> Based upon the comments Thursday looks like the better option. So let&#8217;s meet at noon on Thursday, May 1 in my office (Bell 123).</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/38/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/38/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/38/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/38/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/38/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/38/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/38/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/38/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/38/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/38/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/38/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/38/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=38&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/04/28/lunch/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>April 18th talk</title>
		<link>http://datastreams.wordpress.com/2008/04/25/april-18th-talk/</link>
		<comments>http://datastreams.wordpress.com/2008/04/25/april-18th-talk/#comments</comments>
		<pubDate>Thu, 24 Apr 2008 19:13:19 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[presentation]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=37</guid>
		<description><![CDATA[(Guest post by Yang Wang)
In my presentation, I showed results from the following three papers.

 On the streaming model augmented with a sorting primitive
Trading off space for passes in graph streaming problems,
Adapting Parallel Algorithms to the W-Stream Model, with Applications to Graph Problems

The three papers were developed in sequence. In the first paper, the authors [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=37&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>(Guest post by <strong>Yang Wang</strong>)</em></p>
<p>In my presentation, I showed results from the following three papers.</p>
<ol>
<li><a href="http://www.cse.buffalo.edu/%7Eatri/courses/data-stream/data-stream-bib.html#sort"><span style="text-decoration:underline;"> On the streaming model augmented with a sorting primitive</span></a></li>
<li><a href="http://portal.acm.org/citation.cfm?id=1109635"><span style="text-decoration:underline;">Trading off space for passes in graph streaming problems</span></a>,</li>
<li><a href="http://www.springerlink.com/content/ewjw4w187x041334/"><span style="text-decoration:underline;">Adapting Parallel Algorithms to the W-Stream Model, with Applications to Graph Problems</span></a></li>
</ol>
<p>The three papers were developed in sequence. In the first paper, the authors proposed using multiple passes with the ability to write onto an intermediate stream and also sort on that stream (as some difficult problems such as graph problems, the data stream model is too weak). Thus this improves the power of the &#8220;basic&#8221; data stream model. Define a <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathrm%7Bpolylog+StrSort%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{polylog StrSort}' title='\mathrm{polylog StrSort}' class='latex' /> class to be the problems that can be solved in polylog passes and polylog (inner memory) space. Then they showed that graph connectivity, max spanning tree and many other problems are in this class. However there do exists problems that are not in the polylog class. In particular, they showed an example that need at least <img src='http://s2.wordpress.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p' title='p' class='latex' /> passes, and <img src='http://s3.wordpress.com/latex.php?latex=m&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='m' title='m' class='latex' /> space with <img src='http://s1.wordpress.com/latex.php?latex=pm+%3D+n%5E%7B1%2F3%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='pm = n^{1/3}' title='pm = n^{1/3}' class='latex' /> where <img src='http://s2.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> is the input length.</p>
<p>In the second paper, the authors consider the case when we can write to intermediate streams but cannot sort them. This new model is called the <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathrm%7BW%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{W}' title='\mathrm{W}' class='latex' />-stream. They use ideas from parallel algorithms that solve the transitive closure problem to attack the connected component problems and single source shortest paths problem. Later they extended the idea to more general cases in the third paper. They found the following relationship between parallel algorithm and <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathrm%7BW%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{W}' title='\mathrm{W}' class='latex' />-stream. Every <img src='http://s2.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> processor <img src='http://s3.wordpress.com/latex.php?latex=m&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='m' title='m' class='latex' /> memory <img src='http://s1.wordpress.com/latex.php?latex=T&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T' title='T' class='latex' /> time PRAM algorithm can be simulated on the <img src='http://s2.wordpress.com/latex.php?latex=%5Cmathrm%7BW%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{W}' title='\mathrm{W}' class='latex' />-stream with <img src='http://s3.wordpress.com/latex.php?latex=%5Cfrac%7BTn%5Clog%7Bm%7D%7D%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{Tn\log{m}}{s}' title='\frac{Tn\log{m}}{s}' class='latex' /> passes with space <img src='http://s1.wordpress.com/latex.php?latex=s&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='s' title='s' class='latex' />. In the next step, they claimed that though the <img src='http://s2.wordpress.com/latex.php?latex=%5Cmathrm%7BW%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{W}' title='\mathrm{W}' class='latex' />-stream model can be used to simulate PRAM, it is still more powerful that PRAM in that in the <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathrm%7BW%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{W}' title='\mathrm{W}' class='latex' />-stream model every memory cell in the entire stream can be accessed while in PRAM only a constant number of cells can be accessed. Thus the strategy is to change a PRAM algorithm to a RPRAM algorithm and then to the <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathrm%7BW%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{W}' title='\mathrm{W}' class='latex' />-stream. In this way, the authors solve many problems such as Maximum independent set and biconnected components.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/37/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/37/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/37/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=37&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/04/25/april-18th-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>April 21st talk</title>
		<link>http://datastreams.wordpress.com/2008/04/22/april-21st-talk/</link>
		<comments>http://datastreams.wordpress.com/2008/04/22/april-21st-talk/#comments</comments>
		<pubDate>Tue, 22 Apr 2008 18:21:48 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[presentation]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=36</guid>
		<description><![CDATA[(Guest post by Xi Zhang)
I presented [JKV07] on how to compute aggregates , where , over a probabilistic data stream. Only those five aggregates are considered as they are the major concerns in databases.
First, I introduced the probabilistic data stream model, where in contrast to the classical (deterministic) stream model,  we have pdfs instead [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=36&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>(Guest post by <strong>Xi Zhang</strong>)</em></p>
<p>I presented <a href="http://www.cse.buffalo.edu/%7Eatri/courses/data-stream/data-stream-bib.html#prob-ds">[JKV07]</a> on how to compute aggregates <img src='http://s3.wordpress.com/latex.php?latex=aggr&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='aggr' title='aggr' class='latex' />, where <img src='http://s1.wordpress.com/latex.php?latex=aggr%5Cin%5C%7B+%5Cmathrm%7BSUM%2C+COUNT%2C+AVG%2C+MIN%2C+MAX%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='aggr\in\{ \mathrm{SUM, COUNT, AVG, MIN, MAX}\}' title='aggr\in\{ \mathrm{SUM, COUNT, AVG, MIN, MAX}\}' class='latex' />, over a probabilistic data stream. Only those five aggregates are considered as they are the major concerns in databases.</p>
<p>First, I introduced the probabilistic data stream model, where in contrast to the classical (deterministic) stream model,  we have pdfs instead of elements from some domain. In a probabilistic data stream <img src='http://s2.wordpress.com/latex.php?latex=%5Cmathcal%7BP%7D%3D%5Cvartheta_1%2C+%5Cvartheta_2%2C+%5Cldots%2C+%5Cvartheta_n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathcal{P}=\vartheta_1, \vartheta_2, \ldots, \vartheta_n' title='\mathcal{P}=\vartheta_1, \vartheta_2, \ldots, \vartheta_n' class='latex' />, each pdf <img src='http://s3.wordpress.com/latex.php?latex=%5Cvartheta_i&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\vartheta_i' title='\vartheta_i' class='latex' /> is over the base domain <img src='http://s1.wordpress.com/latex.php?latex=%5C%7B1%2C%5Cldots%2CR%5C%7D+%5Ccup+%5C%7B+%5Cbot+%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\{1,\ldots,R\} \cup \{ \bot \}' title='\{1,\ldots,R\} \cup \{ \bot \}' class='latex' />, where the special symbol <img src='http://s2.wordpress.com/latex.php?latex=%5Cbot&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\bot' title='\bot' class='latex' /> indicates no element is produced. Any deterministic stream which is a possible outcome of <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathcal%7BP%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathcal{P}' title='\mathcal{P}' class='latex' /> is a possible stream of <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathcal%7BP%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathcal{P}' title='\mathcal{P}' class='latex' />. And the semantics of <img src='http://s2.wordpress.com/latex.php?latex=aggr&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='aggr' title='aggr' class='latex' /> over <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathcal%7BP%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathcal{P}' title='\mathcal{P}' class='latex' /> is the expected value of <img src='http://s1.wordpress.com/latex.php?latex=aggr&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='aggr' title='aggr' class='latex' /> over all the possible streams of <img src='http://s2.wordpress.com/latex.php?latex=%5Cmathcal%7BP%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathcal{P}' title='\mathcal{P}' class='latex' />.</p>
<p>Then, I went on to talk about how to evaluate those five aggregates over a probabilistic stream one by one.</p>
<p>For <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathrm%7BSUM%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{SUM}' title='\mathrm{SUM}' class='latex' /> and <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathrm%7BCOUNT%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{COUNT}' title='\mathrm{COUNT}' class='latex' />, <a href="http://almaden.ibm.com/cs/people/jayram/papers/vldb05.pdf">[BDJ05]</a> shows exact algorithms to compute those two aggregates in one-pass with constant space and update time.</p>
<p><img src='http://s2.wordpress.com/latex.php?latex=%5Cmathrm%7BAVG%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{AVG}' title='\mathrm{AVG}' class='latex' /> is an interesting case. The paper presents algorithms from the generating function point of view.<br />
It shows a generating function <img src='http://s3.wordpress.com/latex.php?latex=h_%7B%5Ctextnormal%7BAVG%7D%7D%28x%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h_{\textnormal{AVG}}(x)' title='h_{\textnormal{AVG}}(x)' class='latex' /> such that <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathrm%7BAVG%7D%3D%5Cint%5E1_%7B0%7Dh_%7B%5Cmathrm%7BAVG%7D%7D%28x%29dx&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{AVG}=\int^1_{0}h_{\mathrm{AVG}}(x)dx' title='\mathrm{AVG}=\int^1_{0}h_{\mathrm{AVG}}(x)dx' class='latex' />. <img src='http://s2.wordpress.com/latex.php?latex=h_%7B%5Cmathrm%7BAVG%7D%7D%28x%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h_{\mathrm{AVG}}(x)' title='h_{\mathrm{AVG}}(x)' class='latex' /> is in the form of a high-degree polynomial. The paper exhibits a data stream algorithm for estimating this definite integral to a relative error <img src='http://s3.wordpress.com/latex.php?latex=%281%2B%5Cvarepsilon%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(1+\varepsilon)' title='(1+\varepsilon)' class='latex' /> in <img src='http://s1.wordpress.com/latex.php?latex=O%28%5Clog+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\log n)' title='O(\log n)' class='latex' /> passes over the data with <img src='http://s2.wordpress.com/latex.php?latex=O%28%5Cfrac%7B1%7D%7B%5Cvarepsilon%7D%5Clog%5E2+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\frac{1}{\varepsilon}\log^2 n)' title='O(\frac{1}{\varepsilon}\log^2 n)' class='latex' /> space and <img src='http://s3.wordpress.com/latex.php?latex=O%28%5Cfrac%7B1%7D%7B%5Cvarepsilon%7D%5Clog+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\frac{1}{\varepsilon}\log n)' title='O(\frac{1}{\varepsilon}\log n)' class='latex' /> update time per pdf. The approximation here focuses on the approximation of a definite integral, where rectangle approximation is used.</p>
<p>The paper proves the lower bound of any one-pass exact algorithm for computing <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathrm%7BAVG%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{AVG}' title='\mathrm{AVG}' class='latex' /> is <img src='http://s2.wordpress.com/latex.php?latex=%5COmega%28n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\Omega(n)' title='\Omega(n)' class='latex' />. However, complementing this result, this paper also presents an exact algorithm for computing <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathrm%7BAVG%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{AVG}' title='\mathrm{AVG}' class='latex' />, which runs in <img src='http://s1.wordpress.com/latex.php?latex=O%28n%5Clog%5E2+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(n\log^2 n)' title='O(n\log^2 n)' class='latex' /> time and <img src='http://s2.wordpress.com/latex.php?latex=O%28n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(n)' title='O(n)' class='latex' /> space. It is still better than the previously best-known exact algorithm which uses the same amount of space but runs in <img src='http://s3.wordpress.com/latex.php?latex=O%28n%5E3%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(n^3)' title='O(n^3)' class='latex' /> time.</p>
<p>For <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathrm%7BMIN%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{MIN}' title='\mathrm{MIN}' class='latex' /> and <img src='http://s2.wordpress.com/latex.php?latex=%5Cmathrm%7BMAX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathrm{MAX}' title='\mathrm{MAX}' class='latex' />, the paper presents a one-pass data stream algorithm with relative accuracy <img src='http://s3.wordpress.com/latex.php?latex=%281%2B%5Cvarepsilon%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(1+\varepsilon)' title='(1+\varepsilon)' class='latex' />, using <img src='http://s1.wordpress.com/latex.php?latex=O%28%5Cfrac%7B1%7D%7B%5Cvarepsilon%7D%5Clg+R%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\frac{1}{\varepsilon}\lg R)' title='O(\frac{1}{\varepsilon}\lg R)' class='latex' /> space and constant update time per pdf (in fact, it is <img src='http://s2.wordpress.com/latex.php?latex=O%28%5Cell%5Clg+%5Cell%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\ell\lg \ell)' title='O(\ell\lg \ell)' class='latex' />, where <img src='http://s3.wordpress.com/latex.php?latex=%5Cell&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell' title='\ell' class='latex' /> is the size of maximum support of all pdfs. Whether it is really a &#8220;constant&#8221; is arguable). The approximation technique used here is &#8220;binning&#8221;.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/36/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/36/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/36/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/36/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/36/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/36/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/36/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/36/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/36/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/36/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/36/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/36/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=36&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/04/22/april-21st-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>April 11th talk</title>
		<link>http://datastreams.wordpress.com/2008/04/14/april-11th-talk/</link>
		<comments>http://datastreams.wordpress.com/2008/04/14/april-11th-talk/#comments</comments>
		<pubDate>Mon, 14 Apr 2008 17:36:00 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[presentation]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=35</guid>
		<description><![CDATA[(Guest post by Steve Uurtamo)
I presented a few algorithms for the detection of superspreaders, where superspreaders are defined to be sources in a networking data stream that contact many unique destinations.  The paper was  New algorithms for fast detection of superspreaders, by S. Venkataraman, D. Song, P. Gibbons, and A. Blum.
The basic algorithm [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=35&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>(Guest post by <strong>Steve Uurtamo</strong>)</em></p>
<p>I presented a few algorithms for the detection of superspreaders, where superspreaders are defined to be sources in a networking data stream that contact many unique destinations.  The paper was  <a href="http://www.cs.cmu.edu/~shobha/research/superspreaders.ps"><span style="text-decoration:underline;">New algorithms for fast detection of superspreaders</span></a>, by <a href="http://www.cs.cmu.edu/~shobha">S. Venkataraman</a>, <a href="http://www.cs.berkeley.edu/~dawnsong/">D. Song</a>, <a href="http://www.pittsburgh.intel-research.net/people/gibbons/">P. Gibbons</a>, and <a href="http://www.cs.cmu.edu/~avrim/">A. Blum</a>.</p>
<p>The basic algorithm is to store a uniformly chosen fraction of the <img src='http://s3.wordpress.com/latex.php?latex=%28s%2Cd%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(s,d)' title='(s,d)' class='latex' /> pairs, using a hash function to make sure that we don&#8217;t count any <img src='http://s1.wordpress.com/latex.php?latex=%28s%2Cd%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(s,d)' title='(s,d)' class='latex' /> pair more than once.  Assuming uniformity of the hash function, if we want to detect a <img src='http://s2.wordpress.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' />-superspreader, holding onto <img src='http://s3.wordpress.com/latex.php?latex=O%28%5Cfrac%7Bn%5Clog%28%5Cfrac%7B1%7D%7B%5Cdelta%7D%29%7D%7Bk%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\frac{n\log(\frac{1}{\delta})}{k})' title='O(\frac{n\log(\frac{1}{\delta})}{k})' class='latex' /> of the resulting datastream with <img src='http://s1.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> elements should be sufficient, where, if superspreaders are as rare and as chatty as expected, the storage will be modest.</p>
<p>Modifications of this algorithm were presented for the sliding window model, where we are continuously monitoring a datastream over a fixed length window, and for parallel monitoring, where we need to combine the information in several different datastreams to detect superspreaders.</p>
<p>Experimental results with synthetic data showed that the algorithms in the paper were much more space-efficient than simply combining existing approximate counting and distinct counting stream algorithms together.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/35/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/35/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/35/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=35&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/04/14/april-11th-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>Report</title>
		<link>http://datastreams.wordpress.com/2008/04/12/report/</link>
		<comments>http://datastreams.wordpress.com/2008/04/12/report/#comments</comments>
		<pubDate>Fri, 11 Apr 2008 20:16:48 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[annoucement]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=34</guid>
		<description><![CDATA[A 2-3 page report is due on May 5 (midnight). When you are done, email it to both Hung and me. We basically want you to summarize your thoughts on some problem related to the seminar that you thought about. If you have any questions and/or comments, please use the comments section of this post.
Update [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=34&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>A 2-3 page report is due on <strong>May 5</strong> (midnight). When you are done, email it to both Hung and me. We basically want you to summarize your thoughts on some problem related to the seminar that you thought about. If you have any questions and/or comments, please use the comments section of this post.</p>
<p><strong>Update (4/14)</strong> When you are suggesting open problem, please also summarize what is already known about the problem.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/34/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/34/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/34/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/34/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/34/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/34/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/34/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/34/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/34/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/34/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/34/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/34/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=34&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/04/12/report/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
		<item>
		<title>April 7th presentation</title>
		<link>http://datastreams.wordpress.com/2008/04/12/april-7th-presentation/</link>
		<comments>http://datastreams.wordpress.com/2008/04/12/april-7th-presentation/#comments</comments>
		<pubDate>Fri, 11 Apr 2008 20:12:14 +0000</pubDate>
		<dc:creator>atri</dc:creator>
				<category><![CDATA[presentation]]></category>

		<guid isPermaLink="false">http://datastreams.wordpress.com/?p=33</guid>
		<description><![CDATA[(Guest post by Demian Lessa)
I presented the main results and algorithms of the SODA08 paper &#8220;Declaring Independence via the Sketching of Sketches,&#8221; by Indyk and McGregor.
The authors consider the problem of computing the correlation- that is, the degree of independence, of data streams. In particular, if  are pairs appearing in a data stream, the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=33&subd=datastreams&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>(Guest post by <strong>Demian Lessa</strong>)</em></p>
<p>I presented the main results and algorithms of the <a href="http://www.siam.org/meetings/da08/">SODA08</a> paper &#8220;<a href="http://www.cse.buffalo.edu/%7Eatri/courses/data-stream/data-stream-bib.html#andrew"><span style="text-decoration:underline;">Declaring Independence via the Sketching of Sketches</span></a>,&#8221; by <a href="http://people.csail.mit.edu/indyk/">Indyk</a> and <a href="http://talk.ucsd.edu/andrewm/">McGregor</a>.</p>
<p>The authors consider the problem of computing the correlation- that is, the degree of independence, of data streams. In particular, if <img src='http://s1.wordpress.com/latex.php?latex=%28i%2Cj%29+%5Cin+%5Bn%5D%5E2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(i,j) \in [n]^2' title='(i,j) \in [n]^2' class='latex' /> are pairs appearing in a data stream, the frequencies of such pairs define a joint distribution <img src='http://s2.wordpress.com/latex.php?latex=%28X%2CY%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(X,Y)' title='(X,Y)' class='latex' />, and the goal is to compute the correlation between <img src='http://s3.wordpress.com/latex.php?latex=%28X%2CY%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(X,Y)' title='(X,Y)' class='latex' /> and the product of the marginals. In the centralized model, the coordinates <img src='http://s1.wordpress.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i' title='i' class='latex' /> and <img src='http://s2.wordpress.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='j' title='j' class='latex' /> appear together in the stream, while in the distributed model each coordinate may appear separately. Three measures of closeness are used to approximate the correlation- the <img src='http://s3.wordpress.com/latex.php?latex=%5Cell_1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_1' title='\ell_1' class='latex' /> norm, the <img src='http://s1.wordpress.com/latex.php?latex=%5Cell_2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_2' title='\ell_2' class='latex' /> norm, and the mutual information between <img src='http://s2.wordpress.com/latex.php?latex=X&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='X' title='X' class='latex' /> and <img src='http://s3.wordpress.com/latex.php?latex=Y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Y' title='Y' class='latex' />. All positive results in the paper are obtained in the centralized model.</p>
<p>In order to obtain a <img src='http://s1.wordpress.com/latex.php?latex=%5Cwidetilde%7BO%7D%28%5Cepsilon%5E%7B-2%7D%5Clog%5Cdelta%5E%7B-1%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\widetilde{O}(\epsilon^{-2}\log\delta^{-1})' title='\widetilde{O}(\epsilon^{-2}\log\delta^{-1})' class='latex' />-space, <img src='http://s2.wordpress.com/latex.php?latex=%281%2B%5Cepsilon%2C+%5Cdelta%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(1+\epsilon, \delta)' title='(1+\epsilon, \delta)' class='latex' />-approximation for the <img src='http://s3.wordpress.com/latex.php?latex=%5Cell_2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_2' title='\ell_2' class='latex' />  distance between the joint distribution and the product of the marginals, the authors explore the techniques in <a href="http://www.cse.buffalo.edu/%7Eatri/courses/data-stream/data-stream-bib.html#AMS">[AMS]</a> for the computation of <img src='http://s1.wordpress.com/latex.php?latex=F_2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F_2' title='F_2' class='latex' />. In particular, they utilize two <img src='http://s2.wordpress.com/latex.php?latex=4&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='4' title='4' class='latex' />-wise independent vectors <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathbf%7Bx%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathbf{x}' title='\mathbf{x}' class='latex' /> and <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathbf%7By%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathbf{y}' title='\mathbf{y}' class='latex' /> (of size <img src='http://s2.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />), constructed using the parity check matrix of BCH codes, to compute vector <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathbf%7Bz%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathbf{z}' title='\mathbf{z}' class='latex' /> (of size <img src='http://s1.wordpress.com/latex.php?latex=n%5E2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n^2' title='n^2' class='latex' />) defined as the outer product of <img src='http://s2.wordpress.com/latex.php?latex=%5Cmathbf%7Bx%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathbf{x}' title='\mathbf{x}' class='latex' /> and <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathbf%7By%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathbf{y}' title='\mathbf{y}' class='latex' />. They show that, even though <img src='http://s1.wordpress.com/latex.php?latex=%5Cmathbf%7Bz%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathbf{z}' title='\mathbf{z}' class='latex' /> is not <img src='http://s2.wordpress.com/latex.php?latex=4&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='4' title='4' class='latex' />-wise independent, the variance can still be bounded as necessary by observing the geometric relationship on the indices of <img src='http://s3.wordpress.com/latex.php?latex=%5Cmathbf%7Bz%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathbf{z}' title='\mathbf{z}' class='latex' />. The algorithm is defined naturally by composing sketches and producing an estimate for the square of the <img src='http://s1.wordpress.com/latex.php?latex=%5Cell_2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_2' title='\ell_2' class='latex' /> distance. A set of weak estimates followed by an application of the &#8220;median trick&#8221; yields the claimed approximation and space bounds.</p>
<p>In order to obtain a <img src='http://s2.wordpress.com/latex.php?latex=%5Cwidetilde%7BO%7D%28%5Clog%5Cdelta%5E%7B-1%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\widetilde{O}(\log\delta^{-1})' title='\widetilde{O}(\log\delta^{-1})' class='latex' />-space, <img src='http://s3.wordpress.com/latex.php?latex=%28O%28%5Clog+n%29%2C+%5Cdelta%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(O(\log n), \delta)' title='(O(\log n), \delta)' class='latex' />-approximation for the <img src='http://s1.wordpress.com/latex.php?latex=%5Cell_1+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_1 ' title='\ell_1 ' class='latex' /> distance between the joint distribution and the product of the marginals, the authors explore the techniques in <a href="http://www.cse.buffalo.edu/%7Eatri/courses/data-stream/data-stream-bib.html#stable">[I06]</a> for the computation of <img src='http://s2.wordpress.com/latex.php?latex=%5Cell_1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_1' title='\ell_1' class='latex' />. In particular, they utilize vectors drawn from the Cauchy and <img src='http://s3.wordpress.com/latex.php?latex=T&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T' title='T' class='latex' />-Truncated Cauchy distributions to produce an estimate for <img src='http://s1.wordpress.com/latex.php?latex=%5Cell_1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_1' title='\ell_1' class='latex' />. Differently from the algorithm for <img src='http://s2.wordpress.com/latex.php?latex=%5Cell_2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_2' title='\ell_2' class='latex' />, the estimate for <img src='http://s3.wordpress.com/latex.php?latex=%5Cell_1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_1' title='\ell_1' class='latex' /> is produced by the median of <img src='http://s1.wordpress.com/latex.php?latex=O%281%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(1)' title='O(1)' class='latex' /> estimators. By repeating the process <img src='http://s2.wordpress.com/latex.php?latex=O%28%5Clog%5Cdelta%5E%7B-1%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\log\delta^{-1})' title='O(\log\delta^{-1})' class='latex' /> times, the claimed approximation and space bounds follow.</p>
<p>The remaining results for the centralized model consist of extensions to the <img src='http://s3.wordpress.com/latex.php?latex=%5Cell_1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\ell_1' title='\ell_1' class='latex' /> algorithm, and an approximation of the distance by the mutual information between <img src='http://s1.wordpress.com/latex.php?latex=X&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='X' title='X' class='latex' /> and <img src='http://s2.wordpress.com/latex.php?latex=Y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Y' title='Y' class='latex' />. Finally, negative results were presented for the distributed model.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/datastreams.wordpress.com/33/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/datastreams.wordpress.com/33/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/datastreams.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/datastreams.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/datastreams.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/datastreams.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/datastreams.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/datastreams.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/datastreams.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/datastreams.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/datastreams.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/datastreams.wordpress.com/33/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=datastreams.wordpress.com&blog=2543829&post=33&subd=datastreams&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://datastreams.wordpress.com/2008/04/12/april-7th-presentation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/717332b41db68bbbe7f1ec1f667a74fa?s=96&#38;d=identicon" medium="image">
			<media:title type="html">atri</media:title>
		</media:content>
	</item>
	</channel>
</rss>