Know When to Persist: Deriving Value from a Stream Buffer
Journal Articles
Overview
Research
Identity
Additional Document Info
View All
Overview
abstract
We consider \textsc{Persistence}, a new online problem concerning optimizing
weighted observations in a stream of data when the observer has limited buffer
capacity. A stream of weighted items arrive one at a time at the entrance of a
buffer with two holding locations. A processor (or observer) can process
(observe) an item at the buffer location it chooses, deriving this way the
weight of the observed item as profit. The main constraint is that the
processor can only move {\em synchronously} with the item stream; as a result,
moving from the end of the buffer to the entrance, it crosses paths with the
item already there, and will never have the chance to process or even identify
it. \textsc{Persistence}\ is the online problem of scheduling the processor
movements through the buffer so that its total derived value is maximized under
this constraint.
We study the performance of the straight-forward heuristic {\em Threshold},
i.e., forcing the processor to "follow" an item through the whole buffer only
if its value is above a threshold. We analyze both the optimal offline and
Threshold algorithms in case the input stream is a random permutation, or its
items are iid valued. We show that in both cases the competitive ratio achieved
by the Threshold algorithm is at least $2/3$ when the only statistical
knowledge of the items is the median of all possible values. We generalize our
results by showing that Threshold, equipped with some minimal statistical
advice about the input, achieves competitive ratios in the whole spectrum
between $2/3$ and $1$, following the variation of a newly defined density-like
measure of the input. This result is a significant improvement over the case of
arbitrary input streams, since in this case we show that no online algorithm
can achieve a competitive ratio better than $1/2$.