From: Matthias Troyer (troyer_at_[hidden])
Date: 2007-01-31 16:01:06


On Jan 31, 2007, at 7:47 PM, Eric Niebler wrote:

>
> james.jones_at_[hidden] wrote:
>> From: Matthias Troyer <troyer_at_[hidden]>
>>> These were easy, trickier are robust estimators of variance and
>>> moments, even trickier will be median estimators where I currently
>>> even do not see how this should be done without storing most of the
>>> time series.
>>
>> Isn't this already a problem with sequences? Suppose you're
>> storing the current median, and a new value comes along. What's
>> the new median - without checking all the previous values? There
>> are statistical estimates for this, but I don't know any exact way
>> other than essentially resorting the data and checking the new
>> median.
>
> To calculate an exact median, you're correct. Of course, nothing stops
> you from writing an accumulator that calculates the exact median by
> storing all the samples seen so far. There are approximate median
> algorithms that don't need to do that, though, and that's what
> Matthias
> is referring to. Those might be tricky to combine.

Correct. These approximate (e.g. the p_square methods) algorithms are
impossible to combine. Also many of the algorithms for correlated
sequences of data are hard or impossible to combine.

Matthias