Subject: Re: [boost] [Itl] Review
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2010-02-24 08:36:49


Joachim Faulhaber wrote:
> 2010/2/23 Phil Endecott <spam_from_boost_dev_at_[hidden]>:
>> I was hoping to find that this library was an implementation of an interval
>> tree. ?But it is not: it is built on top of normal standard-library
>> containers. ?The implementation used can find the intervals that overlap a
>> point in O(M + log N) time, but to do so it needs worst-case O(N^2) space
>> and I think it needs O(N^2 log N) time to build that structure.
>
> This is a very limited view on the library. I never claimed it has it's
> strengths in performing stabbing queries. I has so many more aspects
> and applications where it is useful and efficient.

It might help the discussion to describe some of these applications in
depth. I can only think of examples where the O(N^2) space happens.
Example: consider a building access-control system where users check in
and out with cards; the system records a set of
(in_time,out_time)->name records. If N people work in the building, it
will take O(N^2) space to store each day's data.

> If you insert N intervals in an interval_map the number of intervals
> (nodes) in the map is limited to 2*N-1. So space is O(N). A single
> addition of an interval into an interval_map has worst case time performance
> of O(N). So the worst case time for the addition of N interval value pairs is
> O(N^2). Which is not as bad as you write.

No, I believe my assessment is correct. See Luke's explanation.

> The most frequent usage of interval_maps are aggregations, where
> you need all of the aggregated values. If, in the worst case N^2 aggregations
> are involved, you have to perform them anyway. If you'd use an interval
> tree to hold the data, you would not be able to store the the aggregation
> results for N+k segments in your N nodes. For what interval containers
> and specifically interval_maps are used most of the time, interval_trees
> are not an adequate data structure as far as I can see.

I don't entirely understand that paragraph. In an interval tree, I
don't store the "aggregation results"; I just store the key-intervals
and data. Aggregation occurs when I perform a lookup; I find all of
the intervals that overlap my point or interval of interest, and merge
their data in some appropriate way at that time.

Regards, Phil.