$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Joachim Faulhaber (afojgo_at_[hidden])
Date: 2008-05-16 22:16:35
2008/5/16, vicente.botet <vicente.botet_at_[hidden]>:
> Hello,
>
> I'm mainly interested in an implementation of set<T> using set<interval<T>
> and on how it comapres to the stl::set in space and time, when the set are
> dense or scattered.
>
> I have just take a look to the documentation and the implementation, and I
> have some questions?
Thank you for looking at my code. I am quite thrilled to get so many
good responses.
Let me note, before answering your questions, that I followed the
boost advice from the web page to post suggestions *before* investing
work on adjusting or producing library code to conform the boost
standards and requirements, in order to avoid unnecessary work.
I know the library might not conform some boost requirements yet, but
I am of course happy to adjust the code and design if there is
significant interest in my work.
To keep the text simple I will talk mostly of sets and interval_sets,
when I want to say things about sets/maps and
interval_sets/interval_maps because most statements apply to both.
> * Why not follow the STL set and map interfaces as far as possible?
I intended to follow the STL and there is a large set of functions and
operators common to std::set/map and itl::interval_sets/maps. But
there are differences
also.
(1) One very interesting point is that interval_sets in the design presented
do have a twofold character. An interval_set is intended to implement
set. Yet at the same time it is a set of intervals. So the question
how a function like e.g.
find should be implemented has no obvious answer.
(2) In my design I am exploiting concepts (Inplace)Addable and
(Inplace)Subtractable in many ways. That is I do rely on the existence
of a += operator for almost all and -= for many datatypes in my
library. So I have added +=, -= and also *= (intersection) to
interval_set/map. A propagation of += and -= through container
operations is a core element of my design and a base of the useful
'aggregate on overlap' (aggrovering;) mechanics of interval_maps.
(see examples party and history from http://sourceforge.net/projects/itl.)
I have defined o= operators as memberfunctions because this seemed
natural to me. But I think the whole library can be easily refactored,
making them global function templates.
> * Why do you use virtual functions? Are the classes sets, and
> interval_base_set public? I think this can and should be avoided.
The original design was classical OO-design, defining an interface
class that allows different implementations being used uniformly by a
weak polymophic interfacepointer. Furthermore there is a body of
baseclass functions that are common to interval_set and
split_interval_set. They only vary in the handling of interval
borders. This has been expressed by basecalss interval_base_set,
interitence and a virtual function 'handle_neighbours'.
All those virtual functions can be devirtualized without problems. The
interface class templates can be omitted and handle_neighbours might
be expressed via a policy template parameter.
> * Does it works with boost::interval?
>
Unfortunately not. Interval containers rely on the notion of open
interval bounds. This was necessary to express operations for
flotingpoint numbers:
interval_set<double> is;
is.insert([1,pi)); is.insert([pi,4]); // is=={[1,4]} merging INTENDED
is.insert([1,pi)); is.insert((pi,4]); // is=={[1,pi), (pi,4]} must not merge
As I've read in the archives an extension for open interval bounds
have been suggested for boost::interval in the past.
> > The benefits of interval sets and maps are
> > 1. The implementation of sets/maps via interval_set/map is very
> > space and also time efficient.
>
> Have you done some messures, in space and time? In the worst case {1, 11,
> 21...}?
>
Not yet. Space can be reckoned easily. Wost case is that every
interval inserted has exactly one value element x per interval [x,x].
In this case you have double data content and one byte extra for
borderinformation so roughly 2 times the space of stl::set.
The 'nice' case is that you can represent a set of uncountable values like
{[0.0, pi)} quite compact ;)
>
> Do you plan to boostify your code?
>
I would very much like to boostify the code. I admit thou that I am
not yet familiar with all of the boostification ;) I hope guidance
will be provided.
> I'm no sure interval_set is a model of Unique Associative Container.
>
> Given the following :
> interval_set<int> is; //not quite: interval_set<interval<int>> is;
> is.insert([1,3));
> is.insert([3,5));
>
> What is the result of
> assert(find([1,3))!= end);
> And of
> assert(find([1,5))== end);
>
I thought about find more than twice and decided not to provide it.
You can say
interval_set<int> found;
is.intersect(found, rightopen_interval([1,5]); // Intersect to find contents
As interval_set<T> implements set<T> if find is implemented it should
deliver an iterator pointing to a value, but such iterators are not
available. The closest one can provide is an iterator that points to
the interval containing the value or end(), if such interval does not
exist.
> I'm no sure interval_set is a model of Unique Associative Container.
No, I'm afraid it is not.
> Which operator would you use for set_symmetric_difference? This is a detail
> but it will be better to provided the function version of this operators as
> well.
don't know, probably function name. Are there boost suggestions for that?
>
> > split_interval_map:
> > A split_interval_map implements a map as a map of intervals
> > and associated values. For dealing with the insertion
> > of overlapping intervals the principle of "aggregation on
> > overlap" has proved to be very useful: It is shown by example:
> >
> > typedef set<string> guests;
> > split_interval_map<time, guests> party;
> > guests mary; mary.insert("Mary");
> > guests harry; harry.insert("Harry");
> > party.insert(make_pair(rightopen_interval(20:00, 22:00), mary));
> > party.insert(make_pair(rightopen_interval(21:00, 23:00), harry));
> > // party now contains
> > [20:00, 21:00)->{"Mary"}
> > [21:00, 22:00)->{"Harry","Mary"} //guest sets aggregated on overlap
> > [22:00, 23:00)->{"Harry"}
> >
> > As can be seen from the example a split_interval_map has both
> > a decompositional behavior (on the time dimension) as well as
> > a accumulative one (on the associated values).
>
> Quite interesting!
>
> > Quality, validation, portability:
> >
> > The library code of the itl has been validated with a law based
> > test automaton (LaBatea;) which is included in
> > the current release. It includes a testcase generator and law tester
> > that checks laws like e.g. commutativity<interval_set<double>, +=>:
> > the commutativity of interval_set<double> w.r.t. the operation +=.
> >
> > The code has been compiled and tested with three compilers
> > (gcc 3.4.4, conceptgcc-boostcon, msvc 8.0 sp1) and with three
> > implementations of the stl (linux/gnu, stlport, msvc8) under
> > suse linux 10.1. and ms-windows xp.
>
> > I'd be happy if you found interval conainers as presented in this
> > post an interesting contribution to look at.
>
> I'm sure that this library will be a good candidate to boost.
>
> > If so, I am looking forward to an inspiring discussion.
>
> Best regards
>
> Vicente Botet
>
Thank you for your good questions and valuable feedback on my work.
Best regards
Joachim Faulhaber