Subject: Re: [boost] new proposal: order statistics tree
From: Szymon Wojciechowski (sw10_at_[hidden])
Date: 2013-01-13 04:10:54


"Vadim Stadnik" <vadimstdk_at_[hidden]> pisze:
> On Tue, Jan 8, 2013 at 7:47 AM, Szymon Wojciechowski wrote:
>
> > I implemented augmented red-black tree which doesn't check the uniqueness
> > of values. I have conducted a simple survey: I prepared 5 input files with
> > unique 2,7*10e7 integers. The first data set has sorted values as other has
> > randomized. I repeated 7 times on every input. A simple program reads from
> > input integers and puts it to set (maybe I should use multiset) from
> > VisualStudio or my order statistics tree. After that I measured time. Then
> > program erases values from data structure in ordered way (from 0, 1, 2...).
> > After that I also measured how much time it takes. I uploaded the results
> > here: http://wyslijto.pl/plik/q5om39aoua - it's a little bit strange.
> > What kind of performance test would you think about?
> >
>
> Did you implement or is it possible to implement for this augmented red
> black tree split, join and splice operations with logarithmic computational
> complexity?
>
> As for tests, there are a number of performance tests of basic container
> operations in the following projects:
>
> 1. For augmented dynamically allocated B+ trees:
> https://github.com/vstadnik/stl_ext_adv
>
> I have added this link, since this project has results of performance tests
> for Boost.MultiIndex.
>
> 2. For augmented array based B+ trees:
> https://github.com/vstadnik/stl_ext_adv_review
>
> This project is prepared for Boost review. It compares performance of new
> containers with standard containers and Boost.Container.
> The file "boost_performance.cpp" uses Boost.Chrono timer. If your
> containers are STL compliant you should not have any problem with this test
> code.
>
> Regards,
> Vadim Stadnik

I have performed such test for 6 data containers (including boost::multi_index described by Tony and Vadim): the comparison is uploaded here: http://wyslijto.pl/plik/mxxpjbpmfn
I prepared 5 datasets (first one was sorted). I repeated 5 times whole test for each container and dataset. The test inserts and accumulates as in Vadim's tests. However I added the erasure performance.

I haven't thought about possiblility to implement such operations like splice, join, split. However, it seems to me to be impossible. When adding treeA + treeB, the elements of treeB should be considered independently where to place them in result. For example if thereA contains 1, 3, 5, 7, 9 and treeB contains 2, 4, 6, 8, 10 all treeB nodes will be independently located and inserted in O(n).

Best regards,
Szymon Wojciechowski