$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] radix sort
From: Steven Ross (spreadsort_at_[hidden])
Date: 2013-07-22 06:49:08
On Mon, Jul 22, 2013 at 1:05 AM, Jeremy Murphy <
jeremy.william.murphy_at_[hidden]> wrote:
>
> > Reverse sorting could be done by passing reverse iterators, though that
> >
> eliminates the possibility of using raw pointers, which are slightly faster
> > with some of the compilers I tested. I'd be happy to remove the reverse
> > calls if there is general agreement that the performance difference
> isn't a
> > problem.
>
>
> If performance is the only reason can you provide or link to some empirical
> data? Presumably the same decision must have been broached with regard to
> std::sort() and there is no std::reverse_sort(). This discussion on SO
> suggests that with enough optimization all iterators are equal:
>
> http://stackoverflow.com/questions/9025084/sorting-a-vector-in-descending-order
> It also points out that reverse sorting can be achieved with forward
> iterators by using the opposite sorting functor.
I just ran my sample.cpp, and std::sort runtime drops from
169.67 to 164.01 seconds using pointers instead of iterators on the vector.
In general, I've seen about a 1% speedup in sorting when using pointers
instead of iterators. It isn't a big deal, but I don't think there is a
reverse pointer.
What you pointed me to was somebody forgetting to turn compiler
optimizations
on and seeing huge differences.
Iterators are just a bit more complex than the built-in pointer support.
> I certainly haven't done extensive testing, but so far the stable
> counting-sort has been much quicker than std::sort(). It has even been
> quicker than your code for small (< 1000) values of k (maximum value). :)
> The non-stable counting-sort (based on Rosetta Code) is faster still, so
> I'm curious to see some extensive performance tests. My LSD radix-sort is
> almost finished, I'll let you know when.
>
My code uses std::sort when there are <1000 values, because it doesn't
provide
a consistent speedup on randomized data sets smaller than that.
What type of data are you sorting, and how is it distributed?