$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] [algorithm] LSD radix sort
From: Jeremy Murphy (jeremy.william.murphy_at_[hidden])
Date: 2014-03-30 09:43:22
On 30 March 2014 23:27, Philippe Vaucher <philippe.vaucher_at_[hidden]> wrote:
>
>
> I think it'd be nice to see the results of these speed tests, compared to
> std::sort etc.
>
> Philippe
>
Good idea.  Here are the speed test results from my x86_64 laptop with gcc.
 Let me explain them a bit.  There are two sets of data based on the
distribution of the random numbers generated.  The first is a poisson
distribution, which was really just an easy way to get a small range (max -
min) on the input values.  The second is a uniform distribution, which is
also the worst-case scenario.
Then each data distribution is tested four times: unsigned char (h), short
(t), int (j) and long (m).
The first column is the log base 2 of the input size, n.  Second column is
time (t) divided by n, so time per element.  Third column is something I am
still working on, sorry, just ignore it for now.  Fourth and fifth columns
are how many times (x) faster radix sort is compared to std::sort and
std::stable_sort.
So you'll see, LSD radix sort performs pretty well except for on unsigned
long with a large range of values.  Oh and the blank rows are because I
haven't got around to measuring n repetitions of a sort, so the blanks are
because one execution of the sort is not measurable.  Certainly there are
improvements to be made but this is a crude and effective metric to begin
with.  I'm very curious to compare results with different architectures.
 Cheers.
Jeremy
Apologies if the formatting is a mess, it *should* look OK in a fixed-width
font.
*** poisson_distribution, mean = std::numeric_limits<T>::max() / 2
=== Test (seed = 1,396,184,394, T = h). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17
     18
     19
     20
     21
     22     2.38        0.09    15.0    25.0
     23     2.38        0.09    15.0    26.0
     24     2.38        0.08    15.8    28.5
     25     2.98        0.08    12.7    22.1
     26     2.83        0.08    13.6    23.9
     27     2.76        0.07    14.4    25.3
     28     2.76        0.07    14.6    26.3
=== Test (seed = 1,396,184,394, T = t). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17
     18
     19
     20     9.54        0.45    5.0     7.0
     21     4.77        0.19    10.0    15.0
     22    14.31        0.64    3.2     5.2
     23    10.73        0.43    4.3     7.0
     24    12.52        0.50    3.9     6.2
     25    13.41        0.52    3.6     6.0
     26    16.54        0.62    3.0     4.9
     27    20.49        0.74    2.5     4.1
     28    23.06        0.82    2.2     3.7
=== Test (seed = 1,396,184,394, T = j). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17
     18
     19    19.07        1.00    4.0     4.0
     20    28.61        1.40    2.7     2.7
     21    23.84        1.10    3.0     3.2
     22    28.61        1.27    2.5     2.9
     23    32.19        1.39    2.2     2.7
     24    32.19        1.33    2.2     2.7
     25    36.66        1.44    1.9     2.5
     26    43.36        1.65    1.7     2.1
=== Test (seed = 1,396,184,394, T = m). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17    76.29        4.47    0.0     1.0
     18    38.15        2.11    2.0     2.0
     19    57.22        3.00    1.0     1.7
     20    57.22        2.85    1.3     1.5
     21    57.22        2.71    1.4     1.6
     22    61.99        2.77    1.3     1.8
     23    61.99        2.65    1.4     1.6
     24    64.37        2.67    1.4     1.6
     25    66.46        2.64    1.4     1.6
     26    67.50        2.58    1.4     1.7
*** uniform_int_distribution, min = 0, k = std::numeric_limits<T>::max()
=== Test (seed = 1,396,184,394, T = h). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17
     18
     19    19.07        1.00    2.0     3.0
     20
     21     4.77        0.19    9.0     12.0
     22     4.77        0.18    8.5     13.5
     23     3.58        0.13    11.7    18.0
     24     6.56        0.25    6.6     10.0
     25     7.75        0.28    5.7     9.0
     26     8.05        0.31    5.6     8.9
     27    11.77        0.41    3.9     6.2
     28    15.87        0.54    3.0     4.7
=== Test (seed = 1,396,184,394, T = t). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17
     18
     19
     20     9.54        0.45    7.0     8.0
     21     9.54        0.43    7.0     9.0
     22    11.92        0.50    5.6     7.2
     23    15.50        0.65    4.2     5.8
     24    17.29        0.71    3.9     5.3
     25    19.67        0.76    3.4     5.0
     26    24.74        0.92    2.7     3.9
     27    31.66        1.15    2.2     3.1
     28    36.47        1.29    1.9     2.8
=== Test (seed = 1,396,184,394, T = j). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17    76.29        4.47    0.0     1.0
     18
     19    19.07        1.00    3.0     4.0
     20    19.07        0.95    3.5     4.0
     21    33.38        1.57    2.3     2.6
     22    35.76        1.59    2.2     2.4
     23    38.15        1.65    2.2     2.4
     24    41.13        1.71    2.1     2.4
     25    53.35        2.12    1.7     1.9
     26    62.73        2.38    1.5     1.6
=== Test (seed = 1,396,184,394, T = m). ===
log2(n) t/n (ns)        c       x sort  x stable_sort
-------------------------------------------------------------
     16
     17    76.29        4.47    1.0     0.0
     18    38.15        2.11    2.0     2.0
     19    76.29        4.00    1.0     1.0
     20    57.22        2.85    1.3     1.7
     21    71.53        3.38    1.1     1.3
     22    81.06        3.68    1.0     1.2
     23    85.83        3.70    1.0     1.2
     24   103.71        4.29    0.9     1.0
     25   122.49        4.88    0.8     0.9
     26   131.28        5.04    0.7     0.9