Subject: Re: [boost] [Heaps] benchmarks
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2011-06-09 17:30:28


Tim Blechmann wrote:
>> Tim, you're proposing this b-heap purely as a performance optimisation
>> compared to the "traditional" heap, but we've not yet seen any cases
>> where it is actually faster. If it can be faster we need to see some
>> examples, and the documentation should qualify when it can be expected
>> to be better. If it's not faster, it should be dropped.
>
> just for the record:
> core i7 920, x86_64, linux-3.0-rc2, hyperthreading/frequency scaling disabled,
> 12gb triple-channel ddr3 ram
>
> stressed with:
> stress -m 1 --vm-bytes 4G --io 16 -c 3
> and two boinc processes running at nice 10
>
> configured the problem size to 15000000
> changed work_t from boost::function to int (24byte to 4byte)
>
>
> comparison of b-heap and dary_heap (with arity 2) ... running all at nice 1 to
> increase the number of context switches.
>
> Performance counter stats for './bench_boostbheap':
[snip]
> 615.223322417 seconds time elapsed
>
> Performance counter stats for './bench_boostdary':
[snip]
> 659.415916004 seconds time elapsed

That's great. But I was comparing with your priority_queue<> and the
std::push|pop_heap, not your d_ary_heap<>. Could you do the same test
with one of those?

> Performance counter stats for './bench_boostbheap':
> 1,421,230,219,856 cycles
> 928,444,569,746 stalled-cycles-frontend
> 643,217,557,503 stalled-cycles-backend
> 1,517,920,671,665 instructions

> Performance counter stats for './bench_boostdary':
> 1,536,548,826,172 cycles
> 1,304,937,894,945 stalled-cycles-frontend
> 999,244,398,224 stalled-cycles-backend
> 747,525,733,877 instructions

That's the interesting part: the b-heap takes twice as many
instructions as the d-array heap, so the improvement in cache locality
needs to be substantial enough to double the instruction throughput
before the b-heap does better. You've had to pull all the stops out to
make that happen.

Regards, Phil.