$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] [lockfree::fifo] Review
From: Gottlob Frege (gottlobfrege_at_[hidden])
Date: 2009-12-20 14:44:27
On Sun, Dec 20, 2009 at 11:17 AM, Tim Blechmann <tim_at_[hidden]> wrote:
> On 12/20/2009 04:57 PM, Chris M. Thomasson wrote:
>> "Tim Blechmann" <tim_at_[hidden]> wrote in message
>> news:4B2E4492.3050201_at_klingt.org...
>>
>>>> Well, IMO, it should perform better because the producer and consumer
>>>> are
>>>> not thrashing each other wrt the head and tail indexes.
>>
>>> the performance difference is almost the same.
>>
>> Interesting. Thanks for profiling it. Have you tried aligning everything on
>> cache line boundaries? I would try to ensure that the buffer is aligned and
>> padded along with the head and tail variables. For 64-byte cache line and
>> 32-bit pointers you could do:
>
How about we go through the ring buffer by steps of 2^n - 1 such that
each next element is on a separate cache line? ie instead of
m_head = (m_head == T_depth - 1) ? 0 : (m_head + 1);
we do
m_head = (m_head + 7) % T_depth;
You still use each slot, just in a different order. You calculate 'n'
to be whatever you need based on the cell size. As long as the
resultant step size is prime mod T_depth.
I'm not sure if the false-sharing avoidance would be worth the cost of
using up more cache lines. Probably depends on how full the queue is,
etc.
Tony
Or might that be worse?