$include_dir="/home/hyper-archives/boost-users/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [Boost-users] Thread local storage
From: Oliver Abert (abert_at_[hidden])
Date: 2009-03-30 04:25:10
> Thanks for alerting me to this thread Peter.
>
> Oliver Abert <abert_at_[hidden]> writes:
>
>> On 29.03.2009, at 19:36, Peter Dimov wrote:
>>
>>> Oliver Abert:
>>>> Hi Everyone,
>>>>
>>>> I am using Boost Threads (1.38) as threading library and I also use
>>>> the thread_specific_ptr to store a minor amount of data per  thread
>>>> (I  think currently it is like 5 different pointer values  per
>>>> thread).  Technically everything works out fine, but I am  having a
>>>> performance  problem on Mac OS X. On Linux the performance  is 10
>>>> times faster than  on Mac OS. If I use pthreads on Mac OS I  have
>>>> identical performance to  the Linux version. Both versions are
>>>> running on the same machine using  8 threads both.
>>>
>>> What does your profiler say?
>>
>> about 80% of the time is spend in __spin_lock which in turnwas called
>> by pthread_once. If I use only one thread (instead of 8) the
>> percantage goes down to 2.5% - which is still a bit much for my  
>> taste.
>
> pthread_once is called by the thread_specific_ptr code to ensure that
> the TLS key it uses has been allocated and is valid. It's a real  
> pain if
> that is too slow.
yes, i understand that so far - but there seems to be some more  
serious problem. Is it possible that there is some unintended mutex  
lock, because it seems like exactly that is happening. Maybe it is  
related to the static variables, which might get mutexed  
automatically? I heard there is a bug with the Apple gcc 4.0.1  
regarding statics, but this morning I also tried the intel 11.0  
compiler with the same dissapointing results. What makes me wonder,  
ist that the same code runs just fine on Linux.
Some more background Information: The problem is definitevly caused by  
calls to get() of the shared pointer. I am using it in a realtively  
hot section of my code. Profiling is not so helpful, because there are  
a bunch of unknown libraries in between my call and the pthread_once  
call - and yes I also used a begug build of boost - I have not a clue  
what is happening in between.
Oliver