Boost mailing page: Re: [boost] [thread] RFC standard proposed mutex, read-write mutex, condition

Date view	Thread view	Subject view	Author view

From: Peter Dimov (pdimov_at_[hidden])
Date: 2007-08-23 10:35:11

Next message: Beman Dawes: "Re: [boost] Updated Development and Release Practices for 1.35.0"
Previous message: Jeff Flinn: "Re: [boost] Proposed library"
In reply to: Howard Hinnant: "Re: [boost] [thread] RFC standard proposed mutex, read-write mutex, condition"
Next in thread: Howard Hinnant: "Re: [boost] [thread] RFC standard proposed mutex, read-write mutex, condition"
Reply: Howard Hinnant: "Re: [boost] [thread] RFC standard proposed mutex, read-write mutex, condition"

Howard Hinnant wrote:

> On Aug 22, 2007, at 2:59 PM, Zach Laine wrote:
>
>> Could the folks who object to the
>> current design spell it out for me a bit more explicitly -- what in
>> the design is dangerous/inconvenient enough to throw away one or more
>> of the 4 goals?
>
> I would like to see an answer to Zach's question too. I do not know
> what the major objection is with the "current proposal". I only know
> that people are suggesting alternatives.

I deliberately only submitted an alternative for consideration instead of
poking holes in your design or argumentation. But if you insist...

I don't believe that my suggested alternative throws away any of the four
goals, hypothetical malicious vendor D notwithstanding. It is still possible
for the vendor to meet the goals. In addition, it adds

Goal 5: the ability to control the level of checking globally;
Goal 6: ... without source changes or even recompilation.

This can help if hypothetical user D, lured by the "no overhead, less L1
cache misses!" slogan, uses unchecked<> as a matter of habit. "Checking is
for other people."

On reflection though, I'll change the constructor from

explicit condition( Mutex * pm = 0 );

explicit condition( Mutex * pm );

as it's too easy to accidentally disable checking by not including the
condition in the member init list.

Going back to:

class shared_mutex
{
typedef mutex mutex_t;
typedef condition<unchecked<mutex_t>> cond_t;

     mutex_t mut_;
     cond_t gate1_;
     cond_t gate2_;
     unsigned state_;
    ...

and the L1 cache miss argument for:

class A
{
shared_mutex mx_;
...
};

vector<A> v;

1. The size of shared_mutex, according to your numbers, is 104. If we assume
16 bytes of state in A, this makes sizeof(A) 120. The addition of two
pointers makes it 128. This is a 7% increase, but it also happens to round
up the size of A to 128, which makes it never straddle a cache line, so the
"more bloated" version will actually be faster. Note that I picked the
number 16 before realizing that. :-) If A had 24 bytes of state the two
pointers would of couse be detrimental.

2. I'm having a hard time imagining a program where the L1 cache misses due
to the increased size of A would matter. An object of type A allocates a not
insignificant amount of kernel resources, so it would be hard to keep so
many A's in memory for the +7% L1 misses to show up.

3. What is sizeof(pthread_rwlock_t) on your platform? Is it not something
like 48-56? This is half the size of the above shared_mutex, so users who
are L1 cache miss conscious will not use the shared_mutex anyway.

4. The vector v would need to be protected by its own rwlock as well since
you can't reallocate it while someone is accessing the A's, which creates a
central bottleneck. vector< shared_ptr<A> > does not suffer from this
problem as you can reallocate it while someone is holding a reference to an
element in the form of shared_ptr.

5. If optimizing the L1 cache friendliness of shared_mutex is an important
goal, I would consider moving the conditions to the heap as they aren't
accessed on the fast path.

class shared_mutex
{
    mutex mx_;
    unsigned state_;
    void * rest_;
};

I know that your actual shared_mutex will use atomics on state_, so you
could even be able to make

class shared_mutex
{
unsigned state_; // , state2_?
void * rest_;
};

work. Uncontended access now doesn't touch *rest_ and L1 cache misses are a
fraction of what they used to be. Contended access does cause more cache
misses, but this is shadowed by the cost of the contention.

Next message: Beman Dawes: "Re: [boost] Updated Development and Release Practices for 1.35.0"
Previous message: Jeff Flinn: "Re: [boost] Proposed library"
In reply to: Howard Hinnant: "Re: [boost] [thread] RFC standard proposed mutex, read-write mutex, condition"
Next in thread: Howard Hinnant: "Re: [boost] [thread] RFC standard proposed mutex, read-write mutex, condition"
Reply: Howard Hinnant: "Re: [boost] [thread] RFC standard proposed mutex, read-write mutex, condition"

Date view	Thread view	Subject view	Author view