Subject: Re: [boost] [thread 1.48] Multiple interrupt/timed_join leads to deadlock
From: Gaetano Mendola (mendola_at_[hidden])
Date: 2012-12-12 03:19:01


On 12/12/2012 08.33, Vicente J. Botet Escriba wrote:
> Le 12/12/12 00:32, Gaetano Mendola a écrit :
>> On 11/12/2012 21.23, Vicente J. Botet Escriba wrote:
>>> Le 11/12/12 19:12, Gaetano Mendola a écrit :
>>>> On 11/12/2012 16.13, Vicente Botet wrote:
>>>>> Gaetano Mendola-3 wrote
>>>>>>
>>>>>> 1) Thread group is now thread safe, it can be used concurrently by
>>>>>> multiple threads
>>>>>
>>>>> Why a thread group should be inherently thread-safe? It seems to me
>>>>> that
>>>>> having a thread container is already useful.
>>>>
>>>> It can manage without pestering the developers the fact that one
>>>> entity spawns a batch of threads and then wait for the completion
>>>> waiting on
>>>> the join() while another entity (an user interface as example) can
>>>> stop the whole process if it's taking too much time. Otherwise as soon
>>>> someone performs a boost::thread_group::join then nothing can be done
>>>> from outside to stop the process. It seems a natural use to me.
>>> OK I think I understand your use case. Here it is an alternative that
>>> don't use any mutex to protect the group of thread.
>>> I will choose and owner of all these threads, insert them on a
>>> container. Only this thread is able to join/interrupt the threads.
>>> I will use some way to transfer the request from the user interface
>>> thread to the owner that this is taking too much time (using atomic?).
>>> The owner will try to join each thread using try_join_until with the
>>> desired expiration time. If the thread is joined the thread is removed
>>> from the container. If there is a timeout the owner will check the
>>> protected state 'take_too_much_time' and will interrupt all the other
>>> threads and then join all of them. As you can see the contention is
>>> reduced.
>>>
>>> Note that this is a specific behavior that can not be added to the
>>> thread_group class. I will be for the addition of an algorithm/free
>>> function that try to join the threads on a container/range during a
>>> given duration or until an expiration time (removing the joined
>>> threads).
>>
>> I don't know who the boost thread maintainer is and how/who decides
>> if a design is good to be implemented or it work the other way around?
> Here I was talking about my alternative solution.
>>
>>>>>> 2) thread_group now maintains a list of handlers with the
>>>>>> responsibility
>>>>>> to:
>>>>>> -) Avoid join and interrupts to be called concurrently on a
>>>>>> thread
>>>>>> -) Avoid to call join on a joined thread
>>>>>> -) Avoid to call interrupt on a joined/interrupt thread
>>>>>
>>>>> IMO, all the threads in a thread_group are owned by the group, and
>>>>> use move
>>>>> semantics, no need to use pointer to threads. As a consequence there
>>>>> is no
>>>>> need for the handler/wrapper.
>>>>
>>>> This is true if the thread_group does not permits to be used by
>>>> multiple
>>>> threads interrupting/joining.
>>> I understand now why you did this way. But I will not do that.
>>
>> Then the maintainer is you?
> Here I was talking as a user, that is, that I will not use the design of
> your application.
> And yes, I'm the maintainer with Anthony Williams that is the principal
> author.
>>
>>>>>> 4) Due the fact mutex are not fair a thread issuing an interrupt_all
>>>>>> most likely will go in starvation if a thread is issuing a
>>>>>> join_all
>>>>>> (especialy if the group contains a single thread). I can work at
>>>>>> it.
>>>>>
>>>>> Could you clarify your concern?
>>>>
>>>> Sure, if a thread performs a closed loop:
>>>>
>>>> <snip>
>>>> then it goes in starvationm we have observed this (even in a
>>>> deterministic way), then I had to make the two interrupt/timed_join on
>>>> the thread handler fair each other.
>>>> Our platform is a linux platform with a 3.2.0 kernel.
>>>>
>>> IMO, only one thread should join/interrupt all the threads. This avoid
>>> all these issues.
>>
>> Avoid the issues at thread group level, but those issues will be present
>> into an upper layer.
> Maybe. Have you identified these issues on the design I have proposed
> above?

Of course not if thread_group is treated as not thread safe indeed you
avoid all those issues at once.

>> It seems it was coded when it was not a problem for a thread to be
>> joined even if joined/interrupted.
> You are right, the last change in thread had some some undesirable
> impacts on thread_group. IMO, the two fixes I reported in this thread
> resolve the issues.
> Please let me know if this is not the case.

It solves the issue yes. A consideration I can do is to completely
remove the mutex protection inside the thread_group because it's
useless not being a class meant to be used by multiple threads, also
due the fact it is an only header class the inspection of
implementation it gives false expectations.

Regards
Gaetano Mendola