$include_dir="/home/hyper-archives/boost-users/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [Boost-users] thread_group::interrupt_all is not reliable
From: Roland Bock (rbock_at_[hidden])
Date: 2009-12-01 16:07:13
Stonewall Ballard wrote:
> The test code is parked at <http://sb.org/ThreadTest.zip>, 20KB. It's
> an XCode 3.2 project, but the five source files could be readily
> compiled and run in any Unix environment.
I admit it took several times of reading this thread until I finally 
thought I got it (which might have something to do with several rather 
long days of PHP & JavaScript hacking).
Apart from the original problem, I wonder if you could shed some light 
on a comment in your WorkQueue code:
CONTROL:
     void increment()
     {
         boost::mutex::scoped_lock lock(the_mutex);
         ++tasks_available;
         lock.unlock();
                 // notify_one may fail to wake a worker if there
                 // are multiple items in the queue, so it's better
                 // to waste a bit of CPU and notify_all
         the_condition_variable.notify_all();
     }
WORKER:
     void wait_and_decrement()
     {
         boost::mutex::scoped_lock lock(the_mutex);
         while ( tasks_available == 0 ) {
             the_condition_variable.wait(lock);
         }
         --tasks_available;
     }
What would be a scenario in which notify_one would fail? I would assume 
that a problem with notify_one would occur only if you would do 
something like
tasks_available += 42;
instead of
++tasks_available;
In such a case, notify_one would wake less threads than required. But 
how can that happen with notify_one being called for each and every task?
Regards,
Roland
PS: Thank you for starting this thread (and Peter and Anthony for 
participating, of course). Very interesting, because I have written a 
different kind of threadpool a few days ago (no queues, but you can call 
a method addTask() which will hand the task to the next thread available 
(blocking for as long as all threads are working on previously added 
tasks)). The destructor calls interupt_all() with potentially many 
threads in wait(). I probably would have run into the same problems as 
you did when using it in production (8-core).