Boost mailing page: Re: Design and documentation of Boost.Threads

Date view	Thread view	Subject view	Author view

From: williamkempf_at_[hidden]
Date: 2001-11-12 12:19:31

Next message: Darin Adler: "Re: [boost] Boost.Assert"
Previous message: Peter Dimov: "Re: [boost] Boost.Assert"
In reply to: Kevin S. Van Horn: "Design and documentation of Boost.Threads"
Next in thread: Mac Murrett: "Re: [boost] Design and documentation of Boost.Threads"

--- In boost_at_y..., "Kevin S. Van Horn" <kevin.vanhorn_at_n...> wrote:
> I finally managed to eke out enough free time to finish reviewing
and
> commenting on the documentation for the Boost threads library (as a
prelude to
> attempting some formal correctness proofs.) Here are my remarks.
They
> include a mixture of comments on design, exposition, wording, and
sometimes
> grammar.
>
> GENERAL COMMENTS
> ================
>
> Fairness and strongness
> -----------------------
>
> Missing from the documentation is a discussion of the three levels
of
> increasingly-strong guarantees for mutexes, which parallel the same
categories
> for semaphores:
>
> 1. Weak mutexes. When a thread unlocks a weak mutex and there are
other
> threads blocked waiting on the mutex, the mutex may enter the
unlocked state
> for a short period of time until some thread locks it again. The
locking
> thread may not have been one of the threads waiting at the moment
the mutex
> was unlocked.
>
> Consider the following scenario with a weak mutex. Thread A is in
a loop
> where it repeatedly acquires the mutex, does something, then
releases the
> mutex:
>
> while (1) {
> <processing outside of the critical section>;
> boost::mutex::scoped_lock l(mtx);
> <critical section>;
> }
>
> If there are one or more other threads waiting to acquire the
mutex, thread A
> may monopolize the mutex, starving the other threads so that they
*never*
> acquire the mutex. This can occur because thread A may reacquire
the mutex
> after releasing it, before one of the waiting threads manages to
acquire it.
>
> Weak mutexes are nearly useless. As far as I can tell, it is
impossible to
> build starvation-free synchronization mechanisms on top of them.
As one
> example, consider trying to build semaphores on top of weak
mutexes. The
> acess to the integer giving the semaphore count must be protected
by a mutex.
> If this is a weak mutex, then a signal/up_count/V operation may
block
> forever if there is heavy activity on the semaphore!
>
> *** I have looked over the POSIX implementations of condition
variables and
> the various Mutex types, and several of these are weak mutexes. ***

From a documentation stand point I can't really address this with out
causing (theoretical at least) problems with portability. However,
the implementation is a definate issue. If you have solutions for
the implementation I'd be very interested.

> 2. Strong mutexes. Suppose that a thread A holds a lock on the
mutex while
> there are other threads blocked trying to acquire the mutex. When
A "unlocks"
> the mutex, it does not actually enter an unlocked state; instead,
the mutex
> remains locked, but one of the blocked threads becomes unblocked
and acquires
> the mutex. This prevents any one thread from monopolizing the
mutex.
> However, it is not sufficient to prevent starvation: two threads in
a
> lock/unlock loop such as presented in the discussion of weak
mutexes can
> starve a third by playing a game of "keep away". That is, when
thread A
> releases the mutex, thread B always acquires it, and when thread B
releases
> the mutex, thread A always acquires it, leaving thread C forever
waiting to
> acquire the mutex.

Again, I can't really mention this in the documentation. However,
starvation can occur even with "fair" mutexes (by one definition of
fair). POSIX mutexes, for instance, are open to priority starvation
problems if you aren't careful.

> 3. Fair mutexes. A fair mutex is a strong mutex with one additional
> guarantee: as long as every thread that acquires the mutex
eventually releases
> it, any thread that is blocked on the mutex will eventually require
it. Note
> that mutexes that use FIFO scheduling are fair, but a mutex can be
fair
> without using FIFO scheduling.
>
> I seem to recall that one can implement fair semaphores on top of
strong
> semaphores, though I do not recall the algorithm. I'm not sure
whether or not
> one can implement a fair mutex on top of strong mutexes. Note the
important
> difference between binary semaphores and mutexes: any thread can
unlock a
> locked binary semaphore, but only the locking thread can unlock a
locked
> mutex. This added capability of binary semaphores is very useful
for ensuring
> lack of starvation in synchronization algorithms.
>
> As with mutexes, condition objects can also be strong or fair. (The
> definition of the notify_one() and notify_all() operations rules
out weak
> condition objects.) Suppose that a thread A is blocked in a wait()
on a
> condition object. If the condition object is fair, then A can only
remain
> blocked forever because of a lack of notify() calls on the
condition object,
> and not because A always gets passed over in favor of some other
thread to
> unblock on a notify_one() call. The Boost.Threads documentation
needs to
> specify whether or not condition objects are fair in this sense.

Again, I don't think the documentation can do this with out causing
problems with portability.

> Also, the
> documentation should make clear that there are two kinds of
blocking in a
> wait() operation: a thread may be blocked on the condition object
itself, or
> it may have been unblocked and then subsequently blocked trying to
reacquire
> the mutex.

I'm not sure how this knowledge (which should be obvious when
thinking about it) will be useful.

> If at all possible, and if the implementation cost is not too high,
all
> mutexes and condition objects should be fair. Otherwise you run
the risk of
> users having mysterious, intermittent and irreproducible
thread "lock-ups."
> If making the mutexes or condition variables fair adds a significant
> implementation cost, then Boost.Threads should have both efficient-
but-unfair
> and fair-but-less-efficient mutexes.

In general, your comments are very relevant here. I can't document
some of this because I need the leeway for implementing over existing
MT libraries, but these are issues that should be addressed
sufficiently for a standardized library (where the implementers won't
be restricted in the same manner for the implementation). So I'll
need to document this stuff in some way. I'm also very concerned
about your findings on the implementation. If you have any thoughts
on how to correct these areas I'd be very interested.

> Handling of programmer errors
> -----------------------------
>
> There are many places where the documentation specifies that
exceptions *will*
> be thrown if some precondition is not met. Do you really want to
guarantee
> this, or do you just want to say that the exception *may* be
thrown? At debug
> time you do not want exceptions thrown for what amounts to
programmer error
> (e.g., unlocking a mutex that you do not have locked), as this
throws away a
> lot of useful information as the stack unwinds. At debug time it
is much
> preferable for the program to dump core so that you can examine the
core dump
> with a debugger and have full information on the context of the
error. I
> recognize that a core dump is unacceptable for some production
environments,
> so it would be nice to be able to choose between the core-dump and
exception
> options via a compile-time flag.

Complex decision and this has been discussed a lot and honestly the
final decision probably isn't reached yet. Generally, the
documentation is correct IMHO, though the implementation may want to
assert on debug builds.

> In addition, I would suggest that even when you do throw an
exception, the
> exception object should include the file and line number. (See
discussion
> below for overview.html.)

A useful concept for debuggin purposes, but not one that should be
gauranteed I think. In other words I'll consider adding this to the
exception's what() but I think it would be wrong to put this in the
documentation as a requirement (thinking in terms of standardization).

> Atomic reads and writes
> -----------------------
>
> I think a discussion is needed of why even reads and writes of
variables of
> primitive types may not be atomic, and that there is *one* type for
which
> reads and simple writes are guaranteed atomic (sig_atomic_t).

Outside of the scope of the documentation, though discussion of this
topic here and possibly in the FAQ would make sense.

> DOCUMENT-SPECIFIC COMMENTS
> ==========================
>
> overview.html:
> --------------
>
> Quote: "Priority failures (priority inversion, infinite overtaking,
> starvation, etc.)"
>
> Comment: The programmer cannot do anything about priority
inversion; that has
> to be addressed at the OS level.

This isn't true. The simplest way for the programmer to avoid
priority inversion is to never share resources between threads of
varying priority. This can be done by changing a thread's priority
before and after acquiring a mutex, for instance, and in fact POSIX
specifies a mechanism in which this can be automatically done. This
is an area in which Boost.Threads may need some modification, but
even with out this support you can avoid this problem.

> I don't believe that "infinite overtaking"
> and "starvation" are defined anywhere.

Probably not. We'll need to add definitions.

> I'm not sure that "priority failure"
> is an appropriate term to describe starvation. Absence of
starvation just
> means that a thread waiting its turn to do something does,
eventually, get its
> turn; the concept is meaningful even in the absence of any notion
of thread
> priorities.

I agree and disagree. Starvation is often caused by priority
failures. You're quite correct that it can also be caused in other
ways so the documentation should be made clearer here, but it's not
strictly incorrect.

> Quote: "Every multithreaded program must be designed carefully to
avoid race
> conditions and deadlock."
>
> Comment: I would add starvation to the list of things to avoid.
This is a
> commonly-overlooked error in multithreaded code.

Agreed.

> Quote: "Common requirements for all Boost.Threads components."
>
> For implementors they are requirements, but for users (the great
majority of
> those who will be reading the document) they are guarantees /
specifications.

Hmmm... different perspective. I'll look at this.

> introduction.html
> -----------------
>
> Quote: Under "Flexiblity" it says, "When functionality might be
compromised by
> the desire to keep the interface safe, Boost.Threads has been
designed to
> provide the functionality, but to make it's use prohibitive for
general use."
>
> Comment: I have no idea what that final clause means. Also, "it's"
means "it
> is"; the possessive of "it" is "its." (You write "whose",
not "who's"; "his",
> not "he's"; "theirs", not "their's; and so forth.)

Yes, that would be a typo ;). I'm not sure how to reword the clause
to be more meaningful, however.

> Quote: "Recently there has been a lot of research in other
concepts, such as
> in 'Communicating Sequential Processes.'"
>
> Comment: CSP is not new. It wasn't new in in 1985, when I first
learned about
> it.

Depends on your definition of new ;). However, I was referring to
new research, not to the invention of the concept. There is a lot of
new research for CSP that's been started recently.

> faq.html
> --------
>
> Quote: "1. Are lock objects thread-safe?"
>
> Comment: The hyphen is necessary only when the adjective
phrase "thread safe"
> immediately precedes a noun or noun phrase. For example, "f() is a
> thread-safe function." The hyphen indicates grouping: "thread"
modifies
> "safe", and not "safe function".

OK.

> Quote: "They are meant to be short lived objects..."
>
> Comment: Should be "short-lived" objects: "short" modifies "lived",
not "lived
> objects."

OK.

> Quote: "One easy way of doing this [locking mutexes in a standard
order] is to
> use each mutex's address to determine the order..."
>
> Comment: This is not portable. The standard only allows
comparisons other
> than == and != between pointers that point into or just past the
end of the
> same object. The example code for item 5, using "<" and ">" to
compare
> mutex pointers, is not portable for this reason. Interestingly
enough,
> however, I seem to recall that function objects of type std::less<T
*> are
> guaranteed to provide a complete ordering.

That's a debate I don't want to get into, because I lose every time.
If it's enough to satisfy everyone I can simply change the code to
use std::less.

> Quote: The example code for item 5 includes the lines "m_value =
> other.m_value;" and "return m_value;", which are protected by mutex
locks.
>
> Comment: It's worth mentioning why the protection is necessary:
with the
> exception of the sig_atomic_t type, you don't know know whether
reading or
> writing a value of even a primitive type can be done as a single
atomic action
> (single memory read/write.)

There is no standard sig_atomic_t type, so this comment will just
lead to confusion. As it is, I think the reasons are self obvious.

> Quote: [Under item 6] "The internal state of mutex types could have
been made
> mutable, with all lock calls made via const functions, but this
does a poor
> job of documenting the actual semantics."
>
> Comment: In fact, such a const declaration would be outright
incorrect, since
> the logical state of a locked mutex clearly differs from the
logical state of
> an unlocked mutex!

Depends on your definition of incorrect, which is precisely why
things are worded the way they are.

> Quote: "Semaphore was removed as too error prone. The same effect
can be
> achieved with greater safety by the combination of a mutex and a
condition
> variable."
>
> Comment: This is not so easy to do correctly if you wish to avoid
> starvation!

Care to elaborate?

> definitions.html
> ----------------
>
> Quote: "Thread State." [Followed by definitions of "Ready"
and "Running".
>
> Comment: You are getting into scheduler implementation details that
have no
> place in a behavioral specification. The usual way of reasoning
about
> concurrent programs is that you assume nothing about the relative
speeds of
> different threads, and that arbitrarily long (but finite) delays
may occur at
> any point in the execution of a thread. The change of state from
Running to
> Ready and then eventually back to Running again looks just like an
arbitrary
> pause in the execution of the thread.

I don't see any scheduler implementation details here. There is
certainly nothing here that assumes anything about the relative
speeds of different threads or about delays.

> Quote: [Under "Race Condition"] "The result of a race condition may
be a bit
> pattern which isn't even a valid value for the data type."
>
> Comment: It's worth mentioning why this is so (see previous comment
on
> atomicity.)

I think you're trying to overspecify the terms of atomicity for
native types. There's no knowing if a native type can be manipulated
atomically in a portable manner and there's no standard sig_atomic_t
type. The description given here is accurate and complete with out a
discussion of non-portable concerns about atomic read/write
operations.

> Quote: "A priority failure (such as priority inversion or infinite
> overtaking)..."
>
> Comment: Need to define priority inversion and infinite
overtaking. Also need
> to define and discuss starvation.

I'll look into this.

> rationale.html
> --------------
>
> Quote: "However, computer science research has shown that use of
these
> primitives is difficult since there's no way to mathematically
prove that a
> usage pattern is correct, meaning it doesn't result in race
conditions or
> deadlocks.
>
> Comment: The above statement is false. CS researchers have been
> proving the correctness of concurrent algorithms using semaphores,
mutexes, or
> even just reads and writes of shared variables with busy waiting
(e.g., the
> Bakery Algorithm) for decades. It would be correct, however, to
say that
> doing so is *difficult* for low-level synchronization constructs.

OK, some re-wording is needed here. What was meant is that there's
no universal formula for proving this. Each algorithm requires
different proofs, which, as you say, are non-trivial and difficult to
do.

> Quote: "Though it's possible to create a lock object outside of
block scope
> and to share it between threads to do so would not be a typical
usage."
>
> Comment: I would make a stronger statement: "to do so would very
likely be an
> error." Also, you need a comma after "between threads."

OK.

> Quote: [Under "Rationale for Non-Copyable Thread Type".] "...
could provide
> more efficient reference management then wrappers."
>
> Comment: "then" (order) should be "than" (comparison.)

Typo. I'll fix.

> Quote: "Analysis of whether or not these arguments would hold true
don't
> appear to bear them out."
>
> Comment: "Analysis" is the simple subject, so "don't" should
be "doesn't."

OK.

> Quote: "...but thread_ref no longer suffers because of it's
reference
> management,..."
>
> Comment: "it's" should be "its."

Yep. My favorite typo (well, one of).

> mutex_concepts.html:
> --------------------
>
> There is a systematic misuse of "concept" and "model" in this
document. To
> review:
> - A concept is a set of requirements on a type.
> - Mutex is a concept.
> - A type T is a model of concept C if T satisfies all of the
requirements of
> C.
> - The type boost::mutex is a model of Mutex.
> - Actual objects are not models of concepts nor of classes.
>
> The phrase "mutex model" is repeatedly used when "mutex" or "mutex
object"
> (object whose type is a model of Mutex) is the correct term.
Sometimes "mutex
> concept" is used when "mutex object" would be correct.

Yeah, several people have been hammering this into me for some time.
I'll look at the documentation (again).

> Quote: "A mutex (short for mutual-exclusion) concept serializes
access..."
>
> Comment: Concepts don't *do* anything; they're just collections of
type
> requirements. You mean "mutex" or "mutex object".
>
> Quote: "A model that implements Mutex and its refinements has two
states:"
>
> Comment: Concept models are types, and types don't have state;
objects do.
> Suggested rewrite: "A mutex (object whose type models Mutex) has two
> states..." The "and its refinements" is redundant, since any type
that
> models a refinement of Mutex also models Mutex itself.
>
> Quote: "mutex model object" [Appears in many places.]
>
> Comment: Replace with "mutex" or "mutex object."
>
> Quote: "mutex model" [Appears in many places; e.g., "unlock a mutex
model."]
>
> Comment: Replace with "mutex" or "mutex object."
>
> Quote: "With this pattern, a lock concept is employed where the
lock model's
> constructor locks the associated mutex..."
>
> Comment: "lock concept" and "lock model" should be "lock object."
>
> Quote: "... mutex model ..."
>
> Comment: Should be "model of Mutex" or "Mutex model", as "Mutex,"
not "mutex,"
> is the name of the concept.
>
> Quote: "With an unspecified locking strategy, when a thread
attempts to
> acquire a lock on a mutex model for which the thread already owns a
lock the
> operation results in undefined behavior. When a mutex model has an
> unspecified locking strategy the programmer must assume that the
mutex model
> instead uses an unchecked strategy."
>
> Comment: The model *may* use an unchecked strategy. If the
strategy is
> unspecified, the programmer shouldn't assume that that the thread is
> *guaranteed* to deadlock; "undefined behavior" means that anything
can happen,
> including deadlock, aborting the program, or the computer
going "Star Trek"
> and exploding in a shower of sparks :-).
>
> Scheduling policies: What is the difference between "Undefined" and
> "Unspecified"?

Not enough to be relevant, I suppose. I'll remove Undefined.

> Under "Priority Driven", it might be worth mentioning the
> usual way of handling priorities without starving low-priority
threads: use
> aging, where a thread's priority increases the longer it waits,
then goes back
> to its original level after it is serviced.

I'll look into this.

> Quote: [Under "Scheduling Policies," "Undefined"] "Users should
assume that
> low-priority threads may wait indefinitely, and that threads of the
same
> priority acquire the lock in essentially random order."
>
> Comment: Does "indefinitely" mean "forever", or just "an
unspecified but
> finite amount of time"? See also general notes on weak vs. strong
vs. fair
> mutexes.

It means what it says ;). There is no "forever" since the program
will execute in a finite amount of time, but the distinction isn't
worth arguing about. You can't write reasonable code regardless of
whether it's "forever" or "an unspecified but finite amount of time".

> Quote: [In Mutex concept, for expression "M::scoped_lock".] "A
type meeting
> the ScopedLock requirements.
>
> Comment: Suggest "A model of ScopedLock." Similar comment applies
to TryMutex
> and TimedMutex.

OK.

> Quote: [In TryMutex concept.] "A TryMutex must meet the Mutex
requirements."
>
> Comment: Suggest "TryMutex is a refinement of Mutex." Similar
comment applies
> to TimedMutex.

OK.

> Quote: "Boost.Threads currently supplies six classes which model
mutex
> concepts."
>
> Comment: Suggest "Boost.Threads currently supplies six models of
Mutex." In
> the table underneath this, suggest replacing "Classes Modeling the
Concept"
> with "Models."

OK.

> mutex.html
> ----------
>
> Quote: "The mutex, try_mutex and timed_mutex classes are full
featured models
> of the Mutex, TryMutex, and TimedMutex concepts."
>
> Comment: Somewhat redundant. Suggest "The mutex, try_mutex, and
timed_mutex
> classes are models of Mutex, TryMutex, and TimedMutex respectively."

OK.

> Table in Introduction: There is a column labeled "Implementation
defined Lock
> Type." This should be removed as an implementation detail on which
library
> users should not rely. Take a look at the description of the
standard library
> containers; do they tell you the "implementation defined iterator
type" for
> each container class?

Well, at one time there was a reason for documenting these
implementation details, but since we've fully moved this into detail
land you're right. I'll fix this.

> Quote: "The mutex, try_mutex and timed_mutex classes use an
Unspecified
> locking strategy."
>
> Comment: This isn't really a strategy; it's a refusal to specify the
> strategy. Suggest "The mutex, try_mutex and timed_mutex locking
strategies
> are unspecified."

Well, Unspecified *was* listed as a viable strategy, but this is a
better description any way. I'll change it.

> Quote: "Programmers should assume that threads waiting for a lock
on objects
> of these types acquire the lock in a random order, even though the
specific
> behavior for a given platform may be different."
>
> Comment: "Random" has probabilistic connotations.
Suggest "Programmers should
> make no assumption as to the order in which waiting threads acquire
a lock."
> One exception: for strong mutexes, programmers can assume that when
a thread
> holding a mutex releases it and there are threads blocked on the
mutex, one of
> the waiting threads acquires the lock immediately. (At least,
logically this
> is what happens.)

Again, I don't think I can (or should) mention strong mutexes here.
Otherwise I agree and I'll make the changes.

> Quote: "Class mutex Synopsis"
>
> Comment: It's worth mentioning that mutex is a model of Mutex and
NonCopyable,
> and that the class provides no additional facilities beyond the
requirements
> of those two concepts. Similar comment applies to try_mutex and
timed_mutex.

I'll look at this.

> recursive_mutex.html
> --------------------
>
> The comments for mutex.html apply here also.
>
> Quote: "Attempts to unlock them by a thread that does not own a
lock on them
> will result in a lock_error exception being thrown."
>
> Comment: Why not "undefined behavior"? Do you really mean to
*guarantee* that
> a lock_error exception will be thrown? During debugging, I would
prefer an
> assert() that causes a core dump over throwing an exception, as the
latter
> loses all information as to the context in which this programming
error
> occurred.

Complex subject. I'll continue to look at this.

> lock_concept.html
> -----------------
>
> As with mutex_concepts.html, this document systematically
misuses "concept"
> and "model." The phrase "lock model" should, in most cases, be
replaced by
> "lock object."
>
> Quote: "The lock concepts provide exception safe means for locking
and
> unlocking a mutex model."
>
> Comment: Concepts are specifications, not mechanisms.
Suggest "Lock objects
> provide ... a mutex."
>
> Quote: "Instances of lock models are meant to be short lived."
>
> Comment: Suggest "Lock objects are meant to be short lived."
>
> Quote: "Lock models must maintain state to indicate whether or not
they've
> been locked and this state is not protected by any synchronization
concepts."
>
> Comment: Concepts are not mechanism, and thus cannot protect.
Suggest "Lock
> objects maintain state to indicate whether they've been locked, and
this state
> is not protected by any synchronization mechanism."
>
> Quote: [Lock requirements table.]
> Expression
> (&clk)->operator const void *()
>
> Comment: I would suggest
> Expression Effects
> clk Convertible to void *. Returns nonzero if ...

Hmm... I'll look at this. Makes sense, just want to make sure I
follow what the standard already does.

> Quote: [Lock requirements table.]
> Expression Effects
> clk.locked() Returns a bool, (&clk)->operator const void *() != 0
>
> Comment: I would suggest
> Expression Effects
> clk.locked() clk != (void *)0
>
> Quote: [Lock requirements table.]
> Expression Effects
> lk.lock() Throws a lock error if locked(). If the associated
mutex
> is already locked by some other thread, places the
current
> thread in the Blocked state until the associated
mutex is
> unlocked, after which the current thread is placed
in the
> Ready state, eventually to be returned to the
running state.
>
> Comment: Do you mean to guarantee that an exception will be
thrown? When
> debugging, I would rather have an assert() that causes a core dump,
so that I
> can investigate the context in which the programming error occurred.
> Secondly, the wording seems to imply that when the thread holding
the lock on
> a mutex releases it, *any and all* threads blocked on the mutex get
> unblocked. Finally, the discussion of a Ready state brings in
thread
> scheduler implementation details that should be of no concern to the
> programmer.
>
> Quote: [Lock requirements table, entry for "lk.unlock()."] "If !
locked(),
> throws lock_error..."
>
> Comment: Do you really mean to *guarantee* that an exception is
thrown?
>
> Quote: "A ScopedLock must meet the Lock requirements."
>
> Comment: Suggest "ScopedLock is a refinement of Lock."
>
> ScopedLock Concept: If mutex m is of type M, isn't it required that
type L and
> M::scoped_lock be the same type in the expression "L lk(m);"?
>
> ScopedTryLock Concept: Similar comments as for ScopedLock. Also,
for
> lk.try_lock() the table guarantees that an exception will be thrown
for a
> programming error.
>
> ScopedTimedLock Concept: Same comments as for ScopedTryLock and
ScopedLock.
> Other comments:
> - I'm uncomfortable with the use of absolute times (xtime) instead
of
> durations. The xtime document refers to "the duration since the
epoch
> specified by clock_type." This "clock_type" is passed to
xtime_get(), but
> how does the timed lock know which clock_type value to use?

Absolute times are basically required for condition variables
(because of the looping) and so I've used absolute times for
consistency. This was the general consensus I got (from the
admitedly limited responses) when I discussed this during development
on this list. As for the "clock_type"... there's currently only one
valid clock_type making this a non-issue. Things will change when we
settle on a true Boost solution for times/durations.

> - lk.timed_lock(t) is specified to "return true if sucessful within
the
> specified time t, otherwise false." You can't really guarantee
this, due to
> scheduler delays, or delays in acquiring the auxiliary mutex used
in
> the POSIX implementation of timed_mutex. The best you can say is
that
> the thread won't wait forever, but will wait at least until time
t before
> giving up.

Neither description is fully accurate, but your is closer. I'll work
on this.

> Models table at end of document: The "Refines" column is wrong for
every
> entry. Every entry should have "Lock" in its "Refines" column.

Quite true.

> scoped_lock.html:
> -----------------
>
> I'm not sure this document is necessary or desirable. The C++
Standard
> doesn't go into separate detail on C::iterator for each container
class C; it
> just says which concept it models, and any additional needed info
is included
> in the description of class C. Likewise, all of the information in
this
> document that users should see belongs either in the description of
the
> appropriate concept or in the description of the mutex class itself.

Another consequence of having the scoped_lock classes in a nebulas
detail/non-detail state during development. I'll fix this.

> The mutex_type typedef is not mentioned in the Lock concept; it
should be.

Hmmm... probably true. It was put there for use by boost::condition,
but this usage is beneficial for other generic usages. I just didn't
think this one through.

> Quote: "Although this class is an implementation detail..."
>
> Comment: Stop right there. Implementation details don't belong in
> specifications, except as examples clearly labeled as possibilities
only.

The idea was that developers could use the templates for the Lock
concepts in their own Mutex refinements. However, the types have now
been fully moved into detail land and should simply not be documented.

> Quote: [For "operator const void * () const"] "If the associated
mutex is
> currently locked, a value convertible to true, else a value
convertible to
> false."
>
> Comment: Don't you mean "If the associated mutex is currently
locked BY THIS
> OBJECT..." Also, void * values are NOT convertible to bool,
although they can
> be used as the test of an if, while, or for, and can be used as an
argument to
> &&, ||, and !.

Correct on both counts.

> Quote: "A const void * conversion is considered safer than a
conversion to
> bool."
>
> Comment: You should explain why.

This is the rationale, and re-iterating this rationale every time
it's used is needless IMHO. However, it's a non-issue when I remove
this documentation ;).

> scoped_timed_lock.html
> scoped_try_lock.html
> ----------------------
>
> See comments for scoped_lock.html.
>
>
> condition.html
> --------------
>
> Quote: "...a mutex object modeling a Mutex concept."
>
> Comment: Types model concepts; objects do not.
>
> Quote: "...a lock object modeling a Lock Concept..."
>
> Comment: Types model concepts; objects do not.
>
> Quote: "The mutex must be locked prior to waiting on the condition,
> which is ensured by passing a lock object modeling a Lock Concept to
> the object's wait functions."
>
> Comment: This doesn't ensure that the mutex is locked, although it
does ensure
> that the mutex gets unlocked.

Actually, it does ensure it, through a call to locked() on the lock
object passed in.

> Quote: "While the thread is waiting on the condition object, the
mutex
> associated with the lock is unlocked."
>
> Comment: The use of "While..." would seem to indicate that "is
unlocked" is
> not the passive form of "lock", but a simple statement of the state
of the
> mutex. That is, the sentence seems to say that the mutex remains
unlocked
> during the entire time that the thread is waiting on the condition
object.
> Suggest "Upon blocking on the condition object, the thread
unlocks/releases
> the mutex."
>
> Quote: "When the thread returns from a call to one of the condition
object's
> wait functions, the mutex is again locked."
>
> Comment: Does not make clear who locks the mutex, and is not clear
on whether
> this happens just before or just after the call. Suggest "The wait
functions
> reacquire the mutex just before returning."

I'll work on this.

> Quote: "The tricky lock/unlock/lock sequence is performed
automatically by the
> condition object's wait function."
>
> Comment: Only the unlock/lock part is performed by the wait
function.

True. This description needs some cleaning up.

> Quote: "notify_one()... Effects: If there is a thread waiting on
*this, change
> that thread's state to ready."
>
> Comment: This is not completely accurate. After being unblocked,
the thread
> then has to reacquire the mutex, which may cause it to block
again. Also, the
> issue of fairness (lack of starvation) needs to be addressed: in
the case
> where some thread t is waiting on a condition and there are always
multiple
> threads waiting when a notify_one() occurs, is there any guarantee
that
> eventually thread t will be chosen, or may thread t *always* be
passed up in
> favor of another thread and wait forever on the condition? (I'm not
> considering here the issue of subsequently reacquiring the mutex.)

The re-locking semantics are covered by the documentation of the wait
methods. I'm not sure that changing the wording here would be
useful. And again, I'm not sure that mentioning the fairness is
possible here, and even with fair implementations you can get the
same problems, as illustrated by starve_phil.

> Quote: "void wait(ScopedLock & lock); ... All effects occur in an
atomic
> fashion."
>
> Comment: This is not and cannot be true. The initial
unlock/suspend can be
> done atomically, but becoming unblocked is a two-stage process: (1)
the thread
> gets awakened by some other thread calling notify_one() or
notify_all(); (2)
> the thread reacquires the mutex. For conditions, we really need to
> distinguish two separate blocking states: blocked on the condition
itself, and
> blocked on the mutex while attempting to reacquire the mutex.

Technically accurate, but I'm not sure the distinction is useful.

> Quote: "timed_wait ... Throws: lock_error if !lock.locked()"
>
> Comment: Do you mean to guarantee that an excpetion is thrown?

This is what ensures the mutex is locked ;). Seriously, though, I'm
still thinking about this issue.

> Quote: "bool timed_wait(ScopedLock & lock, const xtime & xt);"
>
> Comment: See my previous reservations about using an absolute time
instead of
> a duration.

The rationale for absolute times is well documented by POSIX. Usage
of durations for timed waits on conditions is very problematic given
the presence of "spurious wakeups" and the need for checking the
predicate in a loop. Durations require re-calculations through each
loop iteration which complicates the code and can effect the over all
performance.

> Quote: "timed_wait ... Returns: false if xt is reached, otherwise
true."
>
> Comment: You can't really guarantee this. See my previous comments
on
> ScopedTimedLock.
>
>
> thread.html:
> ------------
>
> Quote: "This includes completion of all thread cleanup handlers,..."
>
> Comment: What thread cleanup handlers? Boost.Threads has no such
notion.

A misuse of the term for those knowledgable in POSIX. I meant thread
specific storage cleanup handlers.

> Quote: "if a thread object is destroyed without join() having first
been
> called, the thread of execution continues until its initial function
> completes."
>
> Comment: This sounds scary. Another possibility is to have the
thread dtor
> call join(). If you think the existing semantics are best, it
would be nice
> to have a section in rationale.html explaining your reasoning.

This is the POSIX concept of a detached thread.

> Quote: "thread(const boost::function0<void> & threadfunc);"
>
> Comment: I would like to see a hyperlink here to a description of
the
> function0 template. I am not familiar with the function0 template,
so I have
> to ask: does the const prevent me from using a function object
whose state is
> intended to be modifiable by the newly-created thread?

Yes and no, with the emphasis on yes. This is an error in the
implementation.

> Quote: "Comparison Operators..."
>
> Comment: Gives requirements on *this (left-hand side of ==), but
not on rhs.

Hmm... I'll fix this.

> Quote: "static void sleep(const xtime & xt);"
>
> Comment: As previously mentioned, I have doubts about the use of
absolute time
> instead of duration. Again, how does sleep() know the appropriate
> clock_type/epoch to use?
>
>
> thread_group.html:
> ------------------
>
> Quote: "void remove_thread(thread* thrd);
> Effects: Removes *this's list of managed thread objects.
> Throws: ? if thrd is not it *this's list of managed thread
objects."
>
> Comment: What is the "?" for?

This was missed during documentation. I'll fix this.

> For effects, don't you mean "Removes thrd from
> *this's list of managed thread objects"?

Yep.

> call_once.html
> --------------
>
> Why does call_once() take a void (*func)() argument instead of a
function
> object?

Originally, for safety (constructing the function object could have
synchronization issues as well). I expect I'll change this, though.
It's been brought up before.

> thread_specific_ptr.html
> ------------------------
>
> Quote: "Note: There is an implementation specific limit to the
number of
> thread specific storage objects that can be created, and this limit
may be
> small."
>
> Comment: This note is not very useful without *some* way of knowing
what the
> limit may be. Perhaps there should be a const static value giving
the limit,
> which will vary from one platform to another. It would be nice to
know a
> lower bound on this limit (i.e., you're guaranteed at least 5
thread-specific
> pointers), and it would be nice to know the limit (or a reasonably
tight lower
> bound) on the most common platforms.

There's a limit on memory but we get along well with out a means to
tell how much memory is left for our use. There's also the issue
that on Win32 the number depends on the actual OS (which can be
worked around with runtime checks). Then there's the fact that we're
considering an implementation that limits this only by the amount of
available memory. This is something I've thought about but left out
in the initial release for all of these reasons.

> Quote: "Destructor... Does not destroy any data that may be stored
in any
> thread's thread specific storage. For this reason you should not
destroy a
> thread_specific_ptr object until you are certain there are no
threads running
> that have made use of its thread specific storage."
>
> Comment: This sounds unsafe and prone to resource leaks.

But it's also all that POSIX guarantees. In fact, it's recommended
you never call pthread_key_delete in portable programs by many books
for this very reason. This may all change at some point if we change
the implementation but for now this is the gaurantee.

> A rationale is
> needed here for why you consider this better than having
thread_specific_ptr
> be like auto_ptr or scoped_ptr (dtor automatically deletes pointer).

I don't consider it better, I just consider it the best gaurantee
given by the current implementations. Until a final decision is made
here I'm not sure that rationale will help anyone.

> xtime.html
> ----------
>
> Why not provide the obvious two-argument ctor for xtime? It would
still be
> compatible with the C version of xtime.

Because I find it highly unlikely that the C++ standard would do
this. I instead expect we'll either replace xtime entirely with a
C++ specific time concept or that a C++ specific time concept will
have a conversion (explicit or not) to xtime. In other words, I
don't expect a C++ programmer to use xtime with out some sort of OO
wrapper, if he'll use it at all.

Don't get too hung up on xtime. It's there solely as a placeholder
until Boost can decide upon a direction for time/duration handling
concepts.

> Quote: "xtime_get(struct xtime * xtp, int clock_type);"
>
> Comment: It's not crystal clear what clock_type is. I gather that
it's
> supposed to be one of the enums in the synopsis, but this isn't
explicitly
> stated anywhere.

You're correct. I need to work on this a bit.

> lock_error.html
> ---------------
>
> Quote: "class lock_error : public std::runtime_error"
>
> Comment: A lock_error is not a runtime error. It is an error which
it is
> reasonable to expect the programmer to avoid. Since it is a
programmer error,
> it should derive from std::logical_error.

I can buy that.

> I would recommend that exceptions indicating programmer errors
include the
> file name and line number where the error was detected, to aid in
debugging.
> Example:
>
> class lock_error : public std::logical_error
> {
> char const * file_;
> char const * line_;
> public:
> lock_error(char const * file, char const * line)
> : logic_error(string("Lock error at file: ") + file + ",
line: " + line)
> { }
> char const * file() { return file_; }
> char const * line() { return line_; }
> };

Storing the line as a seperate string is strange. If it were to be
stored sperately I'd store it as an integral type. However, this
whole concept is foreign to the design of the C++ standard and would
be considered an implemtation detail useful for debugging. So I'll
consider adding this info to the what() string, but I'm not sure it
should be documented as a requirement of the implementation.

> ...
> throw lock_error(__FILE__, __LINE__); // Perhaps define a macro
> ...

And this illustrates why C++ wasn't designed this way ;).

> config.html
> -----------
>
> You should make clear whether these macros are intended to be used
as
>
> #ifdef MACRO_NAME
>
> or
>
> #if MACRO_NAME

That would depend solely on the code using them, no? Seriously,
though, I think this is fairly obvious from the macro names, given
they are all BOOST_HAS_* variants.

Thanks for your comments. They are definately useful.

Bill Kempf

Next message: Darin Adler: "Re: [boost] Boost.Assert"
Previous message: Peter Dimov: "Re: [boost] Boost.Assert"
In reply to: Kevin S. Van Horn: "Design and documentation of Boost.Threads"
Next in thread: Mac Murrett: "Re: [boost] Design and documentation of Boost.Threads"

Date view	Thread view	Subject view	Author view