$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Howard Hinnant (hinnant_at_[hidden])
Date: 2007-04-23 15:16:21
My apologies. It was not my intent to start a mini-firestorm and then
leave town for the next 10 days. ;-)
On Apr 16, 2007, at 4:32 PM, Ion Gaztañaga wrote:
> I'm terrible explaining my points.
I can relate. :-)
There exists a small gap between what was voted into the WP last week
regarding move semantics, and what my intent is. Below I attempt to
explain my intent, and where that differs from what is in the WP, I
will attempt to fix with defect reports.
---
In general, std::types, when dealing with rvalue references which are
overloaded with lvalue references, the std::code is allowed to assume
that it is dealing with an actual temporary, and not a moved-from
lvalue. Reference:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1858.html#23.1%20-%20Container%20requirements
paragraph 13:
> -13- Those container member functions with rvalue-reference
> parameters may assume that the reference truly refers to an rvalue,
> and specifically is not a reference into the same container.
Example:
template > class vector {
...
iterator insert(iterator position, const T& x);
iterator insert(iterator position, T&& x);
...
};
Inside the insert overload taking a T&&, the code is allowed to assume
that x has bound to a temporary. This means that the code does not
have to check for the self-referencing case, and is thus faster. If a
client moves an lvalue to the insert signature, the onus is on the
client to make sure that all existing references to that lvalue
promise not to notice (or care) that the lvalue has been moved from.
void foo()
{
std::vector<A> v(10);
v.insert(v.end(), std::move(v.front()));
}
The above is a run time error similar to:
void foo()
{
std::vector<A> v(10);
v.insert(v.end(), v.begin(), v.begin()+1);
}
(the latter is currently forbidden by 23.1.1, p4)
---
Now let's look at the other side:
void foo()
{
A a;
std::vector<A> v;
v.insert(v.end(), std::move(a));
// Here "a" is required to be in a valid state, but the value of
"a" is unknown
}
The definition of "valid" is up to the author of A. At a minimum,
generic code will need to destruct and/or assign "a" a new value.
However the author of A is free to define as part of the class
invariant:
After an A is moved from, you can't call A::bar()
I don't think that is a very good design myself. But I also don't
think the standard should prohibit it.
For std::types I do not want to see any such restrictions on moved-
from values. For example if a vector gets moved from, I expect it to
still be fully functional, just with an unknown state (but you can
test the state with the existing interface, e.g. size(), capacity(),
empty(), etc.), and set the state however you like.
With perhaps a few exceptions, I do not want to see the value of a
moved-from std::type specified. For example:
std::string s1("123");
std::string s2(std::move(s1));
// value of s1 not reliable here. "123" and "" are two good guesses
---
Originally I preferred vector move assignment as just swap(). However
I've recently become convinced that this is not the best definition.
I now prefer the semantics of clear(); swap();. This means that move
assignment is O(N) instead of O(1). Ok Ion, take a deep breath and
stay with me for just a minute longer. :-)
The semantics of ~thread() is going to be cancel(); detach();. That
is, if thread A is holding a std::thread referencing thread B, and
thread C cancels thread A, as thread A's stack unwinds, it will
execute b.~thread() which cancels B, but does not wait around for B to
respond to the cancel. Thus canceling A will not hang waiting on B to
finish up.
Now consider vector<thread>:
vector<thread> v1;
...
vector<thread> v2;
...
v1 = std::move(v2);
After the move assign, I think it best that whatever threads v1
happened to previously refer to, are now busy canceling themselves,
instead of continuing to run under v2's ownership. If you want the
latter, you've got swap.
Cost analysis: It isn't nearly as bad as it sounds. Promise!!! :-)
First, for vector<type with trivial destructor>, clear() is O(1) in
practice (or it should be). And clear() does not dump capacity, so
the capacity gets reused via the swap. So we only have to worry about
vector<type with non-trivial destructor>.
Consider vector<vector<thread>>::insert. And we're going to insert a
new vector<thread> at begin(). The case where there is insufficient
capacity only uses move construction, and not move assignment. So
assume there is sufficient capacity.
In this case we first move construct *--end() into *end(). This move
construction leaves a zero capacity vector<thread> at *--end() (though
that value should not be guaranteed, there is really little else
vector can do for move construction).
The next step is:
*(end()-1) = std::move(*(end()-2));
Now we already said that *(end()-1) is a zero capacity vector<thread>
prior to this move assignment. Therefore when the move assignment
clears it, that clear is a no-op. Then the states are swapped, making
*(end()-2) a zero capacity vector (again, that state should not be
guaranteed, but it is practical).
Repeat:
*(end()-2) = std::move(*(end()-3));
Again we are move assigning into a zero-capacity vector, so
clear();swap(); and just swap(); are virtually identical sequences of
instructions. This repeats all the way down to:
*(begin()+1) = std::move(*begin());
leaving *begin() as a zero capacity vector<thread>. Then the outside
vector<thread> is moved assigned into *begin(). The outside temporary
vector<thread> is then a zero capacity vector and presumably destructs.
Summary: For vector<vector<thread>>::insert, defining vector move
assignment as clear()+swap() instead of just swap() has a nearly
identical cost. There are no extra destructions.
I've gone through the same analysis with vector::erase, and all
sequence modifying algorithms in <algorithm> which make use of move
semantics. During all of these generic algorithms, move assignment is
always assigning to a source that has already been moved from. Thus
"clearing" the source is a no-op. And I consider the
container::insert/erase and <algorithm>'s as one of the biggest
consumers of move semantics, and even more importantly, as typical use
cases of move semantics.
In general, if you are move assigning *to* something. You've probably
already move assigned or move constructed *from* it. That being said,
if you haven't already moved from the target, and the target is a
std::container or other std::type which owns objects, I think a
"clear" is prudent so that move assignment has the same semantics as
copy assignment.
Getting back to shared_ptr move assignment, I would like to see the
target's reference count decremented if it is not already at 0 prior
to the move assignment (just as in copy assignment - this is the
"clear" part of the algorithm). I would also like to not see any
*new* constraints placed on the source as a result of being moved
from. And finally I do not see the need to specify the value of the
source after the move. I would like vendors to have the freedom to
implement shared_ptr move assignment exactly as they do shared_ptr
copy assignment. If atomic operations can be avoided (for example) by
assuming that the source is a temporary (under move assignment) then
so much the better.
-Howard