Subject: Re: [boost] several messages
From: John Maddock (boost.regex_at_[hidden])
Date: 2012-08-05 06:52:06


>> When ET's are turned on, this will always IMO be slower, consider an
>> operator returning an expression template - it's basically returning a
>> pair of references (and larger objects still for more complex
>> expressions), where as say a wrapped integer is actually returning a
>> smaller cheaper to copy object in that case. So for a wrapped integer,
>> returning the result by value will always win out, and that's before you
>> even consider the cost of unpacking the expression template.
>
> I believe you are a bit pessimistic. Compilers are not bad at inlining
> functions and removing wrappers that do nothing, even for
> expression-templates. And yes, returning a pair of references is doing
> nothing if your function is inlined. Some analysis of what remains would
> be interesting. What usually happens is that:
> * expression template wrappers sometimes contain runtime checks (e.g. for
> aliasing between variables), some of which can be impossible to determine
> at compile-time;
> * some compiler optimization opportunities that would have happened early
> are now only exposed after a lot of inlining / simplification has taken
> place, which may be too late for some compilers (but then it is not too
> hard for compilers to make progress there, if it is pointed out to them).

Nod. Lot's to investigate I guess ... of course if a returned ET can be
optimised away, so can the wrapped type that's returned directly (if it's
small enough).

>>> Note that changing the return type from Number&& to Number cancels the
>>>> allocation gain when using a type like GMP that doesn't have an empty
>>>> state.
>>>>
>>>
>>> Huh, really? That's no good.
>>
>> Don't panic it's OK ;-)
>>
>> The current sandbox code, does have these rvalue ref operator overloads,
>> does return by value for safety, and still manages to avoid the extra
>> allocations - so for example in my horner test case, evaluating:
>>
>> Real result = (((((a[6] * x + a[5]) * x + a[4]) * x + a[3]) * x + a[2]) *
>> x + a[1]) * x + a[0];
>>
>> Reuslts in just one allocation even when expression templates are turned
>> off - the first operator overload called generates a temporary, which
>> then gets moved and reused, eventually ending up in the result.
>
> Uh?
> This is indeed what happens, but for GMP types, unless you added in your
> wrapper a special 0 state (which you then have to test in every
> operation), every constructor has to allocate, including the move
> constructor, since a moved-from object must still be in a valid state.
>
> Did you add an empty state then?

The move constructor doesn't allocate - it takes ownership of the GMP
variable, and sets the variable in the moved-from object to a null state.
The *destructor* then has an added check to ensure it doesn't try and clear
null GMP objects: that's basically the only change. IMO the cost of the
extra if statement in the destructor is worth it - and should be trivial
compared to calling the external library routine to clear the GMP variable.

John.