$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] [optional] generates unnessesary code for trivial types
From: Hite, Christopher (Christopher.Hite_at_[hidden])
Date: 2012-01-26 10:22:36
Thanks for your feedback.
> I think the generated code gets somewhat simplified once issue (1) is addressed.
It would help, but I think won't get rid of all the branches.  Your refactoring might help more.
> I think it would be a mistake to just blindly copy the value of b when b.m_initialized is false,
> if for no other reason than doing so will lead to endless user complaints about compiler and
> valgrind warnings. Also, invoking undefined behavior can result in the compiler doing very
> nasty and unexpected things, even in the absence of runtime issues from reading an
> "uninitialized" location. Consider the possibility that the compiler can prove that the
> optional being copied from is uninitialized, and so can conclude that the read of its value
> is undefined behavior. Probably the *best* one can hope for in such a situation is a
> compiler warning, and many far worse results are possible.
Consider the completely legal code below:
struct cheap_optional_int{
        cheap_optional_int() : m_initialized() {} // don't init m_data
        bool m_initialized;
        int  m_data;
};
void assign_boost_cheap_optional_int(cheap_optional_int& a,cheap_optional_int& b){
        a=b;  // default impl
}
The compiler generates nothing but 32-bit moves from the source to the destination.  This is completely fine for valgrind.  It only complains if a branch based is taken based on uninitialized data.
00000000 <assign_boost_cheap_optional_int(cheap_optional_int&, cheap_optional_int&)>:
   0:   53                      push   %ebx
   1:   8b 44 24 0c             mov    0xc(%esp),%eax
   5:   8b 58 04                mov    0x4(%eax),%ebx
   8:   8b 08                   mov    (%eax),%ecx
   a:   8b 44 24 08             mov    0x8(%esp),%eax
   e:   89 08                   mov    %ecx,(%eax)
  10:   89 58 04                mov    %ebx,0x4(%eax)
  13:   5b                      pop    %ebx
  14:   c3                      ret
Sorry the assembler is so poorly formatted after it's mailed.
The cool thing is cheap_optional_int has_trivial_destructor and has_trivial_copy because we haven't overridden the defaults.
Unfotunately overriding the default ctor/dtor always breaks these, even if the code could be optimized out.  It may not even be possible for a compiler to solve.
Chris
_____________________________________________
From: Hite, Christopher
Sent: Wednesday, January 25, 2012 6:29 PM
To: 'boost_at_[hidden]'
Subject: [optional] generates unnessesary code for trivial types
When decompiling my code I noticed a bunch of unnessesary code caused by boost::optional.
1) deconstruction
typedef boost::optional<int> optional_int;
void deconstruct_boost_optional(optional_int& o){
        o.~optional_int();
}
One would expect this to do nothing.  Instead gcc 4.6.0  with O3 generates:
        if(m_initialized){
                // do nothing
                m_initialized = false;
        }
00000000 <deconstruct_boost_optional(boost::optional<int>&)>:
   0:   8b 44 24 04             mov    0x4(%esp),%eax
   4:   80 38 00                cmpb   $0x0,(%eax)
   7:   74 03                   je     c <deconstruct_boost_optional(boost::optional<int>&)+0xc>
   9:   c6 00 00                movb   $0x0,(%eax)
   c:   f3 c3                   repz ret
This one could be easily fixed by removing the bit that sets m_initialized to false, since we're deconstructing anyway.
2) assignment also generates these problems:
void assign_boost_optional(optional_int& o){
        o=13;
}
Here there's a semantic issue: we have to decide to use the copy constructor or operator=.  This is also wasteful for POD types or any type which has_trivial_copy<>.
3) Even more expensive is if we want to copy an optional<int>
void assign_boost_optional(optional_int& a,optional_int& b){
        a=b;
}
00000000 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)>:
   0:   8b 44 24 04             mov    0x4(%esp),%eax
   4:   8b 54 24 08             mov    0x8(%esp),%edx
   8:   80 38 00                cmpb   $0x0,(%eax)
   b:   74 0b                   je     18 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x18>
   d:   80 3a 00                cmpb   $0x0,(%edx)
  10:   75 16                   jne    28 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x28>
  12:   c6 00 00                movb   $0x0,(%eax)
  15:   c3                      ret
  16:   66 90                   xchg   %ax,%ax
  18:   80 3a 00                cmpb   $0x0,(%edx)
  1b:   74 09                   je     26 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x26>
  1d:   8b 52 04                mov    0x4(%edx),%edx
  20:   c6 00 01                movb   $0x1,(%eax)
  23:   89 50 04                mov    %edx,0x4(%eax)
  26:   f3 c3                   repz ret
  28:   8b 52 04                mov    0x4(%edx),%edx
  2b:   89 50 04                mov    %edx,0x4(%eax)
  2e:   c3                      ret
Three possible branches!  Theoretically single 64 bit copy do the job.  I'm tempted to say: it would be best if for any T has_trivial_copy< optional<T> > iff has_trivial_copy<T>.  It might make a sense to make an exception for huge T, where the copying an unused T is more expensive than the branching.
4) has_trivial_destructor<T> should impl has_trivial_destructor< optional<T> > , but this is hard to implement without specialization of optional.
Checking has_trivial_destructor might take care of the complexity of optional<T&> since has_trivial_destructor< T& >.
I'd be willing to fix #1.  The other issues need some discussion.
Chris