From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2007-08-21 13:11:53


Giovanni Piero Deretta wrote:

> IIRC the lastest verison of boost::function does exactly that. It uses
> a small object optimization for simple functors (like stateless ones
> and ptr to member function bound to a 'this' ptr). I do not think it
> actually *guarantees* that it will never do dynamic allocations, only
> that it will do its best, so probably you can't rely on it. How big is
> the small object buffer is probably implementation (and architecture)
> dependent.

That's good, however there is still something that could be improved by
the function objects generators like boost.lambda.

Take this for example:

int main()
{
     int a = 12;
     int b = 68;

     do_something(_1 + a + b);
}

Technically, this could probably generate something along the likes of:

struct functor
{
     int&& a;
     int&& b;

     functor(int&& a_, int&& b_) : a(a_), b(b_)
     {
     }

     template<typename T>
     decltype(std::make<T>() + a + b) operator()(T&& t) const
     {
         return t + a + b;
     }
};

(imagining we had rvalues, decltype and std::make)

However, the size of that object is linearily dependent on the number of
variables we reference, and thus we might easily go over the small
object optimization.

I don't know how boost.lambda actually does it, but on my x86 box,
sizeof(_1 + a + b) yields 12, which means there is an overhead of one
word for some reason.

We are rather sure to have it well optimized if we do this though:

struct context
{
     int&& a;
     int&& b;

     context(int&& a_, int&& b_) : a(a_), b(b_)
     {
     }
};

int main()
{
    int a = 12;
    int b = 68;

    context cxt(a, b); // ideally we could maybe even have the context
directly owning the variables a and b, but that may be difficult to
codegen with just templates and macros

    do_something(functor2(cxt));
}

struct functor2
{
     context&& cxt;

     functor(context&& cxt_) : cxt(cxt_)
     {
     }

     template<typename T>
     decltype(std::make<T>() + cxt.a + cxt.b) operator()(T&& t) const
     {
         return t + cxt.a + cxt.b;
     }
};

Here, whatever the number of variables we reference may be, our functor
is always of the size of one word, meaning only two words are needed for
it in boost::function : one for the code/type identification, another
for the context. (plus probably a boolean that says whether we need to
delete or not)

It would be nice if we had a lambda engine that could automatically do
such magic. This would probably require the use of macros though.