$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] lifetime of ranges vs. iterators
From: David Abrahams (dave_at_[hidden])
Date: 2008-09-03 13:09:04
on Wed Sep 03 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:
>> > rng | filtered( funcA ) 
>> >     | filtered( funcB ) 
>> >     | filtered( funcC ) 
>> >     | filtered( funcD ) 
>> >     | filtered( funcE )
>
>> 1. _Do_ they write that?  
>
> Why not? They did for transform_range, and the notation is getting easier...
Yes, I never contested that they *could* write it.
I'm still asking you the same question: what real life computations
demand a solution to this problem?
>> 2. If they do that, I believe they are still paying an enormous
>>    unnecessary runtime penalty even with your optimization because of
>>    the endpoint checks in each layer of the composite iterator or
>>    range.  The only way IIUC to avoid such a penalty is to compress the
>>    filter adapters into one, essentially by using expression
>>    templates.
>
> See answer to 4.
>
>> 3. I am not yet convinced that you are suffering a significant
>>    performance penalty due to the wasted space in real applications.
>>    Have you measured it?  Unless you're actually *storing* lots of
>>    these stacked iterators or passing them around by value in your
>>    inner loops, it seems unlikely to make much of a difference.
>
> Giovanni has seen the quadratic stack expansions. 
Sure, but that does not necessarily amount to a significant performance
penalty.
> If you are worried about performance of pointers, why not worry about
> the performance of wrappers around pointers?
I actually am a bit worried about it.  However, as I am trying to tell
you, I don't believe that what we're talking about here addresses the
whole underlying problem.  You're talking about eliminating redundancy
across pairs of iterators, but AFAICT not the redundancy through the
vertical "stack."
>> 4. I can understand wanting to improve these components so that they
>>    can be efficient under more conditions, and avoiding the redundant
>>    storage would undoubtedly be a real improvement.  However, I think
>>    there's a deeper inefficiency in stacking all those filter iterators
>>    together, and it's neither a range library's job, nor I think is it
>>    possible in this case, to protect users from "blissful unawareness"
>>    of performance implications.
>
> The design we came up with addresses the general problem of stacking
> range_adaptors that must store their end, like
> difference_range/filter_range/union_range/[other algorithms that I cannot think
> of now], in any combination. 
OK, let me try to spell this out.
Let's suppose for a moment that you build a
filter_iterator<strided_iterator<int*,5>, F>.  A strided_iterator is one
that visits every Nth element.  Unless you know that there are an even
multiple of N elements, it needs to store an endpoint because otherwise
the iterator might stride right past the last stored location.
  Thus sizeof(strided_iterator<int*,5>) == 2*sizeof(int*)
and its increment operator looks like:
  if (pos <= end - N)
      pos += N;
  else
      pos = end;
Now as you know, a filter_iterator also needs to store an endpoint.  So
assuming F is empty and EBO is used to eliminate its storage, then
  sizeof(filter_iterator<strided_iterator<int*,5>, F>) 
  == 2*sizeof(strided_iterator<int*,5>)
  == 2*2*sizeof(int*)
and the filter iterator's increment looks like:
  do {
      ++pos2;
  }
  while (pos2 != end2 && !f(*pos2));
Now if we expand the incrementation of pos2, it's
  do {
      if (pos <= end - N)
          pos += N;
      else
          pos = end;
  }
  while (pos2 != end2 && !f(*pos2));
Whereas it should only be:
  do {
      if (pos <= end - N)
          pos += N;
      else
          pos = end;
          break;
  }
  while (!f(*pos2));
No approach that only addresses the space wasted across pairs of
iterators can help with these issues, so I think a more fundamental look
at the problem is called for.  And again, I think the problem is very
much like what happens with segmented iterators so its worth looking
there for inspiration.  It may be that, as with segmented iterators, a
more general form for the algorithms is called for.
> If you have thought up expression templates that cover all of these in
> a unified fashion with better performance than stacked iterators, I'd
> be very happy and try to implement that instead.
Again, I'm not ready for answers yet.  I'm only just now discovering the
questions.
>> I'm not yet convinced we're solving a real problem here,
>
> I have enough of that stuff in my code that I am convinced we do.
That fact alone does not make it a real problem in my book.
>> and if it _is_ real, I am not quite convinced the general approach of
>> squeezing out redundant storage would address all of the problem.
>
> It does not, you already mentioned the superfluous end checks. As I
> said, if you have the whole solution, let me know.
I'm not there yet, but I know enough about the problem now that I
wouldn't "buy" a partial solution.
-- Dave Abrahams BoostPro Computing http://www.boostpro.com