$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Thorsten Ottosen (nesotto_at_[hidden])
Date: 2005-05-15 16:35:56
"Pavol Droba" <droba_at_[hidden]> wrote in message 
news:464931589.20050515230222_at_topmail.sk...
| Hi Boosters,
|
| There was a discussion about char[] support in the Boost.Range
| library. The issue seems important and I'd like to express my
| ideas about a possible solution.
|
| First lets sumarize problems and goals.
| The problems:
|  char[] and possibly any other type that can be used as a c-string
|  (this includes wchar_t, but also int, long and etc when used as a
|  unicode code-point) might represent two different things:
|  1.) c-string literal
|  2.) arbitrary c-array
|
|  Both views differ in lenght calculation, which is totaly
|  incompatible and what's worse, it can lead to casual access
|  violation when used improperly.
|
|  An example:
|  char str[] = "Hello";
|  // typeof(str) is char[6], str=={'h','e','l',l','o',0}
|
|  In the c-string view, str have 5 letters and ends at the 'o'.
|  So the range should be  <'H','o')
|  In c-array view str is 6 elements long and ends with '\0'
|  The range is <'H','\0')
|
|  From the user perspective, both views are equaly important,
|  however according to the usage scenarion, one might be preferable
|  over the other one. Important aspect to keep in mind is this strict
|  relativnes. For example for string algorithms c-string literal is
|  obvious default, while for a data processing library the second
|  choice is better.
|
|  Current implementation is not ideal. First of all, there is a
|  difference between char,wchar_t[] and the rest of the types.
|  This brings some confusion. Secondly, it is not possible to use
|  char[] as an ordinary array.
no default can meet all's expectations.
| The goals:
|  From the problem analysis above, following goals can be implied
|
|  1. we need to support both views equaly
if you remove "equally" I agree.
|  2. a user must be always able to explicitly specify what type
|     of view he requires
|  3. it should be possible for a library writer to select default
|     view for his library.
|     However point (2) must hold, so the user must be able
|     to override this default.
|  4. Support must be present in the Boost.Range library.
|     It is not feasible to ask library writer to provide
|     specific workarounds/hacks. It would simply break the idea
|     of Boost.Range library as a unified interface to range-like
|     data structures.
|
| The solution:
|
|  I propose to have two free-standing functions
|  as_string() and as_array() (naming is not important now).
|
|  Both should have the same generic signature:
|
|  template<typename RangeT>
|  boost::sub_range<RangeT> as_string(RangeT& aRange);
|  template<typename RangeT>
|  boost::sub_range<RangeT> as_array(RangeT& aRange);
|
+ const overloads
|  By default, the functions only copy the input range to the target.
|  However for the types like char[], the result will differ.
|  For as_string() will create a sub_range delimiting string
|  literal (using char_type<char>::length for instance), while as_array()
|  will use compile-time boundaries.
sounds fair.
|  In addition we might consider to open this interface for
|  user-defined type, even if I'm not sure how it can be used.
with ADL. the library says
using boost::as_string;
foo( as_string(bar));
|  Please note, that once any of these manipulators is applied to a
|  range following application will have no effect.
|
| Lets see how this faicility can be used:
|
|  A library writer can set the default by writting algorithm like
|  this:
|
|  template<typename RangeT>
|  ... AnAlgorithm(const RangeT& aRange)
|  {
|     boost::sub_range<RangeT> StrRange=as_string(aRange);
|
|     // Do something with StrRange
|  }
|
|  If a user calls AnAlgorithm directly:
|  char str[]="hello";
|  AnAlgorithm(str);
|
|  str will be converted to a range, delimiting a string_literal.
|  However he can alse use as_array():
|
|  char str[]={'h', 'e', 'l', 'l', 'o'};
|  AnAlgorithm(as_array(str));
|
|  This time no conversion will take place, since as_array() returns
|  sub_range.
|
|  Note, that for the AnAlgorithm it does not matter what default is
|  used in the Range library.
|
| Open questions:
|
| - I have intentionaly not included a proposal for the default view
|  that the Range library should provide.
|  Goal of this solution is to provide a way, that is not dependant
|  on this.
|  I'd like to leave it for the discussion. Right now it seems, that
|  most of the people that entered discussion prefer c-array view.
|  I would prefer c-string view, but I'm probably biased by the fact
|  that I'm the author of StringAlgo library.
 I prefer the string view too.
That's how boost.tange was designed.
There is one prblem with the default today IMO: char[]
should call char_traits<char>::length();
| - There is a space for possible extentions to the basic proposal.
|  For instance, as_string() migh have the second parameter that
|  will identify a terminator.
|
| - String literal lenght can be calculated in two ways. Either by
|  using strlenght() (or alike), or using compile-time size (N)
|  decreased by 1 (N-1).
for const char[], this would be the way to go...and so is it also implemented 
by default.
-Thorsten