$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Jeff Garland (jeff_at_[hidden])
Date: 2006-07-02 14:51:32
David Abrahams wrote:
> Jeff Garland <jeff_at_[hidden]> writes:
> 
>> I've been working on a little project where I've had to doing lots of string 
>> processing, so I decided to put together a string type that wraps up 
>> boost.regex and boost.string_algo into a string type. I also remember a 
>> discussion in the LWG about whether the various string algorithms should be 
>> built in or not -- well consider this a test -- personally I find it easier 
>> built into the string than as standalone functions.
> 
> I appreciate the convenience of such an interface, I really do, but
> doesn't this design just compound the "fat interface" problems that
> std::string already has?
Yes, that's partially the point :-) I understand std::string is too big for 
some. Sadly the members it has make it hard to do the things I tend to do with 
strings most often with strings. The fact is, if you look around at the 
languages people are using most for string processing, they offer just as many 
features as super_string and then some.  Somehow, programmers are managing to 
deal with this.  I'd buy more into the fat interface being a problem if 
something in the string class went beyond string processing, but it doesn't. 
String processing is a big complex domain -- whole languages have been 
optimized for it -- it needs a lot of functions to cover the domain and make 
easy to read code. Any way you slice it the current basic_string is inferior 
to what most modern languages offer.
Needless to say, I understand all about stl, free functions, their power, etc, 
etc.  But the big thing this misses is that having a single type that unifies
the string processing interface means there's a single set of documentation to 
start figuring out how to do a string manipulation.  I don't have to wade thru 
50 pages of string_algorithms, 50 pages of regex docs and so on -- there's 
hundreds of functions to deal with strings there. Not to mention the 
templatization factor in the docs of these libraries which mostly detracts 
from me figuring out how to process the string.  If I'm a Boost novice much of 
this great a useful string processing capability might be lost in so many 
other libraries.
The other thing that gets me is the readability of code.  With a built-in 
function, it's one less parameter to remember when calling these functions. 
It seems trivial, but I believe the code is ultimately easier to understand. 
Simple example:
    std::string s1("foo");
    std::string s2("bar);
    std::string s3("foo");
    //The next line makes me go read the docs again, every time
    replace_all(s1,s2,s3);  //which string is modified exactly?
or
    s1.replace_all(s2, s3);  //obvious which string is modified here
I understand this flies against the current established C++ wisdom, but that's 
part of the reason I've done it.  After thinking about it, I think the 
'wisdom' is wrong.  Usability and readability has been lost -- my code is 
harder to understand. I expect that super_string has little chance of ever 
making it to Boost because it is goes too radically against some of these 
deeply held beliefs.  That said, I think there's a group of folks out there 
that agree with me and are afraid to speak up.  Now they can at least download 
it from the vault -- but maybe they'll speak up -- we'll see.  In any case, 
it's up to individuals to decide download and use super_string, or continue 
using their inferior string class ;-)
> Even Python's string, which has a *lot* built in, doesn't try to
> handle the regex stuff directly.
There are plenty of counter examples: Perl, Java, Javascript, and Ruby that 
build regex directly into the library/language.  It's very powerful and useful 
in my experience.  And, of course, super_string doesn't take away anything, 
just makes these powerful tools more accessible and easier to use.
Jeff