$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Rogier van Dalen (rogiervd_at_[hidden])
Date: 2004-10-21 09:28:37
On Wed, 20 Oct 2004 23:05:08 +0300, Peter Dimov <pdimov_at_[hidden]> wrote:
> Rogier van Dalen wrote:
> >
> > // The actual Unicode string
> > template <class CodeUnits, class NormalisationForm,
> > class ErrorChecking>
> > class string
>
> By using ErrorChecking as a template parameter, you are encoding it as part
> of the string type, but this is not necessary, because there is no
> difference between values of strings with different ErrorChecking policies
> (ErrorChecking does not change the invariant). You should just provide
> different member functions for the two ErrorChecking behaviors, or pass the
> ErrorChecking parameter to the member functions that require it.
I hadn't yet looked at it this way, but you are right from a
theoretical point of view at least. To get more to practical matters,
what do you think this should do:
unicode::string s = ...;
s += 0xDC01; // An isolated surrogate, which is nonsense
?
Should it throw, or convert the isolated surrogate to U+FFFD
REPLACEMENT CHARACTER (Unicode standard 4 Section 2.7), or something
else? And what should the member function with the opposite behaviour
be called?
Rogier