From: Rogier van Dalen (rogiervd_at_[hidden])
Date: 2004-10-20 08:09:46


> > So unicode::string<unicode::codepoint_string<std::string> > would be a
> > UTF8-encoded string that is manipulated using its characters.
>
> Encoded characters or abstract characters? (See section 2.4 of Unicode standard
> for definitions)

I mean a base character with its combining characters. I don't think
this is the same as "abstract character", is it?

My plan was to decompose all characters in unicode::string. This makes
manipulation of diacritics easier. Correct me if I'm wrong, but your
example of finding "ü" in a string would come down to finding the
codepoint sequence "U+0075 U+0308" and checking whether it is not
followed by another combining character, pretty trivial still.

Regards,
Rogier