$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: [boost] [gsoc] unicode tools and an unicode string type
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2009-03-29 21:40:36
I plan to submit during the week my proposal for the Summer of Code
about Unicode.
I plan to provide:
- iterator adaptors to iterate sequences of code units, code points and
graphemes, and eventually more, from a sequence in UTF-8, UTF-16, UCS-2
or UTF-32/UCS-4.
- miscellaneous utilities, such as categorization of code points
- normalization functions
- comparisons but not collations
- substring search algorithms
- and finally, an unicode string type
I am well aware defining yet another new string type is quite
controversial, but I believe this is quite useful. A dedicated type
would be able to maintain certain invariants, such as maintaining a
special normalization form.
Also, I believe it can be possible to come up with a string design that
allows easy integration with any other existing string type, such as the
ones from the standard or Qt.