$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: James Porter (porterj_at_[hidden])
Date: 2007-09-23 20:11:23
Instead of defining character types per character set, you could use a 
specialized char_traits class. It contains state_type, which is used 
with codecvt from the I/O stream library. The default typedef for char 
and wchar_t is mbstate_t, which appears in the standard specializations 
for codecvt. (codecvt is used to perform code conversion between 
character types; it's used in wfstream to convert a stream of chars on 
disk to wchar_ts in memory.)
If you change state_type in the char_traits, you'd be able to 
differentiate the various basic_string types and include information 
about the character encoding without writing a whole lot of new code.
To be honest, I'm only just beginning to look into this myself, so I'm 
afraid I don't have a whole lot of information to give you, but I do 
think this would be the simplest way to handle this part of your project.
- James
Phil Endecott wrote:
[snip]
> If latin1string has a constructor from std::string (which is its own 
> base type) that's fine, i.e. we can still write:
> 
> latin1string s2 = s1.substr(1,5);
> 
> but unfortunately we can also write
> 
> latin2string s3 = s1.substr(1,5);
> 
> which is not so good.
> 
> So a different approach is to define a set of character-set-specific 
> character types, and build string types from them:
> 
> typedef char8_t latin1char;
> typedef char8_t latin2char;
[/snip]