$include_dir="/home/hyper-archives/boost-users/include"; include("$include_dir/msg-header.inc") ?>
From: Allan Odgaard (gusixpl02_at_[hidden])
Date: 2008-08-25 15:58:10
It looks like the traits aspect of Xpressive is geared toward  
characters, so I assume that Xpressive is not directly usable with  
UTF-8 encoded text, am I correct?
It might work by having the character type be a 32 bit integer and  
then use iterator adapters which expose the sequence as ucs-4 code  
points (after all, the sequence is encoded), but that leads me to  
the next question: diacritics.
For example something like é in decomposed unicode is two code points  
(e followed by a combining ´ mark), so even when the sequence is  
iterated as ucs-4 code points, a regexp of . will match just the e,  
not the actual (rendered) character.
Since I was unable to find any discussion of this while searching for  
Xpressive, I am curious to hear if any thoughts have gone into these  
issues.