From: Simon Buchan (simon_at_[hidden])
Date: 2005-09-28 20:35:42


Caleb Epstein wrote:
> On 9/28/05, David Abrahams <dave_at_[hidden]> wrote:
>
>
>>Hmm... Also, is the apparent dependency on ASCII encoding truly portable?
>
>
>
> Doubtful. Wouldn't testing for std::isalnum || '-' || '_' be a better idea?
> Perhaps not quite as performant (once the lookup table was made static), but
> certainly more portable and simpler to read.
>
> --
> Caleb Epstein
> caleb dot epstein at gmail dot com
> _______________________________________________
> Unsubscribe & other changes: http://listarchives.boost.org/mailman/listinfo.cgi/boost
>
In most implementations, the is*()'s are implemented using exactly the
same method. It's generally safe, as pretty much every code-page
barring EBCDIC (or whatever it is) that isn't obviously non-latin (CJK
codepages, espcially) uses the ASCII characters for values less than
128, including the Unicode code-points (8 is particularly good at this:
all the surrogates are > 127). If it is a CJK code-page, you're screwed
no matter what you do. (wide characters, and just what is an
"alphabetic" ideograph?)