Subject: Re: [boost] [locale] Review part 1: headers
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-09 07:38:19


>
> > I don't think neither ICU nor Boost supports
> > non-32 bit platform.
> >
> > I understand that strictly speaking it is non-standard
> > but de-facto there are no problems with this and this
> > assumption is quite safe.
>
> Since this assumption arbitrarily limits the library to particular
> platforms and it is trivial to avoid relying it, I see no point in doing so.
>

As I told you I can fix it, but practically it does
not give any advantage.

Give me a name of modern compiler that supports C++03
and has sizeof(unsigned) < 4?

>
> >
> > I can change this to uint32_t
> >
> >> line 131: Please be consistent about using unsigned vs. uint32_t.
> >
> > Good point
> >
> >> line 140: shouldn't offset be std::size_t?
> >> Why should the size of the string be limited
> >> to 4GB? std::size_t is guaranteed to be
> >> large enough to handle anything that can
> >> be stored in memory.
> >>
> >
> > You are right, I'll fix this.
> >
> > However, two points:
> >
> > 1. I already know compilers were size_t != sizeof(void *)
> > so actually it must be uintptr_t... but it is not in C++03
> > but rather in C99. (it is in C++0x)
>
> 1) You work with contiguous memory, so size_t is the correct one, not
> uintptr_t
>

size_t is used in boost as it is closest way to use something
in C++03 standard that is similar to uintptr_t.

That is way C99 and C++03 created uintptr_t.

In any case as I told I'll change this for better consistency/

> 2) uintptr_t should ideally be available from boost/cstdint.hpp (it
> doesn't seem to be ATM, that's a bug)
>
> 3) I don't know of compilers where those are different, mind saying
> which ones exhibit this?
>

As I said before I prefer to relay on native types rather
then boost's types especially if it is feasible.

As I told above I'll change this were necessary but
it is just for "code-beauty".

>
> > 2. There is no reason on planet earth you need to
> > create boundary mapping for text>= 4G!
> >
> > Even entire War& Peace book takes about 3MB.
> >
> > So if you do this you are probably doing
> > something very-very wrong.
> >
> > But this is different story.
>
> Another unnecessary arbitrary limit.
>
> Why would you want to prevent people from treating very large data with
> your library?
>
> On the Internet, there is text much bigger than War & Peace, even if not
> necessarily in the form of books.
>

As I told above I'll change it to size_t but...

You don't want to run boundary analysis on such chunks of text
in any case - and if you do it, you are doing something wrong.

> >
> >
> >> Why do you have explicit specializations of
> >> boundary_indexing? They're all the same.
> >> The compiler is perfectly capable of instantiating
> >> them all from a single definition.
> >>
> >
> > Ok this is something that you'll see all over the Boost.Locale
> > code when it comes to deal with facets generations.
> >
> > It is in order to support (brain damaged) DLL platform.
>
> Both MSVC and GCC support extern template, which is also a C++0x feature.
>

Believe me, it was only feasible solution I've reached after
spending many man hours fighting DLL's, std::locale facets and
both compilers.

(AFAIR MSVC uses specialization as well for its own facets)

So after fighting both compilers and spending days on solving and tuning
this problem I can say this is what should be done.

It requires some good experience with std::locale facets, DLLs
exporting, and having specific tests.

So what I did is unfortunately required and most reliable solution.

>
> > Yes... However all supported platforms are at least 32bits.
> >
> > There is one small points. As far as I remember I
> > always used native types over strict types in the interfaces
> > (function calls) because if it is changed between
> > small releases (like stdint.h found) it would not
> > break binary compatibility so native types (as long
> > as they good enough) are better alternative.
> >
> > (We are talking about C++ compilers that does not
> > handle C99 stdint well)
>
> Boost has a portable stdint equivalent.
>

I know and use it, see the description above.

> In any case, you shouldn't even use uint32_t, but uint_least32_t, or
> better yet, char32_t if it is available.
>

No, see:

   
http://cppcms.sourceforge.net/boost_locale/html/appendix.html#why_no_special_character_type

In any case I use uint32_t or uint16_t when I do relate to handling
of UTF-16 or UTF-32

> boost/cuchar.cpp in my Unicode library shows the right way to define the
> types Unicode needs to work with.
>

Same link as above, typedef of uint16_t or uint32_t is unsuitable for
Boost.Locale
character representation.

See your Unicode library is orthogonal to Boost.Locale, it
handles things very different and has different purpose.

Artyom