$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Beman Dawes (bdawes_at_[hidden])
Date: 2004-11-15 10:14:43
At 07:13 AM 11/15/2004, Peter Dimov wrote:
 >Peter Dimov wrote:
 >> Choosing the wrong native character type causes redundant roundtrip
 >> conversions, one in Boost.Filesystem, one in the OS.
 >
 >Let me expand on that a little.
 >
 >It is _fundamentally wrong_ to assume that all present and future OS APIs 
 >have a single native character type.
The actual wording of PJP's paper was that for paths (not the entire OS 
API's), one type could be considered "fundamental".
 >Consider a case where a dual API OS has access to two logical volumes C:
 >and D:, where the file system on C: stores the filenames as 16 bit 
UTF-16,
 >and the file system on D: uses narrow characters.
That happens all the time on Windows. Often the A: drive is a narrow 
character FAT filesystem.
 >Now the behavior of the calls is as follows:
 >
 >CreateFileA( "C:/foo.txt" ); // char -> wchar_t OS conversion
 >CreateFileW( L"C:/foo.txt" ); // no OS conversion
 >CreateFileA( "D:/foo.txt" ); // no OS conversion
 >CreateFileW( L"D:/foo.txt" ); // wchar_t -> char OS conversion
Yes, that's my understanding too.
 >Furthermore, consider a typical scenario where the application has its 
own
 >"native" character type, app_char_t. In a design that enforces a single
 >"native" character type boost_fs_char_t ("native" is a deceptive term due
 >to the above scenario), there are potentially redundant (and not
 >necessarily preserving) conversions from app_char_t to boost_fs_char_t
 >and then from boost_fs_char_t to the filesystem character type.
Yes. Note that even if a dual scheme is used, that same situation might 
arise:
    if ( fs::exists( "c:foo" ) ) ...
    if ( fs::exists( L"d:foo" ) ) ...
Notice that a narrow character path was given for the wide-character 
filesystem and a wide character path given for the narrow-character file 
system. If the type of the user supplied path is what determines the API to 
use, the O/S may still have to do conversions when there is a mismatch with 
the file system.
Do you see any alternative? If the library queried the O/S about the path 
(which I'm not sure is always possible) to see if the filesystem was wide 
or narrow, a conversion would still have to be done if the user supplied 
path used the other char type. That saves nothing and adds the cost of the 
query.
 >In my opinion, the Boost filesystem library should pass the application
 >characters _exactly as-is_ to the underlying OS API, whenever possible. 
It
 >should not impose its own "native character" ideas upon the user nor upon 
 >the OS.
Your strongest argument IMO is the point about conversions not necessarily 
being value preserving. (I guess we could tell Windows users that they 
should not expect such conversions to work unless supported by the 
applicable codepage. But that seems spin rather than a real solution.)
The efficiency argument is certainly real, but I don't see it as being 
quite as strong. (It will be important for some users, however. Think of 
very small or embedded systems.)
If the rule is that there is some type (char or wchar_t) associated with 
each path, and the library will always use the native API of that type if 
available, then it seems to me that the arguments in favor of a single path 
class weaken considerably. Sure the library can keep track at runtime of 
whether a particular path is wide or narrow, but it is much more normal in 
C++ to distinguish at compile time. In other words, separate path and wpath 
classes.
In discussion on the C++ committee's library reflector, there wasn't demand 
for a templatized basic_path type. AFAICS, a templatized basic_path type 
could be added later if demand arose.
--Beman