From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2006-03-04 12:21:40


Hi to all,

  While revising a bit the boost::filesystem error/exception use for
looking a model for interprocess library and after reading the N1934
"Filesystem Library Proposal for TR2 (Revision 2)" I'm a bit concerned
about the heavy use of std::string/string_type in the library.

The following functions return some kind of path as value type:

// observers
const string_type string() const;
const string_type file_string() const;
const string_type directory_string() const;

const external_string_type external_file_string() const;
const external_string_type external_directory_string() const;

string_type root_name() const;
string_type root_directory() const;
basic_path root_path() const;
basic_path relative_path() const;
string_type leaf() const;
basic_path branch_path() const;

I know that syntactically this is nicer and we can have RVO:

string_type str = path.string();

But when iterating through directories I find that that returning a
temporary object that must allocate/copy/deallocate hurts my performance
paranoia. Even with move semantics we have a an overhead:

std::vector<std::path> paths;
//fill with paths

std::path::iterator beg = paths.begin(), end = paths.end();

for(; beg != end; ++it){
    std::path::string_type str = it->root_name();//temporary created
    str += "append some data";
    std::cout << str;
}

Couldn't be better (although uglier) to take a reference to a string
that will be filled?

void fill_root_name(string_type &root_name) const;
...

////////////////////////////////////////////////////////////////////

std::vector<std::path> paths;
//fill with paths

std::path::string_type root_name;

root_name.reserve(PATH_LENGTH);

std::path::iterator beg = paths.begin(), end = paths.end();

for(; beg != end; ++it){
    it->fill_root_name(root_name);
    str += "append some data";
    std::cout << str;
}

This way we only allocate memory if we don't have enough place in our
string. We can also reserve it beforehand to speed up code.

Apart from this I see that path::iterator has a string member.
dereference will return a reference to that member but an iterator is
supposed to be a "lightweight" pointer-like abstraction, which is
value-copied between functions. A string member, in my opinion, converts
an iterator in a heavy class (that depends on the string length, but an
small string optimization of 16 bytes is not going to help much).

Now that filesystem is proposed for the standard I would like to ask
boosters (and Beman, of course) if they find these performance concerns
serious enough.

Regards,

Ion