From: Sebastian Redl (sebastian.redl_at_[hidden])
Date: 2008-03-08 10:55:28


Phil Endecott wrote:
> OK, the code is here:
> http://svn.chezphil.org/libpbe/trunk/include/charset/
>
> and there are some very basic docs here:
> http://svn.chezphil.org/libpbe/trunk/doc/charsets/
> (Have a look at intro.txt for the feature list.)
>
Another conceptual problem in your traits. Take a look at UTF-8's
skip_forward_char:

  template <typename char8_ptr_t>
  static void skip_forward_char(char8_ptr_t& i) {
    do {
      ++i;
    } while (!char_start_byte(*i)); // Maybe hint this?
  }

And this loop:

for(iterator it = cnt.begin(); it != cnt.end(); skip_forward_char(it)) {
}

This will always invoke undefined behaviour. Consider the case where it
is just before end(), i.e. ++it == cnt.end(). Then skip_forward_char()
will indeed do ++it, and then do *it, thus dereferencing the
past-the-end iterator. Boom.

Compare with filter_iterator. skip_forward_char *must* take the end
iterator, too, and stop when reaching it. This, in turn, makes the
charset adapter iterator that much more complicated.

Sebastian Redl