Subject: Re: [boost] RFC: edit_distance / edit_alignment library
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2013-07-24 06:36:40


Erik Erlandson wrote:
> I am grappling with how best to represent the returned "edit
> script".

"Best" obviously depends on what the caller is going to do with it.
A good design for the interface should emerge once some experience
has been gained with using the algorithm in actual applications.
Attempting to design a general-purpose interface too soon can be
a mistake.

Having said that, the most generic way to return the edit script
would be to template the algorithm on an output-handling class:

template <typename ITER1, typename ITER2, typename OUTPUT>
void diff(ITER1 begin1, ITER1 end1, ITER2 begin2, ITER2 end2, OUTPUT& output)
{
  .......
  // eventually calls object's methods, something like this:
  output.from_1(i,j); // "deletion", present in 1 but not in 2
  output.from_2(p,q); // "insertion", present in 2 but not in 1
  output.from_both(w,x, y,z); // common subsequence present in both
}

Used as follows:

class diff_output
  // Write edit script to cout in the style of the diff program
{
public:
  template <typename ITER>
  void from_1(ITER a, ITER b)
  {
    std::cout << "< " << string(a,b) << "\n";
  }

  template <typename ITER>
  void from_2(ITER a, ITER b)
  {
    std::cout << "> " << string(a,b) << "\n";
  }

  template <typename ITER1, typename ITER2>
  void from_both(ITER1, ITER1, ITER2, ITER2)
  {
  }
};

An important feature of this is that it doesn't store the output.
If the caller wants to store the output, they can supply an object that
does that.

Regards, Phil.