$include_dir="/home/hyper-archives/boost-users/include"; include("$include_dir/msg-header.inc") ?>
From: llwaeva_at_[hidden]
Date: 2006-07-28 02:23:32
Hi there,
  I am using regex_replace to find_replace a pattern. The code is shown below
  string src = "xxx%R__xy\r\n%\r\nRyyyy% A__%%A\r\n%Rzzz%C%Appp_%C0123\r\n%Rooo";
  re = "(%R)|(%A)|(%C)";
  format = "(?1#R)(?2#A)(?3#C)";
  cout << "SOURCE:" << endl << src << endl << endl;
  regex_replace( src.begin(), src.begin(), src.end(), re, format, format_all );
  cout << "OUTPUT:" << endl << src << endl << endl; 
1) For replacing %R with #R, %A with #A and %C with #C,  and the output string save back to the
source string, the above code do a good job. And the output is
xxx#R__xy
%
Ryyyy% A__%#A
#Rzzz#C#Appp_#C0123
#Rooo
NOTE that %%A is replaced with %#A
2) If I change the format so that the length of format longer than that of  subsitute string, e.g.
  re = "(%R)|(%A)|(%C)";
  format = "(?1#RRR)(?2#AAA)(?3#CCC)";
  
  regex_replace raise an error. I think the error is come  from the original string is not long
enough to store the output string. The problem can be solved by the following code
  string src = "xxx%R__xy\r\n%\r\nRyyyy% A__%%A\r\n%Rzzz%C%Appp_%C0123\r\n%Rooo";
  string output=src;
  re = "(%R)|(%A)|(%C)";
  format = "(?1#RRR)(?2#AAA)(?3#CCC)";
  cout << "SOURCE:" << endl << src << endl << endl;
  regex_replace( output.begin(), src.begin(), src.end(), re, format, format_all);
  cout << "OUTPUT:" << endl << output << endl << endl; 
 But I do want the  output  store in the  original string. Reassigning
the source string slove the problem, i.e. src = output, but for a large
input string (for my case , >5M), this way is not that good. I am looking
for a better approach.
BTW, if the length of the output string shorter than the original string,
the output carry some other extra characerts. e.g.
  string src = "xxx%Ry%Az%Ce";
  string output=src;
  re = "(%R)|(%A)|(%C)";
  format = "(?1R)(?2A)(?3C)";
 
 The output is "xxxRyAzCe%Ce" where the last %Ce are extra characters
from source string. How can I kill the extra characters?
3) Finally,  I will modify the search condition to make sure that only %X rather than %%X (X can
be R, A or C) is replaced. I try the following regular expression 
  re = "([^%]%R)|([^%]%A)|([^%]%C)";
but it doesn't work properly for my problem. e.g. for src = "xxx%R %%R",
for the format  "(?1#R)(?2#A)(?3#C)";  the regular expression will kill the 'x' before %R, i..e.
the output is 
"xx#R %%R" rather than "xxx#R %%R"
Please help!
Thanks in advance.