$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: John Maddock (jm_at_[hidden])
Date: 2003-03-28 07:01:17
> I am writing a multithreaded Apache log parser that uses the Boost
> 1_29_0 regex split function to separate elements in the entry. Each
> thread parses a separate log file. The code seems to be working
> correctly on a 1-CPU system, but when I use a 14-CPU Sun server, I
> see massive locking (LCK column of prstat -amLvu username), and
> performance suffers horribly (as measured by the lines processed per
> second). I spent a lot of time checking to see where the locking was
> occurring. I went so far as to compile the code with Sun's Forte 6u2
> and use their analysis tools to identify the problem area. I've
> compiled all code (including Boost) with both gcc 3.2.2 and Forte to
> create 64-bit binaries, if that makes any difference.
>
> If I read the Forte analysis tools correctly, the place I'm seeing
> all the locking is the call to malloc in the void *operator
> new(unsigned long), which is called by
> boost::re_detail::match_results_base and _priv_match_data. Those are
> in turn called by query_match_aux, which is called by reg_grep2.
> Assuming I'm reading it right...
>
> At this point it seems like the issue is either with the library or
> my usage of it. Has anyone seen this before? Any pointers on what I
> may be doing wrong and how to fix it would be appreciated.
The looking is occurring in your runtime library rather than boost.regex as
such. You have two choices:
1) Use a custom allocator for the match_results class instance that you are
using that uses thread-specific memory pools.
2) Wait for the next release (probably still a couple of months away), which
will use much less dynamic memory allocation (almost none at all in
recursive mode).
John.