$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2006-11-21 21:05:54
Sorry, the first message got send out too early...
> David Abrahams wrote:
> 
> > > Yes, and Slex is the other one
> > 
> > Not to mention XPressive?
> 
> Xpressive is not really usable as a lexer, and Eric is aware of that. 
> I have a Wave lexer implemented with Xpressive here on my hard disk, 
> and it functions well, it is only 3 magnitudes slower as for instance 
> the re2c based one. The main reasons are:
> 
> - no optimization between different regex's used for token 
> representation (no internal NFA/DFA generation)
> - no way to tell which alternative matched if using regex's containing 
> alternatives
> 
> The first rules out using separate regex's, one for each token, the 
> second one inhibits us from using one giant regex with alternatives...
> 
> Both are probably merely natural restrictions stemmed from the fact 
> Xpressive is a regex library not a lexer generator.
> The same issues would probably occur if we were trying to use 
> Boost.Regex for this task.
FYI, I found my old timings of the different lexer types:
Timing results for the different lexer types included with Wave:
               Re2C             Slex              Xlex
============================================================================
===
All C++ tokens, lexer get's intstantiated for every C++ token
----------------------------------------------------------------------------
---
1000 times     1.63[s]          2.08[s]           1047.60[s]
                                                   751.57[s] (hoisted
regex_match struct)
============================================================================
===
Regards Hartmut