From: Andreas Huber (ahd6974-spamgroupstrap_at_[hidden])
Date: 2005-03-12 06:28:46


Jonathan Turkanis wrote:
>> http://article.gmane.org/gmane.comp.lib.boost.devel/119246
>
> Sorry I missed this. I think I missed the whole message.
>
>> Let me know whether that's convincing or not.
>
> Okay. Let me see if I understand you correctly:
>
> Andreas Huber wrote:
>> ... What I want to say here
>> is that it is a bad idea to use this library for parsing, as it is so
>> general that performance tradeoffs are inevitable.
>
> This seems to be a restatement of the performance vs. scalability
> part of the rationale, correct?

Yes.

>> For parsing, one
>> is much better off with a parser framework, which is optimized for
>> the task.
>
>> The IMO important observation is the following: Parsers
>> like Spirit only need to return when they have finished processing
>> all "events" (characters). An FSM however must return after
>> processing each event. can exploit this fact and store state
>> with recursive calls, which is of course much faster (some stuff can
>> even be inlined) than any super-clever double-dispatch scheme.
>> There is often a hard limit for stack space which - I think - is the
>> reason why FSMs do make sense in parsers nevertheless.
>
> Here are you talking about any FSM framework or about your library in
> particular?

I'm talking about any FSM framework.

> At any rate, I don't understand why recursion is more
> efficient.

Recursion can be implemented with simple (non-virtual) function calls.
Moreover, if recursion depth is known at compile time, you can inline
such calls. I believe double-dispatch needs at least one indirection
(virtual call, function pointer, whatever). I might be wrong.

> In the case I mentioned -- non-blocking filters -- there is a strong
> resemblence with FSMs because the filter may be interrupted at any
> point during its processing if input is temporarily unavailable or if
> output buffers are full. Filter writers must categorize the various
> points at which the filtering algorithm might be interupted and save
> this information so that when processing begins again the algorithm
> can continue correctly. These categories correspond to states.
> Furthermore, there is often auxilliary information that must be
> stored; e.g., if a filter was interupted after it had consumed a
> character but before it could pass it downstream, the character must
> be stored together with the state. This auxilliary information varies
> from state to state; it would therefore seem to correspond to
> state-local storage.

Ok.

>> However, all
>> the FSM-employing parsers I know are of the dynamic variant (e.g.
>> regex) and these FSMs are not directly designed by humans but
>> computed by an algorithm.
>
> I don't see the relevance of this.

It's a real-world data point that supports (only weakly, admittedly) my
view that you not normally implement parsers-FSMs manually but you use a
tool to do that for you.

>> Last but not least I think it would be
>> tedious for a human to implement a parser on the state- level
>
> It's also tedious to write a non-blocking filter by hand; being able
> to use FSMs could help bring some order to the chaos.

I can't really comment on this, I'm not knowledgeable about these
things.

Regards,

-- 
Andreas Huber
When replying by private email, please remove the words spam and trap
from the address shown in the header.