$include_dir="/home/hyper-archives/boost-commit/include"; include("$include_dir/msg-header.inc") ?>
Subject: [Boost-commit] svn:boost r54082 - in trunk/libs/spirit/doc: . karma lex
From: hartmut.kaiser_at_[hidden]
Date: 2009-06-18 22:35:09
Author: hkaiser
Date: 2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
New Revision: 54082
URL: http://svn.boost.org/trac/boost/changeset/54082
Log:
Some work on Lexer docs
Text files modified: 
   trunk/libs/spirit/doc/karma/quick_reference.qbk      |    32 ++++----                                
   trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk      |    56 --------------                          
   trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk |   150 ++++++++++++++++++++++++++++++++++++++++
   trunk/libs/spirit/doc/spirit2.qbk                    |     1                                         
   4 files changed, 169 insertions(+), 70 deletions(-)
Modified: trunk/libs/spirit/doc/karma/quick_reference.qbk
==============================================================================
--- trunk/libs/spirit/doc/karma/quick_reference.qbk	(original)
+++ trunk/libs/spirit/doc/karma/quick_reference.qbk	2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -150,44 +150,44 @@
 of the attribute of `a`, and `B` is the type of the attribute of `b`, then the 
 type of the attribute of `a >> b` will be `tuple<A, B>`.
 
-[table /Spirit.Karma/ compound generator attribute types
+[table Spirit.Karma compound generator attribute types
     [[Expression]           [Attribute]]
-    
-    [[sequence (`<<`)]      
+
+    [[sequence (`<<`)]
 [``a: A, b: B --> (a << b): tuple<A, B>
 a: A, b: Unused --> (a << b): A
 a: Unused, b: B --> (a << b): B
 a: Unused, b: Unused --> (a << b): Unused
 a: A, b: A --> (a << b): vector<A>``
 ]]
-                             
-    [[alternative (`|`)]    
+
+    [[alternative (`|`)]
 [``a: A, b: B --> (a | b): variant<A, B>
 a: A, b: Unused --> (a | b): variant<Unused, A>
 a: Unused, b: B --> (a | b): variant<Unused, B>
 a: Unused, b: Unused --> (a | b): Unused``
 a: A, b: A --> (a | b): A`]]
-                             
-    [[kleene (`*`)]         
+
+    [[kleene (`*`)]
 [``a: A --> *a: vector<A>
 a: Unused --> a: Unused``]]
-                             
-    [[plus (`+`)]           
+
+    [[plus (`+`)]
 [``a: A --> +a: vector<A>
 a: Unused --> a: Unused``]]
-                             
-    [[list (`%`)]           
+
+    [[list (`%`)]
 [``a: A, b: B --> (a % b): vector<A>
 a: Unused, b: B --> (a % b): Unused``]]
-                             
-    [[repetition]          
+
+    [[repetition]
 [``a: A --> repeat(...,...)[a]: vector<A>
 a: Unused --> repeat(...,...)[a]: Unused``]]
-                             
-    [[optional (`-`)]       
+
+    [[optional (`-`)]
 [``a: A --> -a: optional<A>
 a: Unused --> -a: Unused``]]
-                             
+
     [[and predicate (`&`)]  [`a: A --> &a: Unused`]]
     [[not predicate (`!`)]  [`a: A --> !a: Unused`]]
 ]
Modified: trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk
==============================================================================
--- trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk	(original)
+++ trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk	2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -60,60 +60,8 @@
 associated with a token definition gets executed after the recognition of a
 matching input sequence. The code above uses function objects constructed using 
 __phoenix2__, but it is possible to insert any C++ function or function object 
-as long as it exposes the interface:
-
-    void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id, Context& ctx);
-
-[variablelist where:
-    [[`Iterator& start`]    [This is a the iterator pointing to the begin of the 
-                             matched range in the underlying input sequence. The 
-                             type of the iterator is the same as specified while
-                             defining the type of the `lexertl_lexer<...>` 
-                             (its first template parameter). The semantic action 
-                             is allowed to change the value of this iterator
-                             influencing, the matched input sequence.]]
-    [[`Iterator& end`]      [This is a the iterator pointing to the end of the 
-                             matched range in the underlying input sequence. The 
-                             type of the iterator is the same as specified while
-                             defining the type of the `lexertl_lexer<...>` 
-                             (its first template parameter). The semantic action 
-                             is allowed to change the value of this iterator
-                             influencing, the matched input sequence.]]
-    [[`pass_flag& matched`] [This value is pre/initialized to `pass_normal`.
-                             If the semantic action sets it to `pass_fail` the 
-                             behaves as if the token has not been matched in 
-                             the first place. If the semantic action sets this
-                             to `pass_ignore` the lexer ignores the current
-                             token and tries to match a next token from the
-                             input.]]
-    [[`Idtype& id`]         [This is the token id of the type Idtype (most of 
-                             the time this will be a `std::size_t`) for the 
-                             matched token. The semantic action is allowed to 
-                             change the value of this token id, influencing the 
-                             if of the created token.]]
-    [[`Context& ctx`]       [This is a reference to a lexer specific, 
-                             unspecified type, providing the context for the
-                             current lexer state. It can be used to access
-                             different internal data items and is needed for
-                             lexer state control from inside a semantic 
-                             action.]]
-]
-
-When using a C++ function as the semantic action the following prototypes are 
-allowed as well:
-
-    void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id);
-    void f (Iterator& start, Iterator& end, pass_flag& matched);
-    void f (Iterator& start, Iterator& end);
-    void f ();
-
-Even if it is possible to write your own function object implementations (i.e. 
-using Boost.Lambda or Boost.Bind), the preferred way of defining lexer semantic 
-actions is to use __phoenix2__. In this case you can access the four parameters 
-described in the table above by using the predefined __spirit__ placeholders: 
-`_1` for the iterator range, `_2` for the token id, `_3` for the reference 
-to the boolean value signaling the outcome of the semantic action, and `_4` for 
-the reference to the internal lexer context. 
+as long as it exposes the proper interface. For more details on please refer 
+to the section __sec_lex_semactions__. 
 
 [heading Associating Token Definitions with the Lexer]
 
Modified: trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk
==============================================================================
--- trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk	(original)
+++ trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk	2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -7,4 +7,154 @@
 ===============================================================================/]
 
 [section:lexer_semantic_actions Lexer Semantic Actions]
+
+The main task of a lexer normally is to recognize tokens in the input. 
+Traditionally this has been complemented with the possibility to execute 
+arbitrary code whenever a certain token has been detected. __lex__ has been
+designed to support this mode of operation as well. We borrow from the concept
+of semantic actions for parsers (__qi__) and generators (__karma__). Lexer 
+semantic actions may be attached to any token definition. These are C++ 
+functions or function objects that are called whenever a token definition 
+successfully recognizes a portion of the input. Say you have a token definition
+`D`, and a C++ function `f`, you can make the lexer call `f` whenever it matches 
+an input by attaching `f`:
+
+    D[f]
+
+The expression above links `f` to the token definition, `D`. The required 
+prototype of `f` is:
+
+    void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id, Context& ctx);
+
+[variablelist where:
+    [[`Iterator& start`]    [This is a the iterator pointing to the begin of the 
+                             matched range in the underlying input sequence. The 
+                             type of the iterator is the same as specified while
+                             defining the type of the `lexertl_lexer<...>` 
+                             (its first template parameter). The semantic action 
+                             is allowed to change the value of this iterator
+                             influencing, the matched input sequence.]]
+    [[`Iterator& end`]      [This is a the iterator pointing to the end of the 
+                             matched range in the underlying input sequence. The 
+                             type of the iterator is the same as specified while
+                             defining the type of the `lexertl_lexer<...>` 
+                             (its first template parameter). The semantic action 
+                             is allowed to change the value of this iterator
+                             influencing, the matched input sequence.]]
+    [[`pass_flag& matched`] [This value is pre/initialized to `pass_normal`.
+                             If the semantic action sets it to `pass_fail` the 
+                             behaves as if the token has not been matched in 
+                             the first place. If the semantic action sets this
+                             to `pass_ignore` the lexer ignores the current
+                             token and tries to match a next token from the
+                             input.]]
+    [[`Idtype& id`]         [This is the token id of the type Idtype (most of 
+                             the time this will be a `std::size_t`) for the 
+                             matched token. The semantic action is allowed to 
+                             change the value of this token id, influencing the 
+                             if of the created token.]]
+    [[`Context& ctx`]       [This is a reference to a lexer specific, 
+                             unspecified type, providing the context for the
+                             current lexer state. It can be used to access
+                             different internal data items and is needed for
+                             lexer state control from inside a semantic 
+                             action.]]
+]
+
+When using a C++ function as the semantic action the following prototypes are 
+allowed as well:
+
+    void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id);
+    void f (Iterator& start, Iterator& end, pass_flag& matched);
+    void f (Iterator& start, Iterator& end);
+    void f ();
+
+[heading The context of a lexer semantic action]
+
+The last parameter passed to any lexer semantic action is a reference to an 
+unspecified type (see the `Context` type in the table above). This type is 
+unspecified because it depends on and is implemented by the token type returned
+by the lexer. Nevertheless any context type is expected to expose a couple of
+functions allowing to influence the behavior of the lexer. The following table 
+gives an overview and a short description of the available functionality.
+
+[table Functions exposed by any context passed to a lexer semantic action
+    [[Name]   [Description]]
+    [[`Iterator const& get_eoi() const`]
+     [The function `get_eoi()` may be used by to access the end iterator of 
+      the input stream the lexer has been initialized with]
+    ]
+    [[`Iterator const& less(Iterator const& it, int n) `]
+     [The function `less()` returns an iterator positioned to the nth input 
+      character beyond the current start iterator (i.e. by passing the return 
+      value to the parameter `end` it is possible to return all but the 
+      first n characters of the current token back to the input stream.]
+    ]
+    [[`void more()`]
+     [The function `more()` tells the lexer that the next time it matches a 
+      rule, the corresponding token should be appended onto the current token 
+      value rather than replacing it.]
+    ]
+    [[`bool lookahead(std::size_t id)`]
+     [The function `lookahead()` can be for instance used to implement 
+      lookahead for lexer engines not supporting constructs like flex' `a/b` 
+      (match `a`, but only when followed by `b`). It invokes the lexer on the 
+      input following the current token without actually moving forward in the 
+      input stream. The function returns whether the lexer was able to match a 
+      token with the given token-id `id`.]
+    ]
+    [[`std::size_t get_state() const` and `void set_state(std::size_t state)`]
+     [The functions `get_state()` and `set_state()` may be used to introspect
+      and change the current lexer state.]
+    ]
+]
+
+[heading Lexer Semantic Actions Using Phoenix]
+
+Even if it is possible to write your own function object implementations (i.e. 
+using Boost.Lambda or Boost.Bind), the preferred way of defining lexer semantic 
+actions is to use __phoenix2__. In this case you can access the parameters 
+described above by using the predefined __spirit__ placeholders: 
+
+[table Predefined Phoenix placeholders for lexer semantic actions
+    [[Placeholder]    [Description]]
+    [[`_start`]       [Refers to the iterator pointing to the begin of the 
+                       matched input sequence. Any modifications to this 
+                       iterator value will be reflected in the generated 
+                       token.]]
+    [[`_end`]         [Refers to the iterator pointing past the end of the 
+                       matched input sequence. Any modifications to this 
+                       iterator value will be reflected in the generated 
+                       token.]]
+    [[`_pass`]        [References the value signaling the outcome of the 
+                       semantic action. This is pre-initialized to 
+                       `lex::pass_flags::pass_normal`. If this is set to
+                       `lex::pass_flags::pass_fail`, the lexer will behave as 
+                       if no token has been matched, if is set to 
+                       `lex::pass_flags::pass_ignore`, the lexer will ignore 
+                       the current match and proceed trying to match tokens 
+                       from the input.]]
+    [[`_tokenid`]     [Refers to the token id of the token to be generated. Any
+                       modifications to this value will be reflected in the
+                       generated token.]]
+    [[`_state`]       [Refers to the lexer state the input has been match in.
+                       Any modifications to this value will be reflected in the
+                       lexer itself (the next match will start in the new 
+                       state). The currently generated token is not affected 
+                       by changes to this variable.]]
+    [[`_eoi`]         [References the end iterator of the overall lexer input.
+                       This value cannot be changed.]]
+]
+
+[heading Support functions callable from semantic actions]
+
+[table Support functions
+    [[Plain function] [Phoenix function] [Description]]
+
+[[`ctx.more()`][lex::more()][]]
+[[`ctx.less()`][lex::less()][]]
+[[`ctx.lookahead()`][lex::lookahead()][]]
+
+]
+
 [endsect]
Modified: trunk/libs/spirit/doc/spirit2.qbk
==============================================================================
--- trunk/libs/spirit/doc/spirit2.qbk	(original)
+++ trunk/libs/spirit/doc/spirit2.qbk	2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -80,6 +80,7 @@
 [def __sec_lex_primitives__     [link spirit.lex.abstracts.lexer_primitives Lexer Primitives]]
 [def __sec_lex_tokenvalues__    [link spirit.lex.abstracts.lexer_primitives.lexer_token_values About Tokens and Token Values]]
 [def __sec_lex_attributes__     [link spirit.lex.abstracts.lexer_attributes Lexer Attributes]]
+[def __sec_lex_semactions__     [link spirit.lex.abstracts.lexer_semantic_actions Lexer Semantic Actions]]
 
 [def __sec_ref_lex_token__      [link spirit.lex.reference.concepts.token Token Reference]]
 [def __sec_ref_lex_token_def__  [link spirit.lex.reference.concepts.tokendef TokenDef Reference]]