$include_dir="/home/hyper-archives/boost-commit/include"; include("$include_dir/msg-header.inc") ?>
Subject: [Boost-commit] svn:boost r52169 - trunk/libs/spirit/doc/reference/lex
From: jamin.hanson_at_[hidden]
Date: 2009-04-04 09:56:39
Author: ben_hanson
Date: 2009-04-04 09:56:39 EDT (Sat, 04 Apr 2009)
New Revision: 52169
URL: http://svn.boost.org/trac/boost/changeset/52169
Log:
regex table
Text files modified: 
   trunk/libs/spirit/doc/reference/lex/lexer.qbk |    68 ++++++++++++++++++++++++++++++++++++++++
   1 files changed, 68 insertions(+), 0 deletions(-)
Modified: trunk/libs/spirit/doc/reference/lex/lexer.qbk
==============================================================================
--- trunk/libs/spirit/doc/reference/lex/lexer.qbk	(original)
+++ trunk/libs/spirit/doc/reference/lex/lexer.qbk	2009-04-04 09:56:39 EDT (Sat, 04 Apr 2009)
@@ -7,4 +7,72 @@
 ===============================================================================/]
 
 [section Lexer]
+[table Regular expressions support
+    [[Expression]   [Meaning]]
+    [[`x`]          [Match any character `x`]]
+    [[`.`]          [Match any except newline (or optionally *any* character)]]
+    [[`[xyz]`]      [A character class; in this case matches `x`, `y` or `z`]]
+    [[`[abj-oZ]`]   [A character class with a range in it; matches `a`, `b` any
+                     letter from `j` through `o` or a `Z`]]
+    [[`[^A-Z]`]     [A negated character class i.e. any character but those in 
+                     the class. In this case, any character except an uppercase 
+                     letter]]
+    [[`r*`]         [Zero or more r's (greedy), where r is any regular expression]]
+    [[`r*?`]        [Zero or more r's (abstemious), where r is any regular expression]]
+    [[`r+`]         [One or more r's (greedy)]]
+    [[`r+?`]        [One or more r's (abstemious)]]
+    [[`r?`]         [Zero or one r's (greedy), i.e. optional]]
+    [[`r??`]        [Zero or one r's (abstemious), i.e. optional]]
+    [[`r{2,5}`]     [Anywhere between two and five r's (greedy)]]
+    [[`r{2,5}?`]    [Anywhere between two and five r's (abstemious)]]
+    [[`r{2,}`]      [Two or more r's (greedy)]]
+    [[`r{2,}?`]     [Two or more r's (abstemious)]]
+    [[`r{4}`]       [Exactly four r's]]
+    [[`{NAME}`]     [The macro `NAME` (see below)]]
+    [[`"[xyz]\"foo"`]  [The literal string `[xyz]\"foo`]]
+    [[`\X`]         [If X is `a`, `b`, `e`, `n`, `r`, `f`, `t`, `v` then the 
+                     ANSI-C interpretation of `\x`. Otherwise a literal `X` 
+                     (used to escape operators such as `*`)]]
+    [[`\0`]         [A NUL character (ASCII code 0)]]
+    [[`\123`]       [The character with octal value 123]]
+    [[`\x2a`]       [The character with hexadecimal value 2a]]
+    [[`\cX`]        [A named control character `X`.]]
+    [[`\d`]         [A shortcut for `[0-9]`]]
+    [[`\D`]         [A shortcut for `[^0-9]`]]
+    [[`\s`]         [A shortcut for `[\x20\t\n\r\f\v]`]]
+    [[`\S`]         [A shortcut for `[^\x20\t\n\r\f\v]`]]
+    [[`\w`]         [A shortcut for `[a-zA-Z0-9_]`]]
+    [[`\W`]         [A shortcut for `[^a-zA-Z0-9_]`]]
+    [[`(r)`]        [Match an `r`; parenthesis are used to override precedence 
+                     (see below)]]
+    [[`(?r-s:pattern)`] [apply option 'r' and omit option 's' while interpreting pattern.
+Options may be zero or more of the characters 'i' or 's'.
+'i' means case-insensitive. '-i' means case-sensitive.
+'s' alters the meaning of the '.' syntax to match any single character whatsoever.
+'-s' alters the meaning of '.' to match any character except '`\n`'.]]
+    [[`rs`]         [The regular expression `r` followed by the regular 
+                     expression `s` (a sequence)]]
+    [[`r|s`]        [Either an `r` or and `s`]]
+    [[`^r`]         [An `r` but only at the beginning of a line (i.e. when just 
+                     starting to scan, or right after a newline has been 
+                     scanned)]]
+    [[`r`$]         [An `r` but only at the end of a line (i.e. just before a 
+                     newline)]]
+]
+
+[note POSIX character classes are not currently supported, due to performance issues
+when creating them in wide character mode.]
+[br][h1 Regular Expression Precedence]
+
+* `rs` has highest precedence
+* `r*` has next highest (`+`, `?`, `{n,m}` have the same precedence as `*`)
+* `r|s` has the lowest precedence
+
+[h1 Macros]
+
+Regular expressions can be given a name and referred to in rules using the
+syntax `{NAME}` where `NAME` is the name you have given to the macro.  A macro
+name can be at most 30 characters long and must start with a `_` or a letter.
+Subsequent characters can be `_`, `-`, a letter or a decimal digit.
+Use the `rules::add_macro()` method to define a macro.
 [endsect]