Sophie

Sophie

distrib > Mandriva > 9.1 > i586 > by-pkgid > b9ba69a436161613d8fb030c8c726a8e > files > 416

spirit-1.5.1-2mdk.noarch.rpm

<html>
<head>
<title>Directives</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" href="theme/style.css" type="text/css">
</head>

<body>
<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  <tr> 
    <td width="10"> 
    </td>
    <td width="85%"> 
      <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Directives</b></font>
    </td>
    <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  </tr>
</table>
<br>
<table border="0">
  <tr>
    <td width="10"></td>
    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
    <td width="30"><a href="rule.html"><img src="theme/l_arr.gif" border="0"></a></td>
    <td width="20"><a href="scanner.html"><img src="theme/r_arr.gif" border="0"></a></td>
   </tr>
</table>
<p>Parser directives have the form: <b>directive[expression]</b></p>
<p>A directive modifies the behavior of its enclosed expression, essentially 'decorating' 
  it. The framework pre-defines a few directives. Clients of the framework are 
  free to define their own directives as needed. Information on how this is done 
  will be provided later. For now, we shall deal only with predefined directives.</p>
<h2>lexeme_d</h2>
<p>Turns off white space skipping. By default the parser ignores white spaces 
  and all things considered as white spaces, possibly including comments as parameterized 
  by the scanner passed into the parser's parse member function.</p>
<p>Situations where we want to work at the character level instead of the phrase 
  level call for a special construct. Rules can be directed to work at the character 
  level by enclosing the pertinent parts of the grammar inside the <tt>lexeme_d</tt> 
  directive. For example, let us complete the example presented in the <a href="introduction.html">Introduction</a>. 
  There, we skipped the definition of the <tt>integer</tt> rule. Although its 
  definition is quite obvious, here's how it is actually defined in the context 
  of the framework:</p>
<pre><code><font color="#000000"><span class=identifier>    </span><span class=identifier>integer </span><span class=special>= </span><span class=identifier>lexeme_d</span><span class=special>[ </span><span class=special>!(</span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'+'</span><span class=special>) </span><span class=special>| </span><span class=literal>'-'</span><span class=special>) </span><span class=special>&gt;&gt; </span><span class=special>+</span><span class=identifier>digit </span><span class=special>];</span></font></code></pre>
<p>The <tt>lexeme_d</tt> directive forces the parser to work on the character 
  level. Without it, the <tt>integer</tt> rule would have allowed erroneous embedded 
  white spaces in inputs such as <span class="quotes">&quot;1 2 345&quot;</span> 
  which will be parsed as <span class="quotes">&quot;12345&quot;</span>.</p>
<h2>nocase_d</h2>
<p>There are times when we want to inhibit case sensitivity. The <tt>nocase_d</tt> 
  directive converts all characters from the input to lower-case.</p>
<table width="80%" border="0" align="center">
  <tr> 
    <td class="note_box"><img src="theme/alert.gif" width="16" height="16"><b> 
      nocase_d behavior</b> <br>
      <br>
      It is important to note that only the input is converted to lower case. 
      Any parser enclosed inside the <tt>nocase_d</tt> directive that expects 
      any upper case characters will fail to parse anything. Example: <tt>nocase_d[<span class="quotes">'X'</span>]</tt> 
      will never succeed because it expects an upper case <tt class="quotes">'X'</tt> 
      that the <tt>nocase_d</tt> directive will never supply.</td>
  </tr>
</table>
<p>For example, in Pascal, keywords and identifiers are case insensitive. Pascal 
  ignores the case of letters in identifiers and keywords. Thus the Pascal identifiers 
  Id, ID and id are identical. Without the nocase directive, it would be awkward 
  to define a rule that recognizes this:</p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"id"</span><span class=special>) </span><span class=special>| </span><span class=string>"Id" </span><span class=special>| </span><span class=string>"iD" </span><span class=special>| </span><span class=string>"ID"</span><span class=special>;</span></font></code></pre>
<p>Now, try doing that with the case insensitive Pascal keyword <span class="quotes">&quot;BEGIN&quot;</span>. 
  The <tt>nocase_d</tt> directive makes this simple:</p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>nocase_d</span><span class=special>[</span><span class=string>"begin"</span><span class=special>];</span></font></code></pre>
<table width="80%" border="0" align="center">
  <tr> 
    <td class="note_box"><img src="theme/note.gif" width="16" height="16"> <b>Primitive 
      arguments</b> <br>
      <br>
      The astute reader will notice that we did not explicitly wrap <span class="quotes">&quot;begin&quot;</span> 
      inside an <tt>str_p</tt>. Whenever appropriate, directives should be able 
      to allow primitive types such as <tt>char</tt>, <tt>int</tt>, <tt>wchar_t</tt>, 
      <tt>char const<span class="operators">*</span></tt>, <tt>wchar_t const<span class="operators">*</span></tt> 
      and so on. Examples: <tt><br>
      <br>
      <span class=identifier><code>nocase_d</code></span><code><span class=special>[</span><span class=string>"hello"</span><span class=special>] 
      </span><span class=comment>// is equivalent to nocase_d[str_p("hello")]</span></code></tt><code><br>
      <span class=identifier>nocase_d</span><span class=special>[</span><span class=literal>'x'</span><span class=special>] 
      </span><span class=comment>// is equivalent to nocase_d[ch_p('x')]</span></code></td>
  </tr>
</table>
<h2>longest_d</h2>
<p>Alternatives in the Spirit parser compiler are short-circuited (see <a href="operators.html">Operators</a>). 
  Sometimes, this is not what is desired. The <tt>longest_d</tt> directive instructs 
  the parser not to short-circuit alternatives enclosed inside this directive, 
  but instead makes the parser try all possible alternatives and choose the one 
  matching the longest portion of the input stream.</p>
<p>Consider the parsing of integers and real numbers:</p>
<pre><code><font color="#000000"><span class=comment>    </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>real </span><span class=special>| </span><span class=identifier>integer</span><span class=special>;</span></font></code></pre>
<p>A number can be a real or an integer. This grammar is ambiguous. An input <span class="quotes">&quot;1234&quot;</span> 
  should potentially match both real and integer. Recall though that alternatives 
  are short-circuited . Thus, for inputs such as above, the real alternative always 
  wins. If we swap the alternatives:</p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>real</span><span class=special>;</span></font></code></pre>
<p>We still have a problem. Now, an input <span class="quotes">&quot;123.456&quot;</span> 
  will be partially matched by integer until the decimal point. This is not what 
  we want. The solution here is either to fix the ambiguity by factoring out the 
  common prefixes of real and integer or, if that is not possible nor desired, 
  use the <tt>longest_d</tt> directive:</p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>longest_d</span><span class=special>[ </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>real </span><span class=special>];</span></font></code></pre>
<h2>shortest_d</h2>
<p>Opposite of the <tt>longest_d</tt> directive.</p>
<table width="80%" border="0" align="center">
  <tr> 
    <td class="note_box"><img src="theme/note.gif" width="16" height="16"> <b>Multiple 
      alternatives</b> <br>
      <br>
      The <tt>longest_d</tt> and <tt>shortest_d</tt> directives can accept two 
      or more alternatives. Examples:<br>
      <br>
      <font color="#000000"><span class=identifier><code>longest</code></span><code><span class=special>[ 
      </span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b 
      </span><span class=special>| </span><span class=identifier>c </span><span class=special>]; 
      </span><span class=identifier><br>
      shortest</span><span class=special>[ </span><span class=identifier>a </span><span class=special>| 
      </span><span class=identifier>b </span><span class=special>| </span><span class=identifier>c 
      </span><span class=special>| </span><span class=identifier>d </span><span class=special>];</span></code></font></td>
  </tr>
</table>
<h2>limit_d</h2>
<p>Ensures that the result of a parser is constrained to a given min..max range 
  (inclusive). If not, then the parser fails and returns a no-match.</p>
<p><b>Usage:</b></p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>, </span><span class=identifier>max</span><span class=special>)[</span><span class=identifier>expression</span><span class=special>]</span></font></code></pre>
<p>This directive is particularly useful in conjunction with parsers that parse 
  specific scalar values (for example, <a href="numerics.html">numeric parsers</a>). 
  Here's a practical example. Although the numeric parsers can be configured to 
  accept only a limited number of digits (say, 0..2), there are no means to limit 
  the result to a range (say -1.0..1.0). This design is deliberate. Doing so would 
  have undermined Spirit's design rule that <i><span class="quotes">&quot;the 
  client should not pay for features that she does not use&quot;</span></i>. We 
  would have stored the min, max values in the numeric parser itself, used or 
  unused. Well, we could get by by using static constants configured by a non-type 
  template parameter, but that's not acceptable because that way, we can only 
  accomodate integers. What about real numbers or user defined numbers such as 
  big-ints?</p>
<p><b>Example</b>, parse time of the form <b>HH:MM:SS</b>:</p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>uint_parser</span><span class=special>&lt;</span><span class=keyword>int</span><span class=special>, </span><span class=number>10</span><span class=special>, </span><span class=number>2</span><span class=special>, </span><span class=number>2</span><span class=special>&gt; </span><span class=identifier>uint2_p</span><span class=special>;

    </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>lexeme_d
        </span><span class=special>[
                </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>23u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=special>&gt;&gt; </span><span class=literal>':'    </span><span class=comment>//  Hours 00..23
            </span><span class=special>&gt;&gt;  </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>59u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=special>&gt;&gt; </span><span class=literal>':'    </span><span class=comment>//  Minutes 00..59
            </span><span class=special>&gt;&gt;  </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>59u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>]           </span><span class=comment>//  Seconds 00..59
        </span><span class=special>];</span></font></code>
</pre>
<h2>min_limit_d</h2>
<p>Sometimes, it would be useful to unconstrain just the maximum limit. This will 
  allow for an interval that's unbounded to one direction. The directive min_limit_d 
  ensures that the result of a parser is not less than minimun. If not, then the 
  parser fails and returns a no-match.</p>
<p><b>Usage:</b></p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>)[</span><span class=identifier>expression</span><span class=special>]</span></font></code></pre>
<p><b>Example</b>, ensure that a date is not less than 1900</p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=number>1900u</span><span class=special>)[</span><span class=identifier>int_p</span><span class=special>]</span></font></code></pre>
<h2>max_limit_d</h2>
<p>Opposite of <tt>min_limit_d</tt>. Thus, <tt>limit_d</tt> is equivalent to:</p>
<pre><code><font color="#000000"><span class=special>    </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>)[</span><span class=identifier>max_limit_d</span><span class=special>(</span><span class=identifier>max</span><span class=special>)[</span><span class=identifier>p</span><span class=special>]]</span></font></code></pre>
<table border="0">
  <tr> 
    <td width="10"></td>
    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
    <td width="30"><a href="rule.html"><img src="theme/l_arr.gif" border="0"></a></td>
    <td width="20"><a href="scanner.html"><img src="theme/r_arr.gif" border="0"></a></td>
  </tr>
</table>
<br>
<hr size="1">
<p class="copyright">Copyright &copy; 1998-2002 Joel de Guzman<br>
  <br>
  <font size="2">Permission to copy, use, modify, sell and distribute this document 
  is granted provided this copyright notice appears in all copies. This document 
  is provided &quot;as is&quot; without express or implied warranty, and with 
  no claim as to its suitability for any purpose. </font> </p>
<p>&nbsp;</p>
</body>
</html>