Sophie

Sophie

distrib > Mandriva > 9.1 > ppc > by-pkgid > b9ba69a436161613d8fb030c8c726a8e > files > 414

spirit-1.5.1-2mdk.noarch.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Confix Parsers</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" href="theme/style.css" type="text/css">
</head>

<body>
<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  <tr> 
    <td width="10"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>&nbsp;</b></font></td>
    <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Confix Parsers</b></font></td>
    <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  </tr>
</table>
<br>
<table border="0">
  <tr> 
    <td width="10"></td>
    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
    <td width="30"><a href="character_sets.html"><img src="theme/l_arr.gif" border="0"></a></td>
    <td width="20"><a href="list_parsers.html"><img src="theme/r_arr.gif" border="0"></a></td>
  </tr>
</table>
<p><a name="confix_parser"></a><b>Confix Parsers</b></p>
<p>Confix Parsers are recognize a (possibly 
  nested) sequence out of three independent elements: an opening, an expression 
  and a closing. This utility parser may be used to parse structures
  where the opening is possibly contained in the expression and the 
  whole sequence is only matched after seeing the closing part that matches the first 
  opening subsequence. A simple example is a nested PASCAL comment: </p>
<pre><code class="comment">    { This is a { nested } PASCAL-comment }</code></pre>
<p>which could be parsed through the following rule definition:<code><font color="#000000"> 
  </font></code> </p>
<pre><span class=identifier>    </span><span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>pascal_comment_rule
        </span><span class=special>=   </span><span class=identifier>confix_nest_p</span><span class=special>(</span><span class=literal>'{'</span><span class=special>, </span><span class=special>*</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=literal>'}'</span><span class=special>)
        </span><span class=special>;</span></pre>
<p>If a non-nested Confix Parser is needed, the <tt>confix_p</tt> parser generator 
  should be used for generating the required Confix Parser. If 
  the expression part may contain a valid confix sequence
  (the confix sequence is allowed to be nested), the <tt>confix_nest_p</tt> parser 
  generator template should be used to generate the parser. The 
  three parameters to <tt>confix_p</tt> and <tt>confix_nest_p</tt> can be single 
  characters (as above), strings or, if more complex parsing logic is required, 
  auxiliary parsers, each of which is automatically converted to the corresponding 
  parser type needed for successful parsing.</p>
<p>The generated parser is equivalent to the following rule: </p>
<pre><code>    <span class=identifier>open </span><span class=special>&gt;&gt; (</span><span class=identifier>expr </span><span class=special>- </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
<p>If the expr parser is an <tt>action_parser_category</tt> type parser (a parser 
  with an attached semantic action) we have to do something special. This happens, 
  if the user wrote something like:</p>
<pre><code><span class=identifier>    confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, </span><span class=identifier>expr</span><span class=special>[</span><span class=identifier>func</span><span class=special>], </span><span class=identifier>close</span><span class=special>)</span></code></pre>
<p>where <code>expr</code> is the parser matching the expr of the confix sequence 
  and <code>func</code> is a functor to be called after matching the <code>expr</code>. 
  If we would do nothing, the resulting code would parse the sequence as follows:</p>
<pre><code>    <span class=identifier>start </span><span class=special>&gt;&gt; (</span><span class=identifier>expr</span><span class=special>[</span><span class=identifier>func</span><span class=special>] - </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
<p>what in most cases is not what the user expects. (If this <u>is</u> what you've 
  expected, then please use the <tt>confix_p</tt> or <tt>confix_nest_p</tt> generator 
  function <tt>direct()</tt>, which will inhibit the parser refactoring). To make 
  the confix parser behave as expected:</p> 
<pre><code><span class=identifier>    start </span><span class=special>&gt;&gt; (</span><span class=identifier>expr </span><span class=special>- </span><span class=identifier>close</span><span class=special>)[</span><span class=identifier>func</span><span class=special>] &gt;&gt; </span><span class=identifier>close</span></code></pre>
<p>the actor attached to the <code>expr</code> parser has to be re-attached to 
  the <code>(expr - close)</code> parser construct, which will make the resulting 
  confix parser 'do the right thing'. This refactoring is done by the help of 
  the <a href="refactoring.html">Refactoring Parsers</a>. Additionally special 
  care must be taken, if the expr parser is a <tt>unary_parser_category</tt> type 
  parser as </p>
<pre><code><span class=identifier>    confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, *</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=identifier>close</span><span class=special>)</span></code></pre>
<p>which without any refactoring would result in </p>
<pre><code>    <span class=identifier>start </span><span class=special>&gt;&gt; (*</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
<p>and will not give the expected result (*anychar_p will eat up all the input up 
to the end of the input stream). So we have to refactor this into:
<pre><code><span class=identifier>    start </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
<p>what will give the correct result. </p>
<p>The case, where the expr parser is a combination of the two mentioned problems 
  (i.e. the expr parser is a unary parser with an attached action), is handled 
  accordingly too, so: </p>
<pre><code><span class=identifier>    confix_p</span><span class=special>(</span><span class=identifier>start</span><span class=special>, (*</span><span class=identifier>anychar_p</span><span class=special>)[</span><span class=identifier>func</span><span class=special>], </span><span class=identifier>end</span><span class=special>)</span></code></pre>
<p>will be parsed as expected: </p>
<pre><code>    <span class=identifier>start </span><span class=special>&gt;&gt; (*(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>end</span><span class=special>))[</span><span class=identifier>func</span><span class=special>] &gt;&gt; </span><span class=identifier>end</span></code></pre>
<p>The required refactoring is implemented here with the help of the <a href="refactoring.html">Refactoring 
  Parsers</a> too.</p>
<table width="90%" border="0" align="center">
  <tr> 
    <td colspan="2" class="table_title"><b>Summary of Confix Parser refactorings</b></td>
  </tr>
  <tr class="table_title"> 
    <td width="39%"><b>You write it as:</b></td>
    <td width="61%"><code><font face="Verdana, Arial, Helvetica, sans-serif">It 
      is refactored to:</font></code></td>
  </tr>
  <tr> 
    <td width="39%" class="table_cells"><code>confix_p<span class="special">(</span>start<span class="special">,</span> 
      expr<span class="special">,</span> end<span class="special">)</span></code></td>
    <td width="61%" class="table_cells"> 
      <p><code>start <span class=special>&gt;&gt; (</span>expr <span class=special>-</span> 
        end<span class=special>)</span><font color="#0000FF"> </font><span class=special>&gt;&gt;</span> 
        end</code></p>
    </td>
  </tr>
  <tr> 
    <td width="39%" class="table_cells"><code>confix_p<span class="special">(</span>start<span class="special">,</span> 
      expr<span class="special">[</span>func<span class="special">],</span> end<span class="special">)</span></code></td>
    <td width="61%" class="table_cells"> 
      <p><code>start <span class=special>&gt;&gt; (</span>expr <span class=special>-</span> 
        end<span class="special">)[</span>func<span class="special">] <font color="#0000FF" class="special">&gt;&gt;</font></span> 
        end</code></p>
    </td>
  </tr>
  <tr> 
    <td width="39%" class="table_cells" height="9"><code>confix_p<span class="special">(</span>start<span class="special">, 
      *</span>expr<span class="special">,</span> end<span class="special">)</span></code></td>
    <td width="61%" class="table_cells" height="9"> 
      <p><code>start <font color="#0000FF"><span class="special">&gt;&gt;</span></font> 
        <span class="special"><font color="#0000FF" class="special">*</font>(</span>expr 
        <font color="#0000FF" class="special">-</font> end<span class="special">) 
        <font color="#0000FF" class="special">&gt;&gt;</font></span> end</code></p>
    </td>
  </tr>
  <tr> 
    <td width="39%" class="table_cells"><code>confix_p<span class="special">(</span>start<span class="special">, 
      (*</span>expr<span class="special">)[</span>func<span class="special">],</span> 
      end<span class="special">)</span></code></td>
    <td width="61%" class="table_cells"> 
      <p><code>start <font color="#0000FF"><span class="special">&gt;&gt;</span></font><span class="special"> 
        (<font color="#0000FF" class="special">*</font>(</span>expr <font color="#0000FF" class="special">-</font> 
        end<span class="special">))[</span>func<span class="special">] <font color="#0000FF" class="special">&gt;&gt;</font></span> 
        end</code></p>
    </td>
  </tr>
</table>
<p><a name="comment_parsers"></a><b>Comment Parsers</b></p>
<p>The Comment Parser generator templates <tt>comment_p</tt> and <tt>comment_nest_p</tt> 
  are helpers for generating a correct <a href="#confix_parser">Confix Parser</a> 
  from auxiliary parameters, which is able to parse comment constructs as follows: 
</p>
<pre><code>    StartCommentToken <span class="special">&gt;&gt;</span> Comment text <span class="special">&gt;&gt;</span> EndCommentToken</code></pre>
<p>There are the following types supported as parameters: parsers, single 
  characters and strings (see as_parser). There are two diffenerent predefined 
  comment parser generators: <tt>comment_p</tt> and <tt>comment_nest_p</tt>, which 
  may be used for creating special comment parsers in two different ways. If these 
  are used with one parameter, a comment starting with the given first parser 
  parameter up to the end of the line is matched. So for instance the following 
  parser matches C++ style comments:</p>
  
<pre><code><span class=identifier>    comment_p</span><span class=special>(</span><span class=string>"//"</span><span class=special>)</span></code></pre>
<p>If these are used with two parameters, a comment starting with the first parser 
  parameter up to the second parser parameter is matched. For instance a C style 
  comment parser could be constrcuted as:</p>
<pre><code>    <span class=identifier>comment_p</span><span class=special>(</span><span class=string>"/*"</span><span class=special>, </span><span class=string>"*/"</span><span class=special>)</span></code></pre>
<p>The <tt>comment_p</tt> parser generator allows to generate parsers for matching 
  non-nested comments (as for C/C++ comments). Sometimes it is necessary to parse 
  nested comments as for instance allowed in Pascal. Such nested comments are 
  parseable through parsers generated by the <tt>comment_nest_p</tt> generator 
  template functor. The following example shows a parser, which can be used for 
  parsing the two different (nestable) Pascal comment styles:</p>
<pre><code>    <span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>pascal_comment
        </span><span class=special>=   </span><span class=identifier>comment_nest_p</span><span class=special>(</span><span class=string>"(*"</span><span class=special>, </span><span class=string>"*)"</span><span class=special>)
        |   </span><span class=identifier>comment_nest_p</span><span class=special>(</span><span class=literal>'{'</span><span class=special>, </span><span class=literal>'}'</span><span class=special>)
        ;</span></code></pre>
<p>Please note, that a comment is parsed implicitly as if the whole <tt>comment_p(...)</tt> 
  statement were embedded into a <tt>lexeme_d[]</tt> directive, i.e. during parsing 
  of a comment no token skipping will occur, even if you've defined a skip parser 
  for your whole parsing process.<br>
</p>

<table border="0">
  <tr> 
    <td width="10"></td>
    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
    <td width="30"><a href="character_sets.html"><img src="theme/l_arr.gif" border="0"></a></td>
    <td width="20"><a href="list_parsers.html"><img src="theme/r_arr.gif" border="0"></a></td>
  </tr>
</table>
<br>
<hr size="1">
<p class="copyright">Copyright &copy; 2001-2002 Hartmut Kaiser<br>
  <br>
  <font size="2">Permission to copy, use, modify, sell and distribute this document 
  is granted provided this copyright notice appears in all copies. This document 
  is provided &quot;as is&quot; without express or implied warranty, and with 
  no claim as to its suitability for any purpose. </font> </p>
</body>
</html>