<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" > <head><meta content="text/html;charset=&quot;utf-8&quot;" http-equiv="Content-type"/><link href="faldoc.css" rel="stylesheet" type="text/css"/><title> - Class Parser</title></head><body class="faldoc"><ul class="navi_top"><li class="top"><a href="index.html">Top: Table of contents</a></li> <li class="up"><a href="parser_genparser.html">Up: Module parser.genparser</a></li> <li class="prev"><a href="parser_genparser_PState.html">Previous: Class PState</a></li> <li class="next"><a href="parser_genparser_context.html">Next: Module parser.genparser.context</a></li> <li class="clear"></li> </ul><div id="page_body"><h1><span class="toc_number">15.5.2</span>Class Parser</h1><p class="brief">Main parsing class. </p> <pre class="prototype">Class Parser</pre> <p>A generic parser is a parsing system based on token recognition and callback invocation. This class provides general token recognition and calls adequate callbacks. </p> <p>In strick terms, this class acts more like a <b>lexer</b> which drives a callback-driven parser provided by the user. </p> <p>The <b>Parser</b> class works by using a set of named states (in the class <a href="parser_genparser_PState.html">PState</a>), which corespond to different ruleset which are used in different parsing steps. </p> <p>Rules, that is, user-provided callbacks, can return a string representing a state name; the coresponding satate is then pushed on a stack, and can be popped later by returning the special state "#pop". </p> <p>Each <a href="parser_genparser_PState.html">PState</a> provides a set of four elements: <ul><li>separators: a string containing characters considered "blanks" which hasn't any grammar meaning except that of separating tokens and keywords. In example, " t" would recognize spaces and tab characters as separators. </li><li>tokens: A list of <b>Rule</b> instances which match recognized tokens. Tokens are recognized when it can be demonstrated that no other token can be matched. In example, if the parser has "+" and "+=" tokens, the proper action (either the one associated with + or +=) will be called when the ambiguity can be resolved. </li><li>keywords: A list of <b>Rule</b> instances which are recognized only if surrounded by separators or tokens. In example, a keyword "fun" would be recognized only if preceded and followed by any of the characters in the <b>seperators</b> string, or by any recognized token. </li><li>onElement: it's a callback which gets called when identifying a sequence between separators and/or tokens which is not listed in the keywords. </li></ul></p> <p>In example, the following parser regognizes and saves strings by pushing a different state when they are found: </p> <pre> load parser // base state base = PState( separators|" \t\n\r", tokens| [ Rule( '"', onQuote ) ], onElement|ignoring ) // state when parsing a string pstring = PState( separators|"", // spaces are relevant tokens|[ Rule( "\\\"", onEscapeQuote ), Rule( "\"", onStringClosed ) ], onElement| addToString ) content = "" function onQuote( qt ) global content // starting string mode content = "" return "string" end function ignoring( item ) > "Ignoring ", item end function addToString( data ) global content content += data end function onEscapeQuote( qt ) global content content += "\"" end function onStringClosed( qt ) global content > "Found string: >>", content, "<<" return "#pop" end parser = Parser() parser.addState( "base", base ) parser.addState( "string", pstring ) stream = StringStream( 'unrecognized text, "an \"interesting\" string", other text') parser.parse( stream, "base" ) </pre><p>The parser ensures that all the tokens, keywords and unrecognized elements are notified via callbacks in the same order in which they are found in the parsed file. </p> <p>Other than returning a state name, the rule callbacks may also return a new instance of <a href="parser_genparser_PState.html">PState</a>; this would push the state on the stack and make it the currently active state. This makes possible to create new parser states on the fly, in case some of the parsing conditions are not known in advance. </p> <p>It is also possible to modify existing states by i.e. adding new keywords or tokens as they become defined while parsing the source code. </p> <table class="members"> <tbody><tr class="member_type"><td class="member_type" colspan="2">Properties</td></tr> <tr><td colspan="2"><a href="#row">row</a></td></tr> <tr><td colspan="2"><a href="#states">states</a></td></tr> <tr><td colspan="2"><a href="#trace">trace</a></td></tr> </tbody> <tbody><tr class="member_type"><td class="member_type" colspan="2">Methods</td></tr> <tr><td colspan="2"><a href="#initParser">initParser</a></td></tr> <tr><td><a href="#parse">parse</a></td><td>Performs complete parsing. </td></tr> <tr><td colspan="2"><a href="#parseLine">parseLine</a></td></tr> <tr><td colspan="2"><a href="#parseStream">parseStream</a></td></tr> <tr><td><a href="#popState">popState</a></td><td>Pops current state. </td></tr> <tr><td><a href="#pushState">pushState</a></td><td>Pushes a given state making it the active state. </td></tr> <tr><td><a href="#terminate">terminate</a></td><td>Request parsing termination. </td></tr> <tr><td colspan="2"><a href="#traceMsg">traceMsg</a></td></tr> <tr><td colspan="2"><a href="#traceRule">traceRule</a></td></tr> <tr><td colspan="2"><a href="#traceStates">traceStates</a></td></tr> <tr><td colspan="2"><a href="#traceText">traceText</a></td></tr> </tbody> </table> <h2>Properties</h2><h3><a name="row">row</a></h3><h3><a name="states">states</a></h3><h3><a name="trace">trace</a></h3><h2>Methods</h2><h3><a name="initParser">initParser</a></h3><pre class="prototype">initParser( initState )</pre> <table class="prototype"> <tbody></tbody> </table> <h3><a name="parse">parse</a></h3><p class="brief">Performs complete parsing. </p> <pre class="prototype">parse( stream, initState, string, [initRow] )</pre> <table class="prototype"> <tbody><tr class="param"><td class="name">stream</td><td class="content"> An input stream where the input data is read. </td></tr> <tr class="param"><td class="name">initState</td><td class="content"> The name of the initial state. </td></tr> <tr class="optparam"><td class="name">initRow</td><td class="content"> If given, set the initial row to this number </td></tr> <tr class="raise"><td class="name">Raise</td><td class="content"><table> <tbody><tr><td class="name"><b>ParseError</b></td><td class="content"> if the parser has not been correctly initialized. </td></tr> </tbody> </table> </td></tr> </tbody> </table> <h3><a name="parseLine">parseLine</a></h3><pre class="prototype">parseLine( line, ctx )</pre> <table class="prototype"> <tbody></tbody> </table> <h3><a name="parseStream">parseStream</a></h3><pre class="prototype">parseStream( stream, initState )</pre> <table class="prototype"> <tbody></tbody> </table> <h3><a name="popState">popState</a></h3><p class="brief">Pops current state. </p> <pre class="prototype">popState()</pre> <table class="prototype"> <tbody><tr class="raise"><td class="name">Raise</td><td class="content"><table> <tbody><tr><td class="name"><b>ParseError</b></td><td class="content"> if the state backlist is empty. </td></tr> </tbody> </table> </td></tr> </tbody> </table> <h3><a name="pushState">pushState</a></h3><p class="brief">Pushes a given state making it the active state. </p> <pre class="prototype">pushState( name )</pre> <table class="prototype"> <tbody><tr class="param"><td class="name">name</td><td class="content"> The name of the state that should be pushed. </td></tr> <tr class="raise"><td class="name">Raise</td><td class="content"><table> <tbody><tr><td class="name"><b>ParseError</b></td><td class="content"> if the state name is not known. </td></tr> </tbody> </table> </td></tr> </tbody> </table> <h3><a name="terminate">terminate</a></h3><p class="brief">Request parsing termination. </p> <pre class="prototype">terminate()</pre> <p>A rule callback may call this method via <b>sender.terminate()</b> to request immediate exit from the parser sequence. </p> <p>A rule may also quit the parser by returning the special "#quit" state. </p> <h3><a name="traceMsg">traceMsg</a></h3><pre class="prototype">traceMsg( msg )</pre> <table class="prototype"> <tbody></tbody> </table> <h3><a name="traceRule">traceRule</a></h3><pre class="prototype">traceRule( r, context )</pre> <table class="prototype"> <tbody></tbody> </table> <h3><a name="traceStates">traceStates</a></h3><pre class="prototype">traceStates()</pre> <h3><a name="traceText">traceText</a></h3><pre class="prototype">traceText( line )</pre> <table class="prototype"> <tbody></tbody> </table> </div><ul class="navi_bottom"><li class="top"><a href="index.html">Top: Table of contents</a></li> <li class="up"><a href="parser_genparser.html">Up: Module parser.genparser</a></li> <li class="prev"><a href="parser_genparser_PState.html">Previous: Class PState</a></li> <li class="next"><a href="parser_genparser_context.html">Next: Module parser.genparser.context</a></li> <li class="clear"></li> </ul><div class="signature">Made with <a href="faldoc 3.0">http://www.falconpl.org</a></div></body></html>