<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <link rel="stylesheet" href="style.css" type="text/css"> <meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type"> <link rel="Start" href="index.html"> <link rel="previous" href="Gc.html"> <link rel="next" href="Graphics.html"> <link rel="Up" href="index.html"> <link title="Index of types" rel=Appendix href="index_types.html"> <link title="Index of exceptions" rel=Appendix href="index_exceptions.html"> <link title="Index of values" rel=Appendix href="index_values.html"> <link title="Index of modules" rel=Appendix href="index_modules.html"> <link title="Index of module types" rel=Appendix href="index_module_types.html"> <link title="Arg" rel="Chapter" href="Arg.html"> <link title="Arith_status" rel="Chapter" href="Arith_status.html"> <link title="Array" rel="Chapter" href="Array.html"> <link title="ArrayLabels" rel="Chapter" href="ArrayLabels.html"> <link title="Big_int" rel="Chapter" href="Big_int.html"> <link title="Bigarray" rel="Chapter" href="Bigarray.html"> <link title="Buffer" rel="Chapter" href="Buffer.html"> <link title="Callback" rel="Chapter" href="Callback.html"> <link title="CamlinternalLazy" rel="Chapter" href="CamlinternalLazy.html"> <link title="CamlinternalMod" rel="Chapter" href="CamlinternalMod.html"> <link title="CamlinternalOO" rel="Chapter" href="CamlinternalOO.html"> <link title="Char" rel="Chapter" href="Char.html"> <link title="Complex" rel="Chapter" href="Complex.html"> <link title="Condition" rel="Chapter" href="Condition.html"> <link title="Dbm" rel="Chapter" href="Dbm.html"> <link title="Digest" rel="Chapter" href="Digest.html"> <link title="Dynlink" rel="Chapter" href="Dynlink.html"> <link title="Event" rel="Chapter" href="Event.html"> <link title="Filename" rel="Chapter" href="Filename.html"> <link title="Format" rel="Chapter" href="Format.html"> <link title="Gc" rel="Chapter" href="Gc.html"> <link title="Genlex" rel="Chapter" href="Genlex.html"> <link title="Graphics" rel="Chapter" href="Graphics.html"> <link title="GraphicsX11" rel="Chapter" href="GraphicsX11.html"> <link title="Hashtbl" rel="Chapter" href="Hashtbl.html"> <link title="Int32" rel="Chapter" href="Int32.html"> <link title="Int64" rel="Chapter" href="Int64.html"> <link title="Lazy" rel="Chapter" href="Lazy.html"> <link title="Lexing" rel="Chapter" href="Lexing.html"> <link title="List" rel="Chapter" href="List.html"> <link title="ListLabels" rel="Chapter" href="ListLabels.html"> <link title="Map" rel="Chapter" href="Map.html"> <link title="Marshal" rel="Chapter" href="Marshal.html"> <link title="MoreLabels" rel="Chapter" href="MoreLabels.html"> <link title="Mutex" rel="Chapter" href="Mutex.html"> <link title="Nativeint" rel="Chapter" href="Nativeint.html"> <link title="Num" rel="Chapter" href="Num.html"> <link title="Obj" rel="Chapter" href="Obj.html"> <link title="Oo" rel="Chapter" href="Oo.html"> <link title="Parsing" rel="Chapter" href="Parsing.html"> <link title="Pervasives" rel="Chapter" href="Pervasives.html"> <link title="Printexc" rel="Chapter" href="Printexc.html"> <link title="Printf" rel="Chapter" href="Printf.html"> <link title="Queue" rel="Chapter" href="Queue.html"> <link title="Random" rel="Chapter" href="Random.html"> <link title="Scanf" rel="Chapter" href="Scanf.html"> <link title="Set" rel="Chapter" href="Set.html"> <link title="Sort" rel="Chapter" href="Sort.html"> <link title="Stack" rel="Chapter" href="Stack.html"> <link title="StdLabels" rel="Chapter" href="StdLabels.html"> <link title="Str" rel="Chapter" href="Str.html"> <link title="Stream" rel="Chapter" href="Stream.html"> <link title="String" rel="Chapter" href="String.html"> <link title="StringLabels" rel="Chapter" href="StringLabels.html"> <link title="Sys" rel="Chapter" href="Sys.html"> <link title="Thread" rel="Chapter" href="Thread.html"> <link title="ThreadUnix" rel="Chapter" href="ThreadUnix.html"> <link title="Tk" rel="Chapter" href="Tk.html"> <link title="Unix" rel="Chapter" href="Unix.html"> <link title="UnixLabels" rel="Chapter" href="UnixLabels.html"> <link title="Weak" rel="Chapter" href="Weak.html"><title>Genlex</title> </head> <body> <div class="navbar"><a href="Gc.html">Previous</a> <a href="index.html">Up</a> <a href="Graphics.html">Next</a> </div> <center><h1>Module <a href="type_Genlex.html">Genlex</a></h1></center> <br> <pre><span class="keyword">module</span> Genlex: <code class="code"><span class="keyword">sig</span></code> <a href="Genlex.html">..</a> <code class="code"><span class="keyword">end</span></code></pre>A generic lexical analyzer. <p> This module implements a simple ``standard'' lexical analyzer, presented as a function from character streams to token streams. It implements roughly the lexical conventions of Caml, but is parameterized by the set of keywords of your language. <p> Example: a lexer suitable for a desk calculator is obtained by <pre></pre><code class="code"> <span class="keyword">let</span> lexer = make_lexer [<span class="string">"+"</span>;<span class="string">"-"</span>;<span class="string">"*"</span>;<span class="string">"/"</span>;<span class="string">"let"</span>;<span class="string">"="</span>; <span class="string">"("</span>; <span class="string">")"</span>] </code><pre></pre> <p> The associated parser would be a function from <code class="code">token stream</code> to, for instance, <code class="code">int</code>, and would have rules such as: <p> <pre></pre><code class="code"> <span class="keyword">let</span> parse_expr = <span class="keyword">parser</span><br> [< <span class="keywordsign">'</span><span class="constructor">Int</span> n >] <span class="keywordsign">-></span> n<br> <span class="keywordsign">|</span> [< <span class="keywordsign">'</span><span class="constructor">Kwd</span> <span class="string">"("</span>; n = parse_expr; <span class="keywordsign">'</span><span class="constructor">Kwd</span> <span class="string">")"</span> >] <span class="keywordsign">-></span> n<br> <span class="keywordsign">|</span> [< n1 = parse_expr; n2 = parse_remainder n1 >] <span class="keywordsign">-></span> n2<br> <span class="keyword">and</span> parse_remainder n1 = <span class="keyword">parser</span><br> [< <span class="keywordsign">'</span><span class="constructor">Kwd</span> <span class="string">"+"</span>; n2 = parse_expr >] <span class="keywordsign">-></span> n1+n2<br> <span class="keywordsign">|</span> ...<br> </code><pre></pre><br> <hr width="100%"> <br><code><span id="TYPEtoken"><span class="keyword">type</span> <code class="type"></code>token</span> = </code><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Kwd</span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Ident</span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Int</span> <span class="keyword">of</span> <code class="type">int</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Float</span> <span class="keyword">of</span> <code class="type">float</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">String</span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Char</span> <span class="keyword">of</span> <code class="type">char</code></code></td> </tr></table> <div class="info"> The type of tokens. The lexical classes are: <code class="code"><span class="constructor">Int</span></code> and <code class="code"><span class="constructor">Float</span></code> for integer and floating-point numbers; <code class="code"><span class="constructor">String</span></code> for string literals, enclosed in double quotes; <code class="code"><span class="constructor">Char</span></code> for character literals, enclosed in single quotes; <code class="code"><span class="constructor">Ident</span></code> for identifiers (either sequences of letters, digits, underscores and quotes, or sequences of ``operator characters'' such as <code class="code">+</code>, <code class="code">*</code>, etc); and <code class="code"><span class="constructor">Kwd</span></code> for keywords (either identifiers or single ``special characters'' such as <code class="code">(</code>, <code class="code">}</code>, etc).<br> </div> <pre><span id="VALmake_lexer"><span class="keyword">val</span> make_lexer</span> : <code class="type">string list -> char <a href="Stream.html#TYPEt">Stream.t</a> -> <a href="Genlex.html#TYPEtoken">token</a> <a href="Stream.html#TYPEt">Stream.t</a></code></pre><div class="info"> Construct the lexer function. The first argument is the list of keywords. An identifier <code class="code">s</code> is returned as <code class="code"><span class="constructor">Kwd</span> s</code> if <code class="code">s</code> belongs to this list, and as <code class="code"><span class="constructor">Ident</span> s</code> otherwise. A special character <code class="code">s</code> is returned as <code class="code"><span class="constructor">Kwd</span> s</code> if <code class="code">s</code> belongs to this list, and cause a lexical error (exception <code class="code"><span class="constructor">Parse_error</span></code>) otherwise. Blanks and newlines are skipped. Comments delimited by <code class="code">(*</code> and <code class="code">*)</code> are skipped as well, and can be nested.<br> </div> </body></html>