<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta name="robots" content="index,nofollow"> <title>StandardMLGotchas - MLton Standard ML Compiler (SML Compiler)</title> <link rel="stylesheet" type="text/css" charset="iso-8859-1" media="all" href="common.css"> <link rel="stylesheet" type="text/css" charset="iso-8859-1" media="screen" href="screen.css"> <link rel="stylesheet" type="text/css" charset="iso-8859-1" media="print" href="print.css"> <link rel="Start" href="Home"> </head> <body lang="en" dir="ltr"> <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-833377-1"; urchinTracker(); </script> <table bgcolor = lightblue cellspacing = 0 style = "border: 0px;" width = 100%> <tr> <td style = " border: 0px; color: darkblue; font-size: 150%; text-align: left;"> <a class = mltona href="Home">MLton MLTONWIKIVERSION</a> <td style = " border: 0px; font-size: 150%; text-align: center; width: 50%;"> StandardMLGotchas <td style = " border: 0px; text-align: right;"> <table cellspacing = 0 style = "border: 0px"> <tr style = "vertical-align: middle;"> </table> <tr style = "background-color: white;"> <td colspan = 3 style = " border: 0px; font-size:70%; text-align: right;"> <a href = "Home">Home</a> <a href = "TitleIndex">Index</a> </table> <div id="content" lang="en" dir="ltr"> This page contains brief explanations of some recurring sources of confusion and problems that SML newbies encounter. <p> Many confusions about the syntax of SML seem to arise from the use of an interactive REPL (Read-Eval Print Loop) while trying to learn the basics of the language. While writing your first SML programs, you should keep the source code of your programs in a form that is accepted by an SML compiler as a whole. </p> <h4 id="head-d685d6ac284370652bba2e2a7d86c4399c393136">The {{{and}}} keyword</h4> <p> It is a common mistake to misuse the <tt>and</tt> keyword or to not know how to introduce mutually recursive definitions. The purpose of the <tt>and</tt> keyword is to introduce mutually recursive definitions of functions and datatypes. For example, </p> <pre class=code> <B><FONT COLOR="#A020F0">fun</FONT></B> isEven <B><FONT COLOR="#5F9EA0">0w0</FONT></B> = true | isEven <B><FONT COLOR="#5F9EA0">0w1</FONT></B> = false | isEven n = isOdd (n-<B><FONT COLOR="#5F9EA0">0w1</FONT></B>) <B><FONT COLOR="#A020F0">and</FONT></B> isOdd <B><FONT COLOR="#5F9EA0">0w0</FONT></B> = false | isOdd <B><FONT COLOR="#5F9EA0">0w1</FONT></B> = true | isOdd n = isEven (n-<B><FONT COLOR="#5F9EA0">0w1</FONT></B>) </PRE> <p> </p> <p> and </p> <pre class=code> <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> decl </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">VAL</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> id * pat * expr <I><FONT COLOR="#B22222">(* | ... *)</FONT></I> </FONT></B><B><FONT COLOR="#A020F0">and</FONT></B><B><FONT COLOR="#228B22"> expr </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">LET</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> decl * expr <I><FONT COLOR="#B22222">(* | ... *)</FONT></I> </FONT></B></PRE> <p> </p> <p> You can also use <tt>and</tt> as a shorthand in a couple of other places, but it is not necessary. </p> <h4 id="head-b73586d63b2830f6d817e9b9150d4be4bdb9b049">Constructed patterns</h4> <p> It is a common mistake to forget to parenthesize constructed patterns in <tt>fun</tt> bindings. Consider the following invalid definition: </p> <pre class=code> <B><FONT COLOR="#A020F0">fun</FONT></B> length nil = <B><FONT COLOR="#5F9EA0">0</FONT></B> | length h :: t = <B><FONT COLOR="#5F9EA0">1</FONT></B> + length t </PRE> <p> </p> <p> The pattern <tt>h :: t</tt> needs to be parenthesized: </p> <pre class=code> <B><FONT COLOR="#A020F0">fun</FONT></B> length nil = <B><FONT COLOR="#5F9EA0">0</FONT></B> | length (h :: t) = <B><FONT COLOR="#5F9EA0">1</FONT></B> + length t </PRE> <p> </p> <p> The parentheses are needed, because a <tt>fun</tt> definition may have multiple consecutive constructed patterns through currying. </p> <p> The same applies to nonfix constructors. For example, the parentheses in </p> <pre class=code> <B><FONT COLOR="#A020F0">fun</FONT></B> valOf NONE = <B><FONT COLOR="#A020F0">raise</FONT></B> Option | valOf (SOME x) = x </PRE> <p> </p> <p> are required. However, the outermost constructed pattern in a <tt>fn</tt> or <tt>case</tt> expression need not be parenthesized, because in those cases there is always just one constructed pattern. So, both </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> valOf = <B><FONT COLOR="#A020F0">fn</FONT></B> NONE => <B><FONT COLOR="#A020F0">raise</FONT></B> Option | SOME x => x </PRE> <p> </p> <p> and </p> <pre class=code> <B><FONT COLOR="#A020F0">fun</FONT></B> valOf x = <B><FONT COLOR="#A020F0">case</FONT></B> x <B><FONT COLOR="#A020F0">of</FONT></B> NONE => <B><FONT COLOR="#A020F0">raise</FONT></B> Option | SOME x => x </PRE> <p> </p> <p> are fine. </p> <h4 id="head-8ab1923fa5e92fee24951419e65f282dd12cf824">Declarations and expressions</h4> <p> It is a common mistake to confuse expressions and declarations. Normally an SML source file should only contain declarations. The following are declarations: </p> <pre class=code> <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> dt </FONT></B>=<B><FONT COLOR="#228B22"> </FONT></B>... <B><FONT COLOR="#A020F0">fun</FONT></B> f ... = ... <B><FONT COLOR="#0000FF">functor</FONT></B> Fn (...) = ... <B><FONT COLOR="#A020F0">infix</FONT></B> ... <B><FONT COLOR="#A020F0">infixr</FONT></B> ... <B><FONT COLOR="#0000FF">local</FONT></B> ... <B><FONT COLOR="#0000FF">in</FONT></B> ... <B><FONT COLOR="#0000FF">end</FONT></B> <B><FONT COLOR="#A020F0">nonfix</FONT></B> ... <B><FONT COLOR="#0000FF">open</FONT></B> ... <B><FONT COLOR="#0000FF">signature</FONT></B> SIG = ... <B><FONT COLOR="#0000FF">structure</FONT></B> Struct = ... <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> t </FONT></B>=<B><FONT COLOR="#228B22"> </FONT></B>... <B><FONT COLOR="#A020F0">val</FONT></B> v = ... </PRE> <p> </p> <p> Note that </p> <pre class=code> <B><FONT COLOR="#A020F0">let</FONT></B> ... <B><FONT COLOR="#A020F0">in</FONT></B> ... <B><FONT COLOR="#A020F0">end</FONT></B> </PRE> <p> </p> <p> isn't a declaration. </p> <p> To specify a side-effecting computation in a source file, you can write: </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> () = ... </PRE> <p> </p> <h4 id="head-72f187fcc8b266cbc48f0060c51c75d2448f865c">Equality types</h4> <p> SML has a fairly intricate built-in notion of equality. See <a href="EqualityType">EqualityType</a> and <a href="EqualityTypeVariable">EqualityTypeVariable</a> for a thorough discussion. </p> <h4 id="head-04f8ae45c52f8b12ec8a7fe6650fd9d29cf11660">Nested cases</h4> <p> It is a common mistake to write nested case expressions without the necessary parentheses. See <a href="UnresolvedBugs">UnresolvedBugs</a> for a discussion. </p> <h4 id="head-4d2e5cf4124c665b4ee3f17586873d386091b28e">(op *)</h4> <p> It used to be a common mistake to parenthesize <tt>op *</tt> as <tt>(op *)</tt>. Before SML'97, <tt>*)</tt> was considered a comment terminator in SML and caused a syntax error. At the time of writing, <a href="SMLNJ">SML/NJ</a> still rejects the code. An extra space may be used for portability: <tt>(op * )</tt>. However, parenthesizing <tt>op</tt> is redundant, even though it is a widely used convention. </p> <h4 id="head-cb2f163ccc20cb7388203cf1edbc6860b44ea0f0">Overloading</h4> <p> A number of standard operators (<tt>+</tt>, <tt>-</tt>, <tt>~</tt>, <tt>*</tt>, <tt><</tt>, <tt>></tt>, ...) and numeric constants are overloaded for some of the numeric types (<tt>int</tt>, <tt>real</tt>, <tt>word</tt>). It is a common surprise that definitions using overloaded operators such as </p> <pre class=code> <B><FONT COLOR="#A020F0">fun</FONT></B> min (x, y) = <B><FONT COLOR="#A020F0">if</FONT></B> y < x <B><FONT COLOR="#A020F0">then</FONT></B> y <B><FONT COLOR="#A020F0">else</FONT></B> x </PRE> <p> </p> <p> are not overloaded themselves. SML doesn't really support (user-defined) overloading or other forms of ad hoc polymorphism. In cases such as the above where the context doesn't resolve the overloading, expressions using overloaded operators or constants get assigned a default type. The above definition gets the type </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> min : int * int -> int </PRE> <p> </p> <p> See <a href="Overloading">Overloading</a> and <a href="TypeIndexedValues">TypeIndexedValues</a> for further discussion. </p> <h4 id="head-1158dc6391ca1c282b0274da233e5b47974ab44c">Semicolons</h4> <p> It is a common mistake to use redundant semicolons in SML code. This is probably caused by the fact that in an SML REPL, a semicolon (and enter) is used to signal the REPL that it should evaluate the preceding chunk of code as a unit. In SML source files, semicolons are really needed in only two places. Namely, in expressions of the form </p> <pre class=code> (exp ; ... ; exp) </PRE> <p> </p> <p> and </p> <pre class=code> <B><FONT COLOR="#A020F0">let</FONT></B> ... <B><FONT COLOR="#A020F0">in</FONT></B> exp ; ... ; exp <B><FONT COLOR="#A020F0">end</FONT></B> </PRE> <p> </p> <p> Note that semicolons act as expression (or declaration) separators rather than as terminators. </p> <h4 id="head-72e54dc34abda1fd4474891b84fd2cdee65e658e">Stale bindings</h4> <h4 id="head-80eea6d8744ff759ee13086143a80641754b67b7">Unresolved records</h4> <h4 id="head-403e61388aa56bd37081a03b6455c2a4b2991e43">Value restriction</h4> <p> See <a href="ValueRestriction">ValueRestriction</a>. </p> <h4 id="head-2ae43ee231b8601d841d31e22732569161aeb94e">Type Variable Scope</h4> <p> See <a href="TypeVariableScope">TypeVariableScope</a>. </p> </div> <p> <hr> Last edited on 2007-08-23 04:25:03 by <span title="c-71-57-91-146.hsd1.il.comcast.net"><a href="MatthewFluet">MatthewFluet</a></span>. </body></html>