Sophie

Sophie

distrib > Fedora > 15 > i386 > by-pkgid > d07d7ab417d79053e7e0155c99e1a1c8 > files > 2457

mlton-20100608-3.fc15.i686.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta name="robots" content="index,nofollow">



<title>StaticSum - MLton Standard ML Compiler (SML Compiler)</title>
<link rel="stylesheet" type="text/css" charset="iso-8859-1" media="all" href="common.css">
<link rel="stylesheet" type="text/css" charset="iso-8859-1" media="screen" href="screen.css">
<link rel="stylesheet" type="text/css" charset="iso-8859-1" media="print" href="print.css">


<link rel="Start" href="Home">


</head>

<body lang="en" dir="ltr">

<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
</script>
<script type="text/javascript">
_uacct = "UA-833377-1";
urchinTracker();
</script>
<table bgcolor = lightblue cellspacing = 0 style = "border: 0px;" width = 100%>
  <tr>
    <td style = "
		border: 0px;
		color: darkblue; 
		font-size: 150%;
		text-align: left;">
      <a class = mltona href="Home">MLton MLTONWIKIVERSION</a>
    <td style = "
		border: 0px;
		font-size: 150%;
		text-align: center;
		width: 50%;">
      StaticSum
    <td style = "
		border: 0px;
		text-align: right;">
      <table cellspacing = 0 style = "border: 0px">
        <tr style = "vertical-align: middle;">
      </table>
  <tr style = "background-color: white;">
    <td colspan = 3
	style = "
		border: 0px;
		font-size:70%;
		text-align: right;">
      <a href = "Home">Home</a>
      &nbsp;<a href = "TitleIndex">Index</a>
      &nbsp;
</table>
<div id="content" lang="en" dir="ltr">
While SML makes it impossible to write functions whose types would depend on the values of their arguments, or so called dependently typed functions, it is possible, and arguably commonplace, to write functions whose types depend on the types of their arguments.  Indeed, the types of parametrically polymorphic functions like <tt>map</tt> and <tt>foldl</tt> can be said to depend on the types of their arguments.  What is less commonplace, however, is to write functions whose behavior would depend on the types of their arguments.  Nevertheless, there are several techniques for writing such functions.  <a href="TypeIndexedValues">Type-indexed values</a> and <a href="Fold">fold</a> are two such techniques.  This page presents another such technique dubbed static sums. <h3 id="head-bae05e6c3c767f4336636316b765315a47f6a398">Ordinary Sums</h3>
<p>
Consider the sum type as defined below: 
<pre class=code>
<B><FONT COLOR="#0000FF">structure</FONT></B> Sum = <B><FONT COLOR="#0000FF">struct</FONT></B>
   <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">INL</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> 'a </FONT></B>|<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">INR</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> 'b
</FONT></B><B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 While a generic sum type such as defined above is very useful, it has a number of limitations.  As an example, we could write the function <tt>out</tt> to extract the value from a sum as follows: 
<pre class=code>
<B><FONT COLOR="#A020F0">fun</FONT></B> out (s : ('a, 'a) Sum.t) : 'a =
    <B><FONT COLOR="#A020F0">case</FONT></B> s
     <B><FONT COLOR="#A020F0">of</FONT></B> Sum.INL a =&gt; a
      | Sum.INR a =&gt; a
</PRE>
 As can be seen from the type of <tt>out</tt>, it is limited in the sense that it requires both variants of the sum to have the same type.  So, <tt>out</tt> cannot be used to extract the value of a sum of two different types, such as the type <tt>(int,&nbsp;real)&nbsp;Sum.t</tt>.  As another example of a limitation, consider the following attempt at a <tt>succ</tt> function: 
<pre class=code>
<B><FONT COLOR="#A020F0">fun</FONT></B> succ (s : (int, real) Sum.t) : ??? =
    <B><FONT COLOR="#A020F0">case</FONT></B> s
     <B><FONT COLOR="#A020F0">of</FONT></B> Sum.INL i =&gt; i + <B><FONT COLOR="#5F9EA0">1</FONT></B>
      | Sum.INR r =&gt; Real.nextAfter (r, Real.posInf)
</PRE>
 The above definition of <tt>succ</tt> cannot be typed, because there is no type for the codomain within SML. 
</p>
<h3 id="head-abe64b6ebfd4251ca72fe68475928c5d6be71820">Static Sums</h3>
<p>
Interestingly, it is possible to define values <tt>inL</tt>, <tt>inR</tt>, and <tt>match</tt> that satisfy the laws 
<pre>match (inL x) (f, g) = f x
match (inR x) (f, g) = g x
</pre>and do not suffer from the same limitions.  The definitions are actually quite trivial: 
<pre class=code>
<B><FONT COLOR="#0000FF">structure</FONT></B> StaticSum = <B><FONT COLOR="#0000FF">struct</FONT></B>
   <B><FONT COLOR="#A020F0">fun</FONT></B> inL x (f, _) = f x
   <B><FONT COLOR="#A020F0">fun</FONT></B> inR x (_, g) = g x
   <B><FONT COLOR="#A020F0">fun</FONT></B> match x = x
<B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 
</p>
<p>
Now, given the <tt>succ</tt> function defined as 
<pre class=code>
<B><FONT COLOR="#A020F0">fun</FONT></B> succ s =
    StaticSum.match s
       (<B><FONT COLOR="#A020F0">fn</FONT></B> i =&gt; i + <B><FONT COLOR="#5F9EA0">1</FONT></B>,
        <B><FONT COLOR="#A020F0">fn</FONT></B> r =&gt; Real.nextAfter (r, Real.posInf))
</PRE>
 we get 
<pre class=code>
succ (StaticSum.inL <B><FONT COLOR="#5F9EA0">1</FONT></B>) = <B><FONT COLOR="#5F9EA0">2</FONT></B>
succ (StaticSum.inR Real.maxFinite) = Real.posInf
</PRE>
 
</p>
<p>
To better understand how this works, consider the following signature for static sums: 
<pre class=code>
<B><FONT COLOR="#0000FF">structure</FONT></B> StaticSum :&gt; <B><FONT COLOR="#0000FF">sig</FONT></B>
   <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> ('dL, 'cL, 'dR, 'cR, 'c) t
   </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> inL : 'dL -&gt; ('dL, 'cL, 'dR, 'cR, 'cL) t
   <B><FONT COLOR="#A020F0">val</FONT></B> inR : 'dR -&gt; ('dL, 'cL, 'dR, 'cR, 'cR) t
   <B><FONT COLOR="#A020F0">val</FONT></B> match : ('dL, 'cL, 'dR, 'cR, 'c) t -&gt; ('dL -&gt; 'cL) * ('dR -&gt; 'cR) -&gt; 'c
<B><FONT COLOR="#0000FF">end</FONT></B> = <B><FONT COLOR="#0000FF">struct</FONT></B>
   <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> ('dL, 'cL, 'dR, 'cR, 'c) t </FONT></B>=<B><FONT COLOR="#228B22"> ('dL -&gt; 'cL) * ('dR -&gt; 'cR) -&gt; 'c
   </FONT></B><B><FONT COLOR="#0000FF">open</FONT></B> StaticSum
<B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 Above, <tt>'d</tt> stands for domain and <tt>'c</tt> for codomain.  The key difference between an ordinary sum type, like <tt>(int,&nbsp;real)&nbsp;Sum.t</tt>, and a static sum type, like <tt>(int,&nbsp;real,&nbsp;real,&nbsp;int,&nbsp;real)&nbsp;StaticSum.t</tt>, is that the ordinary sum type says nothing about the type of the result of deconstructing a sum while the static sum type specifies the type. 
</p>
<p>
With the sealed static sum module, we get the type 
<pre class=code>
<B><FONT COLOR="#A020F0">val</FONT></B> succ : (int, int, real, real, 'a) StaticSum.t -&gt; 'a
</PRE>
 for the previously defined <tt>succ</tt> function.  The type specifies that <tt>succ</tt> maps a left <tt>int</tt> to an <tt>int</tt> and a right <tt>real</tt> to a <tt>real</tt>.  For example, the type of <tt>StaticSum.inL&nbsp;1</tt> is <tt>(int,&nbsp;'cL,&nbsp;'dR,&nbsp;'cR,&nbsp;'cL)&nbsp;StaticSum.t</tt>.  Unifying this with the argument type of <tt>succ</tt> gives the type <tt>(int,&nbsp;int,&nbsp;real,&nbsp;real,&nbsp;int)&nbsp;StaticSum.t&nbsp;-&gt;&nbsp;int</tt>. 
</p>
<p>
The <tt>out</tt> function is quite useful on its own.  Here is how it can be defined: 
<pre class=code>
<B><FONT COLOR="#0000FF">structure</FONT></B> StaticSum = <B><FONT COLOR="#0000FF">struct</FONT></B>
   <B><FONT COLOR="#0000FF">open</FONT></B> StaticSum
   <B><FONT COLOR="#A020F0">val</FONT></B> out : ('a, 'a, 'b, 'b, 'c) t -&gt; 'c =
    <B><FONT COLOR="#A020F0">fn</FONT></B> s =&gt; match s (<B><FONT COLOR="#A020F0">fn</FONT></B> x =&gt; x, <B><FONT COLOR="#A020F0">fn</FONT></B> x =&gt; x)
<B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 
</p>
<p>
Due to the value restriction, lack of first class polymorphism and polymorphic recursion, the usefulness and convenience of static sums is somewhat limited in SML.  So, don't throw away the ordinary sum type just yet.  Static sums can nevertheless be quite useful. 
</p>
<h3 id="head-3bdb7cbf1ec43160b51d72f6f6cf74d016effb4f">Example: Send and Receive with Argument Type Dependent Result Types</h3>
<p>
In some situations it would seem useful to define functions whose result type would depend on some of the arguments.  Traditionally such functions have been thought to be impossible in SML and the solution has been to define multiple functions.  For example, the <a class="external" href="http://mlton.org/basis/socket.html"><img src="moin-www.png" alt="[WWW]" height="11" width="11">Socket structure</a> of the Basis library defines 16 <tt>send</tt> and 16 <tt>recv</tt> functions.  In contrast, the Net structure (<a href = "http://mlton.org/cgi-bin/viewsvn.cgi/mltonlib/trunk/com/sweeks/basic/unstable/net.sig?view=markup"><img src="moin-www.png" alt="[WWW]" height="11" width="11">net.sig</a>) of the Basic library designed by Stephen Weeks defines only a single <tt>send</tt> and a single <tt>receive</tt> and the result types of the functions depend on their arguments.  The implementation (<a href = "http://mlton.org/cgi-bin/viewsvn.cgi/mltonlib/trunk/com/sweeks/basic/unstable/net.sml?view=markup"><img src="moin-www.png" alt="[WWW]" height="11" width="11">net.sml</a>) uses static sums (with a slighly different signature: <a href = "http://mlton.org/cgi-bin/viewsvn.cgi/mltonlib/trunk/com/sweeks/basic/unstable/static-sum.sig?view=markup"><img src="moin-www.png" alt="[WWW]" height="11" width="11">static-sum.sig</a>). 
</p>
<h3 id="head-a2ce759ad335eb8fd953a3a53cf16230241d83f0">Example: Picking Monad Results</h3>
<p>
Suppose that we need to write a parser that accepts a pair of integers and returns their sum given a monadic parsing combinator library.  A part of the signature of such library could look like this 
<pre class=code>
<B><FONT COLOR="#0000FF">signature</FONT></B> PARSING = <B><FONT COLOR="#0000FF">sig</FONT></B>
   <B><FONT COLOR="#0000FF">include</FONT></B> MONAD
   <B><FONT COLOR="#A020F0">val</FONT></B> int : int t
   <B><FONT COLOR="#A020F0">val</FONT></B> lparen : unit t
   <B><FONT COLOR="#A020F0">val</FONT></B> rparen : unit t
   <B><FONT COLOR="#A020F0">val</FONT></B> comma : unit t
   <I><FONT COLOR="#B22222">(* ... *)</FONT></I>
<B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 where the <tt>MONAD</tt> signature could be defined as 
<pre class=code>
<B><FONT COLOR="#0000FF">signature</FONT></B> MONAD = <B><FONT COLOR="#0000FF">sig</FONT></B>
   <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> 'a t
   </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> return : 'a -&gt; 'a t
   <B><FONT COLOR="#A020F0">val</FONT></B> &gt;&gt;= : 'a t * ('a -&gt; 'b t) -&gt; 'b t
<B><FONT COLOR="#0000FF">end</FONT></B>
<B><FONT COLOR="#A020F0">infix</FONT></B> &gt;&gt;=
</PRE>
 
</p>
<p>
The straightforward, but tedious, way to write the desired parser is: 
<pre class=code>
<B><FONT COLOR="#A020F0">val</FONT></B> p = lparen &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt;
        int    &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> x =&gt;
        comma  &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt;
        int    &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> y =&gt;
        rparen &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt;
        return (x + y))))))
</PRE>
 In Haskell, the parser could be written using the <tt>do</tt> notation considerably less verbosely as: 
<pre class=code>
p = <B><FONT COLOR="#A020F0">do</FONT></B> { lparen ; x &lt;- int ; comma ; y &lt;- int ; rparen ; return $ x + y }
</PRE>
 SML doesn't provide a <tt>do</tt> notation, so we need another solution. 
</p>
<p>
Suppose we would have a "pick" notation for monads that would allows us to write the parser as 
<pre class=code>
<B><FONT COLOR="#A020F0">val</FONT></B> p = `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (<B><FONT COLOR="#A020F0">fn</FONT></B> x &amp; y =&gt; x + y)
</PRE>
 using four auxiliary combinators: <tt>`</tt>, <tt>\</tt>, <tt>^</tt>, and <tt>@</tt>. Roughly speaking 
</p>

    <ul>

    <li>
<p>
 <tt>`p</tt> means that the result of <tt>p</tt> is dropped, 
</p>
</li>
    <li>
<p>
 <tt>\p</tt> means that the result of <tt>p</tt> is taken, 
</p>
</li>
    <li>
<p>
 <tt>p&nbsp;^&nbsp;q</tt> means that results of <tt>p</tt> and <tt>q</tt> are taken as a  product, and 
</p>
</li>
    <li>
<p>
 <tt>p&nbsp;@&nbsp;a</tt> means that the results of <tt>p</tt> are passed to the  function <tt>a</tt> and that result is returned. 
</p>
</li>

    </ul>


<p>
The difficulty is in implementing the concatenation combinator <tt>^</tt>. The type of the result of the concatenation depends on the types of the arguments. 
</p>
<p>
Using static sums and the <a href="ProductType">product type</a>, the pick notation for monads can be implemented as follows: 
<pre class=code>
<B><FONT COLOR="#0000FF">functor</FONT></B> MkMonadPick (<B><FONT COLOR="#0000FF">include</FONT></B> MONAD) = <B><FONT COLOR="#0000FF">let</FONT></B>
   <B><FONT COLOR="#0000FF">open</FONT></B> StaticSum
<B><FONT COLOR="#0000FF">in</FONT></B>
   <B><FONT COLOR="#0000FF">struct</FONT></B>
      <B><FONT COLOR="#A020F0">fun</FONT></B> `a = inL (a &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt; return ()))
      <B><FONT COLOR="#A020F0">val</FONT></B> \ = inR
      <B><FONT COLOR="#A020F0">fun</FONT></B> a @ f = out a &gt;&gt;= (return o f)
      <B><FONT COLOR="#A020F0">fun</FONT></B> a ^ b =
          (match b o match a)
             (<B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt;
                 (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inL (a &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt; b)),
                  <B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inR (a &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt; b))),
              <B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt;
                 (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inR (a &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt; b &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt; return a))),
                  <B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inR (a &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt; b &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; return (a &amp; b))))))
   <B><FONT COLOR="#0000FF">end</FONT></B>
<B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 The above implementation is inefficient, however.  It uses many more bind operations, <tt>&gt;&gt;=</tt>, than necessary.  That can be solved with an additional level of abstraction: 
<pre class=code>
<B><FONT COLOR="#0000FF">functor</FONT></B> MkMonadPick (<B><FONT COLOR="#0000FF">include</FONT></B> MONAD) = <B><FONT COLOR="#0000FF">let</FONT></B>
   <B><FONT COLOR="#0000FF">open</FONT></B> StaticSum
<B><FONT COLOR="#0000FF">in</FONT></B>
   <B><FONT COLOR="#0000FF">struct</FONT></B>
      <B><FONT COLOR="#A020F0">fun</FONT></B> `a = inL (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; a &gt;&gt;= (<B><FONT COLOR="#A020F0">fn</FONT></B> _ =&gt; b ()))
      <B><FONT COLOR="#A020F0">fun</FONT></B> \a = inR (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; a &gt;&gt;= b)
      <B><FONT COLOR="#A020F0">fun</FONT></B> a @ f = out a (return o f)
      <B><FONT COLOR="#A020F0">fun</FONT></B> a ^ b =
          (match b o match a)
             (<B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt; (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inL (<B><FONT COLOR="#A020F0">fn</FONT></B> c =&gt; a (<B><FONT COLOR="#A020F0">fn</FONT></B> () =&gt; b c)),
                       <B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inR (<B><FONT COLOR="#A020F0">fn</FONT></B> c =&gt; a (<B><FONT COLOR="#A020F0">fn</FONT></B> () =&gt; b c))),
              <B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt; (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inR (<B><FONT COLOR="#A020F0">fn</FONT></B> c =&gt; a (<B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt; b (<B><FONT COLOR="#A020F0">fn</FONT></B> () =&gt; c a))),
                       <B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; inR (<B><FONT COLOR="#A020F0">fn</FONT></B> c =&gt; a (<B><FONT COLOR="#A020F0">fn</FONT></B> a =&gt; b (<B><FONT COLOR="#A020F0">fn</FONT></B> b =&gt; c (a &amp; b))))))
   <B><FONT COLOR="#0000FF">end</FONT></B>
<B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 
</p>
<p>
After instantiating and opening either of the above monad pick implementations, the previously given definition of <tt>p</tt> can be compiled and results in a parser whose result is of type <tt>int</tt>.  Here is a functor to test the theory: 
<pre class=code>
<B><FONT COLOR="#0000FF">functor</FONT></B> Test (Arg : PARSING) = <B><FONT COLOR="#0000FF">struct</FONT></B>
   <B><FONT COLOR="#A020F0">local</FONT></B>
      <B><FONT COLOR="#0000FF">structure</FONT></B> Pick = MkMonadPick (Arg)
      <B><FONT COLOR="#0000FF">open</FONT></B> Pick Arg
   <B><FONT COLOR="#A020F0">in</FONT></B>
      <B><FONT COLOR="#A020F0">val</FONT></B> p : int t =
          `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (<B><FONT COLOR="#A020F0">fn</FONT></B> x &amp; y =&gt; x + y)
   <B><FONT COLOR="#A020F0">end</FONT></B>
<B><FONT COLOR="#0000FF">end</FONT></B>
</PRE>
 
</p>
<h2 id="head-a4bc8bf5caf54b18cea9f58e83dd4acb488deb17">Also see</h2>
<p>
There are a number of related techniques.  Here are some of them. 
</p>

    <ul>

    <li>
<p>
 <a href="Fold">Fold</a> 
</p>
</li>
    <li>
<p>
 <a href="TypeIndexedValues">TypeIndexedValues</a> 
</p>
</li>
</ul>

</div>



<p>
<hr>
Last edited on 2009-06-10 19:23:13 by <span title="fenrir.uchicago.edu"><a href="MatthewFluet">MatthewFluet</a></span>.
</body></html>