<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta name="robots" content="index,nofollow"> <title>PrintfGentle - MLton Standard ML Compiler (SML Compiler)</title> <link rel="stylesheet" type="text/css" charset="iso-8859-1" media="all" href="common.css"> <link rel="stylesheet" type="text/css" charset="iso-8859-1" media="screen" href="screen.css"> <link rel="stylesheet" type="text/css" charset="iso-8859-1" media="print" href="print.css"> <link rel="Start" href="Home"> </head> <body lang="en" dir="ltr"> <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-833377-1"; urchinTracker(); </script> <table bgcolor = lightblue cellspacing = 0 style = "border: 0px;" width = 100%> <tr> <td style = " border: 0px; color: darkblue; font-size: 150%; text-align: left;"> <a class = mltona href="Home">MLton MLTONWIKIVERSION</a> <td style = " border: 0px; font-size: 150%; text-align: center; width: 50%;"> PrintfGentle <td style = " border: 0px; text-align: right;"> <table cellspacing = 0 style = "border: 0px"> <tr style = "vertical-align: middle;"> </table> <tr style = "background-color: white;"> <td colspan = 3 style = " border: 0px; font-size:70%; text-align: right;"> <a href = "Home">Home</a> <a href = "TitleIndex">Index</a> </table> <div id="content" lang="en" dir="ltr"> This page provides a gentle introduction and derivation of <a href="Printf">Printf</a>, with sections and arrangement more suitable to a talk. <h2 id="head-2473e96bc614a911821242119918a241a41836d6">Introduction</h2> <p> SML does not have <tt>printf</tt>. Could we define it ourselves? </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf (<B><FONT COLOR="#BC8F8F">"here's an int %d and a real %f.\n"</FONT></B>, <B><FONT COLOR="#5F9EA0">13</FONT></B>, <B><FONT COLOR="#5F9EA0">17.0</FONT></B>) <B><FONT COLOR="#A020F0">val</FONT></B> () = printf (<B><FONT COLOR="#BC8F8F">"here's three values (%d, %f, %f).\n"</FONT></B>, <B><FONT COLOR="#5F9EA0">13</FONT></B>, <B><FONT COLOR="#5F9EA0">17.0</FONT></B>, <B><FONT COLOR="#5F9EA0">19.0</FONT></B>) </PRE> <p> </p> <p> What could the type of <tt>printf</tt> be? </p> <p> This obviously can't work, because SML functions take a fixed number of arguments. Actually they take one argument, but if that's a tuple, it can only have a fixed number of components. </p> <h2 id="head-f9635a75e0047cc95e31d23e2e028a16d33c9a3e">From tupling to currying</h2> <p> What about currying to get around the typing problem? </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf <B><FONT COLOR="#BC8F8F">"here's an int %d and a real %f.\n"</FONT></B> <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf <B><FONT COLOR="#BC8F8F">"here's three values (%d, %f, %f).\n"</FONT></B> <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#5F9EA0">19.0</FONT></B> </PRE> <p> </p> <p> That fails for a similar reason. We need two types for <tt>printf</tt>. </p> <pre>val printf: string -> int -> real -> unit val printf: string -> int -> real -> real -> unit </pre><p> This can't work, because <tt>printf</tt> can only have one type. SML doesn't support programmer-defined overloading. </p> <h2 id="head-a4afe537360c86052d42c20840d22ff6d349ed93">Overloading and dependent types</h2> <p> Even without worrying about number of arguments, there is another problem. The type of <tt>printf</tt> depends on the format string. </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf <B><FONT COLOR="#BC8F8F">"here's an int %d and a real %f.\n"</FONT></B> <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf <B><FONT COLOR="#BC8F8F">"here's a real %f and an int %d.\n"</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#5F9EA0">13</FONT></B> </PRE> <p> </p> <p> Now we need </p> <pre>val printf: string -> int -> real -> unit val printf: string -> real -> int -> unit </pre><p> Again, this can't possibly working because SML doesn't have overloading, and types can't depend on values. </p> <h2 id="head-3e2c45a19c32be195d2d157a383b8ec3e1bb8ca1">Idea: express type information in the format string</h2> <p> If we express type information in the format string, then different uses of <tt>printf</tt> can have different types. </p> <pre class=code> <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> 'a t <I><FONT COLOR="#B22222">(* the type of format strings *)</FONT></I> </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> printf: 'a t -> 'a <B><FONT COLOR="#A020F0">infix</FONT></B> D F <B><FONT COLOR="#A020F0">val</FONT></B> fs1: (int -> real -> unit) t = <B><FONT COLOR="#BC8F8F">"here's an int "</FONT></B>D<B><FONT COLOR="#BC8F8F">" and a real "</FONT></B>F<B><FONT COLOR="#BC8F8F">".\n"</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> fs2: (int -> real -> real -> unit) t = <B><FONT COLOR="#BC8F8F">"here's three values ("</FONT></B>D<B><FONT COLOR="#BC8F8F">", "</FONT></B>F<B><FONT COLOR="#BC8F8F">", "</FONT></B>F<B><FONT COLOR="#BC8F8F">").\n"</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf fs1 <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf fs2 <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#5F9EA0">19.0</FONT></B> </PRE> <p> </p> <p> Now, our two calls to <tt>printf</tt> type check, because the format string specializes <tt>printf</tt> to the appropriate type. </p> <h2 id="head-222947ccdb8e9be636e5f01eb51ad36e369acdd6">The types of format characters</h2> <p> What should the type of format characters <tt>D</tt> and <tt>F</tt> be? Each format character requires an additional argument of the appropriate type to be supplied to <tt>printf</tt>. </p> <p> Idea: guess the final type that will be needed for <tt>printf</tt> the format string and verify it with each format character. </p> <pre class=code> <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t <I><FONT COLOR="#B22222">(* 'a = rest of type to verify, 'b = final type *)</FONT></I> </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> ` : string -> ('a, 'a) t <I><FONT COLOR="#B22222">(* guess the type, which must be verified *)</FONT></I> <B><FONT COLOR="#A020F0">val</FONT></B> D: (int -> 'a, 'b) t * string -> ('a, 'b) t <I><FONT COLOR="#B22222">(* consume an int *)</FONT></I> <B><FONT COLOR="#A020F0">val</FONT></B> F: (real -> 'a, 'b) t * string -> ('a, 'b) t <I><FONT COLOR="#B22222">(* consume a real *)</FONT></I> <B><FONT COLOR="#A020F0">val</FONT></B> printf: (unit, 'a) t -> 'a </PRE> <p> </p> <p> Don't worry. In the end, type inference will guess and verify for us. </p> <h2 id="head-6dd34f47d86bf18133dbe46190db3b9f050b3ee6">Understanding guess and verify</h2> <p> Now, let's build up a format string and a specialized <tt>printf</tt>. </p> <pre class=code> <B><FONT COLOR="#A020F0">infix</FONT></B> D F <B><FONT COLOR="#A020F0">val</FONT></B> f0 = `<B><FONT COLOR="#BC8F8F">"here's an int "</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> f1 = f0 D <B><FONT COLOR="#BC8F8F">" and a real "</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> f2 = f1 F <B><FONT COLOR="#BC8F8F">".\n"</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> p = printf f2 </PRE> <p> </p> <p> These definitions yield the following types. </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> f0: (int -> real -> unit, int -> real -> unit) t <B><FONT COLOR="#A020F0">val</FONT></B> f1: (real -> unit, int -> real -> unit) t <B><FONT COLOR="#A020F0">val</FONT></B> f2: (unit, int -> real -> unit) t <B><FONT COLOR="#A020F0">val</FONT></B> p: int -> real -> unit </PRE> <p> </p> <p> So, <tt>p</tt> is a specialized <tt>printf</tt> function. We could use it as follows </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> () = p <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> () = p <B><FONT COLOR="#5F9EA0">14</FONT></B> <B><FONT COLOR="#5F9EA0">19.0</FONT></B> </PRE> <p> </p> <h2 id="head-0de24a80c2f8581bf891b8ea4722f1e6f78df373">Type checking this using a functor</h2> <pre class=code> <B><FONT COLOR="#0000FF">signature</FONT></B> PRINTF = <B><FONT COLOR="#0000FF">sig</FONT></B> <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> ` : string -> ('a, 'a) t <B><FONT COLOR="#A020F0">val</FONT></B> D: (int -> 'a, 'b) t * string -> ('a, 'b) t <B><FONT COLOR="#A020F0">val</FONT></B> F: (real -> 'a, 'b) t * string -> ('a, 'b) t <B><FONT COLOR="#A020F0">val</FONT></B> printf: (unit, 'a) t -> 'a <B><FONT COLOR="#0000FF">end</FONT></B> <B><FONT COLOR="#0000FF">functor</FONT></B> Test (P: PRINTF) = <B><FONT COLOR="#0000FF">struct</FONT></B> <B><FONT COLOR="#0000FF">open</FONT></B> P <B><FONT COLOR="#A020F0">infix</FONT></B> D F <B><FONT COLOR="#A020F0">val</FONT></B> () = printf (`<B><FONT COLOR="#BC8F8F">"here's an int "</FONT></B>D<B><FONT COLOR="#BC8F8F">" and a real "</FONT></B>F<B><FONT COLOR="#BC8F8F">".\n"</FONT></B>) <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#A020F0">val</FONT></B> () = printf (`<B><FONT COLOR="#BC8F8F">"here's three values ("</FONT></B>D<B><FONT COLOR="#BC8F8F">", "</FONT></B>F <B><FONT COLOR="#BC8F8F">", "</FONT></B>F<B><FONT COLOR="#BC8F8F">").\n"</FONT></B>) <B><FONT COLOR="#5F9EA0">13</FONT></B> <B><FONT COLOR="#5F9EA0">17.0</FONT></B> <B><FONT COLOR="#5F9EA0">19.0</FONT></B> <B><FONT COLOR="#0000FF">end</FONT></B> </PRE> <p> </p> <h2 id="head-e8524d50c5bfba29773f4e572863f8d9f6f75372">Implementing Printf</h2> <p> Think of a format character as a formatter transformer. It takes the formatter for the part of the format string before it and transforms it into a new formatter that first does the left hand bit, then does its bit, then continues on with the rest of the format string. </p> <pre class=code> <B><FONT COLOR="#0000FF">structure</FONT></B> Printf: PRINTF = <B><FONT COLOR="#0000FF">struct</FONT></B> <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">T</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> (unit -> 'a) -> 'b </FONT></B><B><FONT COLOR="#A020F0">fun</FONT></B> printf (T f) = f (<B><FONT COLOR="#A020F0">fn</FONT></B> () => ()) <B><FONT COLOR="#A020F0">fun</FONT></B> ` s = T (<B><FONT COLOR="#A020F0">fn</FONT></B> a => (print s; a ())) <B><FONT COLOR="#A020F0">fun</FONT></B> D (T f, s) = T (<B><FONT COLOR="#A020F0">fn</FONT></B> g => f (<B><FONT COLOR="#A020F0">fn</FONT></B> () => <B><FONT COLOR="#A020F0">fn</FONT></B> i => (print (Int.toString i); print s; g ()))) <B><FONT COLOR="#A020F0">fun</FONT></B> F (T f, s) = T (<B><FONT COLOR="#A020F0">fn</FONT></B> g => f (<B><FONT COLOR="#A020F0">fn</FONT></B> () => <B><FONT COLOR="#A020F0">fn</FONT></B> i => (print (Real.toString i); print s; g ()))) <B><FONT COLOR="#0000FF">end</FONT></B> </PRE> <p> </p> <h2 id="head-3056e3449c9eeb687ff4979d488318a55b86ad94">Testing printf</h2> <pre class=code> <B><FONT COLOR="#0000FF">structure</FONT></B> Z = Test (Printf) </PRE> <p> </p> <h2 id="head-1ce26dacd016f9087315053e78d0fdf62930a8b3">User-definable formats</h2> <p> The definition of the format characters is pretty much the same. Within the <tt>Printf</tt> structure we can define a format character generator. </p> <pre class=code> <B><FONT COLOR="#A020F0">val</FONT></B> newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t = <B><FONT COLOR="#A020F0">fn</FONT></B> toString => <B><FONT COLOR="#A020F0">fn</FONT></B> (T f, s) => T (<B><FONT COLOR="#A020F0">fn</FONT></B> th => f (<B><FONT COLOR="#A020F0">fn</FONT></B> () => <B><FONT COLOR="#A020F0">fn</FONT></B> a => (print (toString a); print s ; th ()))) <B><FONT COLOR="#A020F0">val</FONT></B> D = <B><FONT COLOR="#A020F0">fn</FONT></B> z => newFormat Int.toString z <B><FONT COLOR="#A020F0">val</FONT></B> F = <B><FONT COLOR="#A020F0">fn</FONT></B> z => newFormat Real.toString z </PRE> <p> </p> <h2 id="head-647c5b70b462e300382ee62466a109f85f30bf29">A core Printf</h2> <p> We can now have a very small <tt>PRINTF</tt> signature, and define all the format strings externally to the core module. </p> <pre class=code> <B><FONT COLOR="#0000FF">signature</FONT></B> PRINTF = <B><FONT COLOR="#0000FF">sig</FONT></B> <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> ` : string -> ('a, 'a) t <B><FONT COLOR="#A020F0">val</FONT></B> newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t <B><FONT COLOR="#A020F0">val</FONT></B> printf: (unit, 'a) t -> 'a <B><FONT COLOR="#0000FF">end</FONT></B> <B><FONT COLOR="#0000FF">structure</FONT></B> Printf: PRINTF = <B><FONT COLOR="#0000FF">struct</FONT></B> <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">T</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> (unit -> 'a) -> 'b </FONT></B><B><FONT COLOR="#A020F0">fun</FONT></B> printf (T f) = f (<B><FONT COLOR="#A020F0">fn</FONT></B> () => ()) <B><FONT COLOR="#A020F0">fun</FONT></B> ` s = T (<B><FONT COLOR="#A020F0">fn</FONT></B> a => (print s; a ())) <B><FONT COLOR="#A020F0">fun</FONT></B> newFormat toString (T f, s) = T (<B><FONT COLOR="#A020F0">fn</FONT></B> th => f (<B><FONT COLOR="#A020F0">fn</FONT></B> () => <B><FONT COLOR="#A020F0">fn</FONT></B> a => (print (toString a) ; print s ; th ()))) <B><FONT COLOR="#0000FF">end</FONT></B> </PRE> <p> </p> <h2 id="head-11e864f783f740962ac1010af348633b44a7adc1">Extending to fprintf</h2> <p> One can implement fprintf by threading the outstream through all the transformers. </p> <pre class=code> <B><FONT COLOR="#0000FF">signature</FONT></B> PRINTF = <B><FONT COLOR="#0000FF">sig</FONT></B> <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> ` : string -> ('a, 'a) t <B><FONT COLOR="#A020F0">val</FONT></B> fprintf: (unit, 'a) t * TextIO.outstream -> 'a <B><FONT COLOR="#A020F0">val</FONT></B> newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t <B><FONT COLOR="#A020F0">val</FONT></B> printf: (unit, 'a) t -> 'a <B><FONT COLOR="#0000FF">end</FONT></B> <B><FONT COLOR="#0000FF">structure</FONT></B> Printf: PRINTF = <B><FONT COLOR="#0000FF">struct</FONT></B> <B><FONT COLOR="#A020F0">type</FONT></B><B><FONT COLOR="#228B22"> out </FONT></B>=<B><FONT COLOR="#228B22"> TextIO.outstream </FONT></B><B><FONT COLOR="#A020F0">val</FONT></B> output = TextIO.output <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">T</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> (out -> 'a) -> out -> 'b </FONT></B><B><FONT COLOR="#A020F0">fun</FONT></B> fprintf (T f, out) = f (<B><FONT COLOR="#A020F0">fn</FONT></B> _ => ()) out <B><FONT COLOR="#A020F0">fun</FONT></B> printf t = fprintf (t, TextIO.stdOut) <B><FONT COLOR="#A020F0">fun</FONT></B> ` s = T (<B><FONT COLOR="#A020F0">fn</FONT></B> a => <B><FONT COLOR="#A020F0">fn</FONT></B> out => (output (out, s); a out)) <B><FONT COLOR="#A020F0">fun</FONT></B> newFormat toString (T f, s) = T (<B><FONT COLOR="#A020F0">fn</FONT></B> g => f (<B><FONT COLOR="#A020F0">fn</FONT></B> out => <B><FONT COLOR="#A020F0">fn</FONT></B> a => (output (out, toString a) ; output (out, s) ; g out))) <B><FONT COLOR="#0000FF">end</FONT></B> </PRE> <p> </p> <h2 id="head-70440046a3dc2e079f23ee1c57dfa76669b732aa">Notes</h2> <ul> <li> <p> Lesson: instead of using dependent types for a function, express the the dependency in the type of the argument. </p> </li> <li class="gap"> <p> If <tt>printf</tt> is partially applied, it will do the printing then and there. Perhaps this could be fixed with some kind of terminator. </p> <p> A syntactic or argument terminator is not necessary. A formatter can either be eager (as above) or lazy (as below). A lazy formatter accumulates enough state to print the entire string. The simplest lazy formatter concatenates the strings as they become available: <pre class=code> <B><FONT COLOR="#0000FF">structure</FONT></B> PrintfLazyConcat: PRINTF = <B><FONT COLOR="#0000FF">struct</FONT></B> <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">T</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> (string -> 'a) -> string -> 'b </FONT></B><B><FONT COLOR="#A020F0">fun</FONT></B> printf (T f) = f print <B><FONT COLOR="#BC8F8F">""</FONT></B> <B><FONT COLOR="#A020F0">fun</FONT></B> ` s = T (<B><FONT COLOR="#A020F0">fn</FONT></B> th => <B><FONT COLOR="#A020F0">fn</FONT></B> s' => th (s' ^ s)) <B><FONT COLOR="#A020F0">fun</FONT></B> newFormat toString (T f, s) = T (<B><FONT COLOR="#A020F0">fn</FONT></B> th => f (<B><FONT COLOR="#A020F0">fn</FONT></B> s' => <B><FONT COLOR="#A020F0">fn</FONT></B> a => th (s' ^ toString a ^ s))) <B><FONT COLOR="#0000FF">end</FONT></B> </PRE> </p> It is somewhat more efficient to accumulate the strings as a list: <pre class=code> <B><FONT COLOR="#0000FF">structure</FONT></B> PrintfLazyList: PRINTF = <B><FONT COLOR="#0000FF">struct</FONT></B> <B><FONT COLOR="#A020F0">datatype</FONT></B><B><FONT COLOR="#228B22"> ('a, 'b) t </FONT></B>=<B><FONT COLOR="#228B22"> <FONT COLOR="#B8860B">T</FONT> <B><FONT COLOR="#A020F0">of</FONT></B> (string list -> 'a) -> string list -> 'b </FONT></B><B><FONT COLOR="#A020F0">fun</FONT></B> printf (T f) = f (List.app print o List.rev) [] <B><FONT COLOR="#A020F0">fun</FONT></B> ` s = T (<B><FONT COLOR="#A020F0">fn</FONT></B> th => <B><FONT COLOR="#A020F0">fn</FONT></B> ss => th (s::ss)) <B><FONT COLOR="#A020F0">fun</FONT></B> newFormat toString (T f, s) = T (<B><FONT COLOR="#A020F0">fn</FONT></B> th => f (<B><FONT COLOR="#A020F0">fn</FONT></B> ss => <B><FONT COLOR="#A020F0">fn</FONT></B> a => th (s::toString a::ss))) <B><FONT COLOR="#0000FF">end</FONT></B> </PRE> <p> </p> </li> </ul> <h2 id="head-a4bc8bf5caf54b18cea9f58e83dd4acb488deb17">Also see</h2> <ul> <li> <p> <a href="Printf">Printf</a> </p> </li> <li> <p> <a href = "References#Danvy98"> Functional Unparsing</a> </p> </li> </ul> </div> <p> <hr> Last edited on 2007-07-08 20:54:24 by <span title="c-71-57-91-146.hsd1.il.comcast.net"><a href="MatthewFluet">MatthewFluet</a></span>. </body></html>