Sophie

Sophie

distrib > Fedora > 15 > i386 > by-pkgid > d07d7ab417d79053e7e0155c99e1a1c8 > files > 2491

mlton-20100608-3.fc15.i686.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta name="robots" content="index,nofollow">



<title>Unicode - MLton Standard ML Compiler (SML Compiler)</title>
<link rel="stylesheet" type="text/css" charset="iso-8859-1" media="all" href="common.css">
<link rel="stylesheet" type="text/css" charset="iso-8859-1" media="screen" href="screen.css">
<link rel="stylesheet" type="text/css" charset="iso-8859-1" media="print" href="print.css">


<link rel="Start" href="Home">


</head>

<body lang="en" dir="ltr">

<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
</script>
<script type="text/javascript">
_uacct = "UA-833377-1";
urchinTracker();
</script>
<table bgcolor = lightblue cellspacing = 0 style = "border: 0px;" width = 100%>
  <tr>
    <td style = "
		border: 0px;
		color: darkblue; 
		font-size: 150%;
		text-align: left;">
      <a class = mltona href="Home">MLton MLTONWIKIVERSION</a>
    <td style = "
		border: 0px;
		font-size: 150%;
		text-align: center;
		width: 50%;">
      Unicode
    <td style = "
		border: 0px;
		text-align: right;">
      <table cellspacing = 0 style = "border: 0px">
        <tr style = "vertical-align: middle;">
      </table>
  <tr style = "background-color: white;">
    <td colspan = 3
	style = "
		border: 0px;
		font-size:70%;
		text-align: right;">
      <a href = "Home">Home</a>
      &nbsp;<a href = "TitleIndex">Index</a>
      &nbsp;
</table>
<div id="content" lang="en" dir="ltr">
The current release of MLton does not support Unicode.  We are working on adding support. 
    <ul>

    <li>
<p>
 <tt>WideChar</tt> structure. 
</p>
</li>
    <li>
<p>
 UTF-8 encoded source files. 
</p>
</li>

    </ul>


<p>
There is no real support for Unicode in the <a href="DefinitionOfStandardML">Definition</a>; there are only a few throw-away sentences along the lines of "ASCII must be a subset of the character set in programs". 
</p>
<p>
Neither is there real support for Unicode in the <a href="BasisLibrary">Basis Library</a>. The general consensus (which includes the opinions of the editors of the Basis Library) is that the <tt>WideChar</tt> structure is insufficient for the purposes of Unicode.  There is no <tt>LargeChar</tt> structure, which in itself is a deficiency, since a programmer can not program against the largest supported character size. 
</p>
<p>
MLton has some preliminary support for 16 and 32 bit characters and strings.  It is even possible to include arbitrary Unicode characters in 32-bit strings using a <tt>\Uxxxxxxxx</tt> escape sequence.  (This longer escape sequence is a minor extension over the Definition which only allows <tt>\uxxxx</tt>.)  This is by no means completely satisfactory in terms of support for Unicode, but it is what is currently available. 
</p>
<p>
There are periodic flurries of questions and discussion about Unicode in MLton/SML.  In December 2004, there was a discussion that led to some seemingly sound design decisions.  The discussion started at: 
</p>

            <ul>

   <a href="http://mlton.org/pipermail/mlton/2004-December/026396.html"><img src="moin-www.png" alt="[WWW]" height="11" width="11">http://mlton.org/pipermail/mlton/2004-December/026396.html</a> 
            </ul>


<p>
There is a good summary of points at: 
</p>

            <ul>

   <a href="http://mlton.org/pipermail/mlton/2004-December/026440.html"><img src="moin-www.png" alt="[WWW]" height="11" width="11">http://mlton.org/pipermail/mlton/2004-December/026440.html</a> 
            </ul>


<p>
In November 2005, there was a followup discussion and the beginning of some coding. 
</p>

        <ul>

  <a href="http://mlton.org/pipermail/mlton/2005-November/028300.html"><img src="moin-www.png" alt="[WWW]" height="11" width="11">http://mlton.org/pipermail/mlton/2005-November/028300.html</a> 
        </ul>


<p>
We are optimistic that support will appear in the next MLton release. 
</p>
<h2 id="head-a4bc8bf5caf54b18cea9f58e83dd4acb488deb17">Also see</h2>
<p>
The <a href="fxp">fxp</a> XML parser has some support for dealing with Unicode documents. 
</p>
</div>



<p>
<hr>
Last edited on 2007-08-15 22:07:35 by <span title="fenrir.uchicago.edu"><a href="MatthewFluet">MatthewFluet</a></span>.
</body></html>