<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta name="generator" content= "HTML Tidy for Linux/x86 (vers 1 September 2005), see www.w3.org" /> <meta http-equiv="Content-Type" content= "text/html; charset=us-ascii" /> <title>docbook2X: utf8trans</title> <link rel="stylesheet" href="docbook2X.css" type="text/css" /> <link rev="made" href="mailto:stevecheng@users.sourceforge.net" /> <meta name="generator" content="DocBook XSL Stylesheets V1.68.1" /> <link rel="start" href="docbook2X.html" title= "docbook2X: Documentation Table of Contents" /> <link rel="up" href="charsets.html" title= "docbook2X: Character set conversion" /> <link rel="prev" href="charsets.html" title= "docbook2X: Character set conversion" /> <link rel="next" href="faq.html" title="docbook2X: FAQ" /> </head> <body> <div class="navheader"> <table width="100%" summary="Navigation header"> <tr> <th colspan="3" align="center"><span><strong class= "command">utf8trans</strong></span></th> </tr> <tr> <td width="20%" align="left"><a accesskey="p" href= "charsets.html"><< Previous</a> </td> <th width="60%" align="center">Character set conversion</th> <td width="20%" align="right"> <a accesskey="n" href= "faq.html">Next >></a></td> </tr> </table> <hr /></div> <div class="refentry" lang="en" xml:lang="en"><a id="utf8trans" name="utf8trans"></a> <div class="titlepage"></div> <a id="id2538852" class="indexterm" name="id2538852"></a><a id= "id2538859" class="indexterm" name="id2538859"></a><a id= "id2538866" class="indexterm" name="id2538866"></a><a id= "id2538873" class="indexterm" name="id2538873"></a><a id= "id2538883" class="indexterm" name="id2538883"></a><a id= "id2538890" class="indexterm" name="id2538890"></a> <div class="refnamediv"> <h2>Name</h2> <p><span><strong class="command">utf8trans</strong></span> — Transliterate UTF-8 characters according to a table</p> </div> <div class="refsynopsisdiv"> <h2>Synopsis</h2> <div class="cmdsynopsis"> <p><code class="command">utf8trans</code> <em class= "replaceable"><code>charmap</code></em> [<em class= "replaceable"><code>file</code></em>...]</p> </div> </div> <div class="refsect1" lang="en" xml:lang="en"><a id="id2538961" name="id2538961"></a> <h2>Description</h2> <a id="id2538967" class="indexterm" name="id2538967"></a> <p><span><strong class="command">utf8trans</strong></span> transliterates characters in the specified files (or standard input, if they are not specified) and writes the output to standard output. All input and output is in the UTF-8 encoding.</p> <p>This program is usually used to render characters in Unicode text files as some markup escapes or ASCII transliterations. (It is not intended for general charset conversions.) It provides functionality similar to the character maps in XSLT 2.0 (XML Stylesheet Language – Transformations, version 2.0).</p> </div> <div class="refsect1" lang="en" xml:lang="en"><a id="id2539001" name="id2539001"></a> <h2>Options</h2> <div class="variablelist"> <dl> <dt><span class="term"><code class="option">-m</code>,</span> <span class="term"><code class="option">--modify</code></span></dt> <dd> <p>Modifies the given files in-place with their transliterated output, instead of sending it to standard output.</p> <p>This option is useful for efficient transliteration of many files at once.</p> </dd> <dt><span class="term"><code class= "option">--help</code></span></dt> <dd> <p>Show brief usage information and exit.</p> </dd> <dt><span class="term"><code class= "option">--version</code></span></dt> <dd> <p>Show version and exit.</p> </dd> </dl> </div> </div> <div class="refsect1" lang="en" xml:lang="en"><a id="id2539071" name="id2539071"></a> <h2>Usage</h2> <p>The translation is done according to the rules in the “<span class="quote">character map</span>”, named in the file <em class="replaceable"><code>charmap</code></em>. It has the following format:</p> <div class="orderedlist"> <ol type="1"> <li> <p>Each line represents a translation entry, except for blank lines and comment lines, which are ignored.</p> </li> <li> <p>Any amount of whitespace (space or tab) may precede the start of an entry.</p> </li> <li> <p>Comment lines begin with <code class="literal">#</code>. Everything on the same line is ignored.</p> </li> <li> <p>Each entry consists of the Unicode codepoint of the character to translate, in hexadecimal, followed <span class= "emphasis"><em>one</em></span> space or tab, followed by the translation string, up to the end of the line.</p> </li> <li> <p>The translation string is taken literally, including any leading and trailing spaces (except the delimeter between the codepoint and the translation string), and all types of characters. The newline at the end is not included.</p> </li> </ol> </div> <p>The above format is intended to be restrictive, to keep <span><strong class="command">utf8trans</strong></span> simple. But if a XML-based format is desired, there is a <code class= "filename">xmlcharmap2utf8trans</code> script that comes with the docbook2X distribution, that converts character maps in XSLT 2.0 format to the <span><strong class= "command">utf8trans</strong></span> format.</p> </div> <div class="refsect1" lang="en" xml:lang="en"><a id="id2539164" name="id2539164"></a> <h2>Limitations</h2> <div class="itemizedlist"> <ul> <li> <p><span><strong class="command">utf8trans</strong></span> does not work with binary files, because malformed UTF-8 sequences in the input are substituted with U+FFFD characters. However, null characters in the input are handled correctly. This limitation may be removed in the future.</p> </li> <li> <p>There is no way to include a newline or null in the substitution string.</p> </li> </ul> </div> </div> </div> <div class="navfooter"> <hr /> <table width="100%" summary="Navigation footer"> <tr> <td width="40%" align="left"><a accesskey="p" href= "charsets.html"><< Previous</a> </td> <td width="20%" align="center"><a accesskey="u" href= "charsets.html">Up</a></td> <td width="40%" align="right"> <a accesskey="n" href= "faq.html">Next >></a></td> </tr> <tr> <td width="40%" align="left" valign="top">Character set conversion </td> <td width="20%" align="center"><a accesskey="h" href= "docbook2X.html">Table of Contents</a></td> <td width="40%" align="right" valign="top"> FAQ</td> </tr> </table> </div> <p class="footer-homepage"><a href= "http://docbook2x.sourceforge.net/" title= "docbook2X: Home page">docbook2X home page</a></p> </body> </html>