Sophie

Sophie

distrib > Mageia > 7 > x86_64 > by-pkgid > b3bdfe6d859a3d6920ff2c44b38e9a6f > files > 67

saxon-manual-9.4.0.9-2.mga7.noarch.rpm

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet href="../../make-menu.xsl" type="text/xsl"?><html>
   <head>
      <this-is section="changes" page="intro92" subpage="parsing92"/>
      <!--
           Generated at 2011-12-09T20:47:22.916Z--><title>Saxonica: XSLT and XQuery Processing: XML Parsing and Serialization</title>
      <meta name="coverage" content="Worldwide"/>
      <meta name="copyright" content="Copyright Saxonica Ltd"/>
      <meta name="title"
            content="Saxonica: XSLT and XQuery Processing: XML Parsing and Serialization"/>
      <meta name="robots" content="noindex,nofollow"/>
      <link rel="stylesheet" href="../../saxondocs.css" type="text/css"/>
   </head>
   <body class="main">
      <h1>XML Parsing and Serialization</h1>
      <p class="subhead">Parsing</p>
      <p>If a <code>SAXSource</code> containing an <code>XMLReader</code> is supplied to Saxon, Saxon now
respects the <code>ErrorHandler</code> associated with the <code>XMLReader</code> rather than replacing
it with its own.</p>
      <p class="subhead">Serialization</p>
      <p>Some very basic support for HTML 5 has been added. If the serialization method is "html" and the version is "5.0", a
heading <code>&lt;!DOCTYPE HTML&gt;</code> will be output regardless of the <code>doctype-system</code> and <code>doctype-public</code>
properties.</p>
      <p>A new serialization option <code>saxon:recognize-binary</code> has been added for use with the <code>text</code> output method
(only). If set to yes, the processing instructions <code>&lt;?hex XXXX?&gt;</code> and <code>&lt;?b64 XXXX?&gt;</code> will be 
recognized; the value is taken as a hexBinary or base64 representation of a character string, encoded using the encoding in use by
the serializer, and this character string will be output without validating it to ensure it contains valid XML characters. This
enables non-XML characters, notably binary zero, to be output. For example, <code>&lt;?hex 0c?&gt;</code> outputs an ASCII form feed.
Also recognized are <code>&lt;?hex.EEEE XXXX?&gt;</code> and <code>&lt;?b64.EEEE XXXX?&gt;</code>, where EEEE is the name of the encoding
of the base64 or hexBinary data: for example <code>hex.ascii</code> or <code>b64.utf8</code>.</p>
      <p>A new UTF8 writer, contributed by Tatu Saloranta, is used in place of the standard Java UTF8 writer. The effect is to speed up
serialization by around 20%; for a transformation that copies its input to its output, the improvement is about 10% overall.</p>
      <table width="100%">
         <tr>
            <td>
               <p align="right"><a class="nav" href="models92.xml">Next</a></p>
            </td>
         </tr>
      </table>
   </body>
</html>