<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <link rel="stylesheet" href="style.css" type="text/css"> <meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type"> <link rel="Start" href="index.html"> <link rel="next" href="Pxp_document.html"> <link rel="Up" href="index.html"> <link title="Index of types" rel=Appendix href="index_types.html"> <link title="Index of exceptions" rel=Appendix href="index_exceptions.html"> <link title="Index of values" rel=Appendix href="index_values.html"> <link title="Index of class methods" rel=Appendix href="index_methods.html"> <link title="Index of classes" rel=Appendix href="index_classes.html"> <link title="Index of class types" rel=Appendix href="index_class_types.html"> <link title="Index of modules" rel=Appendix href="index_modules.html"> <link title="Index of module types" rel=Appendix href="index_module_types.html"> <link title="Pxp_types" rel="Chapter" href="Pxp_types.html"> <link title="Pxp_document" rel="Chapter" href="Pxp_document.html"> <link title="Pxp_dtd" rel="Chapter" href="Pxp_dtd.html"> <link title="Pxp_tree_parser" rel="Chapter" href="Pxp_tree_parser.html"> <link title="Pxp_core_types" rel="Chapter" href="Pxp_core_types.html"> <link title="Pxp_ev_parser" rel="Chapter" href="Pxp_ev_parser.html"> <link title="Pxp_event" rel="Chapter" href="Pxp_event.html"> <link title="Pxp_dtd_parser" rel="Chapter" href="Pxp_dtd_parser.html"> <link title="Pxp_codewriter" rel="Chapter" href="Pxp_codewriter.html"> <link title="Pxp_marshal" rel="Chapter" href="Pxp_marshal.html"> <link title="Pxp_yacc" rel="Chapter" href="Pxp_yacc.html"> <link title="Pxp_reader" rel="Chapter" href="Pxp_reader.html"> <link title="Intro_trees" rel="Chapter" href="Intro_trees.html"> <link title="Intro_extensions" rel="Chapter" href="Intro_extensions.html"> <link title="Intro_namespaces" rel="Chapter" href="Intro_namespaces.html"> <link title="Intro_events" rel="Chapter" href="Intro_events.html"> <link title="Intro_resolution" rel="Chapter" href="Intro_resolution.html"> <link title="Intro_getting_started" rel="Chapter" href="Intro_getting_started.html"> <link title="Intro_advanced" rel="Chapter" href="Intro_advanced.html"> <link title="Intro_preprocessor" rel="Chapter" href="Intro_preprocessor.html"> <link title="Example_readme" rel="Chapter" href="Example_readme.html"><link title="Configuration" rel="Section" href="#2_Configuration"> <link title="Sources" rel="Section" href="#2_Sources"> <link title="Entities" rel="Section" href="#2_Entities"> <link title="Event parsing" rel="Section" href="#2_Eventparsing"> <title>PXP Reference : Pxp_types</title> </head> <body> <div class="navbar"> <a class="up" href="index.html" title="Index">Up</a> <a class="post" href="Pxp_document.html" title="Pxp_document">Next</a> </div> <h1>Module <a href="type_Pxp_types.html">Pxp_types</a></h1> <pre><span class="keyword">module</span> Pxp_types: <code class="code"><span class="keyword">sig</span></code> <a href="Pxp_types.html">..</a> <code class="code"><span class="keyword">end</span></code></pre>Type definitions used throughout PXP<br> <hr width="100%"> <br> This module re-exports all the types listed in <a href="Pxp_core_types.S.html"><code class="code"><span class="constructor">Pxp_core_types</span>.<span class="constructor">S</span></code></a> (and finally defined in <a href="Pxp_core_types.I.html"><code class="code"><span class="constructor">Pxp_core_types</span>.<span class="constructor">I</span></code></a>), so the user only has to <code class="code"><span class="keyword">open</span> <span class="constructor">Pxp_types</span></code> to get all relevant type definitions. The re-exported definitions are shown here in the indented grey block:<br> <br> <br> <pre><span class="keyword">include</span> <a href="Pxp_core_types.S.html">Pxp_core_types.S</a></pre> <div class="included-module-type"> <pre><span class="keyword">module</span> <a href="Pxp_core_types.S.StringMap.html">StringMap</a>: <code class="type">Map.S</code><code class="type"> with type key = string</code></pre><div class="info"> For maps with string keys </div> <br> <span id="2_Identifiers"><h2>Identifiers</h2></span><br> <pre><span id="TYPEext_id"><span class="keyword">type</span> <code class="type"></code>ext_id</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEext_id">Pxp_core_types.A.ext_id</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELText_id.System"><span class="constructor">System</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELText_id.Public"><span class="constructor">Public</span></span> <span class="keyword">of</span> <code class="type">(string * string)</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELText_id.Anonymous"><span class="constructor">Anonymous</span></span></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELText_id.Private"><span class="constructor">Private</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEprivate_id">private_id</a></code></code></td> </tr></table> <div class="info"> External identifiers are names for documents. A <code class="code"><span class="constructor">System</span></code> identifier is a URL. PXP (without extensions) only supports file URLs in the form <code class="code">file:///directory/directory/.../file</code>. Note that the percent encoding (% plus two hex digits) is supported in file URLs. A public identifier can be looked up in a catalog to find a local copy of the file; this type is mostly used for well-known documents (e.g. after standardization). A public identifier can be accompanied by a system identifier (<code class="code"><span class="constructor">Public</span>(pubid,sysid)</code>), but the system identifier can be the empty string. The value <code class="code"><span class="constructor">Anonymous</span></code> should not be used to identify a real document; it is more thought as a placeholder when an ID is not yet known. <code class="code"><span class="constructor">Private</span></code> identifiers are used by PXP internally. These identifiers have, unlike system or public IDs, no textual counterparts. <p> The identifiers are encoded as UTF-8 strings.<br> </div> <pre><span id="TYPEprivate_id"><span class="keyword">type</span> <code class="type"></code>private_id</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEprivate_id">Pxp_core_types.A.private_id</a></code> </pre> <div class="info"> A private ID is an opaque identifier<br> </div> <pre><span id="VALallocate_private_id"><span class="keyword">val</span> allocate_private_id</span> : <code class="type">unit -> <a href="Pxp_core_types.S.html#TYPEprivate_id">private_id</a></code></pre><div class="info"> Get a new unique private ID<br> </div> <pre><span id="TYPEresolver_id"><span class="keyword">type</span> <code class="type"></code>resolver_id</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEresolver_id">Pxp_core_types.A.resolver_id</a></code> = {</pre><table class="typetable"> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTresolver_id.rid_private">rid_private</span> :<code class="type"><a href="Pxp_core_types.S.html#TYPEprivate_id">private_id</a> option</code>;</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTresolver_id.rid_public">rid_public</span> :<code class="type">string option</code>;</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTresolver_id.rid_system">rid_system</span> :<code class="type">string option</code>;</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTresolver_id.rid_system_base">rid_system_base</span> :<code class="type">string option</code>;</code></td> </tr></table> } <div class="info"> A resolver ID is a version of external identifiers used during resolving (i.e. the process of mapping the identifier to a real resource). The same entity can have several names during resolving: one private ID, one public ID, and one system ID. For resolving system IDs, the base URL is also remembered (usually the system ID of the opener of the entity).<br> </div> <pre><span id="VALresolver_id_of_ext_id"><span class="keyword">val</span> resolver_id_of_ext_id</span> : <code class="type"><a href="Pxp_core_types.S.html#TYPEext_id">ext_id</a> -> <a href="Pxp_core_types.S.html#TYPEresolver_id">resolver_id</a></code></pre><div class="info"> The standard way of converting an ext_id into a resolver ID. A <code class="code"><span class="constructor">System</span></code> ID is turned into a <code class="code">resolver_id</code> where only <code class="code">rid_system</code> is set. A <code class="code"><span class="constructor">Public</span></code> ID is turned into a <code class="code">resolver_id</code> where both <code class="code">rid_public</code> and <code class="code">rid_system</code> are set. A <code class="code"><span class="constructor">Private</span></code> ID is turned into a <code class="code">resolver_id</code> where only <code class="code">rid_private</code> is set. An <code class="code"><span class="constructor">Anonymous</span></code> ID is turned into a <code class="code">resolver_id</code> without any value (all components are None).<br> </div> <pre><span id="TYPEdtd_id"><span class="keyword">type</span> <code class="type"></code>dtd_id</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEdtd_id">Pxp_core_types.A.dtd_id</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTdtd_id.External"><span class="constructor">External</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEext_id">ext_id</a></code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >DTD is completely external</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTdtd_id.Derived"><span class="constructor">Derived</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEext_id">ext_id</a></code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >DTD is derived from an external DTD</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTdtd_id.Internal"><span class="constructor">Internal</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >DTD is completely internal</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> Identifier for DTDs<br> </div> <br> <span id="2_ContentmodelsinDTDs"><h2>Content models (in DTDs)</h2></span><br> <pre><span id="TYPEcontent_model_type"><span class="keyword">type</span> <code class="type"></code>content_model_type</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEcontent_model_type">Pxp_core_types.A.content_model_type</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTcontent_model_type.Unspecified"><span class="constructor">Unspecified</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >A specification of the model has not yet been found</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTcontent_model_type.Empty"><span class="constructor">Empty</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Nothing is allowed as content</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTcontent_model_type.Any"><span class="constructor">Any</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Everything is allowed as content</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTcontent_model_type.Mixed"><span class="constructor">Mixed</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEmixed_spec">mixed_spec</a> list</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >The contents consist of elements and <code class="code"><span class="constructor">PCDATA</span></code> in arbitrary order. What is allowed in particular is given as <code class="code">mixed_spec</code>.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTcontent_model_type.Regexp"><span class="constructor">Regexp</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEregexp_spec">regexp_spec</a></code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >The contents are elements following this regular expression</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> Element declaration in a DTD<br> </div> <pre><span id="TYPEmixed_spec"><span class="keyword">type</span> <code class="type"></code>mixed_spec</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEmixed_spec">Pxp_core_types.A.mixed_spec</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTmixed_spec.MPCDATA"><span class="constructor">MPCDATA</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">PCDATA</span></code> children are allowed</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTmixed_spec.MChild"><span class="constructor">MChild</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >This kind of Element is allowed</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> Children of an element in "mixed"-style declaration<br> </div> <pre><span id="TYPEregexp_spec"><span class="keyword">type</span> <code class="type"></code>regexp_spec</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEregexp_spec">Pxp_core_types.A.regexp_spec</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTregexp_spec.Optional"><span class="constructor">Optional</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEregexp_spec">regexp_spec</a></code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >subexpression?</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTregexp_spec.Repeated"><span class="constructor">Repeated</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEregexp_spec">regexp_spec</a></code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >subexpression*</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTregexp_spec.Repeated1"><span class="constructor">Repeated1</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEregexp_spec">regexp_spec</a></code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >subexpression+</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTregexp_spec.Alt"><span class="constructor">Alt</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEregexp_spec">regexp_spec</a> list</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >subexpr1 | subexpr2 | ... | subexprN</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTregexp_spec.Seq"><span class="constructor">Seq</span></span> <span class="keyword">of</span> <code class="type"><a href="Pxp_core_types.S.html#TYPEregexp_spec">regexp_spec</a> list</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >subexpr1 , subexpr2 , ... , subexprN</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTregexp_spec.Child"><span class="constructor">Child</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >This kind of Element is allowed here</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> Children of an element in a regexp-style declaration<br> </div> <pre><span id="TYPEatt_type"><span class="keyword">type</span> <code class="type"></code>att_type</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEatt_type">Pxp_core_types.A.att_type</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_cdata"><span class="constructor">A_cdata</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">CDATA</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_id"><span class="constructor">A_id</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">ID</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_idref"><span class="constructor">A_idref</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">IDREF</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_idrefs"><span class="constructor">A_idrefs</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">IDREFS</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_entity"><span class="constructor">A_entity</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">ENTITY</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_entities"><span class="constructor">A_entities</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">ENTITIES</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_nmtoken"><span class="constructor">A_nmtoken</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">NMTOKEN</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_nmtokens"><span class="constructor">A_nmtokens</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">NMTOKENS</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_notation"><span class="constructor">A_notation</span></span> <span class="keyword">of</span> <code class="type">string list</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">NOTATION</span></code> (name1 | name2 | ... | nameN)</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_type.A_enum"><span class="constructor">A_enum</span></span> <span class="keyword">of</span> <code class="type">string list</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >(name1 | name2 | ... | nameN)</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> Attribute declaration in a DTD<br> </div> <pre><span id="TYPEatt_default"><span class="keyword">type</span> <code class="type"></code>att_default</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEatt_default">Pxp_core_types.A.att_default</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_default.D_required"><span class="constructor">D_required</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="keywordsign">#</span><span class="constructor">REQUIRED</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_default.D_implied"><span class="constructor">D_implied</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="keywordsign">#</span><span class="constructor">IMPLIED</span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_default.D_default"><span class="constructor">D_default</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >a value default -- the value is already expanded</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_default.D_fixed"><span class="constructor">D_fixed</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code"><span class="constructor">FIXED</span></code> value default -- the value is already expanded</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> Default value of an attribute<br> </div> <br> <span id="2_Attributevalue"><h2>Attribute value</h2></span><br> <pre><span id="TYPEatt_value"><span class="keyword">type</span> <code class="type"></code>att_value</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEatt_value">Pxp_core_types.A.att_value</a></code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_value.Value"><span class="constructor">Value</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_value.Valuelist"><span class="constructor">Valuelist</span></span> <span class="keyword">of</span> <code class="type">string list</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTatt_value.Implied_value"><span class="constructor">Implied_value</span></span></code></td> </tr></table> <div class="info"> Enumerates the possible values of an attribute:<ul> <li><code class="code"><span class="constructor">Value</span> s</code>: The attribute is declared as a non-list type, or the attribute is undeclared; and the attribute is either defined with value <code class="code"><span class="string">"s"</span></code>, or it is missing but has the default value <code class="code">s</code>.</li> <li><code class="code">[<span class="constructor">Valuelist</span> [s1;...;sk]]</code>: The attribute is declared as a list type, and the attribute is either defined with value <code class="code"><span class="string">"s1 ... sk"</span></code> (space-separated words), or it is missing but has the default value <code class="code"><span class="string">"s1 ... sk"</span></code>.</li> <li><code class="code"><span class="constructor">Implied_value</span></code>: The attribute is declared without default value, and there is no definition for the attribute.</li> </ul> <br> </div> <br> <span id="2_Warnings"><h2>Warnings</h2></span><br> <pre><span id="TYPEcollect_warnings"><span class="keyword">class type</span> <a href="Pxp_core_types.S.collect_warnings-c.html">collect_warnings</a></span> = <code class="code"><span class="keyword">object</span></code> <a href="Pxp_core_types.S.collect_warnings-c.html">..</a> <code class="code"><span class="keyword">end</span></code></pre><div class="info"> This object is sometimes used for outputting user warnings </div> <pre><span name="TYPEdrop_warnings"><span class="keyword">class</span> <a href="Pxp_core_types.S.drop_warnings-c.html">drop_warnings</a></span> : <code class="type"></code><code class="type"><a href="Pxp_core_types.S.collect_warnings-c.html">collect_warnings</a></code></pre><div class="info"> Drop any warnings </div> <pre><span id="TYPEwarning"><span class="keyword">type</span> <code class="type"></code>warning</span> = <code class="type">[ `W_XML_version_not_supported of string<br> | `W_code_point_cannot_be_represented of int<br> | `W_element_mentioned_but_not_declared of string<br> | `W_entity_declared_twice of string<br> | `W_multiple_ATTLIST_declarations of string<br> | `W_multiple_attribute_declarations of string * string<br> | `W_name_is_reserved_for_extensions of string ]</code> </pre> <div class="info"> Kinds of warnings<br> </div> <pre><span id="TYPEsymbolic_warnings"><span class="keyword">class type</span> <a href="Pxp_core_types.S.symbolic_warnings-c.html">symbolic_warnings</a></span> = <code class="code"><span class="keyword">object</span></code> <a href="Pxp_core_types.S.symbolic_warnings-c.html">..</a> <code class="code"><span class="keyword">end</span></code></pre><div class="info"> This object is sometimes used for outputting user warnings </div> <pre><span id="VALstring_of_warning"><span class="keyword">val</span> string_of_warning</span> : <code class="type"><a href="Pxp_core_types.S.html#TYPEwarning">warning</a> -> string</code></pre><div class="info"> Turn the warning into a human-readable message<br> </div> <pre><span id="VALwarn"><span class="keyword">val</span> warn</span> : <code class="type"><a href="Pxp_core_types.S.symbolic_warnings-c.html">symbolic_warnings</a> option -><br> <a href="Pxp_core_types.S.collect_warnings-c.html">collect_warnings</a> -> <a href="Pxp_core_types.S.html#TYPEwarning">warning</a> -> unit</code></pre><div class="info"> Send a warning to the <code class="code">symbolic_warnings</code> object, and then to the <code class="code">collect_warnings</code> object.<br> </div> <br> <span id="2_Encoding"><h2>Encoding</h2></span><br> <pre><span id="TYPEencoding"><span class="keyword">type</span> <code class="type"></code>encoding</span> = <code class="type">Netconversion.encoding</code> </pre> <div class="info"> For the representation of external resources (files etc.) we accept all encodings for character sets which are defined in Netconversion (package netstring).<br> </div> <pre><span id="TYPErep_encoding"><span class="keyword">type</span> <code class="type"></code>rep_encoding</span> = <code class="type">[ `Enc_cp1006<br> | `Enc_cp437<br> | `Enc_cp737<br> | `Enc_cp775<br> | `Enc_cp850<br> | `Enc_cp852<br> | `Enc_cp855<br> | `Enc_cp856<br> | `Enc_cp857<br> | `Enc_cp860<br> | `Enc_cp861<br> | `Enc_cp862<br> | `Enc_cp863<br> | `Enc_cp864<br> | `Enc_cp865<br> | `Enc_cp866<br> | `Enc_cp869<br> | `Enc_cp874<br> | `Enc_iso88591<br> | `Enc_iso885910<br> | `Enc_iso885913<br> | `Enc_iso885914<br> | `Enc_iso885915<br> | `Enc_iso885916<br> | `Enc_iso88592<br> | `Enc_iso88593<br> | `Enc_iso88594<br> | `Enc_iso88595<br> | `Enc_iso88596<br> | `Enc_iso88597<br> | `Enc_iso88598<br> | `Enc_iso88599<br> | `Enc_koi8r<br> | `Enc_macroman<br> | `Enc_usascii<br> | `Enc_utf8<br> | `Enc_windows1250<br> | `Enc_windows1251<br> | `Enc_windows1252<br> | `Enc_windows1253<br> | `Enc_windows1254<br> | `Enc_windows1255<br> | `Enc_windows1256<br> | `Enc_windows1257<br> | `Enc_windows1258 ]</code> </pre> <div class="info"> The subset of <code class="code">encoding</code> that may be used for the internal representation of strings. The common property of the following encodings is that they are ASCII-compatible - the PXP code relies on that.<br> </div> <br> <span id="2_Exceptions"><h2>Exceptions</h2></span><br> <pre><span id="EXCEPTIONValidation_error"><span class="keyword">exception</span> Validation_error</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> Violation of a validity constraint<br> </div> <pre><span id="EXCEPTIONWF_error"><span class="keyword">exception</span> WF_error</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> Violation of a well-formedness constraint<br> </div> <pre><span id="EXCEPTIONNamespace_error"><span class="keyword">exception</span> Namespace_error</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> Violation of a namespace constraint<br> </div> <pre><span id="EXCEPTIONError"><span class="keyword">exception</span> Error</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> Other error<br> </div> <pre><span id="EXCEPTIONCharacter_not_supported"><span class="keyword">exception</span> Character_not_supported</span></pre> <pre><span id="EXCEPTIONAt"><span class="keyword">exception</span> At</span> <span class="keyword">of</span> <code class="type">(string * exn)</code></pre> <div class="info"> The string is a description where the exn happened. The exn value can again be <code class="code"><span class="constructor">At</span>(_,_)</code> (for example, when an entity within an entity causes the error).<br> </div> <pre><span id="EXCEPTIONUndeclared"><span class="keyword">exception</span> Undeclared</span></pre> <div class="info"> Indicates that no declaration is available and because of this every kind of usage is allowed. (Raised by some DTD methods.)<br> </div> <pre><span id="EXCEPTIONMethod_not_applicable"><span class="keyword">exception</span> Method_not_applicable</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> Indicates that a method has been called that is not applicable for the class. The argument is the name of the method.<br> </div> <pre><span id="EXCEPTIONNamespace_method_not_applicable"><span class="keyword">exception</span> Namespace_method_not_applicable</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> Indicates that the called method is a namespace method but that the object does not support namespaces. The argument is the name of the method.<br> </div> <pre><span id="EXCEPTIONNot_competent"><span class="keyword">exception</span> Not_competent</span></pre> <div class="info"> The resolver cannot open this kind of entity ID<br> </div> <pre><span id="EXCEPTIONNot_resolvable"><span class="keyword">exception</span> Not_resolvable</span> <span class="keyword">of</span> <code class="type">exn</code></pre> <div class="info"> While opening the entity, the nested exception occurred<br> </div> <pre><span id="EXCEPTIONNamespace_not_managed"><span class="keyword">exception</span> Namespace_not_managed</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> A namespace URI is used but not declared in the namespace manager. The string argument is the URI in question.<br> </div> <pre><span id="EXCEPTIONNamespace_prefix_not_managed"><span class="keyword">exception</span> Namespace_prefix_not_managed</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> A namespace prefix is used but not declared in the namespace manager. The string argument is the prefix in question.<br> </div> <pre><span id="EXCEPTIONNamespace_not_in_scope"><span class="keyword">exception</span> Namespace_not_in_scope</span> <span class="keyword">of</span> <code class="type">string</code></pre> <div class="info"> The namespace scope does not know the URI<br> </div> <pre><span id="VALstring_of_exn"><span class="keyword">val</span> string_of_exn</span> : <code class="type">exn -> string</code></pre><div class="info"> Converts a PXP exception into a readable string<br> </div> <br> <span id="2_Outputdestination"><h2>Output destination</h2></span><br> <pre><span id="TYPEoutput_stream"><span class="keyword">type</span> <code class="type"></code>output_stream</span> = <code class="type">[ `Out_buffer of Buffer.t<br> | `Out_channel of Pervasives.out_channel<br> | `Out_function of string -> int -> int -> unit<br> | `Out_netchannel of Netchannels.out_obj_channel ]</code> </pre> <div class="info"> Designates an output destination for several printers:<ul> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Out_buffer</span> b</code>: Output to buffer <code class="code">b</code></li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Out_channel</span> ch</code>: Output to channel <code class="code">ch</code></li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Out_function</span> f</code>: Output to function <code class="code">f</code>. The function <code class="code">f</code> is used like <code class="code"><span class="constructor">Pervasives</span>.output_string</code>.</li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Out_netchannel</span> n</code>: Output to the ocamlnet channel <code class="code">n</code></li> </ul> <br> </div> <pre><span id="VALwrite"><span class="keyword">val</span> write</span> : <code class="type"><a href="Pxp_core_types.S.html#TYPEoutput_stream">output_stream</a> -> string -> int -> int -> unit</code></pre><div class="info"> <code class="code">write os s pos len</code>: Writes the string (portion) to the buffer/channel/stream<br> </div> <br> <span id="2_Pools"><h2>Pools</h2></span><br> <pre><span id="TYPEpool"><span class="keyword">type</span> <code class="type"></code>pool</span> = <code class="type"><a href="Pxp_core_types.A.html#TYPEpool">Pxp_core_types.A.pool</a></code> </pre> <div class="info"> A pool designates a way to increase string sharing<br> </div> <pre><span id="VALmake_probabilistic_pool"><span class="keyword">val</span> make_probabilistic_pool</span> : <code class="type">?fraction:float -> int -> <a href="Pxp_core_types.S.html#TYPEpool">pool</a></code></pre><div class="info"> A probalistic string pool tries to map strings to pool strings in order to make it more likely that equal strings are stored in the same memory block. The int argument is the size of the pool; this is the number of entries of the pool. However, not all entries of the pool are used; the fraction argument (default: 0.3) determines the fraction of the actually used entries. The higher the fraction is, the more strings can be managed at the same time; the lower the fraction is, the more likely it is that a new string can be added to the pool.<br> </div> <pre><span id="VALpool_string"><span class="keyword">val</span> pool_string</span> : <code class="type"><a href="Pxp_core_types.S.html#TYPEpool">pool</a> -> string -> string</code></pre><div class="info"> Tries to find the passed string in the pool; if the string is in the pool, the pool string is returned. Otherwise, the function tries to add the passed string to the pool, and the passed string is returned.<br> </div> </div> <br> <p> <p> <p> <br> <br> <span id="2_Configuration"><h2>Configuration</h2></span><br> <pre><code><span id="TYPEconfig"><span class="keyword">type</span> <code class="type"></code>config</span> = {</code></pre><table class="typetable"> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.warner">warner</span> :<code class="type">collect_warnings</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >An object that collects warnings.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.swarner">swarner</span> :<code class="type">symbolic_warnings option</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Another object getting warnings expressed as polymorphic variants. This is especially useful to turn warnings into errors. If defined, the <code class="code">swarner</code> gets the warning first before it is sent to the classic <code class="code">warner</code>.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_pinstr_nodes">enable_pinstr_nodes</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >if <code class="code"><span class="keyword">true</span></code>, processing instructions (PI's) are represented by nodes of their own in the document tree. If not enabled, PI's are attached to their surrounding elements, and the exact location within the element is lost. <p> For example, if the XML text is <code class="code"><s><?x<span class="keywordsign">?></span>foo<?y<span class="keywordsign">?></</span>s></code>, the parser normally produces only an element object for <code class="code">s</code>, and attaches the PIs <code class="code">x</code> and <code class="code">y</code> to it (without order), and the details of <code class="code">x</code> and <code class="code">y</code> can be only found out with the <code class="code">pinstr</code> method of the surrounding element. The only subelement is the data node for "foo". If <code class="code">enable_pinstr_nodes</code> the node for element <code class="code">s</code> will contain two additional subnodes of type <code class="code"><span class="constructor">T_pinstr</span></code>, one as left sibling of "foo", and one as right sibling. Any code processing such a tree must be prepared that processing instructions occur as normal tree members, and are no longer attached to the surrounding nodes. <p> The event-based parser reacts on the <code class="code">enable_pinstr_nodes</code> mode by emitting <code class="code"><span class="constructor">E_pinstr</span></code> events exactly at the locations where the PI's occur in the text.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_comment_nodes">enable_comment_nodes</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >When enabled, comments are represented as nodes with type <code class="code"><span class="constructor">T_comment</span></code>. If not enabled, comments are ignored. <p> Event-based parser: This flag controls whether E_comment events are generated.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_super_root_node">enable_super_root_node</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >The <code class="code">enable_super_root_node</code> changes the layout of the document tree: The top-most node is no longer the top-most element of the document (i.e. the element root), but a special node called the super root node (<code class="code"><span class="constructor">T_super_root</span></code>). The top-most element is then a child of the super root node. The super root node can have further children, namely comment nodes and processing instructions that are placed before or after the top-most element in the XML text. However, the exact behaviour depends on whether the other special modes in the configuration are also enabled:<ul> <li>If <code class="code">enable_pinstr_nodes</code> is also true, processing instruction nodes (<code class="code"><span class="constructor">T_pinstr</span></code>) can occur as children of the super root node when processing instructions occur before or after the root element. If <code class="code">enable_pinstr_nodes</code> is false, these instructions are simply attached to the super root node as they would be attached to ordinary elements within the tree. Note that processing instructions in the DTD part of the XML text are not meant here (i.e. instructions between the square brackets, or in an external DTD). These instructions are always attached to the DTD object (see <a href="Pxp_dtd.dtd-c.html"><code class="code"><span class="constructor">Pxp_dtd</span>.dtd</code></a>).</li> <li>If <code class="code">enable_comment_nodes</code> is also true, comment nodes can occur as children of the super root node when comments occur before or after the root element. If <code class="code">enable_comment_nodes</code> is false, comments are ignored.</li> </ul> </td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.drop_ignorable_whitespace">drop_ignorable_whitespace</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Ignorable whitespace is whitespace between XML nodes where the DTD does not specify that <code class="code"><span class="keywordsign">#</span><span class="constructor">PCDATA</span></code> must be parsed. For example, if the DTD contains <code class="code"> <!<span class="constructor">ELEMENT</span> a (b,c)><br> <!<span class="constructor">ELEMENT</span> b (<span class="keywordsign">#</span><span class="constructor">PCDATA</span>)*><br> <!<span class="constructor">ELEMENT</span> c <span class="constructor">EMPTY</span>><br> </code> the XML text <code class="code"><a><b> </b> <c></c></a></code> is legal. There are two spaces:<ul> <li>Between <code class="code"><b></code> and <code class="code"></b></code>. Because <code class="code">b</code> is declared with <code class="code"><span class="keywordsign">#</span><span class="constructor">PCDATA</span></code>, this space character is not ignorable, and the parser will create a data node containing the character</li> <li>Between <code class="code"></b></code> and <code class="code"><c></code>. Because the declaration of <code class="code">a</code> does not contain the keyword <code class="code"><span class="keywordsign">#</span><span class="constructor">PCDATA</span></code>, character data is not expected at this position. However, XML allows that whitespace can be written here in order to improve the readability of the XML text. Such whitespace material is considered as "ignorable whitespace". If <code class="code">drop_ignorable_whitespace</code> is true, the parser will not create a data node containing the character. Otherwise, the parser does create such a data node.</li> </ul> Note that <code class="code">c</code> is declared as <code class="code"><span class="constructor">EMPTY</span></code>. XML does not allow space characters between <code class="code"><c></code> and <code class="code"></c></code> such that it is not the question whether such characters are to be ignored or not - they are simply illegal and will lead to a parsing error. <p> In the well-formed mode, the parser treats every whitespace character occuring in an element as non-ignorable. <p> Event-based parser: ignored. (Maybe there will be a stream filter with the same effect if I find time to program it.)</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.encoding">encoding</span> :<code class="type">rep_encoding</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Specifies the encoding used for the <b>internal</b> representation of any character data.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.recognize_standalone_declaration">recognize_standalone_declaration</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Whether the <code class="code">standalone</code> declaration is recognized or not. This option does not have an effect on well-formedness parsing: in this case such declarations are never recognized. <p> Recognizing the <code class="code">standalone</code> declaration means that the value of the declaration is scanned and passed to the DTD, and that the standalone-check is performed. <p> This means: If a document is flagged <code class="code">standalone=<span class="keywordsign">'</span>yes'</code> some additional constraints apply. The idea is that a parser without access to any external document subsets can still parse the document, and will still return the same values as the parser with such access. For example, if the DTD is external and if there are attributes with default values, it is checked that there is no element instance where these attributes are omitted - the parser would return the default value but this requires access to the external DTD subset. <p> Event-based parser: The option has an effect if the <code class="code"><span class="keywordsign">`</span><span class="constructor">Parse_xml_decl</span></code> entry flag is set. In this case, it is passed to the DTD whether there is a standalone declaration, ... and the rest is unclear.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.store_element_positions">store_element_positions</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Whether the file name, the line and the column of the beginning of elements are stored in the element nodes. This option may be useful to generate error messages. <p> Positions are only stored for:<ul> <li>Elements</li> <li>Processing instructions if <code class="code"><span class="constructor">T_pinstr</span></code> nodes are created for them (see <code class="code">enable_pinstr_nodes</code>)</li> </ul> For all other node types, no position is stored. <p> You can access positions by the method <code class="code">position</code> of nodes. <p> Event-based parser: If true, the <code class="code"><span class="constructor">E_position</span></code> events will be generated.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.idref_pass">idref_pass</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Whether the parser does a second pass and checks that all <code class="code"><span class="constructor">IDREF</span></code> and <code class="code"><span class="constructor">IDREFS</span></code> attributes contain valid references. This option works only if an ID index is available. To create an ID index, pass an index object as <code class="code">id_index</code> argument to the parsing functions (such as <a href="Pxp_tree_parser.html#VALparse_document_entity"><code class="code"><span class="constructor">Pxp_tree_parser</span>.parse_document_entity</code></a>). <p> "Second pass" does not mean that the XML text is again parsed; only the existing document tree is traversed, and the check on bad <code class="code"><span class="constructor">IDREF</span></code>/<code class="code"><span class="constructor">IDREFS</span></code> attributes is performed for every node. <p> Event-based parser: this option is ignored.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.validate_by_dfa">validate_by_dfa</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >If true, and if DFAs are available for validation, the DFAs will actually be used for validation. If false, or if no DFAs are available, the standard backtracking algorithm will be used. <p> DFAs are only available if <code class="code">accept_only_deterministic_models</code> is true (because in this case, it is relatively cheap to construct the DFAs). DFAs are a data structure which ensures that validation can always be performed in linear time. <p> I strongly recommend using DFAs; however, there are examples for which validation by backtracking is faster. <p> Event-based parser: this option is ignored.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.accept_only_deterministic_models">accept_only_deterministic_models</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Whether only deterministic content models are accepted in DTDs. <p> Event-based parser: this option is ignored.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.disable_content_validation">disable_content_validation</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >When set to true, content validation is disabled; however, other validation checks remain activated. This option is intended to save time when a validated document is parsed and it can be assumed that it is valid. <p> Do not forget to set <code class="code">accept_only_deterministic_models</code> to false to save maximum time (or DFAs will be computed which is rather expensive). <p> Event-based parser: this option is ignored.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.name_pool">name_pool</span> :<code class="type">Pxp_core_types.I.pool</code>;</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_name_pool_for_element_types">enable_name_pool_for_element_types</span> :<code class="type">bool</code>;</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_name_pool_for_attribute_names">enable_name_pool_for_attribute_names</span> :<code class="type">bool</code>;</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_name_pool_for_attribute_values">enable_name_pool_for_attribute_values</span> :<code class="type">bool</code>;</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_name_pool_for_pinstr_targets">enable_name_pool_for_pinstr_targets</span> :<code class="type">bool</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >The name pool maps strings to pool strings such that strings with the same value share the same block of memory. Enabling the name pool saves memory, but makes the parser slower. <p> Event-based parser: As far as I remember, some of the pool options are honoured, but not all.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.enable_namespace_processing">enable_namespace_processing</span> :<code class="type"><a href="Pxp_dtd.namespace_manager-c.html">Pxp_dtd.namespace_manager</a> option</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Setting this option to a <code class="code">namespace_manager</code> enables namespace processing. This works only if the namespace-aware implementation <code class="code">namespace_element_impl</code> of element nodes is used in the spec; otherwise you will get error messages complaining about missing methods. <p> Note that PXP uses a technique called "prefix normalization" to implement namespaces on top of the plain document model. This means that the namespace prefixes of elements and attributes are changed to unique prefixes if they are ambiguous, and that these "normprefixes" are actually stored in the document tree. Furthermore, the normprefixes are used for validation. (See <a href="Intro_namespaces.html"><code class="code"><span class="constructor">Intro_namespaces</span></code></a> for details.) <p> Event-based parser: If true, the events <code class="code"><span class="constructor">E_ns_start_tag</span></code> and <code class="code"><span class="constructor">E_ns_end_tag</span></code> are generated instead of <code class="code"><span class="constructor">E_start_tag</span></code>, and <code class="code"><span class="constructor">E_end_tag</span></code>, respectively.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.escape_contents">escape_contents</span> :<code class="type">(Pxp_lexer_types.token -> Pxp_entity_manager.entity_manager -> string) option</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><b>Experimental feature.</b> If defined, the <code class="code">escape_contents</code> function is called whenever the tokens "{", "{{", "}", or "}}" are found in the context of character data contents. The first argument is the token. The second argument is the entity manager, it can be used to access the lexing buffer directly. The result of the function are the characters to substitute. <p> "{" is the token <code class="code"><span class="constructor">Lcurly</span></code>, "{{" is the token <code class="code"><span class="constructor">LLcurly</span></code>, "}" is the token <code class="code"><span class="constructor">Rcurly</span></code>, and "}}" is the token <code class="code"><span class="constructor">RRcurly</span></code>. <p> Event-based parser: this option works.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.escape_attributes">escape_attributes</span> :<code class="type">(Pxp_lexer_types.token -> int -> Pxp_entity_manager.entity_manager -> string)<br> option</code>;</code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><b>Experimental feature.</b> If defined, the <code class="code">escape_attributes</code> function is called whenever the tokens "{", "{{", "}", or "}}" are found inside attribute values. The function takes three arguments: The token (<code class="code"><span class="constructor">Lcurly</span></code>, <code class="code"><span class="constructor">LLcurly</span></code>, <code class="code"><span class="constructor">Rcurly</span></code> or <code class="code"><span class="constructor">RRcurly</span></code>), the position in the attribute value, and the entity manager. The result of the function is the string substituted for the token. <p> Example: The attribute is "a{b{{c", and the function is called as follows:<ul> <li><code class="code">escape_attributes <span class="constructor">Lcurly</span> 1 mng</code> - result is "42" (or an arbitrary string, but in this example it is "42")</li> <li><code class="code">escape_attributes <span class="constructor">LLcurly</span> 4 mng</code> - result is "foo"</li> </ul> The resulting attribute value is then "a42bfooc". <p> See also <code class="code">escape_contents</code>. <p> Event-based parser: this option works.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code> </code></td> <td align="left" valign="top" > <code><span id="TYPEELTconfig.debugging_mode">debugging_mode</span> :<code class="type">bool</code>;</code></td> </tr></table> } <pre><span id="VALdefault_config"><span class="keyword">val</span> default_config</span> : <code class="type"><a href="Pxp_types.html#TYPEconfig">config</a></code></pre><div class="info"> Default configuration. This is a recommended set of options that works generally:<ul> <li>Warnings are thrown away</li> <li>Error messages will contain line numbers</li> <li>Neither T_super_root nor T_pinstr nor T_comment nodes are generated</li> <li>The internal encoding is ISO-8859-1</li> <li>The standalone declaration is checked</li> <li>Element positions are stored</li> <li>The IDREF pass is left out</li> <li>If available, DFAs are used for validation</li> <li>Only deterministic content models are accepted</li> <li>Namespace processing is turned off</li> </ul> <br> </div> <pre><span id="VALdefault_namespace_config"><span class="keyword">val</span> default_namespace_config</span> : <code class="type"><a href="Pxp_types.html#TYPEconfig">config</a></code></pre><div class="info"> <b>Deprecated.</b> Same as <code class="code">default_config</code>, but namespace processing is turned on. Note however, that a globally defined namespace manager is used. Because of this, this <code class="code">config</code> should no longer be used. Instead, do <code class="code"> <span class="keyword">let</span> m = <span class="constructor">Pxp_dtd</span>.create_namespace_manager() <span class="keyword">in</span><br> <span class="keyword">let</span> namespace_config =<br> { default_config <span class="keyword">with</span><br> enable_namespace_processing = <span class="constructor">Some</span> m<br> }<br> </code> and take control of the scope of <code class="code">m</code>.<br> </div> <br> <span id="2_Sources"><h2>Sources</h2></span><br> <br> Sources specify where the XML text to parse comes from. The type <code class="code">source</code> is often not used directly, but sources are constructed with the help of the functions <code class="code">from_channel</code>, <code class="code">from_obj_channel</code>, <code class="code">from_file</code>, and <code class="code">from_string</code> (see below). <b>Note that you can usually view the type <code class="code">source</code> as an opaque type.</b> There is no need to understand why it enumerates these three cases, or to use them directly. Just create sources with one of the <code class="code">from_*</code> functions. <p> The type <code class="code">source</code> is an abstraction on top of <code class="code">resolver</code> (defined in module <a href="Pxp_reader.html"><code class="code"><span class="constructor">Pxp_reader</span></code></a>). The <code class="code">resolver</code> is a configurable object that knows how to access files that are<ul> <li>identified by an XML ID (a <code class="code"><span class="constructor">PUBLIC</span></code> or <code class="code"><span class="constructor">SYSTEM</span></code> name)</li> <li>named relative to another file</li> <li>referred to by the special PXP IDs <code class="code"><span class="constructor">Private</span></code> and <code class="code"><span class="constructor">Anonymous</span></code>.</li> </ul> Furthermore, the <code class="code">resolver</code> knows a lot about the character encoding of the files. See <a href="Pxp_reader.html"><code class="code"><span class="constructor">Pxp_reader</span></code></a> for details. <p> A <code class="code">source</code> is a resolver that is applied to a certain ID that should be initially opened.<br> <pre><span id="TYPEsource"><span class="keyword">type</span> <code class="type"></code>source</span> = <code class="type">Pxp_dtd.source</code> = </pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTsource.Entity"><span class="constructor">Entity</span></span> <span class="keyword">of</span> <code class="type">((<a href="Pxp_dtd.dtd-c.html">Pxp_dtd.dtd</a> -> Pxp_entity.entity) * <a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a>)</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTsource.ExtID"><span class="constructor">ExtID</span></span> <span class="keyword">of</span> <code class="type">(Pxp_core_types.I.ext_id * <a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a>)</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTsource.XExtID"><span class="constructor">XExtID</span></span> <span class="keyword">of</span> <code class="type">(Pxp_core_types.I.ext_id * string option * <a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a>)</code></code></td> </tr></table> <div class="info"> The three basic flavours of sources:<ul> <li><code class="code"><span class="constructor">Entity</span>(m,r)</code> is a very low-level way of denoting a source. After the parser has created the DTD object <code class="code">d</code>, it calls <code class="code"> e = m d </code> and uses the entity object <code class="code">e</code> together with the resolver <code class="code">r</code>. This kind of <code class="code">source</code> is intended to implement customized versions of the entity classes. Use it only if there is a strong need to do so.</li> <li><code class="code"><span class="constructor">ExtID</span>(xid,r)</code> is the normal way of denoting a source. The external entity referred to by the ID <code class="code">xid</code> is opened by using the resolver <code class="code">r</code>.</li> <li><code class="code"><span class="constructor">XExtID</span>(xid,sys_base,r)</code> is an extension of <code class="code"><span class="constructor">ExtID</span></code>. The additional parameter <code class="code">sys_base</code> is the base URI to assume if <code class="code">xid</code> is a relative URI (i.e. a <code class="code"><span class="constructor">SYSTEM</span></code> ID).</li> </ul> <br> </div> <pre><span id="VALfrom_channel"><span class="keyword">val</span> from_channel</span> : <code class="type">?alt:<a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a> list -><br> ?system_id:string -><br> ?fixenc:encoding -><br> ?id:ext_id -><br> ?system_encoding:encoding -> Pervasives.in_channel -> <a href="Pxp_types.html#TYPEsource">source</a></code></pre><div class="info"> This function creates a source that reads the XML text from the passed <code class="code">in_channel</code>. By default, this <code class="code">source</code> is not able to read XML text from any other location (you cannot read from files etc.). The optional arguments allow it to modify this behaviour. <p> Keep the following in mind:<ul> <li>Because this source reads from a channel, it can only be used once.</li> <li>The channel will be closed by the parser when the end of the channel is reached, or when the parser stops because of another reason.</li> <li>Unless the <code class="code">alt</code> argument specifies something else, you cannot refer to entities by <code class="code"><span class="constructor">SYSTEM</span></code> or <code class="code"><span class="constructor">PUBLIC</span></code> names (error "no input method available")</li> <li>Even if you pass an <code class="code">alt</code> method that can handle <code class="code"><span class="constructor">SYSTEM</span></code>, it is not immediately possible to open <code class="code"><span class="constructor">SYSTEM</span></code> entities that are defined by a URL relative to the entity that is accessed over the <code class="code">in_channel</code>. You first must pass the <code class="code">system_id</code> argument, so the parser knows the base name relative to which other <code class="code"><span class="constructor">SYSTEM</span></code> entities can be resolved.</li> <li>For more instructions how to construct sources and resolvers look at <a href="Intro_resolution.html"><code class="code"><span class="constructor">Intro_resolution</span></code></a>.</li> </ul> <b>Arguments:</b><ul> <li><code class="code">alt</code>: A list of further resolvers that are used to open further entities referenced in the initially opened entity. For example, you can pass <code class="code"><span class="keyword">new</span> <span class="constructor">Pxp_reader</span>.resolve_as_file()</code> to enable resolving of file names found in <code class="code"><span class="constructor">SYSTEM</span></code> IDs.</li> <li><code class="code">system_id</code>: By default, the XML text found in the <code class="code">in_channel</code> does not have any visible ID (to be exact, the <code class="code">in_channel</code> has a private ID, but this is hidden). Because of this, it is not possible to open a second file by using a relative <code class="code"><span class="constructor">SYSTEM</span></code> ID. The parameter <code class="code">system_id</code> assigns the channel a <code class="code"><span class="constructor">SYSTEM</span></code> ID that is only used to resolve further relative <code class="code"><span class="constructor">SYSTEM</span></code> IDs. - This parameter must be encoded as UTF-8 string.</li> <li><code class="code">fixenc</code>: By default, the character encoding of the XML text is determined by looking at the XML declaration. Setting <code class="code">fixenc</code> forces a certain character encoding. Useful if you can assume that the XML text has been recoded by the transmission media.</li> </ul> <b>Deprecated arguments:</b><ul> <li><code class="code">id</code>: This parameter assigns the channel an arbitrary ID (like <code class="code">system_id</code>, but <code class="code"><span class="constructor">PUBLIC</span></code>, anonymous, and private IDs are also possible - although not reasonable). Furthermore, setting <code class="code">id</code> also enables resolving of file names. <code class="code">id</code> has higher precedence than <code class="code">system_id</code>.</li> <li><code class="code">system_encoding</code>: (Only useful together with <code class="code">id</code>.) The character encoding used for file names. (UTF-8 by default.)</li> </ul> <br> </div> <pre><span id="VALfrom_obj_channel"><span class="keyword">val</span> from_obj_channel</span> : <code class="type">?alt:<a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a> list -><br> ?system_id:string -><br> ?fixenc:encoding -><br> ?id:ext_id -><br> ?system_encoding:encoding -> Netchannels.in_obj_channel -> <a href="Pxp_types.html#TYPEsource">source</a></code></pre><div class="info"> Similar to <code class="code">from_channel</code>, but reads from an Ocamlnet netchannel instead.<br> </div> <pre><span id="VALfrom_string"><span class="keyword">val</span> from_string</span> : <code class="type">?alt:<a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a> list -><br> ?system_id:string -> ?fixenc:encoding -> string -> <a href="Pxp_types.html#TYPEsource">source</a></code></pre><div class="info"> Similar to <code class="code">from_channel</code>, but reads from a string. <p> Of course, it is possible to parse this source several times, unlike the channel-based sources.<br> </div> <pre><span id="VALfrom_file"><span class="keyword">val</span> from_file</span> : <code class="type">?alt:<a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a> list -><br> ?system_encoding:encoding -> ?enc:encoding -> string -> <a href="Pxp_types.html#TYPEsource">source</a></code></pre><div class="info"> This source reads initially from the file whose name is passed as string argument. The filename must be UTF-8-encoded (so it can be correctly rewritten into a URL). <p> This source can open further files by default, and relative URLs work immediately. <p> <b>Arguments:</b><ul> <li><code class="code">alt</code>: A list of further resolvers, especially useful to open non-<code class="code"><span class="constructor">SYSTEM</span></code> IDs, and non-file entities.</li> <li><code class="code">system_encoding</code>: The character encoding the system uses to represent filenames. By default, UTF-8 is assumed.</li> <li><code class="code">enc</code>: The character encoding of the string argument. As mentioned, this is UTF-8 by default.</li> </ul> <br> </div> <br> <b>Examples.</b> <p> <ul> <li>The source <code class="code"> from_file <span class="string">"/tmp/file.xml"</span> </code> reads from this file, which is assumed to have the ID <code class="code"><span class="constructor">SYSTEM</span> <span class="string">"file://localhost/tmp/file.xml"</span></code>. It is no problem when other files are included by either absolute <code class="code"><span class="constructor">SYSTEM</span></code> file name, or by a relative <code class="code"><span class="constructor">SYSTEM</span></code>.</li> <li>The source <code class="code"> <span class="keyword">let</span> ch = open_in <span class="string">"/tmp/file.xml"</span> <span class="keyword">in</span><br> from_channel<br> ~alt:[ <span class="keyword">new</span> <span class="constructor">Pxp_reader</span>.resolve_as_file() ] <br> ~system_id:<span class="string">"file://localhost/tmp/file.xml"</span> ch</code> does roughly the same, but uses a channel for the initially opened entity. Because of the <code class="code">alt</code> argument, it is possible to reference other entities by absolute <code class="code"><span class="constructor">SYSTEM</span></code> name. The <code class="code">system_id</code> assignment makes it possible that <code class="code"><span class="constructor">SYSTEM</span></code> names relative to the initially used entity are resolvable.</li> <li>The source <code class="code"> <span class="keyword">let</span> cat = <span class="keyword">new</span> <span class="constructor">Pxp_reader</span>.lookup_id<br> [ <span class="constructor">Public</span>(<span class="string">"My Public ID"</span>,<span class="string">""</span>),<span class="string">"/usr/share/xml/public.xml"</span> ] <span class="keyword">in</span><br> from_file ~alt:[cat] <span class="string">"/tmp/file.xml"</span></code> sets that the <code class="code"><span class="constructor">PUBLIC</span></code> ID "My Public ID" is mapped to the shown file, i.e. this file is parsed when this <code class="code"><span class="constructor">PUBLIC</span></code> ID occurs in the XML text. (Without mapping <code class="code"><span class="constructor">PUBLIC</span></code> names these cannot be resolved.)</li> </ul> <br> <pre><span id="VALopen_source"><span class="keyword">val</span> open_source</span> : <code class="type"><a href="Pxp_types.html#TYPEconfig">config</a> -><br> <a href="Pxp_types.html#TYPEsource">source</a> -><br> bool -> <a href="Pxp_dtd.dtd-c.html">Pxp_dtd.dtd</a> -> <a href="Pxp_reader.resolver-c.html">Pxp_reader.resolver</a> * Pxp_entity.entity</code></pre><div class="info"> Returns the resolver and the entity for a source. The boolean arg determines whether a document entity (true) or a normal external entity (false) will be returned.<br> </div> <br> <span id="2_Entities"><h2>Entities</h2></span><br> <br> See <a href="Pxp_dtd.Entity.html"><code class="code"><span class="constructor">Pxp_dtd</span>.<span class="constructor">Entity</span></code></a> for functions dealing with entities.<br> <pre><span id="TYPEentity_id"><span class="keyword">type</span> <code class="type"></code>entity_id</span> = <code class="type">Pxp_lexer_types.entity_id</code> </pre> <div class="info"> An <code class="code">entity_id</code> is an identifier for an entity, or a fake identifier.<br> </div> <pre><span id="TYPEentity"><span class="keyword">type</span> <code class="type"></code>entity</span> = <code class="type">Pxp_entity.entity</code> </pre> <div class="info"> The representation of entities<br> </div> <br> <span id="2_Eventparsing"><h2>Event parsing</h2></span><br> <pre><span id="TYPEentry"><span class="keyword">type</span> <code class="type"></code>entry</span> = <code class="type">[ `Entry_content of [ `Dummy ] list<br> | `Entry_declarations of [ `Extend_dtd_fully | `Val_mode_dtd ] list<br> | `Entry_document of<br> [ `Extend_dtd_fully | `Parse_xml_decl | `Val_mode_dtd ] list<br> | `Entry_element_content of [ `Dummy ] list<br> | `Entry_expr of [ `Dummy ] list ]</code> </pre> <div class="info"> Entry points for the parser (used to call <code class="code">process_entity</code>):<ul> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_document</span></code>: The parser reads a complete document that must have a DOCTYPE and may have a DTD.</li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_declarations</span></code>: The parser reads the external subset of a DTD</li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_element_content</span></code>: The parser reads an entity containing contents, but there must be one top element, i.e. "misc* element misc*". At the beginning, there can be an XML declaration as for external entities.</li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_content</span></code>: The parser reads an entity containing contents, but without the restriction of having a top element. At the beginning, there can be an XML declaration as for external entities.</li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_expr</span></code>: The parser reads a single element, a single processing instruction or a single comment, or whitespace, whatever is found. In contrast to the other entry points, the expression need not to be a complete entity, but can start and end in the middle of an entity</li> </ul> More entry points might be defined in the future. <p> The entry points have a list of flags. Note that <code class="code"><span class="keywordsign">`</span><span class="constructor">Dummy</span></code> is ignored and only present because O'Caml does not allow empty variants. For <code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_document</span></code>, and <code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_declarations</span></code>, the flags determine the kind of DTD object that is generated. <p> <b>Without flags</b>, the DTD object is configured for well-formedness mode:<ul> <li>Elements, attributes, and notations found in the XML text are not added to the DTD; entity declarations are added, however. Additionally, the DTD is configured such that it does not complain about missing elements, attributes, and notations (<code class="code">dtd<span class="keywordsign">#</span>arbitrary_allowed</code>).</li> </ul> <b>The flags</b> affecting the DTD have the following meaning. Keep in mind that the event parser can only conduct some validation checks because it does not represent the XML nodes as tree. <p> <ul> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Extend_dtd_fully</span></code>: Elements, attributes, and notations are added to the DTD. The DTD mode <code class="code">dtd<span class="keywordsign">#</span>arbitrary_allowed</code> is enabled. If the resulting event stream is validated later, this mode has the effect that the actually declared elements, attributes, and notations are validated as declared. Also, non-declared elements, attributes, and notations are <b>not rejected</b>, but handled as in well-formed mode.</li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Val_mode_dtd</span></code>: The DTD object is set up for validation, i.e. all declarations are added to the DTD, and <code class="code">dtd<span class="keywordsign">#</span>arbitrary_allowed</code> is disabled. Furthermore, some validation checks are already done for the DTD (e.g. whether the root element is declared). If the resulting event stream is validated later, all validation checks are conducted (except for the XML declaration - see the next flag - this check must be separately enabled).</li> <li><code class="code"><span class="keywordsign">`</span><span class="constructor">Parse_xml_decl</span></code>: By default, the XML declaration <code class="code"><?xml version=<span class="string">"1.0"</span> encoding=<span class="string">"..."</span> standalone=<span class="string">"..."</span><span class="keywordsign">?></span></code> is ignored except for the encoding attribute. This flag causes that the XML declaration is completely parsed.</li> </ul> <br> </div> <pre><code><span id="TYPEevent"><span class="keyword">type</span> <code class="type"></code>event</span> = </code></pre><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_start_doc"><span class="constructor">E_start_doc</span></span> <span class="keyword">of</span> <code class="type">(string * <a href="Pxp_dtd.dtd-c.html">Pxp_dtd.dtd</a>)</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Starts a document. The string is the XML version ("1.0")</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_end_doc"><span class="constructor">E_end_doc</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Ends a document. The string is the literal name of the root element (without any normalization or transformation)</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_start_tag"><span class="constructor">E_start_tag</span></span> <span class="keyword">of</span> <code class="type">(string * (string * string) list * <a href="Pxp_dtd.namespace_scope-c.html">Pxp_dtd.namespace_scope</a> option *<br> Pxp_lexer_types.entity_id)</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code">(name, attlist, scope_opt, entid)</code>: Starts an element <code class="code">name</code> with an attribute list <code class="code">attlist</code>. <code class="code">scope_opt</code> is the scope object in namespace mode, otherwise <code class="code"><span class="constructor">None</span></code>. <code class="code">entid</code> identifies the identity where the start tag occurs</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_end_tag"><span class="constructor">E_end_tag</span></span> <span class="keyword">of</span> <code class="type">(string * Pxp_lexer_types.entity_id)</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code">(name,entid)</code>: Ends the element <code class="code">name</code> in entity <code class="code">entid</code>.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_char_data"><span class="constructor">E_char_data</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Character data</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_pinstr"><span class="constructor">E_pinstr</span></span> <span class="keyword">of</span> <code class="type">(string * string * Pxp_lexer_types.entity_id)</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >A processing instruction <code class="code"><?target value<span class="keywordsign">?></span></code></td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_comment"><span class="constructor">E_comment</span></span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >A comment node. The string does not include the delimiters</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_start_super"><span class="constructor">E_start_super</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Starts the super root</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_end_super"><span class="constructor">E_end_super</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Ends the super root</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_position"><span class="constructor">E_position</span></span> <span class="keyword">of</span> <code class="type">(string * int * int)</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code">(entity,line,pos)</code>: Describes that the next element, which is either <code class="code"><span class="constructor">E_start_tag</span></code>, <code class="code"><span class="constructor">E_pinstr</span></code>, or <code class="code"><span class="constructor">E_comment</span></code>, is located in <code class="code">entity</code> at <code class="code">line</code> and character position <code class="code">pos</code>.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_error"><span class="constructor">E_error</span></span> <span class="keyword">of</span> <code class="type">exn</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >May occur as last event in a stream to describe an error</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span id="TYPEELTevent.E_end_of_stream"><span class="constructor">E_end_of_stream</span></span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >If the text can be parsed without error, this event is the last event of the stream</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> The type of XML events. In event mode, the parser emits a stream of these events. The parser already checks that certain structural properties are met:<ul> <li>Start and end tags (including those of the super root) are properly nested</li> <li>Start and end tags of elements are in the same entity</li> </ul> If a whole document is parsed (entry <code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_document</span></code>), the events of the text are surrounded by <code class="code"><span class="constructor">E_start_doc</span></code> and <code class="code"><span class="constructor">E_end_doc</span></code>, i.e. the overall structure is: <p> <ul> <li><code class="code"><span class="constructor">E_start_doc</span></code></li> <li>Now the elements (or the super root)</li> <li><code class="code"><span class="constructor">E_end_doc</span></code></li> <li><code class="code"><span class="constructor">E_error</span></code> or <code class="code"><span class="constructor">E_end_of_stream</span></code></li> </ul> For the entries <code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_content</span></code> and <code class="code"><span class="keywordsign">`</span><span class="constructor">Entry_expr</span></code> the document events are left out. The final <code class="code"><span class="constructor">E_error</span></code> or <code class="code"><span class="constructor">E_end_of_stream</span></code> event is nevertheless emitted.<br> </div> </body></html>