Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > b86751e6f6ddbd93d85baffd97780842 > files > 386

python-genshi-0.7-4.mga4.x86_64.rpm

<!DOCTYPE html>

<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="generator" content="Docutils 0.8.1: http://docutils.sourceforge.net/">
<title>Genshi: Markup Streams</title>
<link rel="stylesheet" href="common/style/edgewall.css" type="text/css">
</head>
<body>
<div class="document" id="markup-streams">
    <div id="navigation">
      <span class="projinfo">Genshi 0.7</span>
      <a href="index.html">Documentation Index</a>
    </div>
<h1 class="title">Markup Streams</h1>
<p>A stream is the common representation of markup as a <em>stream of events</em>.</p>
<div class="contents topic" id="contents">
<p class="topic-title first">Contents</p>
<ul class="auto-toc simple">
<li><a class="reference internal" href="#basics" id="id3">1   Basics</a></li>
<li><a class="reference internal" href="#filtering" id="id4">2   Filtering</a></li>
<li><a class="reference internal" href="#serialization" id="id5">3   Serialization</a><ul class="auto-toc">
<li><a class="reference internal" href="#id1" id="id6">3.1   Serialization Methods</a></li>
<li><a class="reference internal" href="#serialization-options" id="id7">3.2   Serialization Options</a></li>
</ul>
</li>
<li><a class="reference internal" href="#using-xpath" id="id8">4   Using XPath</a></li>
<li><a class="reference internal" href="#id2" id="id9">5   Event Kinds</a><ul class="auto-toc">
<li><a class="reference internal" href="#start" id="id10">5.1   START</a></li>
<li><a class="reference internal" href="#end" id="id11">5.2   END</a></li>
<li><a class="reference internal" href="#text" id="id12">5.3   TEXT</a></li>
<li><a class="reference internal" href="#start-ns" id="id13">5.4   START_NS</a></li>
<li><a class="reference internal" href="#end-ns" id="id14">5.5   END_NS</a></li>
<li><a class="reference internal" href="#doctype" id="id15">5.6   DOCTYPE</a></li>
<li><a class="reference internal" href="#comment" id="id16">5.7   COMMENT</a></li>
<li><a class="reference internal" href="#pi" id="id17">5.8   PI</a></li>
<li><a class="reference internal" href="#start-cdata" id="id18">5.9   START_CDATA</a></li>
<li><a class="reference internal" href="#end-cdata" id="id19">5.10   END_CDATA</a></li>
</ul>
</li>
</ul>
</div>
<div class="section" id="basics">
<h1>1   Basics</h1>
<p>A stream can be attained in a number of ways. It can be:</p>
<ul class="simple">
<li>the result of parsing XML or HTML text, or</li>
<li>the result of selecting a subset of another stream using XPath, or</li>
<li>programmatically generated.</li>
</ul>
<p>For example, the functions <tt class="docutils literal">XML()</tt> and <tt class="docutils literal">HTML()</tt> can be used to convert
literal XML or HTML text to a markup stream:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">genshi</span> <span class="kn">import</span> <span class="n">XML</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">stream</span> <span class="o">=</span> <span class="n">XML</span><span class="p">(</span><span class="s">'&lt;p class="intro"&gt;Some text and '</span>
<span class="gp">... </span>             <span class="s">'&lt;a href="http://example.org/"&gt;a link&lt;/a&gt;.'</span>
<span class="gp">... </span>             <span class="s">'&lt;br/&gt;&lt;/p&gt;'</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">stream</span>
<span class="go">&lt;genshi.core.Stream object at ...&gt;</span>
</pre></div>
<p>The stream is the result of parsing the text into events. Each event is a tuple
of the form <tt class="docutils literal">(kind, data, pos)</tt>, where:</p>
<ul class="simple">
<li><tt class="docutils literal">kind</tt> defines what kind of event it is (such as the start of an element,
text, a comment, etc).</li>
<li><tt class="docutils literal">data</tt> is the actual data associated with the event. How this looks depends
on the event kind (see  <a class="reference internal" href="#event-kinds">event kinds</a>)</li>
<li><tt class="docutils literal">pos</tt> is a <tt class="docutils literal">(filename, lineno, column)</tt> tuple that describes where the
event “comes from”.</li>
</ul>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="k">for</span> <span class="n">kind</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">pos</span> <span class="ow">in</span> <span class="n">stream</span><span class="p">:</span>
<span class="gp">... </span>    <span class="k">print</span><span class="p">(</span><span class="s">'</span><span class="si">%s</span><span class="s"> </span><span class="si">%r</span><span class="s"> </span><span class="si">%r</span><span class="s">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">kind</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">pos</span><span class="p">))</span>
<span class="gp">...</span>
<span class="go">START (QName('p'), Attrs([(QName('class'), u'intro')])) (None, 1, 0)</span>
<span class="go">TEXT u'Some text and ' (None, 1, 17)</span>
<span class="go">START (QName('a'), Attrs([(QName('href'), u'http://example.org/')])) (None, 1, 31)</span>
<span class="go">TEXT u'a link' (None, 1, 61)</span>
<span class="go">END QName('a') (None, 1, 67)</span>
<span class="go">TEXT u'.' (None, 1, 71)</span>
<span class="go">START (QName('br'), Attrs()) (None, 1, 72)</span>
<span class="go">END QName('br') (None, 1, 77)</span>
<span class="go">END QName('p') (None, 1, 77)</span>
</pre></div>
</div>
<div class="section" id="filtering">
<h1>2   Filtering</h1>
<p>One important feature of markup streams is that you can apply <em>filters</em> to the
stream, either filters that come with Genshi, or your own custom filters.</p>
<p>A filter is simply a callable that accepts the stream as parameter, and returns
the filtered stream:</p>
<div class="highlight"><pre><span class="k">def</span> <span class="nf">noop</span><span class="p">(</span><span class="n">stream</span><span class="p">):</span>
    <span class="sd">"""A filter that doesn't actually do anything with the stream."""</span>
    <span class="k">for</span> <span class="n">kind</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">pos</span> <span class="ow">in</span> <span class="n">stream</span><span class="p">:</span>
        <span class="k">yield</span> <span class="n">kind</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">pos</span>
</pre></div>
<p>Filters can be applied in a number of ways. The simplest is to just call the
filter directly:</p>
<div class="highlight"><pre><span class="n">stream</span> <span class="o">=</span> <span class="n">noop</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span>
</pre></div>
<p>The <tt class="docutils literal">Stream</tt> class also provides a <tt class="docutils literal">filter()</tt> method, which takes an
arbitrary number of filter callables and applies them all:</p>
<div class="highlight"><pre><span class="n">stream</span> <span class="o">=</span> <span class="n">stream</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">noop</span><span class="p">)</span>
</pre></div>
<p>Finally, filters can also be applied using the <em>bitwise or</em> operator (<tt class="docutils literal">|</tt>),
which allows a syntax similar to pipes on Unix shells:</p>
<div class="highlight"><pre><span class="n">stream</span> <span class="o">=</span> <span class="n">stream</span> <span class="o">|</span> <span class="n">noop</span>
</pre></div>
<p>One example of a filter included with Genshi is the <tt class="docutils literal">HTMLSanitizer</tt> in
<tt class="docutils literal">genshi.filters</tt>. It processes a stream of HTML markup, and strips out any
potentially dangerous constructs, such as Javascript event handlers.
<tt class="docutils literal">HTMLSanitizer</tt> is not a function, but rather a class that implements
<tt class="docutils literal">__call__</tt>, which means instances of the class are callable:</p>
<div class="highlight"><pre><span class="n">stream</span> <span class="o">=</span> <span class="n">stream</span> <span class="o">|</span> <span class="n">HTMLSanitizer</span><span class="p">()</span>
</pre></div>
<p>Both the <tt class="docutils literal">filter()</tt> method and the pipe operator allow easy chaining of
filters:</p>
<div class="highlight"><pre><span class="kn">from</span> <span class="nn">genshi.filters</span> <span class="kn">import</span> <span class="n">HTMLSanitizer</span>
<span class="n">stream</span> <span class="o">=</span> <span class="n">stream</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">noop</span><span class="p">,</span> <span class="n">HTMLSanitizer</span><span class="p">())</span>
</pre></div>
<p>That is equivalent to:</p>
<div class="highlight"><pre><span class="n">stream</span> <span class="o">=</span> <span class="n">stream</span> <span class="o">|</span> <span class="n">noop</span> <span class="o">|</span> <span class="n">HTMLSanitizer</span><span class="p">()</span>
</pre></div>
<p>For more information about the built-in filters, see <a class="reference external" href="filters.html">Stream Filters</a>.</p>
</div>
<div class="section" id="serialization">
<h1>3   Serialization</h1>
<p>Serialization means producing some kind of textual output from a stream of
events, which you'll need when you want to transmit or store the results of
generating or otherwise processing markup.</p>
<p>The <tt class="docutils literal">Stream</tt> class provides two methods for serialization: <tt class="docutils literal">serialize()</tt>
and <tt class="docutils literal">render()</tt>. The former is a generator that yields chunks of <tt class="docutils literal">Markup</tt>
objects (which are basically unicode strings that are considered safe for
output on the web). The latter returns a single string, by default UTF-8
encoded.</p>
<p>Here's the output from <tt class="docutils literal">serialize()</tt>:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="k">for</span> <span class="n">output</span> <span class="ow">in</span> <span class="n">stream</span><span class="o">.</span><span class="n">serialize</span><span class="p">():</span>
<span class="gp">... </span>    <span class="k">print</span><span class="p">(</span><span class="nb">repr</span><span class="p">(</span><span class="n">output</span><span class="p">))</span>
<span class="gp">...</span>
<span class="go">&lt;Markup u'&lt;p class="intro"&gt;'&gt;</span>
<span class="go">&lt;Markup u'Some text and '&gt;</span>
<span class="go">&lt;Markup u'&lt;a href="http://example.org/"&gt;'&gt;</span>
<span class="go">&lt;Markup u'a link'&gt;</span>
<span class="go">&lt;Markup u'&lt;/a&gt;'&gt;</span>
<span class="go">&lt;Markup u'.'&gt;</span>
<span class="go">&lt;Markup u'&lt;br/&gt;'&gt;</span>
<span class="go">&lt;Markup u'&lt;/p&gt;'&gt;</span>
</pre></div>
<p>And here's the output from <tt class="docutils literal">render()</tt>:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">stream</span><span class="o">.</span><span class="n">render</span><span class="p">())</span>
<span class="go">&lt;p class="intro"&gt;Some text and &lt;a href="http://example.org/"&gt;a link&lt;/a&gt;.&lt;br/&gt;&lt;/p&gt;</span>
</pre></div>
<p>Both methods can be passed a <tt class="docutils literal">method</tt> parameter that determines how exactly
the events are serialized to text. This parameter can be either a string or a
custom serializer class:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">stream</span><span class="o">.</span><span class="n">render</span><span class="p">(</span><span class="s">'html'</span><span class="p">))</span>
<span class="go">&lt;p class="intro"&gt;Some text and &lt;a href="http://example.org/"&gt;a link&lt;/a&gt;.&lt;br&gt;&lt;/p&gt;</span>
</pre></div>
<p>Note how the <cite>&lt;br&gt;</cite> element isn't closed, which is the right thing to do for
HTML. See  <a class="reference internal" href="#serialization-methods">serialization methods</a> for more details.</p>
<p>In addition, the <tt class="docutils literal">render()</tt> method takes an <tt class="docutils literal">encoding</tt> parameter, which
defaults to “UTF-8”. If set to <tt class="docutils literal">None</tt>, the result will be a unicode string.</p>
<p>The different serializer classes in <tt class="docutils literal">genshi.output</tt> can also be used
directly:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">genshi.filters</span> <span class="kn">import</span> <span class="n">HTMLSanitizer</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">genshi.output</span> <span class="kn">import</span> <span class="n">TextSerializer</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="s">''</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">TextSerializer</span><span class="p">()(</span><span class="n">HTMLSanitizer</span><span class="p">()(</span><span class="n">stream</span><span class="p">))))</span>
<span class="go">Some text and a link.</span>
</pre></div>
<p>The pipe operator allows a nicer syntax:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">stream</span> <span class="o">|</span> <span class="n">HTMLSanitizer</span><span class="p">()</span> <span class="o">|</span> <span class="n">TextSerializer</span><span class="p">())</span>
<span class="go">Some text and a link.</span>
</pre></div>
<div class="section" id="id1">
<span id="serialization-methods"></span><h2>3.1   Serialization Methods</h2>
<p>Genshi supports the use of different serialization methods to use for creating
a text representation of a markup stream.</p>
<dl class="docutils">
<dt><tt class="docutils literal">xml</tt></dt>
<dd>The <tt class="docutils literal">XMLSerializer</tt> is the default serialization method and results in
proper XML output including namespace support, the XML declaration, CDATA
sections, and so on. It is not generally not suitable for serving HTML or
XHTML web pages (unless you want to use true XHTML 1.1), for which the
<tt class="docutils literal">xhtml</tt> and <tt class="docutils literal">html</tt> serializers described below should be preferred.</dd>
<dt><tt class="docutils literal">xhtml</tt></dt>
<dd><p class="first">The <tt class="docutils literal">XHTMLSerializer</tt> is a specialization of the generic <tt class="docutils literal">XMLSerializer</tt>
that understands the pecularities of producing XML-compliant output that can
also be parsed without problems by the HTML parsers found in modern web
browsers. Thus, the output by this serializer should be usable whether sent
as "text/html" or "application/xhtml+html" (although there are a lot of
subtle issues to pay attention to when switching between the two, in
particular with respect to differences in the DOM and CSS).</p>
<p>For example, instead of rendering a script tag as <tt class="docutils literal">&lt;script/&gt;</tt> (which
confuses the HTML parser in many browsers), it will produce
<tt class="docutils literal"><span class="pre">&lt;script&gt;&lt;/script&gt;</span></tt>. Also, it will normalize any boolean attributes values
that are minimized in HTML, so that for example <tt class="docutils literal">&lt;hr <span class="pre">noshade="1"/&gt;</span></tt>
becomes <tt class="docutils literal">&lt;hr <span class="pre">noshade="noshade"</span> /&gt;</tt>.</p>
<p class="last">This serializer supports the use of namespaces for compound documents, for
example to use inline SVG inside an XHTML document.</p>
</dd>
<dt><tt class="docutils literal">html</tt></dt>
<dd>The <tt class="docutils literal">HTMLSerializer</tt> produces proper HTML markup. The main differences
compared to <tt class="docutils literal">xhtml</tt> serialization are that boolean attributes are
minimized, empty tags are not self-closing (so it's <tt class="docutils literal">&lt;br&gt;</tt> instead of
<tt class="docutils literal">&lt;br /&gt;</tt>), and that the contents of <tt class="docutils literal">&lt;script&gt;</tt> and <tt class="docutils literal">&lt;style&gt;</tt> elements
are not escaped.</dd>
<dt><tt class="docutils literal">text</tt></dt>
<dd>The <tt class="docutils literal">TextSerializer</tt> produces plain text from markup streams. This is
useful primarily for <a class="reference external" href="text-templates.html">text templates</a>, but can also be used to produce
plain text output from markup templates or other sources.</dd>
</dl>
</div>
<div class="section" id="serialization-options">
<h2>3.2   Serialization Options</h2>
<p>Both <tt class="docutils literal">serialize()</tt> and <tt class="docutils literal">render()</tt> support additional keyword arguments that
are passed through to the initializer of the serializer class. The following
options are supported by the built-in serializers:</p>
<dl class="docutils">
<dt><tt class="docutils literal">strip_whitespace</tt></dt>
<dd><p class="first">Whether the serializer should remove trailing spaces and empty lines.
Defaults to <tt class="docutils literal">True</tt>.</p>
<p class="last">(This option is not available for serialization to plain text.)</p>
</dd>
<dt><tt class="docutils literal">doctype</tt></dt>
<dd><p class="first">A <tt class="docutils literal">(name, pubid, sysid)</tt> tuple defining the name, publid identifier, and
system identifier of a <tt class="docutils literal">DOCTYPE</tt> declaration to prepend to the generated
output. If provided, this declaration will override any <tt class="docutils literal">DOCTYPE</tt>
declaration in the stream.</p>
<p>The parameter can also be specified as a string to refer to commonly used
doctypes:</p>
<table border="1" class="docutils">
<colgroup>
<col width="40%">
<col width="60%">
</colgroup>
<thead valign="bottom">
<tr><th class="head">Shorthand</th>
<th class="head">DOCTYPE</th>
</tr>
</thead>
<tbody valign="top">
<tr><td><tt class="docutils literal">html</tt> or
<tt class="docutils literal"><span class="pre">html-strict</span></tt></td>
<td>HTML 4.01 Strict</td>
</tr>
<tr><td><tt class="docutils literal"><span class="pre">html-transitional</span></tt></td>
<td>HTML 4.01 Transitional</td>
</tr>
<tr><td><tt class="docutils literal"><span class="pre">html-frameset</span></tt></td>
<td>HTML 4.01 Frameset</td>
</tr>
<tr><td><tt class="docutils literal">html5</tt></td>
<td>DOCTYPE proposed for the work-in-progress
HTML5 standard</td>
</tr>
<tr><td><tt class="docutils literal">xhtml</tt> or
<tt class="docutils literal"><span class="pre">xhtml-strict</span></tt></td>
<td>XHTML 1.0 Strict</td>
</tr>
<tr><td><tt class="docutils literal"><span class="pre">xhtml-transitional</span></tt></td>
<td>XHTML 1.0 Transitional</td>
</tr>
<tr><td><tt class="docutils literal"><span class="pre">xhtml-frameset</span></tt></td>
<td>XHTML 1.0 Frameset</td>
</tr>
<tr><td><tt class="docutils literal">xhtml11</tt></td>
<td>XHTML 1.1</td>
</tr>
<tr><td><tt class="docutils literal">svg</tt> or <tt class="docutils literal"><span class="pre">svg-full</span></tt></td>
<td>SVG 1.1</td>
</tr>
<tr><td><tt class="docutils literal"><span class="pre">svg-basic</span></tt></td>
<td>SVG 1.1 Basic</td>
</tr>
<tr><td><tt class="docutils literal"><span class="pre">svg-tiny</span></tt></td>
<td>SVG 1.1 Tiny</td>
</tr>
</tbody>
</table>
<p class="last">(This option is not available for serialization to plain text.)</p>
</dd>
<dt><tt class="docutils literal">namespace_prefixes</tt></dt>
<dd><p class="first">The namespace prefixes to use for namespace that are not bound to a prefix
in the stream itself.</p>
<p class="last">(This option is not available for serialization to HTML or plain text.)</p>
</dd>
<dt><tt class="docutils literal">drop_xml_decl</tt></dt>
<dd><p class="first">Whether to remove the XML declaration (the <tt class="docutils literal"><span class="pre">&lt;?xml</span> <span class="pre">?&gt;</span></tt> part at the
beginning of a document) when serializing. This defaults to <tt class="docutils literal">True</tt> as an
XML declaration throws some older browsers into "Quirks" rendering mode.</p>
<p class="last">(This option is only available for serialization to XHTML.)</p>
</dd>
<dt><tt class="docutils literal">strip_markup</tt></dt>
<dd><p class="first">Whether the text serializer should detect and remove any tags or entity
encoded characters in the text.</p>
<p class="last">(This option is only available for serialization to plain text.)</p>
</dd>
</dl>
</div>
</div>
<div class="section" id="using-xpath">
<h1>4   Using XPath</h1>
<p>XPath can be used to extract a specific subset of the stream via the
<tt class="docutils literal">select()</tt> method:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="n">substream</span> <span class="o">=</span> <span class="n">stream</span><span class="o">.</span><span class="n">select</span><span class="p">(</span><span class="s">'a'</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">substream</span>
<span class="go">&lt;genshi.core.Stream object at ...&gt;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">substream</span><span class="p">)</span>
<span class="go">&lt;a href="http://example.org/"&gt;a link&lt;/a&gt;</span>
</pre></div>
<p>Often, streams cannot be reused: in the above example, the sub-stream is based
on a generator. Once it has been serialized, it will have been fully consumed,
and cannot be rendered again. To work around this, you can wrap such a stream
in a <tt class="docutils literal">list</tt>:</p>
<div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">genshi</span> <span class="kn">import</span> <span class="n">Stream</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">substream</span> <span class="o">=</span> <span class="n">Stream</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">stream</span><span class="o">.</span><span class="n">select</span><span class="p">(</span><span class="s">'a'</span><span class="p">)))</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">substream</span>
<span class="go">&lt;genshi.core.Stream object at ...&gt;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">substream</span><span class="p">)</span>
<span class="go">&lt;a href="http://example.org/"&gt;a link&lt;/a&gt;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">substream</span><span class="o">.</span><span class="n">select</span><span class="p">(</span><span class="s">'@href'</span><span class="p">))</span>
<span class="go">http://example.org/</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">substream</span><span class="o">.</span><span class="n">select</span><span class="p">(</span><span class="s">'text()'</span><span class="p">))</span>
<span class="go">a link</span>
</pre></div>
<p>See <a class="reference external" href="xpath.html">Using XPath in Genshi</a> for more information about the XPath support in
Genshi.</p>
</div>
<div class="section" id="id2">
<span id="event-kinds"></span><h1>5   Event Kinds</h1>
<p>Every event in a stream is of one of several <em>kinds</em>, which also determines
what the <tt class="docutils literal">data</tt> item of the event tuple looks like. The different kinds of
events are documented below.</p>
<div class="note">
<p class="first admonition-title">Note</p>
<p class="last">The <tt class="docutils literal">data</tt> item is generally immutable. If the data is to be
modified when processing a stream, it must be replaced by a new tuple.
Effectively, this means the entire event tuple is immutable.</p>
</div>
<div class="section" id="start">
<h2>5.1   START</h2>
<p>The opening tag of an element.</p>
<p>For this kind of event, the <tt class="docutils literal">data</tt> item is a tuple of the form
<tt class="docutils literal">(tagname, attrs)</tt>, where <tt class="docutils literal">tagname</tt> is a <tt class="docutils literal">QName</tt> instance describing the
qualified name of the tag, and <tt class="docutils literal">attrs</tt> is an <tt class="docutils literal">Attrs</tt> instance containing
the attribute names and values associated with the tag (excluding namespace
declarations):</p>
<div class="highlight"><pre><span class="n">START</span><span class="p">,</span> <span class="p">(</span><span class="n">QName</span><span class="p">(</span><span class="s">'p'</span><span class="p">),</span> <span class="n">Attrs</span><span class="p">([(</span><span class="n">QName</span><span class="p">(</span><span class="s">'class'</span><span class="p">),</span> <span class="s">u'intro'</span><span class="p">)])),</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="end">
<h2>5.2   END</h2>
<p>The closing tag of an element.</p>
<p>The <tt class="docutils literal">data</tt> item of end events consists of just a <tt class="docutils literal">QName</tt> instance
describing the qualified name of the tag:</p>
<div class="highlight"><pre><span class="n">END</span><span class="p">,</span> <span class="n">QName</span><span class="p">(</span><span class="s">'p'</span><span class="p">),</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="text">
<h2>5.3   TEXT</h2>
<p>Character data outside of elements and comments.</p>
<p>For text events, the <tt class="docutils literal">data</tt> item should be a unicode object:</p>
<div class="highlight"><pre><span class="n">TEXT</span><span class="p">,</span> <span class="s">u'Hello, world!'</span><span class="p">,</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="start-ns">
<h2>5.4   START_NS</h2>
<p>The start of a namespace mapping, binding a namespace prefix to a URI.</p>
<p>The <tt class="docutils literal">data</tt> item of this kind of event is a tuple of the form
<tt class="docutils literal">(prefix, uri)</tt>, where <tt class="docutils literal">prefix</tt> is the namespace prefix and <tt class="docutils literal">uri</tt> is the
full URI to which the prefix is bound. Both should be unicode objects. If the
namespace is not bound to any prefix, the <tt class="docutils literal">prefix</tt> item is an empty string:</p>
<div class="highlight"><pre><span class="n">START_NS</span><span class="p">,</span> <span class="p">(</span><span class="s">u'svg'</span><span class="p">,</span> <span class="s">u'http://www.w3.org/2000/svg'</span><span class="p">),</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="end-ns">
<h2>5.5   END_NS</h2>
<p>The end of a namespace mapping.</p>
<p>The <tt class="docutils literal">data</tt> item of such events consists of only the namespace prefix (a
unicode object):</p>
<div class="highlight"><pre><span class="n">END_NS</span><span class="p">,</span> <span class="s">u'svg'</span><span class="p">,</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="doctype">
<h2>5.6   DOCTYPE</h2>
<p>A document type declaration.</p>
<p>For this type of event, the <tt class="docutils literal">data</tt> item is a tuple of the form
<tt class="docutils literal">(name, pubid, sysid)</tt>, where <tt class="docutils literal">name</tt> is the name of the root element,
<tt class="docutils literal">pubid</tt> is the public identifier of the DTD (or <tt class="docutils literal">None</tt>), and <tt class="docutils literal">sysid</tt> is
the system identifier of the DTD (or <tt class="docutils literal">None</tt>):</p>
<div class="highlight"><pre><span class="n">DOCTYPE</span><span class="p">,</span> <span class="p">(</span><span class="s">u'html'</span><span class="p">,</span> <span class="s">u'-//W3C//DTD XHTML 1.0 Transitional//EN'</span><span class="p">,</span> \
          <span class="s">u'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'</span><span class="p">),</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="comment">
<h2>5.7   COMMENT</h2>
<p>A comment.</p>
<p>For such events, the <tt class="docutils literal">data</tt> item is a unicode object containing all character
data between the comment delimiters:</p>
<div class="highlight"><pre><span class="n">COMMENT</span><span class="p">,</span> <span class="s">u'Commented out'</span><span class="p">,</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="pi">
<h2>5.8   PI</h2>
<p>A processing instruction.</p>
<p>The <tt class="docutils literal">data</tt> item is a tuple of the form <tt class="docutils literal">(target, data)</tt> for processing
instructions, where <tt class="docutils literal">target</tt> is the target of the PI (used to identify the
application by which the instruction should be processed), and <tt class="docutils literal">data</tt> is text
following the target (excluding the terminating question mark):</p>
<div class="highlight"><pre><span class="n">PI</span><span class="p">,</span> <span class="p">(</span><span class="s">u'php'</span><span class="p">,</span> <span class="s">u'echo "Yo" '</span><span class="p">),</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="start-cdata">
<h2>5.9   START_CDATA</h2>
<p>Marks the beginning of a <tt class="docutils literal">CDATA</tt> section.</p>
<p>The <tt class="docutils literal">data</tt> item for such events is always <tt class="docutils literal">None</tt>:</p>
<div class="highlight"><pre><span class="n">START_CDATA</span><span class="p">,</span> <span class="bp">None</span><span class="p">,</span> <span class="n">pos</span>
</pre></div>
</div>
<div class="section" id="end-cdata">
<h2>5.10   END_CDATA</h2>
<p>Marks the end of a <tt class="docutils literal">CDATA</tt> section.</p>
<p>The <tt class="docutils literal">data</tt> item for such events is always <tt class="docutils literal">None</tt>:</p>
<div class="highlight"><pre><span class="n">END_CDATA</span><span class="p">,</span> <span class="bp">None</span><span class="p">,</span> <span class="n">pos</span>
</pre></div>
</div>
</div>
    <div id="footer">
      Visit the Genshi open source project at
      <a href="http://genshi.edgewall.org/">http://genshi.edgewall.org/</a>
    </div>
  </div>
</body>
</html>