

distrib > Fedora > 14 > x86_64 > media > updates > by-pkgid > a493f4849890f585e79f249ecf8c071a > files > 455


 The Docutils Publisher

:Author: David Goodger
:Date: $Date: 2009-11-30 09:10:35 +0100 (Mon, 30 Nov 2009) $
:Revision: $Revision: 6204 $
:Copyright: This document has been placed in the public domain.

.. contents::

The ``docutils.core.Publisher`` class is the core of Docutils,
managing all the processing and relationships between components.  See
`PEP 258`_ for an overview of Docutils components.

The ``docutils.core.publish_*`` convenience functions are the normal
entry points for using Docutils as a library.

See `Inside A Docutils Command-Line Front-End Tool`_ for an overview
of a typical Docutils front-end tool, including how the Publisher
class is used.

.. _PEP 258: ../peps/pep-0258.html
.. _Inside A Docutils Command-Line Front-End Tool: ./cmdline-tool.html

Publisher Convenience Functions

Each of these functions set up a ``docutils.core.Publisher`` object,
then call its ``publish`` method.  ``docutils.core.Publisher.publish``
handles everything else.  There are several convenience functions in
the ``docutils.core`` module:

:_`publish_cmdline`: for command-line front-end tools, like
  ````.  There are several examples in the ``tools/``
  directory.  A detailed analysis of one such tool is in `Inside A
  Docutils Command-Line Front-End Tool`_

:_`publish_file`: for programmatic use with file-like I/O.  In
  addition to writing the encoded output to a file, also returns the
  encoded output as a string.

:_`publish_string`: for programmatic use with string I/O.  Returns
  the encoded output as a string.

:_`publish_parts`: for programmatic use with string input; returns a
  dictionary of document parts.  Dictionary keys are the names of
  parts, and values are Unicode strings; encoding is up to the client.
  Useful when only portions of the processed document are desired.
  See `publish_parts Details`_ below.

  There are usage examples in the `docutils/`_ module.

:_`publish_doctree`: for programmatic use with string input; returns a
  Docutils document tree data structure (doctree).  The doctree can be
  modified, pickled & unpickled, etc., and then reprocessed with

:_`publish_from_doctree`: for programmatic use to render from an
  existing document tree data structure (doctree); returns the encoded
  output as a string.

:_`publish_programmatically`: for custom programmatic use.  This
  function implements common code and is used by ``publish_file``,
  ``publish_string``, and ``publish_parts``.  It returns a 2-tuple:
  the encoded string output and the Publisher object.

.. _Inside A Docutils Command-Line Front-End Tool: ./cmdline-tool.html
.. _docutils/ ../../docutils/


To pass application-specific setting defaults to the Publisher
convenience functions, use the ``settings_overrides`` parameter.  Pass
a dictionary of setting names & values, like this::

    overrides = {'input_encoding': 'ascii',
                 'output_encoding': 'latin-1'}
    output = publish_string(..., settings_overrides=overrides)

Settings from command-line options override configuration file
settings, and they override application defaults.  For details, see
`Docutils Runtime Settings`_.  See `Docutils Configuration Files`_ for
details about individual settings.

.. _Docutils Runtime Settings: ./runtime-settings.html
.. _Docutils Configuration Files: ../user/tools.html


The default output encoding of Docutils is UTF-8.  If you have any
non-ASCII in your input text, you may have to do a bit more setup.
Docutils may introduce some non-ASCII text if you use
`auto-symbol footnotes`_ or the `"contents" directive`_.

.. _auto-symbol footnotes:
.. _"contents" directive:

``publish_parts`` Details

The ``docutils.core.publish_parts`` convenience function returns a
dictionary of document parts.  Dictionary keys are the names of parts,
and values are Unicode strings.

Each Writer component may publish a different set of document parts,
described below.  Not all writers implement all parts.

Parts Provided By All Writers

    The output encoding setting.

    The version of Docutils used.

    ``parts['whole']`` contains the entire formatted document.

.. _HTML writer:

Parts Provided By the HTML Writer

    ``parts['body']`` is equivalent to parts['fragment_'].  It is
    *not* equivalent to parts['html_body_'].

    ``parts['body_prefix']`` contains::

        <div class="document" ...>

    and, if applicable::

        <div class="header">

    ``parts['body_pre_docinfo]`` contains (as applicable)::

        <h1 class="title">...</h1>
        <h2 class="subtitle" id="...">...</h2>

    ``parts['body_suffix']`` contains::


    (the end-tag for ``<div class="document">``), the footer division
    if applicable::

        <div class="footer">



    ``parts['docinfo']`` contains the document bibliographic data, the
    docinfo field list rendered as a table.

    ``parts['footer']`` contains the document footer content, meant to
    appear at the bottom of a web page, or repeated at the bottom of
    every printed page.

    ``parts['fragment']`` contains the document body (*not* the HTML
    ``<body>``).  In other words, it contains the entire document,
    less the document title, subtitle, docinfo, header, and footer.

    ``parts['head']`` contains ``<meta ... />`` tags and the document

    ``parts['head_prefix']`` contains the XML declaration, the DOCTYPE
    declaration, the ``<html ...>`` start tag and the ``<head>`` start

    ``parts['header']`` contains the document header content, meant to
    appear at the top of a web page, or repeated at the top of every
    printed page.

    ``parts['html_body']`` contains the HTML ``<body>`` content, less
    the ``<body>`` and ``</body>`` tags themselves.

    ``parts['html_head']`` contains the HTML ``<head>`` content, less
    the stylesheet link and the ``<head>`` and ``</head>`` tags
    themselves.  Since ``publish_parts`` returns Unicode strings and
    does not know about the output encoding, the "Content-Type" meta
    tag's "charset" value is left unresolved, as "%s"::

        <meta http-equiv="Content-Type" content="text/html; charset=%s" />

    The interpolation should be done by client code.

    ``parts['html_prolog]`` contains the XML declaration and the
    doctype declaration.  The XML declaration's "encoding" attribute's
    value is left unresolved, as "%s"::

        <?xml version="1.0" encoding="%s" ?>

    The interpolation should be done by client code.

    ``parts['html_subtitle']`` contains the document subtitle,
    including the enclosing ``<h2 class="subtitle">`` & ``</h2>``

    ``parts['html_title']`` contains the document title, including the
    enclosing ``<h1 class="title">`` & ``</h1>`` tags.

    ``parts['meta']`` contains all ``<meta ... />`` tags.

    ``parts['stylesheet']`` contains the embedded stylesheet or
    stylesheet link.

    ``parts['subtitle']`` contains the document subtitle text and any
    inline markup.  It does not include the enclosing ``<h2>`` &
    ``</h2>`` tags.

    ``parts['title']`` contains the document title text and any inline
    markup.  It does not include the enclosing ``<h1>`` & ``</h1>``

Parts Provided by the PEP/HTML Writer

The PEP/HTML writer provides the same parts as the `HTML writer`_,
plus the following:

    ``parts['pepnum']`` contains

Parts Provided by the S5/HTML Writer

The S5/HTML writer provides the same parts as the `HTML writer`_.

Parts Provided by the LaTeX2e Writer

    ``parts['abstract']`` contains the formatted content of the
    'abstract' docinfo field.

    ``parts['body']`` contains the document's content. In other words, it
    contains the entire document, except the document title, subtitle, and

    This part can be included into another LaTeX document body using the
    ``\input{}`` command.

.. body_prefix
    ``parts['body_prefix']`` contains the LaTeX ``\begin{document}``

    ``parts['body_pre_docinfo]`` contains title (and poss. subtitle) setup
    and the ``\maketitle`` command.

    With ``--use-latex-docinfo``, it also contains the 'author',
    'organization', 'contact', 'address' and 'date' docinfo items.

.. body_suffix
    ``parts['body_suffix']`` contains the LaTeX ``\end{document}``.

    ``parts['dedication']`` contains the formatted content of the
    'dedication' docinfo field.

    ``parts['docinfo']`` contains the document bibliographic data, the
    docinfo field list rendered as a table.

    With ``--use-latex-docinfo`` 'author', 'organization', 'contact',
    'address' and 'date' info is moved to the title metadata (included in

    'dedication' and 'abstract' are always moved to separate parts.

    ``parts['fallbacks']`` contains fallback definitions for
    Docutils-specific commands and environments.

    ``parts['head_prefix']`` contains the declaration of
    documentclass and document options.

    ``parts['latex_preamble']`` contains the argument of the
    ``--latex-preamble`` option.

     ``parts['pdfsetup']`` contains the PDF properties
     ("hyperref" package setup).

    ``parts['requirements']`` contains required packages and setup
    before the stylesheet inclusion.

    ``parts['stylesheet']`` contains the embedded stylesheet(s) or
    stylesheet loading command(s).

    ``parts['subtitle']`` contains the document subtitle text and any
    inline markup.

    ``parts['title']`` contains the document title text and any inline