Sophie

Sophie

distrib > Mandriva > 2007.1 > i586 > by-pkgid > 09cecd41fd5510f1b4c6358078b3faaf > files > 200

haskell-HXT-7.1-2mdv2007.1.i586.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!--Rendered using the Haskell Html Library v0.2-->
<HTML
><HEAD
><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"
><TITLE
>Text.XML.HXT.Parser.HtmlParser</TITLE
><LINK HREF="haddock.css" REL="stylesheet" TYPE="text/css"
><SCRIPT SRC="haddock.js" TYPE="text/javascript"
></SCRIPT
></HEAD
><BODY
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD CLASS="topbar"
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD
><IMG SRC="haskell_icon.gif" WIDTH="16" HEIGHT="16" ALT=" "
></TD
><TD CLASS="title"
>hxt-7.1: </TD
><TD CLASS="topbut"
><A HREF="index.html"
>Contents</A
></TD
><TD CLASS="topbut"
><A HREF="doc-index.html"
>Index</A
></TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="modulebar"
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD
><FONT SIZE="6"
>Text.XML.HXT.Parser.HtmlParser</FONT
></TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
>Description</TD
></TR
><TR
><TD CLASS="doc"
><P
>HTML Parser
</P
><P
>Version : $Id: HtmlParser.hs,v 1.4 2006<EM
>11</EM
>12 14:53:00 hxml Exp $
</P
><P
>This parser tries to interprete everything as HTML
 no errors are emitted during parsing. If something looks
 weired, warning messages are inserted in the document tree
</P
><P
>module contains state filter for easy parsing and error handling
 real work is done in <TT
><A HREF="Text-XML-HXT-Parser.html#t%3AHtmlParsec"
>HtmlParsec</A
></TT
>
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
>Synopsis</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="body"
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD CLASS="decl"
><A HREF="#v%3AgetHtmlDoc"
>getHtmlDoc</A
> :: <A HREF="Text-XML-HXT-DOM-XmlState.html#t%3AXmlStateFilter"
>XmlStateFilter</A
> state</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3AparseHtmlDoc"
>parseHtmlDoc</A
> :: <A HREF="Text-XML-HXT-DOM-XmlState.html#t%3AXmlStateFilter"
>XmlStateFilter</A
> a</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3ArunHtmlParser"
>runHtmlParser</A
> :: <A HREF="Text-XML-HXT-DOM-XmlState.html#t%3AXmlStateFilter"
>XmlStateFilter</A
> a</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
>module <A HREF="Text-XML-HXT-Parser-HtmlParsec.html"
>Text.XML.HXT.Parser.HtmlParsec</A
></TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
>Documentation</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v%3AgetHtmlDoc"
></A
><B
>getHtmlDoc</B
> :: <A HREF="Text-XML-HXT-DOM-XmlState.html#t%3AXmlStateFilter"
>XmlStateFilter</A
> state</TD
></TR
><TR
><TD CLASS="doc"
><P
>read a document and parse it with <TT
><A HREF="Text-XML-HXT-Parser-HtmlParser.html#v%3AparseHtmlDoc"
>parseHtmlDoc</A
></TT
>. The main entry point of this module
</P
><P
>The input tree must be a root tree like in '	Text.XML.HXT.Parser.MainFunctions.getXmlDoc'. The content is read with <TT
><A HREF="Text-XML-HXT-Parser-XmlInput.html#v%3AgetXmlContents"
>getXmlContents</A
></TT
>,
 is parsed with <TT
><A HREF="Text-XML-HXT-Parser-HtmlParser.html#v%3AparseHtmlDoc"
>parseHtmlDoc</A
></TT
> and canonicalized (char refs are substituted in content and attributes,
 but comment is preserved)
</P
><P
>see also : <TT
><A HREF="Text-XML-HXT-Parser-DTDProcessing.html#v%3AgetWellformedDoc"
>getWellformedDoc</A
></TT
>
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v%3AparseHtmlDoc"
></A
><B
>parseHtmlDoc</B
> :: <A HREF="Text-XML-HXT-DOM-XmlState.html#t%3AXmlStateFilter"
>XmlStateFilter</A
> a</TD
></TR
><TR
><TD CLASS="doc"
><P
>The HTML parsing filter
</P
><P
>The input is parsed with <TT
><A HREF="Text-XML-HXT-Parser-HtmlParser.html#v%3ArunHtmlParser"
>runHtmlParser</A
></TT
>, everything is interpreted as HTML,
 if errors ocuur, the parser will try to do some meaningfull action and continues
 parsing. Afterwards the entitiy references for defined for XHTML are resovled,
 any unresolved reference is transformed into plain text.
</P
><P
>Error messages
 during parsing and entity resolving are added as warning nodes into the resulting tree.
</P
><P
>The warnings are issued, if the 1. parameter noWarnings is set to True,
 afterwards all are removed from the resulting tree.
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v%3ArunHtmlParser"
></A
><B
>runHtmlParser</B
> :: <A HREF="Text-XML-HXT-DOM-XmlState.html#t%3AXmlStateFilter"
>XmlStateFilter</A
> a</TD
></TR
><TR
><TD CLASS="doc"
>The pure HTML parser, usually called via <TT
><A HREF="Text-XML-HXT-Parser-HtmlParser.html#v%3AparseHtmlDoc"
>parseHtmlDoc</A
></TT
>.
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
>module <A HREF="Text-XML-HXT-Parser-HtmlParsec.html"
>Text.XML.HXT.Parser.HtmlParsec</A
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="botbar"
>Produced by <A HREF="http://www.haskell.org/haddock/"
>Haddock</A
> version 0.8</TD
></TR
></TABLE
></BODY
></HTML
>