<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Changes in version 3.0 — feedparser 5.1.3 documentation</title> <link rel="stylesheet" href="_static/default.css" type="text/css" /> <link rel="stylesheet" href="_static/pygments.css" type="text/css" /> <link rel="stylesheet" href="_static/feedparser.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '', VERSION: '5.1.3', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="_static/jquery.js"></script> <script type="text/javascript" src="_static/underscore.js"></script> <script type="text/javascript" src="_static/doctools.js"></script> <link rel="top" title="feedparser 5.1.3 documentation" href="index.html" /> <link rel="up" title="Revision history" href="history.html" /> <link rel="next" title="Changes in version 2.7.x" href="changes-27.html" /> <link rel="prev" title="Changes in version 3.0.1" href="changes-301.html" /> </head> <body> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="changes-27.html" title="Changes in version 2.7.x" accesskey="N">next</a> |</li> <li class="right" > <a href="changes-301.html" title="Changes in version 3.0.1" accesskey="P">previous</a> |</li> <li><a href="index.html">feedparser 5.1.3 documentation</a> »</li> <li><a href="history.html" accesskey="U">Revision history</a> »</li> </ul> </div> <div class="document"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="changes-in-version-3-0"> <h1>Changes in version 3.0<a class="headerlink" href="#changes-in-version-3-0" title="Permalink to this headline">ΒΆ</a></h1> <p><strong class="program">Universal Feed Parser</strong> 3.0 was released on June 21, 2004.</p> <ul class="simple"> <li>don’t try <tt class="docutils literal"><span class="pre">iso-8859-1</span></tt> (can’t distinguish between <tt class="docutils literal"><span class="pre">iso-8859-1</span></tt> and <tt class="docutils literal"><span class="pre">windows-1252</span></tt> anyway, and most incorrectly marked feeds are <tt class="docutils literal"><span class="pre">windows-1252</span></tt>)</li> <li>fixed regression that could cause the same encoding to be tried twice (even if it failed the first time)</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0fc3 was released on June 18, 2004.</p> <ul class="simple"> <li>fixed bug in <tt class="docutils literal"><span class="pre">_changeEncodingDeclaration</span></tt> that failed to parse UTF-16 encoded feeds</li> <li>made <tt class="docutils literal"><span class="pre">source</span></tt> into a FeedParserDict</li> <li>duplicate admin:generatorAgent/@rdf:resource in <tt class="docutils literal"><span class="pre">generator_detail.url</span></tt></li> <li>added support for image</li> <li>refactored <tt class="docutils literal"><span class="pre">parse()</span></tt> fallback logic to try other encodings if SAX parsing fails (previously it would only try other encodings if re-encoding failed)</li> <li>remove <tt class="docutils literal"><span class="pre">unichr</span></tt> madness in normalize_attrs now that we’re properly tracking encoding in and out of BaseHTMLProcessor</li> <li>set <tt class="docutils literal"><span class="pre">feed.language</span></tt> from root-level xml:lang</li> <li>set <tt class="docutils literal"><span class="pre">entry.id</span></tt> from rdf:about</li> <li>send <tt class="docutils literal"><span class="pre">Accept</span></tt> header</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0fc2 was released on May 10, 2004.</p> <ul class="simple"> <li>added and passed Sam’s amp tests</li> <li>added and passed my blink tag tests</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0fc1 was released on April 23, 2004.</p> <ul class="simple"> <li>made <tt class="docutils literal"><span class="pre">results.entries[0].links[0]</span></tt> and <tt class="docutils literal"><span class="pre">results.entries[0].enclosures[0]</span></tt> into FeedParserDict</li> <li>fixed typo that could cause the same encoding to be tried twice (even if it failed the first time)</li> <li>fixed DOCTYPE stripping when DOCTYPE contained entity declarations</li> <li>better textinput and image tracking in illformed <abbr title="Rich Site Summary">RSS</abbr> 1.0 feeds</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b23 was released on April 21, 2004.</p> <ul class="simple"> <li>fixed <tt class="docutils literal"><span class="pre">UnicodeDecodeError</span></tt> for feeds that contain high-bit characters in attributes in embedded <abbr title="HyperText Markup Language">HTML</abbr> in description (thanks Thijs van de Vossen)</li> <li>moved <tt class="docutils literal"><span class="pre">guid</span></tt>, <tt class="docutils literal"><span class="pre">date</span></tt>, and <tt class="docutils literal"><span class="pre">date_parsed</span></tt> to mapped keys in FeedParserDict</li> <li>tweaked FeedParserDict.has_key to return <tt class="docutils literal"><span class="pre">True</span></tt> if asking about a mapped key</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b22 was released on April 19, 2004.</p> <ul class="simple"> <li>changed <tt class="docutils literal"><span class="pre">channel</span></tt> to <tt class="docutils literal"><span class="pre">feed</span></tt>, <tt class="docutils literal"><span class="pre">item</span></tt> to <tt class="docutils literal"><span class="pre">entries</span></tt> in <tt class="docutils literal"><span class="pre">results</span></tt> dict</li> <li>changed <tt class="docutils literal"><span class="pre">results</span></tt> dict to allow getting values with <tt class="docutils literal"><span class="pre">results.key</span></tt> as well as <tt class="docutils literal"><span class="pre">results[key]</span></tt></li> <li>work around embedded illformed <abbr title="HyperText Markup Language">HTML</abbr> with half a DOCTYPE</li> <li>work around malformed <tt class="docutils literal"><span class="pre">Content-Type</span></tt> header</li> <li>if character encoding is wrong, try several common ones before falling back to regexes (if this works, <tt class="docutils literal"><span class="pre">bozo_exception</span></tt> is set to <tt class="docutils literal"><span class="pre">CharacterEncodingOverride</span></tt></li> <li>fixed character encoding issues in BaseHTMLProcessor by tracking encoding and converting from Unicode to raw strings before feeding data to sgmllib.SGMLParser</li> <li>convert each value in results to Unicode (if possible), even if using regex-based parsing</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b21 was released on April 14, 2004.</p> <ul class="simple"> <li>added Hot RSS support</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b20 was released on April 7, 2004.</p> <ul class="simple"> <li>added <abbr title="Channel Definition Format">CDF</abbr> support</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b19 was released on March 15, 2004.</p> <ul class="simple"> <li>fixed bug exploding author information when author name was in parentheses</li> <li>removed ultra-problematic <tt class="file docutils literal"><span class="pre">mxTidy</span></tt> support</li> <li>patch to workaround crash in PyXML/expat when encountering invalid entities (MarkMoraes)</li> <li>support for textinput/textInput</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b18 was released on February 17, 2004.</p> <ul class="simple"> <li>always map description to <tt class="docutils literal"><span class="pre">summary_detail</span></tt> (Andrei)</li> <li>use <tt class="file docutils literal"><span class="pre">libxml2</span></tt> (if available)</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b17 was released on February 13, 2004.</p> <ul class="simple"> <li>determine character encoding as per <a class="reference external" href="http://www.ietf.org/rfc/rfc3023.txt">RFC 3023</a></li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b16 was released on February 12, 2004.</p> <ul class="simple"> <li>fixed support for <abbr title="Rich Site Summary">RSS</abbr> 0.90 (broken in b15)</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b15 was released on February 11, 2004.</p> <ul class="simple"> <li>fixed bug resolving relative links in wfw:commentRSS</li> <li>fixed bug capturing author and contributor <abbr title="Uniform Resource Identifier">URI</abbr></li> <li>fixed bug resolving relative links in author and contributor <abbr title="Uniform Resource Identifier">URI</abbr></li> <li>fixed bug resolving relative links in generator <abbr title="Uniform Resource Identifier">URI</abbr></li> <li>added support for recognizing <abbr title="Rich Site Summary">RSS</abbr> 1.0</li> <li>passed Simon Fell’s namespace tests, and included them permanently in the test suite with his permission</li> <li>fixed namespace handling under <strong class="program">Python</strong> 2.1</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b14 was released on February 8, 2004.</p> <ul class="simple"> <li>fixed CDATA handling in non-wellformed feeds under <strong class="program">Python</strong> 2.1</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b13 was released on February 8, 2004.</p> <ul class="simple"> <li>better handling of empty <abbr title="HyperText Markup Language">HTML</abbr> tags (br, hr, img, etc.) in embedded markup, in either <abbr title="HyperText Markup Language">HTML</abbr> or <abbr title="Extensible HyperText Markup Language">XHTML</abbr> form (<br>, <br/>, <br />)</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b12 was released on February 6, 2004.</p> <ul class="simple"> <li>fiddled with <tt class="docutils literal"><span class="pre">decodeEntities</span></tt> (still not right)</li> <li>added support to Atom 0.2 subtitle</li> <li>added support for Atom content model in copyright</li> <li>better sanitizing of dangerous <abbr title="HyperText Markup Language">HTML</abbr> elements with end tags (script, frameset)</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b11 was released on February 2, 2004.</p> <ul class="simple"> <li>added rights to list of elements that can contain dangerous markup</li> <li>fiddled with <tt class="docutils literal"><span class="pre">decodeEntities</span></tt> (not right)</li> <li>liberalized date parsing even further</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b10 was released on January 31, 2004.</p> <ul class="simple"> <li>incorporated ISO-8601 date parsing routines from <tt class="file docutils literal"><span class="pre">xml.util.iso8601</span></tt></li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b9 was released on January 29, 2004.</p> <ul class="simple"> <li>fixed check for presence of <tt class="docutils literal"><span class="pre">dict</span></tt> function</li> <li>added support for summary</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b8 was released on January 28, 2004.</p> <ul class="simple"> <li>added support for contributor</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b7 was released on January 28, 2004.</p> <ul class="simple"> <li>support Atom-style author element in <tt class="docutils literal"><span class="pre">author_detail</span></tt> (dictionary of <tt class="docutils literal"><span class="pre">name</span></tt>, <tt class="docutils literal"><span class="pre">url</span></tt>, <tt class="docutils literal"><span class="pre">email</span></tt>)</li> <li>map <tt class="docutils literal"><span class="pre">author</span></tt> to <tt class="docutils literal"><span class="pre">author_detail</span></tt> if <tt class="docutils literal"><span class="pre">author</span></tt> contains name + email address</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b6 was released on January 27, 2004.</p> <ul class="simple"> <li>added feed type and version detection, <tt class="docutils literal"><span class="pre">result['version']</span></tt> will be one of <tt class="docutils literal"><span class="pre">SUPPORTED_VERSIONS.keys()</span></tt> or empty string if unrecognized</li> <li>added support for creativeCommons:license and cc:license</li> <li>added support for full Atom content model in title, tagline, info, copyright, summary</li> <li>fixed bug with gzip encoding (not always telling server we support it when we do)</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b5 was released on January 26, 2004.</p> <ul class="simple"> <li>fixed bug parsing multiple links at feed level</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b4 was released on January 26, 2004.</p> <ul class="simple"> <li>fixed xml:lang inheritance</li> <li>fixed multiple bugs tracking xml:base <abbr title="Uniform Resource Identifier">URI</abbr>, one for documents that don’t define one explicitly and one for documents that define an outer and an inner xml:base that goes out of scope before the end of the document</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b3 was released on January 23, 2004.</p> <ul class="simple"> <li>parse entire feed with real <abbr title="Extensible Markup Language">XML</abbr> parser (if available)</li> <li>added several new supported namespaces</li> <li>fixed bug tracking naked markup in description</li> <li>added support for enclosure</li> <li>added support for source</li> <li>re-added support for cloud which got dropped somehow</li> <li>added support for expirationDate</li> </ul> <p><strong class="program">Universal Feed Parser</strong> 3.0b2 and 3.0b1 have been lost in the mists of time.</p> </div> </div> </div> </div> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <h4>Previous topic</h4> <p class="topless"><a href="changes-301.html" title="previous chapter">Changes in version 3.0.1</a></p> <h4>Next topic</h4> <p class="topless"><a href="changes-27.html" title="next chapter">Changes in version 2.7.x</a></p> <h3>This Page</h3> <ul class="this-page-menu"> <li><a href="_sources/changes-30.txt" rel="nofollow">Show Source</a></li> </ul> <div id="searchbox" style="display: none"> <h3>Quick search</h3> <form class="search" action="search.html" method="get"> <input type="text" name="q" /> <input type="submit" value="Go" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> <p class="searchtip" style="font-size: 90%"> Enter search terms or a module, class or function name. </p> </div> <script type="text/javascript">$('#searchbox').show(0);</script> </div> </div> <div class="clearer"></div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="changes-27.html" title="Changes in version 2.7.x" >next</a> |</li> <li class="right" > <a href="changes-301.html" title="Changes in version 3.0.1" >previous</a> |</li> <li><a href="index.html">feedparser 5.1.3 documentation</a> »</li> <li><a href="history.html" >Revision history</a> »</li> </ul> </div> <div class="footer"> © Copyright 2004-2008 Mark Pilgrim, 2010-2012 Kurt McKee. Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3. </div> </body> </html>