<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Microformats — feedparser 5.1.3 documentation</title> <link rel="stylesheet" href="_static/default.css" type="text/css" /> <link rel="stylesheet" href="_static/pygments.css" type="text/css" /> <link rel="stylesheet" href="_static/feedparser.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '', VERSION: '5.1.3', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="_static/jquery.js"></script> <script type="text/javascript" src="_static/underscore.js"></script> <script type="text/javascript" src="_static/doctools.js"></script> <link rel="top" title="feedparser 5.1.3 documentation" href="index.html" /> <link rel="next" title="Reference" href="reference.html" /> <link rel="prev" title="Changes in earlier versions" href="changes-early.html" /> </head> <body> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="reference.html" title="Reference" accesskey="N">next</a> |</li> <li class="right" > <a href="changes-early.html" title="Changes in earlier versions" accesskey="P">previous</a> |</li> <li><a href="index.html">feedparser 5.1.3 documentation</a> »</li> </ul> </div> <div class="document"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="microformats"> <span id="advanced-microformats"></span><h1>Microformats<a class="headerlink" href="#microformats" title="Permalink to this headline">¶</a></h1> <p>An emerging trend in feed syndication is the inclusion of <a class="reference external" href="http://microformats.org/">microformats</a>. Besides the semantics defined by individual feed formats, publishers can add additional semantics using rel and class attributes in embedded <abbr title="HyperText Markup Language">HTML</abbr> content.</p> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">To parse microformats. <strong class="program">Universal Feed Parser</strong> relies on a third-party library called <a class="reference external" href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a>, which is distributed separately. If Beautiful Soup is not installed, <strong class="program">Universal Feed Parser</strong> will silently skip microformats parsing.</p> </div> <p>The following elements are parsed for microformats:</p> <ul class="simple"> <li><a class="reference internal" href="reference-entry-summary_detail.html#reference-entry-summary-detail-value"><em>entries[i].summary_detail.value</em></a></li> <li><a class="reference internal" href="reference-entry-content.html#reference-entry-content-value"><em>entries[i].content[j].value</em></a></li> </ul> <div class="section" id="rel-enclosure"> <span id="advanced-microformats-relenclosure"></span><h2>rel=enclosure<a class="headerlink" href="#rel-enclosure" title="Permalink to this headline">¶</a></h2> <p>The <a class="reference external" href="http://microformats.org/wiki/rel-enclosure">rel=enclosure</a> microformat provides a way for embedded <abbr title="HyperText Markup Language">HTML</abbr> content to specify that a certain link should be treated as an <a class="reference internal" href="reference-entry-enclosures.html#reference-entry-enclosures"><em>enclosure</em></a>. <strong class="program">Universal Feed Parser</strong> looks for links within embedded markup that meet any of the following conditions:</p> <ul class="simple"> <li>rel attribute contains enclosure (note: rel attributes can contain a list of space-separated values)</li> <li>type attribute starts with audio/</li> <li>type attribute starts with video/</li> <li>type attribute starts with application/ but does not end with xml</li> <li>href attribute ends with one of the following file extensions: <tt class="file docutils literal"><span class="pre">.7z</span></tt>, <tt class="file docutils literal"><span class="pre">.avi</span></tt>, <tt class="file docutils literal"><span class="pre">.bin</span></tt>, <tt class="file docutils literal"><span class="pre">.bz2</span></tt>, <tt class="file docutils literal"><span class="pre">.bz2</span></tt>, <tt class="file docutils literal"><span class="pre">.deb</span></tt>, <tt class="file docutils literal"><span class="pre">.dmg</span></tt>, <tt class="file docutils literal"><span class="pre">.exe</span></tt>, <tt class="file docutils literal"><span class="pre">.gz</span></tt>, <tt class="file docutils literal"><span class="pre">.hqx</span></tt>, <tt class="file docutils literal"><span class="pre">.img</span></tt>, <tt class="file docutils literal"><span class="pre">.iso</span></tt>, <tt class="file docutils literal"><span class="pre">.jar</span></tt>, <tt class="file docutils literal"><span class="pre">.m4a</span></tt>, <tt class="file docutils literal"><span class="pre">.m4v</span></tt>, <tt class="file docutils literal"><span class="pre">.mp2</span></tt>, <tt class="file docutils literal"><span class="pre">.mp3</span></tt>, <tt class="file docutils literal"><span class="pre">.mp4</span></tt>, <tt class="file docutils literal"><span class="pre">.msi</span></tt>, <tt class="file docutils literal"><span class="pre">.ogg</span></tt>, <tt class="file docutils literal"><span class="pre">.ogm</span></tt>, <tt class="file docutils literal"><span class="pre">.rar</span></tt>, <tt class="file docutils literal"><span class="pre">.rpm</span></tt>, <tt class="file docutils literal"><span class="pre">.sit</span></tt>, <tt class="file docutils literal"><span class="pre">.sitx</span></tt>, <tt class="file docutils literal"><span class="pre">.tar</span></tt>, <tt class="file docutils literal"><span class="pre">.tbz2</span></tt>, <tt class="file docutils literal"><span class="pre">.tgz</span></tt>, <tt class="file docutils literal"><span class="pre">.wma</span></tt>, <tt class="file docutils literal"><span class="pre">.wmv</span></tt>, <tt class="file docutils literal"><span class="pre">.z</span></tt>, <tt class="file docutils literal"><span class="pre">.zip</span></tt></li> </ul> <p>When <strong class="program">Universal Feed Parser</strong> finds a link that satisfies any of these conditions, it adds it to <a class="reference internal" href="reference-entry-enclosures.html#reference-entry-enclosures"><em>entries[i].enclosures</em></a>.</p> <p class="rubric">Parsing embedded enclosures</p> <div class="highlight-python"><div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">feedparser</span> <span class="gp">>>> </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">'http://feedparser.org/docs/examples/rel-enclosure.xml'</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">enclosures</span> <span class="go">[{u'href': u'http://example.com/movie.mp4', 'title': u'awesome movie'}]</span> </pre></div> </div> </div> <div class="section" id="rel-tag"> <span id="advanced-microformats-reltag"></span><h2>rel=tag<a class="headerlink" href="#rel-tag" title="Permalink to this headline">¶</a></h2> <p>The <a class="reference external" href="http://microformats.org/wiki/rel-tag">rel=tag</a> microformat allows you to define <a class="reference internal" href="reference-entry-tags.html#reference-entry-tags"><em>tags</em></a> within embedded <abbr title="HyperText Markup Language">HTML</abbr> content. <strong class="program">Universal Feed Parser</strong> looks for these attribute values in embedded markup and maps them to <a class="reference internal" href="reference-entry-tags.html#reference-entry-tags"><em>entries[i].tags</em></a>.</p> <p class="rubric">Parsing embedded tags</p> <div class="highlight-python"><div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">feedparser</span> <span class="gp">>>> </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">'http://feedparser.org/docs/examples/rel-tag.xml'</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">tags</span> <span class="go">[{'term': u'tech', 'scheme': u'http://del.icio.us/tag/', 'label': u'Technology'}]</span> </pre></div> </div> </div> <div class="section" id="xfn"> <span id="advanced-microformats-xfn"></span><h2><abbr title="XHTML Friends Network">XFN</abbr><a class="headerlink" href="#xfn" title="Permalink to this headline">¶</a></h2> <p>The <a class="reference external" href="http://microformats.org/wiki/XFN">XFN</a> microformat allows you to define human relationships between <abbr title="Uniform Resource Identifier">URI</abbr>s. For example, you could link from your weblog to your spouse’s weblog with the <tt class="docutils literal"><span class="pre">rel="spouse"</span></tt> relation. It is intended primarily for “blogrolls” or other static lists of links, but the relations can occur anywhere in <abbr title="HyperText Markup Language">HTML</abbr> content. If found, <strong class="program">Universal Feed Parser</strong> will return the <abbr title="XHTML Friends Network">XFN</abbr> information in <a class="reference internal" href="reference-entry-xfn.html#reference-entry-xfn"><em>entries[i].xfn</em></a>.</p> <p><strong class="program">Universal Feed Parser</strong> supports all of the relationships listed in the <a class="reference external" href="http://gmpg.org/xfn/11">XFN 1.1 profile</a>, as well as the following variations:</p> <ul class="simple"> <li><tt class="docutils literal"><span class="pre">coworker</span></tt> in addition to <tt class="docutils literal"><span class="pre">co-worker</span></tt></li> <li><tt class="docutils literal"><span class="pre">coresident</span></tt> in addition to <tt class="docutils literal"><span class="pre">co-resident</span></tt></li> <li><tt class="docutils literal"><span class="pre">relative</span></tt> in addition to <tt class="docutils literal"><span class="pre">kin</span></tt></li> <li><tt class="docutils literal"><span class="pre">brother</span></tt> and <tt class="docutils literal"><span class="pre">sister</span></tt> in addition to <tt class="docutils literal"><span class="pre">sibling</span></tt></li> <li><tt class="docutils literal"><span class="pre">husband</span></tt> and <tt class="docutils literal"><span class="pre">wife</span></tt> in addition to <tt class="docutils literal"><span class="pre">spouse</span></tt></li> </ul> <p class="rubric">Parsing <abbr title="XHTML Friends Network">XFN</abbr> relationships</p> <div class="highlight-python"><div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">feedparser</span> <span class="gp">>>> </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">'http://feedparser.org/docs/examples/xfn.xml'</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">person</span> <span class="o">=</span> <span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">xfn</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="gp">>>> </span><span class="n">person</span><span class="o">.</span><span class="n">name</span> <span class="go">u'John Doe'</span> <span class="gp">>>> </span><span class="n">person</span><span class="o">.</span><span class="n">href</span> <span class="go">u'http://example.com/johndoe'</span> <span class="gp">>>> </span><span class="n">person</span><span class="o">.</span><span class="n">relationships</span> <span class="go">[u'coworker', u'friend']</span> </pre></div> </div> </div> <div class="section" id="hcard"> <span id="advanced-microformats-hcard"></span><h2>hCard<a class="headerlink" href="#hcard" title="Permalink to this headline">¶</a></h2> <p>The <a class="reference external" href="http://microformats.org/wiki/hcard">hCard</a> microformat allows you to embed address book information within <abbr title="HyperText Markup Language">HTML</abbr> content. If <strong class="program">Universal Feed Parser</strong> finds an hCard within supported elements, it converts it into an RFC 2426-compliant vCard and returns it in <a class="reference internal" href="reference-entry-vcard.html#reference-entry-vcard"><em>entries[i].vcard</em></a>.</p> <p class="rubric">Converting embedded hCard markup into a vCard</p> <div class="highlight-python"><div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">feedparser</span> <span class="gp">>>> </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">'http://feedparser.org/docs/examples/hcard.xml'</span><span class="p">)</span> <span class="gp">>>> </span><span class="k">print</span> <span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">vcard</span> <span class="go">BEGIN:vCard</span> <span class="go">VERSION:3.0</span> <span class="go">FN:Frank Dawson</span> <span class="go">N:Dawson;Frank</span> <span class="go">ADR;TYPE=work,postal,parcel:;;6544 Battleford Drive;Raleigh;NC;27613-3502;U</span> <span class="go">.S.A.</span> <span class="go">TEL;TYPE=WORK,VOICE,MSG:+1-919-676-9515</span> <span class="go">TEL;TYPE=WORK,FAX:+1-919-676-9564</span> <span class="go">EMAIL;TYPE=internet,pref:Frank_Dawson at Lotus.com</span> <span class="go">EMAIL;TYPE=internet:fdawson at earthlink.net</span> <span class="go">ORG:Lotus Development Corporation</span> <span class="go">URL:http://home.earthlink.net/~fdawson</span> <span class="go">END:vCard</span> <span class="go">BEGIN:vCard</span> <span class="go">VERSION:3.0</span> <span class="go">FN:Tim Howes</span> <span class="go">N:Howes;Tim</span> <span class="go">ADR;TYPE=work:;;501 E. Middlefield Rd.;Mountain View;CA;94043;U.S.A.</span> <span class="go">TEL;TYPE=WORK,VOICE,MSG:+1-415-937-3419</span> <span class="go">TEL;TYPE=WORK,FAX:+1-415-528-4164</span> <span class="go">EMAIL;TYPE=internet:howes at netscape.com</span> <span class="go">ORG:Netscape Communications Corp.</span> <span class="go">END:vCard</span> </pre></div> </div> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">There are a growing number of microformats, and <strong class="program">Universal Feed Parser</strong> does not parse all of them. However, both the rel and class attributes survive <a class="reference internal" href="html-sanitization.html#advanced-sanitization"><em>HTML sanitizing</em></a>, so applications built on <strong class="program">Universal Feed Parser</strong> that wish to parse additional microformat content are free to do so.</p> </div> <div class="admonition-see-also admonition seealso"> <p class="first admonition-title">See also</p> <ul class="last simple"> <li><a class="reference external" href="http://microformats.org/">Microformats.org</a></li> <li><a class="reference external" href="http://microformats.org/wiki/rel-enclosure">rel=enclosure specification</a></li> <li><a class="reference external" href="http://microformats.org/wiki/rel-tag">rel=tag specification</a></li> <li><a class="reference external" href="http://microformats.org/wiki/XFN">XFN specification</a></li> <li><a class="reference external" href="http://microformats.org/wiki/hcard">hCard specification</a></li> </ul> </div> </div> </div> </div> </div> </div> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <h3><a href="index.html">Table Of Contents</a></h3> <ul> <li><a class="reference internal" href="#">Microformats</a><ul> <li><a class="reference internal" href="#rel-enclosure">rel=enclosure</a></li> <li><a class="reference internal" href="#rel-tag">rel=tag</a></li> <li><a class="reference internal" href="#xfn"><abbr title="XHTML Friends Network">XFN</abbr></a></li> <li><a class="reference internal" href="#hcard">hCard</a></li> </ul> </li> </ul> <h4>Previous topic</h4> <p class="topless"><a href="changes-early.html" title="previous chapter">Changes in earlier versions</a></p> <h4>Next topic</h4> <p class="topless"><a href="reference.html" title="next chapter">Reference</a></p> <h3>This Page</h3> <ul class="this-page-menu"> <li><a href="_sources/microformats.txt" rel="nofollow">Show Source</a></li> </ul> <div id="searchbox" style="display: none"> <h3>Quick search</h3> <form class="search" action="search.html" method="get"> <input type="text" name="q" /> <input type="submit" value="Go" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> <p class="searchtip" style="font-size: 90%"> Enter search terms or a module, class or function name. </p> </div> <script type="text/javascript">$('#searchbox').show(0);</script> </div> </div> <div class="clearer"></div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="reference.html" title="Reference" >next</a> |</li> <li class="right" > <a href="changes-early.html" title="Changes in earlier versions" >previous</a> |</li> <li><a href="index.html">feedparser 5.1.3 documentation</a> »</li> </ul> </div> <div class="footer"> © Copyright 2004-2008 Mark Pilgrim, 2010-2012 Kurt McKee. Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3. </div> </body> </html>