Sophie

Sophie

distrib > Fedora > 18 > i386 > by-pkgid > 7e03e96dde1cbbdbc7cc96424cd9e059 > files > 315

python-feedparser-doc-5.1.3-3.fc18.noarch.rpm



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Microformats &mdash; feedparser 5.1.3 documentation</title>
    
    <link rel="stylesheet" href="_static/default.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    <link rel="stylesheet" href="_static/feedparser.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '',
        VERSION:     '5.1.3',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="top" title="feedparser 5.1.3 documentation" href="index.html" />
    <link rel="next" title="Reference" href="reference.html" />
    <link rel="prev" title="Changes in earlier versions" href="changes-early.html" /> 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="reference.html" title="Reference"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="changes-early.html" title="Changes in earlier versions"
             accesskey="P">previous</a> |</li>
        <li><a href="index.html">feedparser 5.1.3 documentation</a> &raquo;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="microformats">
<span id="advanced-microformats"></span><h1>Microformats<a class="headerlink" href="#microformats" title="Permalink to this headline">¶</a></h1>
<p>An emerging trend in feed syndication is the inclusion of <a class="reference external" href="http://microformats.org/">microformats</a>.
Besides the semantics defined by individual feed formats, publishers can add
additional semantics using rel and class attributes in embedded
<abbr title="HyperText Markup Language">HTML</abbr> content.</p>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p class="last">To parse microformats. <strong class="program">Universal Feed Parser</strong> relies on a
third-party library called <a class="reference external" href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a>, which is distributed
separately.  If Beautiful Soup is not installed,
<strong class="program">Universal Feed Parser</strong> will silently skip microformats parsing.</p>
</div>
<p>The following elements are parsed for microformats:</p>
<ul class="simple">
<li><a class="reference internal" href="reference-entry-summary_detail.html#reference-entry-summary-detail-value"><em>entries[i].summary_detail.value</em></a></li>
<li><a class="reference internal" href="reference-entry-content.html#reference-entry-content-value"><em>entries[i].content[j].value</em></a></li>
</ul>
<div class="section" id="rel-enclosure">
<span id="advanced-microformats-relenclosure"></span><h2>rel=enclosure<a class="headerlink" href="#rel-enclosure" title="Permalink to this headline">¶</a></h2>
<p>The <a class="reference external" href="http://microformats.org/wiki/rel-enclosure">rel=enclosure</a> microformat provides a way for embedded
<abbr title="HyperText Markup Language">HTML</abbr> content to specify that a certain link
should be treated as an <a class="reference internal" href="reference-entry-enclosures.html#reference-entry-enclosures"><em>enclosure</em></a>.
<strong class="program">Universal Feed Parser</strong> looks for links within embedded markup that
meet any of the following conditions:</p>
<ul class="simple">
<li>rel attribute contains enclosure (note: rel attributes can contain a list of space-separated values)</li>
<li>type attribute starts with audio/</li>
<li>type attribute starts with video/</li>
<li>type attribute starts with application/ but does not end with xml</li>
<li>href attribute ends with one of the following file extensions:
<tt class="file docutils literal"><span class="pre">.7z</span></tt>,
<tt class="file docutils literal"><span class="pre">.avi</span></tt>,
<tt class="file docutils literal"><span class="pre">.bin</span></tt>,
<tt class="file docutils literal"><span class="pre">.bz2</span></tt>,
<tt class="file docutils literal"><span class="pre">.bz2</span></tt>,
<tt class="file docutils literal"><span class="pre">.deb</span></tt>,
<tt class="file docutils literal"><span class="pre">.dmg</span></tt>,
<tt class="file docutils literal"><span class="pre">.exe</span></tt>,
<tt class="file docutils literal"><span class="pre">.gz</span></tt>,
<tt class="file docutils literal"><span class="pre">.hqx</span></tt>,
<tt class="file docutils literal"><span class="pre">.img</span></tt>,
<tt class="file docutils literal"><span class="pre">.iso</span></tt>,
<tt class="file docutils literal"><span class="pre">.jar</span></tt>,
<tt class="file docutils literal"><span class="pre">.m4a</span></tt>,
<tt class="file docutils literal"><span class="pre">.m4v</span></tt>,
<tt class="file docutils literal"><span class="pre">.mp2</span></tt>,
<tt class="file docutils literal"><span class="pre">.mp3</span></tt>,
<tt class="file docutils literal"><span class="pre">.mp4</span></tt>,
<tt class="file docutils literal"><span class="pre">.msi</span></tt>,
<tt class="file docutils literal"><span class="pre">.ogg</span></tt>,
<tt class="file docutils literal"><span class="pre">.ogm</span></tt>,
<tt class="file docutils literal"><span class="pre">.rar</span></tt>,
<tt class="file docutils literal"><span class="pre">.rpm</span></tt>,
<tt class="file docutils literal"><span class="pre">.sit</span></tt>,
<tt class="file docutils literal"><span class="pre">.sitx</span></tt>,
<tt class="file docutils literal"><span class="pre">.tar</span></tt>,
<tt class="file docutils literal"><span class="pre">.tbz2</span></tt>,
<tt class="file docutils literal"><span class="pre">.tgz</span></tt>,
<tt class="file docutils literal"><span class="pre">.wma</span></tt>,
<tt class="file docutils literal"><span class="pre">.wmv</span></tt>,
<tt class="file docutils literal"><span class="pre">.z</span></tt>,
<tt class="file docutils literal"><span class="pre">.zip</span></tt></li>
</ul>
<p>When <strong class="program">Universal Feed Parser</strong> finds a link that satisfies any of these
conditions, it adds it to <a class="reference internal" href="reference-entry-enclosures.html#reference-entry-enclosures"><em>entries[i].enclosures</em></a>.</p>
<p class="rubric">Parsing embedded enclosures</p>
<div class="highlight-python"><div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">feedparser</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">&#39;http://feedparser.org/docs/examples/rel-enclosure.xml&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">enclosures</span>
<span class="go">[{u&#39;href&#39;: u&#39;http://example.com/movie.mp4&#39;, &#39;title&#39;: u&#39;awesome movie&#39;}]</span>
</pre></div>
</div>
</div>
<div class="section" id="rel-tag">
<span id="advanced-microformats-reltag"></span><h2>rel=tag<a class="headerlink" href="#rel-tag" title="Permalink to this headline">¶</a></h2>
<p>The <a class="reference external" href="http://microformats.org/wiki/rel-tag">rel=tag</a> microformat allows you to define
<a class="reference internal" href="reference-entry-tags.html#reference-entry-tags"><em>tags</em></a> within embedded
<abbr title="HyperText Markup Language">HTML</abbr> content.
<strong class="program">Universal Feed Parser</strong> looks for these attribute values in embedded
markup and maps them to <a class="reference internal" href="reference-entry-tags.html#reference-entry-tags"><em>entries[i].tags</em></a>.</p>
<p class="rubric">Parsing embedded tags</p>
<div class="highlight-python"><div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">feedparser</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">&#39;http://feedparser.org/docs/examples/rel-tag.xml&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">tags</span>
<span class="go">[{&#39;term&#39;: u&#39;tech&#39;, &#39;scheme&#39;: u&#39;http://del.icio.us/tag/&#39;, &#39;label&#39;: u&#39;Technology&#39;}]</span>
</pre></div>
</div>
</div>
<div class="section" id="xfn">
<span id="advanced-microformats-xfn"></span><h2><abbr title="XHTML Friends Network">XFN</abbr><a class="headerlink" href="#xfn" title="Permalink to this headline">¶</a></h2>
<p>The <a class="reference external" href="http://microformats.org/wiki/XFN">XFN</a> microformat allows you to define human relationships between
<abbr title="Uniform Resource Identifier">URI</abbr>s.  For example, you could link from
your weblog to your spouse&#8217;s weblog with the <tt class="docutils literal"><span class="pre">rel=&quot;spouse&quot;</span></tt> relation.  It is
intended primarily for &#8220;blogrolls&#8221; or other static lists of links, but the
relations can occur anywhere in <abbr title="HyperText Markup Language">HTML</abbr>
content.  If found, <strong class="program">Universal Feed Parser</strong> will return the
<abbr title="XHTML Friends Network">XFN</abbr> information in <a class="reference internal" href="reference-entry-xfn.html#reference-entry-xfn"><em>entries[i].xfn</em></a>.</p>
<p><strong class="program">Universal Feed Parser</strong> supports all of the relationships listed in
the <a class="reference external" href="http://gmpg.org/xfn/11">XFN 1.1 profile</a>, as well as the following variations:</p>
<ul class="simple">
<li><tt class="docutils literal"><span class="pre">coworker</span></tt> in addition to <tt class="docutils literal"><span class="pre">co-worker</span></tt></li>
<li><tt class="docutils literal"><span class="pre">coresident</span></tt> in addition to <tt class="docutils literal"><span class="pre">co-resident</span></tt></li>
<li><tt class="docutils literal"><span class="pre">relative</span></tt> in addition to <tt class="docutils literal"><span class="pre">kin</span></tt></li>
<li><tt class="docutils literal"><span class="pre">brother</span></tt> and <tt class="docutils literal"><span class="pre">sister</span></tt> in addition to <tt class="docutils literal"><span class="pre">sibling</span></tt></li>
<li><tt class="docutils literal"><span class="pre">husband</span></tt> and <tt class="docutils literal"><span class="pre">wife</span></tt> in addition to <tt class="docutils literal"><span class="pre">spouse</span></tt></li>
</ul>
<p class="rubric">Parsing <abbr title="XHTML Friends Network">XFN</abbr> relationships</p>
<div class="highlight-python"><div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">feedparser</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">&#39;http://feedparser.org/docs/examples/xfn.xml&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">person</span> <span class="o">=</span> <span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">xfn</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">person</span><span class="o">.</span><span class="n">name</span>
<span class="go">u&#39;John Doe&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">person</span><span class="o">.</span><span class="n">href</span>
<span class="go">u&#39;http://example.com/johndoe&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">person</span><span class="o">.</span><span class="n">relationships</span>
<span class="go">[u&#39;coworker&#39;, u&#39;friend&#39;]</span>
</pre></div>
</div>
</div>
<div class="section" id="hcard">
<span id="advanced-microformats-hcard"></span><h2>hCard<a class="headerlink" href="#hcard" title="Permalink to this headline">¶</a></h2>
<p>The <a class="reference external" href="http://microformats.org/wiki/hcard">hCard</a> microformat allows you to embed address book information within
<abbr title="HyperText Markup Language">HTML</abbr> content.  If
<strong class="program">Universal Feed Parser</strong> finds an hCard within supported elements, it
converts it into an RFC 2426-compliant vCard and returns it in
<a class="reference internal" href="reference-entry-vcard.html#reference-entry-vcard"><em>entries[i].vcard</em></a>.</p>
<p class="rubric">Converting embedded hCard markup into a vCard</p>
<div class="highlight-python"><div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">feedparser</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">d</span> <span class="o">=</span> <span class="n">feedparser</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">&#39;http://feedparser.org/docs/examples/hcard.xml&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">print</span> <span class="n">d</span><span class="o">.</span><span class="n">entries</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">vcard</span>
<span class="go">BEGIN:vCard</span>
<span class="go">VERSION:3.0</span>
<span class="go">FN:Frank Dawson</span>
<span class="go">N:Dawson;Frank</span>
<span class="go">ADR;TYPE=work,postal,parcel:;;6544 Battleford Drive;Raleigh;NC;27613-3502;U</span>
<span class="go">.S.A.</span>
<span class="go">TEL;TYPE=WORK,VOICE,MSG:+1-919-676-9515</span>
<span class="go">TEL;TYPE=WORK,FAX:+1-919-676-9564</span>
<span class="go">EMAIL;TYPE=internet,pref:Frank_Dawson at Lotus.com</span>
<span class="go">EMAIL;TYPE=internet:fdawson at earthlink.net</span>
<span class="go">ORG:Lotus Development Corporation</span>
<span class="go">URL:http://home.earthlink.net/~fdawson</span>
<span class="go">END:vCard</span>
<span class="go">BEGIN:vCard</span>
<span class="go">VERSION:3.0</span>
<span class="go">FN:Tim Howes</span>
<span class="go">N:Howes;Tim</span>
<span class="go">ADR;TYPE=work:;;501 E. Middlefield Rd.;Mountain View;CA;94043;U.S.A.</span>
<span class="go">TEL;TYPE=WORK,VOICE,MSG:+1-415-937-3419</span>
<span class="go">TEL;TYPE=WORK,FAX:+1-415-528-4164</span>
<span class="go">EMAIL;TYPE=internet:howes at netscape.com</span>
<span class="go">ORG:Netscape Communications Corp.</span>
<span class="go">END:vCard</span>
</pre></div>
</div>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p class="last">There are a growing number of microformats, and
<strong class="program">Universal Feed Parser</strong> does not parse all of them.  However, both the
rel and class attributes survive <a class="reference internal" href="html-sanitization.html#advanced-sanitization"><em>HTML sanitizing</em></a>,
so applications built on <strong class="program">Universal Feed Parser</strong> that wish to parse
additional microformat content are free to do so.</p>
</div>
<div class="admonition-see-also admonition seealso">
<p class="first admonition-title">See also</p>
<ul class="last simple">
<li><a class="reference external" href="http://microformats.org/">Microformats.org</a></li>
<li><a class="reference external" href="http://microformats.org/wiki/rel-enclosure">rel=enclosure specification</a></li>
<li><a class="reference external" href="http://microformats.org/wiki/rel-tag">rel=tag specification</a></li>
<li><a class="reference external" href="http://microformats.org/wiki/XFN">XFN specification</a></li>
<li><a class="reference external" href="http://microformats.org/wiki/hcard">hCard specification</a></li>
</ul>
</div>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
  <h3><a href="index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Microformats</a><ul>
<li><a class="reference internal" href="#rel-enclosure">rel=enclosure</a></li>
<li><a class="reference internal" href="#rel-tag">rel=tag</a></li>
<li><a class="reference internal" href="#xfn"><abbr title="XHTML Friends Network">XFN</abbr></a></li>
<li><a class="reference internal" href="#hcard">hCard</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="changes-early.html"
                        title="previous chapter">Changes in earlier versions</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="reference.html"
                        title="next chapter">Reference</a></p>
  <h3>This Page</h3>
  <ul class="this-page-menu">
    <li><a href="_sources/microformats.txt"
           rel="nofollow">Show Source</a></li>
  </ul>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="reference.html" title="Reference"
             >next</a> |</li>
        <li class="right" >
          <a href="changes-early.html" title="Changes in earlier versions"
             >previous</a> |</li>
        <li><a href="index.html">feedparser 5.1.3 documentation</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
        &copy; Copyright 2004-2008 Mark Pilgrim, 2010-2012 Kurt McKee.
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
    </div>
  </body>
</html>