<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Mirror Mirror on the Wall — Stem 1.1.0 documentation</title> <link rel="stylesheet" href="../_static/haiku.css" type="text/css" /> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" /> <link rel="stylesheet" href="../_static/print.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '../', VERSION: '1.1.0', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="../_static/jquery.js"></script> <script type="text/javascript" src="../_static/underscore.js"></script> <script type="text/javascript" src="../_static/doctools.js"></script> <script type="text/javascript" src="../_static/theme_extras.js"></script> <link rel="shortcut icon" href="../_static/favicon.png"/> <link rel="top" title="Stem 1.1.0 documentation" href="../index.html" /> <link rel="up" title="Contents" href="../contents.html" /> <link rel="next" title="Double Double Toil and Trouble" href="double_double_toil_and_trouble.html" /> <link rel="prev" title="East of the Sun & West of the Moon" href="east_of_the_sun.html" /> </head> <body> <div class="header"><img class="rightlogo" src="../_static/logo.png" alt="Logo"/><h1 class="heading"><a href="../index.html"> <span>Stem Docs</span></a></h1> <h2 class="heading"><span>Mirror Mirror on the Wall</span></h2> </div> <div class="topnav"> <p> <ul id="navbar"> <li><a href="../index.html">Home</a></li> <li><a href="../tutorials.html">Tutorials</a> <ul> <li><a href="the_little_relay_that_could.html">Hello World</a></li> <li><a href="to_russia_with_love.html">Client Usage</a></li> <li><a href="tortoise_and_the_hare.html">Event Listening</a></li> <li><a href="#">Tor Descriptors</a></li> <li><a href="east_of_the_sun.html">Utilities</a></li> <li><a href="double_double_toil_and_trouble.html">Examples</a></li> </ul> </li> <li><a href="../api.html">API</a> <ul> <li><a href="../api/control.html">stem.control</a></li> <li><a href="../api/connection.html">stem.connection</a></li> <li><a href="../api/socket.html">stem.socket</a></li> <li><a href="../api/process.html">stem.process</a></li> <li><a href="../api/response.html">stem.response</a></li> <li><a href="../api/exit_policy.html">stem.exit_policy</a></li> <li><a href="../api/version.html">stem.version</a></li> <li><a href="../api.html#descriptors">Descriptors</a></li> <li><a href="../api.html#utilities">Utilities</a></li> </ul> </li> <li><a href="https://trac.torproject.org/projects/tor/wiki/doc/stem">Development</a> <ul> <li><a href="../faq.html">FAQ</a></li> <li><a href="../change_log.html">Change Log</a></li> <li><a href="https://trac.torproject.org/projects/tor/wiki/doc/stem/bugs">Bug Tracker</a></li> <li><a href="../download.html">Download</a></li> </ul> </li> </ul> </p> </div> <div class="content"> <div class="section" id="mirror-mirror-on-the-wall"> <h1>Mirror Mirror on the Wall<a class="headerlink" href="#mirror-mirror-on-the-wall" title="Permalink to this headline">¶</a></h1> <p>The following is an overview of <strong>Tor descriptors</strong>. If you’re already familiar with what they are and where to get them then you may want to skip to the end.</p> <ul class="simple"> <li><a class="reference internal" href="#what-is-a-descriptor"><em>What is a descriptor?</em></a></li> <li><a class="reference internal" href="#where-can-i-get-the-current-descriptors"><em>Where can I get the current descriptors?</em></a></li> <li><a class="reference internal" href="#can-i-get-descriptors-from-tor"><em>Can I get descriptors from Tor?</em></a></li> <li><a class="reference internal" href="#where-can-i-get-past-descriptors"><em>Where can I get past descriptors?</em></a></li> <li><a class="reference internal" href="#putting-it-together"><em>Putting it together...</em></a></li> </ul> <div class="section" id="what-is-a-descriptor"> <span id="id1"></span><h2>What is a descriptor?<a class="headerlink" href="#what-is-a-descriptor" title="Permalink to this headline">¶</a></h2> <p>Tor is made up of two parts: the application and a distributed network of a few thousand volunteer relays. Information about these relays is public, and made up of documents called <strong>descriptors</strong>.</p> <p>There are several different kinds of descriptors, the most common ones being...</p> <table border="1" class="docutils"> <colgroup> <col width="19%" /> <col width="81%" /> </colgroup> <thead valign="bottom"> <tr class="row-odd"><th class="head">Descriptor Type</th> <th class="head">Description</th> </tr> </thead> <tbody valign="top"> <tr class="row-even"><td><a class="reference external" href="../api/descriptor/server_descriptor.html">Server Descriptor</a></td> <td>Information that relays publish about themselves. Tor clients once downloaded this information, but now they use microdescriptors instead.</td> </tr> <tr class="row-odd"><td><a class="reference external" href="../api/descriptor/extrainfo_descriptor.html">ExtraInfo Descriptor</a></td> <td>Relay information that tor clients do not need in order to function. This is self-published, like server descriptors, but not downloaded by default.</td> </tr> <tr class="row-even"><td><a class="reference external" href="../api/descriptor/microdescriptor.html">Microdescriptor</a></td> <td>Minimalistic document that just includes the information necessary for tor clients to work.</td> </tr> <tr class="row-odd"><td><a class="reference external" href="../api/descriptor/networkstatus.html">Network Status Document</a></td> <td>Though tor relays are decentralized, the directories that track the overall network are not. These central points are called <strong>directory authorities</strong>, and every hour they publish a document called a <strong>consensus</strong> (aka, network status document). The consensus in turn is made up of <strong>router status entries</strong>.</td> </tr> <tr class="row-even"><td><a class="reference external" href="../api/descriptor/router_status_entry.html">Router Status Entry</a></td> <td>Relay information provided by the directory authorities including flags, heuristics used for relay selection, etc.</td> </tr> </tbody> </table> </div> <div class="section" id="where-can-i-get-the-current-descriptors"> <span id="id2"></span><h2>Where can I get the current descriptors?<a class="headerlink" href="#where-can-i-get-the-current-descriptors" title="Permalink to this headline">¶</a></h2> <p>To work tor needs to have up-to-date information about relays within the network. As such getting current descriptors is easy: <em>just download it like tor does</em>.</p> <p>The <a class="reference external" href="../api/descriptor/remote.html">stem.descriptor.remote</a> module downloads descriptors from the tor directory authorities and mirrors. <strong>Please show some restraint when doing this</strong>! This adds load to the network, and hence an irresponsible script can make tor worse for everyone.</p> <p>Listing the current relays in the tor network is as easy as...</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">stem.descriptor.remote</span> <span class="kn">import</span> <span class="n">DescriptorDownloader</span> <span class="n">downloader</span> <span class="o">=</span> <span class="n">DescriptorDownloader</span><span class="p">()</span> <span class="k">try</span><span class="p">:</span> <span class="k">for</span> <span class="n">desc</span> <span class="ow">in</span> <span class="n">downloader</span><span class="o">.</span><span class="n">get_consensus</span><span class="p">()</span><span class="o">.</span><span class="n">run</span><span class="p">():</span> <span class="k">print</span> <span class="s">"found relay </span><span class="si">%s</span><span class="s"> (</span><span class="si">%s</span><span class="s">)"</span> <span class="o">%</span> <span class="p">(</span><span class="n">desc</span><span class="o">.</span><span class="n">nickname</span><span class="p">,</span> <span class="n">desc</span><span class="o">.</span><span class="n">fingerprint</span><span class="p">)</span> <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">exc</span><span class="p">:</span> <span class="k">print</span> <span class="s">"Unable to retrieve the consensus: </span><span class="si">%s</span><span class="s">"</span> <span class="o">%</span> <span class="n">exc</span> </pre></div> </div> </div> <div class="section" id="can-i-get-descriptors-from-tor"> <span id="id3"></span><h2>Can I get descriptors from Tor?<a class="headerlink" href="#can-i-get-descriptors-from-tor" title="Permalink to this headline">¶</a></h2> <p>If you already have tor running on your system then it is already getting descriptors on your behalf. Reusing these is a great way to keep from burdening the rest of the tor network.</p> <p>Tor only gets the descriptors that it needs by default, so if you’re scripting against tor you may want to set some of the following in your <a class="reference external" href="https://www.torproject.org/docs/faq.html.en#torrc">torrc</a>. Keep in mind that these add a small burden to the network, so don’t set them in a widely distributed application. And, of course, please consider <a class="reference external" href="https://www.torproject.org/docs/tor-doc-relay.html.en">running tor as a relay</a> so you give back to the network!</p> <div class="highlight-python"><pre># Descriptors have a range of time during which they're valid. To get the # most recent descriptor information, regardless of if tor needs it or not, # set the following. FetchDirInfoEarly 1 FetchDirInfoExtraEarly 1 # If you aren't actively using tor as a client then tor will eventually stop # downloading descriptor information that it doesn't need. To prevent this # from happening set... FetchUselessDescriptors 1 # Tor no longer downloads server descriptors by default, opting for # microdescriptors instead. If you want tor to download server descriptors # then set... UseMicrodescriptors 0 # Tor doesn't need extrainfo descriptors to work. If you want tor to download # them anyway then set... DownloadExtraInfo 1</pre> </div> <p>Now that tor is happy chugging along up-to-date descriptors are available through tor’s control socket...</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">stem.control</span> <span class="kn">import</span> <span class="n">Controller</span> <span class="k">with</span> <span class="n">Controller</span><span class="o">.</span><span class="n">from_port</span><span class="p">(</span><span class="n">port</span> <span class="o">=</span> <span class="mi">9051</span><span class="p">)</span> <span class="k">as</span> <span class="n">controller</span><span class="p">:</span> <span class="n">controller</span><span class="o">.</span><span class="n">authenticate</span><span class="p">()</span> <span class="k">for</span> <span class="n">desc</span> <span class="ow">in</span> <span class="n">controller</span><span class="o">.</span><span class="n">get_network_statuses</span><span class="p">():</span> <span class="k">print</span> <span class="s">"found relay </span><span class="si">%s</span><span class="s"> (</span><span class="si">%s</span><span class="s">)"</span> <span class="o">%</span> <span class="p">(</span><span class="n">desc</span><span class="o">.</span><span class="n">nickname</span><span class="p">,</span> <span class="n">desc</span><span class="o">.</span><span class="n">fingerprint</span><span class="p">)</span> </pre></div> </div> <p>... or by reading directly from tor’s data directory...</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">stem.descriptor</span> <span class="kn">import</span> <span class="n">parse_file</span> <span class="k">for</span> <span class="n">desc</span> <span class="ow">in</span> <span class="n">parse_file</span><span class="p">(</span><span class="nb">open</span><span class="p">(</span><span class="s">"/home/atagar/.tor/cached-consensus"</span><span class="p">)):</span> <span class="k">print</span> <span class="s">"found relay </span><span class="si">%s</span><span class="s"> (</span><span class="si">%s</span><span class="s">)"</span> <span class="o">%</span> <span class="p">(</span><span class="n">desc</span><span class="o">.</span><span class="n">nickname</span><span class="p">,</span> <span class="n">desc</span><span class="o">.</span><span class="n">fingerprint</span><span class="p">)</span> </pre></div> </div> </div> <div class="section" id="where-can-i-get-past-descriptors"> <span id="id4"></span><h2>Where can I get past descriptors?<a class="headerlink" href="#where-can-i-get-past-descriptors" title="Permalink to this headline">¶</a></h2> <p>Descriptor archives are available on <a class="reference external" href="https://metrics.torproject.org/data.html">Tor’s metrics site</a>. These archives can be read with the <a class="reference external" href="../api/descriptor/reader.html">DescriptorReader</a>...</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">stem.descriptor.reader</span> <span class="kn">import</span> <span class="n">DescriptorReader</span> <span class="k">with</span> <span class="n">DescriptorReader</span><span class="p">([</span><span class="s">"/home/atagar/server-descriptors-2013-03.tar"</span><span class="p">])</span> <span class="k">as</span> <span class="n">reader</span><span class="p">:</span> <span class="k">for</span> <span class="n">desc</span> <span class="ow">in</span> <span class="n">reader</span><span class="p">:</span> <span class="k">print</span> <span class="s">"found relay </span><span class="si">%s</span><span class="s"> (</span><span class="si">%s</span><span class="s">)"</span> <span class="o">%</span> <span class="p">(</span><span class="n">desc</span><span class="o">.</span><span class="n">nickname</span><span class="p">,</span> <span class="n">desc</span><span class="o">.</span><span class="n">fingerprint</span><span class="p">)</span> </pre></div> </div> </div> <div class="section" id="putting-it-together"> <span id="id5"></span><h2>Putting it together...<a class="headerlink" href="#putting-it-together" title="Permalink to this headline">¶</a></h2> <p>As discussed above there are three methods for reading descriptors...</p> <ul class="simple"> <li>With the <a class="reference internal" href="../api/control.html#stem.control.Controller" title="stem.control.Controller"><tt class="xref py py-class docutils literal"><span class="pre">Controller</span></tt></a> via methods like <a class="reference internal" href="../api/control.html#stem.control.Controller.get_server_descriptors" title="stem.control.Controller.get_server_descriptors"><tt class="xref py py-func docutils literal"><span class="pre">get_server_descriptors()</span></tt></a> and <a class="reference internal" href="../api/control.html#stem.control.Controller.get_network_statuses" title="stem.control.Controller.get_network_statuses"><tt class="xref py py-func docutils literal"><span class="pre">get_network_statuses()</span></tt></a>.</li> <li>By reading the file directly with <a class="reference internal" href="../api/descriptor/descriptor.html#stem.descriptor.__init__.parse_file" title="stem.descriptor.__init__.parse_file"><tt class="xref py py-func docutils literal"><span class="pre">parse_file()</span></tt></a>.</li> <li>Reading with the <a class="reference external" href="../api/descriptor/reader.html">DescriptorReader</a>. This is best if you have you want to read everything from a directory or archive.</li> </ul> <p>Now lets say you want to figure out who the <em>biggest</em> exit relays are. You could use any of the methods above, but for this example we’ll use <a class="reference external" href="../api/descriptor/remote.html">stem.descriptor.remote</a>...</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">sys</span> <span class="kn">from</span> <span class="nn">stem.descriptor.remote</span> <span class="kn">import</span> <span class="n">DescriptorDownloader</span> <span class="kn">from</span> <span class="nn">stem.util</span> <span class="kn">import</span> <span class="n">str_tools</span> <span class="c"># provides a mapping of observed bandwidth to the relay nicknames</span> <span class="k">def</span> <span class="nf">get_bw_to_relay</span><span class="p">():</span> <span class="n">bw_to_relay</span> <span class="o">=</span> <span class="p">{}</span> <span class="n">downloader</span> <span class="o">=</span> <span class="n">DescriptorDownloader</span><span class="p">()</span> <span class="k">try</span><span class="p">:</span> <span class="k">for</span> <span class="n">desc</span> <span class="ow">in</span> <span class="n">downloader</span><span class="o">.</span><span class="n">get_server_descriptors</span><span class="p">()</span><span class="o">.</span><span class="n">run</span><span class="p">():</span> <span class="k">if</span> <span class="n">desc</span><span class="o">.</span><span class="n">exit_policy</span><span class="o">.</span><span class="n">is_exiting_allowed</span><span class="p">():</span> <span class="n">bw_to_relay</span><span class="o">.</span><span class="n">setdefault</span><span class="p">(</span><span class="n">desc</span><span class="o">.</span><span class="n">observed_bandwidth</span><span class="p">,</span> <span class="p">[])</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">desc</span><span class="o">.</span><span class="n">nickname</span><span class="p">)</span> <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">exc</span><span class="p">:</span> <span class="k">print</span> <span class="s">"Unable to retrieve the server descriptors: </span><span class="si">%s</span><span class="s">"</span> <span class="o">%</span> <span class="n">exc</span> <span class="k">return</span> <span class="n">bw_to_relay</span> <span class="c"># prints the top fifteen relays</span> <span class="n">bw_to_relay</span> <span class="o">=</span> <span class="n">get_bw_to_relay</span><span class="p">()</span> <span class="n">count</span> <span class="o">=</span> <span class="mi">1</span> <span class="k">for</span> <span class="n">bw_value</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">bw_to_relay</span><span class="o">.</span><span class="n">keys</span><span class="p">(),</span> <span class="n">reverse</span> <span class="o">=</span> <span class="bp">True</span><span class="p">):</span> <span class="k">for</span> <span class="n">nickname</span> <span class="ow">in</span> <span class="n">bw_to_relay</span><span class="p">[</span><span class="n">bw_value</span><span class="p">]:</span> <span class="k">print</span> <span class="s">"</span><span class="si">%i</span><span class="s">. </span><span class="si">%s</span><span class="s"> (</span><span class="si">%s</span><span class="s">/s)"</span> <span class="o">%</span> <span class="p">(</span><span class="n">count</span><span class="p">,</span> <span class="n">nickname</span><span class="p">,</span> <span class="n">str_tools</span><span class="o">.</span><span class="n">get_size_label</span><span class="p">(</span><span class="n">bw_value</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span> <span class="n">count</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">count</span> <span class="o">></span> <span class="mi">15</span><span class="p">:</span> <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">()</span> </pre></div> </div> <div class="highlight-python"><pre>% python example.py 1. herngaard (40.95 MB/s) 2. chaoscomputerclub19 (40.43 MB/s) 3. chaoscomputerclub18 (40.02 MB/s) 4. chaoscomputerclub20 (38.98 MB/s) 5. wannabe (38.63 MB/s) 6. dorrisdeebrown (38.48 MB/s) 7. manning2 (38.20 MB/s) 8. chaoscomputerclub21 (36.90 MB/s) 9. TorLand1 (36.22 MB/s) 10. bolobolo1 (35.93 MB/s) 11. manning1 (35.39 MB/s) 12. gorz (34.10 MB/s) 13. ndnr1 (25.36 MB/s) 14. politkovskaja2 (24.93 MB/s) 15. wau (24.72 MB/s)</pre> </div> </div> </div> </div> <div class="bottomnav"> </div> <div class="footer"> </div> </body> </html>