Sophie

Sophie

distrib > Mageia > 5 > x86_64 > media > core-release > by-pkgid > 4c66e4a74400d106ea54ec458e4c006c > files > 2609

bzr-2.6.0-11.mga5.x86_64.rpm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Content Filtering &mdash; Bazaar 2.6.0 documentation</title>
    
    <link rel="stylesheet" href="_static/default.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    './',
        VERSION:     '2.6.0',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="shortcut icon" href="_static/bzr.ico"/>

    <link rel="top" title="Bazaar 2.6.0 documentation" href="index.html" />
    <link rel="up" title="Implementation notes" href="implementation-notes.html" />
    <link rel="next" title="LCA Tree Merging" href="lca_tree_merging.html" />
    <link rel="prev" title="Computing last_modified values" href="last-modified.html" />
<link rel="stylesheet" href="_static/bzr-doc.css" type="text/css" />
 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="lca_tree_merging.html" title="LCA Tree Merging"
             accesskey="N">next</a></li>
        <li class="right" >
          <a href="last-modified.html" title="Computing last_modified values"
             accesskey="P">previous</a> |</li>
<li><a href="http://bazaar.canonical.com/">
    <img src="_static/bzr icon 16.png" /> Home</a>&nbsp;|&nbsp;</li>
<a href="http://doc.bazaar.canonical.com/en/">Documentation</a>&nbsp;|&nbsp;</li>

        <li><a href="index.html">Developer Document Catalog (2.6.0)</a> &raquo;</li>

          <li><a href="implementation-notes.html" accesskey="U">Implementation notes</a> &raquo;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="content-filtering">
<h1>Content Filtering<a class="headerlink" href="#content-filtering" title="Permalink to this headline">¶</a></h1>
<p>Content filtering is the feature by which Bazaar can do line-ending
conversion or keyword expansion so that the files that appear in the
working tree are not precisely the same as the files stored in the
repository.</p>
<p>This document describes the implementation; see the user guide for how to
use it.</p>
<p>We distinguish between the <em>canonical form</em> which is stored in the
repository and the <em>convenient form</em> which is stored in the working tree.
The convenient form will for example use OS-local newline conventions or
have keywords expanded, and the canonical form will not.  We use these
names rather than eg &#8220;filtered&#8221; and &#8220;unfiltered&#8221; because filters are
applied when both reading and writing so those names might cause
confusion.</p>
<p>Content filtering is only active on working trees that support it, which
is format 2a and later.</p>
<p>Content filtering is configured by rules that match file patterns.</p>
<div class="section" id="filters">
<h2>Filters<a class="headerlink" href="#filters" title="Permalink to this headline">¶</a></h2>
<p>Filters come in pairs: a read filter (reading convenient-&gt;canonical) and
a write filter.  There is no requirement that they be symmetric or that
they be deterministic from the input, though in general both these
properties will be true.  Filters are allowed to change the size of the
content, and things like line-ending conversion commonly will.</p>
<p>Filters are fed a sequence of byte chunks (so that they don&#8217;t have to
hold the whole file in memory).  There is no guarantee that the chunks
will be aligned with line endings.  Write filters are passed a context
object through which they can obtain some information about eg which
file they&#8217;re working on.  (See <tt class="docutils literal"><span class="pre">bzrlib.filters</span></tt> docstring.)</p>
<p>These are at the moment strictly <em>content</em> filters: they can&#8217;t make
changes to the tree like changing the execute bit, file types, or
adding/removing entries.</p>
</div>
<div class="section" id="conventions">
<h2>Conventions<a class="headerlink" href="#conventions" title="Permalink to this headline">¶</a></h2>
<p>bzrlib interfaces that aren&#8217;t explicitly specified to deal with the
convenient form should return the canonical form.  Whenever we have the
SHA1 hash of a file, it&#8217;s the hash of the canonical form.</p>
</div>
<div class="section" id="dirstate-interactions">
<h2>Dirstate interactions<a class="headerlink" href="#dirstate-interactions" title="Permalink to this headline">¶</a></h2>
<p>The dirstate file should store, in the column for the working copy, the cached
hash and size of the canonical form, and the packed stat fingerprint for
which that cache is valid.  This implies that the stored size will
in general be different to the size in the packed stat.  (However, it
may not always do this correctly - see
&lt;<a class="reference external" href="https://bugs.launchpad.net/bzr/+bug/418439">https://bugs.launchpad.net/bzr/+bug/418439</a>&gt;.)</p>
<p>The dirstate is given a SHA1Provider instance by its tree.  This class
can calculate the (canonical) hash and size given a filename.  This
provides a hook by which the working tree can make sure that when the
dirstate needs to get the hash of the file, it takes the filters into
account.</p>
</div>
<div class="section" id="user-interface">
<h2>User interface<a class="headerlink" href="#user-interface" title="Permalink to this headline">¶</a></h2>
<p>Most commands that deal with the text of files present the
canonical form.  Some have options to choose.</p>
</div>
<div class="section" id="performance-considerations">
<h2>Performance considerations<a class="headerlink" href="#performance-considerations" title="Permalink to this headline">¶</a></h2>
<p>Content filters can have serious performance implications.  For example,
getting the size of (the canonical form of) a file is easy and fast when
there are no content filters: we simply stat it.  However, when there
are filters that might change the size of the file, determining the
length of the canonical form requires reading in and filtering the whole
file.</p>
<p>Formats from 1.14 onwards support content filtering, so having fast
paths for the case where content filtering is not possible is not
generally worthwhile.  In fact, they&#8217;re probably harmful by causing
extra edges in test coverage and performance.</p>
<p>We need to have things be fast even when filters are in use and then
possibly do a bit less work when there are no filters configured.</p>
</div>
<div class="section" id="future-ideas-and-open-issues">
<h2>Future ideas and open issues<a class="headerlink" href="#future-ideas-and-open-issues" title="Permalink to this headline">¶</a></h2>
<ul>
<li><p class="first">We might benefit from having filters declare some of their properties
statically, for example that they&#8217;re deterministic or can round-trip
or won&#8217;t change the length of the file.  However, common cases like
crlf conversion are not guaranteed to round-trip and may change the
length, so perhaps adding separate cases will just complicate the code
and tests.  So overall this does not seem worthwhile.</p>
</li>
<li><p class="first">In a future workingtree format, it might be better not to separately
store the working-copy hash and size, but rather just a stat fingerprint
at which point it was known to have the same canonical form as the
basis tree.</p>
</li>
<li><p class="first">It may be worthwhile to have a virtual Tree-like object that does
filtering, so there&#8217;s a clean separation of filtering from the on-disk
state and the meaning of any object is clear.  This would have some
risk of bugs where either code holds the wrong object, or their state
becomes inconsistent.</p>
<p>This would be useful in allowing you to get a filtered view of a
historical tree, eg to export it or diff it.  At the moment export
needs to have its own code to do the filtering.</p>
<p>The convenient-form tree would talk to disk, and the convenient-form
tree would sit on top of that and be used by most other bzr code.</p>
<p>If we do this, we&#8217;d need to handle the fact that the on-disk tree,
which generally deals with all of the IO and generally works entirely
in convenient form, would also need to be told the canonical hash to
store in the dirstate.  This can perhaps be handled by the
SHA1Provider or a similar hook.</p>
</li>
<li><p class="first">Content filtering at the moment is a bit specific to on-disk trees:
for instance <tt class="docutils literal"><span class="pre">SHA1Provider</span></tt> goes directly to disk, but it seems like
this is not necessary.</p>
</li>
</ul>
</div>
<div class="section" id="see-also">
<h2>See also<a class="headerlink" href="#see-also" title="Permalink to this headline">¶</a></h2>
<ul class="simple">
<li><a class="reference external" href="http://wiki.bazaar.canonical.com/LineEndings">http://wiki.bazaar.canonical.com/LineEndings</a></li>
<li><a class="reference external" href="http://wiki.bazaar.canonical.com/LineEndings/Roadmap">http://wiki.bazaar.canonical.com/LineEndings/Roadmap</a></li>
<li><a class="reference external" href="index.html">Developer Documentation</a></li>
<li><tt class="docutils literal"><span class="pre">bzrlib.filters</span></tt></li>
</ul>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
  <h3><a href="index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Content Filtering</a><ul>
<li><a class="reference internal" href="#filters">Filters</a></li>
<li><a class="reference internal" href="#conventions">Conventions</a></li>
<li><a class="reference internal" href="#dirstate-interactions">Dirstate interactions</a></li>
<li><a class="reference internal" href="#user-interface">User interface</a></li>
<li><a class="reference internal" href="#performance-considerations">Performance considerations</a></li>
<li><a class="reference internal" href="#future-ideas-and-open-issues">Future ideas and open issues</a></li>
<li><a class="reference internal" href="#see-also">See also</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="last-modified.html"
                        title="previous chapter">Computing last_modified values</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="lca_tree_merging.html"
                        title="next chapter">LCA Tree Merging</a></p>
  <h3>This Page</h3>
  <ul class="this-page-menu">
    <li><a href="_sources/content-filtering.txt"
           rel="nofollow">Show Source</a></li>
  </ul>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="lca_tree_merging.html" title="LCA Tree Merging"
             >next</a></li>
        <li class="right" >
          <a href="last-modified.html" title="Computing last_modified values"
             >previous</a> |</li>
<li><a href="http://bazaar.canonical.com/">
    <img src="_static/bzr icon 16.png" /> Home</a>&nbsp;|&nbsp;</li>
<a href="http://doc.bazaar.canonical.com/en/">Documentation</a>&nbsp;|&nbsp;</li>

        <li><a href="index.html">Developer Document Catalog (2.6.0)</a> &raquo;</li>

          <li><a href="implementation-notes.html" >Implementation notes</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
        &copy; Copyright 2009-2011 Canonical Ltd.
      Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.2.3.
    </div>
  </body>
</html>