Sophie

Sophie

distrib > Mageia > 7 > aarch64 > by-pkgid > 7e647d9940d31b34c253e6f71c416c4b > files > 2672

bzr-2.7.0-6.mga7.aarch64.rpm


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="X-UA-Compatible" content="IE=Edge" />
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Container format &#8212; Bazaar 2.7.0 documentation</title>
    <link rel="stylesheet" href="_static/classic.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    
    <script type="text/javascript" id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <script type="text/javascript" src="_static/language_data.js"></script>
    
    <link rel="shortcut icon" href="_static/bzr.ico"/>

    <link rel="search" title="Search" href="search.html" />
    <link rel="next" title="Overview" href="groupcompress-design.html" />
    <link rel="prev" title="Bundles" href="bundles.html" />
<link rel="stylesheet" href="_static/bzr-doc.css" type="text/css" />
 
  </head><body>
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="groupcompress-design.html" title="Overview"
             accesskey="N">next</a></li>
        <li class="right" >
          <a href="bundles.html" title="Bundles"
             accesskey="P">previous</a> |</li>
<li><a href="http://bazaar.canonical.com/">
    <img src="_static/bzr icon 16.png" /> Home</a>&nbsp;|&nbsp;</li>
<a href="http://doc.bazaar.canonical.com/en/">Documentation</a>&nbsp;|&nbsp;</li>

        <li class="nav-item nav-item-0"><a href="index.html">Developer Document Catalog (2.7.0)</a> &#187;</li>

          <li class="nav-item nav-item-1"><a href="specifications.html" accesskey="U">Specifications</a> &#187;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body" role="main">
            
  <div class="section" id="container-format">
<h1><a class="toc-backref" href="#id3">Container format</a><a class="headerlink" href="#container-format" title="Permalink to this headline">¶</a></h1>
<div class="section" id="status">
<h2><a class="toc-backref" href="#id4">Status</a><a class="headerlink" href="#status" title="Permalink to this headline">¶</a></h2>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Date:</th><td class="field-body">2007-06-07</td>
</tr>
</tbody>
</table>
<p>This document describes the proposed container format for streaming and
storing collections of data in Bazaar.  Initially this will be used for
streaming revision data for incremental push/pull in the smart server for
0.18, but the intention is that this will be the basis for much more
than just that use case.</p>
<p>In particular, this document currently focuses almost exclusively on the
streaming case, and not the on-disk storage case.  It also does not
discuss the APIs used to manipulate containers and their records.</p>
<div class="contents topic" id="contents">
<p class="topic-title first">Contents</p>
<ul class="simple">
<li><a class="reference internal" href="#container-format" id="id3">Container format</a><ul>
<li><a class="reference internal" href="#status" id="id4">Status</a></li>
<li><a class="reference internal" href="#motivation" id="id5">Motivation</a></li>
<li><a class="reference internal" href="#terminology" id="id6">Terminology</a></li>
<li><a class="reference internal" href="#use-cases" id="id7">Use Cases</a><ul>
<li><a class="reference internal" href="#streaming-data-between-a-smart-server-and-client" id="id8">Streaming data between a smart server and client</a><ul>
<li><a class="reference internal" href="#incremental-push-or-pull" id="id9">Incremental push or pull</a></li>
</ul>
</li>
<li><a class="reference internal" href="#persistent-storage-on-disk" id="id10">Persistent storage on disk</a></li>
<li><a class="reference internal" href="#usable-before-deep-model-changes-to-bazaar" id="id11">Usable before deep model changes to Bazaar</a></li>
<li><a class="reference internal" href="#examples-of-possible-record-content" id="id12">Examples of possible record content</a></li>
</ul>
</li>
<li><a class="reference internal" href="#characteristics" id="id13">Characteristics</a><ul>
<li><a class="reference internal" href="#no-length-prefixing-of-entire-container" id="id14">No length-prefixing of entire container</a></li>
<li><a class="reference internal" href="#structured-as-a-self-contained-series-of-records" id="id15">Structured as a self-contained series of records</a></li>
<li><a class="reference internal" href="#addressing-records" id="id16">Addressing records</a></li>
<li><a class="reference internal" href="#reasonably-cheap-for-small-records" id="id17">Reasonably cheap for small records</a></li>
</ul>
</li>
<li><a class="reference internal" href="#specification" id="id18">Specification</a><ul>
<li><a class="reference internal" href="#record-types" id="id19">Record types</a><ul>
<li><a class="reference internal" href="#end-marker" id="id20">End Marker</a></li>
<li><a class="reference internal" href="#bytes" id="id21">Bytes</a></li>
</ul>
</li>
<li><a class="reference internal" href="#names" id="id22">Names</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
</div>
<div class="section" id="motivation">
<h2><a class="toc-backref" href="#id5">Motivation</a><a class="headerlink" href="#motivation" title="Permalink to this headline">¶</a></h2>
<p>To create a low-level file format which is suitable for solving the smart
server latency problem and whose layout and requirements are extendable in
future versions of Bazaar, and with no requirements that the smart server
does not have today.</p>
</div>
<div class="section" id="terminology">
<h2><a class="toc-backref" href="#id6">Terminology</a><a class="headerlink" href="#terminology" title="Permalink to this headline">¶</a></h2>
<p>A <strong>container</strong> is a streamable file that contains a series of
<strong>records</strong>.  Records may have <strong>names</strong>, and consist of bytes.</p>
</div>
<div class="section" id="use-cases">
<h2><a class="toc-backref" href="#id7">Use Cases</a><a class="headerlink" href="#use-cases" title="Permalink to this headline">¶</a></h2>
<p>Here’s a brief description of use cases this format is intended to
support.</p>
<div class="section" id="streaming-data-between-a-smart-server-and-client">
<h3><a class="toc-backref" href="#id8">Streaming data between a smart server and client</a><a class="headerlink" href="#streaming-data-between-a-smart-server-and-client" title="Permalink to this headline">¶</a></h3>
<p>It would be nice if we could combine multiple containers into a single
stream by something no more expensive than concatenation (e.g. by omitting
end/start marker pairs).</p>
<p>This doesn’t imply that such a combination necessarily produces a valid
container (e.g. care must be taken to ensure that names are still unique
in the combined container), or even a useful container.  It is simply that
the cost of assembling a new combined container is practically as cheap as
simple concatenation.</p>
<div class="section" id="incremental-push-or-pull">
<h4><a class="toc-backref" href="#id9">Incremental push or pull</a><a class="headerlink" href="#incremental-push-or-pull" title="Permalink to this headline">¶</a></h4>
<p>Consider the use case of incremental push/pull, which is currently (0.16)
very slow on high-latency links due to the large number of round trips.
What we’d like is something like the following.</p>
<p>A client will make a request meaning “give me the knit contents for these
revision IDs” (how the client determines which revision IDs it needs is
unimportant here).  In response, the server streams a single container of:</p>
<blockquote>
<div><ul class="simple">
<li>one record per file-id:revision-id knit gzip contents and graph data,</li>
<li>one record per inventory:revision-id knit gzip contents and graph
data,</li>
<li>one record per revision knit gzip contents,</li>
<li>one record per revision signature,</li>
<li>end marker record.</li>
</ul>
</div></blockquote>
<p>in that order.</p>
</div>
</div>
<div class="section" id="persistent-storage-on-disk">
<h3><a class="toc-backref" href="#id10">Persistent storage on disk</a><a class="headerlink" href="#persistent-storage-on-disk" title="Permalink to this headline">¶</a></h3>
<p>We want a storage format that allows lock-free writes, which suggests a
format that uses <em>rename into place</em>, and <em>do not modify after writing</em>.</p>
</div>
<div class="section" id="usable-before-deep-model-changes-to-bazaar">
<h3><a class="toc-backref" href="#id11">Usable before deep model changes to Bazaar</a><a class="headerlink" href="#usable-before-deep-model-changes-to-bazaar" title="Permalink to this headline">¶</a></h3>
<p>We want a format we can use and refine sooner rather than later.  So it
should be usable before the anticipated model changes for Bazaar “1.0”
land, while not conflicting with those changes either.</p>
<p>Specifically, we’d like to have this format in Bazaar 0.18.</p>
</div>
<div class="section" id="examples-of-possible-record-content">
<h3><a class="toc-backref" href="#id12">Examples of possible record content</a><a class="headerlink" href="#examples-of-possible-record-content" title="Permalink to this headline">¶</a></h3>
<blockquote>
<div><ul class="simple">
<li>full texts of file versions</li>
<li>deltas of full texts</li>
<li>revisions</li>
<li>inventories</li>
<li>inventory as tree items e.g. the inventory data for 20 files</li>
<li>revision signatures</li>
<li>per-file graph data</li>
<li>annotation cache</li>
</ul>
</div></blockquote>
</div>
</div>
<div class="section" id="characteristics">
<h2><a class="toc-backref" href="#id13">Characteristics</a><a class="headerlink" href="#characteristics" title="Permalink to this headline">¶</a></h2>
<p>Some key aspects of the described format are discussed in this section.</p>
<div class="section" id="no-length-prefixing-of-entire-container">
<h3><a class="toc-backref" href="#id14">No length-prefixing of entire container</a><a class="headerlink" href="#no-length-prefixing-of-entire-container" title="Permalink to this headline">¶</a></h3>
<p>The overall container is not length-prefixed.  Instead there is an end
marker so that readers can determine when they have read the entire
container.  This also does not conflict with the goal of allowing
single-pass writing.</p>
</div>
<div class="section" id="structured-as-a-self-contained-series-of-records">
<h3><a class="toc-backref" href="#id15">Structured as a self-contained series of records</a><a class="headerlink" href="#structured-as-a-self-contained-series-of-records" title="Permalink to this headline">¶</a></h3>
<p>The container contains a series of <em>records</em>.  Each record is
self-delimiting.  Record markers are lightweight.  The overhead in terms
of bytes and processing for records in this container vs. the raw contents
of those records is minimal.</p>
</div>
<div class="section" id="addressing-records">
<h3><a class="toc-backref" href="#id16">Addressing records</a><a class="headerlink" href="#addressing-records" title="Permalink to this headline">¶</a></h3>
<p>There is a requirement that each object can be given an arbitrary name.
Some version control systems address all content by the SHA-1 digest of
that content, but this scheme is unsatisfactory for Bazaar’s revision
objects.  We can still allow addressing by SHA-1 digest for those content
types where it makes sense.</p>
<p>Some proposed object names:</p>
<blockquote>
<div><ul class="simple">
<li>to name a revision: “<code class="docutils literal notranslate"><span class="pre">revision:</span></code><em>revision-id</em>”.  e.g.,
<cite>revision:pqm&#64;pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5</cite>.</li>
<li>to name an inventory delta: “<code class="docutils literal notranslate"><span class="pre">inventory.delta:</span></code><em>revision-id</em>”.  e.g.,
<cite>inventory.delta:pqm&#64;pqm.ubuntu.com-20070531210833-8ptk86ocu822hjd5</cite>.</li>
</ul>
</div></blockquote>
<p>It seems likely that we may want to have multiple names for an object.
This format allows that (by allowing multiple <code class="docutils literal notranslate"><span class="pre">name</span></code> headers in a Bytes
record).</p>
<p>Although records are in principle addressable by name, this specification
alone doesn’t provide for efficient access to a particular record given
its name.  It is intended that separate indexes will be maintained to
provide this.</p>
<p>It is acceptable to have records with no explicit name, if the expected
use of them does not require them.  For example:</p>
<blockquote>
<div><ul class="simple">
<li>a record’s content could be self-describing in the context of a
particular container, or</li>
<li>a record could be accessed via an index based on SHA-1, or</li>
<li>when streaming, the first record could be treated specially.</li>
</ul>
</div></blockquote>
</div>
<div class="section" id="reasonably-cheap-for-small-records">
<h3><a class="toc-backref" href="#id17">Reasonably cheap for small records</a><a class="headerlink" href="#reasonably-cheap-for-small-records" title="Permalink to this headline">¶</a></h3>
<p>The overhead for storing fairly short records (tens of bytes, rather than
thousands or millions) is minimal.  The minimum overhead is 3 bytes plus
the length of the decimal representation of the <em>length</em> value (for a
record with no name).</p>
</div>
</div>
<div class="section" id="specification">
<h2><a class="toc-backref" href="#id18">Specification</a><a class="headerlink" href="#specification" title="Permalink to this headline">¶</a></h2>
<p>This describes just a basic layer for storing a simple series of
“records”.  This layer has no intrinsic understanding of the contents of
those records.</p>
<p>The format is:</p>
<blockquote>
<div><ul class="simple">
<li>a <strong>container lead-in</strong>, “<code class="docutils literal notranslate"><span class="pre">Bazaar</span> <span class="pre">pack</span> <span class="pre">format</span> <span class="pre">1</span> <span class="pre">(introduced</span> <span class="pre">in</span> <span class="pre">0.18)\n</span></code>”,</li>
<li>followed by one or more <strong>records</strong>.</li>
</ul>
</div></blockquote>
<p>A record is:</p>
<blockquote>
<div><ul class="simple">
<li>a 1 byte <strong>kind marker</strong>.</li>
<li>0 or more bytes of record content, depending on the record type.</li>
</ul>
</div></blockquote>
<div class="section" id="record-types">
<h3><a class="toc-backref" href="#id19">Record types</a><a class="headerlink" href="#record-types" title="Permalink to this headline">¶</a></h3>
<div class="section" id="end-marker">
<h4><a class="toc-backref" href="#id20">End Marker</a><a class="headerlink" href="#end-marker" title="Permalink to this headline">¶</a></h4>
<p>An <strong>End Marker</strong> record:</p>
<blockquote>
<div><ul class="simple">
<li>has a kind marker of “<code class="docutils literal notranslate"><span class="pre">E</span></code>”,</li>
<li>no content bytes.</li>
</ul>
</div></blockquote>
<p>End Marker records signal the end of a container.</p>
</div>
<div class="section" id="bytes">
<h4><a class="toc-backref" href="#id21">Bytes</a><a class="headerlink" href="#bytes" title="Permalink to this headline">¶</a></h4>
<p>A <strong>Bytes</strong> record:</p>
<blockquote>
<div><ul>
<li><p class="first">has a kind marker of “<code class="docutils literal notranslate"><span class="pre">B</span></code>”,</p>
</li>
<li><p class="first">followed by a mandatory <strong>content length</strong> <a class="footnote-reference" href="#id2" id="id1">[1]</a>:
“<em>number</em><code class="docutils literal notranslate"><span class="pre">\n</span></code>”, where <em>number</em> is in decimal, e.g:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="mi">1234</span>
</pre></div>
</div>
</li>
<li><p class="first">followed by zero or more optional <strong>names</strong>:
“<em>name</em><code class="docutils literal notranslate"><span class="pre">\n</span></code>”, e.g.:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">revision</span><span class="p">:</span><span class="n">pqm</span><span class="nd">@pqm</span><span class="o">.</span><span class="n">ubuntu</span><span class="o">.</span><span class="n">com</span><span class="o">-</span><span class="mi">20070531210833</span><span class="o">-</span><span class="mi">8</span><span class="n">ptk86ocu822hjd5</span>
</pre></div>
</div>
</li>
<li><p class="first">followed by an <strong>end of headers</strong> byte: “<code class="docutils literal notranslate"><span class="pre">\n</span></code>”,</p>
</li>
<li><p class="first">followed by some <strong>bytes</strong>, exactly as many as specified by the length
prefix header.</p>
</li>
</ul>
</div></blockquote>
<p>So a Bytes record is a series of lines encoding the length and names (if
any) followed by a body.</p>
<p>For example, this is a possible Bytes record (including the kind marker):</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">B26</span>
<span class="n">example</span><span class="o">-</span><span class="n">name1</span>
<span class="n">example</span><span class="o">-</span><span class="n">name2</span>

<span class="n">abcdefghijklmnopqrstuvwxyz</span>
</pre></div>
</div>
</div>
</div>
<div class="section" id="names">
<h3><a class="toc-backref" href="#id22">Names</a><a class="headerlink" href="#names" title="Permalink to this headline">¶</a></h3>
<p>Names should be UTF-8 encoded strings, with no whitespace.  Names should
be unique within a single container, but no guarantee of uniqueness
outside of the container is made by this layer.  Names need to be at least
one character long.</p>
<table class="docutils footnote" frame="void" id="id2" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id1">[1]</a></td><td>This requires that the writer of a record knows the full length of
the record up front, which typically means it will need to buffer an
entire record in memory.  For the first version of this format this is
considered to be acceptable.</td></tr>
</tbody>
</table>
</div>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
        <div class="sphinxsidebarwrapper">
  <h3><a href="index.html">Table of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Container format</a><ul>
<li><a class="reference internal" href="#status">Status</a></li>
<li><a class="reference internal" href="#motivation">Motivation</a></li>
<li><a class="reference internal" href="#terminology">Terminology</a></li>
<li><a class="reference internal" href="#use-cases">Use Cases</a><ul>
<li><a class="reference internal" href="#streaming-data-between-a-smart-server-and-client">Streaming data between a smart server and client</a><ul>
<li><a class="reference internal" href="#incremental-push-or-pull">Incremental push or pull</a></li>
</ul>
</li>
<li><a class="reference internal" href="#persistent-storage-on-disk">Persistent storage on disk</a></li>
<li><a class="reference internal" href="#usable-before-deep-model-changes-to-bazaar">Usable before deep model changes to Bazaar</a></li>
<li><a class="reference internal" href="#examples-of-possible-record-content">Examples of possible record content</a></li>
</ul>
</li>
<li><a class="reference internal" href="#characteristics">Characteristics</a><ul>
<li><a class="reference internal" href="#no-length-prefixing-of-entire-container">No length-prefixing of entire container</a></li>
<li><a class="reference internal" href="#structured-as-a-self-contained-series-of-records">Structured as a self-contained series of records</a></li>
<li><a class="reference internal" href="#addressing-records">Addressing records</a></li>
<li><a class="reference internal" href="#reasonably-cheap-for-small-records">Reasonably cheap for small records</a></li>
</ul>
</li>
<li><a class="reference internal" href="#specification">Specification</a><ul>
<li><a class="reference internal" href="#record-types">Record types</a><ul>
<li><a class="reference internal" href="#end-marker">End Marker</a></li>
<li><a class="reference internal" href="#bytes">Bytes</a></li>
</ul>
</li>
<li><a class="reference internal" href="#names">Names</a></li>
</ul>
</li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="bundles.html"
                        title="previous chapter">Bundles</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="groupcompress-design.html"
                        title="next chapter">Overview</a></p>
  <div role="note" aria-label="source link">
    <h3>This Page</h3>
    <ul class="this-page-menu">
      <li><a href="_sources/container-format.txt"
            rel="nofollow">Show Source</a></li>
    </ul>
   </div>
<div id="searchbox" style="display: none" role="search">
  <h3>Quick search</h3>
    <div class="searchformwrapper">
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    </div>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="groupcompress-design.html" title="Overview"
             >next</a></li>
        <li class="right" >
          <a href="bundles.html" title="Bundles"
             >previous</a> |</li>
<li><a href="http://bazaar.canonical.com/">
    <img src="_static/bzr icon 16.png" /> Home</a>&nbsp;|&nbsp;</li>
<a href="http://doc.bazaar.canonical.com/en/">Documentation</a>&nbsp;|&nbsp;</li>

        <li class="nav-item nav-item-0"><a href="index.html">Developer Document Catalog (2.7.0)</a> &#187;</li>

          <li class="nav-item nav-item-1"><a href="specifications.html" >Specifications</a> &#187;</li> 
      </ul>
    </div>
    <div class="footer" role="contentinfo">
        &#169; Copyright 2009-2011 Canonical Ltd.
      Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.8.4.
    </div>
  </body>
</html>