  <div class="section" id="garbage-collection">
<h1>Garbage Collection<a class="headerlink" href="#garbage-collection" title="Permalink to this headline">¶</a></h1>
<p>Garbage collection is used to remove data from a repository that is no longer referenced.</p>
<p>Generally this involves locking the repository and scanning all its branches
then generating a new repository with less data.</p>
<div class="section" id="least-work-we-can-hope-to-perform">
<h2>Least work we can hope to perform<a class="headerlink" href="#least-work-we-can-hope-to-perform" title="Permalink to this headline">¶</a></h2>
<ul class="simple">
<li>Read all branches to get initial references - tips + tags.</li>
<li>Read through the revision graph to find unreferenced revisions. A cheap HEADS
list might help here by allowing comparison of the initial references to the
HEADS - any unreferenced head is garbage.</li>
<li>Walk out via inventory deltas to get the full set of texts and signatures to preserve.</li>
<li>Copy to a new repository</li>
<li>Bait and switch back to the original</li>
<li>Remove the old repository.</li>
<p>A possibility to reduce this would be to have a set of grouped &#8216;known garbage
free&#8217; data - &#8216;ancient history&#8217; which can be preserved in total should its HEADS
be fully referenced - and where the HEADS list is deliberate cheap (e.g. at the
top of some index).</p>
<p>possibly - null data in place without saving size.</p>

