Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > 4726f970c4b56b9a0ebb9a03a0b6522e > files > 120

python-tables-doc-3.0.0-4.mga4.noarch.rpm




<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Hints for SQL users &mdash; PyTables 3.0.0 documentation</title>
    
    <link rel="stylesheet" href="../_static/cloud.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <link rel="stylesheet" href="../" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '3.0.0',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <script type="text/javascript" src="../_static/jquery.cookie.js"></script>
    <script type="text/javascript" src="../_static/toggle_sections.js"></script>
    <script type="text/javascript" src="../_static/toggle_sidebar.js"></script>
    <link rel="shortcut icon" href="../_static/favicon.ico"/>
    <link rel="top" title="PyTables 3.0.0 documentation" href="../index.html" />
    <link rel="up" title="PyTables Cookbook" href="index.html" />
    <link rel="next" title="How to integrate PyTables in your application by using py2exe" href="py2exe_howto.html" />
    <link rel="prev" title="PyTables Cookbook" href="index.html" /> 
  </head>
  <body>
    <div class="relbar-top">
        
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="../py-modindex.html" title="Python Module Index"
             >modules</a> &nbsp; &nbsp;</li>
        <li class="right" >
          <a href="../np-modindex.html" title="Python Module Index"
             >modules</a> &nbsp; &nbsp;</li>
        <li class="right" >
          <a href="py2exe_howto.html" title="How to integrate PyTables in your application by using py2exe"
             accesskey="N">next</a> &nbsp; &nbsp;</li>
        <li class="right" >
          <a href="index.html" title="PyTables Cookbook"
             accesskey="P">previous</a> &nbsp; &nbsp;</li>
    <li><a href="../index.html">PyTables 3.0.0 documentation</a> &raquo;</li>

          <li><a href="index.html" accesskey="U">PyTables Cookbook</a> &raquo;</li> 
      </ul>
    </div>
    </div>
  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="hints-for-sql-users">
<h1>Hints for SQL users<a class="headerlink" href="#hints-for-sql-users" title="Permalink to this headline">¶</a></h1>
<p>This page is intended to be <strong>a guide to new PyTables for users who are used
to writing SQL code</strong> to access their relational databases.
It will cover the most usual SQL statements.
If you are missing a particular statement or usage example, you can ask at the
<a class="reference external" href="https://lists.sourceforge.net/lists/listinfo/pytables-users">PyTables users&#8217; list</a> <a class="footnote-reference" href="#id1" id="id2">[1]</a> for it.
If you know some examples yourself, you can also write them here!</p>
<p>This page is under development: you can come back frequently to check for new
examples.
Also, this is no replacement for the <a class="reference external" href="http://www.pytables.org/docs/manual">User&#8217;s Guide</a> <a class="footnote-reference" href="#id3" id="id4">[2]</a>;
if you don&#8217;t read the manual, you&#8217;ll be missing lots of features not available
in relational databases!</p>
<p>Examples in Python assume that you have imported the PyTables package like
this:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">tables</span>
</pre></div>
</div>
<div class="section" id="creating-a-new-database">
<h2>Creating a new database<a class="headerlink" href="#creating-a-new-database" title="Permalink to this headline">¶</a></h2>
<p>RDBMs happen to have several syntaxes for creating a database.
A usual syntax is:</p>
<div class="highlight-python"><pre>CREATE DATABASE database_name</pre>
</div>
<p>In PyTables, each database goes to a different <a class="reference external" href="http://www.hdfgroup.org/HDF5">HDF5</a> <a class="footnote-reference" href="#id6" id="id7">[3]</a> file (much like
<a class="reference external" href="http://www.sqlite.org">SQLite</a> <a class="footnote-reference" href="#id9" id="id10">[4]</a> or MS Access).
To create a new <a class="reference external" href="http://www.hdfgroup.org/HDF5">HDF5</a> <a class="footnote-reference" href="#id6" id="id8">[3]</a> file, you use the <tt class="xref py py-func docutils literal"><span class="pre">tables.openFile()</span></tt> function with
the <cite>&#8216;w&#8217;</cite> mode (which deletes the database if it already exists), like this:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">h5f</span> <span class="o">=</span> <span class="n">tables</span><span class="o">.</span><span class="n">openFile</span><span class="p">(</span><span class="s">&#39;database_name.h5&#39;</span><span class="p">,</span> <span class="s">&#39;w&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>In this way you get the <cite>h5f</cite> PyTables <em>file handleé (an instance of the
:class:`tables.File` class), which is a concept similar to a *database
connection</em>, and a new <tt class="file docutils literal"><span class="pre">database_name.h5</span></tt> file is created in the current
directory (you can use full paths here).
You can close the handle (like you close the connection) with:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">h5f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</pre></div>
</div>
<p>This is important for PyTables to dump pending changes to the database.
In case you forget to do it, PyTables closes all open database handles for
you when you exit your program or interactive session, but it is always safer
to close your files explicitly.
If you want to use the database after closing it, you just call
<tt class="xref py py-func docutils literal"><span class="pre">openFile()</span></tt> again, but using the <cite>&#8216;r+&#8217;</cite> or <cite>&#8216;r&#8217;</cite> modes, depending on
whether you do or don&#8217;t need to modify the database, respectively.</p>
<p>You may use several PyTables databases simultaneously in a program, so you
must be explicit on which database you want to act upon (by using its handle).</p>
<div class="section" id="a-note-on-concurrency-under-pytables">
<h3>A note on concurrency under PyTables<a class="headerlink" href="#a-note-on-concurrency-under-pytables" title="Permalink to this headline">¶</a></h3>
<p>Unlike most RDBMs, PyTables is not intended to serve concurrent accesses to a
database.
It has no protections whatsoever against corruption for different (or even the
same) programs accessing the same database.
Opening several handles to the same database in read-only mode is safe, though.</p>
</div>
</div>
<div class="section" id="creating-a-table">
<h2>Creating a table<a class="headerlink" href="#creating-a-table" title="Permalink to this headline">¶</a></h2>
<p>PyTables supports some other <em>datasets</em> besides tables, and they&#8217;re not
arranged in a flat namespace, but rather into a <em>hierarchicalé one (see an
introduction to the _ref:`object tree &lt;ObjectTreeSection&gt;`);
however, due to the nature of these recipes, we&#8217;ll limit ourselves to tables
in the *root group</em>.
The basic syntax for table creation under SQL is:</p>
<div class="highlight-python"><pre>CREATE TABLE table_name (
    column_name1 column_type1,
    column_name2 column_type2,
    ...
    column_nameN column_typeN
)</pre>
</div>
<div class="section" id="table-descriptions">
<h3>Table descriptions<a class="headerlink" href="#table-descriptions" title="Permalink to this headline">¶</a></h3>
<p>In PyTables, one first <em>describes</em> the structure of a table.
PyTables allows you to <em>reuse a description</em> for creating several tables with
the same structure, just by using the description object (<cite>description_name</cite>
below) or getting it from a created table.
This is specially useful for creating temporary tables holding query results.</p>
<p>You can create a table description using a dictionary:</p>
<div class="highlight-python"><pre>description_name = {
    'column_name1': colum_type1,
    'column_name2': colum_type2,
    'column_name3': colum_type3,
    ...
    'column_nameN': colum_typeN
}</pre>
</div>
<p>or a subclass of <a class="reference internal" href="../usersguide/libref/declarative_classes.html#tables.IsDescription" title="tables.IsDescription"><tt class="xref py py-class docutils literal"><span class="pre">tables.IsDescription</span></tt></a>:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="k">class</span> <span class="nc">description_name</span><span class="p">(</span><span class="n">tables</span><span class="o">.</span><span class="n">IsDescription</span><span class="p">):</span>
    <span class="n">column_name1</span> <span class="o">=</span> <span class="n">colum_type1</span>
    <span class="n">column_name2</span> <span class="o">=</span> <span class="n">colum_type2</span>
    <span class="n">column_name3</span> <span class="o">=</span> <span class="n">colum_type3</span>
    <span class="o">...</span>
    <span class="n">column_nameN</span> <span class="o">=</span> <span class="n">colum_typeN</span>
</pre></div>
</div>
<p>Please note that dictionaries are the only way of describing structures with
names which cannot be Python identifiers.
Also, if an explicit order is desired for colums, it must be specified through
the column type declarations (see below), since dictionariy keys and class
attributes aren&#8217;t ordered.
Otherwise, columns are ordered in alphabetic increasing order.
It is important to note that PyTables doesn&#8217;t have a concept of primary or
foreign keys, so relationships between tables are left to the user.</p>
</div>
<div class="section" id="column-type-declarations">
<h3>Column type declarations<a class="headerlink" href="#column-type-declarations" title="Permalink to this headline">¶</a></h3>
<p>PyTables supports lots of types (including nested and multidimensional
columns).
Non-nested columns are declared through instances of <a class="reference internal" href="../usersguide/libref/declarative_classes.html#tables.Col" title="tables.Col"><tt class="xref py py-class docutils literal"><span class="pre">tables.Col</span></tt></a>
subclasses (which you can also reuse).
These are some correspondences with SQL:</p>
<table border="1" class="docutils">
<colgroup>
<col width="43%" />
<col width="57%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">SQL type declaration</th>
<th class="head">PyTables type declaration</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>INTEGER(digits)</td>
<td>tables.IntCol(itemsize)</td>
</tr>
<tr class="row-odd"><td>REAL</td>
<td>tables.FloatCol()</td>
</tr>
<tr class="row-even"><td>VARCHAR(length)</td>
<td>tables.StringCol(itemsize)</td>
</tr>
<tr class="row-odd"><td>DATE</td>
<td>tables.Time32Col()</td>
</tr>
<tr class="row-even"><td>TIMESTAMP</td>
<td>tables.Time64Col()</td>
</tr>
</tbody>
</table>
<p>See a complete description of <a class="reference internal" href="../usersguide/datatypes.html#datatypes"><em>PyTables types</em></a>.
Note that some types admit different <em>item sizes</em>, which are specified in
bytes.
For types with a limited set of supported item sizes, you may also use specific
subclasses which are named after the type and its <em>precision</em>, e.g. <cite>Int32Col</cite>
for 4-byte (32 bit) item size.</p>
<p>Cells in a PyTables&#8217; table always have a value of the cell type, so there is
no <cite>NULL</cite>.
Instead, cells take a <em>default value</em> (zero or empty) which can be changed in
the type declaration, like this: <cite>col_name = StringCol(10, dflt=&#8217;nothing&#8217;)</cite>
(<cite>col_name</cite> takes the value <cite>&#8216;nothing&#8217;</cite> if unset).
The declaration also allows you to set <em>column order</em> via the <cite>pos</cite> argument,
like this:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="k">class</span> <span class="nc">ParticleDescription</span><span class="p">(</span><span class="n">tables</span><span class="o">.</span><span class="n">IsDescription</span><span class="p">):</span>
    <span class="n">name</span> <span class="o">=</span> <span class="n">tables</span><span class="o">.</span><span class="n">StringCol</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="n">pos</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">tables</span><span class="o">.</span><span class="n">FloatCol</span><span class="p">(</span><span class="n">pos</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
    <span class="n">y</span> <span class="o">=</span> <span class="n">tables</span><span class="o">.</span><span class="n">FloatCol</span><span class="p">(</span><span class="n">pos</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
    <span class="n">temperature</span> <span class="o">=</span> <span class="n">tables</span><span class="o">.</span><span class="n">FloatCol</span><span class="p">(</span><span class="n">pos</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span>
</pre></div>
</div>
<p>== Using a description ==</p>
<p>Once you have a table description <cite>description_name</cite> and a writeable file
handle <cite>h5f</cite>, creating a table with that description is as easy as:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span> <span class="o">=</span> <span class="n">h5f</span><span class="o">.</span><span class="n">createTable</span><span class="p">(</span><span class="s">&#39;/&#39;</span><span class="p">,</span> <span class="s">&#39;table_name&#39;</span><span class="p">,</span> <span class="n">description_name</span><span class="p">)</span>
</pre></div>
</div>
<p>PyTables is very object-oriented, and database is usually done through
methods of <a class="reference internal" href="../usersguide/libref/file_class.html#tables.File" title="tables.File"><tt class="xref py py-class docutils literal"><span class="pre">tables.File</span></tt></a>.
The first argument indicates the <em>path</em> where the table will be created,
i.e. the root path (HDF5 uses Unix-like paths).
The <tt class="xref py py-meth docutils literal"><span class="pre">tables.File.createTable()</span></tt> method has many options e.g. for setting
a table title or compression properties. What you get back is an instance of
<a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table" title="tables.Table"><tt class="xref py py-class docutils literal"><span class="pre">tables.Table</span></tt></a>, a handle for accessing the data in that table.</p>
<p>As with files, table handles can also be closed with <cite>tbl.close()</cite>.
If you want to acces an already created table, you can use:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span> <span class="o">=</span> <span class="n">h5f</span><span class="o">.</span><span class="n">getNode</span><span class="p">(</span><span class="s">&#39;/&#39;</span><span class="p">,</span> <span class="s">&#39;table_name&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>(PyTables uses the concept of <em>node</em> for datasets -tables and others- and
groups in the object tree) or, using <em>natural naming</em>:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span> <span class="o">=</span> <span class="n">h5f</span><span class="o">.</span><span class="n">root</span><span class="o">.</span><span class="n">table_name</span>
</pre></div>
</div>
<p>Once you have created a table, you can access (and reuse) its description by
accessing the <cite>description</cite> attribute of its handle.</p>
</div>
</div>
<div class="section" id="creating-an-index">
<h2>Creating an index<a class="headerlink" href="#creating-an-index" title="Permalink to this headline">¶</a></h2>
<p>RDBMs use to allow named indexes on any set of columns (or all of them) in a
table, using a syntax like:</p>
<div class="highlight-python"><pre>CREATE INDEX index_name
ON table_name (column_name1, column_name2, column_name3...)</pre>
</div>
<p>and</p>
<blockquote>
<div>DROP INDEX index_name</div></blockquote>
<p>Indexing is supported in the commercial version of PyTables (PyTablesPro).
However, indexes don&#8217;t have names and they are bound to single columns.
Following the object-oriented philosophy of PyTables, index creation is a
method (<tt class="xref py py-meth docutils literal"><span class="pre">tables.Column.createIndex()</span></tt>) of a <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Column" title="tables.Column"><tt class="xref py py-class docutils literal"><span class="pre">tables.Column</span></tt></a> object
of a table, which you can access trough its <cite>cols</cite> accessor.</p>
<dl class="docutils">
<dt>::</dt>
<dd>tbl.cols.colum_name.createIndex()</dd>
</dl>
<p>For dropping an index on a column:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span><span class="o">.</span><span class="n">cols</span><span class="o">.</span><span class="n">colum_name</span><span class="o">.</span><span class="n">removeIndex</span><span class="p">()</span>
</pre></div>
</div>
</div>
<div class="section" id="altering-a-table">
<h2>Altering a table<a class="headerlink" href="#altering-a-table" title="Permalink to this headline">¶</a></h2>
<p>The first case of table alteration is renaming:</p>
<div class="highlight-python"><pre>ALTER TABLE old_name RENAME TO new_name</pre>
</div>
<p>This is accomplished in !PyTables with:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">h5f</span><span class="o">.</span><span class="n">renameNode</span><span class="p">(</span><span class="s">&#39;/&#39;</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">&#39;old_name&#39;</span><span class="p">,</span> <span class="n">newname</span><span class="o">=</span><span class="s">&#39;new_name&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>or through the table handle:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span><span class="o">.</span><span class="n">rename</span><span class="p">(</span><span class="s">&#39;new_name&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>A handle to a table is still usable after renaming.
The second alteration, namely column addition, is currently not supported in
PyTables.</p>
</div>
<div class="section" id="dropping-a-table">
<h2>Dropping a table<a class="headerlink" href="#dropping-a-table" title="Permalink to this headline">¶</a></h2>
<p>In SQL you can remove a table using:</p>
<div class="highlight-python"><pre>DROP TABLE table_name</pre>
</div>
<p>In PyTables, tables are removed as other nodes, using the
<tt class="xref py py-meth docutils literal"><span class="pre">tables.File.removeNode()</span></tt> method:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">h5f</span><span class="o">.</span><span class="n">removeNode</span><span class="p">(</span><span class="s">&#39;/&#39;</span><span class="p">,</span> <span class="s">&#39;table_name&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>or through the table handle:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span><span class="o">.</span><span class="n">remove</span><span class="p">()</span>
</pre></div>
</div>
<p>When you remove a table, its associated indexes are automatically removed.</p>
</div>
<div class="section" id="inserting-data">
<h2>Inserting data<a class="headerlink" href="#inserting-data" title="Permalink to this headline">¶</a></h2>
<p>In SQL you can insert data one row at a time (fetching from a selection will
be covered later) using a syntax like:</p>
<div class="highlight-python"><pre>INSERT INTO table_name (column_name1, column_name2...)
VALUES (value1, value2...)</pre>
</div>
<p>In PyTables, rows in a table form a <em>sequence</em>, so data isn&#8217;t <em>inserted</em> into
a set, but rather <em>appended</em> to the end of the sequence.
This also implies that identical rows may exist in a table (but they have a
different <em>row number</em>).
There are two ways of appending rows: one at a time or in a block.
The first one is conceptually similar to the SQL case:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span><span class="o">.</span><span class="n">row</span><span class="p">[</span><span class="s">&#39;column_name1&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">value1</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">row</span><span class="p">[</span><span class="s">&#39;column_name2&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">value2</span>
<span class="o">...</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">row</span><span class="o">.</span><span class="n">append</span><span class="p">()</span>
</pre></div>
</div>
<p>The <cite>tbl.row</cite> accessor represents a <em>new row</em> in the table.
You just set the values you want to set (the others take the default value
from their column declarations - see above) and the effectively append the
new row.
This code is usually enclosed in some kind of loop, like:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">row</span> <span class="o">=</span> <span class="n">tbl</span><span class="o">.</span><span class="n">row</span>
<span class="k">while</span> <span class="n">some_condition</span><span class="p">:</span>
    <span class="n">row</span><span class="p">[</span><span class="s">&#39;column_name1&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">value1</span>
    <span class="o">...</span>
    <span class="n">row</span><span class="o">.</span><span class="n">append</span><span class="p">()</span>
</pre></div>
</div>
<p>For appending a block of rows in a single shot, <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.append" title="tables.Table.append"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.append()</span></tt></a>
is more adequate.
You just pass a <a class="reference external" href="http://www.numpy.org">NumPy</a> <a class="footnote-reference" href="#id11" id="id12">[5]</a> record array or Python sequence with elements which
match the expected columns.
For example, given the <cite>tbl</cite> handle for a table with the <cite>ParticleDescription</cite>
structure described above:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">rows</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">(</span><span class="s">&#39;foo&#39;</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">150.0</span><span class="p">),</span>
    <span class="p">(</span><span class="s">&#39;bar&#39;</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">100.0</span><span class="p">),</span>
    <span class="p">(</span><span class="s">&#39;foo&#39;</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span>  <span class="mf">25.0</span><span class="p">)</span>
<span class="p">]</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">rows</span><span class="p">)</span>

<span class="c"># Using a NumPy container.</span>
<span class="kn">import</span> <span class="nn">numpy</span>
<span class="n">rows</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">rec</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">rows</span><span class="p">)</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">rows</span><span class="p">)</span>
</pre></div>
</div>
<div class="section" id="a-note-on-transactions">
<h3>A note on transactions<a class="headerlink" href="#a-note-on-transactions" title="Permalink to this headline">¶</a></h3>
<p>PyTables doesn&#8217;t support transactions nor checkpointing or rolling back (there
is undo support for operations performed on the object tree, but this is
unrelated).
Changes to the database are optimised for maximum performance and reasonable
memory requirements, which means that you can&#8217;t tell whether e.g.
<cite>tbl.append()</cite> has actually committed all, some or no data to disk when it ends.</p>
<p>However, you can <em>force</em> PyTables to commit changes to disk using the <cite>flush()</cite>
method of table and file handles:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span><span class="o">.</span><span class="n">flush</span><span class="p">()</span>  <span class="c"># flush data in the table</span>
<span class="n">h5f</span><span class="o">.</span><span class="n">flush</span><span class="p">()</span>  <span class="c"># flush all pending data</span>
</pre></div>
</div>
<p>Closing a table or a database actually flushes it, but it is recommended that
you explicitly flush frequently (specially with tables).</p>
</div>
</div>
<div class="section" id="updating-data">
<h2>Updating data<a class="headerlink" href="#updating-data" title="Permalink to this headline">¶</a></h2>
<p>We&#8217;re now looking for alternatives to the SQL <cite>UPDATE</cite> statement:</p>
<div class="highlight-python"><pre>UPDATE table_name
SET column_name1 = expression1, column_name2 = expression2...
[WHERE condition]</pre>
</div>
<p>There are different ways of approaching this, depending on your needs.
If you aren&#8217;t using a condition, then the <cite>SET</cite> clause updates all rows,
something you can do in PyTables by iterating over the table:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">tbl</span><span class="p">:</span>
    <span class="n">row</span><span class="p">[</span><span class="s">&#39;column_name1&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">expression1</span>
    <span class="n">row</span><span class="p">[</span><span class="s">&#39;column_name2&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">expression2</span>
    <span class="o">...</span>
    <span class="n">row</span><span class="o">.</span><span class="n">update</span><span class="p">()</span>
</pre></div>
</div>
<p>Don&#8217;t forget to call <cite>update()</cite> or no value will be changed!
Also, since the used iterator allows you to read values from the current row,
you can implement a simple <em>conditional update</em>, like this:</p>
<div class="highlight-python"><pre>for row in tbl:
    if condition on row['column_name1'], row['column_name2']...:
        row['column_name1'] = expression1
        row['column_name2'] = expression2
        ...
        row.update()</pre>
</div>
<p>There are substantially more efficient ways of locating rows fulfilling a
condition.
Given the main PyTables usage scenarios, querying and modifying data are
quite decoupled operations, so we will have a look at querying later and
assume that you already know the set of rows you want to update.</p>
<p>If the set happens to be a slice of the table, you may use the
:<cite>meth:`tables.Table.modifyRows</cite> method or its equivalent
<a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.__setitem__" title="tables.Table.__setitem__"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.__setitem__()</span></tt></a> notation:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">rows</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">(</span><span class="s">&#39;foo&#39;</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">150.0</span><span class="p">),</span>
    <span class="p">(</span><span class="s">&#39;bar&#39;</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">100.0</span><span class="p">),</span>
    <span class="p">(</span><span class="s">&#39;foo&#39;</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span>  <span class="mf">25.0</span><span class="p">)</span>
<span class="p">]</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">modifyRows</span><span class="p">(</span><span class="n">start</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span> <span class="n">stop</span><span class="o">=</span><span class="mi">13</span><span class="p">,</span> <span class="n">step</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">rows</span><span class="o">=</span><span class="n">rows</span><span class="p">)</span>
<span class="n">tbl</span><span class="p">[</span><span class="mi">6</span><span class="p">:</span><span class="mi">13</span><span class="p">:</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">rows</span>  <span class="c"># this is the same</span>
</pre></div>
</div>
<p>If you just want to update some columns in the slice, use the
<tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.modifyColumns()</span></tt> or <tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.modifyColumn()</span></tt>
methods:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">cols</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">[</span><span class="mf">150.0</span><span class="p">,</span> <span class="mf">100.0</span><span class="p">,</span> <span class="mf">25.0</span><span class="p">]</span>
<span class="p">]</span>
<span class="c"># These are all equivalent.</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">modifyColumns</span><span class="p">(</span><span class="n">start</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span> <span class="n">stop</span><span class="o">=</span><span class="mi">13</span><span class="p">,</span> <span class="n">step</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="n">cols</span><span class="p">,</span> <span class="n">names</span><span class="o">=</span><span class="p">[</span><span class="s">&#39;temperature&#39;</span><span class="p">])</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">modifyColumn</span><span class="p">(</span><span class="n">start</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span> <span class="n">stop</span><span class="o">=</span><span class="mi">13</span><span class="p">,</span> <span class="n">step</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">column</span><span class="o">=</span><span class="n">cols</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">colname</span><span class="o">=</span><span class="s">&#39;temperature&#39;</span><span class="p">)</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">cols</span><span class="o">.</span><span class="n">temperature</span><span class="p">[</span><span class="mi">6</span><span class="p">:</span><span class="mi">13</span><span class="p">:</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">cols</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</pre></div>
</div>
<p>The last line shows an example of using the <cite>cols</cite> accessor to get to the
desired <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Column" title="tables.Column"><tt class="xref py py-class docutils literal"><span class="pre">tables.Column</span></tt></a> of the table using natural naming and apply
<cite>setitem</cite> on it.</p>
<p>If the set happens to be an array of sparse coordinates, you can also use
PyTables&#8217; extended slice notation:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">rows</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">(</span><span class="s">&#39;foo&#39;</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">150.0</span><span class="p">),</span>
    <span class="p">(</span><span class="s">&#39;bar&#39;</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">100.0</span><span class="p">),</span>
    <span class="p">(</span><span class="s">&#39;foo&#39;</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span>  <span class="mf">25.0</span><span class="p">)</span>
<span class="p">]</span>
<span class="n">rownos</span> <span class="o">=</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">735</span><span class="p">,</span> <span class="mi">371913476</span><span class="p">]</span>
<span class="n">tbl</span><span class="p">[</span><span class="n">rownos</span><span class="p">]</span> <span class="o">=</span> <span class="n">rows</span>
</pre></div>
</div>
<p>instead of the traditional:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="k">for</span> <span class="n">row_id</span><span class="p">,</span> <span class="n">datum</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">rownos</span><span class="p">,</span> <span class="n">rows</span><span class="p">):</span>
     <span class="n">tbl</span><span class="p">[</span><span class="n">row_id</span><span class="p">]</span> <span class="o">=</span> <span class="n">datum</span>
</pre></div>
</div>
<p>Since you are modifying table data in all cases, you should also remember to
<cite>flush()</cite> the table when you&#8217;re done.</p>
</div>
<div class="section" id="deleting-data">
<h2>Deleting data<a class="headerlink" href="#deleting-data" title="Permalink to this headline">¶</a></h2>
<p>Rows are deleted from a table with the following SQL syntax:</p>
<div class="highlight-python"><pre>DELETE FROM table_name
[WHERE condition]</pre>
</div>
<p><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.removeRows()</span></tt> is the method used for deleting rows in
PyTables.
However, it is very simple (only contiguous blocks of rows can be deleted) and
quite inefficient, and one should consider whether <em>dumping filtered data from
one table into another</em> isn&#8217;t a much more convenient approach.
This is a far more optimized operation under PyTables which will be covered
later.</p>
<p>Anyway, using <cite>removeRows()</cite> is quite straightforward:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">tbl</span><span class="o">.</span><span class="n">removeRows</span><span class="p">(</span><span class="mi">12</span><span class="p">)</span>  <span class="c"># delete one single row (12)</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">removeRows</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span> <span class="mi">20</span><span class="p">)</span>  <span class="c"># delete all rows from 12 to 19 (included)</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">removeRows</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">tbl</span><span class="o">.</span><span class="n">nrows</span><span class="p">)</span>  <span class="c"># delete all rows unconditionally</span>
<span class="n">tbl</span><span class="o">.</span><span class="n">removeRows</span><span class="p">(</span><span class="o">-</span><span class="mi">4</span><span class="p">,</span> <span class="n">tbl</span><span class="o">.</span><span class="n">nrows</span><span class="p">)</span>  <span class="c"># delete the last 4 rows</span>
</pre></div>
</div>
</div>
<div class="section" id="reading-data">
<h2>Reading data<a class="headerlink" href="#reading-data" title="Permalink to this headline">¶</a></h2>
<p>The most basic syntax in SQL for reading rows in a table without using a
condition is:</p>
<div class="highlight-python"><pre>SELECT (column_name1, column_name2... | *) FROM table_name</pre>
</div>
<p>Which reads all rows (though maybe not all columns) from a table.
In PyTables there are two ways of retrieving data: <em>iteratively</em> or <em>at once</em>.
You&#8217;ll notice some similarities with how we appended and updated data above,
since this dichotomy is widespread here.</p>
<p>For a clearer separation with conditional queries (covered further below),
and since the concept of <em>row number</em> doesn&#8217;t exist in relational databases,
we&#8217;ll be including here the cases where you want to read a <strong>known</strong> <em>slice</em>
or <em>sequence</em> of rows, besides the case of reading <em>all</em> rows.</p>
<div class="section" id="iterating-over-rows">
<h3>Iterating over rows<a class="headerlink" href="#iterating-over-rows" title="Permalink to this headline">¶</a></h3>
<p>This is similar to using the <cite>fetchone()</cite> method of a DB <cite>cursor</cite> in a
<a class="reference external" href="http://www.python.org/dev/peps/pep-0249">Python DBAPI</a> <a class="footnote-reference" href="#id14" id="id15">[6]</a>-compliant package, i.e. you <em>iterate</em> over the list of wanted
rows, getting one <em>row handle</em> at a time.
In this case, the handle is an instance of the <tt class="xref py py-class docutils literal"><span class="pre">tables.Row</span></tt> class,
which allows access to individual columns as items acessed by key (so there
is no special way of selecting columns: you just use the ones you want
whenever you want).</p>
<p>This way of reading rows is recommended when you want to perform operations
on individual rows in a simple manner, and specially if you want to process
a lot of rows in the table (i.e. when loading them all at once would take too
much memory).
Iterators are also handy for using with the <cite>itertools</cite> Python module for
grouping, sorting and other operations.</p>
<p>For iterating over <em>all</em> rows, use plain iteration or the
<a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.iterrows" title="tables.Table.iterrows"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.iterrows()</span></tt></a> method:</p>
<div class="highlight-python"><pre>for row in tbl:  # or tbl.iterrows()
    do something with row['column_name1'], row['column_name2']...</pre>
</div>
<p>For iterating over a <em>slice</em> of rows, use the
<tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.iterrows|Table.iterrows()</span></tt> method:</p>
<div class="highlight-python"><pre>for row in tbl.iterrows(start=6, stop=13, step=3):
    do something with row['column_name1'], row['column_name2']...</pre>
</div>
<p>For iterating over a <em>sequence</em> of rows, use the
<a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.itersequence" title="tables.Table.itersequence"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.itersequence()</span></tt></a> method:</p>
<div class="highlight-python"><pre>for row in tbl.itersequence([6, 7, 9, 11]):
    do something with row['column_name1'], row['column_name2']...</pre>
</div>
</div>
<div class="section" id="reading-rows-at-once">
<h3>Reading rows at once<a class="headerlink" href="#reading-rows-at-once" title="Permalink to this headline">¶</a></h3>
<p>In contrast with iteration, you can fetch all desired rows into a single
<em>container</em> in memory (usually an efficient <a class="reference external" href="http://www.numpy.org">NumPy</a> <a class="footnote-reference" href="#id11" id="id13">[5]</a> record-array) in a single
operation, like the <cite>fetchall()</cite> or <cite>fetchmany()</cite> methods of a DBAPI <cite>cursor</cite>.
This is specially useful when you want to transfer the read data to another
component in your program, avoiding loops to construct your own containers.
However, you should be careful about the amount of data you are fetching into
memory, since it can be quite large (and even exceed its physical capacity).</p>
<p>You can choose between the <cite>Table.read*()</cite> methods or the
<a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.__getitem__" title="tables.Table.__getitem__"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.__getitem__()</span></tt></a> syntax for this kind of reads.
The <cite>read*()</cite> methods offer you the chance to choose a single column to read
via their <cite>field</cite> argument (which isn&#8217;t still as powerful as the SQL <cite>SELECT</cite>
column spec).</p>
<p>For reading <em>all</em> rows, use <cite>[:]</cite> or the <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.read" title="tables.Table.read"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.read()</span></tt></a> method:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">rows</span> <span class="o">=</span> <span class="n">tbl</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">rows</span> <span class="o">=</span> <span class="n">tbl</span><span class="p">[:]</span>  <span class="c"># equivalent</span>
</pre></div>
</div>
<p>For reading a <em>slice</em> of rows, use <cite>[slice]</cite> or the
<tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.read|Table.read()</span></tt> method:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">rows</span> <span class="o">=</span> <span class="n">tbl</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">start</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span> <span class="n">stop</span><span class="o">=</span><span class="mi">13</span><span class="p">,</span> <span class="n">step</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<span class="n">rows</span> <span class="o">=</span> <span class="n">tbl</span><span class="p">[</span><span class="mi">6</span><span class="p">:</span><span class="mi">13</span><span class="p">:</span><span class="mi">3</span><span class="p">]</span>  <span class="c"># equivalent</span>
</pre></div>
</div>
<p>For reading a <em>sequence</em> of rows, use the <tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.readCoordinates()</span></tt>
method:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">rows</span> <span class="o">=</span> <span class="n">tbl</span><span class="o">.</span><span class="n">readCoordinates</span><span class="p">([</span><span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">11</span><span class="p">])</span>
</pre></div>
</div>
<p>Please note that you can add a <cite>field=&#8217;column_name&#8217;</cite> argument to <cite>read*()</cite>
methods in order to get only the given column instead of them all.</p>
</div>
</div>
<div class="section" id="selecting-data">
<h2>Selecting data<a class="headerlink" href="#selecting-data" title="Permalink to this headline">¶</a></h2>
<p>When you want to read a subset of rows which match a given condition from a
table you use a syntax like this in SQL:</p>
<div class="highlight-python"><pre>SELECT column_specification FROM table_name
WHERE condition</pre>
</div>
<p>The <cite>condition</cite> is an expression yielding a boolean value based on a
combination of column names and constants with functions and operators.
If the condition holds true for a given row, the <cite>column_specification</cite> is
applied on it and the resulting row is added to the result.</p>
<p>In PyTables, you may filter rows using two approaches: the first one is
achieved through standard Python comparisons (similar to what we used for
conditional update), like this:</p>
<div class="highlight-python"><pre>for row in tbl:
    if condition on row['column_name1'], row['column_name2']...:
        do something with row</pre>
</div>
<p>This is easy for newcomers, but not very efficient. That&#8217;s why PyTables offers
another approach: <strong>in-kernel</strong> searches, which are much more efficient than
standard searches, and can take advantage of indexing (under PyTables &gt;= 2.3).</p>
<p>In-kernel searches are used through the <em>where methods</em> in <cite>Table</cite>, which are
passed a <em>condition string</em> describing the condition in a Python-like syntax.
For instance, with the <cite>ParticleDescription</cite> we defined above, we may specify
a condition for selecting particles at most 1 unit apart from the origin with
a temperature under 100 with a condition string like this one:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="s">&#39;(sqrt(x**2 + y**2) &lt;= 1) &amp; (temperature &lt; 100)&#39;</span>
</pre></div>
</div>
<p>Where <cite>x</cite>, <cite>y</cite> and <cite>temperature</cite> are the names of columns in the table.
The operators and functions you may use in a condition string are described
in the <a class="reference internal" href="../usersguide/condition_syntax.html#condition-syntax"><em>appendix on condition syntax</em></a> in the
<a class="reference external" href="http://www.pytables.org/docs/manual">User&#8217;s Guide</a> <a class="footnote-reference" href="#id3" id="id5">[2]</a>.</p>
<div class="section" id="iterating-over-selected-rows">
<h3>Iterating over selected rows<a class="headerlink" href="#iterating-over-selected-rows" title="Permalink to this headline">¶</a></h3>
<p>You can iterate over the rows in a table which fulfill a condition (a la DBAPI
<cite>fetchone()</cite>) by using the <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.where" title="tables.Table.where"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.where()</span></tt></a> method, which is very
similar to the <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.iterrows" title="tables.Table.iterrows"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.iterrows()</span></tt></a> one discussed above, and which
can be used in the same circumstances (i.e. performing operations on individual
rows or having results exceeding available memory).</p>
<p>Here is an example of using <cite>where()</cite> with the previous example condition:</p>
<div class="highlight-python"><pre>for row in tbl.where('(sqrt(x**2 + y**2) &lt;= 1) &amp; (temperature &lt; 100)'):
    do something with row['name'], row['x']...</pre>
</div>
</div>
<div class="section" id="reading-seleted-rows-at-once">
<h3>Reading seleted rows at once<a class="headerlink" href="#reading-seleted-rows-at-once" title="Permalink to this headline">¶</a></h3>
<p>Like the aforementioned <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.read" title="tables.Table.read"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.read()</span></tt></a>,
<tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.readWhere()</span></tt> gets all the rows fulfilling the given condition
and packs them in a single container (a la DBAPI <cite>fetchmany()</cite>).
The same warning applies: be careful on how many rows you expect to retrieve,
or you may run out of memory!</p>
<p>Here is an example of using <cite>readWhere()</cite> with the previous example condition:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">rows</span> <span class="o">=</span> <span class="n">tbl</span><span class="o">.</span><span class="n">readWhere</span><span class="p">(</span><span class="s">&#39;(sqrt(x**2 + y**2) &lt;= 1) &amp; (temperature &lt; 100)&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>Please note that both <a class="reference internal" href="../usersguide/libref/structured_storage.html#tables.Table.where" title="tables.Table.where"><tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.where()</span></tt></a> and
<tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.readWhere()</span></tt> can also take slicing arguments.</p>
</div>
<div class="section" id="getting-the-coordinates-of-selected-rows">
<h3>Getting the coordinates of selected rows<a class="headerlink" href="#getting-the-coordinates-of-selected-rows" title="Permalink to this headline">¶</a></h3>
<p>There is yet another method for querying tables:
<tt class="xref py py-meth docutils literal"><span class="pre">tables.Table.getWhereList()</span></tt>.
It returns just a sequence of the numbers of the rows which fulfill the given
condition.
You may pass that sequence to :meth:tables.Table.readCoordinates`, e.g. to
retrieve data from a different table where rows with the same number as the
queried one refer to the same first-class object or entity.</p>
</div>
<div class="section" id="a-note-on-table-joins">
<h3>A note on table joins<a class="headerlink" href="#a-note-on-table-joins" title="Permalink to this headline">¶</a></h3>
<p>You may have noticed that queries in PyTables only cover one table.
In fact, there is no way of directly performing a join between two tables in
PyTables (remember that it&#8217;s not a relational database).
You may however work around this limitation depending on your case:</p>
<ul>
<li><p class="first">If one table is an <em>extension</em> of another (i.e. it contains additional
columns for the same entities), your best bet is to arrange rows of the
same entity so that they are placed in the same positions in both tables.
For instance, if <cite>tbl1</cite> and <cite>tbl2</cite> follow this rule, you may do something
like this to emulate a natural join:</p>
<div class="highlight-python"><pre> for row1 in tbl1.where('condition'):
     row2 = tbl2[row1.nrow]
     if condition on row2['column_name1'], row2['column_name2']...:
         do something with row1 and row2...

(Note that `row1` is a `Row` instance and `row2` is a record of the current
flavor.)</pre>
</div>
</li>
<li><p class="first">If rows in both tables are linked by a common value (e.g. acting as an
identifier), you&#8217;ll need to split your condition in one for the first table
and one for the second table, and then nest your queries, placing the most
restrictive one first. For instance:</p>
<div class="highlight-python"><pre>SELECT clients.name, bills.item_id FROM clients, bills
WHERE clients.id = bills.client_id and clients.age &gt; 50 and bills.price &gt; 200</pre>
</div>
<p>could be written as:</p>
<div class="highlight-python"><pre>for client in clients.where('age &gt; 50'):
    # Note that the following query is different for each client.
    for bill in bills.where('(client_id == %r) &amp; (price &gt; 200)' % client['id']):
        do something with client['name'] and bill['item_id']</pre>
</div>
<p>In this example, indexing the <cite>client_id</cite> column of <cite>bills</cite> could speed up
the inner query quite a lot.
Also, you could avoid parsing the inner condition each time by using
<em>condition variables</em>:</p>
<div class="highlight-python"><pre>for client in clients.where('age &gt; 50'):
    for bill in bills.where('(client_id == cid) &amp; (price &gt; 200)', {'cid': client['id']}):
        do something with client['name'] and bill['item_id']</pre>
</div>
</li>
</ul>
</div>
</div>
<div class="section" id="summary-of-row-selection-methods">
<h2>Summary of row selection methods<a class="headerlink" href="#summary-of-row-selection-methods" title="Permalink to this headline">¶</a></h2>
<table border="1" class="docutils">
<colgroup>
<col width="20%" />
<col width="16%" />
<col width="19%" />
<col width="21%" />
<col width="23%" />
</colgroup>
<tbody valign="top">
<tr class="row-odd"><td>&nbsp;</td>
<td><strong>All rows</strong></td>
<td><strong>Range of rows</strong></td>
<td><strong>Sequence of rows</strong></td>
<td><strong>Condition</strong></td>
</tr>
<tr class="row-even"><td><strong>Iterative access</strong></td>
<td><tt class="docutils literal"><span class="pre">__iter__()</span></tt>,
<tt class="docutils literal"><span class="pre">iterrows()</span></tt></td>
<td><tt class="docutils literal"><span class="pre">iterrows(range)</span></tt></td>
<td><tt class="docutils literal"><span class="pre">itersequence()</span></tt></td>
<td><tt class="docutils literal"><span class="pre">where(condition)</span></tt></td>
</tr>
<tr class="row-odd"><td><strong>Block access</strong></td>
<td><tt class="docutils literal"><span class="pre">[:]</span></tt>,
<tt class="docutils literal"><span class="pre">read()</span></tt></td>
<td><tt class="docutils literal"><span class="pre">[range]</span></tt>,
<tt class="docutils literal"><span class="pre">read(range)</span></tt></td>
<td><tt class="docutils literal"><span class="pre">readCoordinates()</span></tt></td>
<td><tt class="docutils literal"><span class="pre">readWhere(condition)</span></tt></td>
</tr>
</tbody>
</table>
</div>
<div class="section" id="sorting-the-results-of-a-selection">
<h2>Sorting the results of a selection<a class="headerlink" href="#sorting-the-results-of-a-selection" title="Permalink to this headline">¶</a></h2>
<p><em>Do you feel like writing this section? Your contribution is welcome!</em></p>
</div>
<div class="section" id="grouping-the-results-of-a-selection">
<h2>Grouping the results of a selection<a class="headerlink" href="#grouping-the-results-of-a-selection" title="Permalink to this headline">¶</a></h2>
<p>By making use of the <tt class="xref py py-func docutils literal"><span class="pre">itertools.groupby()</span></tt> utility, you can group results
by field:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">group</span> <span class="o">=</span> <span class="p">{}</span> <span class="c"># dictionary to put results grouped by &#39;pressure&#39;</span>
<span class="k">def</span> <span class="nf">pressure_selector</span><span class="p">(</span><span class="n">row</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">row</span><span class="p">[</span><span class="s">&#39;pressure&#39;</span><span class="p">]</span>
<span class="k">for</span> <span class="n">pressure</span><span class="p">,</span> <span class="n">rows_grouped_by_pressure</span> <span class="ow">in</span> <span class="n">itertools</span><span class="o">.</span><span class="n">groupby</span><span class="p">(</span><span class="n">mytable</span><span class="p">,</span> <span class="n">pressure_selector</span><span class="p">):</span>
    <span class="n">group</span><span class="p">[</span><span class="n">pressure</span><span class="p">]</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">((</span><span class="n">r</span><span class="p">[</span><span class="s">&#39;energy&#39;</span><span class="p">]</span> <span class="o">+</span> <span class="n">r</span><span class="p">[</span><span class="s">&#39;ADCcount&#39;</span><span class="p">]</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">rows_grouped_by_pressure</span><span class="p">))</span>
</pre></div>
</div>
<p>However, <tt class="xref py py-func docutils literal"><span class="pre">itertools.groupby()</span></tt> assumes the incoming array is sorted by the
grouping field.
If not, there are multiple groups with the same grouper returned.
In the example, mytable thus has to be sorted on pressure, or the last line
should be changed to:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">group</span><span class="p">[</span><span class="n">pressure</span><span class="p">]</span> <span class="o">+=</span> <span class="nb">sum</span><span class="p">((</span><span class="n">r</span><span class="p">[</span><span class="s">&#39;energy&#39;</span><span class="p">]</span> <span class="o">+</span> <span class="n">r</span><span class="p">[</span><span class="s">&#39;ADCcount&#39;</span><span class="p">]</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">rows_grouped_by_pressure</span><span class="p">))</span>
</pre></div>
</div>
<hr class="docutils" />
<table class="docutils footnote" frame="void" id="id1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id2">[1]</a></td><td><a class="reference external" href="https://lists.sourceforge.net/lists/listinfo/pytables-users">https://lists.sourceforge.net/lists/listinfo/pytables-users</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id3" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[2]</td><td><em>(<a class="fn-backref" href="#id4">1</a>, <a class="fn-backref" href="#id5">2</a>)</em> <a class="reference external" href="http://www.pytables.org/docs/manual">http://www.pytables.org/docs/manual</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id6" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[3]</td><td><em>(<a class="fn-backref" href="#id7">1</a>, <a class="fn-backref" href="#id8">2</a>)</em> <a class="reference external" href="http://www.hdfgroup.org/HDF5">http://www.hdfgroup.org/HDF5</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id9" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id10">[4]</a></td><td><a class="reference external" href="http://www.sqlite.org">http://www.sqlite.org</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id11" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[5]</td><td><em>(<a class="fn-backref" href="#id12">1</a>, <a class="fn-backref" href="#id13">2</a>)</em> <a class="reference external" href="http://www.numpy.org">http://www.numpy.org</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id14" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id15">[6]</a></td><td><a class="reference external" href="http://www.python.org/dev/peps/pep-0249">http://www.python.org/dev/peps/pep-0249</a></td></tr>
</tbody>
</table>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
        <p class="logo"><a href="../index.html">
          <img class="logo" src="../_static/logo-pytables-small.png" alt="Logo"/>
        </a></p>
  <h3><a href="../index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Hints for SQL users</a><ul>
<li><a class="reference internal" href="#creating-a-new-database">Creating a new database</a><ul>
<li><a class="reference internal" href="#a-note-on-concurrency-under-pytables">A note on concurrency under PyTables</a></li>
</ul>
</li>
<li><a class="reference internal" href="#creating-a-table">Creating a table</a><ul>
<li><a class="reference internal" href="#table-descriptions">Table descriptions</a></li>
<li><a class="reference internal" href="#column-type-declarations">Column type declarations</a></li>
</ul>
</li>
<li><a class="reference internal" href="#creating-an-index">Creating an index</a></li>
<li><a class="reference internal" href="#altering-a-table">Altering a table</a></li>
<li><a class="reference internal" href="#dropping-a-table">Dropping a table</a></li>
<li><a class="reference internal" href="#inserting-data">Inserting data</a><ul>
<li><a class="reference internal" href="#a-note-on-transactions">A note on transactions</a></li>
</ul>
</li>
<li><a class="reference internal" href="#updating-data">Updating data</a></li>
<li><a class="reference internal" href="#deleting-data">Deleting data</a></li>
<li><a class="reference internal" href="#reading-data">Reading data</a><ul>
<li><a class="reference internal" href="#iterating-over-rows">Iterating over rows</a></li>
<li><a class="reference internal" href="#reading-rows-at-once">Reading rows at once</a></li>
</ul>
</li>
<li><a class="reference internal" href="#selecting-data">Selecting data</a><ul>
<li><a class="reference internal" href="#iterating-over-selected-rows">Iterating over selected rows</a></li>
<li><a class="reference internal" href="#reading-seleted-rows-at-once">Reading seleted rows at once</a></li>
<li><a class="reference internal" href="#getting-the-coordinates-of-selected-rows">Getting the coordinates of selected rows</a></li>
<li><a class="reference internal" href="#a-note-on-table-joins">A note on table joins</a></li>
</ul>
</li>
<li><a class="reference internal" href="#summary-of-row-selection-methods">Summary of row selection methods</a></li>
<li><a class="reference internal" href="#sorting-the-results-of-a-selection">Sorting the results of a selection</a></li>
<li><a class="reference internal" href="#grouping-the-results-of-a-selection">Grouping the results of a selection</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="index.html"
                        title="previous chapter">PyTables Cookbook</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="py2exe_howto.html"
                        title="next chapter">How to integrate PyTables in your application by using py2exe</a></p>
  <h3>This Page</h3>
  <ul class="this-page-menu">
    <li><a href="../_sources/cookbook/hints_for_sql_users.txt"
           rel="nofollow">Show Source</a></li>
  </ul>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="../search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="relbar-bottom">
        
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="../py-modindex.html" title="Python Module Index"
             >modules</a> &nbsp; &nbsp;</li>
        <li class="right" >
          <a href="../np-modindex.html" title="Python Module Index"
             >modules</a> &nbsp; &nbsp;</li>
        <li class="right" >
          <a href="py2exe_howto.html" title="How to integrate PyTables in your application by using py2exe"
             >next</a> &nbsp; &nbsp;</li>
        <li class="right" >
          <a href="index.html" title="PyTables Cookbook"
             >previous</a> &nbsp; &nbsp;</li>
    <li><a href="../index.html">PyTables 3.0.0 documentation</a> &raquo;</li>

          <li><a href="index.html" >PyTables Cookbook</a> &raquo;</li> 
      </ul>
    </div>
    </div>

    <div class="footer">
        &copy; Copyright 2011-2013, PyTables maintainers.
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
    </div>
    <!-- cloud_sptheme 1.3 -->
  </body>
</html>