Sophie

Sophie

distrib > Fedora > 18 > i386 > by-pkgid > d0983343df85ecf7d844c2cfc3a0597a > files > 494

python-whoosh-2.5.1-1.fc18.noarch.rpm



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>The default query language &mdash; Whoosh 2.5.1 documentation</title>
    
    <link rel="stylesheet" href="_static/default.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '',
        VERSION:     '2.5.1',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="top" title="Whoosh 2.5.1 documentation" href="index.html" />
    <link rel="next" title="Indexing and parsing dates/times" href="dates.html" />
    <link rel="prev" title="Parsing user queries" href="parsing.html" /> 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="dates.html" title="Indexing and parsing dates/times"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="parsing.html" title="Parsing user queries"
             accesskey="P">previous</a> |</li>
        <li><a href="index.html">Whoosh 2.5.1 documentation</a> &raquo;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="the-default-query-language">
<h1>The default query language<a class="headerlink" href="#the-default-query-language" title="Permalink to this headline">¶</a></h1>
<div class="section" id="overview">
<h2>Overview<a class="headerlink" href="#overview" title="Permalink to this headline">¶</a></h2>
<p>A query consists of <em>terms</em> and <em>operators</em>. There are two types of terms: single
terms and <em>phrases</em>. Multiple terms can be combined with operators such as
<em>AND</em> and <em>OR</em>.</p>
<p>Whoosh supports indexing text in different <em>fields</em>. You must specify the
<em>default field</em> when you create the <a class="reference internal" href="api/qparser.html#whoosh.qparser.QueryParser" title="whoosh.qparser.QueryParser"><tt class="xref py py-class docutils literal"><span class="pre">whoosh.qparser.QueryParser</span></tt></a> object.
This is the field in which any terms the user does not explicitly specify a field
for will be searched.</p>
<p>Whoosh&#8217;s query parser is capable of parsing different and/or additional syntax
through the use of plug-ins. See <a class="reference internal" href="parsing.html"><em>Parsing user queries</em></a>.</p>
</div>
<div class="section" id="individual-terms-and-phrases">
<h2>Individual terms and phrases<a class="headerlink" href="#individual-terms-and-phrases" title="Permalink to this headline">¶</a></h2>
<p>Find documents containing the term <tt class="docutils literal"><span class="pre">render</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>render
</pre></div>
</div>
<p>Find documents containing the phrase <tt class="docutils literal"><span class="pre">all</span> <span class="pre">was</span> <span class="pre">well</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>&quot;all was well&quot;
</pre></div>
</div>
<p>Note that a field must store Position information for phrase searching to work in
that field.</p>
<p>Normally when you specify a phrase, the maximum difference in position between
each word in the phrase is 1 (that is, the words must be right next to each
other in the document). For example, the following matches if a document has
<tt class="docutils literal"><span class="pre">library</span></tt> within 5 words after <tt class="docutils literal"><span class="pre">whoosh</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>&quot;whoosh library&quot;~5
</pre></div>
</div>
</div>
<div class="section" id="boolean-operators">
<h2>Boolean operators<a class="headerlink" href="#boolean-operators" title="Permalink to this headline">¶</a></h2>
<p>Find documents containing <tt class="docutils literal"><span class="pre">render</span></tt> <em>and</em> <tt class="docutils literal"><span class="pre">shading</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>render AND shading
</pre></div>
</div>
<p>Note that AND is the default relation between terms, so this is the same as:</p>
<div class="highlight-none"><div class="highlight"><pre>render shading
</pre></div>
</div>
<p>Find documents containing <tt class="docutils literal"><span class="pre">render</span></tt>, <em>and</em> also either <tt class="docutils literal"><span class="pre">shading</span></tt> <em>or</em>
<tt class="docutils literal"><span class="pre">modeling</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>render AND shading OR modeling
</pre></div>
</div>
<p>Find documents containing <tt class="docutils literal"><span class="pre">render</span></tt> but <em>not</em> modeling:</p>
<div class="highlight-none"><div class="highlight"><pre>render NOT modeling
</pre></div>
</div>
<p>Find documents containing <tt class="docutils literal"><span class="pre">alpha</span></tt> but not either <tt class="docutils literal"><span class="pre">beta</span></tt> or <tt class="docutils literal"><span class="pre">gamma</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>alpha NOT (beta OR gamma)
</pre></div>
</div>
<p>Note that when no boolean operator is specified between terms, the parser will
insert one, by default AND. So this query:</p>
<div class="highlight-none"><div class="highlight"><pre>render shading modeling
</pre></div>
</div>
<p>is equivalent (by default) to:</p>
<div class="highlight-none"><div class="highlight"><pre>render AND shading AND modeling
</pre></div>
</div>
<p>See <a class="reference internal" href="parsing.html"><em>customizing the default parser</em></a> for information on how to
change the default operator to OR.</p>
<p>Group operators together with parentheses. For example to find documents that
contain both <tt class="docutils literal"><span class="pre">render</span></tt> and <tt class="docutils literal"><span class="pre">shading</span></tt>, or contain <tt class="docutils literal"><span class="pre">modeling</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>(render AND shading) OR modeling
</pre></div>
</div>
</div>
<div class="section" id="fields">
<h2>Fields<a class="headerlink" href="#fields" title="Permalink to this headline">¶</a></h2>
<p>Find the term <tt class="docutils literal"><span class="pre">ivan</span></tt> in the <tt class="docutils literal"><span class="pre">name</span></tt> field:</p>
<div class="highlight-none"><div class="highlight"><pre>name:ivan
</pre></div>
</div>
<p>The <tt class="docutils literal"><span class="pre">field:</span></tt> prefix only sets the field for the term it directly precedes, so
the query:</p>
<div class="highlight-none"><div class="highlight"><pre>title:open sesame
</pre></div>
</div>
<p>Will search for <tt class="docutils literal"><span class="pre">open</span></tt> in the <tt class="docutils literal"><span class="pre">title</span></tt> field and <tt class="docutils literal"><span class="pre">sesame</span></tt> in the <em>default</em>
field.</p>
<p>To apply a field prefix to multiple terms, group them with parentheses:</p>
<div class="highlight-none"><div class="highlight"><pre>title:(open sesame)
</pre></div>
</div>
<p>This is the same as:</p>
<div class="highlight-none"><div class="highlight"><pre>title:open title:sesame
</pre></div>
</div>
<p>Of course you can specify a field for phrases too:</p>
<div class="highlight-none"><div class="highlight"><pre>title:&quot;open sesame&quot;
</pre></div>
</div>
</div>
<div class="section" id="inexact-terms">
<h2>Inexact terms<a class="headerlink" href="#inexact-terms" title="Permalink to this headline">¶</a></h2>
<p>Use &#8220;globs&#8221; (wildcard expressions using <tt class="docutils literal"><span class="pre">?</span></tt> to represent a single character
and <tt class="docutils literal"><span class="pre">*</span></tt> to represent any number of characters) to match terms:</p>
<div class="highlight-none"><div class="highlight"><pre>te?t test* *b?g*
</pre></div>
</div>
<p>Note that a wildcard starting with <tt class="docutils literal"><span class="pre">?</span></tt> or <tt class="docutils literal"><span class="pre">*</span></tt> is very slow. Note also that
these wildcards only match <em>individual terms</em>. For example, the query:</p>
<div class="highlight-none"><div class="highlight"><pre>my*life
</pre></div>
</div>
<p>will <strong>not</strong> match an indexed phrase like:</p>
<div class="highlight-none"><div class="highlight"><pre>my so called life
</pre></div>
</div>
<p>because those are four separate terms.</p>
</div>
<div class="section" id="ranges">
<h2>Ranges<a class="headerlink" href="#ranges" title="Permalink to this headline">¶</a></h2>
<p>You can match a range of terms. For example, the following query will match
documents containing terms in the lexical range from <tt class="docutils literal"><span class="pre">apple</span></tt> to <tt class="docutils literal"><span class="pre">bear</span></tt>
<em>inclusive</em>. For example, it will match documents containing <tt class="docutils literal"><span class="pre">azores</span></tt> and
<tt class="docutils literal"><span class="pre">be</span></tt> but not <tt class="docutils literal"><span class="pre">blur</span></tt>:</p>
<div class="highlight-none"><div class="highlight"><pre>[apple TO bear]
</pre></div>
</div>
<p>This is very useful when you&#8217;ve stored, for example, dates in a lexically sorted
format (i.e. YYYYMMDD):</p>
<div class="highlight-none"><div class="highlight"><pre>date:[20050101 TO 20090715]
</pre></div>
</div>
<p>The range is normally <em>inclusive</em> (that is, the range will match all terms
between the start and end term, <em>as well as</em> the start and end terms
themselves). You can specify that one or both ends of the range are <em>exclusive</em>
by using the <tt class="docutils literal"><span class="pre">{</span></tt> and/or <tt class="docutils literal"><span class="pre">}</span></tt> characters:</p>
<div class="highlight-none"><div class="highlight"><pre>[0000 TO 0025}
{prefix TO suffix}
</pre></div>
</div>
<p>You can also specify <em>open-ended</em> ranges by leaving out the start or end term:</p>
<div class="highlight-none"><div class="highlight"><pre>[0025 TO]
{TO suffix}
</pre></div>
</div>
</div>
<div class="section" id="boosting-query-elements">
<h2>Boosting query elements<a class="headerlink" href="#boosting-query-elements" title="Permalink to this headline">¶</a></h2>
<p>You can specify that certain parts of a query are more important for calculating
the score of a matched document than others. For example, to specify that
<tt class="docutils literal"><span class="pre">ninja</span></tt> is twice as important as other words, and <tt class="docutils literal"><span class="pre">bear</span></tt> is half as
important:</p>
<div class="highlight-none"><div class="highlight"><pre>ninja^2 cowboy bear^0.5
</pre></div>
</div>
<p>You can apply a boost to several terms using grouping parentheses:</p>
<div class="highlight-none"><div class="highlight"><pre>(open sesame)^2.5 roc
</pre></div>
</div>
</div>
<div class="section" id="making-a-term-from-literal-text">
<h2>Making a term from literal text<a class="headerlink" href="#making-a-term-from-literal-text" title="Permalink to this headline">¶</a></h2>
<p>If you need to include characters in a term that are normally treated specially
by the parser, such as spaces, colons, or brackets, you can enclose the term
in single quotes:</p>
<div class="highlight-none"><div class="highlight"><pre>path:&#39;MacHD:My Documents&#39;
&#39;term with spaces&#39;
title:&#39;function()&#39;
</pre></div>
</div>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
  <h3><a href="index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">The default query language</a><ul>
<li><a class="reference internal" href="#overview">Overview</a></li>
<li><a class="reference internal" href="#individual-terms-and-phrases">Individual terms and phrases</a></li>
<li><a class="reference internal" href="#boolean-operators">Boolean operators</a></li>
<li><a class="reference internal" href="#fields">Fields</a></li>
<li><a class="reference internal" href="#inexact-terms">Inexact terms</a></li>
<li><a class="reference internal" href="#ranges">Ranges</a></li>
<li><a class="reference internal" href="#boosting-query-elements">Boosting query elements</a></li>
<li><a class="reference internal" href="#making-a-term-from-literal-text">Making a term from literal text</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="parsing.html"
                        title="previous chapter">Parsing user queries</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="dates.html"
                        title="next chapter">Indexing and parsing dates/times</a></p>
  <h3>This Page</h3>
  <ul class="this-page-menu">
    <li><a href="_sources/querylang.txt"
           rel="nofollow">Show Source</a></li>
  </ul>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="dates.html" title="Indexing and parsing dates/times"
             >next</a> |</li>
        <li class="right" >
          <a href="parsing.html" title="Parsing user queries"
             >previous</a> |</li>
        <li><a href="index.html">Whoosh 2.5.1 documentation</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
        &copy; Copyright 2007-2012 Matt Chaput.
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
    </div>
  </body>
</html>