<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>8.11.9. snippet_html — groonga v3.0.5 documentation</title> <link rel="stylesheet" href="../../_static/groonga.css" type="text/css" /> <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '../../', VERSION: '3.0.5', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="../../_static/jquery.js"></script> <script type="text/javascript" src="../../_static/underscore.js"></script> <script type="text/javascript" src="../../_static/doctools.js"></script> <link rel="shortcut icon" href="../../_static/favicon.ico"/> <link rel="top" title="groonga v3.0.5 documentation" href="../../index.html" /> <link rel="up" title="8.11. Function" href="../function.html" /> <link rel="next" title="8.11.10. sub_filter" href="sub_filter.html" /> <link rel="prev" title="8.11.8. rand" href="rand.html" /> </head> <body> <div class="header"> <h1 class="title"> <a id="top-link" href="../../index.html"> <span class="project">groonga</span> <span class="separator">-</span> <span class="description">An open-source fulltext search engine and column store.</span> </a> </h1> <div class="other-language-links"> <ul> <li><a href="../../../../ja/html/reference/functions/snippet_html.html"><img src="../../_static/jp.png" alt="日本語">日本語版はこちら</a></li> </ul> </div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../../genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="sub_filter.html" title="8.11.10. sub_filter" accesskey="N">next</a> |</li> <li class="right" > <a href="rand.html" title="8.11.8. rand" accesskey="P">previous</a> |</li> <li><a href="../../index.html">groonga v3.0.5 documentation</a> »</li> <li><a href="../../reference.html" >8. リファレンスマニュアル</a> »</li> <li><a href="../function.html" accesskey="U">8.11. Function</a> »</li> </ul> </div> <div class="document"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="snippet-html"> <h1>8.11.9. snippet_html<a class="headerlink" href="#snippet-html" title="Permalink to this headline">¶</a></h1> <div class="admonition caution"> <p class="first admonition-title">Caution</p> <p class="last">This feature is experimental. API will be changed.</p> </div> <div class="section" id="summary"> <h2>8.11.9.1. Summary<a class="headerlink" href="#summary" title="Permalink to this headline">¶</a></h2> <p><tt class="docutils literal"><span class="pre">snippet_html</span></tt> extracts snippets of target text around search keywords (<tt class="docutils literal"><span class="pre">KWIC</span></tt>. <tt class="docutils literal"><span class="pre">KeyWord</span> <span class="pre">In</span> <span class="pre">Context</span></tt>). The snippets are prepared for embedding HTML. Special characters such as <tt class="docutils literal"><span class="pre"><</span></tt> and <tt class="docutils literal"><span class="pre">></span></tt> are escapsed as <tt class="docutils literal"><span class="pre">&lt;</span></tt> and <tt class="docutils literal"><span class="pre">&gt;</span></tt>. Keyword is surrounded with <tt class="docutils literal"><span class="pre"><span</span> <span class="pre">class="keyword"></span></tt> and <tt class="docutils literal"><span class="pre"></span></span></tt>. For example, a snippet of <tt class="docutils literal"><span class="pre">I</span> <span class="pre">am</span> <span class="pre">a</span> <span class="pre">groonga</span> <span class="pre">user.</span> <span class="pre"><3</span></tt> for keyword <tt class="docutils literal"><span class="pre">groonga</span></tt> is <tt class="docutils literal"><span class="pre">I</span> <span class="pre">am</span> <span class="pre">a</span> <span class="pre"><span</span> <span class="pre">class="keyword">groonga</span></span> <span class="pre">user.</span> <span class="pre">&lt;3</span></tt>.</p> </div> <div class="section" id="syntax"> <h2>8.11.9.2. Syntax<a class="headerlink" href="#syntax" title="Permalink to this headline">¶</a></h2> <p><tt class="docutils literal"><span class="pre">snippet_html</span></tt> has only one parameter:</p> <div class="highlight-none"><div class="highlight"><pre>snippet_html(column) </pre></div> </div> <p><tt class="docutils literal"><span class="pre">snippet_html</span></tt> has many parameters internally but they can't be specified for now. You will be able to custom those parameters soon.</p> </div> <div class="section" id="usage"> <h2>8.11.9.3. Usage<a class="headerlink" href="#usage" title="Permalink to this headline">¶</a></h2> <p>Here are a schema definition and sample data to show usage.</p> <p>Execution example:</p> <div class="highlight-none"><div class="highlight"><pre>table_create Documents TABLE_NO_KEY # [[0, 1337566253.89858, 0.000355720520019531], true] column_create Documents content COLUMN_SCALAR Text # [[0, 1337566253.89858, 0.000355720520019531], true] table_create Terms TABLE_PAT_KEY|KEY_NORMALIZE ShortText --default_tokenizer TokenBigram # [[0, 1337566253.89858, 0.000355720520019531], true] column_create Terms documents_content_index COLUMN_INDEX|WITH_POSITION Documents content # [[0, 1337566253.89858, 0.000355720520019531], true] load --table Documents [ ["content"], ["Groonga is a fast and accurate full text search engine based on inverted index. One of the characteristics of groonga is that a newly registered document instantly appears in search results. Also, groonga allows updates without read locks. These characteristics result in superior performance on real-time applications."], ["Groonga is also a column-oriented database management system (DBMS). Compared with well-known row-oriented systems, such as MySQL and PostgreSQL, column-oriented systems are more suited for aggregate queries. Due to this advantage, groonga can cover weakness of row-oriented systems."] ] # [[0, 1337566253.89858, 0.000355720520019531], 2] </pre></div> </div> <p><tt class="docutils literal"><span class="pre">snippet_html</span></tt> can be used in only <tt class="docutils literal"><span class="pre">--output_columns</span></tt> in <a class="reference internal" href="../commands/select.html"><em>select</em></a>.</p> <p>You need to specify <tt class="docutils literal"><span class="pre">--command_version</span> <span class="pre">2</span></tt> argument explicitly because function call in <tt class="docutils literal"><span class="pre">--output_columns</span></tt> is experimental feature in groonga 2.0.9. It will be enabled by default soon.</p> <p>You also need to specify <tt class="docutils literal"><span class="pre">--query</span></tt> and/or <tt class="docutils literal"><span class="pre">--filter</span></tt>. Keywords are extracted from <tt class="docutils literal"><span class="pre">--query</span></tt> and <tt class="docutils literal"><span class="pre">--filter</span></tt> arguments.</p> <p>The following example uses <tt class="docutils literal"><span class="pre">--query</span> <span class="pre">"fast</span> <span class="pre">performance"</span></tt>. In this case, <tt class="docutils literal"><span class="pre">fast</span></tt> and <tt class="docutils literal"><span class="pre">performance</span></tt> are used as keywords.</p> <p>Execution example:</p> <div class="highlight-none"><div class="highlight"><pre>select Documents --output_columns "snippet_html(content)" --command_version 2 --match_columns content --query "fast performance" # [ # [ # 0, # 1337566253.89858, # 0.000355720520019531 # ], # [ # [ # [ # 1 # ], # [ # [ # "snippet_html", # "null" # ] # ], # [ # [ # "Groonga is a <span class=\"keyword\">fast</span> and accurate full text search engine based on inverted index. One of the characteristics of groonga is that a newly registered document instantly appears in search results. Also, gro", # "onga allows updates without read locks. These characteristics result in superior <span class=\"keyword\">performance</span> on real-time applications." # ] # ] # ] # ] # ] </pre></div> </div> <p><tt class="docutils literal"><span class="pre">--query</span> <span class="pre">"fast</span> <span class="pre">performance"</span></tt> matches to only the first record's content. <tt class="docutils literal"><span class="pre">snippet_html(content)</span></tt> extracts two text parts that include the keywords <tt class="docutils literal"><span class="pre">fast</span></tt> or <tt class="docutils literal"><span class="pre">performance</span></tt> and surronds the keywords with <tt class="docutils literal"><span class="pre"><span</span> <span class="pre">class="keyword"></span></tt> and <tt class="docutils literal"><span class="pre"></span></span></tt>.</p> <p>The max number of text parts is 3. If there are 4 or more text parts that include the keywords, only the leading 3 parts are only used.</p> <p>The max size of a text part is 200byte. The unit is bytes not chracters. The size doesn't include inserted <tt class="docutils literal"><span class="pre"><span</span> <span class="pre">keyword="keyword"></span></tt> and <tt class="docutils literal"><span class="pre"></span></span></tt>.</p> <p>Both the max number of text parts and the max size of a text part aren't customizable.</p> <p>You can specify string literal instead of column.</p> <p>Execution example:</p> <div class="highlight-none"><div class="highlight"><pre>select Documents --output_columns 'snippet_html("Groonga is very fast fulltext search engine.")' --command_version 2 --match_columns content --query "fast performance" # [ # [ # 0, # 1337566253.89858, # 0.000355720520019531 # ], # [ # [ # [ # 1 # ], # [ # [ # "snippet_html", # "null" # ] # ], # [ # [ # "Groonga is very <span class=\"keyword\">fast</span> fulltext search engine." # ] # ] # ] # ] # ] </pre></div> </div> </div> <div class="section" id="return-value"> <h2>8.11.9.4. Return value<a class="headerlink" href="#return-value" title="Permalink to this headline">¶</a></h2> <p><tt class="docutils literal"><span class="pre">snippet_html</span></tt> returns an array of string. An element of array is a snippet:</p> <div class="highlight-none"><div class="highlight"><pre>[SNIPPET1, SNIPPET2, SNIPPET3] </pre></div> </div> <p>A snippet includes one or more keywords. The max byte size of a snippet except <tt class="docutils literal"><span class="pre"><span</span> <span class="pre">keyword="keyword"></span></tt> and <tt class="docutils literal"><span class="pre"></span></span></tt> is 200byte. The unit isn't the number of chracters.</p> <p>The array size is larger than or equal to 0 and less than or equal to 3. The max size 3 will be customizable soon.</p> </div> <div class="section" id="todo"> <h2>8.11.9.5. TODO<a class="headerlink" href="#todo" title="Permalink to this headline">¶</a></h2> <ul class="simple"> <li>Make the max number of text parts customizable.</li> <li>Make the max size of a text part customizable.</li> <li>Make keywords customizable.</li> <li>Make tag that surrounds a keyword customizable.</li> <li>Make normalization customizable.</li> <li>Support options by object literal.</li> </ul> </div> <div class="section" id="see-also"> <h2>8.11.9.6. See also<a class="headerlink" href="#see-also" title="Permalink to this headline">¶</a></h2> <ul class="simple"> <li><a class="reference internal" href="../commands/select.html"><em>select</em></a></li> </ul> </div> </div> </div> </div> </div> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <h3><a href="../../index.html">Table Of Contents</a></h3> <ul> <li><a class="reference internal" href="#">8.11.9. snippet_html</a><ul> <li><a class="reference internal" href="#summary">8.11.9.1. Summary</a></li> <li><a class="reference internal" href="#syntax">8.11.9.2. Syntax</a></li> <li><a class="reference internal" href="#usage">8.11.9.3. Usage</a></li> <li><a class="reference internal" href="#return-value">8.11.9.4. Return value</a></li> <li><a class="reference internal" href="#todo">8.11.9.5. TODO</a></li> <li><a class="reference internal" href="#see-also">8.11.9.6. See also</a></li> </ul> </li> </ul> <h4>Previous topic</h4> <p class="topless"><a href="rand.html" title="previous chapter">8.11.8. rand</a></p> <h4>Next topic</h4> <p class="topless"><a href="sub_filter.html" title="next chapter">8.11.10. sub_filter</a></p> <h3>This Page</h3> <ul class="this-page-menu"> <li><a href="../../_sources/reference/functions/snippet_html.txt" rel="nofollow">Show Source</a></li> </ul> <div id="searchbox" style="display: none"> <h3>Quick search</h3> <form class="search" action="../../search.html" method="get"> <input type="text" name="q" /> <input type="submit" value="Go" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> <p class="searchtip" style="font-size: 90%"> Enter search terms or a module, class or function name. </p> </div> <script type="text/javascript">$('#searchbox').show(0);</script> </div> </div> <div class="clearer"></div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../../genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="sub_filter.html" title="8.11.10. sub_filter" >next</a> |</li> <li class="right" > <a href="rand.html" title="8.11.8. rand" >previous</a> |</li> <li><a href="../../index.html">groonga v3.0.5 documentation</a> »</li> <li><a href="../../reference.html" >8. リファレンスマニュアル</a> »</li> <li><a href="../function.html" >8.11. Function</a> »</li> </ul> </div> <div class="footer"> © Copyright 2009-2013, Brazil, Inc. </div> </body> </html>