Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > f9d20baf2d42bbb9f9c5746dba0abad5 > files > 228

python-translate-doc-1.10.0-3.mga4.noarch.rpm


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>poterminology &mdash; Translate Toolkit 1.9.0 documentation</title>
    
    <link rel="stylesheet" href="../_static/basic.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <link rel="stylesheet" href="../_static/bootstrap.css" type="text/css" />
    <link rel="stylesheet" href="../_static/bootstrap-sphinx.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '1.9.0',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <script type="text/javascript" src="../_static/bootstrap.js"></script>
    <script type="text/javascript" src="../_static/bootstrap-sphinx.js"></script>
    <link rel="top" title="Translate Toolkit 1.9.0 documentation" href="../index.html" />
    <link rel="up" title="Converters" href="index.html" />
    <link rel="next" title="Stopword file format" href="poterminology_stopword_file.html" />
    <link rel="prev" title="tmserver" href="tmserver.html" /> 
  </head>
  <body>
  <div id="navbar" class="navbar navbar-fixed-top">
    <div class="navbar-inner">
      <div class="container-fluid">
        <a class="brand" href="../index.html">Translate Toolkit</a>
        <span class="navbar-text pull-left"><b>1.9.0</b></span>
          <ul class="nav">
            <li class="divider-vertical"></li>
            
              <li class="dropdown">
  <a href="#" class="dropdown-toggle" data-toggle="dropdown">Site <b class="caret"></b></a>
  <ul class="dropdown-menu globaltoc"><ul class="simple">
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../features.html">Features</a></li>
<li class="toctree-l1"><a class="reference internal" href="../installation.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="index.html">Converters</a></li>
<li class="toctree-l1"><a class="reference internal" href="index.html#tools">Tools</a></li>
<li class="toctree-l1"><a class="reference internal" href="index.html#scripts">Scripts</a></li>
<li class="toctree-l1"><a class="reference internal" href="../guides/index.html">Use Cases</a></li>
<li class="toctree-l1"><a class="reference internal" href="../formats/index.html">Supported formats</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../styleguide.html">Translate Styleguide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../styleguide.html#documentation">Documentation</a></li>
<li class="toctree-l1"><a class="reference internal" href="../development/building.html">Building</a></li>
<li class="toctree-l1"><a class="reference internal" href="../development/contributing.html">Contributing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../development/developers.html">Translate Toolkit Developers Guide</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../api/index.html">API</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../changelog.html">Important Changes</a></li>
<li class="toctree-l1"><a class="reference internal" href="../history.html">History of the Translate Toolkit</a></li>
<li class="toctree-l1"><a class="reference internal" href="../license.html">License</a></li>
</ul>
</ul>
</li>
              
<li class="dropdown">
  <a href="#" class="dropdown-toggle" data-toggle="dropdown">Page <b class="caret"></b></a>
  <ul class="dropdown-menu localtoc"><ul>
<li><a class="reference internal" href="#">poterminology</a><ul>
<li><a class="reference internal" href="#usage">Usage</a></li>
<li><a class="reference internal" href="#examples">Examples</a><ul>
<li><a class="reference internal" href="#reduced-terminology-glossaries">Reduced terminology glossaries</a></li>
</ul>
</li>
<li><a class="reference internal" href="#reducing-output-terminology-with-thresholding-options">Reducing output terminology with thresholding options</a><ul>
<li><a class="reference internal" href="#inputs-needed">&#8211;inputs-needed</a></li>
<li><a class="reference internal" href="#locs-needed">&#8211;locs-needed</a></li>
<li><a class="reference internal" href="#fullmsg-needed-substr-needed">&#8211;fullmsg-needed &amp; &#8211;substr-needed</a></li>
</ul>
</li>
<li><a class="reference internal" href="#stop-word-files">Stop word files</a></li>
<li><a class="reference internal" href="#issues">Issues</a></li>
<li><a class="reference internal" href="#on-single-files">On single files</a></li>
</ul>
</li>
</ul>
</ul>
</li>
            
            
              
  <li><a href="tmserver.html"
         title="previous chapter">&laquo; tmserver</a></li>
  <li><a href="poterminology_stopword_file.html"
         title="next chapter">Stopword file format &raquo;</a></li>
            
            
              
            
          </ul>
          
            
<form class="navbar-search pull-right" action="../search.html" method="get">
  <input type="text" name="q" placeholder="Search" />
  <input type="hidden" name="check_keywords" value="yes" />
  <input type="hidden" name="area" value="default" />
</form>
          
          </ul>
        </div>
      </div>
    </div>
  </div>

<div class="container content">
   
  <div class="section" id="poterminology">
<span id="id1"></span><h1>poterminology<a class="headerlink" href="#poterminology" title="Permalink to this headline">¶</a></h1>
<p>poterminology takes Gettext PO/POT files and extracts potential terminology.</p>
<p>This is useful as a first step before translating a new project (or an existing
project into a new target language) as it allows you to define key terminology
for consistency in translations.  The resulting terminology PO files can be
used by Pootle to provide suggestions while translating.</p>
<p>Generally, all the input files should have the same source language, and either
be POT files (with no translations) or PO files with translations to the same
target language.</p>
<p>The more separate PO files you use to generate terminology, the better your
results will be, but poterminology can be used with just a single input file.</p>
<p>Read more about <a class="reference external" href="http://en.wikipedia.org/wiki/Terminology_extraction">terminology extraction</a></p>
<div class="section" id="usage">
<span id="poterminology-usage"></span><h2>Usage<a class="headerlink" href="#usage" title="Permalink to this headline">¶</a></h2>
<div class="highlight-python"><pre>poterminology [options] &lt;input&gt; &lt;terminology&gt;</pre>
</div>
<p>Where:</p>
<table border="1" class="docutils">
<colgroup>
<col width="27%" />
<col width="73%" />
</colgroup>
<tbody valign="top">
<tr class="row-odd"><td>&lt;input&gt;</td>
<td>translations to be examined for terminology</td>
</tr>
<tr class="row-even"><td>&lt;terminology&gt;</td>
<td>extracted potential terminology</td>
</tr>
</tbody>
</table>
<p>Options:</p>
<table class="docutils option-list" frame="void" rules="none">
<col class="option" />
<col class="description" />
<tbody valign="top">
<tr><td class="option-group">
<kbd><span class="option">--version</span></kbd></td>
<td>show program&#8217;s version number and exit</td></tr>
<tr><td class="option-group">
<kbd><span class="option">-h</span>, <span class="option">--help</span></kbd></td>
<td>show this help message and exit</td></tr>
<tr><td class="option-group">
<kbd><span class="option">--manpage</span></kbd></td>
<td>output a manpage based on the help</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--progress=<var>PROGRESS</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>show progress as: <a class="reference internal" href="option_progress.html"><em>dots, none, bar, names, verbose</em></a></td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--errorlevel=<var>ERRORLEVEL</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>show errorlevel as: <a class="reference internal" href="option_errorlevel.html"><em>none, message, exception,
traceback</em></a></td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-i <var>INPUT</var></span>, <span class="option">--input=<var>INPUT</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>read from INPUT in pot, po formats</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-x <var>EXCLUDE</var></span>, <span class="option">--exclude=<var>EXCLUDE</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>exclude names matching EXCLUDE from input paths</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-o <var>OUTPUT</var></span>, <span class="option">--output=<var>OUTPUT</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>write to OUTPUT in po, pot formats</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-u <var>UPDATEFILE</var></span>, <span class="option">--update=<var>UPDATEFILE</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>update terminology in UPDATEFILE</td></tr>
<tr><td class="option-group">
<kbd><span class="option">--psyco=<var>MODE</var></span></kbd></td>
<td>use psyco to speed up the operation, modes: <a class="reference internal" href="option_psyco.html"><em>none,
full, profile</em></a></td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-S <var>STOPFILE</var></span>, <span class="option">--stopword-list=<var>STOPFILE</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>read stopword (term exclusion) list from STOPFILE (default site-packages/translate/share/stoplist-en)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-F</span>, <span class="option">--fold-titlecase</span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>fold &#8220;Title Case&#8221; to lowercase (default)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-C</span>, <span class="option">--preserve-case</span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>preserve all uppercase/lowercase</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-I</span>, <span class="option">--ignore-case</span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>make all terms lowercase</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--accelerator=<var>ACCELERATORS</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>ignore the given accelerator characters when matching (accelerator characters probably require quoting)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">-t <var>LENGTH</var></span>, <span class="option">--term-words=<var>LENGTH</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>generate terms of up to LENGTH words (default 3)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--inputs-needed=<var>MIN</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>omit terms appearing in less than MIN input files (default 2, or 1 if only one input file)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--fullmsg-needed=<var>MIN</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>omit full message terms appearing in less than MIN different messages (default 1)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--substr-needed=<var>MIN</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>omit substring-only terms appearing in less than MIN different messages (default 2)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--locs-needed=<var>MIN</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>omit terms appearing in less than MIN different original program locations (default 2)</td></tr>
<tr><td class="option-group">
<kbd><span class="option">--sort=<var>ORDER</var></span></kbd></td>
<td>output sort order(s): frequency, dictionary, length (default is all orders in the above priority)</td></tr>
<tr><td class="option-group" colspan="2">
<kbd><span class="option">--source-language=<var>LANG</var></span></kbd></td>
</tr>
<tr><td>&nbsp;</td><td>the source language code (default &#8216;en&#8217;)</td></tr>
<tr><td class="option-group">
<kbd><span class="option">-v</span>, <span class="option">--invert</span></kbd></td>
<td>invert the source and target languages for terminology</td></tr>
</tbody>
</table>
</div>
<div class="section" id="examples">
<span id="poterminology-examples"></span><h2>Examples<a class="headerlink" href="#examples" title="Permalink to this headline">¶</a></h2>
<p>You want to generate a terminology file for Pootle that will be used to provide
suggestions for translating Pootle itself:</p>
<div class="highlight-python"><pre>poterminology Pootle/po/pootle/templates/*.pot .</pre>
</div>
<p>This results in a <tt class="docutils literal"><span class="pre">./pootle-terminology.pot</span></tt> output file with 23 terms (from
&#8220;file&#8221; to &#8220;does not exist&#8221;) &#8211; without any translations.</p>
<p>The default output file can be added to a Pootle project to provide
<a class="reference external" href="http://pootle.readthedocs.org/en/latest/features/terminology.html#terminology" title="(in Pootle v2.5.0)"><em class="xref std std-ref">terminology matching</em></a> suggestions for that project;
alternately a special Terminology project can be used and it will provide
terminology suggestions for all projects that do not have a
pootle-terminology.po file.</p>
<p>Generating a terminology file containing automatically extracted translations
is possible as well, by using PO files with translations for the input files:</p>
<div class="highlight-python"><pre>poterminology Pootle/po/pootle/fi/*.po --output fi/pootle-terminology.po --sort dictionary</pre>
</div>
<p>Using PO files with Finnish translations, you get an output file that contains
the same 23 terms, with translations of eight terms &#8211; one (&#8220;login&#8221;) is fuzzy
due to slightly different translations in jToolkit and Pootle.  The file is
sorted in alphabetical order (by source term, not translated term), which can
be useful when comparing different terminology files.</p>
<p>Even though there is no translation of Pootle into Kinyarwanda, you can use the
Gnome UI terminology PO file as a source for translations; in order to extract
only the terms common to jToolkit and Pootle this command includes the POT
output from the first step above (which is redundant) and require terms to
appear in three different input sources:</p>
<div class="highlight-python"><pre>poterminology Pootle/po/pootle/templates/*.pot pootle-terminology.pot \
  Pootle/po/terminology/rw/gnome/rw.po --inputs-needed=3 -o terminology/rw.po</pre>
</div>
<p>Of the 23 terms, 16 have Kinyarwanda translations extracted from the Gnome UI
terminology.</p>
<p>For a language like Spanish, with both Pootle translations and Gnome
terminology available, 18 translations (2 fuzzy) are generated by the following
command, which initializes the terminology file from the POT output from the
first step, and then uses <tt class="docutils literal"><span class="pre">--update</span></tt> to specify that the pootle-es.po file
is to be used both for input and output:</p>
<div class="highlight-python"><pre>cp pootle-terminology.pot glossary-es.po
poterminology --inputs=3 --update glossary-es.po \
  Pootle/po/pootle/es/*.po Pootle/po/terminology/es/gnome/es.po</pre>
</div>
<div class="section" id="reduced-terminology-glossaries">
<span id="poterminology-reduced-terminology-glossaries"></span><h3>Reduced terminology glossaries<a class="headerlink" href="#reduced-terminology-glossaries" title="Permalink to this headline">¶</a></h3>
<p>If you want to generate a terminology file containing only single words,  not
phrases, you can use <tt class="docutils literal"><span class="pre">-t</span></tt>/<tt class="docutils literal"><span class="pre">--term-words</span></tt> to control this.  If your
input files are very large and/or you have a lot of input files, and you are
finding that poterminology is taking too much time and memory to run, reducing
the phrase size from the default value of 3 can be helpful.</p>
<p>For example, running poterminology on the subversion trunk with the default
phrase size can take quite some time and may not even complete on a
small-memory system, but with <tt class="docutils literal"><span class="pre">--term-words=1</span></tt> the initial number of terms
is reduced by half, and the thresholding process can complete:</p>
<div class="highlight-python"><pre>poterminology --progress=none -t 1 translate

1297 terms from 64039 units in 216 files
254 terms after thresholding
254 terms after subphrase reduction</pre>
</div>
<p>The first line of output indicates the number of input files and translation
units (messages), with the number of unique terms present after removing C and
Python format specifiers (e.g. %d), XML/HTML &lt;elements&gt; and &amp;entities; and
performing stoplist elimination.</p>
<p>The second line gives the number of terms remaining after applying threshold
filtering (discussed in more detail below) to eliminate terms that are not
sufficiently &#8220;common&#8221; in the input files.</p>
<p>The third line gives the number of terms remaining after eliminating subphrases
that did not occur independently.  In this case, since the term-words limit is
1, there are no subphrases and so the number is the same as on the second line.</p>
<p>However, in the first example above (generating terminology for Pootle itself),
the term &#8220;not exist&#8221; passes the stoplist and threshold filters, but all
occurrences of this term also contained the term &#8220;does not exist&#8221; which also
passes the stoplist and threshold filters.  Given this duplication, the shorter
phrase is eliminated in favor of the longer one, resulting in 23 terms (out of
25 that pass the threshold filters).</p>
</div>
</div>
<div class="section" id="reducing-output-terminology-with-thresholding-options">
<span id="poterminology-reducing-output-terminology-with-thresholding-options"></span><h2>Reducing output terminology with thresholding options<a class="headerlink" href="#reducing-output-terminology-with-thresholding-options" title="Permalink to this headline">¶</a></h2>
<p>Depending on the size and number of the source files, and the desired scope of
the output terminology file, there are several thresholding filters that can be
adjusted to allow fewer or more terms in the output file.  We have seen above
how one (<tt class="docutils literal"><span class="pre">--inputs-needed</span></tt>) can be used to require that terms be present
in multiple input files, but there are also other thresholds that can be
adjusted to control the size of the output terminology file.</p>
<div class="section" id="inputs-needed">
<h3>&#8211;inputs-needed<a class="headerlink" href="#inputs-needed" title="Permalink to this headline">¶</a></h3>
<p>This is the most flexible and powerful thresholding control.  The default value
is 2, unless only one input file (not counting an <tt class="docutils literal"><span class="pre">--update</span> <span class="pre">argument</span></tt>) is
provided, in which case the threshold is 1 to avoid filtering out all terms and
generating an empty output terminology file.</p>
<p>By copying input files and providing them multiple times as inputs, you can
even achieve &#8220;weighted&#8221; thresholding, so that for example, all terms in one
original input file will pass thresholding, while other files may be filtered.
A simple version of this technique was used above to incorporate translations
from the Gnome terminology PO files without having it affect the terms that
passed the threshold filters.</p>
</div>
<div class="section" id="locs-needed">
<h3>&#8211;locs-needed<a class="headerlink" href="#locs-needed" title="Permalink to this headline">¶</a></h3>
<p>Rather than requiring that a term appear in multiple input PO or POT files,
this requires that it have been present in multiple source code files, as
evidenced by location comments in the PO/POT sources.</p>
<p>This threshold can be helpful in eliminating over-specialized terminology that
you don&#8217;t want when multiple PO/POT files are generated from the same sources
(via included header or library files).</p>
<p>Note that some PO/POT files have function names rather than source file names
in the location comments; in this case the threshold will be on multiple
functions, which may need to be set higher to be effective.</p>
<p>Not all PO/POT files contain proper location comments.  If your input files
don&#8217;t have (good) location comments and the output terminology file is reduced
to zero or very few entries by thresholding, you may need to override the
default value for this threshold and set it to 0, which disables this check.</p>
<p>The setting of the <tt class="docutils literal"><span class="pre">--locs-needed</span></tt> comment has another effect, which is
that location comments in the output terminology file will be limited to twice
that number; a location comment indicating the number of additional locations
not specified will be added instead of the omitted locations.</p>
</div>
<div class="section" id="fullmsg-needed-substr-needed">
<h3>&#8211;fullmsg-needed &amp; &#8211;substr-needed<a class="headerlink" href="#fullmsg-needed-substr-needed" title="Permalink to this headline">¶</a></h3>
<p>These two thresholds specify the number of different translation units
(messages) in which a term must appear; they both work in the same way, but the
first one applies to terms which appear as complete translation units in one or
more of the source files (full message terms), and the second one to all other
terms (substring terms).  Note that translations are extracted only for full
message terms; poterminology cannot identify the corresponding substring in a
translation.</p>
<p>If you are working with a single input file without useful location comments,
increasing these thresholds may be the only way to effectively reduce the
output terminology.  Generally, you should increase the <tt class="docutils literal"><span class="pre">--substr-needed</span></tt>
threshold first, as the full message terms are more likely to be useful
terminology.</p>
</div>
</div>
<div class="section" id="stop-word-files">
<span id="poterminology-stop-word-files"></span><h2>Stop word files<a class="headerlink" href="#stop-word-files" title="Permalink to this headline">¶</a></h2>
<p>Much of the power of poterminology in generating useful terminology files is
due to the default stop word file that it uses.  This file contains words and
regular expressions that poterminology will ignore when generating terms, so
that the output terminology doesn&#8217;t have tons of useless entries like &#8220;the 16&#8221;
or &#8220;Z&#8221;.</p>
<p>In most cases, the default stop word list will work well, but you may want to
replace it with your own version, or possibly just supplement or override
certain entries.  The default <a class="reference internal" href="poterminology_stopword_file.html"><em>poterminology stopword file</em></a> contains comments that describe the syntax and
operation of these files.</p>
<p>If you want to completely replace the stopword list (for example, if your
source language is French rather than English) you could do it with a command
like this:</p>
<div class="highlight-python"><pre>poterminology --stopword-list=stoplist-fr logiciel/ -o glossaire.po</pre>
</div>
<p>If you merely want to modify the standard stopword list with your own additions
and overrides, you must explicitly specify the default list first:</p>
<div class="highlight-python"><pre>poterminology -S /usr/lib/python2.5/site-packages/translate/share/stoplist-en \
  -S my-stoplist po/ -o terminology.po</pre>
</div>
<p>You can use poterminology <tt class="docutils literal"><span class="pre">--help</span></tt> to see the default stopword list
pathname, which may differ from the one shown above.</p>
<p>Note that if you are using multiple stopword list files, as in the above, they
will all be subject to the same case mapping (fold &#8220;Title Case&#8221; to lower case
by default) &#8211; if you specify a different case mapping in the second file it
will override the mapping for all the stopword list files.</p>
</div>
<div class="section" id="issues">
<span id="poterminology-issues"></span><h2>Issues<a class="headerlink" href="#issues" title="Permalink to this headline">¶</a></h2>
<p>When using poterminology on Windows systems, file globbing for input is not
supported (unless you have a version of Python built with cygwin, which is not
common).  On Windows, a command like <tt class="docutils literal"><span class="pre">poterminology</span> <span class="pre">-o</span> <span class="pre">test.po</span> <span class="pre">podir/\*.po</span></tt>
will fail with an error &#8220;No such file or directory: &#8216;podir\*.po&#8217;&#8221; instead of
expanding the podir/*.po glob expression.  (This problem affects all Translate
Toolkit command-line tools, not just poterminology.)  You can work around this
problem by making sure that the directory does not contain any files (or
subdirectories) that you do not want to use for input, and just giving the
directory name as the argument, e.g. <tt class="docutils literal"><span class="pre">poterminology</span> <span class="pre">-o</span> <span class="pre">test.po</span> <span class="pre">podir</span></tt> for the
case above.</p>
<p>When using terminology files generated by poterminology as input, a plethora of
translator comments marked with (poterminology) may be generated, with the
number of these increasing on each iteration.  You may wish to run
<a class="reference internal" href="pocommentclean.html"><em>pocommentclean</em></a> (or a slightly modified version of it which only removes
(poterminology) comments) on the input and/or output files, especially since
translator comments are displayed as tooltips by Pootle (thankfully, they are
truncated at a few dozen characters).</p>
<p>Default threshold settings may eliminate all output terms; in this case,
poterminology should suggest threshold option settings that would allow output
to be generated (this enhancement is tracked as <a class="reference external" href="http://bugs.locamotion.org/show_bug.cgi?id=582">bug 582</a>).</p>
<p>While poterminology ignores XML/HTML entities and elements and %-style format
strings (for C and Python), it does not ignore all types of &#8220;variables&#8221; that
may occur, particularly in OpenOffice.org, Mozilla, or Gnome localization
files.  These other types should be ignored as well (this enhancement is
tracked as <a class="reference external" href="http://bugs.locamotion.org/show_bug.cgi?id=598">bug 598</a>).</p>
<p>Terms containing only words that are ignored individually, but not excluded
from phrases (e.g. &#8220;you are you&#8221;) may be generated by poterminology, but aren&#8217;t
generally useful.  Adding a new threshold option <tt class="docutils literal"><span class="pre">--nonstop-needed</span></tt> could
allow these to be suppressed (this enhancement is tracked as <a class="reference external" href="http://bugs.locamotion.org/show_bug.cgi?id=1102">bug 1102</a>).</p>
<p>Pootle ignores parenthetical comments in source text when performing
terminology matching; this allows for terms like &#8220;scan (verb)&#8221; and &#8220;scan
(noun)&#8221; to both be provided as suggestions for a message containing &#8220;scan.&#8221;
poterminology does not provide any special handling for these, but it could use
them to provide better handling of different translations for a single term.
This would be an improvement over the current approach, which marks the term
fuzzy and includes all variants, with location information in {} braces in the
automatically extracted translation.</p>
<p>Currently, message context information (PO msgctxt) is not used in any way;
this could provide an additional source of information for distinguishing
variants of the same term.</p>
<p>A single execution of poterminology can only perform automatic translation
extraction for a single target language &#8211; having the ability to handle all
target languages in one run would allow a single command to generate all
terminology for an entire project.  Additionally, this could provide even more
information for identifying variant terms by comparing the number of target
languages that have variant translations.</p>
</div>
<div class="section" id="on-single-files">
<span id="poterminology-on-single-files"></span><h2>On single files<a class="headerlink" href="#on-single-files" title="Permalink to this headline">¶</a></h2>
<p>If poterminology yields 0 terms from single files, try the following:</p>
<div class="highlight-python"><pre>poterminology --locs-needed=0 --inputs-needed=0 --substr-needed=5 -i yourfile.po -o yourfile_term.po</pre>
</div>
<p>...where &#8220;substr-needed&#8221; is the number of times a term should occur to be
considered.</p>
</div>
</div>


</div>
<hr>

<footer class="footer">
  <div class="container">
    <p class="pull-right"><a href="#">Back to top ↑</a></p>
    <ul class="unstyled muted">
      <li><small>
        &copy; 2012, Translate.org.za.<br/>
      </small></li>
      <li><small>
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
      </small></li>
    </ul>
  </div>
</footer>
  </body>
</html>