Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > f9d20baf2d42bbb9f9c5746dba0abad5 > files > 284

python-translate-doc-1.10.0-3.mga4.noarch.rpm


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Creating a terminology list from your existing translations &mdash; Translate Toolkit 1.9.0 documentation</title>
    
    <link rel="stylesheet" href="../_static/basic.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <link rel="stylesheet" href="../_static/bootstrap.css" type="text/css" />
    <link rel="stylesheet" href="../_static/bootstrap-sphinx.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '1.9.0',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <script type="text/javascript" src="../_static/bootstrap.js"></script>
    <script type="text/javascript" src="../_static/bootstrap-sphinx.js"></script>
    <link rel="top" title="Translate Toolkit 1.9.0 documentation" href="../index.html" />
    <link rel="up" title="Use Cases" href="index.html" />
    <link rel="next" title="Running the tools on Microsoft Windows" href="running_the_tools_on_microsoft_windows.html" />
    <link rel="prev" title="Checking for inconsistencies in your translations" href="checking_for_inconsistencies.html" /> 
  </head>
  <body>
  <div id="navbar" class="navbar navbar-fixed-top">
    <div class="navbar-inner">
      <div class="container-fluid">
        <a class="brand" href="../index.html">Translate Toolkit</a>
        <span class="navbar-text pull-left"><b>1.9.0</b></span>
          <ul class="nav">
            <li class="divider-vertical"></li>
            
              <li class="dropdown">
  <a href="#" class="dropdown-toggle" data-toggle="dropdown">Site <b class="caret"></b></a>
  <ul class="dropdown-menu globaltoc"><ul class="simple">
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../features.html">Features</a></li>
<li class="toctree-l1"><a class="reference internal" href="../installation.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="../commands/index.html">Converters</a></li>
<li class="toctree-l1"><a class="reference internal" href="../commands/index.html#tools">Tools</a></li>
<li class="toctree-l1"><a class="reference internal" href="../commands/index.html#scripts">Scripts</a></li>
<li class="toctree-l1"><a class="reference internal" href="index.html">Use Cases</a></li>
<li class="toctree-l1"><a class="reference internal" href="../formats/index.html">Supported formats</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../styleguide.html">Translate Styleguide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../styleguide.html#documentation">Documentation</a></li>
<li class="toctree-l1"><a class="reference internal" href="../development/building.html">Building</a></li>
<li class="toctree-l1"><a class="reference internal" href="../development/contributing.html">Contributing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../development/developers.html">Translate Toolkit Developers Guide</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../api/index.html">API</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../changelog.html">Important Changes</a></li>
<li class="toctree-l1"><a class="reference internal" href="../history.html">History of the Translate Toolkit</a></li>
<li class="toctree-l1"><a class="reference internal" href="../license.html">License</a></li>
</ul>
</ul>
</li>
              
<li class="dropdown">
  <a href="#" class="dropdown-toggle" data-toggle="dropdown">Page <b class="caret"></b></a>
  <ul class="dropdown-menu localtoc"><ul>
<li><a class="reference internal" href="#">Creating a terminology list from your existing translations</a><ul>
<li><a class="reference internal" href="#quick-overview">Quick Overview</a></li>
<li><a class="reference internal" href="#get-short-phrases-from-the-current-translations">Get short phrases from the current translations</a></li>
<li><a class="reference internal" href="#remove-any-translations-with-issues">Remove any translations with issues</a></li>
<li><a class="reference internal" href="#create-a-compendium">Create a compendium</a></li>
<li><a class="reference internal" href="#split-the-file">Split the file</a></li>
<li><a class="reference internal" href="#dealing-with-the-fuzzies">Dealing with the fuzzies</a></li>
<li><a class="reference internal" href="#put-it-back-together-again">Put it back together again</a></li>
<li><a class="reference internal" href="#create-other-formats">Create other formats</a></li>
</ul>
</li>
<li><a class="reference internal" href="#the-work-has-only-just-begun">The work has only just begun</a></li>
</ul>
</ul>
</li>
            
            
              
  <li><a href="checking_for_inconsistencies.html"
         title="previous chapter">&laquo; Checking for inconsistencies in your translations</a></li>
  <li><a href="running_the_tools_on_microsoft_windows.html"
         title="next chapter">Running the tools on Microsoft Windows &raquo;</a></li>
            
            
              
            
          </ul>
          
            
<form class="navbar-search pull-right" action="../search.html" method="get">
  <input type="text" name="q" placeholder="Search" />
  <input type="hidden" name="check_keywords" value="yes" />
  <input type="hidden" name="area" value="default" />
</form>
          
          </ul>
        </div>
      </div>
    </div>
  </div>

<div class="container content">
   
  <div class="section" id="creating-a-terminology-list-from-your-existing-translations">
<span id="id1"></span><h1>Creating a terminology list from your existing translations<a class="headerlink" href="#creating-a-terminology-list-from-your-existing-translations" title="Permalink to this headline">¶</a></h1>
<p>If you did not create a terminology list when you started your translation
project or if you have inherited some old translations you probably now want to
create a terminology list.</p>
<p>A terminology list or glossary is a list of words and phrases with their
expected translation.  They are useful for ensuring that your translations are
consistent across your project.</p>
<p>With existing translations you have embedded a list of valid translation.  This
example will help you to extract the terms.  It is only the first step you will
need to review the terms and must not regard this as a complete list.  And of
course you would want to take your corrections and feed them back into the
original translations.</p>
<div class="section" id="quick-overview">
<span id="creating-a-terminology-list-from-your-existing-translations-quick-overview"></span><h2>Quick Overview<a class="headerlink" href="#quick-overview" title="Permalink to this headline">¶</a></h2>
<p>This describes a multi-stage process for extracting terminology from
translation files.  It is provided for historical interest and completeness,
but you will probably find that using <a class="reference internal" href="../commands/poterminology.html"><em>poterminology</em></a> is easier
and will give better results than following this process.</p>
<ul class="simple">
<li>Filter our phrases of more than N words</li>
<li>Remove obviously erroneous phrases such as numbers and punctuation</li>
<li>Create a single PO compendium</li>
<li>Extract and review items that are fuzzy and drop untranslated items</li>
<li>Create a new PO files and process into CSV and TMX format</li>
</ul>
</div>
<div class="section" id="get-short-phrases-from-the-current-translations">
<span id="creating-a-terminology-list-from-your-existing-translations-get-short-phrases-from-the-current-translations"></span><h2>Get short phrases from the current translations<a class="headerlink" href="#get-short-phrases-from-the-current-translations" title="Permalink to this headline">¶</a></h2>
<p>We will not be able to identify terminology within bodies of text, we are only
going to extract short bit of text i.e. ones that are between 1 and 3 words
long.</p>
<div class="highlight-python"><pre>pogrep --header --search=msgid -e '^\w+(\s+\w+){0,2}$' zulu zulu-short</pre>
</div>
<p>We use <tt class="docutils literal"><span class="pre">--header</span></tt> to ensure that the PO files have a header entry (which
is important for encoding).  We are searching only in the msgid and the regular
expression we use is looking for a string with between 1 and 3 words in it.  We
are searching through the folder <em>zulu</em> and outputting the result in
<em>zulu-short</em></p>
</div>
<div class="section" id="remove-any-translations-with-issues">
<span id="creating-a-terminology-list-from-your-existing-translations-remove-any-translations-with-issues"></span><h2>Remove any translations with issues<a class="headerlink" href="#remove-any-translations-with-issues" title="Permalink to this headline">¶</a></h2>
<p>You can for instance remove all entries with only a single letter.  Useful for
eliminating all those spurious accelerator keys.</p>
<div class="highlight-python"><pre>pogrep --header --search=msgid -v -e "^.$" zulu-short zulu-short-clean</pre>
</div>
<p>We use the <tt class="docutils literal"><span class="pre">-v</span></tt> option to invert the search.  Our <em>cleaner</em> potential
glossary words are now in <em>zulu-short-clean</em>.  What you can eliminate is only
limited by your ability to build regular expressions but yu could eliminate:</p>
<ul class="simple">
<li>Entries with only numbers</li>
<li>Entries that only contain punctuation</li>
</ul>
</div>
<div class="section" id="create-a-compendium">
<span id="creating-a-terminology-list-from-your-existing-translations-create-a-compendium"></span><h2>Create a compendium<a class="headerlink" href="#create-a-compendium" title="Permalink to this headline">¶</a></h2>
<p>Now that we have our words we want to create a sinlge files of all terminology.
Thus we create a PO compendium:</p>
<div class="highlight-python"><pre>~/path/to/pocompendium -i -su zulu-gnome-glossary.po -d zulu-short-clean</pre>
</div>
<p>You can use various methods but our bash script is quite good.  Here we ignore
case, <tt class="docutils literal"><span class="pre">-i</span></tt>, and ignore the underscore (_) accelerator key, <tt class="docutils literal"><span class="pre">-su</span></tt>,
outputting the results in.</p>
<p>We now have a single file containing all glossary terms and the clean up and
review can begin.</p>
</div>
<div class="section" id="split-the-file">
<span id="creating-a-terminology-list-from-your-existing-translations-split-the-file"></span><h2>Split the file<a class="headerlink" href="#split-the-file" title="Permalink to this headline">¶</a></h2>
<p>We want to split the file into translated, untranslated and fuzzy entries:</p>
<div class="highlight-python"><pre>~/path/to/posplit ./zulu-gnome-glossary.po</pre>
</div>
<p>This will create three files:</p>
<ul class="simple">
<li>zulu-gnome-glossary-translated.po &#8211; all fully translated entries</li>
<li>zulu-gnome-glossary-untranslated.po &#8211; messages with no translation</li>
<li>zulu-gnome-glossary-fuzzy.po &#8211; words that need investigation</li>
</ul>
<div class="highlight-python"><pre>rm zulu-gnome-glossary-untranslated.po</pre>
</div>
<p>We discard <tt class="docutils literal"><span class="pre">zulu-gnome-glossary-untranslated.po</span></tt> since they are of no use to
us.</p>
</div>
<div class="section" id="dealing-with-the-fuzzies">
<span id="creating-a-terminology-list-from-your-existing-translations-dealing-with-the-fuzzies"></span><h2>Dealing with the fuzzies<a class="headerlink" href="#dealing-with-the-fuzzies" title="Permalink to this headline">¶</a></h2>
<p>The fuzzies come in two kinds.  Those that are simply wrong or needed updating
and those where there was more then one translation for a given term.  So if
someone had translated &#8216;File&#8217; differently across the translations we&#8217;d have an
entry that was marked fuzzy with the two options displayed.</p>
<div class="highlight-python"><pre>pofilter -t compendiumconflicts zulu-gnome-glossary-fuzzy.po zulu-gnome-glossary-conflicts.po</pre>
</div>
<p>These compedium conflicts are what we are interested in so we use pofilter to
filter them from the other fuzzies.</p>
<div class="highlight-python"><pre>rm zulu-gnome-glossary-fuzzy.po</pre>
</div>
<p>We discard the other fuzzies as they where probably wrong in the first place.
You could review these but it is not recommended.</p>
<p>Now edit <tt class="docutils literal"><span class="pre">zulu-gnome-glossary-conflicts.po</span></tt> to resolve the conflicts.  You
can edit them however you like but we usually follow the format:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">option1</span><span class="p">,</span> <span class="n">option2</span><span class="p">,</span> <span class="n">option3</span>
</pre></div>
</div>
<p>You can get them into that layout by doing the following:</p>
<div class="highlight-python"><pre>sed '/#, fuzzy/d; /\"#-#-#-#-# /d; /# (pofilter) compendiumconflicts:/d; s/\\n"$/, "/' zulu-gnome-glossary-conflicts.po &gt; tmp.po
msgcat tmp.po &gt; zulu-gnome-glossary-conflicts.po</pre>
</div>
<p>Of course if a word is clearly wrong, misspelled etc. then you can eliminate
it.  Often you will find the &#8220;problem&#8221; relates to the part of speech of the
source word and that indeed there are two options depending on the context.</p>
<p>You now have a cleaned fuzzy file and we are ready to proceed.</p>
</div>
<div class="section" id="put-it-back-together-again">
<span id="creating-a-terminology-list-from-your-existing-translations-put-it-back-together-again"></span><h2>Put it back together again<a class="headerlink" href="#put-it-back-together-again" title="Permalink to this headline">¶</a></h2>
<div class="highlight-python"><pre>msgcat zulu-gnome-glossary-translated.po zulu-gnome-glossary-conflicts.po &gt; zulu-gnome-glossary.po</pre>
</div>
<p>We now have a single file <tt class="docutils literal"><span class="pre">zulu-gnome-glossary.po</span></tt> which contains our
glossary texts.</p>
</div>
<div class="section" id="create-other-formats">
<span id="creating-a-terminology-list-from-your-existing-translations-create-other-formats"></span><h2>Create other formats<a class="headerlink" href="#create-other-formats" title="Permalink to this headline">¶</a></h2>
<p>It is probably good to make your terminology available in other formats.  You
can create CSV and TMX files from your PO.</p>
<div class="highlight-python"><pre>po2csv zulu-gnome-glossary.po zulu-gnome-glossary.csv
po2tmx -l zu zulu-gnome-glossary.po zulu-gnome-glossary.tmx</pre>
</div>
<p>For the terminology to be usable by Trados or Wordfast translators they need to
be in the following formats:</p>
<ul class="simple">
<li>Trados &#8211; comma delimited file <tt class="docutils literal"><span class="pre">source,target</span></tt></li>
<li>Wordfast &#8211; tab delimited file <tt class="docutils literal"><span class="pre">source[tab]target</span></tt></li>
</ul>
<p>In that format they are now available to almost all localisers in the world.</p>
<p>FIXME need scripts to generate these formats.</p>
</div>
</div>
<div class="section" id="the-work-has-only-just-begun">
<span id="creating-a-terminology-list-from-your-existing-translations-the-work-has-only-just-begun"></span><h1>The work has only just begun<a class="headerlink" href="#the-work-has-only-just-begun" title="Permalink to this headline">¶</a></h1>
<p>The lists you have just created are useful in their own right.  But you most
likely want to keep growing them, cleaning and improving them.</p>
<p>You should as a first step review what you have created and fix spelling and
other errors or disambiguate terms as needed.</p>
<p>But congratulations a Terminology list or Glossary is one of your most
important assets for creating good and consistent translations and it acts as a
valuable resource for both new and experienced translators when they need
prompting as to how to translate a term.</p>
</div>


</div>
<hr>

<footer class="footer">
  <div class="container">
    <p class="pull-right"><a href="#">Back to top ↑</a></p>
    <ul class="unstyled muted">
      <li><small>
        &copy; 2012, Translate.org.za.<br/>
      </small></li>
      <li><small>
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
      </small></li>
    </ul>
  </div>
</footer>
  </body>
</html>