Sophie

Sophie

distrib > Fedora > 14 > x86_64 > media > updates > by-pkgid > 0e54ba0ee564ce6063a5e83aa86060c5 > files > 463

festival-speechtools-devel-1.2.96-18.fc14.i686.rpm

  <sect1>
	<title>Linguistic Classes Example Code</title>

    <para>
    </para>
    <sect2>
      <title>Adding basic information to an EST_Item</title>
      <para>
 
 An item such as 
 <graphic fileref="../arch_doc/eq01.gif" format="gif"></graphic> 
 is constructed as follows: (note that
 the atttributes are in capitals by linguistic convention only:
 attirbute names are case sensitive and can be upper or lower
 case).
      </para>
      <programlisting arch='c'>  EST_Item p;
  
  p.set("POS", "Noun");
  p.set("NAME", "example");
  p.set("FOCUS", "+");
  p.set("DURATION", 2.76);
  p.set("STRESS", 2);      </programlisting>
      <para>
The type of the values in features is a
 <classname>EST_Val</classname> class, which is a union which can
 store ints, floats, EST_Strings, void pointers, and
 <classname>EST_Features</classname>. The overloaded function
 facility of C++ means that the <function>set()</function> can be
 used for all of these. 
      </para>
    </sect2>
    <sect2>
      <title>Accessing basic information in an Item</title>
      <para>
 
 When accessing the features, the type must be
 specified. This is done most easily by using of a series of
 functions whose type is coded by a capital letter:
 </para>
 <formalpara><title><function>F()</function></title><para> return value as a 
 float</para></formalpara>
 <formalpara><title><function>I()</function></title><para> return value as a
	    integer</para></formalpara>
 <formalpara><title><function>S()</function></title><para> return value as a
 <formalpara><title><function>A()</function></title><para> return value as a
       EST_Features</para></formalpara>
 <para>
      </para>
      <programlisting arch='c'>  cout &lt;&lt; "Part of speech for p is " &lt;&lt; p.S("POS") &lt;&lt; endl;
  cout &lt;&lt; "Duration for p is " &lt;&lt; p.F("DURATION") &lt;&lt; endl;
  cout &lt;&lt; "Stress value for p is " &lt;&lt; p.I("STRESS") &lt;&lt; endl;      </programlisting>
      <para>
</para>
 <SIDEBAR>
 <TITLE>Output</TITLE>
 <screen>
 "Noun"
 2.75
 1
 </screen>
 </SIDEBAR>
 <para>
 A optional default value can be given if a result is always desired
      </para>
      <programlisting arch='c'>  cout &lt;&lt; "Part of speech for p is " 
      &lt;&lt; p.S("POS") &lt;&lt; endl;
  cout &lt;&lt; "Syntactic Category for p is " 
      &lt;&lt; p.S("CAT", "Noun") &lt;&lt; endl; // <lineannotation>noerror</lineannotation>      </programlisting>
    </sect2>
    <sect2>
      <title>Nested feature structures in items</title>
      <para>
 
 Nested feature structures such as <xref linkend="eq11"> 
 <example ID="eq11">
   <title>Example eq11</title>
 <graphic fileref="../arch_doc/eq05.gif" format="gif"></graphic>
 </example>
 can be created in a number of ways:
      </para>
      <programlisting arch='c'>  
  p.set("NAME", "d");
  p.set("VOICE", "+");
  p.set("CONTINUANT", "-");
  p.set("SONORANT", "-");

  EST_Features f;  
  p.set("PLACE OF ARTICULATION", f); // <lineannotation>copy in empty feature set here</lineannotation>
  
  p.A("PLACE OF ARTICULATION").set("CORONAL", "+");
  p.A("PLACE OF ARTICULATION").set("ANTERIOR", "+");      </programlisting>
      <para>
or by filling the values in an EST_Features object and
 copying it in:
      </para>
      <programlisting arch='c'>  EST_Features f2;
  
  f2.set("CORONAL", "+");
  f2.set("ANTERIOR", "+");
  
  p.set("PLACE OF ARTICULATION", f2);      </programlisting>
      <para>
Nested features can be accessed by multiple calls to the
 accessing commands:
      </para>
      <programlisting arch='c'>  cout &lt;&lt; "Anterior value is: " &lt;&lt; p.A("PLACE OF ARTICULATION").S("ANTERIOR");
  cout &lt;&lt; "Coronal value is: " &lt;&lt; p.A("PLACE OF ARTICULATION").S("CORONAL");      </programlisting>
      <para>
The first command is <function>A()</function> because PLACE is a
 feature structure, and the second command is
 <function>S()</function> because it returns a string (the
 value or ANTRIOR or CORONAL). A shorthand is provided to
 extract the value in a single statement:
      </para>
      <programlisting arch='c'>  cout &lt;&lt; "Anterior value is: " &lt;&lt; p.S("PLACE OF ARTICULATION.ANTERIOR");
  cout &lt;&lt; "Coronal value is: " &lt;&lt; p.S("PLACE OF ARTICULATION.CORONAL");      </programlisting>
      <para>
Again, as the last value to be returned is a string
 <function>S()</function> must be used. This shorthand can also be used
 to set the features:
      </para>
      <programlisting arch='c'>  
  p.set("PLACE OF ARTICULATION.CORONAL", "+");
  p.set("PLACE OF ARTICULATION.ANTERIOR", "+");      </programlisting>
      <sect3>
        <para>
this is the easiest and most commonly used method. */
//@}
/** @name Utility functions for items
 
 The presence of a attribute can be checked using
 <function>f_present()</function>, which returns true if the
  attribute is in the item:
        </para>
        <programlisting arch='c'>  cout &lt;&lt; "This is true: " &lt;&lt; p.f_present("PLACE OF ARTICULATION");
  cout &lt;&lt; "This is false: " &lt;&lt; p.f_present("MANNER");        </programlisting>
        <para>
A attirbute can be removed by <function>f_remove</function>
        </para>
        <programlisting arch='c'>  p.f_remove("PLACE OF ARTICULATION");        </programlisting>
      </sect3>
      <sect3>
        <title>Building a linear list relation</title>
        <para>
  <!--  *** UPDATE *** -->      
 	
 	It is standard to store the phones for an utterance as a linear list
 	in a EST_Relation object. Each phone is represented by one
 	EST_Item, whereas the complete list is stored as a
 	EST_Relation.
 	</para><para>
 	The easiest way to build a linear list is by using the
 	<function>EST_Relation.append()</function>, which when called
 	without arguments, makes a new empty EST_Item, adds it onto
 	the end of the relation and returns a pointer to it. The
 	information relevant to that phone can then be added to the
 	returned item.
        </para>
        <programlisting arch='c'>  EST_Relation phones;
  EST_Item *a;
  
  a = phones.append();
  
  a-&gt;set("NAME", "f");
  a-&gt;set("TYPE", "consonant");
  
  a = phones.append();
  
  a-&gt;set("NAME", "o");
  a-&gt;set("TYPE", "vowel");
  
  a = phones.append();
  
  a-&gt;set("NAME", "r");
  a-&gt;set("TYPE", "consonant");        </programlisting>
        <para>
Note that the -> operator is used because the EST_Item a is a
 pointer here. The same pointer variable can be used multiple
 times because every time <function>append()</function> is
 called it allocates a new item and returns a pointer to it.
 </para><para>
 If you already have a EST_Item pointer and want to add it to a
 relation, you can give it as an argument to
 <function>append()</function>, but this is generally
 inadvisable as it involves some unecessary copying, and also
 you have to allocate the memory for the next EST_Item pointer
 yourself everytime (if you don't you will overwrite the
 previous one):
        </para>
        <programlisting arch='c'>  a = new EST_Item;
  a-&gt;set("NAME", "m");
  a-&gt;set("TYPE", "consonant");
  
  phones.append(a);
  
  a = new EST_Item;
  a-&gt;set("NAME", "ei");
  a-&gt;set("TYPE", "vowel");        </programlisting>
        <para>
Items can be prepended in exactly the same way:
        </para>
        <programlisting arch='c'>  a = phones.prepend();
  
  a-&gt;set("NAME", "n");
  a-&gt;set("TYPE", "consonant");
  
  a = phones.prepend();
  
  a-&gt;set("NAME", "i");
  a-&gt;set("TYPE", "vowel");
	        </programlisting>
      </sect3>
      <sect3>
        <title>Iterating through a linear list relation</title>
        <para>
 Iteration in lists is performed with
 <function>next()</function> and <function>prev()</function>, and
 an EST_Item, used as an iteration pointer.
        </para>
        <programlisting arch='c'>  EST_Item *s;

  for (s = phones.head(); s != 0; s = next(s))
    cout &lt;&lt; s-&gt;S("NAME") &lt;&lt; endl;        </programlisting>
        <para>
</para>
 <SIDEBAR>
 <TITLE>Output</TITLE>
 <screen>
 name:i    type:vowel
 name:n    type:consonant
 name:f    type:consonant
 name:o    type:vowel
 name:r    type:consonant
 name:m    type:consonant
 </screen>
 </SIDEBAR>
 <para>
        </para>
        <programlisting arch='c'>  for (s = phones.tail(); s != 0; s = prev(s))
    cout &lt;&lt; s-&gt;S("NAME") &lt;&lt; endl;        </programlisting>
        <para>
</para>
 <SIDEBAR>
 <TITLE>Output</TITLE>
 <screen>
 name:m    type:consonant
 name:r    type:consonant
 name:o    type:vowel
 name:f    type:consonant
 name:n    type:consonant
 name:i    type:vowel
 </screen>
 </SIDEBAR>
<para> 	
 <function>head()</function> and <function>tail()</function>
 return EST_Item pointers to the start and end of the list.
 <function>next()</function> and <function>prev()</function>
 returns the next or previous item in the list, and returns
 <literal>0</literal> when the end or start of the list is
 reached. Hence checking for <literal>0</literal> is a useful
 termination condition of the iteration. Taking advantage of C
 shorthand allows us to write:
        </para>
        <programlisting arch='c'>  for (s = phones.head(); s; s = next(s))
    cout &lt;&lt; s-&gt;S("NAME") &lt;&lt; endl;        </programlisting>
      </sect3>
      <sect3>
        <title>Building a tree relation</title>
        <para>
 
 <!--  *** UPDATE *** -->
 
 	It is standard to store information such as syntax as a tree
 	in a EST_Relation object. Each tree node is represented by one
 	EST_Item, whereas the complete tree is stored as a
 	EST_Relation.
 </para><para>	
 	The easiest way to build a tree is by using the
 	<function>append_daughter()</function>, which when called
 	without arguments, makes a new empty EST_Item, adds it as a
 	daughter to an existing item and returns a pointer to it. The
 	information relevant to that node can then be added to the
 	returned item. The root node of the tree must be added
 	directly to the EST_Relation.
        </para>
        <example id='prog01'>
          <title>Example prog01</title>
        <programlisting arch='c'>  EST_Relation tree;
  EST_Item *r, *np, *vp, *n;
  
  r = tree.append();
  r-&gt;set("CAT", "S");
  
  np = append_daughter(r);
  np-&gt;set("CAT", "NP");
  
  n =  append_daughter(np);
  n-&gt;set("CAT", "PRO");
  
  n =  append_daughter(n);
  n-&gt;set("NAME", "John");
  
  vp = append_daughter(r);
  vp-&gt;set("CAT", "VP");
  
  n = append_daughter(vp);
  n-&gt;set("CAT", "VERB");
  n = append_daughter(n);
  n-&gt;set("NAME", "loves");
  
  np = append_daughter(vp);
  np-&gt;set("CAT", "NP");
  
  n = append_daughter(np);
  n-&gt;set("CAT", "DET");
  n = append_daughter(n);
  n-&gt;set("NAME", "the");
  
  n = append_daughter(np);
  n-&gt;set("CAT", "NOUN");
  n = append_daughter(n);
  n-&gt;set("NAME", "woman");
  
  cout &lt;&lt; tree;        </programlisting>
        </example>
        <para>
</para>
 <SIDEBAR>
 <TITLE>Output</TITLE>
 <screen>
 (S 
   (NP 
      (N (John))
   )
   (VP 
      (V (loves)) 
      (NP 
         (DET the) 
         (NOUN woman))
   )
)
</screen>
 </SIDEBAR>
 <para>
 Obviously, the use of recursive functions in building trees is more
 efficient and would eliminate the need for the large number of
 temporary variables used in the above example.
        </para>
      </sect3>
      <sect3>
        <title>Iterating through a tree relation</title>
        <para>
 
 Iteration in trees is done with <function>daughter1()</function>
 <function>daughter2()</function> <function>daughtern()</function> and
 <function>parent()</function>. Pre-order traversal can be achieved
 iteratively as follows:
        </para>
        <programlisting arch='c'>  n = tree.head();             // <lineannotation>initialise iteration variable to head of tree </lineannotation>
  while (n)
    {
      if (daughter1(n) != 0) // <lineannotation>if daughter exists, make n its daughter </lineannotation>
        n = daughter1(n);
      else if (next(n) != 0)// <lineannotation>otherwise visit its sisters </lineannotation>
        n = next(n);
      else                    // <lineannotation>if no sisters are left, go back up the tree </lineannotation>
	{                       // <lineannotation>until a sister to a parent is found </lineannotation>
	  bool found=FALSE;
	  for (EST_Item *pp = parent(n); pp != 0; pp = parent(pp))
	    if (next(pp))
	      {
		n = next(pp);
		found=TRUE;
		break;
	      }
	  if (!found)
	    {
	      n = 0;
	      break;
	    }
	}
      cout &lt;&lt; *n;
    }        </programlisting>
        <para>
A special set of iterators are available for traversal of the leaf
 (terminal) nodes of a tree:
        </para>
        <example id='prog02'>
          <title>Leaf iteration</title>
        <programlisting arch='c'>  for (s = first_leaf(tree.head()); s != last_leaf(tree.head()); 
       s = next_leaf(s))
    cout &lt;&lt; s-&gt;S("NAME") &lt;&lt; endl;        </programlisting>
        </example>
      </sect3>
      <sect3>
        <title>Building a multi-linear relation</title>
        <para>
        </para>
      </sect3>
      <sect3>
        <title>Iterating through a multi-linear relation</title>
        <para>
        </para>
      </sect3>
      <sect3>
        <title>Relations in Utterances</title>
        <para>
 
 The <classname>EST_Utterance</classname> class is used to store all
 the items and relations relevant to a single utterance. (Here
 utterance is used as a general linguistic entity - it doesn't have to
 relate to a well formed complete linguistic unit such as a sentence or
 phrase). 
 </para><para>
 Instead of storing relations separately, they are stored in
 utterances:
        </para>
        <programlisting arch='c'>  EST_Utterance utt;
  
  utt.create_relation("Word");
  utt.create_relation("Syntax");        </programlisting>
        <para>
EST_Relations can be accessed though the utterance object either
 directly or by use of a temporary EST_Relation pointer:
        </para>
        <programlisting arch='c'>  EST_Relation *word, *syntax;
  
  word = utt.relation("Word");
  syntax = utt.relation("Syntax");        </programlisting>
        <para>
The contents of the relation can be filled by the methods described
 above. 
        </para>
      </sect3>
      <sect3>
        <title>Adding items into multiple relations</title>
        <para>
 A major aspect of this system is that an item can be in two relations
 at once, as shown in <xref linkend="figure02">.
 </para><para>
 In the following example, using the syntax relation as already created
 in <xref linkend="prog01">,
 shows how to put the terminal nodes of this
 tree into a word relation:
        </para>
        <example id='prog03'>
          <title>adding existing items to a new relation</title>
        <programlisting arch='c'>  word = utt.relation("Word");
  syntax = utt.relation("Syntax");
  
  for (s = first_leaf(syntax-&gt;head()); s != last_leaf(syntax-&gt;head()); 
       s = next_leaf(s))
    word-&gt;append(s);
          </programlisting>
        </example>
        <para>

 Thus the terminal nodes in the syntax relation are now stored as a
 linear list in the word relation.
 
 Hence
        </para>
        <programlisting arch='c'>  cout &lt;&lt; *utt.relation("Syntax") &lt;&lt; "\n";        </programlisting>
        <para>
produces
</para>
 <sidebar>
 <title>Output</title>
 <screen>
(S 
   (NP 
      (N (John))
   )
   (VP 
      (V (loves)) 
      (NP 
         (DET the) 
         (NOUN woman))
   )
)
</screen>
</sidebar>
<para>
whereas
        </para>
        <programlisting arch='c'>  cout &lt;&lt; *utt.relation("Word") &lt;&lt; "\n";        </programlisting>
        <para>
produces
</para>
 <sidebar>
 <title>Output</title>
 <screen>
John
loves
the
woman
</screen>
 </sidebar>
 <para>
        </para>
      </sect3>
      <sect3>
        <title>Changing the relation an item is in</title>
        <para>
as_relation, in relation etc
        </para>
      </sect3>
      <sect3>
        <title>Feature functions</title>
        <para>
evaluate functions
setting functions
        </para>
      </sect3>
    </sect2>
  </sect1>