<sect1 id='scfg-make-manual'> <title><command>scfg_train</command> <emphasis>Train the parameters of a stochastic context free grammar</emphasis></title> <toc depth='1'></toc> <para> </para> <sect2> <title>Synopsis</title> <para> </para> <!-- /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/scfg_train -sgml_synopsis --> <para> <cmdsynopsis><command>scfg_train</command>[options<arg>-grammar <replaceable>ifile</replaceable></arg> <arg>-corpus <replaceable>ifile</replaceable></arg> <arg>-method <replaceable>string</replaceable> " {inout}"</arg> <arg>-passes <replaceable>int</replaceable> " {50}"</arg> <arg>-startpass <replaceable>int</replaceable> " {0}"</arg> <arg>-spread <replaceable>int</replaceable></arg> <arg>-checkpoint <replaceable>int</replaceable></arg> <arg>-heap <replaceable>int</replaceable> " {210000}"</arg> <arg>-o <replaceable>ofile</replaceable></arg> </cmdsynopsis> </para> <!-- DONE /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/scfg_train -sgml_synopsis --> <para> scfg_train takes a stochastic context free grammar (SCFG) and trains the probabilities with repsect to a given bracket corpus using the inside-outside algorithm. This is basically an implementation of Pereira and Schabes 1992. Note using this program properly may require months of CPU time. </para> </sect2> <sect2> <title>OPTIONS</title> <para> </para> <!-- /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/scfg_train -sgml_options --> <para> <variablelist> <varlistentry><term>-grammar</term> <LISTITEM><PARA> <replaceable>ifile</replaceable> Grammar file, one rule per line. </PARA></LISTITEM> </varlistentry> <varlistentry><term>-corpus</term> <LISTITEM><PARA> <replaceable>ifile</replaceable> Corpus file, one bracketed sentence per line. </PARA></LISTITEM> </varlistentry> <varlistentry><term>-method</term> <LISTITEM><PARA> <replaceable>string</replaceable> " {inout}" Method for training: inout. </PARA></LISTITEM> </varlistentry> <varlistentry><term>-passes</term> <LISTITEM><PARA> <replaceable>int</replaceable> " {50}" Number of training passes. </PARA></LISTITEM> </varlistentry> <varlistentry><term>-startpass</term> <LISTITEM><PARA> <replaceable>int</replaceable> " {0}" Starting at pass N. </PARA></LISTITEM> </varlistentry> <varlistentry><term>-spread</term> <LISTITEM><PARA> <replaceable>int</replaceable> Spread training data over N passes. </PARA></LISTITEM> </varlistentry> <varlistentry><term>-checkpoint</term> <LISTITEM><PARA> <replaceable>int</replaceable> Save grammar every N passes </PARA></LISTITEM> </varlistentry> <varlistentry><term>-heap</term> <LISTITEM><PARA> <replaceable>int</replaceable> " {210000}" Set size of Lisp heap, needed for large corpora </PARA></LISTITEM> </varlistentry> <varlistentry><term>-o</term> <LISTITEM><PARA> <replaceable>ofile</replaceable> Output file for trained grammar. </PARA></LISTITEM> </varlistentry> </variablelist> </para> <!-- DONE /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/scfg_train -sgml_options --> </sect2> </sect1>