<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML> <HEAD> <TITLE>class EST_SCFG_traintest</TITLE> <META NAME="GENERATOR" CONTENT="DOC++ 3.4.6"> </HEAD> <body bgcolor="#ffffff" link="#0000ff" vlink="#dd0000" text="#000088" alink="9000ff"> <A HREF = "http://www.cstr.ed.ac.uk/"> <IMG align=left BORDER=0 SRC = "cstr.gif"></A> <A HREF="http://www.cstr.ed.ac.uk/projects/speech_tools.html"> <IMG BORDER=0 ALIGN=right SRC="est.jpg" width=150 height=93></A> <br> <br clear=left> <p align=right> In file ../include/EST_SCFG.h:<TABLE BORDER=0><TR> <TD VALIGN=TOP><H2>class <A HREF="#DOC.DOCU">EST_SCFG_traintest</A></H2></TD></H2></TD></TR></TABLE> <BLOCKQUOTE>A class used to train (and test) SCFGs is an extention of <!1><A HREF="EST_SCFG.html">EST_SCFG</A>.</BLOCKQUOTE> <HR> <H2>Inheritance:</H2> <APPLET CODE="ClassGraph.class" WIDTH=600 HEIGHT=65> <param name=classes value="CEST_SCFG,MEST_SCFG.html,CEST_SCFG_traintest,MEST_SCFG_traintest.html"> <param name=before value="M,M"> <param name=after value="Md_,M"> <param name=indent value="0,1"> <param name=arrowdir value="down"> </APPLET> <HR> <DL> <P><TABLE> <DT><H3>Public Methods</H3><DD><TR> <TD VALIGN=TOP><A HREF="#DOC.106.17"><IMG ALT="[more]" BORDER=0 SRC=icon1.gif></A>void </TD><TD><B>test_corpus</B> ()<BR> <I>Test the current grammar against the current corpus print summary.</I> </TD></TR><TR> <TD VALIGN=TOP><A HREF="#DOC.106.18"><IMG ALT="[more]" BORDER=0 SRC=icon1.gif></A>void </TD><TD><B>test_crossbrackets</B> ()<BR> <I>Test the current grammar against the current corpus.</I> </TD></TR><TR> <TD VALIGN=TOP><A HREF="#DOC.106.19"><IMG ALT="[more]" BORDER=0 SRC=icon1.gif></A>void </TD><TD><B>load_corpus</B> (const <!1><A HREF="EST_String.html">EST_String</A> &<!1><A HREF="EST_TokenStream.html#DOC.10.7.9">filename</A>)<BR> <I>Load a corpus from the given file.</I> </TD></TR><TR> <TD VALIGN=TOP><A HREF="#DOC.106.20"><IMG ALT="[more]" BORDER=0 SRC=icon1.gif></A>void </TD><TD><B>train_inout</B> (int passes, int startpass, int checkpoint, int spread, const <!1><A HREF="EST_String.html">EST_String</A> &outfile)<BR> <I>Train a grammar using the loaded corpus.</I> </TD></TR></TABLE></P> </DL> <HR><H3>Inherited from <A HREF="EST_SCFG.html">EST_SCFG</A>:</H3> <DL> <P><DL> <DT><H3>Public Methods</H3><DD><DT> <P> <B>Constructor and initialisation functions </B> <P><DL> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif> <B><A HREF="#DOC.105.1.1">EST_SCFG</A></B>(LISP <!1><A HREF="EST_SCFG.html#DOC.105.2.3">rules</A>) <DD><I>Initialize from a set of rules</I> </DL></P> <DT> <P> <B>utility functions </B> <P><DL> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>void <B><A HREF="#DOC.105.2.1">set_rules</A></B>(LISP <!1><A HREF="EST_SCFG.html#DOC.105.2.3">rules</A>) <DD><I>Set (or reset) rules from external source after construction</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>LISP <B><A HREF="#DOC.105.2.2">get_rules</A></B>() <DD><I>Return rules as LISP list</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>SCFGRuleList <B><A HREF="#DOC.105.2.3">rules</A></B> <DD><I>The rules themselves</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>void <B><A HREF="#DOC.105.2.4">find_terms_nonterms</A></B>(EST_StrList &nt, EST_StrList &<!1><A HREF="EST_Wave.html#DOC.81.4.5">t</A>, LISP <!1><A HREF="EST_SCFG.html#DOC.105.2.3">rules</A>) <DD><I>Find the terminals and nonterminals in the given grammar, adding them to the appropriate given string lists</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif><!1><A HREF="EST_String.html">EST_String</A> <B><A HREF="#DOC.105.2.5">nonterminal</A></B>(int <!1><A HREF="XML_Parser.html#DOC.190.3.9">p</A>) const <DD><I>Convert nonterminal index to string form</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif><!1><A HREF="EST_String.html">EST_String</A> <B><A HREF="#DOC.105.2.6">terminal</A></B>(int m) const <DD><I>Convert terminal index to string form</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>int <B><A HREF="#DOC.105.2.7">nonterminal</A></B>(const <!1><A HREF="EST_String.html">EST_String</A> &<!1><A HREF="XML_Parser.html#DOC.190.3.9">p</A>) const <DD><I>Convert nonterminal string to index</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>int <B><A HREF="#DOC.105.2.8">terminal</A></B>(const <!1><A HREF="EST_String.html">EST_String</A> &m) const <DD><I>Convert terminal string to index</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>int <B><A HREF="#DOC.105.2.9">num_nonterminals</A></B>() const <DD><I>Number of nonterminals</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>int <B><A HREF="#DOC.105.2.10">num_terminals</A></B>() const <DD><I>Number of terminals</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>double <B><A HREF="#DOC.105.2.11">prob_B</A></B>(int <!1><A HREF="XML_Parser.html#DOC.190.3.9">p</A>, int q, int r) const <DD><I>The rule probability of given binary rule</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>double <B><A HREF="#DOC.105.2.12">prob_U</A></B>(int <!1><A HREF="XML_Parser.html#DOC.190.3.9">p</A>, int m) const <DD><I>The rule probability of given unary rule</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif>void <B><A HREF="#DOC.105.2.13">set_rule_prob_cache</A></B>() <DD><I>(re-)set rule probability caches</I> </DL></P> <DT> <P> <B>file i/o functions </B> <P><DL> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif><!1><A HREF="EST_read_status.html">EST_read_status</A> <B><A HREF="#DOC.105.3.1">load</A></B>(const <!1><A HREF="EST_String.html">EST_String</A> &<!1><A HREF="EST_TokenStream.html#DOC.10.7.9">filename</A>) <DD><I>Load grammar from named file</I> <DT> <IMG ALT="[more]" BORDER=0 SRC=icon1.gif><!1><A HREF="EST_write_status.html">EST_write_status</A> <B><A HREF="#DOC.105.3.2">save</A></B>(const <!1><A HREF="EST_String.html">EST_String</A> &<!1><A HREF="EST_TokenStream.html#DOC.10.7.9">filename</A>) <DD><I>Save current grammar to named file</I> </DL></P> </DL></P> </DL> <A NAME="DOC.DOCU"></A> <HR> <H2>Documentation</H2> <BLOCKQUOTE>A class used to train (and test) SCFGs is an extention of <!1><A HREF="EST_SCFG.html">EST_SCFG</A>. <P>This offers an implementation of Pereira and Schabes ``Inside-Outside reestimation from partially bracket corpora.'' ACL 1992. <P>A SCFG maybe trained from a corpus (optionally) containing brackets over a series of passes reestimating the grammar probabilities after each pass. This basically extends the <!1><A HREF="EST_SCFG.html">EST_SCFG</A> class adding support for a bracket corpus and various indexes for efficient use of the grammar.</BLOCKQUOTE> <DL> <A NAME="test_corpus"></A> <A NAME="DOC.106.17"></A> <DT><IMG ALT="o" BORDER=0 SRC=icon2.gif><TT><B>void test_corpus()</B></TT> <DD>Test the current grammar against the current corpus print summary. <P>Cross entropy measure only is given. <DL><DT><DD></DL><P> <A NAME="test_crossbrackets"></A> <A NAME="DOC.106.18"></A> <DT><IMG ALT="o" BORDER=0 SRC=icon2.gif><TT><B>void test_crossbrackets()</B></TT> <DD>Test the current grammar against the current corpus. <P>Sumamry includes percentage of cross bracketing accuracy and percentage of fully correct parses. <DL><DT><DD></DL><P> <A NAME="load_corpus"></A> <A NAME="DOC.106.19"></A> <DT><IMG ALT="o" BORDER=0 SRC=icon2.gif><TT><B>void load_corpus(const <!1><A HREF="EST_String.html">EST_String</A> &<!1><A HREF="EST_TokenStream.html#DOC.10.7.9">filename</A>)</B></TT> <DD>Load a corpus from the given file. <P>Each setence in the corpus should be contained in parentheses. Additional paranethesis may be used to denote phrasing within a sentence. The corpus is read using the LISP reader so LISP conventions shold apply, notable single quotes should appear within double quotes. <DL><DT><DD></DL><P> <A NAME="train_inout"></A> <A NAME="DOC.106.20"></A> <DT><IMG ALT="o" BORDER=0 SRC=icon2.gif><TT><B>void train_inout(int passes, int startpass, int checkpoint, int spread, const <!1><A HREF="EST_String.html">EST_String</A> &outfile)</B></TT> <DD>Train a grammar using the loaded corpus. <P> <DL><DT><DT><B>Parameters:</B><DD><B>passes</B> - the number of training passes desired. <BR><B>startpass</B> - from which pass to <!1><A HREF="EST_Track.html#DOC.71.7.9">start</A> from <BR><B>checkpoint</B> - <!1><A HREF="EST_SCFG.html#DOC.105.3.2">save</A> the <!1><A HREF="EST_SCFG_Chart.html#DOC.110.1">grammar</A> every <!1><A HREF="EST_SCFG_traintest.html#DOC.106.3">n</A> passes <BR><B>spread</B> - Percentage of corpus to use on each pass, this cycles through the corpus on each pass.<BR><DD></DL><P></DL> <HR><DL><DT><B>This class has no child classes.</B></DL> <DL><DT><DD></DL><P><P><I><A HREF="index.html">Alphabetic index</A></I> <I><A HREF="HIER.html">HTML hierarchy of classes</A> or <A HREF="HIERjava.html">Java</A></I></P><HR> <A HREF = "http://www.ed.ac.uk/"> <IMG align=right BORDER=0 SRC = "edcrest.gif"></A> <P Align=left><I>This page is part of the <A HREF="http://www.cstr.ed.ac.uk/projects/speech_tools.html"> Edinburgh Speech Tools Library</A> documentation <br> Copyright <A HREF="http://www.ed.ac.uk"> University of Edinburgh</A> 1997 <br> Contact: <A HREF="mailto:speech_toolss@cstr.ed.ac.uk"> speech_tools@cstr.ed.ac.uk </a> </P> <br clear=right>