Sophie: festival-speechtools-devel-1.2.96-16.fc13 i686

festival-speechtools-devel-1.2.96-16.fc13.i686.rpm

  <sect1 id='ch-track-manual'>
	<title><command>ch_track</command> <emphasis>Track file manipulation</emphasis></title>

    <toc depth='1'></toc>
    <para>
    </para>
    <sect2>
      <title>Synopsis</title>
      <para>
      </para>
        <!-- /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/ch_track -sgml_synopsis -->
        <para>
<cmdsynopsis><command>ch_track</command>[input file] -o [output file] [options]<arg>-h </arg>
<arg>-itype <replaceable>string</replaceable></arg>
<arg>-ctype <replaceable>string</replaceable></arg>
<arg>-s <replaceable>float</replaceable></arg>
<arg>-c <replaceable>string</replaceable></arg>
<arg>-start <replaceable>float</replaceable></arg>
<arg>-end <replaceable>float</replaceable></arg>
<arg>-from <replaceable>int</replaceable></arg>
<arg>-to <replaceable>int</replaceable></arg>
<arg>-otype <replaceable>string</replaceable> " {ascii}"</arg>
<arg>-S <replaceable>float</replaceable></arg>
<arg>-o <replaceable>ofile</replaceable></arg>
<arg>-info </arg>
<arg>-track_names <replaceable>string</replaceable></arg>
<arg>-diff </arg>
<arg>-delta <replaceable>int</replaceable></arg>
<arg>-sm <replaceable>float</replaceable></arg>
<arg>-smtype <replaceable>string</replaceable></arg>
<arg>-style <replaceable>string</replaceable></arg>
<arg>-t <replaceable>float</replaceable></arg>
<arg>-neg <replaceable>string</replaceable></arg>
<arg>-pos <replaceable>string</replaceable></arg>
<arg>-pc <replaceable>string</replaceable></arg>
</cmdsynopsis>
        </para>
        <!-- DONE /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/ch_track -sgml_synopsis -->
      <para>

ch_track is used to manipulate the format of a track
file. Operations include:
<itemizedlist>
<listitem><para>file format conversion</para></listitem>
<listitem><para>smoothing</para></listitem>
<listitem><para>changing the frame spacing of a track (resampling)</para></listitem>
<listitem><para>producing differentiated and delta tracks</para></listitem>
<listitem><para>Using a threshold to convert a track file to a label file</para></listitem>
<listitem><para>making multiple input files into a single multi-channel output file</para></listitem>
<listitem><para>extracting a single channel from a multi-channel track</para></listitem>
<listitem><para>extracting a time-delimited portion of the waveform</para></listitem>
</itemizedlist>
      </para>
    </sect2>
    <sect2>
      <title>Options</title>
      <para>
      </para>
        <!-- /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/ch_track -sgml_options -->
        <para>
<variablelist>
<varlistentry><term>-h</term>
<LISTITEM><PARA>

Options help 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-itype</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Input file type (optional). If no type is 
specified type is automatically derived from 
file's header. Supported types 
are: none, esps, est, est_binary, htk, htk_fbank, htk_mfcc, htk_user, htk_discrete, ssff, xmg, xgraph, ema, ema_swapped, ascii 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-ctype</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Contour type: F0, track 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-s</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Frame spacing of input in seconds, for unheadered input file 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-c</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Select a subset of channels (starts from 0). 
Tracks can have multiple channels. This option 
specifies a list of numbers, refering to the channel 
numbers which are to be used for for processing. 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-start</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Extract track starting at this time, 
specified in seconds 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-end</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Extract track ending at this time, 
specified in seconds 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-from</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

Extract track starting at this frame position 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-to</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

Extract track ending at this frame position 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-otype</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>
 " {ascii}"
Output file type, if unspecified ascii is 
assumed, types are: none, esps, est, est_binary, htk, htk_fbank, htk_mfcc, htk_user, htk_discrete, ssff, xmg, xgraph, ema, ema_swapped, ascii, label 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-S</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Frame spacing of output in seconds. If this is 
different from the internal spacing, the contour is 
resampled at this spacing 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-o</term>
<LISTITEM><PARA>
<replaceable>ofile</replaceable>

Output filename, defaults to stdout 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-info</term>
<LISTITEM><PARA>

Print information about file and header. 
This option gives useful information such as file 
length, file type, channel names. No output is produced 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-track_names</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

File containing new names for output channels 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-diff</term>
<LISTITEM><PARA>

Differentiate contour. This performs simple 
numerical differentiation on the contour by 
subtracting the amplitude of the current frame 
from the amplitude of the next. Although quick, 
this technique is crude and not recommende as the 
estimation of the derivate is done on only one point 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-delta</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

Make delta coefficients (better form of differentiate). 
The argument to this option is the regression length of 
of the delta calculation and can be between 2 and 4 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-sm</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Length of smoothing window in seconds. Various types of 
smoothing are available for tracks. This options specifies 
length of the smooting window which effects the degree of 
smoothing, i.e. a longer value means more smoothing 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-smtype</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Smooth type, median or mean 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-style</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Convert track to other form. Currently only one form 
"label" is supported. This uses a specified cut off to 
make a label file, with two labels, one for above the 
cut off (-pos) and one for below (-neg) 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-t</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

threshold for track to label conversion 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-neg</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Name of negative label in track to label conversion 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-pos</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Name of positive label in track to label conversion 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-pc</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Combine given tracks in parallel. If option 
is longest, pad shorter tracks to longest, else if 
first pad/cut to match first input track 
Available track file formats: 
none unknown track file type 
esps entropic sps file 
est Edinburgh Speech Tools track file 
est_binary Edinburgh Speech Tools track file 
htk htk file 
htk_fbank htk file (as FBANK) 
htk_mfcc htk file (as MFCC) 
htk_user htk file (as USER) 
htk_discretehtk file (as DISCRETE) 
ssff Macquarie University's Simple Signal File Format 
xmg xmg file viewer 
xgraph xgraph display program format 
ema ema 
ema_swapped ema, swapped 
ascii ascii decimal numbers </PARA></LISTITEM>
</varlistentry>
</variablelist>
        </para>
        <!-- DONE /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/ch_track -sgml_options -->
    </sect2>
    <sect2>
      <title>Making multiple tracks into a single track</title>
      <para>
If multiple input files are specified, by default they are concatenated into 
the output file.
<para>
<screen>
$ ch_track kdt_010.tr kdt_011.tr kdt_012.tr kdt_013.tr -o out.tr
</screen>
</para>
<para>
In the above example, 4 multi channel input files are converted to
one single channel output file. Multi-channel tracks can 
concatenated provided they all have the same number of input channels.
</para><para>
Multiple input files can be made into a multi-channel output file by 
using the -pc option:
</para><para>
<screen>
$ ch_track kdt_010.tr kdt_011.tr kdt_012.tr kdt_013.tr -o -pc longest out.tr
</screen>
</para>
<para>
The argument to -pc can either be longest, in which the output
track is the length of the longest input file, or first in which it
is the length of the first intput file.
      </para>
    </sect2>
    <sect2>
      <title>Extracting channels from multi-channel tracks</title>
      <para>
The -c option is used to specify channels which should be extracted
from the input.  If the input is a 4 channel track,
</para><para>
<screen>
$ ch_track kdt_m.tr -o a.tr -c "0 2"
</screen>
</para>
<para>
will extract the 0th and 2nd channel (counting starts from 0). The
argument to -c can be either a single number of a list of numbers
(wrapped in quotes).
      </para>
    </sect2>
    <sect2>
      <title>Extracting of a single region from a track</title>
      <para>
There are several ways of extracting a region of a track. The
simplest way is by using the start, end, to and from commands to
delimit a sub portion of the input track. For example
</para><para>
<screen>
$ ch_track kdt_010.tr -o small.tr -start 1.45 -end 1.768
</screen>
</para>
<para>
extracts a subtrack starting at 1.45 seconds and extending to 1.768 seconds.
alternatively,
</para><para>
<screen>
$ ch_track kdt_010.tr -o small.tr -from 50 -to 100
</screen>
</para>
<para>
extracts a subtrack starting at 50 frames and extending to 100
frames. Times and frames can be mixed in sub-track extraction. The
output track will have the same number of channels as the input track.
      </para>
    </sect2>
    <sect2>
      <title>Adding headers and format conversion</title>
      <para>
It is usually a good idea for all track files to have headers as this
way different files can be handled safely. ch_track provides a means
of adding headers to unheadered files. These files are assumed to
be ascii floats with one channel per line.
The following adds a header to an ascii file.
</para>
<para>
<screen>
$ ch_track kdt_010.atr -o kdt_010.h5.tr -otype est -s 0.01
</screen>
</para>
<para>
ch_track can change the frame shift of a fixed frame file, or convert
a variable frame shift file into a fixed frame shift.  At present this
is done with a very crude resampling technique and hence the output
file may suffer from anti-aliasing distortion.</para><para>
Change to a frame spacing of 0.02 seconds:
</para><para>
<screen>
$ ch_track kdt_010.tr -o kdt_010.tr2 -S 0.02
</screen>
      </para>
    </sect2>
  </sect1>