Sophie: festival-speechtools-devel-1.2.96-18.fc14 i686

festival-speechtools-devel-1.2.96-18.fc14.i686.rpm

  <sect1 id='spectgen-manual'>
	<title><command>spectgen</command> <emphasis>Make spectrograms</emphasis></title>

    <toc depth='1'></toc>
    <para>
    </para>
    <sect2>
      <title>Synopsis</title>
      <para>
      </para>
        <!-- /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/spectgen -sgml_synopsis -->
        <para>
<cmdsynopsis><command>spectgen</command>[input file] -o [output file]<arg>-h </arg>
<arg>-itype <replaceable>string</replaceable></arg>
<arg>-n <replaceable>int</replaceable></arg>
<arg>-f <replaceable>int</replaceable></arg>
<arg>-ibo <replaceable>string</replaceable></arg>
<arg>-iswap </arg>
<arg>-istype <replaceable>string</replaceable></arg>
<arg>-c <replaceable>string</replaceable></arg>
<arg>-start <replaceable>float</replaceable></arg>
<arg>-end <replaceable>float</replaceable></arg>
<arg>-from <replaceable>int</replaceable></arg>
<arg>-to <replaceable>int</replaceable></arg>
<arg>-otype <replaceable>string</replaceable> " {ascii}"</arg>
<arg>-S <replaceable>float</replaceable></arg>
<arg>-o <replaceable>ofile</replaceable></arg>
<arg>-shift <replaceable>float</replaceable></arg>
<arg>-length <replaceable>float</replaceable></arg>
<arg>-sr <replaceable>float</replaceable></arg>
<arg>-slow </arg>
<arg>-w <replaceable>float</replaceable></arg>
<arg>-b <replaceable>float</replaceable></arg>
<arg>-raw </arg>
<arg>-order <replaceable>int</replaceable></arg>
</cmdsynopsis>
        </para>
        <!-- DONE /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/spectgen -sgml_synopsis -->
      <para>

spectgen is used to create spectrograms, which are 3d plots of
amplitude against time and frequency. Spectgen takes a waveform and
produces a track, where each channel represents one frequency bin. 
By default spectgen produces a "wide-band" spectrogram, that is one
with high time resolution and low frequency resolution. "Narrow-band"
spectrograms can be produced by using the -shift and -lengt options.
Typical values for -shift and -length are:
      </para>
    </sect2>
    <sect2>
      <title>Options</title>
      <para>
      </para>
        <!-- /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/spectgen -sgml_options -->
        <para>
<variablelist>
<varlistentry><term>-h</term>
<LISTITEM><PARA>

Options help 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-itype</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Input file type (optional). If set to raw, this 
indicates that the input file does not have a header. While 
this can be used to specify file types other than raw, this is 
rarely used for other purposes 
as the file type of all the existing supported 
types can be determined automatically from the 
file's header. If the input file is unheadered, 
files are assumed to be shorts (16bit). 
Supported types are 
nist, est, esps, snd, riff, aiff, audlab, raw, ascii 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-n</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

Number of channels in an unheadered input file 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-f</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

Sample rate in Hertz for an unheadered input file 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-ibo</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Input byte order in an unheadered input file: 
possibliities are: MSB , LSB, native or nonnative. 
Suns, HP, SGI Mips, M68000 are MSB (big endian) 
Intel, Alpha, DEC Mips, Vax are LSB (little 
endian) 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-iswap</term>
<LISTITEM><PARA>

Swap bytes. (For use on an unheadered input file) 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-istype</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Sample type in an unheadered input file: 
short, mulaw, byte, ascii 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-c</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>

Select a single channel (starts from 0). 
Waveforms can have multiple channels. This option 
extracts a single channel for progcessing and 
discards the rest. 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-start</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Extract sub-wave starting at this time, specified in 
seconds 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-end</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Extract sub-wave ending at this time, specified in 
seconds 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-from</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

Extract sub-wave starting at this sample point 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-to</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

Extract sub-wave ending at this sample point 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-otype</term>
<LISTITEM><PARA>
<replaceable>string</replaceable>
 " {ascii}"
Output file type, if unspecified ascii is 
assumed, types are: none, esps, est, est_binary, htk, htk_fbank, htk_mfcc, htk_user, htk_discrete, ssff, xmg, xgraph, ema, ema_swapped, ascii, label 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-S</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

Frame spacing of output in seconds. If this is 
different from the internal spacing, the contour is 
resampled at this spacing 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-o</term>
<LISTITEM><PARA>
<replaceable>ofile</replaceable>

Output filename, defaults to stdout 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-shift</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

frame spacing in seconds for fixed frame analysis. This 
doesn't have to be the same as the output file spacing - the 
S option can be used to resample the track before saving 
default: 0.001 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-length</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

input frame length in milliseconds 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-sr</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

range in which output values should lie 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-slow</term>
<LISTITEM><PARA>

slow FFT code 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-w</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

white cut off (0.0 to 1.0) 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-b</term>
<LISTITEM><PARA>
<replaceable>float</replaceable>

black cut off (0.0 to 1.0) 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-raw</term>
<LISTITEM><PARA>

Don't perform any scaling 
</PARA></LISTITEM>
</varlistentry>

<varlistentry><term>-order</term>
<LISTITEM><PARA>
<replaceable>int</replaceable>

cepstral order </PARA></LISTITEM>
</varlistentry>
</variablelist>
        </para>
        <!-- DONE /amd/projects/festival/versions/v_mpiro/speech_tools_linux/bin/spectgen -sgml_options -->
    </sect2>
  </sect1>