Sophie: fsa-tcl-0.51-5.mga6 armv5tl

fsa-tcl-0.51-5.mga6.armv5tl.rpm

Version 0.5:
- Word length in programs using automata increased to 120.
- Option `clean' provided in Makefile.
- Option `-v' provided in all programs (gives version details).
- Sorting of arcs on frequency in optimization phase of automaton creation.
- Merging two nodes that share the same arc.
- This file added.
Version 0.6:
- Option -v corrected.
- fr.acc file added to distribution.
- man pages provided.
- Compilation options shown in -v in all programs.
- Option -X provided in fsa_build (makes an index a tergo for word category
  guessing).
- New program - fsa_guess - added; it predicts word categories based on
  word endings.
- New program - fsa_hash - added; it is used for perfect hashing.
- Option -i added to programs using automata; it specifies input files.
- Option -l added to programs using automata; it provides information
  on language specific features, such as which characters form words,
  and on case conversions.
- New module - text_io - provided that processes text files (many words
  in line, punctuation, etc.), and gives grep-like output.
Version 0.7:
- In one_word_io, replacements are now separated by a comma and a space
  (was: space only); this makes it possible to have a two-word
  replacement for one word - in other words: now run-on words can be
  corrected.
- New compile option RUNON_WORDS added; if turned on, fsa_spell checks
  for run-on words, i.e. it checks whether inserting a space somewhere
  inside the word results in two correct words.
- New compile option CHCLASS added; if turned on, a dedicated file
  specifies equivalent sequences of characters, so that e.g. `rz' and
  `z' with a dot above (\.z in TeX) may be only one edit distance unit
  apart from each other.
- Emacs interface for spelling correction added; it is an adaptation of
  ispell.el.
Version 0.8:
- New program fsa_morph performs morphological analysis (but not generation).
- Improved INSTALL guidelines.
- README more up to date, obsolete data removed, better file list.
- fsa_guess now guesses lexemes as well (with GUESS_LEXEMES).
- awk scripts for data preparation.
Version 0.9:
- Corrected a bug that caused segment violations when using dictionaries
  of different sizes, and thus preventing users from using personal
  dictionaries.
- fsa_guess now recognizes prefixes with GUESS_PREFIX option.
- New options -g and -p for fsa_guess to simulate compile options.
- Words and lines can now be of arbitrary length.
- Binary search in leaf vectors of the register - this does speed up
  processing considerably.
- New compile option for creating an index a tergo: GENERALIZE; it gives
  smallest automata sizes.
- New compile option STATISTICS prints... wait for it... some statistics
  in fsa_build.
Version 0.10:
- Corrected a bug in fsa_build that showed up when using PRUNE_ARCS options
  while compiling an index a tergo.
- Corrected a bug in fsa_guess that prevented the proper use of -g option.
  Now -g and -p are independent.
- Introduced a limit on the number of analyses in fsa_guess.
- Introduced a limit on the depth of search for suffixes.
- Corrected a bug in fsa_build man page.
- Changed definitions of node and arc_node classes, so that the automaton
  requires less memory than before (by a quarter).
Version 0.11:
- Corrected a bug in statistics.
- Option -r added to the function usage() in fsa_spell.
- Removed random inline in fsa.h.
- Updated #ifdefs so that all #ifdef NUMBERS are enclosed in #ifdef FLEXIBLE.
- Updated Makefile so that it contains description of NUMBERS
- Corrected a bug in fsa_build that appeared while reading long input lines
- Updated description of -v option for all programs
- Corrected the effect of GENERALIZE option
- Introduced -m option in fsa_guess (prediction of mmorph descriptions
  of words based on inflected forms). mmorph is a morphology program
  available from ISSCO, Geneva, http://www.issco.unige.ch/
  or http://issco-www.unige.ch/
- fsa_build is now faster.
- Corrected a bug in PRUNE_ARCS option application.
Version 0.12:
- Added a new program: fsa_ubuild.
Version 0.13:
- Corrected a bug in fsa_ubuild that excluded some words from the automaton;
  the bug was in the function already_there().
- Added new program: fsa_visual.
- Added an entry for version 0.12 in this file.
Version 0.14:
- Corrected a bug in Makefile (introduced in 0.12) - there was no rule
  for making buildu_fsa.o.
- Changed declarations in fsa.h to simplify their use.
- Added perl scripts (awk scripts translated with a2p) for portability.
- Corrected a bug in fsa_hash: -N did not work correctly.
- fsa_visual uses manhattan edges.
- Introduced a new compile option STOPBIT that changes the format
  version, and makes automata smaller (by nearly 20% for large
  automata).
- Included more information on data preparation in README, and on
  compile options in INSTALL.
- Compiled the package on Solaris using g++ 2.6.0 to improve
  portability (thanks Sabine).
Version 0.15:
- Corrected a bug in list.empty_list - a memory leak that could be a nuisance
  with fsa_prefix operationg on large data.
- Corrected perl scripts.
- Added new script: morph_infix.{awk,pl}. It prepares data for an automaton
  to be used with fsa_morph for languages that have prefixes and infixes
  (like German).
- Added new compile option: MORPH_INFIX, and two new runtime options for
  fsa_morph: -I and -P. They make it possible to use data prepared with
  morph_infix.{awk,pl}.
- Added new compile option: POOR_MORPH that enables -A option in
  fsa_morph. That option enables morphological analysis giving only
  categories, and no base form.
- Added new script: morph_prefix.{awk.pl}. It prepares data for an automaton
  to be used with fsa_morph for languages that have prefixes (like Polish).
Version 0.16:
- Corrected a memory leak bug in fsa_morph. Now fsa_morph works two orders 
  of magnitude faster.
- Corrected manual pages (format errors).
Version 0.17:
- Added new compile option for fsa_build and fsa_ubuild:
  DESCENDING. If on, makes resulting automata smaller, but slower.
- Improved morph_infix.{awk,pl}.
- New option -F added to fsa_build and fsa_ubuild. It sets the filler
  character.
- New scripts added: prep_ati.{awk,pl}. They prepare data with coded infixes
  and prefixes for guessing lexemes and categories using fsa_guess.
- New scripts added: prep_atp.{awk,pl}. They prepare data with coded prefixes
  for guessing lexemes and categories using fsa_guess.
- fsa_hash now works correctly with the STOPBIT option.
- corrected another bug in fsa_hash, which probably lingered there
  from the beginning, and which made fsa_hash unusable for more than
  256 words.
Version 0.18:
- Added new compile option MORE_COMPR that tries to get more compression
  when using fsa_build or fsa_ubuild compiled with NEXTBIT.
Version 0.19:
- Added new compile option TAILS that enables compression of tails
  (last transitions) of states.
- Now MORE_COMPR also tries to squeeze some bytes without NEXTBIT.
- Corrected a bug in Makefile introduced in 0.18 (one comment too
  many).
- Enriched documentation on options in INSTALL.
- Corrected a bug in fsa_visual that showed up with variable size
  arcs, i.e. NEXTBIT or TAILS.
- Added a check on whether -O option should be used in fsa_visual and
  supply it when necessary even when the user doesn't do that.
- Added LOOSING_RPM compile option to circumvent a bug in g++ or
  stdlibc++ found in new rpms (I have to use SuSE now, and I got
  reports of the same bugs appearing on Red Hat, but no problems on
  Debian). This does not solve all the problems - if they appear,
  switch optimization off (remove -O2 from compile options).
- Added a small program fsa_dump. It is not in Makefile, as it is not
  tested yet. The source is in dump.cc. The program lists the contents
  of an automaton as transitions.
- Added scripts: de_morph_data.{awk,pl} and de_morph_infix.{awk,pl}
  that produce the 3 column format out of data for fsa_build.
- Added scripts: demorph.{awk,pl} that produce the 3 column format from
  the output of fsa_morph.
Version 0.20
- Moved mark_inner() to nnode.cc, as it can be used without A_TERGO option.
- Added an info on producing the contents of an automaton.
- Fixed display of statistics for NEXTBIT and TAILS
- Corrected placement of conditionals so that compilation without
  FLEXIBLE is possible. But do use FLEXIBLE!
- Added a Tcl script - an interface for fsa_guess as a tool for
  acquisition of descriptions for a morphological dictionary.
- Added additional information to -v option of all programs.
- MORE_COMPR is now *much* faster; actually, it became usable.
- Added a new perl script chkmorph.pl that removes those predictions
  made by fsa_guess that cannot produce the required flectional form.
- Added sortatt.pl perl script that sorts words on their
  categories/features; it is used by the tcl/tk interface, and it is
  specially useful when comparing output of two descriptions.
- added gendata.pl - a perl script that generates data for guessing
  morphological descriptions in mmorph format of unknown words.
Version 0.21
- Corrected some bugs in gendata.pl.
- Added new compile option - WEIGHTED.
- Corrected a bug in chkmorph.pl.
- Corrected a bug in fsa_ubuild (thanks to Christen Blom - Dahl)
- Totally rewritten GENERALIZE. I hope it provides better results.
- Added new script sortondesc.pl that sorts morphological descriptions
  of words so that the most probable come first. A description is
  judged to be more probable when it appears in more words.
- Corrected a horrible bug in fsa_spell that manifested itself when
  the edit distance was set to 0. Program gave arbitrary results.
- Tcl/Tk interface for lexical acquisition is now much more powerful.
- Added a new script putinplace.pl that should put descriptions chosen
  with the Tcl/Tk tool in their appropriate places.
Version 0.22
- Corrected conditional compilation so that it is now possible to
  compile without MORE_COMPR.
- Added guided correction (right mouse button on description) to the
  dictionary acquisition tool. The interface is improved.
- Added statistics to the dictionary acquisition interface.
Version 0.23
- In the Tcl/Tk tool, corrected output from mmorph matching so that if
  all values of a feature are generated, nothing comes out, and when
  no features are generated, the feature name is deleted from the
  output.
- In the Tcl/Tk tool, corrected deleting features using the right mouse
  button menu.
- Corrected the script chkmorph.pl so that no phony item appears at
  the end (there is no dangling comma at the end).
- Added a new option to ignore the filler character in morphology.
- Corrected building a weighted guessing automaton. It still needs my
  attention.
Version 0.24
- Corrected dropping one hypothesis in sortondesc.pl script.
- Corrected a bug in fsa_build that make pointer size calculation invalid
  (thanks to Gertjan van Noord).
- Corrected a bug in fsa_spell for distances greater than 1 (thanks to
  Jiri Andel).
Version 0.25
- Included perl and tcl scripts in installation in Makefile.
- Corrected a bug in fsa_hash: null pointers were followed in word->number
  conversion (thanks to Martin Povolny).
Version 0.26
- Included perl and tcl scripts deleted by mistake from 0.25.
- Corrected Makefile so that it does not delete perl and tcl scripts
  in make realclean.
Version 0.27
- Corrected a bug in Undo operation in Tcl/Tk interface.
- Moved customization of tclmacq to Makefile.
- Adapted tclmacq to new version of Tcl/Tk.
Version 0.28
- Corrected a bug in tclmacq (Tcl/Tk interface for dictionary
  acquisition). Sorting was done before (and not after) expansion of
  alternatives, which resulted in apparently random order.
- Added some include directives needed in the most recent compilers
  (thanks to Dawid Weiss).
- Corrected setting the FILLER character in builds_fsa.cc (thanks to
  Dawid Weiss).
- Corrected usage info for dump.cc (thanks to Dawid Weiss).
Version 0.29
- Corrected a bug in simplify.pl (it produced duplicates).
Version 0.30
- Corrected a bug in fsa_morph. When one entry was a prefix of another
  entry, the words were the same, but one annotation was shorter then
  the other one, the longer entry was not printed (thanks to Gertjan
  van Noord).
Version 0.31
- Corrected the use of one variable so that the package compiles with
  the old set of options (thanks to Michael Daum).
Version 0.32
- The package now compiles under g++ 3.1.1.
Version 0.33
- jguess is again produced (thanks to Leonoor van der Beek)
- Corrected fsa_hash so that words not in the dictionary return -1
  and not a slash (thanks to Vinay Middha).
- Added a file TROUBLESHOOTING describing the most common problems people have
  while trying to install and use the package. As a bonus, I included some
  solutions as well.
- Added possibility of morphological analysis of words without tags, i.e.
  stemming or lemmatization (thanks to Gertjan van Noord). Just remove
  the last annotation separator (+) and anything that follows it from
  the output of a script preparing morphological data.
Version 0.34
- States can have up to 255 (was: 127) outgoing transitions when
  compiled with STOPBIT (thanks to Gertjan van Noord).
- Closed memory leaks in handling of lists (thanks to Martin Povolny).
Version 0.35
- Corrected a bug introduced in the previous version (deleting the
  wrong thing).
Version 0.36
- Corrected a bug in dynamic growth of strings read from input in
  programs that use automata, i.e. not fsa_build nor fsa_ubuild
  (thanks to Gertjan van Noord).
Version 0.37
- Replaced recursion with iteration in some programs, e.g. fsa_hash.
  fsa_hash is now about 3.5 percent faster.
Version 0.38
- Introduced a "-a" runtime option to list the contents of the whole
  dictionary. The updated glibc++ version I have now treats reading an
  empty line as an error, so there is no way to learn if an empty line
  was indeed read.
- Introduced a new compile option DUMP_ALL to supress printing the
  leading space in fsa_prefix.
- Corrected some type errors and vestiges of previous versions when
  using DEBUG compile option (thanks to Nikolay Ketsaris).
- Corrected dump.cc to print non-ASCII characters.
Version 0.39
- fsa_spell now compiles also without CHCLASS (thanks to Nikolay Ketsaris).
- Added ios::binary in 3 places for the benefit of those who have the
  misfortune of being forces to use the virus distribution system from
  M$.
- Corrected exit code in fsa_prefix when -a is used (thanks to Marcin
  Mi³kowski)
- Corrected a bug in fsa_build and fsa_ubuild when -O was
  used. Certain states were compressed "too much", i.e. comparison of
  transitions did not work in part_cmp_nodes due to a modification
  introduced several versions ago.
- Changed the script ie1 to make it immediately useful for debugging
  should anything unpredictable happen.
- Removed the outrageously outdated file ToDo.
Version 0.40
- Corrected a bug in the initialization of the H_matrix in fsa_spell
  (thanks to Guillaume Rousse).
- Corrected a bug in ie1 (return value for fgrep)
- Changed the way parameters are passed to most functions in programs
  that use automata (passing first arc instead of the parent arc).
  This might have introduced some new errors...
- Added a parameter to fsa_spell to force the search for replacements
  (thanks to Guillaume Rousse).
- Added two new compile options. The first one -- SPARSE -- changes
  the way the automaton is represented. if the option is used, then
  most of transitions of the automaton are stored as a sparse
  matrix. Only annotations (e.g. in morphological dictionaries) are
  still stored as lists of transitions. The new representation is
  faster for most tasks, but it takes longer to produce, and it is
  larger. The option SLOW_SPARSE makes sure that we try to fill in
  every hole in the sparse matrix, but it results in *VERY* slow
  construction, and the results are practically the same.
Version 0.41
- Corrected a bug in fsa_ubuild that caused the FILLER not to be set (thanks
  to Marco Baroni)
Version 0.42
- Corrected a compile error in mmorph.cc when MORPH_INFIX was undefined.
- Corrected an error in fsa_prefix that gave infinite loops while
  listing words with certain prefixes.
- Corrected a bug that resulted from new glibc++ I/O behaviour (thanks
  to Gertjan van Noord)
- Changed the licence so that the package is freer than it used to be.
Version 0.43
- Corrected a bug in fsa_morph that never got a chance to manifest
  itself there because of the way C++ initializes variables, but it
  was a bug anyway (thanks to Jirka Mikulasek).
Version 0.44
- Corrected a bug in fsa_morph that was introduced in version 0.40 and
  resulted in inability to process infixes (thanks to Marcin Milkowski).
- Corrected a bug in fsa_guess that was introduced in version 0.40 and
  resulted in inability to process infixes (thanks to Marcin Milkowski).
Version 0.45
- Corrected a bug in counting transitions with the next flag set that
  resulted in incorrect pointer size in fsa_build/fsa_ubuild (thanks
  to Marcin Milkowski and Dawid Weiss).
Version 0.46
- Corrected a bug in pruning arcs (transitions) for guessing automata in
  fsa_build/fsa_ubuild (thanks to Marcin Milkowski).
Version 0.47
- Corrected a bug in pruning arcs (transitions) for guessing automata in
  fsa_build/fsa_ubuild (thanks to Marcin Milkowski).
- fsa_ubuild calls now the same functions as fsa_build when building guessing 
  automata.
Version 0.48
- Corrected a few bugs in memory management in fsa_build and fsa_ubuild when
  pruning arcs (transitions) for guessing automata (thanks to Marcin
  Milkowski).
- Updated README to include LANG=C in examples invoking sort, so that the way
  to make sort behave correctly is shown.
Version 0.49
- Corrected a bug in construction of guessing automata, in which the
  same annotation appeared twice. 
- Corrected a bug in construction of guessing automata, in which no
  annotation separator character was used in certain paths.
- Added an include directive to make the package compile with gcc 4.3 (thanks
  to Milos Jakubicek).
Version 0.50
- Added perl scripts prep_gen.pl, prep_genp.pl, and prep_geni.pl for
  preparation of data for morphological generation (synthesis).
- Added comments to morph_data.pl, morph_prefix.pl, and morph_infix.pl.
- Added man pages fsa_synth.1 and fsa_synth.5.
- Added files synth.h, synth.cc, and synth_main.cc that compile to
  give fsa_synth - a program for morphological generation. Also
  adjusted Makefile.
- Added new compile option UTF8. The package worked with unicode even
  without it, except for case conversion, fsa_accent, and
  fsa_spell. It may be needed by fsa_synth when regular expressions
  are used, and they contain characters that take more than one byte
  in UTF8. While fsa_accent now works with unicode (with some
  limitations), fsa_spell still does not word with UTF8.
Version 0.51
- Corrected handling of long lines in programs that use automata. Stroustrup
  just forgot to mention that one needs to call clear() for an input stream
  when the entire line did not fit the buffer length (thanks to Gertjan van
  Noord). Introduced detection of non-hash dictionaries with fsa_hash.