- Name: tokenizer
- Version: 5.4.1
- Release: 1mdk
- Epoch:
- Group: Sciences/Computer science
- License: GPL
- Url: http://atoll.inria.fr/~lclement/tokenizer-main.html
- Summary: Text segmenter
- Architecture: i586
- Size: 73338
- Distribution: Mandrakelinux
- Vendor: Mandrakesoft
- Packager: Guillaume Rousse <guillomovitch@mandrake.org>
Description:
Tokenizer allows to segment a text in tokens, then in word-forms. The tokens
match regular expressions, and the word-forms match lexical entries compiled
with lexed. A word-form is a concatenation of tokens for a compound name.
Ambiguity between simple and coumpound words is represented through a direct
acyclic graph (DAG).
- OptFlags: -O2 -fomit-frame-pointer -pipe -march=i586 -mtune=pentiumpro
- Cookie: n2.mandrakesoft.com 1101393863
- Buildhost: n2.mandrakesoft.com
Sources packages:
Other version of this rpm: