Sophie

Sophie

distrib > Mageia > 3 > x86_64 > media > core-release-src > by-pkgid > c76b91e16bf1266a1bc4657a37979e8a

kytea-0.4.2-2.mga3.src.rpm

Description:

General toolkit for analyzing text, with a focus on Japanese, Chinese
and other languages requiring word or morpheme segmentation.

KyTea is able to perform the following types of processing:
- Word Segmentation: it can separate an unsegmented text stream into
appropriate units (words or morphemes).
- Tagging: it can estimate the tags for words such as POS (part of
speech) tags.
- Pronounciation: it has the ability to estimate the pronunciation
of unknown words.

While KyTea comes with a default model, if you have your own annotated
text, it provides a tool to train your own model.

Generated packages:

Other version of this rpm: