FASTR Version 2.03/99-04-01 * * FASt Term Recognizer * * fastr/README * Version 2.03/99-04-01 * * Copyright (C) 1998 Christian Jacquemin, LIMSI-CNRS * BP 133, 91403 ORSAY, FRANCE * tel +33 (0)1 69 85 80 22 / fax -- 80 88 mailto:jacquemin@limsi.fr * http://www.limsi.fr/Individu/jacquemi/ ftp://ftp.limsi.fr/pub/jacquemi/ * * This file contains a general presentaion of Fastr * Overview ******** Fastr is a parser for term and variant recognition. Fastr take as input a corpus and a list of terms and ouputs the indexed corpus in which terms and variants are recognized. Fastr can be used in two modes: o controlled indexing: input consists of a corpus and a list of terms, o free indexing: input only consists of a corpus, the list of terms is automatically acquired from the corpus. Fastr uses the following resources: o the corpus and the list of terms are tagged by the TreeTagger: http://www.ims.uni-stuttgart.de/Tools/DecisionTreeTagger.html o if available, a list of morphological families and a list of semantic links are used to calculate morphological and semantic variation. The formalism of Fastr is close to PATR-II. Pointers to the User Manual *************************** The Fastr distribution includes its own manual page. The manual page can be viewed by saying "nroff -man Fastr.1 | less", "nroff -man Fastrconf.1 | less", "nroff -man Fastrlang.1 | less", or "nroff -man Fastrdata.1 | less". See the online publications at http://www.limsi.fr/Individu/jacquemi/ for more information and examples. Installation ************ Fastr is currently available for the following languages: o French, o English, and for the following operating systems: o Linux, o Solaris. To install Fastr you MUST do the following: 1. get the distribution of Fastr (XXXX is the name of the operating system [linux|sunos|solaris]): fastr-XXXX.tar.gz 2. get the linguistic resources and configuration file corresponding to the desired language(s) (XX is the name of the language [en|fr]): fastr-language-XX.tar.gz 3. excute the install-Fastr script. 4. install the TreeTagger - or use another tagger and adapt the file for tag transcription lib/TAGS-TreeTagger-XX (XX is the name of the language [en|fr]). 5. test your installation for free or controlled indexing through the following commands (XX is the name of the language [en|fr]): Fastr-controlled-indexing-XX text-XX.txt terms-XX.txt Fastr-free-indexing-XX corpus-XX.txt Improvement *********** You can enrich your system by adding linguistic resources: morphological and semantic families. Scripts are provided for creating the following resources: type of links | language | source database | script --------------+----------+-----------------+------------------ semantic | English | WordNet 1.6 | WordNetPreProc.sh --------------+----------+-----------------+------------------ morphological | English | CELEX | CelexPreProc.sh --------------+----------+-----------------+------------------ semantic | French | Microsoft Word97| WordNetPreProc.sh --------------+----------+-----------------+------------------ More Information **************** Author: Christian Jacquemin <jacquemin@limsi.fr> LIMSI-CNRS BP 133, 91403 ORSAY, FRANCE mailto:jacquemin@limsi.fr