<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html> <head> <title> ht://Dig: htfuzzy </title> </head> <body bgcolor="#eef7ff"> <h1> htfuzzy </h1> <p> ht://Dig Copyright © 1995-2001 <a href="THANKS.html">The ht://Dig Group</a><br> Please see the file <a href="COPYING">COPYING</a> for license information. </p> <hr size="4" noshade> <dl> <dd> <h2> Synopsis </h2> </dd> <dd> htfuzzy [-c <em>configfile</em>][-v] <em>algorithm</em> ... </dd> </dl> <dl> <dd> <h2> Description </h2> </dd> <dd> Htfuzzy creates indexes for different "fuzzy" search algorithms. These indexes can then be used by the <a href="htsearch.html" target="_top">htsearch</a> program. </dd> </dl> <dl> <dd> <h2> Options </h2> </dd> <dd> <dl compact> <dt> -c <em>configfile</em> </dt> <dd> Use the specified configuration file instead of the default. </dd> <dt> -v </dt> <dd> Verbose mode. Used once will provide progress feedback, used more than once will overflow even the biggest buffers. :-) </dd> </dl> </dd> </dl> <dl> <dd> <h2> Algorithms </h2> </dd> <dd> Indexes for the following search algorithms can currently be created: <dl> <dt> <strong>soundex</strong> </dt> <dd> Creates a slightly modified soundex key database. Differences with the standard soundex algorithm are: <ul> <li> Keys are 6 digits. </li> <li> The first letter is also encoded. </li> </ul> </dd> <dt> <strong>metaphone</strong> </dt> <dd> Creates a metaphone key database. This algorithm is more specific to English, but will get fewer "weird" matches than the soundex algorithm. </dd> <dt> <strong>accents</strong> </dt> <dd> Creates an accents key database. This algorithm will map all accented letters to their unaccented counterparts, so that a search for the unaccented word will yield all variations of this word with accents. </dd> <dt> <strong>endings</strong> </dt> <dd> Creates two databases which can be used to match common word endings. The creation of these databases requires a list of affix rules and a dictionary which uses those affix rules. The format of the affix rules and dictionary files are the ones used by the <a href="http://fmg-www.cs.ucla.edu/fmg-members/geoff/ispell.html"> ispell</a> program. Included with the distribution are the affix rules for English and a fairly small English dictionary. Other languages can be supported by getting the appropriate affix rules and dictionaries. These are available for many languages; check the ispell distribution for more details. </dd> <dt> <strong>synonyms</strong> </dt> <dd> Creates a database of synonyms for words. It reads a text database of synonyms and creates a database that htsearch can then use. Each line of the text database consists of words where the first word will have the other words on that line as synonyms. </dd> </dl> </dd> </dl> <dl> <dd> <h2> Files </h2> </dd> <dd> <dl> <dt> CONFIG_DIR/htdig.conf </dt> <dd> The default configuration file. </dd> </dl> </dd> </dl> <dl> <dd> <h2> See Also </h2> </dd> <dd> <a href="htdig.html">htdig</a>, <a href="htmerge.html">htmerge</a>, <a href="htsearch.html" target="_top">htsearch</a>, <a href="attrs.html">Configuration file format</a>, and <a href="http://fmg-www.cs.ucla.edu/fmg-members/geoff/ispell.html"> ispell</a>. </dd> </dl> <hr size="4" noshade> Last modified: $Date: 2001/02/15 17:05:33 $ </body> </html>