<HTML ><HEAD ><TITLE >Fuzzy search</TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.73 "><LINK REL="HOME" TITLE="mnoGoSearch 3.2 reference manual" HREF="index.html"><LINK REL="UP" TITLE="Searching documents" HREF="msearch-doingsearch.html"><LINK REL="PREVIOUS" TITLE="Search results cache " HREF="msearch-srcache.html"><LINK REL="NEXT" TITLE="Miscellaneous" HREF="msearch-misc.html"><LINK REL="STYLESHEET" TYPE="text/css" HREF="mnogo.css"><META NAME="Description" CONTENT="mnoGoSearch - Full Featured Web site Open Source Search Engine Software over the Internet and Intranet Web Sites Based on SQL Database. It is a Free search software covered by GNU license."><META NAME="Keywords" CONTENT="shareware, freeware, download, internet, unix, utilities, search engine, text retrieval, knowledge retrieval, text search, information retrieval, database search, mining, intranet, webserver, index, spider, filesearch, meta, free, open source, full-text, udmsearch, website, find, opensource, search, searching, software, udmsearch, engine, indexing, system, web, ftp, http, cgi, php, SQL, MySQL, database, php3, FreeBSD, Linux, Unix, mnoGoSearch, MacOS X, Mac OS X, Windows, 2000, NT, 95, 98, GNU, GPL, url, grabbing"></HEAD ><BODY CLASS="sect1" BGCOLOR="#EEEEEE" TEXT="#000000" LINK="#000080" VLINK="#800080" ALINK="#FF0000" ><DIV CLASS="NAVHEADER" ><TABLE SUMMARY="Header navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TH COLSPAN="3" ALIGN="center" >mnoGoSearch 3.2 reference manual: Full-featured search engine software</TH ></TR ><TR ><TD WIDTH="10%" ALIGN="left" VALIGN="bottom" ><A HREF="msearch-srcache.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="80%" ALIGN="center" VALIGN="bottom" >Chapter 8. Searching documents</TD ><TD WIDTH="10%" ALIGN="right" VALIGN="bottom" ><A HREF="msearch-misc.html" ACCESSKEY="N" >Next</A ></TD ></TR ></TABLE ><HR ALIGN="LEFT" WIDTH="100%"></DIV ><DIV CLASS="sect1" ><H1 CLASS="sect1" ><A NAME="fuzzy" >Fuzzy search</A ></H1 ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="ispell" >Ispell <A NAME="AEN3657" ></A ></A ></H2 ><P >When mnoGoSearch is used with ispell support all words are normalized. It allows finding different grammatical forms of the same words. During indexing all words are stored as in the database. During the search all forms of the given keyword are selected and are taken into account. E.g. search front-end will try to find the word "test" if "testing" or "tests" is given in search query. </P ><DIV CLASS="sect3" ><H3 CLASS="sect3" ><A NAME="typesispellfiles" >Two types of ispell files</A ></H3 ><P >MnoGoSearch understands two types of ispell files: affixes and dictionaries. Ispell affixes file contains rules for words and has approximately the following format: <PRE CLASS="programlisting" > Flag V: E > -E, IVE # As in create> creative [^E] > IVE # As in prevent > preventive Flag *N: E > -E, ION # As in create > creation Y > -Y, ICATION # As in multiply > multiplication [^EY] > EN # As in fall > fallen </PRE > </P ><P >Ispell dictionary file contains words themselves and has the following format: <PRE CLASS="programlisting" > wop/S word/DGJMS wordage/S wordbook wordily wordless/P </PRE > </P ></DIV ><DIV CLASS="sect3" ><H3 CLASS="sect3" ><A NAME="using-ispell" >Using Ispell</A ></H3 ><P >To make mnoGoSearch support ispell you must specify Affix and Spell commands in <TT CLASS="filename" >search.htm</TT > file. The format of commands: <PRE CLASS="programlisting" > Affix [lang] [charset] [ispell affixes file name] Spell [lang] [charset] [ispell dictionary filename] </PRE > </P ><P >The first parameter of both commands is two letters language abbreviation. The second is ispell files charset. The third one is filename. File names are relative to mnoGoSearch <TT CLASS="literal" >/etc</TT > directory. Absolute paths can be also specified.</P ><DIV CLASS="note" ><BLOCKQUOTE CLASS="note" ><P ><B >Note: </B >Simultaneous loading of several languages is supported, e.g.: <PRE CLASS="programlisting" > Affix en iso-8859-1 en.aff Spell en iso-8859-1 en.dict Affix de iso-8859-1 de.aff Spell de iso-8859-1 de.dict </PRE > </P ><P >Will load support for both English and German languages.</P ></BLOCKQUOTE ></DIV ><P >If you use <TT CLASS="literal" >searchd</TT >, add the same commands to <TT CLASS="filename" >searchd.conf</TT >.</P ><P >When mnoGoSearch is used with ispell support it is recommended to use <TT CLASS="literal" >searchd</TT >, especially for several languages support. Otherwise the starting time of <TT CLASS="filename" >search.cgi</TT > increases.</P ></DIV ><DIV CLASS="sect3" ><H3 CLASS="sect3" ><A NAME="addwords-dict" >Customizing dictionary</A ></H3 ><P >It is possible that several rare words are found in your site which are not in ispell dictionaries. You may create the list of such words in plain text file with the following format (one word per line): <PRE CLASS="programlisting" > rare.dict: ---------- webmaster intranet ....... www http --------- </PRE > </P ><P >You may also use ispell flags in this file (for ispell flags refer to ISpell documentation). This will allow not writing the same word with different endings to the rare words file, for example "webmaster" and "webmasters". You may choose the word which has the same changing rules from existing ispell dictionary and just to copy flags from it. For example, English dictionary has this line:</P ><P > <TT CLASS="literal" >postmaster/MS</TT > </P ><P >So, webmaster with MS flags will be probably OK:</P ><P > <TT CLASS="literal" >webmaster/MS</TT > </P ><P >Then copy this file to /etc directory of mnoGoSearch and add this file by Spell command in ISpell tab of mnoGoSearch:</P ><P >During next reindexing using of all documents new words will be considered as words with correct spelling. The only really incorrect words will remain.</P ></DIV ></DIV ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="synonyms" >Synonyms <A NAME="AEN3697" ></A ></A ></H2 ><P >Beginning from mnoGoSearch version 3.2 synonyms-based inexplicit search is supported.</P ><P >Synonyms files are installed into <TT CLASS="filename" >etc/synonym</TT > subdirectory of mnoGoSearch installation.</P ><P ><A NAME="AEN3703" ></A > To enable synonyms, add to <TT CLASS="filename" >search.htm</TT > search template commands like <TT CLASS="literal" >Synonym <filename></TT >, e.g.: <PRE CLASS="programlisting" > Synonym synonym/english.syn Synonym synonym/russian.syn </PRE > </P ><P >Filenames are relative to <TT CLASS="filename" >etc</TT > directory of mnoGoSearch installation or absolute if begin with /</P ><P >If you use <TT CLASS="literal" >searchd</TT >, add the same commands to <TT CLASS="filename" >searchd.conf</TT >.</P ><P >Please feel free to send us your own synonyms lists at <TT CLASS="email" ><<A HREF="mailto:devel@mnogosearch.org" >devel@mnogosearch.org</A >></TT >. As an example you may take the english synonyms file. In the beginning of the list please specify the following two commands: <PRE CLASS="programlisting" > Language: en Charset: us-ascii </PRE > </P ><P ></P ><UL ><LI ><P > <TT CLASS="varname" >Language</TT > - standard (ISO 639) two-letter language abbreviation.</P ></LI ><LI ><P > <TT CLASS="varname" >Charset</TT > - any charset supported by mnoGoSearch (see <A HREF="msearch-international.html#charset" >the Section called <I >Character sets <A NAME="AEN2429" ></A ></I > in Chapter 7</A >).</P ></LI ></UL ></DIV ></DIV ><DIV CLASS="NAVFOOTER" ><HR ALIGN="LEFT" WIDTH="100%"><TABLE SUMMARY="Footer navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" ><A HREF="msearch-srcache.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="index.html" ACCESSKEY="H" >Home</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" ><A HREF="msearch-misc.html" ACCESSKEY="N" >Next</A ></TD ></TR ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" >Search results cache <A NAME="AEN3638" ></A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="msearch-doingsearch.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" >Miscellaneous</TD ></TR ></TABLE ></DIV ></BODY ></HTML >