Sophie

Sophie

distrib > Mandriva > 9.1 > ppc > by-pkgid > bebff3570faee357416d2588192a229a > files > 161

mnogosearch-3.2.8-1mdk.ppc.rpm

<HTML
><HEAD
><TITLE
>Searching documents</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.73
"><LINK
REL="HOME"
TITLE="mnoGoSearch 3.2 reference manual"
HREF="index.html"><LINK
REL="PREVIOUS"
TITLE="Segmenters for chinese and japanese languages
"
HREF="msearch-cjk.html"><LINK
REL="NEXT"
TITLE="How to write search result templates

"
HREF="msearch-templates.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="mnogo.css"><META
NAME="Description"
CONTENT="mnoGoSearch - Full Featured Web site Open Source Search Engine Software over the Internet and Intranet Web Sites Based on SQL Database. It is a Free search software covered by GNU license."><META
NAME="Keywords"
CONTENT="shareware, freeware, download, internet, unix, utilities, search engine, text retrieval, knowledge retrieval, text search, information retrieval, database search, mining, intranet, webserver, index, spider, filesearch, meta, free, open source, full-text, udmsearch, website, find, opensource, search, searching, software, udmsearch, engine, indexing, system, web, ftp, http, cgi, php, SQL, MySQL, database, php3, FreeBSD, Linux, Unix, mnoGoSearch, MacOS X, Mac OS X, Windows, 2000, NT, 95, 98, GNU, GPL, url, grabbing"></HEAD
><BODY
CLASS="chapter"
BGCOLOR="#EEEEEE"
TEXT="#000000"
LINK="#000080"
VLINK="#800080"
ALINK="#FF0000"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>mnoGoSearch 3.2 reference manual: Full-featured search engine software</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="msearch-cjk.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="msearch-templates.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="chapter"
><H1
><A
NAME="doingsearch"
>Chapter 8. Searching documents</A
></H1
><DIV
CLASS="TOC"
><DL
><DT
><B
>Table of Contents</B
></DT
><DT
><A
HREF="msearch-doingsearch.html#search"
>Using search front-ends</A
></DT
><DT
><A
HREF="msearch-templates.html"
>How to write search result templates
<A
NAME="AEN3083"
></A
></A
></DT
><DT
><A
HREF="msearch-html.html"
>Designing search.html</A
></DT
><DT
><A
HREF="msearch-rel.html"
>Relevancy
<A
NAME="AEN3548"
></A
></A
></DT
><DT
><A
HREF="msearch-track.html"
>Search queries tracking
<A
NAME="AEN3618"
></A
></A
></DT
><DT
><A
HREF="msearch-srcache.html"
>Search results cache
<A
NAME="AEN3638"
></A
></A
></DT
><DT
><A
HREF="msearch-fuzzy.html"
>Fuzzy search</A
></DT
></DL
></DIV
><DIV
CLASS="sect1"
><H1
CLASS="sect1"
><A
NAME="search"
>Using search front-ends</A
></H1
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="search-perform"
>Performing search</A
></H2
><P
>Open your preferred front-end in Web browser:
		<PRE
CLASS="programlisting"
>&#13;http://your.web.server/path/to/search.cgi
or
http://your.web.server/path/to/search.php3
or
http://your.web.server/path/to/search.pl
</PRE
>
</P
><P
>To find something just type words you want to
find and press SUBMIT button. For example, "<TT
CLASS="userinput"
><B
>mysql
odbc</B
></TT
>". You should
not use quotes " in query, they are written here only to divide a
query from other text. mnoGoSearch will find all documents that
contain word "mysql" and/or word "odbc". Best documents having bigger
weights will be displayed first.</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="search-params"
>Search parameters
<A
NAME="AEN2947"
></A
></A
></H2
><P
>mnoGoSearch front-ends support the following
parameters given in CGI query string. You may use them in HTML form on
search page.</P
><DIV
CLASS="table"
><A
NAME="AEN2950"
></A
><P
><B
>Table 8-1. Available search parameters</B
></P
><TABLE
BORDER="1"
CLASS="CALSTABLE"
><TBODY
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>q</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>text parameter with search query</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>ps</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>page size,
number of search results displayed on one page, 20 by default. Maximum
page size is 100. This value does not allow passing very big page
sizes to avoid server overload and might be changed with MAX_PS
definition in <TT
CLASS="filename"
>search.c</TT
>.

</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>np</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>page number, 0 by default (first page)</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>m</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>search mode. Currently "all","any" and "bool" values are supported.</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>wm</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>word match. You
may use this parameter to choose word match type.	There are
"wrd", "beg", "end" and "sub" values that respectively mean whole
word, word beginning, word ending and word substring match.</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>t</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>tag
limit. Limits search through only documents with given tag. This
parameter has the same effect with -t indexer option</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>cat</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>Category
limit. Take a look into <A
HREF="msearch-subsections.html#categories"
>the Section called <I
>Categories
<A
NAME="AEN2372"
></A
></I
> in Chapter 6</A
> for details.</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>ul</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>URL limit, URL
substring to limit search through subsection	of database. It
supports SQL % and _  LIKE wildcards. This parameter has the same
effect with -u indexer option. If relative URL is specified
<TT
CLASS="filename"
>search.cgi</TT
> inserts % signs before and after "ul"
value when compiled with SQL support. It allows to write URL substring
in HTML from to limit search, for example	&#60;OPTION
VALUE="/manual/"&#62; instead of VALUE="%/manual/%". When full URL with
schema is specified <TT
CLASS="filename"
>search.cgi</TT
> adds % sign only
after this value. For example for &#60;OPTION
VALUE="http://localhost/"&#62; <TT
CLASS="filename"
>search.cgi</TT
> will
pass <TT
CLASS="literal"
>http://localhost/%</TT
> in SQL LIKE
comparison.</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>wf</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>Weight
factors. It allows changing different document sections weights at a
search time. Should be passed in the form of hex number.	Check
the explanation below.</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>g</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>Language
limit. Language abbreviation to limit search results by url.lang
field.</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>tmplt</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>Template filename (without path).
To specify template file other standart <TT
CLASS="filename"
>search.htm</TT
>.
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>type</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>Content-Type limit.
Content-type to limit search results by url.content_type field.
For cache mode storage this should be exact match. For SQL-modes it may be sql-like pattern.
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>sp</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>Words forms limit.
=1, if you need search all forms for entered words.
=0, if you need search only entered words. Default value is 1.
</TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>sy</TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
>Synonyms limit.
=1, if you need add synonyms for entered words.
=0, do not use synonyms. Default value is 1.
</TD
></TR
></TBODY
></TABLE
></DIV
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="search-changeweight"
>Changing different document parts weights at search time</A
></H2
><P
>It is possible to pass "wf" HTML form variable
to <TT
CLASS="filename"
>search.cgi</TT
>. "wf" variable represents weight
factors for specific document parts. Currently
body,title,keywords,description,url parts, crosswords as well as user
defined META and HTTP headers are supported. Take a look into
"Section" part of <TT
CLASS="filename"
>indexer.conf-dist</TT
>.</P
><P
>To be able use this feature it is recommended to
set different sections IDs for different document parts in "Section"
<TT
CLASS="filename"
>indexer.conf</TT
> command. Currently up to 256
different sections are supported.</P
><P
>Imagine that we have these default sections in <TT
CLASS="filename"
>indexer.conf</TT
>:
		<PRE
CLASS="programlisting"
>&#13;  Section body        1  256
  Section title       2  128
  Section keywords    3  128
  Section description 4  128
</PRE
>
		</P
><P
>"wf" value is a string of hex digits ABCD. Each
digit is a factor for corresponding section's weight. The most right
digit corresponds to section 1. For the given above sections
configuration:</P
><P
CLASS="literallayout"
><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;D&nbsp;is&nbsp;a&nbsp;factor&nbsp;for&nbsp;section&nbsp;1&nbsp;(body)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;C&nbsp;is&nbsp;a&nbsp;factor&nbsp;for&nbsp;section&nbsp;2&nbsp;(title)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;B&nbsp;is&nbsp;a&nbsp;factor&nbsp;for&nbsp;section&nbsp;3&nbsp;(keywords)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;A&nbsp;is&nbsp;a&nbsp;factor&nbsp;for&nbsp;section&nbsp;4&nbsp;(description)<br>
</P
><P
>Examples:</P
><P
CLASS="literallayout"
><br>
&nbsp;&nbsp;&nbsp;wf=0001&nbsp;will&nbsp;search&nbsp;through&nbsp;body&nbsp;only.<br>
<br>
&nbsp;&nbsp;&nbsp;wf=1110&nbsp;will&nbsp;search&nbsp;through&nbsp;title,keywords,desctription&nbsp;but&nbsp;not&nbsp;<br>
	through&nbsp;the&nbsp;body.<br>
<br>
&nbsp;&nbsp;&nbsp;wf=F421&nbsp;will&nbsp;search&nbsp;through:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Description&nbsp;with&nbsp;factor&nbsp;15&nbsp;&nbsp;(F&nbsp;hex)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Keywords&nbsp;with&nbsp;factor&nbsp;4<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Title&nbsp;with&nbsp;factor&nbsp;&nbsp;2<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Body&nbsp;with&nbsp;factor&nbsp;1<br>
</P
><P
>By default, if "wf" variable is omitted in the
query, all sections factors are 1, it means all sections have the same
weight.</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="search-scriptname"
>Using front-end with an shtml page</A
></H2
><P
>When using a dynamic shtml page containing SSI
that calls <TT
CLASS="filename"
>search.cgi</TT
>,
i.e. <TT
CLASS="filename"
>search.cgi</TT
> is not called directly as a CGI
program, it is necessary to override Apache's SCRIPT_NAME environment
attribute so that all the links on search pages lead to the dynamic
page and not to <TT
CLASS="filename"
>search.cgi</TT
>.</P
><P
>For example, when a shtml page contains a line
<TT
CLASS="literal"
>&#60;--#include virtual="search.cgi"&#62;</TT
>,
SCRIPT_NAME variable will still point to
<TT
CLASS="filename"
>search.cgi</TT
>, but not to the shtml page.</P
><P
>To override SCRIPT_NAME variable we implemented
a UDMSEARCH_SELF variable that you may add to Apache's
<TT
CLASS="filename"
>httpd.conf</TT
> file. Thus
<TT
CLASS="filename"
>search.cgi</TT
> will check UDMSEARCH_SELF variable
first and then SCRIPT_NAME. Here is an example of using UDMSEARCH_SELF
environment variable with <B
CLASS="command"
>SetEnv/PassEnv</B
> Apache's
<TT
CLASS="filename"
>httpd.conf</TT
> command:</P
><P
>&#13;			<PRE
CLASS="programlisting"
>&#13;SetEnv UDMSEARCH_SELF /path/to/search.cgi
PassEnv UDMSEARCH_SELF
</PRE
>
		</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="search-templates"
>Using several templates</A
></H2
><P
>It is often required to use several templates
with the same <TT
CLASS="filename"
>search.cgi</TT
>. There are actually
several ways to do it. They are given here in the order how
<TT
CLASS="filename"
>search.cgi</TT
> detects template name.</P
><P
></P
><OL
TYPE="1"
><LI
><P
>&#13;					<TT
CLASS="filename"
>search.cgi</TT
> checks environment variable UDMSEARCH_TEMPLATE. So you can put a path to desired search template into this variable. </P
></LI
><LI
><P
>&#13;					<TT
CLASS="filename"
>search.cgi</TT
> also supports Apache internal redirect. It checks REDIRECT_STATUS and REDIRECT_URL environment variables. To activate this way of template usage you may add these lines in Apache <TT
CLASS="filename"
>srm.conf</TT
>: 

				<PRE
CLASS="programlisting"
>&#13;AddType text/html .zhtml
AddHandler zhtml .zhtml
Action zhtml /cgi-bin/search.cgi
</PRE
>
				</P
><P
>Put
<TT
CLASS="filename"
>search.cgi</TT
> into your
<TT
CLASS="filename"
>/cgi-bin/</TT
> directory. Then put HTML template into
your site directory structure under any name with .zthml extension,
for example template.zhtml. Now you may open search page:
<TT
CLASS="literal"
>http://www.site.com/path/to/template.zhtml</TT
> You may
use any unused extension instead of .zthml of course. </P
></LI
><LI
><P
>If the above two ways fail,
search.cgi opens a template which has the same name with the script
being executed using SCRIPT_NAME environment
variable. <TT
CLASS="filename"
>search.cgi</TT
> will open a template
<TT
CLASS="filename"
>ETC/search.htm</TT
>, <TT
CLASS="filename"
>search1.cgi</TT
>
will open <TT
CLASS="filename"
>ETC/search1.htm</TT
> and so on, where ETC is
mnoGoSearch /etc directory (usually
<TT
CLASS="filename"
>/usr/local/mnoGoSearch/etc</TT
>). So, you can use the
same <TT
CLASS="filename"
>search.cgi</TT
> with different templates without
having to recompile it. Just create one or several hard or symbolic
links for <TT
CLASS="filename"
>search.cgi</TT
> or copy it and put
corresponding search templates into /etc directory of mnoGoSearch
installation.</P
><P
>Take a look also into Making multi-language search pages section</P
></LI
></OL
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="search-bool"
>Advanced boolean search
<A
NAME="AEN3064"
></A
></A
></H2
><P
>If you want more advanced results you may use
query language. You should select "bool" match mode in the search
from.</P
><P
>mnoGoSearch understands the following boolean operators:</P
><P
><TT
CLASS="userinput"
><B
>&#38;</B
></TT
> - logical AND. For example, "mysql &#38;
odbc". mnoGoSearch will find any URLs that contain both "mysql" and
"odbc". You can also use <TT
CLASS="userinput"
><B
>+</B
></TT
> for this operator.</P
><P
><TT
CLASS="userinput"
><B
>|</B
></TT
> - logical OR. For example
"mysql|odbc". mnoGoSearch will find any URLs  that contain word
"mysql" or word "odbc".</P
><P
><TT
CLASS="userinput"
><B
>~</B
></TT
> - logical NOT. For example "mysql &#38;
~odbc". mnoGoSearch will find URLs that contain word "mysql" and do
not contain word "odbc" at the same time. Note that ~ just excludes
given word from results. Query "~odbc" will find nothing! You can also
use <TT
CLASS="userinput"
><B
>-</B
></TT
> for this operator.</P
><P
><TT
CLASS="userinput"
><B
>()</B
></TT
> - group command to compose more complex
queries. For example "(mysql | msql) &#38; ~postgres". Query language
is simple and powerful at the same time. Just consider query as usual
boolean expression.</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="search-exp"
>How search handles expired documents</A
></H2
><P
>Expired documents are still searchable with their old content.</P
></DIV
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="msearch-cjk.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="msearch-templates.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Segmenters for chinese and japanese languages</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
>&nbsp;</TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>How to write search result templates
<A
NAME="AEN3083"
></A
></TD
></TR
></TABLE
></DIV
></BODY
></HTML
>