Sophie

Sophie

distrib > Mandriva > current > x86_64 > by-pkgid > b2392e2bab3459aa4eec68cd0e44713c > files > 245

mnogosearch-3.3.9-4mdv2010.1.x86_64.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>Mime</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REL="HOME"
TITLE="mnoGoSearch 3.3.9 reference manual"
HREF="index.html"><LINK
REL="UP"
TITLE="mnoGoSearch command reference"
HREF="msearch-cmdref.html"><LINK
REL="PREVIOUS"
TITLE="MaxWordLength"
HREF="msearch-cmdref-maxwordlength.html"><LINK
REL="NEXT"
TITLE="MinCoordFactor"
HREF="msearch-cmdref-mincoordfactor.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="mnogo.css"><META
NAME="Description"
CONTENT="mnoGoSearch - Full Featured Web site Open Source Search Engine Software over the Internet and Intranet Web Sites Based on SQL Database. It is a Free search software covered by GNU license."><META
NAME="Keywords"
CONTENT="shareware, freeware, download, internet, unix, utilities, search engine, text retrieval, knowledge retrieval, text search, information retrieval, database search, mining, intranet, webserver, index, spider, filesearch, meta, free, open source, full-text, udmsearch, website, find, opensource, search, searching, software, udmsearch, engine, indexing, system, web, ftp, http, cgi, php, SQL, MySQL, database, php3, FreeBSD, Linux, Unix, mnoGoSearch, MacOS X, Mac OS X, Windows, 2000, NT, 95, 98, GNU, GPL, url, grabbing"></HEAD
><BODY
CLASS="refentry"
BGCOLOR="#EEEEEE"
TEXT="#000000"
LINK="#000080"
VLINK="#800080"
ALINK="#FF0000"
><!--#include virtual="body-before.html"--><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
><SPAN
CLASS="application"
>mnoGoSearch</SPAN
> 3.3.9 reference manual: Full-featured search engine software</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="msearch-cmdref-maxwordlength.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="msearch-cmdref-mincoordfactor.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><H1
><A
NAME="cmdref-mime"
></A
>Mime</H1
><DIV
CLASS="refnamediv"
><A
NAME="AEN10620"
></A
><H2
>Name</H2
><B
CLASS="command"
>Mime</B
>&nbsp;--&nbsp;defines external parser for given mime-type<P
><B
></B
><TT
CLASS="filename"
>indexer.conf</TT
></P
></DIV
><DIV
CLASS="refsynopsisdiv"
><A
NAME="AEN10626"
></A
><H2
>Synopsis</H2
><P
><B
CLASS="command"
>Mime</B
>  {from_mime} {to_mime} {command line} [source]</P
></DIV
><DIV
CLASS="refsect1"
><A
NAME="AEN10633"
></A
><H2
>Description</H2
><P
><B
CLASS="command"
>Mime</B
> is used to enable parsing documents with mime types
    other than <TT
CLASS="literal"
>text/plain</TT
>,
    <TT
CLASS="literal"
>text/html</TT
> or <TT
CLASS="literal"
>text/xml</TT
>, which
    have built-in parsers.
    </P
><P
>&#13;    Processing of documents with other mime types is possible 
    with help of
    <SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
><A
HREF="msearch-parsers.html"
>external parsers</A
></I
></SPAN
> -
    external programs which convert documents of arbitrary types
    to the above types natively supported by <SPAN
CLASS="application"
>mnoGoSearch</SPAN
>.
    </P
><P
>The <CODE
CLASS="parameter"
>from_mime</CODE
> and
    <CODE
CLASS="parameter"
>to_mime</CODE
> parameters are standard mime types.
    </P
><P
>&#13;    <CODE
CLASS="parameter"
>to_mime</CODE
> should be one of the natively supported types (listed above)
    and can optionally have the <CODE
CLASS="option"
>charset=</CODE
> part.
    If the <CODE
CLASS="option"
>charset=</CODE
> part is omitted,
    the parser output is considered to be in 
    <B
CLASS="command"
><A
HREF="msearch-cmdref-localcharset.html"
>LocalCharset</A
></B
>.
    </P
><P
>By default, when executing a parser, <SPAN
CLASS="application"
>indexer</SPAN
> sends data
    to its <TT
CLASS="filename"
>STDIN</TT
> and reads results from its <TT
CLASS="filename"
>STDOUT</TT
>.
    </P
><P
>Some parsers can not operate on <TT
CLASS="filename"
>STDIN</TT
> and need a file.
    The <CODE
CLASS="parameter"
>command line</CODE
> parameter can have <CODE
CLASS="varname"
>$1</CODE
>
    reference which stands for a temporary file name.
    If <CODE
CLASS="varname"
>$1</CODE
> is specified, <SPAN
CLASS="application"
>indexer</SPAN
> creates a temporary
    file, writes the input data to it, and substitutes the temporary
    file in the parser command line instead of the <CODE
CLASS="varname"
>$1</CODE
> reference.
    </P
><P
><CODE
CLASS="parameter"
>Command line</CODE
> can also use variables,
    for example <CODE
CLASS="varname"
>${URL}</CODE
> or <CODE
CLASS="varname"
>${Content-Type}</CODE
>.
    See the list of all available variables in <KBD
CLASS="userinput"
>indexer -v6</KBD
> output,
    in the lines having the "<TT
CLASS="literal"
>Response.</TT
>" prefix.
    </P
><P
>&#13;    The fourth parameter <CODE
CLASS="parameter"
>source</CODE
> is optional.
    It can specify what kind of data is sent to the parser.
    By default, <SPAN
CLASS="application"
>indexer</SPAN
> sends raw document content.
    With help of the <CODE
CLASS="parameter"
>source</CODE
> parameter you
    can mix document content with other kind of data,
    for example, its <ACRONYM
CLASS="acronym"
>URL</ACRONYM
> or some <ACRONYM
CLASS="acronym"
>HTTP</ACRONYM
> header,
    using the same notation with the <CODE
CLASS="parameter"
>command line</CODE
> parameter.
    Raw content is available as <CODE
CLASS="varname"
>${HTTP.Content}</CODE
>.
    <DIV
CLASS="note"
><BLOCKQUOTE
CLASS="note"
><P
><B
>Note: </B
>
      To make <CODE
CLASS="varname"
>${HTTP.Content}</CODE
> available, use <TT
CLASS="literal"
>Section HTTP.Content 0 0</TT
>
      command.
      </P
></BLOCKQUOTE
></DIV
>
    </P
></DIV
><DIV
CLASS="refsect1"
><A
NAME="AEN10682"
></A
><H2
>Examples</H2
><DIV
CLASS="informalexample"
><P
></P
><A
NAME="AEN10684"
></A
><PRE
CLASS="programlisting"
>&#13;Mime application/msword      "text/plain; charset=cp1251"  "catdoc $1"
Mime application/x-troff-man  text/plain                    "deroff"
Mime text/x-postscript        text/plain                    "ps2ascii"
Mime application/pdf          text/plain                    "pdftotext $1 -"
Mime application/vnd.ms-excel text/plain                    "xls2csv $1"
Mime "text/rtf*"              text/html                     "rthc --use-stdout $1 2&#62;/dev/null"

# A parser example with variables in its command line
Mime application/mytype       text/html    "myparser -u ${URL} -t ${Content-Type} $1"

# Mixing content with URL and HTTP headers
Section HTTP.Content 0 0
Mime application/mytype2      text/html    "myparser2"   "${URL} # ${Content-Type} # ${HTTP.Content}"
      </PRE
><P
></P
></DIV
></DIV
><DIV
CLASS="refsect1"
><A
NAME="AEN10686"
></A
><H2
>See also</H2
><P
>&#13;    <A
HREF="msearch-cmdref-addtype.html"
>AddType</A
>,
    <A
HREF="msearch-cmdref-defaultcontenttype.html"
>DefaultContentType</A
>,
    <A
HREF="msearch-cmdref-useremotecontenttype.html"
>UseRemoteContentType</A
>.
  </P
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="msearch-cmdref-maxwordlength.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="msearch-cmdref-mincoordfactor.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>MaxWordLength</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="msearch-cmdref.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>MinCoordFactor</TD
></TR
></TABLE
></DIV
><!--#include virtual="body-after.html"--></BODY
></HTML
>