<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML ><HEAD ><TITLE >Mime</TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK REL="HOME" TITLE="mnoGoSearch 3.3.9 reference manual" HREF="index.html"><LINK REL="UP" TITLE="mnoGoSearch command reference" HREF="msearch-cmdref.html"><LINK REL="PREVIOUS" TITLE="MaxWordLength" HREF="msearch-cmdref-maxwordlength.html"><LINK REL="NEXT" TITLE="MinCoordFactor" HREF="msearch-cmdref-mincoordfactor.html"><LINK REL="STYLESHEET" TYPE="text/css" HREF="mnogo.css"><META NAME="Description" CONTENT="mnoGoSearch - Full Featured Web site Open Source Search Engine Software over the Internet and Intranet Web Sites Based on SQL Database. It is a Free search software covered by GNU license."><META NAME="Keywords" CONTENT="shareware, freeware, download, internet, unix, utilities, search engine, text retrieval, knowledge retrieval, text search, information retrieval, database search, mining, intranet, webserver, index, spider, filesearch, meta, free, open source, full-text, udmsearch, website, find, opensource, search, searching, software, udmsearch, engine, indexing, system, web, ftp, http, cgi, php, SQL, MySQL, database, php3, FreeBSD, Linux, Unix, mnoGoSearch, MacOS X, Mac OS X, Windows, 2000, NT, 95, 98, GNU, GPL, url, grabbing"></HEAD ><BODY CLASS="refentry" BGCOLOR="#EEEEEE" TEXT="#000000" LINK="#000080" VLINK="#800080" ALINK="#FF0000" ><!--#include virtual="body-before.html"--><DIV CLASS="NAVHEADER" ><TABLE SUMMARY="Header navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TH COLSPAN="3" ALIGN="center" ><SPAN CLASS="application" >mnoGoSearch</SPAN > 3.3.9 reference manual: Full-featured search engine software</TH ></TR ><TR ><TD WIDTH="10%" ALIGN="left" VALIGN="bottom" ><A HREF="msearch-cmdref-maxwordlength.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="80%" ALIGN="center" VALIGN="bottom" ></TD ><TD WIDTH="10%" ALIGN="right" VALIGN="bottom" ><A HREF="msearch-cmdref-mincoordfactor.html" ACCESSKEY="N" >Next</A ></TD ></TR ></TABLE ><HR ALIGN="LEFT" WIDTH="100%"></DIV ><H1 ><A NAME="cmdref-mime" ></A >Mime</H1 ><DIV CLASS="refnamediv" ><A NAME="AEN10620" ></A ><H2 >Name</H2 ><B CLASS="command" >Mime</B > -- defines external parser for given mime-type<P ><B ></B ><TT CLASS="filename" >indexer.conf</TT ></P ></DIV ><DIV CLASS="refsynopsisdiv" ><A NAME="AEN10626" ></A ><H2 >Synopsis</H2 ><P ><B CLASS="command" >Mime</B > {from_mime} {to_mime} {command line} [source]</P ></DIV ><DIV CLASS="refsect1" ><A NAME="AEN10633" ></A ><H2 >Description</H2 ><P ><B CLASS="command" >Mime</B > is used to enable parsing documents with mime types other than <TT CLASS="literal" >text/plain</TT >, <TT CLASS="literal" >text/html</TT > or <TT CLASS="literal" >text/xml</TT >, which have built-in parsers. </P ><P > Processing of documents with other mime types is possible with help of <SPAN CLASS="emphasis" ><I CLASS="emphasis" ><A HREF="msearch-parsers.html" >external parsers</A ></I ></SPAN > - external programs which convert documents of arbitrary types to the above types natively supported by <SPAN CLASS="application" >mnoGoSearch</SPAN >. </P ><P >The <CODE CLASS="parameter" >from_mime</CODE > and <CODE CLASS="parameter" >to_mime</CODE > parameters are standard mime types. </P ><P > <CODE CLASS="parameter" >to_mime</CODE > should be one of the natively supported types (listed above) and can optionally have the <CODE CLASS="option" >charset=</CODE > part. If the <CODE CLASS="option" >charset=</CODE > part is omitted, the parser output is considered to be in <B CLASS="command" ><A HREF="msearch-cmdref-localcharset.html" >LocalCharset</A ></B >. </P ><P >By default, when executing a parser, <SPAN CLASS="application" >indexer</SPAN > sends data to its <TT CLASS="filename" >STDIN</TT > and reads results from its <TT CLASS="filename" >STDOUT</TT >. </P ><P >Some parsers can not operate on <TT CLASS="filename" >STDIN</TT > and need a file. The <CODE CLASS="parameter" >command line</CODE > parameter can have <CODE CLASS="varname" >$1</CODE > reference which stands for a temporary file name. If <CODE CLASS="varname" >$1</CODE > is specified, <SPAN CLASS="application" >indexer</SPAN > creates a temporary file, writes the input data to it, and substitutes the temporary file in the parser command line instead of the <CODE CLASS="varname" >$1</CODE > reference. </P ><P ><CODE CLASS="parameter" >Command line</CODE > can also use variables, for example <CODE CLASS="varname" >${URL}</CODE > or <CODE CLASS="varname" >${Content-Type}</CODE >. See the list of all available variables in <KBD CLASS="userinput" >indexer -v6</KBD > output, in the lines having the "<TT CLASS="literal" >Response.</TT >" prefix. </P ><P > The fourth parameter <CODE CLASS="parameter" >source</CODE > is optional. It can specify what kind of data is sent to the parser. By default, <SPAN CLASS="application" >indexer</SPAN > sends raw document content. With help of the <CODE CLASS="parameter" >source</CODE > parameter you can mix document content with other kind of data, for example, its <ACRONYM CLASS="acronym" >URL</ACRONYM > or some <ACRONYM CLASS="acronym" >HTTP</ACRONYM > header, using the same notation with the <CODE CLASS="parameter" >command line</CODE > parameter. Raw content is available as <CODE CLASS="varname" >${HTTP.Content}</CODE >. <DIV CLASS="note" ><BLOCKQUOTE CLASS="note" ><P ><B >Note: </B > To make <CODE CLASS="varname" >${HTTP.Content}</CODE > available, use <TT CLASS="literal" >Section HTTP.Content 0 0</TT > command. </P ></BLOCKQUOTE ></DIV > </P ></DIV ><DIV CLASS="refsect1" ><A NAME="AEN10682" ></A ><H2 >Examples</H2 ><DIV CLASS="informalexample" ><P ></P ><A NAME="AEN10684" ></A ><PRE CLASS="programlisting" > Mime application/msword "text/plain; charset=cp1251" "catdoc $1" Mime application/x-troff-man text/plain "deroff" Mime text/x-postscript text/plain "ps2ascii" Mime application/pdf text/plain "pdftotext $1 -" Mime application/vnd.ms-excel text/plain "xls2csv $1" Mime "text/rtf*" text/html "rthc --use-stdout $1 2>/dev/null" # A parser example with variables in its command line Mime application/mytype text/html "myparser -u ${URL} -t ${Content-Type} $1" # Mixing content with URL and HTTP headers Section HTTP.Content 0 0 Mime application/mytype2 text/html "myparser2" "${URL} # ${Content-Type} # ${HTTP.Content}" </PRE ><P ></P ></DIV ></DIV ><DIV CLASS="refsect1" ><A NAME="AEN10686" ></A ><H2 >See also</H2 ><P > <A HREF="msearch-cmdref-addtype.html" >AddType</A >, <A HREF="msearch-cmdref-defaultcontenttype.html" >DefaultContentType</A >, <A HREF="msearch-cmdref-useremotecontenttype.html" >UseRemoteContentType</A >. </P ></DIV ><DIV CLASS="NAVFOOTER" ><HR ALIGN="LEFT" WIDTH="100%"><TABLE SUMMARY="Footer navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" ><A HREF="msearch-cmdref-maxwordlength.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="index.html" ACCESSKEY="H" >Home</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" ><A HREF="msearch-cmdref-mincoordfactor.html" ACCESSKEY="N" >Next</A ></TD ></TR ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" >MaxWordLength</TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="msearch-cmdref.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" >MinCoordFactor</TD ></TR ></TABLE ></DIV ><!--#include virtual="body-after.html"--></BODY ></HTML >