<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML ><HEAD ><TITLE >mnoGoSearch performance issues </TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK REL="HOME" TITLE="mnoGoSearch 3.3.9 reference manual" HREF="index.html"><LINK REL="UP" TITLE="Storing mnoGoSearch data " HREF="msearch-howstore.html"><LINK REL="PREVIOUS" TITLE="Cache mode storage" HREF="msearch-cachemode.html"><LINK REL="NEXT" TITLE="Oracle notes " HREF="msearch-oracle.html"><LINK REL="STYLESHEET" TYPE="text/css" HREF="mnogo.css"><META NAME="Description" CONTENT="mnoGoSearch - Full Featured Web site Open Source Search Engine Software over the Internet and Intranet Web Sites Based on SQL Database. It is a Free search software covered by GNU license."><META NAME="Keywords" CONTENT="shareware, freeware, download, internet, unix, utilities, search engine, text retrieval, knowledge retrieval, text search, information retrieval, database search, mining, intranet, webserver, index, spider, filesearch, meta, free, open source, full-text, udmsearch, website, find, opensource, search, searching, software, udmsearch, engine, indexing, system, web, ftp, http, cgi, php, SQL, MySQL, database, php3, FreeBSD, Linux, Unix, mnoGoSearch, MacOS X, Mac OS X, Windows, 2000, NT, 95, 98, GNU, GPL, url, grabbing"></HEAD ><BODY CLASS="sect1" BGCOLOR="#EEEEEE" TEXT="#000000" LINK="#000080" VLINK="#800080" ALINK="#FF0000" ><!--#include virtual="body-before.html"--><DIV CLASS="NAVHEADER" ><TABLE SUMMARY="Header navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TH COLSPAN="3" ALIGN="center" ><SPAN CLASS="application" >mnoGoSearch</SPAN > 3.3.9 reference manual: Full-featured search engine software</TH ></TR ><TR ><TD WIDTH="10%" ALIGN="left" VALIGN="bottom" ><A HREF="msearch-cachemode.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="80%" ALIGN="center" VALIGN="bottom" >Chapter 7. Storing <SPAN CLASS="application" >mnoGoSearch</SPAN > data</TD ><TD WIDTH="10%" ALIGN="right" VALIGN="bottom" ><A HREF="msearch-oracle.html" ACCESSKEY="N" >Next</A ></TD ></TR ></TABLE ><HR ALIGN="LEFT" WIDTH="100%"></DIV ><DIV CLASS="sect1" ><H1 CLASS="sect1" ><A NAME="perf" >mnoGoSearch performance issues <A NAME="AEN3527" ></A ></A ></H1 ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="perf-mysql" >MySQL performance</A ></H2 ><P >MySQL users may declare mnoGoSearch tables with the <CODE CLASS="option" >DELAY_KEY_WRITE=1</CODE > option. This will make the updating of indexes faster, as these are not logged to disk until the file is closed. <CODE CLASS="option" >DELAY_KEY_WRITE</CODE > excludes updating indexes on disk completely. </P ><P >With it, indexes are processed only in memory and written onto disk as a last resort, by the <B CLASS="command" >FLUSH TABLES </B > command or at mysqld shutdown. This can take even minutes and impatient user can <TT CLASS="literal" >kill -9 mysql server</TT > and break index files with this. Another downside is that you should run <TT CLASS="literal" >myisamchk</TT > on these tables before you start mysqld to ensure that they are okay if something killed mysqld in the middle.</P ><P >Because of it, we didn't include this table option into the default tables structure. However, as the key information can always be generated from the data, you should not lose anything by using <CODE CLASS="option" >DELAY_KEY_WRITE</CODE >. So, use this option at your own risk.</P ></DIV ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="perf-optimization" >Post-indexing optimization</A ></H2 ><P >This article was supplied by Randy Winch <CODE CLASS="email" ><<A HREF="mailto:gumby@cafes.net" >gumby@cafes.net</A >></CODE > </P ><P >I have some performance numbers that some of you might find interesting. I'm using RH 6.2 with the 2.2.14-6.1.1 kernel update (allows files larger than 2 gig) and mysql 2.23.18-alpha. I have just indexed most of our site using mnoGoSearch 3.0.18: <PRE CLASS="programlisting" > mnoGoSearch statistics Status Expired Total ----------------------------- 200 821178 2052579 OK 301 797 29891 Moved Permanently 302 3 3 Moved Temporarily 304 0 7 Not Modified 400 0 99 Bad Request 403 0 7 Forbidden 404 30690 100115 Not found 500 0 1 Internal Server Error 503 0 1 Service Unavailable ----------------------------- Total 852668 2182703 </PRE > </P ><P >I optimize the data by dumping it into a file using <CODE CLASS="option" >SELECT * INTO OUTFILE</CODE >, sort it using the system sort routine into word (CRC) order and then reloading it into the database using the procedure described in the <A HREF="http://www.mysql.com/documentation/mysql/commented/manual.php?section=Insert_speed" TARGET="_top" >mysql online manual</A >.</P ><P >The performance is wonderful. My favorite test is searching for "John Smith". The optimized database version takes about 13 seconds. The raw version takes about 73 seconds.</P ><PRE CLASS="programlisting" > Search results: john : 620241 smith : 177096 Displaying documents 1-20 of total 128656 found </PRE ><P >You may also take <A HREF="http://mnogosearch.org/download.html" TARGET="_top" >optimize.sh</A > script from our site. It was written by Joe Frost <CODE CLASS="email" ><<A HREF="mailto:joe_frost@omnis-software.com" >joe_frost@omnis-software.com</A >></CODE > and implements Randy's idea.</P ></DIV ></DIV ><DIV CLASS="NAVFOOTER" ><HR ALIGN="LEFT" WIDTH="100%"><TABLE SUMMARY="Footer navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" ><A HREF="msearch-cachemode.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="index.html" ACCESSKEY="H" >Home</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" ><A HREF="msearch-oracle.html" ACCESSKEY="N" >Next</A ></TD ></TR ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" >Cache mode storage</TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="msearch-howstore.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" >Oracle notes <A NAME="AEN3556" ></A ></TD ></TR ></TABLE ></DIV ><!--#include virtual="body-after.html"--></BODY ></HTML >