<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html><head> <link rel=stylesheet type="text/css" href="anlgdocs.css"> <LINK REL="SHORTCUT ICON" HREF="favicon.ico"> <title>Readme for analog -- FAQ</title> </head> <body> [ <a href="Readme.html">Top</a> | <a href="Readme.html">Up</a> | <a href="errors.html">Prev</a> | <a href="mailing.html">Next</a> | <a href="map.html">Map</a> | <a href="indx.html">Index</a> ] <h1><img src="analogo.gif" alt=""> Analog 5.21: Frequently asked questions</h1> <hr size=2 noshade> <!-- Highest numbered question: 203 --> This list is divided into six sections: <ol type="A"> <li><a href="#startfaqcont">Getting Started</a> <li><a href="#configfaqcont">Basic Configuration</a> <li><a href="#underfaqcont">Understanding the Output</a> <li><a href="#advfaqcont">Advanced Usage</a> <li><a href="#formfaqcont">Form Interface</a> <li><a href="#designfaqcont">Design Decisions</a> </ol> <h3><a name="faqcontents">List of Questions</a></h3> <ol type="A"> <li><a name="startfaqcont" href="#startfaq">Getting Started</a> <br>See also <cite><a href="start.html">Starting to use analog</a></cite>. <ol> <li><a href="#faq100">Analog doesn't have a <kbd>setup.exe</kbd>.</a> <li><a href="#faq101">Analog just flashes up a DOS window and then quits.</a> <li><a href="#faq185">When I try and edit <kbd>analog.cfg</kbd>, Windows asks me which program I want to use to open that file.</a> <li><a href="#faq102">When I try and compile analog, it gives me an error (e.g. on SunOS 5).</a> <li><a href="#faq103">Analog didn't write the logfile when I ran it.</a> <li><a href="#faq105">Analog won't read extended logfiles generated by IIS.</a> <br><i>or</i> <a href="#faq105">What does "time without date" mean?</a> <li><a href="#faq106">What does "Logfile with ambiguous dates" mean?</a> <li><a href="#faq199">I tried to analyse a logfile, but I just got "Large number of corrupt lines."</a> <li><a href="#faq107">What does this error message mean?</a> <li><a href="#faq108">I tried to run analog from my browser, but it didn't work.</a> </ol> <br><li><a name="configfaqcont" href="#configfaq">Basic Configuration</a> <ol> <li><a href="#faq193">I want to analyse several logfiles together.</a> <li><a href="#faq110">I want to make several different statistics pages. Do I have to install several copies of analog?</a> <li><a href="#faq111">My <kbd>analog.cfg</kbd> included lots of <kbd>CONFIGFILE</kbd> commands, but only one output page was produced.</a> <li><a href="#faq112">Why does the Daily Report only show the last six weeks?</a> <li><a href="#faq113">Why do the time reports all list 0 requests?</a> <li><a href="#faq114">How do I get the Request Report to list files with fewer than 20 requests?</a> <li><a href="#faq115">How do I ignore accesses from my site?</a> <li><a href="#faq116">How do I ignore internal referrers in the Referrer Report?</a> <li><a href="#faq117">How do I get information on just my pages, not everybody's?</a> <li><a href="#faq118">How do I list subdirectories not just top-level directories in the Directory Report?</a> <li><a href="#faq119">How do I list minor browser versions in the Browser Summary?</a> <li><a href="#faq120">I used the command "<kbd>DIREXCLUDE /mydir/</kbd>", but files in that directory were still listed.</a> <li><a href="#faq121">I used the command "<kbd>FILEEXCLUDE /cgi-bin/script.pl</kbd>", but that file was still listed in the Request Report.</a> <li><a href="#faq122">I used the command "<kbd>IMAGEDIR C:\analog\images\</kbd>", but I only got broken images.</a> <li><a href="#faq196">I want to put several output pages in the same directory, but the pie charts overwrite each other.</a> <li><a href="#faq183">I want a configuration file with all of the possible configuration commands in it.</a> <li><a href="#faq191">I want to see your configuration file.</a> <li><a href="#faq123">Does the order of the commands matter in the configuration file?</a> <li><a href="#faq124">Why are my browser and referrer reports empty?</a> <li><a href="#faq125">Why isn't the Referrer Report sorted properly?</a> <li><a href="#faq127">I want to list (<i>or</i> not to list) referrers with their search arguments in the Referrer Report.</a> <li><a href="#faq141">Why are my click-thru's (<i>or</i> CGI scripts) not listed in the Request Report?</a> <li><a href="#faq184">I can't find <kbd>/script.pl?q=1</kbd> in the Request Report.</a> <li><a href="#faq126">Why can't I have <kbd>P</kbd> in the <kbd>REQCOLS</kbd>, <kbd>REQSORTBY</kbd> or <kbd>REQFLOOR</kbd>?</a> <li><a href="#faq128">Can I find out which files each referrer pointed to?</a> <br><i>or</i> <a href="#faq128">Can I find out which files each host has read?</a> <br><i>or</i> <a href="#faq128">Can I find out which hosts have read each file?</a> <br><i>or</i> <a href="#faq128">Can I find out the number of hosts visiting on each day?</a> <br><i>or <a href="#faq128">lots of similar questions.</a></i> <li><a href="#faq188">Can <kbd>SETTINGS ON</kbd> produce a configuration file instead of an English list of settings?</a> <li><a href="#faq130">I get the message "logfiles overlap" even though the two logfiles contain completely separate requests.</a> <li><a href="#faq131">Can I count the individual visitors, or visits, to my site?</a> <br><i>or</i> <a href="#faq131">Can I see how long visitors spend on my site?</a> <li><a href="#faq133">Can I change the way dates are formatted in the output?</a> <br><i>or</i> <a href="#faq133">Can I change some of the phrases in the output?</a> <li><a href="#faq132">Can I change the background colour of my output?</a> <li><a href="#faq190">How can I make the output prettier?</a> </ol> <br><li><a name="underfaqcont" href="#underfaq">Understanding the Output</a> <br>See also <cite><a href="meaning.html">What the results mean</a></cite>. <ol> <li><a href="#faq134">How do I find out the number of hits from your data?</a> <li><a href="#faq135">Why are there so many referrers from my own site?</a> <li><a href="#faq136">The analysis covers exactly a week, but the figures for the last seven days don't agree with the totals.</a> <li><a href="#faq137">I only have 240 requests in total. Why does analog think there are 840 requests per week?</a> <li><a href="#faq201">The pie charts don't agree with the figures in the tables.</a> <li><a href="#faq138">Why doesn't analog agree with the counter on my page?</a> <li><a href="#faq139">Why doesn't analog agree with grepping the logfile?</a> <li><a href="#faq192">Why doesn't analog agree with my other logfile analysis program?</a> <li><a href="#faq140">Why do I only get "unresolved numerical addresses" in the Domain Report?</a> <li><a href="#faq142">Why are directories listed in the Request Report?</a> <li><a href="#faq143">When someone reads one of my PDF files, it scores dozens of hits.</a> <li><a href="#faq203">Kilobytes should be 1000 bytes, not 1024 bytes.</a> <li><a href="#faq144">The Organisation Report doesn't identify organisations correctly.</a> <li><a href="#faq145">"Organization" isn't spelled correctly.</a> </ol> <br><li><a name="advfaqcont" href="#advfaq">Advanced Usage</a> <ol> <li><a href="#faq146">How can I do such-and-such with a command line option?</a> <li><a href="#faq147">I want a list of all command line arguments.</a> <li><a href="#faq148">How do I list all numerical subdomains to depth 2 in the Domain Report?</a> <li><a href="#faq189">I want to do a <kbd>HOSTEXCLUDE</kbd> on some IP addresses. Can I use a range like 131.111.20.1-127?</a> <li><a href="#faq181">I want to be able to count requests with status code 301 and 302 as successes, so that they appear in the Request Report.</a> <li><a href="#faq182">I want to report on a field analog doesn't know about.</a> <li><a href="#faq197">Can analog analyse Squid proxy logfiles?</a> <li><a href="#faq149">Can analog analyse FTP logfiles?</a> <li><a href="#faq187">Can analog analyse other logfiles, such as mail logs, or the syslog?</a> <li><a href="#faq150">How can I run analog automatically every day?</a> <li><a href="#faq200">How can I automatically email the results to myself or someone else?</a> <li><a href="#faq151">I'm setting up IIS. Which logfile format should I use?</a> <li><a href="#faq152">I host lots of virtual domains. How should I set up analog?</a> <li><a href="#faq153">Can I make several output pages with just one run of analog?</a> <li><a href="#faq154">I ran out of memory when trying to run analog. What can I do?</a> <li><a href="#faq155">You're processing 20,000,000 requests in under 10 minutes. Why is mine much slower?</a> <br><i>or</i> <a href="#faq155">Analog appears to stall.</a> <li><a href="#faq156">How do I make a link on my page that runs analog?</a> <li><a href="#faq157">Do I have to save all my old logfiles?</a> <br><i>or</i> <a href="#faq157">Can analog make statistics from old reports instead of reading the whole logfile again?</a> <li><a href="#faq158">Can analog write to a database or spreadsheet?</a> </ol> <br><li><a name="formfaqcont" href="#formfaq">Form Interface</a> <br>See also <cite><a href="form.html#trouble">Form troubleshooting</a></cite>. <ol> <li><a href="#faq159">I couldn't make the form run.</a> <li><a href="#faq160">How can I specify different logfiles from the form interface?</a> <li><a href="#faq162">My browser showed me anlgform.pl, rather than running it.</a> <li><a href="#faq163">Why does the form interface give "Document Returned no Data"?</a> <li><a href="#faq164">The images don't appear when running analog from the form interface.</a> <li><a href="#faq165">Why do I get some reports that weren't requested on the form?</a> <li><a href="#faq166">How do I make a link to <kbd>anlgform.pl</kbd> without using <kbd>anlgform.html</kbd>?</a> <li><a href="#faq179">Is there a form interface not using Perl (e.g. ASP or .exe)?</a> </ol> <br><li><a name="designfaqcont" href="#designfaq">Design Decisions</a> <ol> <li><a href="#faq167">Why doesn't the <kbd>HEADERFILE</kbd> replace the whole <kbd><head></kbd> of the output file?</a> <li><a href="#faq168">Why not use HTML tables?</a> <li><a href="#faq194">Why don't you just use one image, and scale it with the width and height attributes?</a> <li><a href="#faq169">Why are you still using HTML 2.0?</a> <li><a href="#faq195">Why not automatically spot robots by whether they request <kbd>/robots.txt</kbd>?</a> <li><a href="#faq171">Why not just do DNS resolution of the hosts that actually make it into the Host Report?</a> <li><a href="#faq172">Couldn't you do the DNS lookups faster with threads?</a> <li><a href="#faq173">Why doesn't analog analyse the error_log?</a> <li><a href="#faq174">My server lists local names in the logfile. Can you put a common suffix on them automatically?</a> <li><a href="#faq175">Can you extrapolate from the current month's partial data to produce a prediction for the whole month, based on the rate so far?</a> <li><a href="#faq176">Can you extend the Domain Report to say which US states people visited from?</a> <li><a href="#faq198">Please distinguish between the different BSDs in the Operating System Report.</a> <li><a href="#faq177">Why not use language codes instead of country codes for the names of the language files?</a> <li><a href="#faq186">Why doesn't analog produce statistics on "visits"?</a> <li><a href="#faq178">Why don't you sell analog?</a> </ol> </ol> <hr size=1 noshade> <h3><a name="startfaq">A. Getting Started</a></h3> Most questions in this category are answered in the section entitled <cite><a href="start.html">Starting to use analog</a></cite>. If you can't get analog running you should look there. <ol> <li><b><a name="faq100">Analog</a> doesn't have a <kbd>setup.exe</kbd>.</b> <br>No, and it doesn't need one. It's already ready to run! See <cite><a href="startpc.html">Starting to use analog under Windows</a></cite>. <li><b><a name="faq101">Analog</a> just flashes up a DOS window and then quits.</b> <br>This is the correct behaviour. It should have created an output page called <kbd>Report.html</kbd>. See <cite><a href="startpc.html">Starting to use analog under Windows</a></cite>. <li><b><a name="faq185">When</a> I try and edit <kbd>analog.cfg</kbd>, Windows asks me which program I want to use to open that file.</b> <br>Use Notepad, or any other plain text editor. <li><b><a name="faq102">When</a> I try and compile analog, it gives me an error (e.g. on SunOS 5).</b> <br>Maybe you need to edit the Makefile. There are some platform-specific notes in the section <cite><a href="startux.html">Starting to use analog on other platforms</a></cite>, and in the Makefile itself. <li><b><a name="faq103">Analog</a> didn't write the logfile when I ran it.</b> <br>Analog doesn't write the logfiles. Your web server writes the logfiles, and analog just reads them. See <cite><a href="start.html">Starting to use analog</a></cite>. <li><b><a name="faq105">Analog</a> won't read extended logfiles generated by IIS.</b> <br><i>or</i> <b>What does "time without date" mean?</b> <br>By default, IIS writes the date only at the top of the logfile, not on every line. But it doesn't write a new date if the date changes during the logfile, so analog can't tell which date later entries in the log occurred on. More details, and what to do about it, are in the section on <cite><a href="logfile.html#dateonly">Choosing a logfile</a></cite>. <li><b><a name="faq106">What</a> does "Logfile with ambiguous dates" mean?</b> <br>See the section on <cite><a href="errors.html#warnsF">Errors and warnings</a></cite>. <li><b><a name="faq199">I</a> tried to analyse a logfile, but I just got "Large number of corrupt lines."</b> <br>There are lots of possible reasons for this. You can find them described in the section on <cite><a href="logfile.html#corruptlines">Choosing a logfile</a></cite>. <li><b><a name="faq107">What</a> does this error message mean?</b> <br>Again, see the section on <cite><a href="errors.html">Errors and warnings</a></cite>. <li><b><a name="faq108">I</a> tried to run analog from my browser, but it didn't work.</b> <br>Analog should not be run as a CGI program, or even put in the folder with your CGI programs, for security reasons. You should use the special <a href="form.html">CGI program</a> instead. </ol> <h3><a name="configfaq">B. Basic Configuration</a></h3> Analog has lots of configuration commands, all of which are in the section on <cite><a href="custom.html">Customising analog</a></cite>. Here are some of the most common questions. If your question isn't answered here, you could also try looking in the <a href="indx.html">index</a>. <ol> <li><b><a name="faq193">I</a> want to analyse several logfiles together.</b> <br>Just use several <kbd>LOGFILE</kbd> commands, or wildcards in the logfile name. <li><b><a name="faq110">I</a> want to make several different statistics pages. Do I have to install several copies of analog?</b> <br>No. Just install it once, and run it with different <a href="syntax.html#CONFIGFILE">configuration files</a>. (You do have to run it once per output page though.) <li><b><a name="faq111">My</a> <kbd>analog.cfg</kbd> included lots of <kbd>CONFIGFILE</kbd> commands, but only one output page was produced.</b> <br>Analog can only produce one output page per run. To produce several reports, you have to run it several times. <li><b><a name="faq112">Why</a> does the Daily Report only show the last six weeks?</b> <br>This is controlled by the <kbd><a href="timereps.html#ROWS">FULLDAYROWS</a></kbd> command. <li><b><a name="faq113">Why</a> do the time reports all list 0 requests?</b> <br>They probably only list 0 requests for pages. Maybe you need to use <kbd><a href="include.html#PAGEINCLUDE">PAGEINCLUDE</a></kbd> to count more files as pages. <li><b><a name="faq114">How</a> do I get the Request Report to list files with fewer than 20 requests?</b> <br>Use the <kbd><a href="othreps.html#FLOOR">REQFLOOR</a></kbd> command, e.g., <kbd>REQFLOOR 10r</kbd> to list down to 10 requests. Also, if you want to list all the files not just pages, you may need to use the command <kbd>REQINCLUDE *</kbd> <li><b><a name="faq115">How</a> do I ignore accesses from my site?</b> <br>Use the <kbd><a href="include.html">HOSTEXCLUDE</a></kbd> command. <li><b><a name="faq116">How</a> do I ignore internal referrers in the Referrer Report?</b> <br>Use the <kbd><a href="include.html">REFREPEXCLUDE</a></kbd> command. <li><b><a name="faq117">How</a> do I get information on just my pages, not everybody's?</b> <br>Use the <kbd><a href="include.html">FILEINCLUDE</a></kbd> command. <li><b><a name="faq118">How</a> do I list subdirectories not just top-level directories in the Directory Report?</b> <br><kbd>SUBDIR */*</kbd> <li><b><a name="faq119">How</a> do I list minor browser versions in the Browser Summary?</b> <br>Use <kbd>SUBBROW */*.*</kbd> <li><b><a name="faq120">I</a> used the command "<kbd>DIREXCLUDE /mydir/</kbd>", but files in that directory were still listed.</b> <br><kbd>DIREXCLUDE</kbd> only affects the Directory Report, not the other reports. You want "<kbd>FILEEXCLUDE /mydir/*</kbd>" instead. <li><b><a name="faq121">I</a> used the command "<kbd>FILEEXCLUDE /cgi-bin/script.pl</kbd>", but that file was still listed in the Request Report.</b> <br>If the file has search arguments, you have to be a bit careful with <kbd>FILEEXCLUDE</kbd>. This is described in the section about <a href="args.html#unintuitive">search arguments</a>. <li><b><a name="faq122">I</a> used the command "<kbd>IMAGEDIR C:\analog\images\</kbd>", but I only got broken images.</b> <br>The <kbd>IMAGEDIR</kbd> command has to be a URL, not a directory on your disk. (It's just inserted into the <kbd><img></kbd> tags in the HTML output: have a look at the output and you'll see.) Also this means that the images have to be put in the part of your filespace that has your web files. <li><b><a name="faq196">I</a> want to put several output pages in the same directory, but the pie charts overwrite each other.</b> <br>You have to set the <a href="othreps.html#CHARTDIR"><kbd>CHARTDIR</kbd> and <kbd>LOCALCHARTDIR</kbd></a> to be different for each output. (You can still have all the charts in the same directory if the <kbd>CHARTDIR</kbd> and <kbd>LOCALCHARTDIR</kbd> don't end with slashes.) <li><b><a name="faq183">I</a> want a configuration file with all of the possible configuration commands in it.</b> <br>One is already distributed with the program, in the <kbd>examples</kbd> folder. <li><b><a name="faq191">I</a> want to see your configuration file.</b> <br>This is also included in the <kbd>examples</kbd> folder in the distribution. <li><b><a name="faq123">Does</a> the order of the commands matter in the configuration file?</b> <br>Only occasionally. If you have two of one command, the later one will generally override the earlier one. Apart from that, commands can come in any order, except that <kbd><a href="logfmt.html">LOGFORMAT</a></kbd> and <kbd><a href="output.html#TIMEOFFSET">LOGTIMEOFFSET</a></kbd> commands must come before the <kbd>LOGFILE</kbd> to which they refer. <li><b><a name="faq124">Why</a> are my browser and referrer reports empty?</b> <br>Maybe your logfile doesn't contain any browser and referrer information? <li><b><a name="faq125">Why</a> isn't the Referrer Report sorted properly?</b> <br>It is sorted properly. But <a href="args.html">search arguments</a> are also listed under the file they belong to, and this interrupts the ordering. If you set the <kbd><a href="hierreps.html#ARGSFLOOR">REFARGSFLOOR</a></kbd> high enough you won't see the search arguments. Or you can include the <a href="othreps.html#othCOLS"><kbd>N</kbd> column</a> to make the ordering more obvious. <li><b><a name="faq127">I</a> want to list (<i>or</i> not to list) referrers with their search arguments in the Referrer Report.</b> <br>To see the search arguments you may need to set the <kbd><a href="hierreps.html#ARGSFLOOR">REFARGSFLOOR</a></kbd> lower. To avoid seeing them, you could set the <kbd>REFARGSFLOOR</kbd> higher, or alternatively use the <kbd><a href="args.html#ARGSINCLUDE">REFARGSEXCLUDE</a></kbd> command to ignore them either for all files or just for particular files. <li><b><a name="faq141">Why</a> are my click-thru's (<i>or</i> CGI scripts) not listed in the Request Report?</b> <br>If they cause a redirection to another page, they will be listed in the Redirection Report, rather than the Request Report. <li><b><a name="faq184">I</a> can't find <kbd>/script.pl?q=1</kbd> in the Request Report.</b> <br>If it causes a redirection, it will be in the Redirection Report not the Request Report. But also, you may need to set the <kbd><a href="hierreps.html#ARGSFLOOR">REQARGSFLOOR</a></kbd> or <kbd><a href="hierreps.html#ARGSFLOOR">REDIRARGSFLOOR</a></kbd> lower to actually see it. <li><b><a name="faq126">Why</a> can't I have <kbd>P</kbd> in the <kbd>REQCOLS</kbd>, <kbd>REQSORTBY</kbd> or <kbd>REQFLOOR</kbd>?</b> <br>The number of page requests doesn't make sense in the Request Report because it's either the same as the number of requests (if the file is a page) or zero (if it isn't). If you want to list only pages in this report, use <kbd>REQINCLUDE pages</kbd> instead. <li><b><a name="faq128">Can</a> I find out which files each referrer pointed to?</b> <br><i>or</i> <b>Can I find out which files each host has read?</b> <br><i>or</i> <b>Can I find out which hosts have read each file?</b> <br><i>or</i> <b>Can I find out the number of hosts visiting on each day?</b> <br><i>or <b>lots of similar questions.</b></i> <br>There are lots of questions like this. They all want analog to cross-reference two sorts of item (e.g. files and referrers in the first example above, or hosts and dates in the last). Granted, these would be useful. But it is fundamental to analog's speed and minimal memory requirement that it only records statistics for each type of item individually, and doesn't record enough information to cross-reference them afterwards. <br>What you can do is to restrict the analysis to just requests from certain referrers (for example) with the <kbd><a href="include.html">REFINCLUDE</a></kbd> command, or to a particular time period with <a href="include.html#FROMTO"><kbd>FROM</kbd> and <kbd>TO</kbd></a>. This is usually good enough. <li><b><a name="faq188">Can</a> <kbd>SETTINGS ON</kbd> produce a configuration file instead of an English list of settings?</b> <br>No. But it does tell you which configuration files it read, so you can just get the commands out of them. Or if you want a list of all configuration commands, there is one in the <kbd>examples</kbd> directory. <li><b><a name="faq130">I</a> get the message "logfiles overlap" even though the two logfiles contain completely separate requests.</b> <br>This message is based only on the dates of the files, not the contents. If you're sure there is no problem, you can turn it off with the command <kbd><a href="debug.html">WARNINGS -L</a></kbd>. <li><b><a name="faq131">Can</a> I count the individual visitors, or visits, to my site?</b> <br><i>or</i> <b>Can I see how long visitors spend on my site?</b> <br>No, it's not technically possible, and don't believe any program which tells you it is. See the section on <cite><a href="webworks.html">How the web works</a></cite> for details. <li><b><a name="faq133">Can</a> I change the way dates are formatted in the output?</b> <br><i>or</i> <b>Can I change some of the phrases in the output?</b> <br>Yes, by editing the <a href="output.html#LANGUAGE">language file</a>. <li><b><a name="faq132">Can</a> I change the background colour of my output?</b> <br>Yes. The correct way to do this is to write a style sheet, and then use the <kbd><a href="output.html#STYLESHEET">STYLESHEET</a></kbd> command. <li><b><a name="faq190">How</a> can I make the output prettier?</b> <br>There are some programs on the <a href="helpers.html">helper applications page</a> to do this. </ol> <h3><a name="underfaq">C. Understanding the Output</a></h3> Most of the questions in this category are answered in the section on <cite><a href="meaning.html">What the results mean</a></cite>, which I really recommend you read if you want to understand what analog is telling you. <ol> <li><b><a name="faq134">How</a> do I find out the number of hits from your data?</b> <br>I don't use the word <i>hits</i>, because people use it in different ways, so it's misleading. I use <i>requests</i> for the number of transfers of any type of file (text, graphics, ...), and <i>page requests</i> for the number of transfers of HTML pages. See the section on <cite><a href="defns.html">Analog's definitions</a></cite> for more information. <li><b><a name="faq135">Why</a> are there so many referrers from my own site?</b> <br>These come from all the internal links on your site, and all the graphics on your pages. See the section on <cite><a href="webworks.html">How the web works</a></cite> for more information. If you don't want to see them, you can use <kbd><a href="include.html#outputexcludes">REFREPEXCLUDE</a></kbd> to exclude them. <li><b><a name="faq136">The</a> analysis covers exactly a week, but the figures for the last seven days don't agree with the totals.</b> <br>The figures in parentheses are for the seven days <i>before the time the program was run</i>, unless there is a <kbd>TO</kbd> command. They are <i>never</i> for the seven days before the end of the logfile. (Although if you know that the logfile only contains entries up to a certain time, you may want to include a <kbd>TO</kbd> command for that time to get the last seven days' data right.) <li><b><a name="faq137">I</a> only have 240 requests in total. Why does analog think there are 840 requests per week?</b> <br>If you have 240 requests in two days, that's a rate of 840 requests per week. Just like if you drove 28 miles in 20 minutes, you'd have driven at 84 miles per hour. <li><b><a name="faq201">The</a> pie charts don't agree with the figures in the tables.</b> <br>Possibly you are looking at out-of-date images. Make sure to reload the images as well as the text. Also, if you are running analog several times, make sure to use <a href="othreps.html#CHARTDIR"><kbd>CHARTDIR</kbd> and <kbd>LOCALCHARTDIR</kbd></a> to stop the images for the different runs overwriting each other. <li><b><a name="faq138">Why</a> doesn't analog agree with the counter on my page?</b> <br>There are lots of possible reasons. Do they both start from the same date? Are you just looking at requests for that one page with analog, not for all your other pages and graphics? Also, analog will record all requests to that page; if it's a graphic, your counter will only measure requests from people on graphical browsers that reached that place on the page. <li><b><a name="faq139">Why</a> doesn't analog agree with grepping the logfile?</b> <br>Have you understood <a href="defns.html">what analog includes</a> in its counts? In particular, most reports only list "successful" requests (HTTP status codes 200-209 & 304). A naïve grep would count failures too. <li><b><a name="faq192">Why</a> doesn't analog agree with my other logfile analysis program?</b> <br>Small differences can be put down to different parsing. But if you are seeing large differences, you have to understand what analog counts, and what the other program counts. For example, some programs count HTTP status codes 301 & 302 as successes, whereas I think that to do so gives <a href="#faq181">extremely misleading results</a>. <li><b><a name="faq140">Why</a> do I only get "unresolved numerical addresses" in the Domain Report?</b> <br>Your server only records the numerical IP address of the hosts that contact you, not their names. Read the section about <cite><a href="dns.html">DNS lookups</a></cite>, or turn DNS resolution on in your server. <li><b><a name="faq142">Why</a> are directories listed in the Request Report?</b> <br>They are not directories, they are pages with the same name as the directory. For example, I have both a directory called <kbd>/analog/</kbd> and a page called <kbd>/analog/</kbd> (which happens to be the same as <kbd>/analog/index.html</kbd>). <li><b><a name="faq143">When</a> someone reads one of my PDF files, it scores dozens of hits.</b> <br>PDF files are often downloaded and read one page at a time, and each page will then count as a separate request. Although this is not ideal, it's much less clear what to do about it. Analog has no way of knowing how many pages constituted a single download in the reader's mind. As usual, we can only reliably report how many requests there were at the server, not guess what users did with the file later. <li><b><a name="faq203">Kilobytes</a> should be 1000 bytes, not 1024 bytes.</b> <br>Personally I think that whatever 1024 bytes should have been called originally, it's stupid to try and change established usage now. But we don't need to argue about it. Analog's kilobytes are 1024 bytes, but if you prefer to call them kibibytes, you can do so by <a href="output.html#LANGUAGE">editing your language file</a>. <li><b><a name="faq144">The</a> Organisation Report doesn't identify organisations correctly.</b> <br>The rules I use are described in the section on <cite><a href="domfile.html#orgrules">The domains file</a></cite>. I admit they aren't perfect, but this is because in domains in which organisations aren't all at the same level in the domain hierarchy, there is no way to identify them perfectly without long lists. <li><b><a name="faq145">"Organization"</a> isn't spelled correctly.</b> <br>Yes it is. If you want American spellings, you have to specify <pre>LANGUAGE US-ENGLISH</pre> in your configuration file. </ol> <h3><a name="advfaq">D. Advanced Usage</a></h3> <ol> <li><b><a name="faq146">How</a> can I do such-and-such with a command line option?</b> <br>Use the <kbd><a href="syntax.html#plusC">+C</a></kbd> option to put any configuration command on the command line. <li><b><a name="faq147">I</a> want a list of all command line arguments.</b> <br>There is a list in the <a href="indx.html#clargs">index</a>. <li><b><a name="faq148">How</a> do I list all numerical subdomains to depth 2 in the Domain Report?</b> <br><kbd>SUBDOMAIN *.*</kbd> deliberately only lists the top-level numerical subdomains to avoid cluttering the output. <kbd>SUBDOMAIN *.*.*</kbd> will work but will list everything else to depth 3. So the best solution is <pre>SUBDOMAIN 1*.*,2*.*,3*.*,...</pre> <li><b><a name="faq189">I</a> want to do a <kbd>HOSTEXCLUDE</kbd> on some IP addresses. Can I use a range like 131.111.20.1-127?</b> <br>No, but you can use wildcards or regular expressions, which allows you to specify most ranges very quickly. <li><b><a name="faq181">I</a> want to be able to count requests with status code 301 and 302 as successes, so that they appear in the Request Report.</b> <br>No, you really don't, because that would lead to double counting when a request for <kbd>/dir</kbd> (code 301) is redirected to <kbd>/dir/</kbd> (code 200). For CGI scripts etc. look in the Redirection Report instead of the Request Report. <li><b><a name="faq182">I</a> want to report on a field analog doesn't know about.</b> <br>Use the following kludge. Write a <kbd><a href="logfmt.html">LOGFORMAT</a></kbd> to declare the field to be a virtual host or a user (whichever you aren't already using). Then edit your language file so that the right text is output. <li><b><a name="faq197">Can</a> analog analyse Squid proxy logfiles?</b> <br>It can analyse Squid's common log format, although Squid uses some extra HTTP status codes which will be rejected as corrupt by analog. But really you want to know different statistics from a proxy log, such as percentage of requests retrieved from cache, and you might be better to use Squid's native format and a tool specifically designed to analyse it such as <a href="http://cord.de/tools/squid/calamaris/">Calamaris</a>. <li><b><a name="faq149">Can</a> analog analyse FTP logfiles?</b> <br>Yes. If you are using the xferlog format, then there is a configuration file to help you in the <kbd>examples</kbd> directory. Otherwise you will have to write your own <kbd><a href="logfmt.html">LOGFORMAT</a></kbd>. (You probably won't be able to read anything other than the lines corresponding to file transfers.) <li><b><a name="faq187">Can</a> analog analyse other logfiles, such as mail logs, or the syslog?</b> <br>Yes and no. For mail logs, there is a program on the <a href="helpers.html">helper applications page</a> to help you. For other logs, you can get some results out by writing your own <kbd><a href="logfmt.html">LOGFORMAT</a></kbd>. But analog does make some assumptions about the sort of information it expects on a logfile line, and the further these assumptions are from being met, the harder it will be! <li><b><a name="faq150">How</a> can I run analog automatically every day?</b> <br>This depends on your particular machine. On Unix, you need to run analog as a cron job (see "man cron"). This is my cron command to run it at 1:50am every day: <br><kbd>50 1 * * * $HOME/bin/analog</kbd> <br>On Windows NT you can do the same with the at command. (It's probably easiest to put it in a batch job; also only an administrator can run at.) On Windows 98, it should be possible with the Task Scheduler, although I haven't tried it. On Windows 95 it's not possible as far as I know. <br>On Mac, there are programs called <a href="http://hyperarchive.lcs.mit.edu/HyperArchive/Archive/cfg/chris-cron-10a7.hqx">Cron</a> or <a href="http://hyperarchive.lcs.mit.edu/HyperArchive/Archive/app/time/">CronoTask</a> to do this. <li><b><a name="faq200">How</a> can I automatically email the results to myself or someone else?</b> <br>Again, this depends on your operating system. On Unix, it's easy: <pre>analog +a +O- | mailx -s"Subject" someuser@somewhere.com</pre> I don't know about other operating systems, but at the worst, you can write the output to a temporary file, and then mail that file. <li><b><a name="faq151">I'm</a> setting up IIS. Which logfile format should I use?</b> <br>The W3C format is probably best. You can turn fields on and off in this format. And it contains all the possible fields which can be logged, which the other formats do not. However, it is important to turn the date field on (it's off by default), not just to log the date once at the top: see the section on <a href="logfile.html#dateonly">problems with logfile formats</a> for why. <li><b><a name="faq152">I</a> host lots of virtual domains. How should I set up analog?</b> <br>There's a <a href="../how-to/index.html">How-To</a> which discusses this issue. There's also a file in the <kbd>examples</kbd> directory. <li><b><a name="faq153">Can</a> I make several output pages with just one run of analog?</b> <br>Not at the moment. I want to do this in a future version, but it will require some considerable work. However, depending on your which options you want to vary, you may be able to avoid having to read the logfile several times by using <a href="cache.html">cache files</a>. (This is likely to be faster, but more complicated.) <li><b><a name="faq154">I</a> ran out of memory when trying to run analog. What can I do?</b> <br>See the section on <a href="lowmem.html">Coping with low memory</a>. <li><b><a name="faq155">You're</a> processing 20,000,000 requests in under 10 minutes. Why is mine much slower?</b> <br><i>or</i> <b>Analog appears to stall.</b> <br>If you have <a href="dns.html">DNS lookups</a> on, they are very slow. Otherwise, it probably depends on the speed of your computer and disks, and what other programs are running at the same time. You can use the <kbd><a href="debug.html#PROGRESSFREQ">PROGRESSFREQ</a></kbd> command to see if it's really stalled or whether it's just being slow. If you are running out of memory, you might find analog's <kbd><a href="lowmem.html">LOWMEM</a></kbd> commands helpful. <li><b><a name="faq156">How</a> do I make a link on my page that runs analog?</b> <br>Link to the <a href="form.html">anlgform</a> program, with the desired options. But be careful about the load on your server. <li><b><a name="faq157">Do</a> I have to save all my old logfiles?</b> <br><i>or</i> <b>Can analog make statistics from old reports instead of reading the whole logfile again?</b> <br>These questions are answered in the section about <cite><a href="cache.html">Cache files</a></cite>. <li><b><a name="faq158">Can</a> analog write to a database or spreadsheet?</b> <br>Use the <a href="compout.html">computer-readable output style</a>, which can export to CSV. Or if what you really want to do is to run analog again without re-reading the logfiles, read the section about <cite><a href="cache.html">Cache files</a></cite>. </ol> <h3><a name="formfaq">E. Form Interface</a></h3> There is also a section on <a href="form.html#trouble">troubleshooting</a> in the documentation about the form interface. <ol> <li><b><a name="faq159">I</a> couldn't make the form run.</b> <br>Have you made analog work without the form? Have you run <kbd>anlgform.pl</kbd> from the command line as explained in the section on <a href="form.html#trouble">troubleshooting</a>? <li><b><a name="faq160">How</a> can I specify different logfiles from the form interface?</b> <br>Just add a new field to the form with <kbd>name=LOGFILE</kbd> <li><b><a name="faq162">My</a> browser showed me anlgform.pl, rather than running it.</b> <br>You have to tell the server to execute the CGI program, not just send it out like it would for a normal file. Often this is done by putting it in a special <kbd>/cgi-bin/</kbd> directory. <li><b><a name="faq163">Why</a> does the form interface give "Document Returned no Data"?</b> <br>If it doesn't happen for a while, then probably the server is giving up before the analog process has finished running. Increase the timeout interval on the server. <li><b><a name="faq164">The</a> images don't appear when running analog from the form interface.</b> <br>For the bar charts, you probably need to set the <kbd><a href="output.html#IMAGEDIR">IMAGEDIR</a></kbd>, because if the images are in your <kbd>/cgi-bin/</kbd> directory, the server will normally try to execute them instead of just sending them out. Pie charts don't appear unless you <a href="form.html#formcharts">configure them specially</a>. <li><b><a name="faq165">Why</a> do I get some reports that weren't requested on the form?</b> <br>If a report is neither included nor excluded on the form, the system default will be used. This will depend on your configuration files and on compile-time settings. <li><b><a name="faq166">How</a> do I make a link to <kbd>anlgform.pl</kbd> without using <kbd>anlgform.html</kbd>?</b> <br><kbd>anlgform.pl</kbd> accepts the <kbd>GET</kbd> or <kbd>POST</kbd> methods of form submission. So you can make a link with the arguments passed after a question mark in the usual <kbd>GET</kbd> way. <li><b><a name="faq179">Is</a> there a form interface not using Perl (e.g. ASP or .exe)?</b> <br>There is a Windows executable version of the Perl script on the <a href="helpers.html">analog helpers page</a>. At the time of writing, I don't know of any ASP version of the anlgform program, but if someone writes one, I'll put it on the <a href="helpers.html">analog helpers page</a> too. <strong>Warning:</strong> Potential authors <em>must</em> understand CGI security issues in general, and the <a href="form.html#security">extra issues</a> about what the analog form interface must disallow, or they <em>will</em> open security holes on their system. </ol> <h3><a name="designfaq">F. Design Decisions</a></h3> or "Why didn't you do it this way?" <ol> <li><b><a name="faq167">Why</a> doesn't the <kbd>HEADERFILE</kbd> replace the whole <kbd><head></kbd> of the output file?</b> <br>Because you almost never get valid HTML that way. Use a <a href="output.html#STYLESHEET">style sheet</a> instead. <li><b><a name="faq168">Why</a> not use HTML tables?</b> <br>Most non-graphical browsers don't do a good job with tables. Also tables aren't available in HTML 2.0, which is the sort of HTML analog writes. <li><b><a name="faq194">Why</a> don't you just use one image, and scale it with the width and height attributes?</b> <br>Again, it doesn't work in HTML 2.0. But also, it doesn't work with the <kbd><a href="timereps.html#BARSTYLE">BARSTYLE</a></kbd> command. <li><b><a name="faq169">Why</a> are you still using HTML 2.0?</b> <br>It seems to be impossible to make my bar charts in HTML 4.0. <li><b><a name="faq195">Why</a> not automatically spot robots by whether they request <kbd>/robots.txt</kbd>?</b> <br>It's not reliable. Not all robots request <kbd>/robots.txt</kbd>, and not everything that requests <kbd>/robots.txt</kbd> is a robot. (Consider a webmaster checking his own <kbd>/robots.txt</kbd>, for example.) <li><b><a name="faq171">Why</a> not just do DNS resolution of the hosts that actually make it into the Host Report?</b> <br>There is one theoretical and one practical problem. Theoretically, the problem is that which hosts do make it into the Host Report can change when the DNS lookups have been done. And practically, this wouldn't help identify the busiest countries or organisations, which is usually what you really want to know. However, there is a Perl script on the <a href="helpers.html">helper applications page</a> to do this. <li><b><a name="faq172">Couldn't</a> you do the DNS lookups faster with threads?</b> <br>The problem is, the standard commands for DNS lookups are not thread-safe on many platforms, so it would involve a lot of platform-specific code. Again, there are programs for specific platforms on the <a href="helpers.html">helper applications page</a>. <li><b><a name="faq173">Why</a> doesn't analog analyse the error_log?</b> <br>The error log is intended for humans rather than computers to read. So there is no consistent format: even different versions of the same server have different formats. And there is not much need to analyse it because analog's various failure reports are good enough for almost all purposes. <li><b><a name="faq174">My</a> server lists local names in the logfile. Can you put a common suffix on them automatically?</b> <br>This wouldn't be a good idea by default, because things like "unknown" would get the suffix. You can always add them using <kbd><a href="alias.html#useraliases">HOSTALIAS</a></kbd>. (There is an example to accomplish this using regular expressions in the <a href="alias.html#aliasregexp">section about aliases</a>.) <li><b><a name="faq175">Can</a> you extrapolate from the current month's partial data to produce a prediction for the whole month, based on the rate so far?</b> <br>No. There are too many problems in trying to produce anything sensible, especially near the beginning of the month. Different days of the week and different times of day cause lots of problems. I would prefer to produce accurate raw data than suspect derived data. <li><b><a name="faq176">Can</a> you extend the Domain Report to say which US states people visited from?</b> <br>No. Some programs pretend to do this, but you can actually only tell which state the computer the person was using is in, which may be quite different from where the user was for ISP's or other large organisations. <li><b><a name="faq198">Please</a> distinguish between the different BSDs in the Operating System Report.</b> <br>Sorry, I know they're different operating systems, but I don't want to introduce any finer granularity. <li><b><a name="faq177">Why</a> not use language codes instead of country codes for the names of the language files?</b> <br>People are more familiar with the country codes, and not all of my languages have language codes anyway. Anyway, the filenames are normally invisible to the user. <li><b><a name="faq186">Why</a> doesn't analog produce statistics on "visits"?</b> <br>See <cite><a href="webworks.html">How the Web Works</a></cite>. <li><b><a name="faq178">Why</a> don't you sell analog?</b> <br>I didn't write analog for the money, and I'm happy just to see people use it. Also, by making it open source, lots of people send me ideas and code to include in future versions. How do you think I got all those languages? (Of course, if you want to send me money, or gifts in kind, or even just postcards...). </ol> <hr size=2 noshade> Go to the <a href="http://www.analog.cx/">analog home page</a>. <p> <address>Stephen Turner <br>20 February 2002</address> <p><em>Need help with analog? <a href="mailing.html">Use the analog-help mailing list</a>.</em> <p> [ <a href="Readme.html">Top</a> | <a href="Readme.html">Up</a> | <a href="errors.html">Prev</a> | <a href="mailing.html">Next</a> | <a href="map.html">Map</a> | <a href="indx.html">Index</a> ] </body> </html>