##AWFFULL_LANG=sv_SE.UTF-8 ##AWFFULL_LANGUAGE=sv_SE.UTF-8:sv /* # AWFFull - A Webalizer Fork, Full o' features # # sample.conf # Sample configuration file # # Copyright 1997-2000 by Bradford L. Barrett (brad@mrunix.net) # Copyright (C) 2004- 2008 by Stephen McInerney # (spm@stedee.id.au) # # This file is part of AWFFull. # # AWFFull is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # AWFFull is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with AWFFull. If not, see <http://www.gnu.org/licenses/>. # */ # Sample AWFFull configuration file # # This is a sample configuration file for AWFFull (v3.8.1) # Lines starting with pound signs '#' are comment lines and are # ignored. Blank lines are skipped as well. Other lines are considered # as configuration lines, and have the form "ConfigOption Value" where # ConfigOption is a valid configuration keyword, and Value is the value # to assign that configuration option. Invalid keyword/values are # ignored, with appropriate warnings being displayed. There must be # at least one space or tab between the keyword and its value. # # AWFFull will look for a 'default' configuration file # "/etc/awffull/awffull.conf", and if found, use that. # the '-c config.file' option can also be used to specify an alternate # configuration file. Or multiple configuration files, with multiple -c's. # LogFile defines the web server log file to use. If not specified # here or on on the command line, input will default to STDIN. If # the log filename ends in '.gz' (ie: a gzip compressed file), it will # be decompressed on the fly as it is being read. #LogFile /var/lib/httpd/logs/access_log LogFile /var/log/httpd/access_log # LogType defines the log type being processed. Normally, AWFFull # expects a CLF or Combined web server log as input. Using this option, # you can process ftp logs as well (xferlog as produced by wu-ftp and # others), or Squid native logs. # Values can be 'auto' 'clf', 'combined', 'ftp', 'domino' or 'squid', with # 'auto' the default. # The 'auto' value means that AWFFull will try and work out what log format # you are sending to it. If no joy, AWFFull will immediately exit. #LogType auto LogType combined # OutputDir is where you want to put the output files. This should # should be a full path name, however relative ones might work as well. # If no output directory is specified, the current directory will be used. #OutputDir . OutputDir /var/lib/awffull # HistoryName allows you to specify the name of the history file produced # by AWFFull. The history file keeps the data for up to 12 months # worth of logs, used for generating the main HTML page (index.html). # The default is a file named "awffull.hist", stored in the specified # output directory. If you specify just the filename (without a path), # it will be kept in the specified output directory. Otherwise, the path # is relative to the output directory, unless absolute (leading /). #HistoryName awffull.hist HistoryName awffull.hist # Incremental processing allows multiple partial log files to be used # instead of one huge one. Useful for large sites that have to rotate # their log files more than once a month. AWFFull will save its # internal state before exiting, and restore it the next time run, in # order to continue processing where it left off. This mode also causes # AWFFull to scan for and ignore duplicate records (records already # processed by a previous run). See the README file for additional # information. The value may be 'yes' or 'no', with a default of 'no'. # The file 'awffull.current' is used to store the current state data, # and is located in the output directory of the program (unless changed # with the IncrementalName option below). Please read at least the section # on Incremental processing in the README file before you enable this option. #Incremental no # IncrementalName allows you to specify the filename for saving the # incremental data in. It is similar to the HistoryName option where the # name is relative to the specified output directory, unless an absolute # filename is specified. The default is a file named "awffull.current" # kept in the normal output directory. If you don't specify "Incremental" # as 'yes' then this option has no meaning. #IncrementalName awffull.current # ReportTitle is the text to display as the title. The hostname # (unless blank) is appended to the end of this string (separated with # a space) to generate the final full title string. # Default is (for English) "Usage Statistics for". #ReportTitle Usage Statistics for # HostName defines the hostname for the report. This is used in # the title, and is prepended to the URL table items. This allows # clicking on URL's in the report to go to the proper location in # the event you are running the report on a 'virtual' web server, # or for a server different than the one the report resides on. # If not specified here, or on the command line, awffull will # try to get the hostname via a uname system call. If that fails, # it will default to "localhost". #HostName localhost # HTMLExtension allows you to specify the filename extension to use # for generated HTML pages. Normally, this defaults to "html", but # can be changed for sites who need it (like for PHP embedded pages). #HTMLExtension html # PageType lets you tell AWFFull what types of URL's you # consider a 'page'. Most people consider html and cgi documents # as pages, while not images and audio files. If no types are # specified, defaults will be used ('htm', 'html', 'cgi' and HTMLExtension # if different for web logs, 'txt' for ftp logs). # Putting the more likely page types first in the list should increase the # speed of a run. # Do Not Use Wildcards Here. It will not work. PageType htm PageType html PageType php #PageType pl #PageType cfm #PageType pdf #PageType txt #PageType cgi # NotPageType is the direct and incompatible opposite of PageType. # You can use one set or the other, but not both. # PageType specifies what *is* a Page, NotPageType specifies what # *isn't*, and hence by implication, everything else is a page. # Neither method is more or lessor correct than the other. It's more # what is more accurate for *your* site. # Do not add the "." or use any wildcards. As a general rule. # There are some assumed internal optimisations that may otherwise # break. # Those who understand pcre's would do well to examine the source # of parser.c if they wish to extract greater flexibility from the # below. #NotPageType gif #NotPageType css #NotPageType js #NotPageType jpg #NotPageType ico #NotPageType png # CSSFilename is used to set the name of the CSS file to use in conjunction # with the generated html. An existing file is *not* overwritten, so feel free # to make you own changes to the default file. #CSSFilename awffull.css CSSFilename awffull.css # UseHTTPS should be used if the analysis is being run on a # secure server, and links to urls should use 'https://' instead # of the default 'http://'. If you need this, set it to 'yes'. # Default is 'no'. This only changes the behaviour of the 'Top # URL's' table. #UseHTTPS no # HTMLPre defines HTML code to insert at the very beginning of the # file. Default is the DOCTYPE line shown below. Max line length # is 80 characters, so use multiple HTMLPre lines if you need more. #HTMLPre <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> # HTMLHead defines HTML code to insert within the <HEAD></HEAD> # block, immediately after the <TITLE> line. Maximum line length # is 80 characters, so use multiple lines if needed. #HTMLHead <META NAME="author" CONTENT="AWFFull"> # HTMLBody defined the HTML code to be inserted, starting with the # <BODY> tag. If not specified, the default is shown below. If # used, you MUST include your own <BODY> tag as the first line. # Maximum line length is 80 char, use multiple lines if needed. #HTMLBody <BODY BGCOLOR="#E8E8E8" TEXT="#000000" LINK="#0000FF" VLINK="#FF0000"> # HTMLPost defines the HTML code to insert immediately before the # first <HR> on the document, which is just after the title and # "summary period"-"Generated on:" lines. If anything, this should # be used to clean up in case an image was inserted with HTMLBody. # As with HTMLHead, you can define as many of these as you want and # they will be inserted in the output stream in order of appearance. # Max string size is 80 characters. Use multiple lines if you need to. #HTMLPost <BR CLEAR="all"> # HTMLTail defines the HTML code to insert at the bottom of each # HTML document, usually to include a link back to your home # page or insert a small graphic. It is inserted as a table # data element (ie: <TD> your code here </TD>) and is right # aligned with the page. Max string size is 80 characters. #HTMLTail <IMG SRC="yourlogo.png" ALT="Company XYZ!"> # HTMLEnd defines the HTML code to add at the very end of the # generated files. It defaults to what is shown below. If # used, you MUST specify the </BODY> and </HTML> closing tags # as the last lines. Max string length is 80 characters. #HTMLEnd </BODY></HTML> # TimeMe allows you to force the display of timing information # at the end of processing. A value of 'yes' will force the # timing information to be displayed. A value of 'no' has no # effect. #TimeMe no # GMTTime allows reports to show GMT (UTC) time instead of local # time. Default is to display the time the report was generated # in the timezone of the local machine, such as EDT or PST. This # keyword allows you to have times displayed in UTC instead. Use # only if you really have a good reason, since it will probably # screw up the reporting periods by however many hours your local # time zone is off of GMT. #GMTTime no # FoldSeqErr forces AWFFull to ignore sequence errors. # This is useful for Netscape and other web servers that cache # the writing of log records and do not guarantee that they # will be in chronological order. The use of the FoldSeqErr # option will cause out of sequence log records to be treated # as if they had the same time stamp as the last valid record. # Default is to ignore out of sequence log records. #FoldSeqErr no # VisitTimeout allows you to set the default timeout for a visit # (sometimes called a 'session'). The default is 30 minutes, # which should be fine for most sites. # Visits are determined by looking at the time of the current # request, and the time of the last request from the site. If # the time difference is greater than the VisitTimeout value, it # is considered a new visit, and visit totals are incremented. # Value is the number of seconds to timeout (default=1800=30min) #VisitTimeout 1800 # IgnoreHist shouldn't be used in a config file, but it is here # just because it might be useful in certain situations. If the # history file is ignored, the main "index.html" file will only # report on the current log files contents. Useful only when you # want to reproduce the reports from scratch. USE WITH CAUTION! # Valid values are "yes" or "no". Default is "no". #IgnoreHist no # TrackPartialRequests is used to track 206 codes. This gives two # additional columns in the Top URLs tables. # The first to "Hits" counts the number of partial requests # The second to "Volume" counts the volume in partial requests # This option is more of use to those with lots of PDF's. #TrackPartialRequests no # CountryGraph allows the usage by country graph to be disabled. # Values can be 'yes' or 'no', default is 'yes'. #CountryGraph yes # GeoIP enables or disables the use of the GeoIP capability for more # accurate detection of countries. Default is 'no'. # NOTE! Do not enable GeoIP if you analyse files that have had the IP Address # translated to a Fully Qualified Host Name. # Use either raw IP Addresses and GeoIP, or Names and disable GeoIP. # ie. Don't use GeoIP AND DNSHistory. #GeoIP no GeoIP yes # GeoIPDatabase is the location of the GeoIP database file. Default is # '/usr/local/share/GeoIP/GeoIP.dat', which is where a default GeoIP # install will put it. Note that the database is updated monthly. # For the details see: http://www.maxmind.com/app/geoip_country #GeoIPDatabase /usr/local/share/GeoIP/GeoIP.dat GeoIPDatabase /usr/share/GeoIP/GeoIP.dat # FlagsLocation will enable the display of country flags in the country # table. The path is that for a webserver, not file system. Can be # relative or complete. The trailing slash is not necessary. #FlagsLocation flags FlagsLocation /flags # DailyGraph and DailyStats allows the daily statistics graph # and statistics table to be disabled (not displayed). Values # may be "yes" or "no". Default is "yes". #DailyGraph yes #DailyStats yes # HourlyGraph and HourlyStats allows the hourly statistics graph # and statistics table to be disabled (not displayed). Values # may be "yes" or "no". Default is "yes". #HourlyGraph yes #HourlyStats yes # TopURLsbyHITsGraph - Display a pie chart of the top URLs by HITS #TopURLsbyHitsGraph yes #TopURLsbyVolGraph yes # TopExitPagesGraph - Display Top Exit Pages Pie Chart # no for don't display # hits for by hits # visits for by visits #TopExitPagesGraph visits # TopEntryPagesGraph - Display Top Entry Pages Pie Chart # no for don't display # hits for by hits # visits for by visits #TopEntryPagesGraph visits # TopSitesbyPagesGraph - Display a pie chart of the Top Sites by Page Impressions #TopSitesbyPagesGraph yes # TopSitesbyVolGraph - Display a pie chart of the Top Sites by Page Impressions #TopSitesbyVolGraph yes # TopAgentsGraph - Display a pie chart of the Top User Agents (by pages) #TopAgentsGraph yes # GraphLegend allows the color coded legends to be turned on or off # in the graphs. The default is for them to be displayed. This only # toggles the color coded legends, the other legends are not changed. # If you think they are hideous and ugly, say 'no' here :) #GraphLegend yes # GraphLines allows you to have index lines drawn behind the graphs. # Anything other than "no" will enable the lines. #GraphLines yes # YearlySubtotals will display the subtotal for a given year in the main # page. This is in addition to the Grand Total of all years. #YearlySubtotals no # The "Top" options below define the number of entries for each table. # Defaults are Sites=30, URL's=30, Referrers=30 and Agents=15, and # Countries=30. TopKSites and TopKURLs (by KByte tables) both default # to 10, as do the top entry/exit tables (TopEntry/TopExit). The top # search strings and user names default to 20. Tables may be disabled # by using zero (0) for the value. # Top404Errors, displays a table of error requests, and the corresponding # referring URL. #TopSites 30 #TopKSites 10 #TopURLs 30 #TopKURLs 10 #TopReferrers 30 #TopAgents 15 #TopCountries 30 #TopEntry 10 #TopExit 10 #TopSearch 20 #TopUsers 20 #Top404Errors 0 # The All* keywords allow the display of all URL's, Sites, Referrers # User Agents, Search Strings and User names. If enabled, a separate # HTML page will be created, and a link will be added to the bottom # of the appropriate "Top" table. There are a couple of conditions # for this to occur.. First, there must be more items than will fit # in the "Top" table (otherwise it would just be duplicating what is # already displayed). Second, the listing will only show those items # that are normally visible, which means it will not show any hidden # items. Grouped entries will be listed first, followed by individual # items. The value for these keywords can be either 'yes' or 'no', # with the default being 'no'. Please be aware that these pages can # be quite large in size, particularly the sites page, and separate # pages are generated for each month, which can consume quite a lot # of disk space depending on the traffic to your site. # All404Errors displays a table of error requests, and the corresponding # referring URL. #AllSites no #AllURLs no #AllReferrers no #AllAgents no #AllSearchStr no #AllUsers no #All404Errors no # AWFFull normally strips the string 'index.' off the end of # URL's in order to consolidate URL totals. For example, the URL # /somedir/index.html is turned into /somedir/ which is really the # same URL. This option allows you to specify additional strings # to treat in the same way. You don't need to specify 'index.' as # it is always scanned for by AWFFull, this option is just to # specify _additional_ strings if needed. If you don't need any, # don't specify any as each string will be scanned for in EVERY # log record... A bunch of them will degrade performance. Also, # the string is scanned for anywhere in the URL, so a string of # 'home' would turn the URL /somedir/homepages/brad/home.html into # just /somedir/ which is probably not what was intended. #IndexAlias home.htm #IndexAlias homepage.htm # The opposite (in a way) of IndexAlias is IgnoreIndexAlias. # This will STOP any URL variable stripping, as well as ignoring the # default "index." setting, or any that you set above. #IgnoreIndexAlias no # The Hide*, Group* and Ignore* and Include* keywords allow you to # change the way Sites, URL's, Referrers, User Agents and User names # are manipulated. The Ignore* keywords will cause AWFFull to # completely ignore records as if they didn't exist (and thus not # counted in the main site totals). The Hide* keywords will prevent # things from being displayed in the 'Top' tables, but will still be # counted in the main totals. The Group* keywords allow grouping # similar objects as if they were one. Grouped records are displayed # in the 'Top' tables and can optionally be displayed in BOLD and/or # shaded. Groups cannot be hidden, and are not counted in the main # totals. The Group* options do not, by default, hide all the items # that it matches. If you want to hide the records that match (so just # the grouping record is displayed), follow with an identical Hide* # keyword with the same value. (see example below) In addition, # Group* keywords may have an optional label which will be displayed # instead of the keywords value. The label should be separated from # the value by at least one 'white-space' character, such as a space # or tab. # # The value can have either a leading or trailing '*' wildcard # character. If no wildcard is found, a match can occur anywhere # in the string. Given a string "www.yourmama.com", the values "your", # "*mama.com" and "www.your*" will all match. # Your own site should be hidden #HideSite *mrunix.net #HideSite localhost # Your own site gives most referrals #HideReferrer mrunix.net/ # This one hides non-referrers ("-" Direct requests) #HideReferrer Direct Request # Usually you want to hide these HideURL *.gif HideURL *.GIF HideURL *.jpg HideURL *.JPG HideURL *.png HideURL *.PNG HideURL *.ra # Hiding agents is kind of futile #HideAgent RealPlayer # You can also hide based on authenticated user name #HideUser root #HideUser admin # Grouping options #GroupURL /cgi-bin/* CGI Scripts #GroupURL /images/* Images #GroupSite *.aol.com #GroupSite *.compuserve.com #GroupReferrer yahoo.com/ Yahoo! #GroupReferrer excite.com/ Excite #GroupReferrer infoseek.com/ InfoSeek #GroupReferrer webcrawler.com/ WebCrawler #GroupUser root Admin users #GroupUser admin Admin users #GroupUser wheel Admin users # The following is a great way to get an overall total # for browsers, and not display all the detail records. # (You should use MangleAgent to refine further...) # # Simplified browser list for Webalizer. Copy & paste in awffull.conf, # replacing the original list. # # Longer version in http://griho.udl.es/webalizer/groupagent.txt # Full version in http://griho.udl.es/webalizer/webalizer.conf.txt # # Version: 1.1 14/May/2005 # # GroupAndHideAgent is equivalent to the two lines of a GroupAgent, then a HideAgent GroupAndHideAgent Googlebot Spider: Googlebot GroupAndHideAgent msnbot* Spider: MSNBot GroupAndHideAgent AppleWebKit/ Browser: Safari (OSX) GroupAndHideAgent Camino Browser: Camino (OSX) GroupAndHideAgent Epiphany Browser: Epiphany (Gentoo) GroupAndHideAgent Firebird/ Browser: Firebird GroupAndHideAgent Firefox/ Browser: Firefox GroupAndHideAgent Galeon/ Browser: Galeon GroupAndHideAgent Konqueror/ Browser: Konqueror GroupAndHideAgent Netscape6/ Browser: Netscape 6 GroupAndHideAgent Netscape/7 Browser: Netscape 7 GroupAndHideAgent Netscape/8 Browser: Netscape 8 GroupAndHideAgent rv:1. Browser: Mozilla 1.x GroupAndHideAgent Opera Browser: Opera GroupAndHideAgent Mozilla/1 Browser: Netscape v1.xx GroupAndHideAgent Mozilla/2 Browser: Netscape v2.xx GroupAndHideAgent Mozilla/3.04Gold Browser: Netscape 3.04 Gold GroupAndHideAgent Mozilla/3 Browser: Netscape v3.xx GroupAndHideAgent Mozilla/4.03 Browser: Netscape 4.03 GroupAndHideAgent Mozilla/4.04 Browser: Netscape 4.04 GroupAndHideAgent Mozilla/4.05 Browser: Netscape 4.05 GroupAndHideAgent Mozilla/4.06 Browser: Netscape 4.06 GroupAndHideAgent Mozilla/4.08 Browser: Netscape 4.08 GroupAndHideAgent Mozilla/4.5 Browser: Netscape 4.5 GroupAndHideAgent Mozilla/4.61 Browser: Netscape 4.6 (Mac/WinNT) GroupAndHideAgent Mozilla/4.6 Browser: Netscape 4.6 (Win95/Win98) GroupAndHideAgent Mozilla/4.72 Browser: Netscape 4.72 GroupAndHideAgent Mozilla/4.73 Browser: Netscape 4.73 GroupAndHideAgent Mozilla/4.75 Browser: Netscape 4.75 GroupAndHideAgent Mozilla/4.76 Browser: Netscape 4.76 GroupAndHideAgent Mozilla/4.77 Browser: Netscape 4.77 GroupAndHideAgent Mozilla/4.78 Browser: Netscape 4.78 GroupAndHideAgent Mozilla/4.79 Browser: Netscape 4.79 GroupAndHideAgent Mozilla/4.7 Browser: Netscape 4.7 GroupAndHideAgent Mozilla/4.8 Browser: Netscape 4.8 GroupAndHideAgent Mozilla/5.0 Browser: Netscape 4.8 GroupAndHideAgent "compatible; MSIE 6.0" Browser: Internet Explorer 6.0 (Win) GroupAndHideAgent "compatible; MSIE 7.01" Spambot: Pretends to be MSIE 7.01 GroupAndHideAgent "compatible; MSIE 7.0" Browser: Internet Explorer 7.0 (Win) GroupAndHideAgent "compatible; MSIE 5.5" Browser: Internet Explorer 5.5 (Win) GroupAndHideAgent "compatible; MSIE 5.01" Browser: Internet Explorer 5.01 # this 4.0 entry is matching Mozilla/4.0 which applies for every MSIE in the net, leave it commented ##GroupAgent 4.0 Browser: Internet Explorer 4.0 ##HideAgent 4.0 GroupAndHideAgent 4.5 Browser: Internet Explorer 4.5 GroupAndHideAgent 5.0 Browser: Internet Explorer 5.0 GroupAndHideAgent 5.12 Browser: Internet Explorer 5.12 (Mac) GroupAndHideAgent 5.13 Browser: Internet Explorer 5.13 (Mac) GroupAndHideAgent 5.14 Browser: Internet Explorer 5.14 (Mac) GroupAndHideAgent 5.15 Browser: Internet Explorer 5.15 (Mac) GroupAndHideAgent 5.16 Browser: Internet Explorer 5.16 (Mac) GroupAndHideAgent 5.17 Browser: Internet Explorer 5.17 (Mac) GroupAndHideAgent 5.21 Browser: Internet Explorer 5.21 (Mac) GroupAndHideAgent 5.22 Browser: Internet Explorer 5.22 (Mac) GroupAndHideAgent 5.23 Browser: Internet Explorer 5.23 (Mac) GroupAndHideAgent "compatible; MSIE 5.0" Browser: Internet Explorer 5.0 GroupAndHideAgent "compatible; MSIE 4.5" Browser: Internet Explorer 4.5 GroupAndHideAgent 3.0 Browser: Internet Explorer 3.0 (win95) GroupAndHideAgent 3.0B Browser: Internet Explorer 3.0B (win95) GroupAndHideAgent 3.01 Browser: Internet Explorer 3.01 (win95) GroupAndHideAgent 4.01 Browser: Internet Explorer 4.01 # we comment MSIE because many agents use it in their name to disguise as Internet Explorer #####GroupAgent MSIE Browser: Internet Explorer (unknown version) #####HideAgent MSIE # HideAllSites allows forcing individual sites to be hidden in the # report. This is particularly useful when used in conjunction # with the "GroupDomain" feature, but could be useful in other # situations as well, such as when you only want to display grouped # sites (with the GroupSite keywords...). The value for this # keyword can be either 'yes' or 'no', with 'no' the default, # allowing individual sites to be displayed. #HideAllSites no # The GroupDomains keyword allows you to group individual host names # into their respective domains. The value specifies the level of # grouping to perform, and can be thought of as 'the number of dots' # that will be displayed. For example, if a visiting host is named # cust1.tnt.mia.uu.net, a domain grouping of 1 will result in just # "uu.net" being displayed, while a 2 will result in "mia.uu.net". # The default value of zero disable this feature. Domains will only # be grouped if they do not match any existing "GroupSite" records, # which allows overriding this feature with your own if desired. #GroupDomains 0 # The GroupShading allows grouped rows to be shaded in the report. # Useful if you have lots of groups and individual records that # intermingle in the report, and you want to differentiate the group # records a little more. Value can be 'yes' or 'no', with 'yes' # being the default. #GroupShading yes # GroupHighlight allows the group record to be displayed in BOLD. # Can be either 'yes' or 'no' with the default 'yes'. #GroupHighlight yes # Segmenting - segXXX # Segmenting is a bit like the Ignore* and Include* keywords. Where it # differs is in "remembering". Such that, as a "session" moves away from # the original condition, that session is still tracked. # So if you segment on a referral from Google, only sessions that were # referred to the site from Google will be tracked. Even as they access # other pages within the site. # eg. Google -> Site Page 1 -> Site Page 2 -> Site Page 3 # Whereas Ignore/Include would only filter the first interaction. # eg. Google -> Site Page 1 # # By "session" it is meant that the time limitation of a session (typically # 30 minutes timeout) will impact. So in the above example from Google, if # the last step (from Page 2 to Page 3) occurred 31+ minutes after the Page 1 # to Page 2 transition, then this final step would NOT be included. The trail # would be: # Google -> Site Page 1 -> Site Page 2 # # Please do be aware that currently AWFFull uses IP Addresses to determine # the continuation of a given session. This will be most flawed if you have # a user population that sits behind corporate firewalls, or ISP Proxies. # To mention two major problem areas. # # Why do Segmenting? # http://judah.webanalyticsdemystified.com/2007/11/a-few-tips-on-web-analytics-segmentation.html # "Segment analysis will tell you different things about your audience than # you will realize from studying overall population metrics." # "The goal of segmentation is to maximize future value of that segment by # optimizing your marketing mix." # With apologies to Judah for mixing his phrase order around. :-) # Segment by Country # Only track sessions that come from the following countries. # This will be determined by: # 1. Use of AssignToCountry overrides # 2. GeoIP lookups if so configured and enabled # 3. Hostname TLD. eg .au # The third option is generally going to be the worst for accuracy. # We have plenty of Australian IP addresses that are .com or .net etc. # It is strongly advised to enable GeoIP if you wish to use this option. #SegCountry AU #SegCountry US #SegCountry BR # Segment by Referer # Only track sessions that originated from the following referrers. # NOTE!!!! SegReferer only works against the HOST name. Not the full URL. #SegReferer *google.com.au #SegReferer *yahoo.com.au #SegReferer ninemsn # The Ignore* keywords allow you to completely ignore log records based # on hostname, URL, user agent, referrer or user name. I hesitated in # adding these, since the Webalizer was designed to generate _accurate_ # statistics about a web servers performance. By choosing to ignore # records, the accuracy of reports become skewed, negating why I wrote # this program in the first place. However, due to popular demand, here # they are. Use the same as the Hide* keywords, where the value can have # a leading or trailing wildcard '*'. Use at your own risk ;) #IgnoreSite bad.site.net #IgnoreURL /test* #IgnoreReferrer file:/* #IgnoreAgent RealPlayer #IgnoreUser root # The Include* keywords allow you to force the inclusion of log records # based on hostname, URL, user agent, referrer or user name. They take # precedence over the Ignore* keywords. Note: Using Ignore/Include # combinations to selectively process parts of a web site is _extremely # inefficient_!!! Avoid doing so if possible (ie: grep the records to a # separate file if you really want that kind of report). # Example: Only show stats on Joe User's pages... #IgnoreURL * #IncludeURL ~joeuser* # Or based on an authenticated user name #IgnoreUser * #IncludeUser someuser # The MangleAgents allows you to specify how much, if any, AWFFull # should mangle user agent names. This allows several levels of detail # to be produced when reporting user agent statistics. There are six # levels that can be specified, which define different levels of detail # suppression. Level 5 shows only the browser name (MSIE or Mozilla) # and the major version number. Level 4 adds the minor version number # (single decimal place). Level 3 displays the minor version to two # decimal places. Level 2 will add any sub-level designation (such # as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will attempt to also add # the system type if it is specified. The default Level 0 displays the # full user agent field without modification and produces the greatest # amount of detail. User agent names that can't be mangled will be # left unmodified. #MangleAgents 0 # The SearchEngine keywords allow specification of search engines and # their query strings on the URL. These are used to locate and report # what search strings are used to find your site. The first word is # a substring to match in the referrer field that identifies the search # engine, and the second is the URL variable used by that search engine # to define it's search terms. SearchEngine google. q= SearchEngine yahoo. p= SearchEngine msn. q= SearchEngine search.aol query= SearchEngine altavista. q= SearchEngine lycos. query= SearchEngine hotbot. query= SearchEngine alltheweb. query= SearchEngine infoseek. qt= SearchEngine webcrawler searchText= SearchEngine excite search= SearchEngine netscape. query= SearchEngine ask.com q= SearchEngine webwombat. ix= SearchEngine earthlink. q= SearchEngine search.comcast. q= SearchEngine search.mywebsearch. searchfor= SearchEngine reference.com q= SearchEngine mamma.com query= # Last attempt catch all SearchEngine search. q= # AssignToCountry allows a form of override to force given domains # to a specified country. Use the standard 2 letter country codes. # Can also use org, com, net and so on, if more appropriate. # With judicious use of the AllSites, GroupSite and 'whois', this # can fairly easily cover all your majority users with not too much # effort. #AssignToCountry *.bigpond.com au #AssignToCountry *.internode.on.net au #AssignToCountry 203.36.* au #AssignToCountry *.ntli.net uk #AssignToCountry *.btcentralplus.com uk # The Dump* keywords allow the dumping of Sites, URL's, Referrers # User Agents, User names and Search strings to separate tab delimited # text files, suitable for import into most database or spreadsheet # programs. # DumpPath specifies the path to dump the files. If not specified, # it will default to the current output directory. Do not use a # trailing slash ('/'). #DumpPath /var/lib/httpd/logs # The DumpHeader keyword specifies if a header record should be # written to the file. A header record is the first record of the # file, and contains the labels for each field written. Normally, # files that are intended to be imported into a database system # will not need a header record, while spreadsheets usually do. # Value can be either 'yes' or 'no', with 'no' being the default. #DumpHeader no # DumpExtension allow you to specify the dump filename extension # to use. The default is "tab", but some programs are picky about # the filenames they use, so you may change it here (for example, # some people may prefer to use "csv"). #DumpExtension tab # These control the dumping of each individual table. The value # can be either 'yes' or 'no'.. the default is 'no'. #DumpSites no #DumpURLs no #DumpReferrers no #DumpAgents no #DumpUsers no #DumpSearchStr no #DumpEntryPages no #DumpExitPages no #DumpCountries no # This option controls how many years worth of data to display on the # front summary page. In months. # eg: Display the last 5 years: 5 x 12 = 60 # IndexMonths 60 # The following Graph????X or Y options are used to modify the sizes of the # created charts. # The default settings are shown. The defaults are also the minimum settings. # The main chart on the front page. Summary of all Months. #GraphIndexX 512 #GraphIndexY 256 # The Day by Day Summary graph at the start of each Months Summary. #GraphDailyX 512 #GraphDailyY 400 # The Hourly Average graph within each Months Summary. #GraphHourlyX 512 #GraphHourlyY 256 # All pie charts are the same size. #GraphPieX 512 #GraphPieY 300 # The custom bar graph and pie Colors are defined here. # Declare them in the standard hexadecimal way (as HTML, but without the '#') # If none are given, you will get the standard Webalizer colors. #ColorHit 00805c #ColorFile 0000ff #ColorSite ff8000 #ColorKbyte ff0000 #ColorPage 00c0ff #ColorVisit ffff00 #PieColor1 800080 #PieColor2 80ffc0 #PieColor3 ff00ff #PieColor4 ffc480 # End of configuration file... Have a nice day!