<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html> <head> <title> ht://Dig: Recognized META information in HTML documents </title> </head> <body bgcolor="#eef7ff"> <h1> Recognized META information in HTML documents </h1> <p> ht://Dig Copyright © 1995-2001 <a href="THANKS.html">The ht://Dig Group</a><br> Please see the file <a href="COPYING">COPYING</a> for license information. </p> <hr size="4" noshade> <h2> Introduction </h2> <p> As the <a href="index.html">ht://Dig</a> system will index all HTML pages on a system, individual authors of pages may want to control some of the aspects of the indexing operation. To this end, ht://Dig will recognize some special <META> tag attributes. The following things can be controlled in this manner: </p> <ul> <li> Do not index the document </li> <li> Notify a user that the document has expired </li> <li> Set keywords for the document </li> </ul> <hr> <h2> General <META> tag use </h2> <p> In HTML, any number of <META> tags can be used between the <HEAD> and </HEAD> tags of a document. There are three possible attributes in this tag, two of which are recognized by ht://Dig: </p> <dl> <dt> NAME </dt> <dd> Used to name a specific property. </dd> <dt> CONTENT </dt> <dd> Used to supply the value for a named property. </dd> </dl> <p> A document could start with something like the following: </p> <blockquote> <HTML><br> <HEAD><br> <META NAME="htdig-keywords" CONTENT="phone telephone online electronic directory"><br> <META NAME="htdig-email" CONTENT="pat.user@nowhere.net"><br> <TITLE>Some document title</TITLE><br> </HEAD><br> <BODY> <blockquote> <em>Body of document</em> </blockquote> </BODY><br> </HTML> </blockquote> <hr> <h2> Recognized properties </h2> <p> The following properties are recognized by ht://Dig: </p> <ul> <li> htdig-keywords </li> <li> htdig-noindex </li> <li> htdig-email </li> <li> htdig-notification-date </li> <li> htdig-email-subject </li> <li> robots </li> <li> keywords </li> <li> description </li> </ul> <p> Detailed information about the <em>htdig-email</em>, <em> htdig-notification-date</em>, and <em> htdig-email-subject</em> properties can be found in the <a href="notification.html">Email notification service</a> document. </p> <p> Descriptions of the properties and their values: </p> <dl> <dt> <strong>htdig-keywords</strong> </dt> <dd> The value of this property should be a blank separated list of keywords which will get a very high weight when searching. This can be used to get around some problems with common synonyms for words in the document. For example, if a document is a telephone directory, possible keywords could be "telephone phone directory book list". Now, regardless of what text is actually in the document, it can be found if these keywords are used in the search. The weight that words in the content string will have in search results is controlled by the <a href="attrs.html#keywords_factor"> keywords_factor</a> attribute in your configuration. </dd> <dt> <strong>htdig-noindex</strong> </dt> <dd> This property has no value associated with it. If it is used, the document will NOT be included in any searches. Example uses of this could be: <ul> <li> A document which is dynamic. ie: the contents change continually. </li> <li> Temporary document, not officially available, yet. </li> <li> A document you just don't want to be found. </li> </ul> </dd> <dt> <strong>htdig-email</strong> </dt> <dd> The value is the email address a notification message should be sent to. Multiple email addresses can be given by separating them by commas. If no email address is given, no notification will be sent.<br> (Please check the <a href="notification.html">Email notification service</a> documentation for more details on this.) </dd> <dt> <strong>htdig-notification-date</strong> </dt> <dd> The value is the date on or after which the notification should be sent. The format is simply <em>month / day / year</em>, or if the <a href="attrs.html#iso_8601">iso_8601</a> attribute is set, <em>year - month - day</em>. Make sure that the year has the century with it as well. This means that you should use <em>1995</em> instead of <em>95</em>.<br> If no date is given, no notification will be sent. (Please check the <a href="notification.html">Email notification service</a> documentation for more details on this.) </dd> <dt> <strong>htdig-email-subject</strong> </dt> <dd> The value specifies the subject the notification message. This is an optional property. (Please check the <a href="notification.html">Email notification service</a> documentation for more details on this.) </dd> <dt> <strong>robots</strong> </dt> <dd> The value specifies restrictions on robots (including ht://Dig) for the current page. These restrictions can be "noindex" to prevent indexing the document but allowing the robot to follow links from the page, "nofollow" to allow indexing but preventing links from being followed, or "none" to prevent both. Additionally, ht://Dig supports the values "index" and "follow" and "all" which obviously are the opposite of the other values and are the default behavior. For more information on META robots tags, check out the <a href="http://info.webcrawler.com/mak/projects/robots/meta-user.html"> HTMLAuthor's Guide to the Robots META tag</a>. </dd> <dt> <strong>keywords</strong> </dt> <dd> The value of this property should be a blank separated list of keywords, just as for the htdig-keywords property. They are treated as equivalent by htdig. The reason for two different properties is that the keywords property is used by other search engines as well, while the htdig-keywords property can be used for words you want indexed only by htdig. You can get htdig to treat other property names as equivalent to htdig-keywords, or disable the htdig-keywords or keywords properties, by changing the <a href="attrs.html#keywords_meta_tag_names"> keywords_meta_tag_names</a> attribute in your configuration. </dd> <dt> <strong>description</strong> </dt> <dd> The value allows you to specify an alternate excerpt (description) of a page. If the config-file attribute <a href="attrs.html#use_meta_description"> use_meta_description</a> is used, then any documents with descriptions will use them instead of the automatically generated excerpts. The weight that words in the content string will have in search results is controlled by the <a href="attrs.html#meta_description_factor"> meta_description_factor</a> attribute in your configuration. </dd> </dl> <hr size="4" noshade> Last modified: $Date: 2001/02/15 17:05:34 $ </body> </html>