<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <link rel="STYLESHEET" href="lib.css" type='text/css' /> <link rel="SHORTCUT ICON" href="../icons/pyfav.gif" /> <link rel='start' href='../index.html' title='Python Documentation Index' /> <link rel="first" href="lib.html" title='Python Library Reference' /> <link rel='contents' href='contents.html' title="Contents" /> <link rel='index' href='genindex.html' title='Index' /> <link rel='last' href='about.html' title='About this document...' /> <link rel='help' href='about.html' title='About this document...' /> <LINK rel="next" href="re-objects.html"> <LINK rel="prev" href="matching-searching.html"> <LINK rel="parent" href="module-re.html"> <LINK rel="next" href="re-objects.html"> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <meta name='aesop' content='information' /> <META name="description" content="Module Contents"> <META name="keywords" content="lib"> <META name="resource-type" content="document"> <META name="distribution" content="global"> <title>4.2.3 Module Contents</title> </head> <body> <DIV CLASS="navigation"> <div id='top-navigation-panel'> <table align="center" width="100%" cellpadding="0" cellspacing="2"> <tr> <td class='online-navigation'><a rel="prev" title="4.2.2 Matching vs Searching" href="matching-searching.html"><img src='../icons/previous.png' border='0' height='32' alt='Previous Page' width='32' /></A></td> <td class='online-navigation'><a rel="parent" title="4.2 re " href="module-re.html"><img src='../icons/up.png' border='0' height='32' alt='Up One Level' width='32' /></A></td> <td class='online-navigation'><a rel="next" title="4.2.4 Regular Expression Objects" href="re-objects.html"><img src='../icons/next.png' border='0' height='32' alt='Next Page' width='32' /></A></td> <td align="center" width="100%">Python Library Reference</td> <td class='online-navigation'><a rel="contents" title="Table of Contents" href="contents.html"><img src='../icons/contents.png' border='0' height='32' alt='Contents' width='32' /></A></td> <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' border='0' height='32' alt='Module Index' width='32' /></a></td> <td class='online-navigation'><a rel="index" title="Index" href="genindex.html"><img src='../icons/index.png' border='0' height='32' alt='Index' width='32' /></A></td> </tr></table> <div class='online-navigation'> <b class="navlabel">Previous:</b> <a class="sectref" rel="prev" href="matching-searching.html">4.2.2 Matching vs Searching</A> <b class="navlabel">Up:</b> <a class="sectref" rel="parent" href="module-re.html">4.2 re </A> <b class="navlabel">Next:</b> <a class="sectref" rel="next" href="re-objects.html">4.2.4 Regular Expression Objects</A> </div> <hr /></div> </DIV> <!--End of Navigation Panel--> <H2><A NAME="SECTION006230000000000000000"> 4.2.3 Module Contents</A> </H2> <A NAME="Contents_of_Module_re"><!--z--></A> <P> The module defines the following functions and constants, and an exception: <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-832' class="function">compile</tt></b>(</nobr></td> <td><var>pattern</var><big>[</big><var>, flags</var><big>]</big>)</td></tr></table></dt> <dd> Compile a regular expression pattern into a regular expression object, which can be used for matching using its <tt class="function">match()</tt> and <tt class="function">search()</tt> methods, described below. <P> The expression's behaviour can be modified by specifying a <var>flags</var> value. Values can be any of the following variables, combined using bitwise OR (the <code>|</code> operator). <P> The sequence <P> <div class="verbatim"><pre> prog = re.compile(pat) result = prog.match(str) </pre></div> <P> is equivalent to <P> <div class="verbatim"><pre> result = re.match(pat, str) </pre></div> <P> but the version using <tt class="function">compile()</tt> is more efficient when the expression will be used several times in a single program. </dl> <P> <dl><dt><b><tt id='l2h-833'>I</tt></b></dt> <dd> <dt><b><tt id='l2h-848'>IGNORECASE</tt></b></dt><dd> Perform case-insensitive matching; expressions like <tt class="regexp">[A-Z]</tt> will match lowercase letters, too. This is not affected by the current locale. </dd></dl> <P> <dl><dt><b><tt id='l2h-834'>L</tt></b></dt> <dd> <dt><b><tt id='l2h-849'>LOCALE</tt></b></dt><dd> Make <tt class="regexp">\w</tt>, <tt class="regexp">\W</tt>, <tt class="regexp">\b</tt>, and <tt class="regexp">\B</tt> dependent on the current locale. </dd></dl> <P> <dl><dt><b><tt id='l2h-835'>M</tt></b></dt> <dd> <dt><b><tt id='l2h-850'>MULTILINE</tt></b></dt><dd> When specified, the pattern character "<tt class="character">^</tt>" matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character "<tt class="character">$</tt>" matches at the end of the string and at the end of each line (immediately preceding each newline). By default, "<tt class="character">^</tt>" matches only at the beginning of the string, and "<tt class="character">$</tt>" only at the end of the string and immediately before the newline (if any) at the end of the string. </dd></dl> <P> <dl><dt><b><tt id='l2h-836'>S</tt></b></dt> <dd> <dt><b><tt id='l2h-851'>DOTALL</tt></b></dt><dd> Make the "<tt class="character">.</tt>" special character match any character at all, including a newline; without this flag, "<tt class="character">.</tt>" will match anything <i>except</i> a newline. </dd></dl> <P> <dl><dt><b><tt id='l2h-837'>U</tt></b></dt> <dd> <dt><b><tt id='l2h-852'>UNICODE</tt></b></dt><dd> Make <tt class="regexp">\w</tt>, <tt class="regexp">\W</tt>, <tt class="regexp">\b</tt>, and <tt class="regexp">\B</tt> dependent on the Unicode character properties database. <span class="versionnote">New in version 2.0.</span> </dd></dl> <P> <dl><dt><b><tt id='l2h-838'>X</tt></b></dt> <dd> <dt><b><tt id='l2h-853'>VERBOSE</tt></b></dt><dd> This flag allows you to write regular expressions that look nicer. Whitespace within the pattern is ignored, except when in a character class or preceded by an unescaped backslash, and, when a line contains a "<tt class="character">#</tt>" neither in a character class or preceded by an unescaped backslash, all characters from the leftmost such "<tt class="character">#</tt>" through the end of the line are ignored. </dd></dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-839' class="function">search</tt></b>(</nobr></td> <td><var>pattern, string</var><big>[</big><var>, flags</var><big>]</big>)</td></tr></table></dt> <dd> Scan through <var>string</var> looking for a location where the regular expression <var>pattern</var> produces a match, and return a corresponding <tt class="class">MatchObject</tt> instance. Return <code>None</code> if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string. </dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-840' class="function">match</tt></b>(</nobr></td> <td><var>pattern, string</var><big>[</big><var>, flags</var><big>]</big>)</td></tr></table></dt> <dd> If zero or more characters at the beginning of <var>string</var> match the regular expression <var>pattern</var>, return a corresponding <tt class="class">MatchObject</tt> instance. Return <code>None</code> if the string does not match the pattern; note that this is different from a zero-length match. <P> <span class="note"><b class="label">Note:</b> If you want to locate a match anywhere in <var>string</var>, use <tt class="method">search()</tt> instead.</span> </dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-841' class="function">split</tt></b>(</nobr></td> <td><var>pattern, string</var><big>[</big><var>, maxsplit<code> = 0</code></var><big>]</big>)</td></tr></table></dt> <dd> Split <var>string</var> by the occurrences of <var>pattern</var>. If capturing parentheses are used in <var>pattern</var>, then the text of all groups in the pattern are also returned as part of the resulting list. If <var>maxsplit</var> is nonzero, at most <var>maxsplit</var> splits occur, and the remainder of the string is returned as the final element of the list. (Incompatibility note: in the original Python 1.5 release, <var>maxsplit</var> was ignored. This has been fixed in later releases.) <P> <div class="verbatim"><pre> >>> re.split('\W+', 'Words, words, words.') ['Words', 'words', 'words', ''] >>> re.split('(\W+)', 'Words, words, words.') ['Words', ', ', 'words', ', ', 'words', '.', ''] >>> re.split('\W+', 'Words, words, words.', 1) ['Words', 'words, words.'] </pre></div> <P> This function combines and extends the functionality of the old <tt class="function">regsub.split()</tt> and <tt class="function">regsub.splitx()</tt>. </dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-842' class="function">findall</tt></b>(</nobr></td> <td><var>pattern, string</var>)</td></tr></table></dt> <dd> Return a list of all non-overlapping matches of <var>pattern</var> in <var>string</var>. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match. <span class="versionnote">New in version 1.5.2.</span> </dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-843' class="function">finditer</tt></b>(</nobr></td> <td><var>pattern, string</var>)</td></tr></table></dt> <dd> Return an iterator over all non-overlapping matches for the RE <var>pattern</var> in <var>string</var>. For each match, the iterator returns a match object. Empty matches are included in the result unless they touch the beginning of another match. <span class="versionnote">New in version 2.2.</span> </dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-844' class="function">sub</tt></b>(</nobr></td> <td><var>pattern, repl, string</var><big>[</big><var>, count</var><big>]</big>)</td></tr></table></dt> <dd> Return the string obtained by replacing the leftmost non-overlapping occurrences of <var>pattern</var> in <var>string</var> by the replacement <var>repl</var>. If the pattern isn't found, <var>string</var> is returned unchanged. <var>repl</var> can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, "<tt class="samp">\n</tt>" is converted to a single newline character, "<tt class="samp">\r</tt>" is converted to a linefeed, and so forth. Unknown escapes such as "<tt class="samp">\j</tt>" are left alone. Backreferences, such as "<tt class="samp">\6</tt>", are replaced with the substring matched by group 6 in the pattern. For example: <P> <div class="verbatim"><pre> >>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):', ... r'static PyObject*\npy_\1(void)\n{', ... 'def myfunc():') 'static PyObject*\npy_myfunc(void)\n{' </pre></div> <P> If <var>repl</var> is a function, it is called for every non-overlapping occurrence of <var>pattern</var>. The function takes a single match object argument, and returns the replacement string. For example: <P> <div class="verbatim"><pre> >>> def dashrepl(matchobj): .... if matchobj.group(0) == '-': return ' ' .... else: return '-' >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files') 'pro--gram files' </pre></div> <P> The pattern may be a string or an RE object; if you need to specify regular expression flags, you must use a RE object, or use embedded modifiers in a pattern; for example, "<tt class="samp">sub("(?i)b+", "x", "bbbb BBBB")</tt>" returns <code>'x x'</code>. <P> The optional argument <var>count</var> is the maximum number of pattern occurrences to be replaced; <var>count</var> must be a non-negative integer. If omitted or zero, all occurrences will be replaced. Empty matches for the pattern are replaced only when not adjacent to a previous match, so "<tt class="samp">sub('x*', '-', 'abc')</tt>" returns <code>'-a-b-c-'</code>. <P> In addition to character escapes and backreferences as described above, "<tt class="samp">\g<name></tt>" will use the substring matched by the group named "<tt class="samp">name</tt>", as defined by the <tt class="regexp">(?P<name>...)</tt> syntax. "<tt class="samp">\g<number></tt>" uses the corresponding group number; "<tt class="samp">\g<2></tt>" is therefore equivalent to "<tt class="samp">\2</tt>", but isn't ambiguous in a replacement such as "<tt class="samp">\g<2>0</tt>". "<tt class="samp">\20</tt>" would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character "<tt class="character">0</tt>". The backreference "<tt class="samp">\g<0></tt>" substitutes in the entire substring matched by the RE. </dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-845' class="function">subn</tt></b>(</nobr></td> <td><var>pattern, repl, string</var><big>[</big><var>, count</var><big>]</big>)</td></tr></table></dt> <dd> Perform the same operation as <tt class="function">sub()</tt>, but return a tuple <code>(<var>new_string</var>, <var>number_of_subs_made</var>)</code>. </dl> <P> <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> <td><nobr><b><tt id='l2h-846' class="function">escape</tt></b>(</nobr></td> <td><var>string</var>)</td></tr></table></dt> <dd> Return <var>string</var> with all non-alphanumerics backslashed; this is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it. </dl> <P> <dl><dt><b><span class="typelabel">exception</span> <tt id='l2h-847' class="exception">error</tt></b></dt> <dd> Exception raised when a string passed to one of the functions here is not a valid regular expression (for example, it might contain unmatched parentheses) or when some other error occurs during compilation or matching. It is never an error if a string contains no match for a pattern. </dd></dl> <P> <DIV CLASS="navigation"> <div class='online-navigation'><hr /> <table align="center" width="100%" cellpadding="0" cellspacing="2"> <tr> <td class='online-navigation'><a rel="prev" title="4.2.2 Matching vs Searching" rel="prev" title="4.2.2 Matching vs Searching" href="matching-searching.html"><img src='../icons/previous.png' border='0' height='32' alt='Previous Page' width='32' /></A></td> <td class='online-navigation'><a rel="parent" title="4.2 re " rel="parent" title="4.2 re " href="module-re.html"><img src='../icons/up.png' border='0' height='32' alt='Up One Level' width='32' /></A></td> <td class='online-navigation'><a rel="next" title="4.2.4 Regular Expression Objects" rel="next" title="4.2.4 Regular Expression Objects" href="re-objects.html"><img src='../icons/next.png' border='0' height='32' alt='Next Page' width='32' /></A></td> <td align="center" width="100%">Python Library Reference</td> <td class='online-navigation'><a rel="contents" title="Table of Contents" rel="contents" title="Table of Contents" href="contents.html"><img src='../icons/contents.png' border='0' height='32' alt='Contents' width='32' /></A></td> <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' border='0' height='32' alt='Module Index' width='32' /></a></td> <td class='online-navigation'><a rel="index" title="Index" rel="index" title="Index" href="genindex.html"><img src='../icons/index.png' border='0' height='32' alt='Index' width='32' /></A></td> </tr></table> <div class='online-navigation'> <b class="navlabel">Previous:</b> <a class="sectref" rel="prev" href="matching-searching.html">4.2.2 Matching vs Searching</A> <b class="navlabel">Up:</b> <a class="sectref" rel="parent" href="module-re.html">4.2 re </A> <b class="navlabel">Next:</b> <a class="sectref" rel="next" href="re-objects.html">4.2.4 Regular Expression Objects</A> </div> </div> <hr /> <span class="release-info">Release 2.3.4, documentation updated on May 20, 2004.</span> </DIV> <!--End of Navigation Panel--> <ADDRESS> See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. </ADDRESS> </BODY> </HTML>