Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > f800694edefe91adea2624f711a41a2d > files > 11022

php-manual-en-5.5.7-1.mga4.noarch.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
 <head>
  <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  <title>Repetition</title>

 </head>
 <body><div class="manualnavbar" style="text-align: center;">
 <div class="prev" style="text-align: left; float: left;"><a href="regexp.reference.subpatterns.html">Subpatterns</a></div>
 <div class="next" style="text-align: right; float: right;"><a href="regexp.reference.back-references.html">Back references</a></div>
 <div class="up"><a href="reference.pcre.pattern.syntax.html">PCRE regex syntax</a></div>
 <div class="home"><a href="index.html">PHP Manual</a></div>
</div><hr /><div id="regexp.reference.repetition" class="section">
  <h2 class="title">Repetition</h2>
  <p class="para">
   Repetition is specified by quantifiers, which can follow any
   of the following items:
   
   <ul class="itemizedlist">
    <li class="listitem"><span class="simpara">a single character, possibly escaped</span></li>
    <li class="listitem"><span class="simpara">the . metacharacter</span></li>
    <li class="listitem"><span class="simpara">a character class</span></li>
    <li class="listitem"><span class="simpara">a back reference (see next section)</span></li>
    <li class="listitem"><span class="simpara">a parenthesized subpattern (unless it is  an  assertion  -
     see below)</span></li>
   </ul>
  </p>
  <p class="para">
   The general repetition quantifier specifies  a  minimum  and
   maximum  number  of  permitted  matches,  by  giving the two
   numbers in curly brackets (braces), separated  by  a  comma.
   The  numbers  must be less than 65536, and the first must be
   less than or equal to the second. For example:
   
   <em>z{2,4}</em>
   
   matches &quot;zz&quot;, &quot;zzz&quot;, or &quot;zzzz&quot;. A closing brace on  its  own
   is not a special character. If the second number is omitted,
   but the comma is present, there is no upper  limit;  if  the
   second number and the comma are both omitted, the quantifier
   specifies an exact number of required matches. Thus
   
   <em>[aeiou]{3,}</em>
   
   matches at least 3 successive vowels,  but  may  match  many
   more, while
   
   <em>\d{8}</em>
   
   matches exactly 8 digits.  An  opening  curly  bracket  that
   appears  in a position where a quantifier is not allowed, or
   one that does not match the syntax of a quantifier, is taken
   as  a literal character. For example, {,6} is not a quantifier,
   but a literal string of four characters.
  </p>
  <p class="para">
   The quantifier {0} is permitted, causing the  expression  to
   behave  as  if the previous item and the quantifier were not
   present.
  </p>
  <p class="para">
   For convenience (and  historical  compatibility)  the  three
   most common quantifiers have single-character abbreviations:
   
   <table class="doctable table">
    <caption><strong>Single-character quantifiers</strong></caption>
    
     <tbody class="tbody">
      <tr>
       <td><em>*</em></td>
       <td>equivalent to <em>{0,}</em></td>
      </tr>

      <tr>
       <td><em>+</em></td>
       <td>equivalent to <em>{1,}</em></td>
      </tr>

      <tr>
       <td><em>?</em></td>
       <td>equivalent to <em>{0,1}</em></td>
      </tr>

     </tbody>
    
   </table>

  </p>
  <p class="para">
   It is possible to construct infinite loops  by  following  a
   subpattern  that  can  match no characters with a quantifier
   that has no upper limit, for example:
   
   <em>(a?)*</em>
  </p>
  <p class="para">
   Earlier versions of Perl and PCRE used to give an  error  at
   compile  time  for such patterns. However, because there are
   cases where this  can  be  useful,  such  patterns  are  now
   accepted,  but  if  any repetition of the subpattern does in
   fact match no characters, the loop is forcibly broken.
  </p>
  <p class="para">
   By default, the quantifiers  are  &quot;greedy&quot;,  that  is,  they
   match  as much as possible (up to the maximum number of permitted
   times), without causing the rest of  the  pattern  to
   fail. The classic example of where this gives problems is in
   trying to match comments in C programs. These appear between
   the  sequences /* and */ and within the sequence, individual
   * and / characters may appear. An attempt to  match  C  comments
   by applying the pattern
   
   <em>/\*.*\*/</em>
   
   to the string
   
   <em>/* first comment */  not comment  /* second comment */</em>
   
   fails, because it matches  the  entire  string  due  to  the
   greediness of the .*  item.
  </p>
  <p class="para">
   However, if a quantifier is followed  by  a  question  mark,
   then it becomes lazy, and instead matches the minimum
   number of times possible, so the pattern
   
   <em>/\*.*?\*/</em>
   
   does the right thing with the C comments. The meaning of the
   various  quantifiers is not otherwise changed, just the preferred
   number of matches.  Do not confuse this use of
   question  mark  with  its  use as a quantifier in its own right.
   Because it has two uses, it can sometimes appear doubled, as
   in
   
   <em>\d??\d</em>
   
   which matches one digit by preference, but can match two  if
   that is the only way the rest of the pattern matches.
  </p>
  <p class="para">
   If the <a href="reference.pcre.pattern.modifiers.html" class="link">PCRE_UNGREEDY</a>  
   option is set (an option which  is  not
   available  in  Perl)  then the quantifiers are not greedy by
   default, but individual ones can be made greedy by following
   them  with  a  question mark. In other words, it inverts the
   default behaviour.
  </p>
  <p class="para">
   Quantifiers followed by <em>+</em> are &quot;possessive&quot;. They eat
   as many characters as possible and don&#039;t return to match the rest of the
   pattern. Thus <em>.*abc</em> matches &quot;aabc&quot; but
   <em>.*+abc</em> doesn&#039;t because <em>.*+</em> eats the
   whole string. Possessive quantifiers can be used to speed up processing.
  </p>
  <p class="para">
   When a parenthesized subpattern is quantified with a minimum
   repeat  count  that is greater than 1 or with a limited maximum,
   more store is required for the  compiled  pattern,  in
   proportion to the size of the minimum or maximum.
  </p>
  <p class="para">
   If a pattern starts with .* or  .{0,}  and  the  <a href="reference.pcre.pattern.modifiers.html" class="link">PCRE_DOTALL</a> 
   option (equivalent to Perl&#039;s /s) is set, thus allowing the .
   to match newlines, then the pattern is implicitly  anchored,
   because whatever follows will be tried against every character
   position in the subject string, so there is no point  in
   retrying  the overall match at any position after the first.
   PCRE treats such a pattern as though it were preceded by \A.
   In  cases where it is known that the subject string contains
   no newlines, it is worth setting <a href="reference.pcre.pattern.modifiers.html" class="link">PCRE_DOTALL</a>  when  the  
   pattern begins with .* in order to
   obtain this optimization, or
   alternatively using ^ to indicate anchoring explicitly.
  </p>
  <p class="para">
   When a capturing subpattern is repeated, the value  captured
   is the substring that matched the final iteration. For example, after
   
   <em>(tweedle[dume]{3}\s*)+</em>
   
   has matched &quot;tweedledum tweedledee&quot; the value  of  the  captured
   substring  is  &quot;tweedledee&quot;.  However,  if  there are
   nested capturing  subpatterns,  the  corresponding  captured
   values  may  have been set in previous iterations. For example,
   after
   
   <em>/(a|(b))+/</em>
   
   matches &quot;aba&quot; the value of the second captured substring  is
   &quot;b&quot;.
  </p>
 </div><hr /><div class="manualnavbar" style="text-align: center;">
 <div class="prev" style="text-align: left; float: left;"><a href="regexp.reference.subpatterns.html">Subpatterns</a></div>
 <div class="next" style="text-align: right; float: right;"><a href="regexp.reference.back-references.html">Back references</a></div>
 <div class="up"><a href="reference.pcre.pattern.syntax.html">PCRE regex syntax</a></div>
 <div class="home"><a href="index.html">PHP Manual</a></div>
</div></body></html>