aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorChong Yidong2011-01-28 13:03:30 -0500
committerChong Yidong2011-01-28 13:03:30 -0500
commit65401ee3fefe38cb3a8a350a17f8b0a3a4ccb579 (patch)
tree8a98ca7c39fa21497d2052428deb0d3fbb634aa8
parent7427eb9754e8d22568b99621b5e8117dc2bde802 (diff)
downloademacs-65401ee3fefe38cb3a8a350a17f8b0a3a4ccb579.tar.gz
emacs-65401ee3fefe38cb3a8a350a17f8b0a3a4ccb579.zip
* search.texi (Regexps): Copyedits. Mention character classes (Bug#7809).
-rw-r--r--doc/emacs/ChangeLog3
-rw-r--r--doc/emacs/search.texi113
2 files changed, 60 insertions, 56 deletions
diff --git a/doc/emacs/ChangeLog b/doc/emacs/ChangeLog
index 4d4d38c2c5c..f3c6afc3fa5 100644
--- a/doc/emacs/ChangeLog
+++ b/doc/emacs/ChangeLog
@@ -1,5 +1,8 @@
12011-01-28 Chong Yidong <cyd@stupidchicken.com> 12011-01-28 Chong Yidong <cyd@stupidchicken.com>
2 2
3 * search.texi (Regexps): Copyedits. Mention character classes
4 (Bug#7809).
5
3 * files.texi (File Aliases): Restore explanatory text from Eli 6 * files.texi (File Aliases): Restore explanatory text from Eli
4 Zaretskii, accidentally removed in 2011-01-08 commit. 7 Zaretskii, accidentally removed in 2011-01-08 commit.
5 8
diff --git a/doc/emacs/search.texi b/doc/emacs/search.texi
index cd63a562d66..e2ecb4a2385 100644
--- a/doc/emacs/search.texi
+++ b/doc/emacs/search.texi
@@ -546,21 +546,20 @@ Search}.
546@cindex syntax of regexps 546@cindex syntax of regexps
547 547
548 This manual describes regular expression features that users 548 This manual describes regular expression features that users
549typically want to use. There are additional features that are 549typically use. @xref{Regular Expressions,,, elisp, The Emacs Lisp
550mainly used in Lisp programs; see @ref{Regular Expressions,,, 550Reference Manual}, for additional features used mainly in Lisp
551elisp, The Emacs Lisp Reference Manual}. 551programs.
552 552
553 Regular expressions have a syntax in which a few characters are 553 Regular expressions have a syntax in which a few characters are
554special constructs and the rest are @dfn{ordinary}. An ordinary 554special constructs and the rest are @dfn{ordinary}. An ordinary
555character is a simple regular expression which matches that same 555character matches that same character and nothing else. The special
556character and nothing else. The special characters are @samp{$}, 556characters are @samp{$^.*+?[\}. The character @samp{]} is special if
557@samp{^}, @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, and 557it ends a character alternative (see later). The character @samp{-}
558@samp{\}. The character @samp{]} is special if it ends a character 558is special inside a character alternative. Any other character
559alternative (see later). The character @samp{-} is special inside a 559appearing in a regular expression is ordinary, unless a @samp{\}
560character alternative. Any other character appearing in a regular 560precedes it. (When you use regular expressions in a Lisp program,
561expression is ordinary, unless a @samp{\} precedes it. (When you use 561each @samp{\} must be doubled, see the example near the end of this
562regular expressions in a Lisp program, each @samp{\} must be doubled, 562section.)
563see the example near the end of this section.)
564 563
565 For example, @samp{f} is not a special character, so it is ordinary, and 564 For example, @samp{f} is not a special character, so it is ordinary, and
566therefore @samp{f} is a regular expression that matches the string 565therefore @samp{f} is a regular expression that matches the string
@@ -570,28 +569,27 @@ only @samp{o}. (When case distinctions are being ignored, these regexps
570also match @samp{F} and @samp{O}, but we consider this a generalization 569also match @samp{F} and @samp{O}, but we consider this a generalization
571of ``the same string,'' rather than an exception.) 570of ``the same string,'' rather than an exception.)
572 571
573 Any two regular expressions @var{a} and @var{b} can be concatenated. The 572 Any two regular expressions @var{a} and @var{b} can be concatenated.
574result is a regular expression which matches a string if @var{a} matches 573The result is a regular expression which matches a string if @var{a}
575some amount of the beginning of that string and @var{b} matches the rest of 574matches some amount of the beginning of that string and @var{b}
576the string.@refill 575matches the rest of the string. For example, concatenating the
577 576regular expressions @samp{f} and @samp{o} gives the regular expression
578 As a simple example, we can concatenate the regular expressions @samp{f} 577@samp{fo}, which matches only the string @samp{fo}. Still trivial.
579and @samp{o} to get the regular expression @samp{fo}, which matches only 578To do something nontrivial, you need to use one of the special
580the string @samp{fo}. Still trivial. To do something nontrivial, you 579characters. Here is a list of them.
581need to use one of the special characters. Here is a list of them.
582 580
583@table @asis 581@table @asis
584@item @kbd{.}@: @r{(Period)} 582@item @kbd{.}@: @r{(Period)}
585is a special character that matches any single character except a newline. 583is a special character that matches any single character except a
586Using concatenation, we can make regular expressions like @samp{a.b}, which 584newline. For example, the regular expressions @samp{a.b} matches any
587matches any three-character string that begins with @samp{a} and ends with 585three-character string that begins with @samp{a} and ends with
588@samp{b}.@refill 586@samp{b}.
589 587
590@item @kbd{*} 588@item @kbd{*}
591is not a construct by itself; it is a postfix operator that means to 589is not a construct by itself; it is a postfix operator that means to
592match the preceding regular expression repetitively as many times as 590match the preceding regular expression repetitively any number of
593possible. Thus, @samp{o*} matches any number of @samp{o}s (including no 591times, as many times as possible. Thus, @samp{o*} matches any number
594@samp{o}s). 592of @samp{o}s, including no @samp{o}s.
595 593
596@samp{*} always applies to the @emph{smallest} possible preceding 594@samp{*} always applies to the @emph{smallest} possible preceding
597expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating 595expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
@@ -610,22 +608,21 @@ With this choice, the rest of the regexp matches successfully.@refill
610 608
611@item @kbd{+} 609@item @kbd{+}
612is a postfix operator, similar to @samp{*} except that it must match 610is a postfix operator, similar to @samp{*} except that it must match
613the preceding expression at least once. So, for example, @samp{ca+r} 611the preceding expression at least once. Thus, @samp{ca+r} matches the
614matches the strings @samp{car} and @samp{caaaar} but not the string 612strings @samp{car} and @samp{caaaar} but not the string @samp{cr},
615@samp{cr}, whereas @samp{ca*r} matches all three strings. 613whereas @samp{ca*r} matches all three strings.
616 614
617@item @kbd{?} 615@item @kbd{?}
618is a postfix operator, similar to @samp{*} except that it can match the 616is a postfix operator, similar to @samp{*} except that it can match
619preceding expression either once or not at all. For example, 617the preceding expression either once or not at all. Thus, @samp{ca?r}
620@samp{ca?r} matches @samp{car} or @samp{cr}; nothing else. 618matches @samp{car} or @samp{cr}, and nothing else.
621 619
622@item @kbd{*?}, @kbd{+?}, @kbd{??} 620@item @kbd{*?}, @kbd{+?}, @kbd{??}
623@cindex non-greedy regexp matching 621@cindex non-greedy regexp matching
624are non-greedy variants of the operators above. The normal operators 622are non-@dfn{greedy} variants of the operators above. The normal
625@samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as 623operators @samp{*}, @samp{+}, @samp{?} match as much as they can, as
626much as they can, as long as the overall regexp can still match. With 624long as the overall regexp can still match. With a following
627a following @samp{?}, they are non-greedy: they will match as little 625@samp{?}, they will match as little as possible.
628as possible.
629 626
630Thus, both @samp{ab*} and @samp{ab*?} can match the string @samp{a} 627Thus, both @samp{ab*} and @samp{ab*?} can match the string @samp{a}
631and the string @samp{abbbb}; but if you try to match them both against 628and the string @samp{abbbb}; but if you try to match them both against
@@ -641,29 +638,30 @@ a newline, it matches the whole string. Since it @emph{can} match
641starting at the first @samp{a}, it does. 638starting at the first @samp{a}, it does.
642 639
643@item @kbd{\@{@var{n}\@}} 640@item @kbd{\@{@var{n}\@}}
644is a postfix operator that specifies repetition @var{n} times---that 641is a postfix operator specifying @var{n} repetitions---that is, the
645is, the preceding regular expression must match exactly @var{n} times 642preceding regular expression must match exactly @var{n} times in a
646in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx} 643row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx} and
647and nothing else. 644nothing else.
648 645
649@item @kbd{\@{@var{n},@var{m}\@}} 646@item @kbd{\@{@var{n},@var{m}\@}}
650is a postfix operator that specifies repetition between @var{n} and 647is a postfix operator specifying between @var{n} and @var{m}
651@var{m} times---that is, the preceding regular expression must match 648repetitions---that is, the preceding regular expression must match at
652at least @var{n} times, but no more than @var{m} times. If @var{m} is 649least @var{n} times, but no more than @var{m} times. If @var{m} is
653omitted, then there is no upper limit, but the preceding regular 650omitted, then there is no upper limit, but the preceding regular
654expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is 651expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is
655equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to 652equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to
656@samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}. 653@samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}.
657 654
658@item @kbd{[ @dots{} ]} 655@item @kbd{[ @dots{} ]}
659is a @dfn{character set}, which begins with @samp{[} and is terminated 656is a @dfn{character set}, beginning with @samp{[} and terminated by
660by @samp{]}. In the simplest case, the characters between the two 657@samp{]}.
661brackets are what this set can match.
662 658
663Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and 659In the simplest case, the characters between the two brackets are what
664@samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s 660this set can match. Thus, @samp{[ad]} matches either one @samp{a} or
665(including the empty string), from which it follows that @samp{c[ad]*r} 661one @samp{d}, and @samp{[ad]*} matches any string composed of just
666matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc. 662@samp{a}s and @samp{d}s (including the empty string). It follows that
663@samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr},
664@samp{caddaar}, etc.
667 665
668You can also include character ranges in a character set, by writing the 666You can also include character ranges in a character set, by writing the
669starting and ending characters with a @samp{-} between them. Thus, 667starting and ending characters with a @samp{-} between them. Thus,
@@ -672,9 +670,12 @@ intermixed freely with individual characters, as in @samp{[a-z$%.]},
672which matches any lower-case @acronym{ASCII} letter or @samp{$}, @samp{%} or 670which matches any lower-case @acronym{ASCII} letter or @samp{$}, @samp{%} or
673period. 671period.
674 672
675Note that the usual regexp special characters are not special inside a 673You can also include certain special @dfn{character classes} in a
676character set. A completely different set of special characters exists 674character set. A @samp{[:} and balancing @samp{:]} enclose a
677inside character sets: @samp{]}, @samp{-} and @samp{^}. 675character class inside a character alternative. For instance,
676@samp{[[:alnum:]]} matches any letter or digit. @xref{Char Classes,,,
677elisp, The Emacs Lisp Reference Manual}, for a list of character
678classes.
678 679
679To include a @samp{]} in a character set, you must make it the first 680To include a @samp{]} in a character set, you must make it the first
680character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To 681character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To