aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--doc/lispref/searching.texi52
1 files changed, 31 insertions, 21 deletions
diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 748ab586af9..72ee9233a3c 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -398,17 +398,11 @@ range should not be the starting point of another one; for example,
398The usual regexp special characters are not special inside a 398The usual regexp special characters are not special inside a
399character alternative. A completely different set of characters is 399character alternative. A completely different set of characters is
400special inside character alternatives: @samp{]}, @samp{-} and @samp{^}. 400special inside character alternatives: @samp{]}, @samp{-} and @samp{^}.
401 401To include @samp{]} in a character alternative, put it at the
402To include a @samp{]} in a character alternative, you must make it the first 402beginning. To include @samp{^}, put it anywhere but at the beginning.
403character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To include 403To include @samp{-}, put it at the end. Thus, @samp{[]^-]} matches
404a @samp{-}, write @samp{-} as the last character of the character alternative, 404all three of these special characters. You cannot use @samp{\} to
405tho you can also put it first or after a range. Thus, @samp{[]-]} matches both 405escape these three characters, since @samp{\} is not special here.
406@samp{]} and @samp{-}. (As explained below, you cannot use @samp{\]} to
407include a @samp{]} inside a character alternative, since @samp{\} is not
408special there.)
409
410To include @samp{^} in a character alternative, put it anywhere but at
411the beginning.
412 406
413The following aspects of ranges are specific to Emacs, in that POSIX 407The following aspects of ranges are specific to Emacs, in that POSIX
414allows but does not require this behavior and programs other than 408allows but does not require this behavior and programs other than
@@ -426,17 +420,33 @@ of its bounds, so that @samp{[a-z]} matches only ASCII letters, even
426outside the C or POSIX locale. 420outside the C or POSIX locale.
427 421
428@item 422@item
429As a special case, if either bound of a range is a raw 8-bit byte, the 423If the lower bound of a range is greater than its upper bound, the
430other bound should be a unibyte character, and the range matches only 424range is empty and represents no characters. Thus, @samp{[z-a]}
431unibyte characters. 425always fails to match, and @samp{[^z-a]} matches any character,
426including newline. However, a reversed range should always be from
427the letter @samp{z} to the letter @samp{a} to make it clear that it is
428not a typo; for example, @samp{[+-*/]} should be avoided, because it
429matches only @samp{/} rather than the likely-intended four characters.
430@end enumerate
431
432Some kinds of character alternatives are not the best style even
433though they are standardized by POSIX and are portable. They include:
432 434
435@enumerate
433@item 436@item
434If the lower bound of a range is greater than its upper bound, the 437A character alternative can include duplicates. For example,
435range is empty and represents no characters. Thus, @samp{[b-a]} 438@samp{[XYa-yYb-zX]} is less clear than @samp{[XYa-z]}.
436always fails to match, and @samp{[^b-a]} matches any character, 439
437including newline. However, the lower bound should be at most one 440@item
438greater than the upper bound; for example, @samp{[c-a]} should be 441A range can denote just one, two, or three characters. For example,
439avoided. 442@samp{[(-(]} is less clear than @samp{[(]}, @samp{[*-+]} is less clear
443than @samp{[*+]}, and @samp{[*-,]} is less clear than @samp{[*+,]}.
444
445@item
446A @samp{-} also appear at the beginning of a character alternative, or
447as the upper bound of a range. For example, although @samp{[-a-z]} is
448valid, @samp{[a-z-]} is better style; and although @samp{[!--/]} is
449valid, @samp{[!-,/-]} is clearer.
440@end enumerate 450@end enumerate
441 451
442A character alternative can also specify named character classes 452A character alternative can also specify named character classes
@@ -452,7 +462,7 @@ of a range.
452@cindex @samp{^} in regexp 462@cindex @samp{^} in regexp
453@samp{[^} begins a @dfn{complemented character alternative}. This 463@samp{[^} begins a @dfn{complemented character alternative}. This
454matches any character except the ones specified. Thus, 464matches any character except the ones specified. Thus,
455@samp{[^a-z0-9A-Z]} matches all characters @emph{except} letters and 465@samp{[^a-z0-9A-Z]} matches all characters @emph{except} ASCII letters and
456digits. 466digits.
457 467
458@samp{^} is not special in a character alternative unless it is the first 468@samp{^} is not special in a character alternative unless it is the first