diff options
| author | Luc Teirlinck | 2004-01-12 04:21:01 +0000 |
|---|---|---|
| committer | Luc Teirlinck | 2004-01-12 04:21:01 +0000 |
| commit | bcb6b6b8b1a7bc1f724bb0b5ba3306c760d97c35 (patch) | |
| tree | ead1dedec551510a0a0937c398f78af12db6c3eb | |
| parent | 90c3aa59349b4a902282badc8d31144a143c1973 (diff) | |
| download | emacs-bcb6b6b8b1a7bc1f724bb0b5ba3306c760d97c35.tar.gz emacs-bcb6b6b8b1a7bc1f724bb0b5ba3306c760d97c35.zip | |
Various small changes in addition to the following.
(Regexp Example): Adapt to new value of `sentence-end'.
(Regexp Functions): The PAREN argument to `regexp-opt' can be `words'.
(Search and Replace): Add usage note for `perform-replace'.
(Entire Match Data): Mention INTEGERS and REUSE arguments to `match-data'.
(Standard Regexps): Update for new values of `paragraph-start' and
`sentence-end'.
| -rw-r--r-- | lispref/searching.texi | 169 |
1 files changed, 110 insertions, 59 deletions
diff --git a/lispref/searching.texi b/lispref/searching.texi index ab5abecc7d0..94edaae6734 100644 --- a/lispref/searching.texi +++ b/lispref/searching.texi | |||
| @@ -90,7 +90,8 @@ If @var{repeat} is supplied (it must be a positive number), then the | |||
| 90 | search is repeated that many times (each time starting at the end of the | 90 | search is repeated that many times (each time starting at the end of the |
| 91 | previous time's match). If these successive searches succeed, the | 91 | previous time's match). If these successive searches succeed, the |
| 92 | function succeeds, moving point and returning its new value. Otherwise | 92 | function succeeds, moving point and returning its new value. Otherwise |
| 93 | the search fails, leaving point where it started. | 93 | the search fails, with results depending on the value of |
| 94 | @var{noerror}, as described above. | ||
| 94 | @end deffn | 95 | @end deffn |
| 95 | 96 | ||
| 96 | @deffn Command search-backward string &optional limit noerror repeat | 97 | @deffn Command search-backward string &optional limit noerror repeat |
| @@ -143,7 +144,7 @@ If @var{noerror} is @code{nil}, then @code{word-search-forward} signals | |||
| 143 | an error if the search fails. If @var{noerror} is @code{t}, then it | 144 | an error if the search fails. If @var{noerror} is @code{t}, then it |
| 144 | returns @code{nil} instead of signaling an error. If @var{noerror} is | 145 | returns @code{nil} instead of signaling an error. If @var{noerror} is |
| 145 | neither @code{nil} nor @code{t}, it moves point to @var{limit} (or the | 146 | neither @code{nil} nor @code{t}, it moves point to @var{limit} (or the |
| 146 | end of the buffer) and returns @code{nil}. | 147 | end of the accessible portion of the buffer) and returns @code{nil}. |
| 147 | 148 | ||
| 148 | If @var{repeat} is non-@code{nil}, then the search is repeated that many | 149 | If @var{repeat} is non-@code{nil}, then the search is repeated that many |
| 149 | times. Point is positioned at the end of the last match. | 150 | times. Point is positioned at the end of the last match. |
| @@ -168,8 +169,8 @@ regexps; the following section says how to search for them. | |||
| 168 | 169 | ||
| 169 | @menu | 170 | @menu |
| 170 | * Syntax of Regexps:: Rules for writing regular expressions. | 171 | * Syntax of Regexps:: Rules for writing regular expressions. |
| 171 | * Regexp Functions:: Functions for operating on regular expressions. | ||
| 172 | * Regexp Example:: Illustrates regular expression syntax. | 172 | * Regexp Example:: Illustrates regular expression syntax. |
| 173 | * Regexp Functions:: Functions for operating on regular expressions. | ||
| 173 | @end menu | 174 | @end menu |
| 174 | 175 | ||
| 175 | @node Syntax of Regexps | 176 | @node Syntax of Regexps |
| @@ -293,10 +294,10 @@ matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc. | |||
| 293 | 294 | ||
| 294 | You can also include character ranges in a character alternative, by | 295 | You can also include character ranges in a character alternative, by |
| 295 | writing the starting and ending characters with a @samp{-} between them. | 296 | writing the starting and ending characters with a @samp{-} between them. |
| 296 | Thus, @samp{[a-z]} matches any lower-case @acronym{ASCII} letter. Ranges may be | 297 | Thus, @samp{[a-z]} matches any lower-case @acronym{ASCII} letter. |
| 297 | intermixed freely with individual characters, as in @samp{[a-z$%.]}, | 298 | Ranges may be intermixed freely with individual characters, as in |
| 298 | which matches any lower case @acronym{ASCII} letter or @samp{$}, @samp{%} or | 299 | @samp{[a-z$%.]}, which matches any lower case @acronym{ASCII} letter |
| 299 | period. | 300 | or @samp{$}, @samp{%} or period. |
| 300 | 301 | ||
| 301 | Note that the usual regexp special characters are not special inside a | 302 | Note that the usual regexp special characters are not special inside a |
| 302 | character alternative. A completely different set of characters is | 303 | character alternative. A completely different set of characters is |
| @@ -358,10 +359,11 @@ the handling of regexps in programs such as @code{grep}. | |||
| 358 | 359 | ||
| 359 | @item @samp{^} | 360 | @item @samp{^} |
| 360 | @cindex beginning of line in regexp | 361 | @cindex beginning of line in regexp |
| 361 | is a special character that matches the empty string, but only at the | 362 | When matching a buffer, @samp{^} matches the empty string, but only at the |
| 362 | beginning of a line in the text being matched. Otherwise it fails to | 363 | beginning of a line in the text being matched (or the beginning of the |
| 363 | match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at | 364 | accessible portion of the buffer). Otherwise it fails to match |
| 364 | the beginning of a line. | 365 | anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at the |
| 366 | beginning of a line. | ||
| 365 | 367 | ||
| 366 | When matching a string instead of a buffer, @samp{^} matches at the | 368 | When matching a string instead of a buffer, @samp{^} matches at the |
| 367 | beginning of the string or after a newline character. | 369 | beginning of the string or after a newline character. |
| @@ -372,8 +374,9 @@ beginning of the regular expression, or after @samp{\(} or @samp{\|}. | |||
| 372 | @item @samp{$} | 374 | @item @samp{$} |
| 373 | @cindex @samp{$} in regexp | 375 | @cindex @samp{$} in regexp |
| 374 | @cindex end of line in regexp | 376 | @cindex end of line in regexp |
| 375 | is similar to @samp{^} but matches only at the end of a line. Thus, | 377 | is similar to @samp{^} but matches only at the end of a line (or the |
| 376 | @samp{x+$} matches a string of one @samp{x} or more at the end of a line. | 378 | end of the accessible portion of the buffer). Thus, @samp{x+$} |
| 379 | matches a string of one @samp{x} or more at the end of a line. | ||
| 377 | 380 | ||
| 378 | When matching a string instead of a buffer, @samp{$} matches at the end | 381 | When matching a string instead of a buffer, @samp{$} matches at the end |
| 379 | of the string or before a newline character. | 382 | of the string or before a newline character. |
| @@ -542,7 +545,7 @@ purposes of an ordinary group (controlling the nesting of other | |||
| 542 | operators), but it does not get a number, so you cannot refer back to | 545 | operators), but it does not get a number, so you cannot refer back to |
| 543 | its value with @samp{\@var{digit}}. | 546 | its value with @samp{\@var{digit}}. |
| 544 | 547 | ||
| 545 | Shy groups are particulary useful for mechanically-constructed regular | 548 | Shy groups are particularly useful for mechanically-constructed regular |
| 546 | expressions because they can be added automatically without altering the | 549 | expressions because they can be added automatically without altering the |
| 547 | numbering of any ordinary, non-shy groups. | 550 | numbering of any ordinary, non-shy groups. |
| 548 | 551 | ||
| @@ -567,6 +570,10 @@ composed of two identical halves. The @samp{\(.*\)} matches the first | |||
| 567 | half, which may be anything, but the @samp{\1} that follows must match | 570 | half, which may be anything, but the @samp{\1} that follows must match |
| 568 | the same exact text. | 571 | the same exact text. |
| 569 | 572 | ||
| 573 | If a @samp{\( @dots{} \)} construct matches more than once (which can | ||
| 574 | happen, for instance, if it is followed by @samp{*}), only the last | ||
| 575 | match is recorded. | ||
| 576 | |||
| 570 | If a particular grouping construct in the regular expression was never | 577 | If a particular grouping construct in the regular expression was never |
| 571 | matched---for instance, if it appears inside of an alternative that | 578 | matched---for instance, if it appears inside of an alternative that |
| 572 | wasn't used, or inside of a repetition that repeated zero times---then | 579 | wasn't used, or inside of a repetition that repeated zero times---then |
| @@ -611,7 +618,9 @@ matches any character whose category is not @var{c}. | |||
| 611 | 618 | ||
| 612 | The following regular expression constructs match the empty string---that is, | 619 | The following regular expression constructs match the empty string---that is, |
| 613 | they don't use up any characters---but whether they match depends on the | 620 | they don't use up any characters---but whether they match depends on the |
| 614 | context. | 621 | context. For all, the beginning and end of the accessible portion of |
| 622 | the buffer are treated as if they were the actual beginning and end of | ||
| 623 | the buffer. | ||
| 615 | 624 | ||
| 616 | @table @samp | 625 | @table @samp |
| 617 | @item \` | 626 | @item \` |
| @@ -636,25 +645,25 @@ end of a word. Thus, @samp{\bfoo\b} matches any occurrence of | |||
| 636 | @samp{foo} as a separate word. @samp{\bballs?\b} matches | 645 | @samp{foo} as a separate word. @samp{\bballs?\b} matches |
| 637 | @samp{ball} or @samp{balls} as a separate word.@refill | 646 | @samp{ball} or @samp{balls} as a separate word.@refill |
| 638 | 647 | ||
| 639 | @samp{\b} matches at the beginning or end of the buffer | 648 | @samp{\b} matches at the beginning or end of the buffer (or string) |
| 640 | regardless of what text appears next to it. | 649 | regardless of what text appears next to it. |
| 641 | 650 | ||
| 642 | @item \B | 651 | @item \B |
| 643 | @cindex @samp{\B} in regexp | 652 | @cindex @samp{\B} in regexp |
| 644 | matches the empty string, but @emph{not} at the beginning or | 653 | matches the empty string, but @emph{not} at the beginning or |
| 645 | end of a word. | 654 | end of a word, nor at the beginning or end of the buffer (or string). |
| 646 | 655 | ||
| 647 | @item \< | 656 | @item \< |
| 648 | @cindex @samp{\<} in regexp | 657 | @cindex @samp{\<} in regexp |
| 649 | matches the empty string, but only at the beginning of a word. | 658 | matches the empty string, but only at the beginning of a word. |
| 650 | @samp{\<} matches at the beginning of the buffer only if a | 659 | @samp{\<} matches at the beginning of the buffer (or string) only if a |
| 651 | word-constituent character follows. | 660 | word-constituent character follows. |
| 652 | 661 | ||
| 653 | @item \> | 662 | @item \> |
| 654 | @cindex @samp{\>} in regexp | 663 | @cindex @samp{\>} in regexp |
| 655 | matches the empty string, but only at the end of a word. @samp{\>} | 664 | matches the empty string, but only at the end of a word. @samp{\>} |
| 656 | matches at the end of the buffer only if the contents end with a | 665 | matches at the end of the buffer (or string) only if the contents end |
| 657 | word-constituent character. | 666 | with a word-constituent character. |
| 658 | @end table | 667 | @end table |
| 659 | 668 | ||
| 660 | @kindex invalid-regexp | 669 | @kindex invalid-regexp |
| @@ -668,9 +677,11 @@ an @code{invalid-regexp} error is signaled. | |||
| 668 | @comment node-name, next, previous, up | 677 | @comment node-name, next, previous, up |
| 669 | @subsection Complex Regexp Example | 678 | @subsection Complex Regexp Example |
| 670 | 679 | ||
| 671 | Here is a complicated regexp, used by Emacs to recognize the end of a | 680 | Here is a complicated regexp which was formerly used by Emacs to |
| 672 | sentence together with any whitespace that follows. It is the value of | 681 | recognize the end of a sentence together with any whitespace that |
| 673 | the variable @code{sentence-end}. | 682 | follows. It was used as the variable @code{sentence-end}. (Its value |
| 683 | nowadays contains alternatives for @samp{.}, @samp{?} and @samp{!} in | ||
| 684 | other character sets.) | ||
| 674 | 685 | ||
| 675 | First, we show the regexp as a string in Lisp syntax to distinguish | 686 | First, we show the regexp as a string in Lisp syntax to distinguish |
| 676 | spaces from tab characters. The string constant begins and ends with a | 687 | spaces from tab characters. The string constant begins and ends with a |
| @@ -679,17 +690,16 @@ string, @samp{\\} for a backslash as part of the string, @samp{\t} for a | |||
| 679 | tab and @samp{\n} for a newline. | 690 | tab and @samp{\n} for a newline. |
| 680 | 691 | ||
| 681 | @example | 692 | @example |
| 682 | "[.?!][]\"')@}]*\\($\\| $\\|\t\\| \\)[ \t\n]*" | 693 | "[.?!][]\"')@}]*\\($\\| $\\|\t\\|@ @ \\)[ \t\n]*" |
| 683 | @end example | 694 | @end example |
| 684 | 695 | ||
| 685 | @noindent | 696 | @noindent |
| 686 | In contrast, if you evaluate the variable @code{sentence-end}, you | 697 | In contrast, if you evaluate this string, you will see the following: |
| 687 | will see the following: | ||
| 688 | 698 | ||
| 689 | @example | 699 | @example |
| 690 | @group | 700 | @group |
| 691 | sentence-end | 701 | "[.?!][]\"')@}]*\\($\\| $\\|\t\\|@ @ \\)[ \t\n]*" |
| 692 | @result{} "[.?!][]\"')@}]*\\($\\| $\\| \\| \\)[ | 702 | @result{} "[.?!][]\"')@}]*\\($\\| $\\| \\|@ @ \\)[ |
| 693 | ]*" | 703 | ]*" |
| 694 | @end group | 704 | @end group |
| 695 | @end example | 705 | @end example |
| @@ -704,7 +714,10 @@ deciphered as follows: | |||
| 704 | @item [.?!] | 714 | @item [.?!] |
| 705 | The first part of the pattern is a character alternative that matches | 715 | The first part of the pattern is a character alternative that matches |
| 706 | any one of three characters: period, question mark, and exclamation | 716 | any one of three characters: period, question mark, and exclamation |
| 707 | mark. The match must begin with one of these three characters. | 717 | mark. The match must begin with one of these three characters. (This |
| 718 | is the one point where the new value of @code{sentence-end} differs | ||
| 719 | from the old. The new value also lists sentence ending | ||
| 720 | non-@acronym{ASCII} characters.) | ||
| 708 | 721 | ||
| 709 | @item []\"')@}]* | 722 | @item []\"')@}]* |
| 710 | The second part of the pattern matches any closing braces and quotation | 723 | The second part of the pattern matches any closing braces and quotation |
| @@ -764,13 +777,14 @@ whitespace: | |||
| 764 | 777 | ||
| 765 | @defun regexp-opt strings &optional paren | 778 | @defun regexp-opt strings &optional paren |
| 766 | This function returns an efficient regular expression that will match | 779 | This function returns an efficient regular expression that will match |
| 767 | any of the strings @var{strings}. This is useful when you need to make | 780 | any of the strings in the list @var{strings}. This is useful when you |
| 768 | matching or searching as fast as possible---for example, for Font Lock | 781 | need to make matching or searching as fast as possible---for example, |
| 769 | mode. | 782 | for Font Lock mode. |
| 770 | 783 | ||
| 771 | If the optional argument @var{paren} is non-@code{nil}, then the | 784 | If the optional argument @var{paren} is non-@code{nil}, then the |
| 772 | returned regular expression is always enclosed by at least one | 785 | returned regular expression is always enclosed by at least one |
| 773 | parentheses-grouping construct. | 786 | parentheses-grouping construct. If @var{paren} is @code{words}, then |
| 787 | that construct is additionally surrounded by @samp{\<} and @samp{\>}. | ||
| 774 | 788 | ||
| 775 | This simplified definition of @code{regexp-opt} produces a | 789 | This simplified definition of @code{regexp-opt} produces a |
| 776 | regular expression which is equivalent to the actual value | 790 | regular expression which is equivalent to the actual value |
| @@ -788,7 +802,8 @@ regular expression which is equivalent to the actual value | |||
| 788 | 802 | ||
| 789 | @defun regexp-opt-depth regexp | 803 | @defun regexp-opt-depth regexp |
| 790 | This function returns the total number of grouping constructs | 804 | This function returns the total number of grouping constructs |
| 791 | (parenthesized expressions) in @var{regexp}. | 805 | (parenthesized expressions) in @var{regexp}. (This does not include |
| 806 | shy groups.) | ||
| 792 | @end defun | 807 | @end defun |
| 793 | 808 | ||
| 794 | @node Regexp Search | 809 | @node Regexp Search |
| @@ -830,7 +845,7 @@ error is signaled. If @var{noerror} is @code{t}, | |||
| 830 | @code{re-search-forward} does nothing and returns @code{nil}. If | 845 | @code{re-search-forward} does nothing and returns @code{nil}. If |
| 831 | @var{noerror} is neither @code{nil} nor @code{t}, then | 846 | @var{noerror} is neither @code{nil} nor @code{t}, then |
| 832 | @code{re-search-forward} moves point to @var{limit} (or the end of the | 847 | @code{re-search-forward} moves point to @var{limit} (or the end of the |
| 833 | buffer) and returns @code{nil}. | 848 | accessible portion of the buffer) and returns @code{nil}. |
| 834 | 849 | ||
| 835 | In the following example, point is initially before the @samp{T}. | 850 | In the following example, point is initially before the @samp{T}. |
| 836 | Evaluating the search call moves point to the end of that line (between | 851 | Evaluating the search call moves point to the end of that line (between |
| @@ -866,9 +881,10 @@ simple mirror images. @code{re-search-forward} finds the match whose | |||
| 866 | beginning is as close as possible to the starting point. If | 881 | beginning is as close as possible to the starting point. If |
| 867 | @code{re-search-backward} were a perfect mirror image, it would find the | 882 | @code{re-search-backward} were a perfect mirror image, it would find the |
| 868 | match whose end is as close as possible. However, in fact it finds the | 883 | match whose end is as close as possible. However, in fact it finds the |
| 869 | match whose beginning is as close as possible. The reason for this is that | 884 | match whose beginning is as close as possible (and yet ends before the |
| 870 | matching a regular expression at a given spot always works from | 885 | starting point). The reason for this is that matching a regular |
| 871 | beginning to end, and starts at a specified beginning position. | 886 | expression at a given spot always works from beginning to end, and |
| 887 | starts at a specified beginning position. | ||
| 872 | 888 | ||
| 873 | A true mirror-image of @code{re-search-forward} would require a special | 889 | A true mirror-image of @code{re-search-forward} would require a special |
| 874 | feature for matching regular expressions from end to beginning. It's | 890 | feature for matching regular expressions from end to beginning. It's |
| @@ -1069,7 +1085,8 @@ This function is the guts of @code{query-replace} and related | |||
| 1069 | commands. It searches for occurrences of @var{from-string} in the | 1085 | commands. It searches for occurrences of @var{from-string} in the |
| 1070 | text between positions @var{start} and @var{end} and replaces some or | 1086 | text between positions @var{start} and @var{end} and replaces some or |
| 1071 | all of them. If @var{start} is @code{nil} (or omitted), point is used | 1087 | all of them. If @var{start} is @code{nil} (or omitted), point is used |
| 1072 | instead, and the buffer's end is used for @var{end}. | 1088 | instead, and the end of the buffer's accessible portion is used for |
| 1089 | @var{end}. | ||
| 1073 | 1090 | ||
| 1074 | If @var{query-flag} is @code{nil}, it replaces all | 1091 | If @var{query-flag} is @code{nil}, it replaces all |
| 1075 | occurrences; otherwise, it asks the user what to do about each one. | 1092 | occurrences; otherwise, it asks the user what to do about each one. |
| @@ -1090,7 +1107,7 @@ get the replacement text. This function is called with two arguments: | |||
| 1090 | 1107 | ||
| 1091 | If @var{repeat-count} is non-@code{nil}, it should be an integer. Then | 1108 | If @var{repeat-count} is non-@code{nil}, it should be an integer. Then |
| 1092 | it specifies how many times to use each of the strings in the | 1109 | it specifies how many times to use each of the strings in the |
| 1093 | @var{replacements} list before advancing cyclicly to the next one. | 1110 | @var{replacements} list before advancing cyclically to the next one. |
| 1094 | 1111 | ||
| 1095 | If @var{from-string} contains upper-case letters, then | 1112 | If @var{from-string} contains upper-case letters, then |
| 1096 | @code{perform-replace} binds @code{case-fold-search} to @code{nil}, and | 1113 | @code{perform-replace} binds @code{case-fold-search} to @code{nil}, and |
| @@ -1099,6 +1116,22 @@ it uses the @code{replacements} without altering the case of them. | |||
| 1099 | Normally, the keymap @code{query-replace-map} defines the possible user | 1116 | Normally, the keymap @code{query-replace-map} defines the possible user |
| 1100 | responses for queries. The argument @var{map}, if non-@code{nil}, is a | 1117 | responses for queries. The argument @var{map}, if non-@code{nil}, is a |
| 1101 | keymap to use instead of @code{query-replace-map}. | 1118 | keymap to use instead of @code{query-replace-map}. |
| 1119 | |||
| 1120 | @strong{Usage note:} Do not use this function in your own programs | ||
| 1121 | unless you want to do something very similar to what | ||
| 1122 | @code{query-replace} does, including setting the mark and possibly | ||
| 1123 | querying the user. For most purposes a simple loop like, for | ||
| 1124 | instance: | ||
| 1125 | |||
| 1126 | @example | ||
| 1127 | (while (re-search-forward "foo[ \t]+bar" nil t) | ||
| 1128 | (replace-match "foobar")) | ||
| 1129 | @end example | ||
| 1130 | |||
| 1131 | @noindent | ||
| 1132 | is preferable. It runs faster and avoids side effects, such as | ||
| 1133 | setting the mark. @xref{Replacing Match,, Replacing the Text that | ||
| 1134 | Matched}, for a description of @code{replace-match}. | ||
| 1102 | @end defun | 1135 | @end defun |
| 1103 | 1136 | ||
| 1104 | @defvar query-replace-map | 1137 | @defvar query-replace-map |
| @@ -1205,9 +1238,11 @@ was matched by the last search. It replaces that text with | |||
| 1205 | @var{replacement}. | 1238 | @var{replacement}. |
| 1206 | 1239 | ||
| 1207 | If you did the last search in a buffer, you should specify @code{nil} | 1240 | If you did the last search in a buffer, you should specify @code{nil} |
| 1208 | for @var{string}. Then @code{replace-match} does the replacement by | 1241 | for @var{string} and make sure that the current buffer when you call |
| 1209 | editing the buffer; it leaves point at the end of the replacement text, | 1242 | @code{replace-match} is the one in which you did the searching or |
| 1210 | and returns @code{t}. | 1243 | matching. Then @code{replace-match} does the replacement by editing |
| 1244 | the buffer; it leaves point at the end of the replacement text, and | ||
| 1245 | returns @code{t}. | ||
| 1211 | 1246 | ||
| 1212 | If you did the search in a string, pass the same string as @var{string}. | 1247 | If you did the search in a string, pass the same string as @var{string}. |
| 1213 | Then @code{replace-match} does the replacement by constructing and | 1248 | Then @code{replace-match} does the replacement by constructing and |
| @@ -1239,6 +1274,7 @@ part of one of the following sequences: | |||
| 1239 | @samp{\@var{n}}, where @var{n} is a digit, stands for the text that | 1274 | @samp{\@var{n}}, where @var{n} is a digit, stands for the text that |
| 1240 | matched the @var{n}th subexpression in the original regexp. | 1275 | matched the @var{n}th subexpression in the original regexp. |
| 1241 | Subexpressions are those expressions grouped inside @samp{\(@dots{}\)}. | 1276 | Subexpressions are those expressions grouped inside @samp{\(@dots{}\)}. |
| 1277 | If the @var{n}th subexpression never matched, an empty string is substituted. | ||
| 1242 | 1278 | ||
| 1243 | @item @samp{\\} | 1279 | @item @samp{\\} |
| 1244 | @cindex @samp{\} in replacement | 1280 | @cindex @samp{\} in replacement |
| @@ -1396,7 +1432,7 @@ character of the buffer counts as 1.) | |||
| 1396 | The functions @code{match-data} and @code{set-match-data} read or | 1432 | The functions @code{match-data} and @code{set-match-data} read or |
| 1397 | write the entire match data, all at once. | 1433 | write the entire match data, all at once. |
| 1398 | 1434 | ||
| 1399 | @defun match-data | 1435 | @defun match-data &optional integers reuse |
| 1400 | This function returns a newly constructed list containing all the | 1436 | This function returns a newly constructed list containing all the |
| 1401 | information on what text the last search matched. Element zero is the | 1437 | information on what text the last search matched. Element zero is the |
| 1402 | position of the beginning of the match for the whole expression; element | 1438 | position of the beginning of the match for the whole expression; element |
| @@ -1420,8 +1456,20 @@ number {\mathsurround=0pt $2n+1$} | |||
| 1420 | corresponds to @code{(match-end @var{n})}. | 1456 | corresponds to @code{(match-end @var{n})}. |
| 1421 | 1457 | ||
| 1422 | All the elements are markers or @code{nil} if matching was done on a | 1458 | All the elements are markers or @code{nil} if matching was done on a |
| 1423 | buffer, and all are integers or @code{nil} if matching was done on a | 1459 | buffer and all are integers or @code{nil} if matching was done on a |
| 1424 | string with @code{string-match}. | 1460 | string with @code{string-match}. If @var{integers} is |
| 1461 | non-@code{nil}, then all elements are integers or @code{nil}, even if | ||
| 1462 | matching was done on a buffer. Also, @code{match-beginning} and | ||
| 1463 | @code{match-end} always return integers or @code{nil}. | ||
| 1464 | |||
| 1465 | If @var{reuse} is non-@code{nil}, it should be a list. In that case, | ||
| 1466 | @code{match-data} stores the match data in @var{reuse}. That is, | ||
| 1467 | @var{reuse} is destructively modified. @var{reuse} does not need to | ||
| 1468 | have the right length. If it is not long enough to contain the match | ||
| 1469 | data, it is extended. If it is too long, the length of @var{reuse} | ||
| 1470 | stays the same, but the elements that were not used are set to | ||
| 1471 | @code{nil}. The purpose of this feature is to avoid producing too | ||
| 1472 | much garbage, that would later have to be collected. | ||
| 1425 | 1473 | ||
| 1426 | As always, there must be no possibility of intervening searches between | 1474 | As always, there must be no possibility of intervening searches between |
| 1427 | the call to a search function and the call to @code{match-data} that is | 1475 | the call to a search function and the call to @code{match-data} that is |
| @@ -1474,7 +1522,8 @@ that shows the problem that arises if you fail to save the match data: | |||
| 1474 | 1522 | ||
| 1475 | @defmac save-match-data body@dots{} | 1523 | @defmac save-match-data body@dots{} |
| 1476 | This macro executes @var{body}, saving and restoring the match | 1524 | This macro executes @var{body}, saving and restoring the match |
| 1477 | data around it. | 1525 | data around it. The return value is the value of the last form in |
| 1526 | @var{body}. | ||
| 1478 | @end defmac | 1527 | @end defmac |
| 1479 | 1528 | ||
| 1480 | You could use @code{set-match-data} together with @code{match-data} to | 1529 | You could use @code{set-match-data} together with @code{match-data} to |
| @@ -1544,10 +1593,11 @@ for an upper case letter only. But this has nothing to do with the | |||
| 1544 | searching functions used in Lisp code. | 1593 | searching functions used in Lisp code. |
| 1545 | 1594 | ||
| 1546 | @defopt case-replace | 1595 | @defopt case-replace |
| 1547 | This variable determines whether the replacement functions should | 1596 | This variable determines whether the higher level replacement |
| 1548 | preserve case. If the variable is @code{nil}, that means to use the | 1597 | functions should preserve case. If the variable is @code{nil}, that |
| 1549 | replacement text verbatim. A non-@code{nil} value means to convert the | 1598 | means to use the replacement text verbatim. A non-@code{nil} value |
| 1550 | case of the replacement text according to the text being replaced. | 1599 | means to convert the case of the replacement text according to the |
| 1600 | text being replaced. | ||
| 1551 | 1601 | ||
| 1552 | This variable is used by passing it as an argument to the function | 1602 | This variable is used by passing it as an argument to the function |
| 1553 | @code{replace-match}. @xref{Replacing Match}. | 1603 | @code{replace-match}. @xref{Replacing Match}. |
| @@ -1600,22 +1650,23 @@ spaces, tabs, and form feeds (after its left margin). | |||
| 1600 | @defvar paragraph-start | 1650 | @defvar paragraph-start |
| 1601 | This is the regular expression for recognizing the beginning of a line | 1651 | This is the regular expression for recognizing the beginning of a line |
| 1602 | that starts @emph{or} separates paragraphs. The default value is | 1652 | that starts @emph{or} separates paragraphs. The default value is |
| 1603 | @w{@code{"[@ \t\n\f]"}}, which matches a line starting with a space, tab, | 1653 | @w{@code{"\f\\|[ \t]*$"}}, which matches a line containing only |
| 1604 | newline, or form feed (after its left margin). | 1654 | whitespace or starting with a form feed (after its left margin). |
| 1605 | @end defvar | 1655 | @end defvar |
| 1606 | 1656 | ||
| 1607 | @defvar sentence-end | 1657 | @defvar sentence-end |
| 1608 | This is the regular expression describing the end of a sentence. (All | 1658 | This is the regular expression describing the end of a sentence. (All |
| 1609 | paragraph boundaries also end sentences, regardless.) The default value | 1659 | paragraph boundaries also end sentences, regardless.) The (slightly |
| 1610 | is: | 1660 | simplified) default value is: |
| 1611 | 1661 | ||
| 1612 | @example | 1662 | @example |
| 1613 | "[.?!][]\"')@}]*\\($\\| $\\|\t\\| \\)[ \t\n]*" | 1663 | "[.?!][]\"')@}]*\\($\\| $\\|\t\\|@ @ \\)[ \t\n]*" |
| 1614 | @end example | 1664 | @end example |
| 1615 | 1665 | ||
| 1616 | This means a period, question mark or exclamation mark, followed | 1666 | This means a period, question mark or exclamation mark (the actual |
| 1617 | optionally by a closing parenthetical character, followed by tabs, | 1667 | default value also lists their alternatives in other character sets), |
| 1618 | spaces or new lines. | 1668 | followed optionally by a closing parenthetical character, followed by |
| 1669 | tabs, spaces or new lines. | ||
| 1619 | 1670 | ||
| 1620 | For a detailed explanation of this regular expression, see @ref{Regexp | 1671 | For a detailed explanation of this regular expression, see @ref{Regexp |
| 1621 | Example}. | 1672 | Example}. |