diff options
| author | Luc Teirlinck | 2004-10-09 18:35:38 +0000 |
|---|---|---|
| committer | Luc Teirlinck | 2004-10-09 18:35:38 +0000 |
| commit | 45fa30b2ffeff0b52fab027547d0c48510b3a9c0 (patch) | |
| tree | c0c5a0223a3181eba890843c99c80d6173babb30 | |
| parent | 6cae76c29157856e6392ecc47764fe6eab6f2e1c (diff) | |
| download | emacs-45fa30b2ffeff0b52fab027547d0c48510b3a9c0.tar.gz emacs-45fa30b2ffeff0b52fab027547d0c48510b3a9c0.zip | |
(Regexp Example): Update description of how Emacs currently recognizes
the end of a sentence.
(Standard Regexps): Update definition of the variable `sentence-end'.
Add definition of the function `sentence-end'.
| -rw-r--r-- | lispref/searching.texi | 46 |
1 files changed, 24 insertions, 22 deletions
diff --git a/lispref/searching.texi b/lispref/searching.texi index 93a152fbbe1..ee6cb06b1e1 100644 --- a/lispref/searching.texi +++ b/lispref/searching.texi | |||
| @@ -1,6 +1,6 @@ | |||
| 1 | @c -*-texinfo-*- | 1 | @c -*-texinfo-*- |
| 2 | @c This is part of the GNU Emacs Lisp Reference Manual. | 2 | @c This is part of the GNU Emacs Lisp Reference Manual. |
| 3 | @c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999 | 3 | @c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999, 2004 |
| 4 | @c Free Software Foundation, Inc. | 4 | @c Free Software Foundation, Inc. |
| 5 | @c See the file elisp.texi for copying conditions. | 5 | @c See the file elisp.texi for copying conditions. |
| 6 | @setfilename ../info/searching | 6 | @setfilename ../info/searching |
| @@ -694,9 +694,9 @@ an @code{invalid-regexp} error is signaled. | |||
| 694 | 694 | ||
| 695 | Here is a complicated regexp which was formerly used by Emacs to | 695 | Here is a complicated regexp which was formerly used by Emacs to |
| 696 | recognize the end of a sentence together with any whitespace that | 696 | recognize the end of a sentence together with any whitespace that |
| 697 | follows. It was used as the variable @code{sentence-end}. (Its value | 697 | follows. (Nowadays Emacs uses a similar but more complex default |
| 698 | nowadays contains alternatives for @samp{.}, @samp{?} and @samp{!} in | 698 | regexp constructed by the function @code{sentence-end}. |
| 699 | other character sets.) | 699 | @xref{Standard Regexps}.) |
| 700 | 700 | ||
| 701 | First, we show the regexp as a string in Lisp syntax to distinguish | 701 | First, we show the regexp as a string in Lisp syntax to distinguish |
| 702 | spaces from tab characters. The string constant begins and ends with a | 702 | spaces from tab characters. The string constant begins and ends with a |
| @@ -730,9 +730,9 @@ deciphered as follows: | |||
| 730 | The first part of the pattern is a character alternative that matches | 730 | The first part of the pattern is a character alternative that matches |
| 731 | any one of three characters: period, question mark, and exclamation | 731 | any one of three characters: period, question mark, and exclamation |
| 732 | mark. The match must begin with one of these three characters. (This | 732 | mark. The match must begin with one of these three characters. (This |
| 733 | is the one point where the new value of @code{sentence-end} differs | 733 | is one point where the new default regexp used by Emacs differs from |
| 734 | from the old. The new value also lists sentence ending | 734 | the old. The new value also allows some non-@acronym{ASCII} |
| 735 | non-@acronym{ASCII} characters.) | 735 | characters that end a sentence without any following whitespace.) |
| 736 | 736 | ||
| 737 | @item []\"')@}]* | 737 | @item []\"')@}]* |
| 738 | The second part of the pattern matches any closing braces and quotation | 738 | The second part of the pattern matches any closing braces and quotation |
| @@ -1698,23 +1698,25 @@ whitespace or starting with a form feed (after its left margin). | |||
| 1698 | @end defvar | 1698 | @end defvar |
| 1699 | 1699 | ||
| 1700 | @defvar sentence-end | 1700 | @defvar sentence-end |
| 1701 | This is the regular expression describing the end of a sentence. (All | 1701 | If non-@code{nil}, the value should be a regular expression describing |
| 1702 | paragraph boundaries also end sentences, regardless.) The (slightly | 1702 | the end of a sentence, including the whitespace following the |
| 1703 | simplified) default value is: | 1703 | sentence. (All paragraph boundaries also end sentences, regardless.) |
| 1704 | 1704 | ||
| 1705 | @example | 1705 | If the value is @code{nil}, the default, then the function |
| 1706 | "[.?!][]\"')@}]*\\($\\| $\\|\t\\|@ @ \\)[ \t\n]*" | 1706 | @code{sentence-end} has to construct the regexp. That is why you |
| 1707 | @end example | 1707 | should always call the function @code{sentence-end} to obtain the |
| 1708 | 1708 | regexp to be used to recognize the end of a sentence. | |
| 1709 | This means a period, question mark or exclamation mark (the actual | ||
| 1710 | default value also lists their alternatives in other character sets), | ||
| 1711 | followed optionally by closing parenthetical characters, followed by | ||
| 1712 | tabs, spaces or new lines. | ||
| 1713 | |||
| 1714 | For a detailed explanation of this regular expression, see @ref{Regexp | ||
| 1715 | Example}. | ||
| 1716 | @end defvar | 1709 | @end defvar |
| 1717 | 1710 | ||
| 1711 | @defun sentence-end | ||
| 1712 | This function returns the value of the variable @code{sentence-end}, | ||
| 1713 | if non-@code{nil}. Otherwise it returns a default value based on the | ||
| 1714 | values of the variables @code{sentence-end-double-space} | ||
| 1715 | (@pxref{Definition of sentence-end-double-space}), | ||
| 1716 | @code{sentence-end-without-period} and | ||
| 1717 | @code{sentence-end-without-space}. | ||
| 1718 | @end defun | ||
| 1719 | |||
| 1718 | @ignore | 1720 | @ignore |
| 1719 | arch-tag: c2573ca2-18aa-4839-93b8-924043ef831f | 1721 | arch-tag: c2573ca2-18aa-4839-93b8-924043ef831f |
| 1720 | @end ignore | 1722 | @end ignore |