diff options
| author | Dave Love | 2000-10-13 16:36:35 +0000 |
|---|---|---|
| committer | Dave Love | 2000-10-13 16:36:35 +0000 |
| commit | 6cc089d2ade30c1d8dfc71d2d239b99a056834cd (patch) | |
| tree | 6954f4f2c75e1e65750fc712e76255a7f9b37875 | |
| parent | 40ad3db491f57a70e68c572594bee3d26efefb1a (diff) | |
| download | emacs-6cc089d2ade30c1d8dfc71d2d239b99a056834cd.tar.gz emacs-6cc089d2ade30c1d8dfc71d2d239b99a056834cd.zip | |
Non-ASCII in regexp ranges.
| -rw-r--r-- | lispref/searching.texi | 15 |
1 files changed, 11 insertions, 4 deletions
diff --git a/lispref/searching.texi b/lispref/searching.texi index 0b54fcd2fe8..7274209adb7 100644 --- a/lispref/searching.texi +++ b/lispref/searching.texi | |||
| @@ -311,10 +311,17 @@ matches both @samp{]} and @samp{-}. | |||
| 311 | To include @samp{^} in a character alternative, put it anywhere but at | 311 | To include @samp{^} in a character alternative, put it anywhere but at |
| 312 | the beginning. | 312 | the beginning. |
| 313 | 313 | ||
| 314 | The beginning and end of a range must be in the same character set | 314 | The beginning and end of a range of multibyte characters must be in the |
| 315 | (@pxref{Character Sets}). Thus, @samp{[a-\x8e0]} is invalid because | 315 | same character set (@pxref{Character Sets}). Thus, @samp{[\x8e0-\x97c]} |
| 316 | @samp{a} is in the @sc{ascii} character set but the character 0x8e0 | 316 | is invalid because character 0x8e0 (@samp{a} with grave accent) is in |
| 317 | (@samp{a} with grave accent) is in the Emacs character set for Latin-1. | 317 | the Emacs character set for Latin-1 but the character 0x97c (@samp{u} |
| 318 | with diaeresis) is in the Emacs character set for Latin-2. | ||
| 319 | |||
| 320 | If a range starts with a unibyte character @var{c} and ends with a | ||
| 321 | multibyte character @var{c2}, the range is divided into two parts: one | ||
| 322 | is @samp{@var{c}..?\377}, the other is @samp{@var{c1}..@var{c2}}, where | ||
| 323 | @var{c1} is the first character of the charset to which @var{c2} | ||
| 324 | belongs. | ||
| 318 | 325 | ||
| 319 | You cannot always match all non-@sc{ascii} characters with the regular | 326 | You cannot always match all non-@sc{ascii} characters with the regular |
| 320 | expression @samp{[\200-\377]}. This works when searching a unibyte | 327 | expression @samp{[\200-\377]}. This works when searching a unibyte |