diff options
| author | Paul Eggert | 2019-04-02 15:00:59 -0700 |
|---|---|---|
| committer | Paul Eggert | 2019-04-02 15:01:34 -0700 |
| commit | f9ff60e0d7288e30cdbd1e43225059f1374441f1 (patch) | |
| tree | 0e7e37a750e55adc0f959ca372369f4aa81cd3c2 | |
| parent | bb669166ba6b33cd1a927c772c87ee2240a10f89 (diff) | |
| download | emacs-f9ff60e0d7288e30cdbd1e43225059f1374441f1.tar.gz emacs-f9ff60e0d7288e30cdbd1e43225059f1374441f1.zip | |
Improve regexp advice again, and unchain ranges
* doc/lispref/searching.texi (Regexp Special):
Mention char classes earlier, in a more-logical place.
Advise sticking to ASCII letters and digits in ranges.
Reword negative advice to make it clearer that it’s negative.
* lisp/files.el (make-auto-save-file-name):
* lisp/gnus/message.el (message-mailer-swallows-blank-line):
* lisp/gnus/nndoc.el (nndoc-lanl-gov-announce-type-p)
(nndoc-generate-lanl-gov-head):
* lisp/org/org-eshell.el (org-eshell-open):
* lisp/org/org.el (org-deadline-time-hour-regexp)
(org-scheduled-time-hour-regexp):
* lisp/progmodes/bat-mode.el (bat-font-lock-keywords):
* lisp/progmodes/bug-reference.el (bug-reference-bug-regexp):
* lisp/textmodes/less-css-mode.el (less-css-font-lock-keywords):
* lisp/vc/vc-cvs.el (vc-cvs-valid-symbolic-tag-name-p):
* lisp/vc/vc-svn.el (vc-svn-valid-symbolic-tag-name-p):
Avoid attempts to chain ranges, as this can be confusing.
For example, instead of [0-9-_.], use [0-9_.-].
| -rw-r--r-- | doc/lispref/searching.texi | 52 | ||||
| -rw-r--r-- | lisp/files.el | 2 | ||||
| -rw-r--r-- | lisp/gnus/message.el | 2 | ||||
| -rw-r--r-- | lisp/gnus/nndoc.el | 4 | ||||
| -rw-r--r-- | lisp/org/org-eshell.el | 2 | ||||
| -rw-r--r-- | lisp/org/org.el | 4 | ||||
| -rw-r--r-- | lisp/progmodes/bat-mode.el | 2 | ||||
| -rw-r--r-- | lisp/progmodes/bug-reference.el | 2 | ||||
| -rw-r--r-- | lisp/textmodes/less-css-mode.el | 4 | ||||
| -rw-r--r-- | lisp/vc/vc-cvs.el | 2 | ||||
| -rw-r--r-- | lisp/vc/vc-svn.el | 2 |
11 files changed, 45 insertions, 33 deletions
diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi index 72ee9233a3c..8775254dd07 100644 --- a/doc/lispref/searching.texi +++ b/doc/lispref/searching.texi | |||
| @@ -395,9 +395,18 @@ or @samp{$}, @samp{%} or period. However, the ending character of one | |||
| 395 | range should not be the starting point of another one; for example, | 395 | range should not be the starting point of another one; for example, |
| 396 | @samp{[a-m-z]} should be avoided. | 396 | @samp{[a-m-z]} should be avoided. |
| 397 | 397 | ||
| 398 | A character alternative can also specify named character classes | ||
| 399 | (@pxref{Char Classes}). This is a POSIX feature. For example, | ||
| 400 | @samp{[[:ascii:]]} matches any @acronym{ASCII} character. | ||
| 401 | Using a character class is equivalent to mentioning each of the | ||
| 402 | characters in that class; but the latter is not feasible in practice, | ||
| 403 | since some classes include thousands of different characters. | ||
| 404 | A character class should not appear as the lower or upper bound | ||
| 405 | of a range. | ||
| 406 | |||
| 398 | The usual regexp special characters are not special inside a | 407 | The usual regexp special characters are not special inside a |
| 399 | character alternative. A completely different set of characters is | 408 | character alternative. A completely different set of characters is |
| 400 | special inside character alternatives: @samp{]}, @samp{-} and @samp{^}. | 409 | special: @samp{]}, @samp{-} and @samp{^}. |
| 401 | To include @samp{]} in a character alternative, put it at the | 410 | To include @samp{]} in a character alternative, put it at the |
| 402 | beginning. To include @samp{^}, put it anywhere but at the beginning. | 411 | beginning. To include @samp{^}, put it anywhere but at the beginning. |
| 403 | To include @samp{-}, put it at the end. Thus, @samp{[]^-]} matches | 412 | To include @samp{-}, put it at the end. Thus, @samp{[]^-]} matches |
| @@ -430,33 +439,36 @@ matches only @samp{/} rather than the likely-intended four characters. | |||
| 430 | @end enumerate | 439 | @end enumerate |
| 431 | 440 | ||
| 432 | Some kinds of character alternatives are not the best style even | 441 | Some kinds of character alternatives are not the best style even |
| 433 | though they are standardized by POSIX and are portable. They include: | 442 | though they have a well-defined meaning in Emacs. They include: |
| 434 | 443 | ||
| 435 | @enumerate | 444 | @enumerate |
| 436 | @item | 445 | @item |
| 437 | A character alternative can include duplicates. For example, | 446 | Although a range's bound can be almost any character, it is better |
| 438 | @samp{[XYa-yYb-zX]} is less clear than @samp{[XYa-z]}. | 447 | style to stay within natural sequences of ASCII letters and digits |
| 448 | because most people have not memorized character code tables. | ||
| 449 | For example, @samp{[.-9]} is less clear than @samp{[./0-9]}, | ||
| 450 | and @samp{[`-~]} is less clear than @samp{[`a-z@{|@}~]}. | ||
| 451 | Unicode character escapes can help here; for example, for most programmers | ||
| 452 | @samp{[ก-ฺ฿-๛]} is less clear than @samp{[\u0E01-\u0E3A\u0E3F-\u0E5B]}. | ||
| 439 | 453 | ||
| 440 | @item | 454 | @item |
| 441 | A range can denote just one, two, or three characters. For example, | 455 | Although a character alternative can include duplicates, it is better |
| 442 | @samp{[(-(]} is less clear than @samp{[(]}, @samp{[*-+]} is less clear | 456 | style to avoid them. For example, @samp{[XYa-yYb-zX]} is less clear |
| 443 | than @samp{[*+]}, and @samp{[*-,]} is less clear than @samp{[*+,]}. | 457 | than @samp{[XYa-z]}. |
| 444 | 458 | ||
| 445 | @item | 459 | @item |
| 446 | A @samp{-} also appear at the beginning of a character alternative, or | 460 | Although a range can denote just one, two, or three characters, it |
| 447 | as the upper bound of a range. For example, although @samp{[-a-z]} is | 461 | is simpler to list the characters. For example, |
| 448 | valid, @samp{[a-z-]} is better style; and although @samp{[!--/]} is | 462 | @samp{[a-a0]} is less clear than @samp{[a0]}, @samp{[i-j]} is less clear |
| 449 | valid, @samp{[!-,/-]} is clearer. | 463 | than @samp{[ij]}, and @samp{[i-k]} is less clear than @samp{[ijk]}. |
| 450 | @end enumerate | ||
| 451 | 464 | ||
| 452 | A character alternative can also specify named character classes | 465 | @item |
| 453 | (@pxref{Char Classes}). This is a POSIX feature. For example, | 466 | Although a @samp{-} can appear at the beginning of a character |
| 454 | @samp{[[:ascii:]]} matches any @acronym{ASCII} character. | 467 | alternative or as the upper bound of a range, it is better style to |
| 455 | Using a character class is equivalent to mentioning each of the | 468 | put @samp{-} by itself at the end of a character alternative. For |
| 456 | characters in that class; but the latter is not feasible in practice, | 469 | example, although @samp{[-a-z]} is valid, @samp{[a-z-]} is better |
| 457 | since some classes include thousands of different characters. | 470 | style; and although @samp{[*--]} is valid, @samp{[*+,-]} is clearer. |
| 458 | A character class should not appear as the lower or upper bound | 471 | @end enumerate |
| 459 | of a range. | ||
| 460 | 472 | ||
| 461 | @item @samp{[^ @dots{} ]} | 473 | @item @samp{[^ @dots{} ]} |
| 462 | @cindex @samp{^} in regexp | 474 | @cindex @samp{^} in regexp |
diff --git a/lisp/files.el b/lisp/files.el index 77a194b085d..1dae57593a0 100644 --- a/lisp/files.el +++ b/lisp/files.el | |||
| @@ -6316,7 +6316,7 @@ See also `auto-save-file-name-p'." | |||
| 6316 | ;; We do this on all platforms, because even if we are not | 6316 | ;; We do this on all platforms, because even if we are not |
| 6317 | ;; running on DOS/Windows, the current directory may be on a | 6317 | ;; running on DOS/Windows, the current directory may be on a |
| 6318 | ;; mounted VFAT filesystem, such as a USB memory stick. | 6318 | ;; mounted VFAT filesystem, such as a USB memory stick. |
| 6319 | (while (string-match "[^A-Za-z0-9-_.~#+]" buffer-name limit) | 6319 | (while (string-match "[^A-Za-z0-9_.~#+-]" buffer-name limit) |
| 6320 | (let* ((character (aref buffer-name (match-beginning 0))) | 6320 | (let* ((character (aref buffer-name (match-beginning 0))) |
| 6321 | (replacement | 6321 | (replacement |
| 6322 | ;; For multibyte characters, this will produce more than | 6322 | ;; For multibyte characters, this will produce more than |
diff --git a/lisp/gnus/message.el b/lisp/gnus/message.el index dae4b0dced6..c8b6f0ee685 100644 --- a/lisp/gnus/message.el +++ b/lisp/gnus/message.el | |||
| @@ -1288,7 +1288,7 @@ called and its result is inserted." | |||
| 1288 | ;; According to RFC 822 and its successors, the field name must | 1288 | ;; According to RFC 822 and its successors, the field name must |
| 1289 | ;; consist of printable US-ASCII characters other than colon, | 1289 | ;; consist of printable US-ASCII characters other than colon, |
| 1290 | ;; i.e., decimal 33-56 and 59-126. | 1290 | ;; i.e., decimal 33-56 and 59-126. |
| 1291 | '(looking-at "[ \t]\\|[][!\"#$%&'()*+,-./0-9;<=>?@A-Z\\^_`a-z{|}~]+:")) | 1291 | '(looking-at "[ \t]\\|[][!\"#$%&'()*+,./0-9;<=>?@A-Z\\^_`a-z{|}~-]+:")) |
| 1292 | "Set this non-nil if the system's mailer runs the header and body together. | 1292 | "Set this non-nil if the system's mailer runs the header and body together. |
| 1293 | \(This problem exists on Sunos 4 when sendmail is run in remote mode.) | 1293 | \(This problem exists on Sunos 4 when sendmail is run in remote mode.) |
| 1294 | The value should be an expression to test whether the problem will | 1294 | The value should be an expression to test whether the problem will |
diff --git a/lisp/gnus/nndoc.el b/lisp/gnus/nndoc.el index 8f1217b1275..532ba11fa09 100644 --- a/lisp/gnus/nndoc.el +++ b/lisp/gnus/nndoc.el | |||
| @@ -701,7 +701,7 @@ from the document.") | |||
| 701 | 701 | ||
| 702 | (defun nndoc-lanl-gov-announce-type-p () | 702 | (defun nndoc-lanl-gov-announce-type-p () |
| 703 | (when (let ((case-fold-search nil)) | 703 | (when (let ((case-fold-search nil)) |
| 704 | (re-search-forward "^\\\\\\\\\n\\(Paper\\( (\\*cross-listing\\*)\\)?: [a-zA-Z-\\.]+/[0-9]+\\|arXiv:\\)" nil t)) | 704 | (re-search-forward "^\\\\\\\\\n\\(Paper\\( (\\*cross-listing\\*)\\)?: [a-zA-Z\\.-]+/[0-9]+\\|arXiv:\\)" nil t)) |
| 705 | t)) | 705 | t)) |
| 706 | 706 | ||
| 707 | (defun nndoc-transform-lanl-gov-announce (article) | 707 | (defun nndoc-transform-lanl-gov-announce (article) |
| @@ -732,7 +732,7 @@ from the document.") | |||
| 732 | (save-restriction | 732 | (save-restriction |
| 733 | (narrow-to-region (car entry) (nth 1 entry)) | 733 | (narrow-to-region (car entry) (nth 1 entry)) |
| 734 | (goto-char (point-min)) | 734 | (goto-char (point-min)) |
| 735 | (when (looking-at "^\\(Paper.*: \\|arXiv:\\)\\([0-9a-zA-Z-\\./]+\\)") | 735 | (when (looking-at "^\\(Paper.*: \\|arXiv:\\)\\([0-9a-zA-Z\\./-]+\\)") |
| 736 | (setq subject (concat " (" (match-string 2) ")")) | 736 | (setq subject (concat " (" (match-string 2) ")")) |
| 737 | (when (re-search-forward "^From: \\(.*\\)" nil t) | 737 | (when (re-search-forward "^From: \\(.*\\)" nil t) |
| 738 | (setq from (concat "<" | 738 | (setq from (concat "<" |
diff --git a/lisp/org/org-eshell.el b/lisp/org/org-eshell.el index bb27d92e12d..2251a1b892f 100644 --- a/lisp/org/org-eshell.el +++ b/lisp/org/org-eshell.el | |||
| @@ -37,7 +37,7 @@ | |||
| 37 | eshell buffer) or a command line prefixed by a buffer name | 37 | eshell buffer) or a command line prefixed by a buffer name |
| 38 | followed by a colon." | 38 | followed by a colon." |
| 39 | (let* ((buffer-and-command | 39 | (let* ((buffer-and-command |
| 40 | (if (string-match "\\([A-Za-z0-9-+*]+\\):\\(.*\\)" link) | 40 | (if (string-match "\\([A-Za-z0-9+*-]+\\):\\(.*\\)" link) |
| 41 | (list (match-string 1 link) | 41 | (list (match-string 1 link) |
| 42 | (match-string 2 link)) | 42 | (match-string 2 link)) |
| 43 | (list eshell-buffer-name link))) | 43 | (list eshell-buffer-name link))) |
diff --git a/lisp/org/org.el b/lisp/org/org.el index bf7e305b7a0..ce6dd24a83b 100644 --- a/lisp/org/org.el +++ b/lisp/org/org.el | |||
| @@ -430,7 +430,7 @@ Matched keyword is in group 1.") | |||
| 430 | 430 | ||
| 431 | (defconst org-deadline-time-hour-regexp | 431 | (defconst org-deadline-time-hour-regexp |
| 432 | (concat "\\<" org-deadline-string | 432 | (concat "\\<" org-deadline-string |
| 433 | " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9-+:hdwmy \t.]*\\)>") | 433 | " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9+:hdwmy \t.-]*\\)>") |
| 434 | "Matches the DEADLINE keyword together with a time-and-hour stamp.") | 434 | "Matches the DEADLINE keyword together with a time-and-hour stamp.") |
| 435 | 435 | ||
| 436 | (defconst org-deadline-line-regexp | 436 | (defconst org-deadline-line-regexp |
| @@ -446,7 +446,7 @@ Matched keyword is in group 1.") | |||
| 446 | 446 | ||
| 447 | (defconst org-scheduled-time-hour-regexp | 447 | (defconst org-scheduled-time-hour-regexp |
| 448 | (concat "\\<" org-scheduled-string | 448 | (concat "\\<" org-scheduled-string |
| 449 | " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9-+:hdwmy \t.]*\\)>") | 449 | " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9+:hdwmy \t.-]*\\)>") |
| 450 | "Matches the SCHEDULED keyword together with a time-and-hour stamp.") | 450 | "Matches the SCHEDULED keyword together with a time-and-hour stamp.") |
| 451 | 451 | ||
| 452 | (defconst org-closed-time-regexp | 452 | (defconst org-closed-time-regexp |
diff --git a/lisp/progmodes/bat-mode.el b/lisp/progmodes/bat-mode.el index 6c85ff99053..a8b002be59b 100644 --- a/lisp/progmodes/bat-mode.el +++ b/lisp/progmodes/bat-mode.el | |||
| @@ -78,7 +78,7 @@ | |||
| 78 | "goto" "gtr" "if" "in" "leq" "lss" "neq" "not" "start")) | 78 | "goto" "gtr" "if" "in" "leq" "lss" "neq" "not" "start")) |
| 79 | (UNIX | 79 | (UNIX |
| 80 | '("bash" "cat" "cp" "fgrep" "grep" "ls" "sed" "sh" "mv" "rm"))) | 80 | '("bash" "cat" "cp" "fgrep" "grep" "ls" "sed" "sh" "mv" "rm"))) |
| 81 | `(("\\_<\\(call\\|goto\\)\\_>[ \t]+%?\\([A-Za-z0-9-_\\:.]+\\)%?" | 81 | `(("\\_<\\(call\\|goto\\)\\_>[ \t]+%?\\([A-Za-z0-9_\\:.-]+\\)%?" |
| 82 | (2 font-lock-constant-face t)) | 82 | (2 font-lock-constant-face t)) |
| 83 | ("^:[^:].*" | 83 | ("^:[^:].*" |
| 84 | . 'bat-label-face) | 84 | . 'bat-label-face) |
diff --git a/lisp/progmodes/bug-reference.el b/lisp/progmodes/bug-reference.el index 8baf74854f6..759db1f5686 100644 --- a/lisp/progmodes/bug-reference.el +++ b/lisp/progmodes/bug-reference.el | |||
| @@ -69,7 +69,7 @@ so that it is considered safe, see `enable-local-variables'.") | |||
| 69 | (get s 'bug-reference-url-format))))) | 69 | (get s 'bug-reference-url-format))))) |
| 70 | 70 | ||
| 71 | (defcustom bug-reference-bug-regexp | 71 | (defcustom bug-reference-bug-regexp |
| 72 | "\\([Bb]ug ?#?\\|[Pp]atch ?#\\|RFE ?#\\|PR [a-z-+]+/\\)\\([0-9]+\\(?:#[0-9]+\\)?\\)" | 72 | "\\([Bb]ug ?#?\\|[Pp]atch ?#\\|RFE ?#\\|PR [a-z+-]+/\\)\\([0-9]+\\(?:#[0-9]+\\)?\\)" |
| 73 | "Regular expression matching bug references. | 73 | "Regular expression matching bug references. |
| 74 | The second subexpression should match the bug reference (usually a number)." | 74 | The second subexpression should match the bug reference (usually a number)." |
| 75 | :type 'string | 75 | :type 'string |
diff --git a/lisp/textmodes/less-css-mode.el b/lisp/textmodes/less-css-mode.el index b4c7f28985d..4077789eb12 100644 --- a/lisp/textmodes/less-css-mode.el +++ b/lisp/textmodes/less-css-mode.el | |||
| @@ -194,10 +194,10 @@ directory by default." | |||
| 194 | ;; - custom faces. | 194 | ;; - custom faces. |
| 195 | (defconst less-css-font-lock-keywords | 195 | (defconst less-css-font-lock-keywords |
| 196 | '(;; Variables | 196 | '(;; Variables |
| 197 | ("@[a-z_-][a-z-_0-9]*" . font-lock-variable-name-face) | 197 | ("@[a-z_-][a-z_0-9-]*" . font-lock-variable-name-face) |
| 198 | ("&" . font-lock-preprocessor-face) | 198 | ("&" . font-lock-preprocessor-face) |
| 199 | ;; Mixins | 199 | ;; Mixins |
| 200 | ("\\(?:[ \t{;]\\|^\\)\\(\\.[a-z_-][a-z-_0-9]*\\)[ \t]*;" . | 200 | ("\\(?:[ \t{;]\\|^\\)\\(\\.[a-z_-][a-z_0-9-]*\\)[ \t]*;" . |
| 201 | (1 font-lock-keyword-face)))) | 201 | (1 font-lock-keyword-face)))) |
| 202 | 202 | ||
| 203 | (defvar less-css-mode-syntax-table | 203 | (defvar less-css-mode-syntax-table |
diff --git a/lisp/vc/vc-cvs.el b/lisp/vc/vc-cvs.el index 3bbd0ed49b1..626e190c1e8 100644 --- a/lisp/vc/vc-cvs.el +++ b/lisp/vc/vc-cvs.el | |||
| @@ -1087,7 +1087,7 @@ CVS/Entries should only be accessed through this function." | |||
| 1087 | ;; an uppercase or lowercase letter and can contain uppercase and | 1087 | ;; an uppercase or lowercase letter and can contain uppercase and |
| 1088 | ;; lowercase letters, digits, `-', and `_'. | 1088 | ;; lowercase letters, digits, `-', and `_'. |
| 1089 | (and (string-match "^[a-zA-Z]" tag) | 1089 | (and (string-match "^[a-zA-Z]" tag) |
| 1090 | (not (string-match "[^a-z0-9A-Z-_]" tag)))) | 1090 | (not (string-match "[^a-z0-9A-Z_-]" tag)))) |
| 1091 | 1091 | ||
| 1092 | (defun vc-cvs-valid-revision-number-p (tag) | 1092 | (defun vc-cvs-valid-revision-number-p (tag) |
| 1093 | "Return non-nil if TAG is a valid revision number." | 1093 | "Return non-nil if TAG is a valid revision number." |
diff --git a/lisp/vc/vc-svn.el b/lisp/vc/vc-svn.el index 618f03eedc5..3c50c8fff64 100644 --- a/lisp/vc/vc-svn.el +++ b/lisp/vc/vc-svn.el | |||
| @@ -759,7 +759,7 @@ Set file properties accordingly. If FILENAME is non-nil, return its status." | |||
| 759 | ;; an uppercase or lowercase letter and can contain uppercase and | 759 | ;; an uppercase or lowercase letter and can contain uppercase and |
| 760 | ;; lowercase letters, digits, `-', and `_'. | 760 | ;; lowercase letters, digits, `-', and `_'. |
| 761 | (and (string-match "^[a-zA-Z]" tag) | 761 | (and (string-match "^[a-zA-Z]" tag) |
| 762 | (not (string-match "[^a-z0-9A-Z-_]" tag)))) | 762 | (not (string-match "[^a-z0-9A-Z_-]" tag)))) |
| 763 | 763 | ||
| 764 | (defun vc-svn-valid-revision-number-p (tag) | 764 | (defun vc-svn-valid-revision-number-p (tag) |
| 765 | "Return non-nil if TAG is a valid revision number." | 765 | "Return non-nil if TAG is a valid revision number." |