aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorPaul Eggert2019-04-02 15:00:59 -0700
committerPaul Eggert2019-04-02 15:01:34 -0700
commitf9ff60e0d7288e30cdbd1e43225059f1374441f1 (patch)
tree0e7e37a750e55adc0f959ca372369f4aa81cd3c2
parentbb669166ba6b33cd1a927c772c87ee2240a10f89 (diff)
downloademacs-f9ff60e0d7288e30cdbd1e43225059f1374441f1.tar.gz
emacs-f9ff60e0d7288e30cdbd1e43225059f1374441f1.zip
Improve regexp advice again, and unchain ranges
* doc/lispref/searching.texi (Regexp Special): Mention char classes earlier, in a more-logical place. Advise sticking to ASCII letters and digits in ranges. Reword negative advice to make it clearer that it’s negative. * lisp/files.el (make-auto-save-file-name): * lisp/gnus/message.el (message-mailer-swallows-blank-line): * lisp/gnus/nndoc.el (nndoc-lanl-gov-announce-type-p) (nndoc-generate-lanl-gov-head): * lisp/org/org-eshell.el (org-eshell-open): * lisp/org/org.el (org-deadline-time-hour-regexp) (org-scheduled-time-hour-regexp): * lisp/progmodes/bat-mode.el (bat-font-lock-keywords): * lisp/progmodes/bug-reference.el (bug-reference-bug-regexp): * lisp/textmodes/less-css-mode.el (less-css-font-lock-keywords): * lisp/vc/vc-cvs.el (vc-cvs-valid-symbolic-tag-name-p): * lisp/vc/vc-svn.el (vc-svn-valid-symbolic-tag-name-p): Avoid attempts to chain ranges, as this can be confusing. For example, instead of [0-9-_.], use [0-9_.-].
-rw-r--r--doc/lispref/searching.texi52
-rw-r--r--lisp/files.el2
-rw-r--r--lisp/gnus/message.el2
-rw-r--r--lisp/gnus/nndoc.el4
-rw-r--r--lisp/org/org-eshell.el2
-rw-r--r--lisp/org/org.el4
-rw-r--r--lisp/progmodes/bat-mode.el2
-rw-r--r--lisp/progmodes/bug-reference.el2
-rw-r--r--lisp/textmodes/less-css-mode.el4
-rw-r--r--lisp/vc/vc-cvs.el2
-rw-r--r--lisp/vc/vc-svn.el2
11 files changed, 45 insertions, 33 deletions
diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 72ee9233a3c..8775254dd07 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -395,9 +395,18 @@ or @samp{$}, @samp{%} or period. However, the ending character of one
395range should not be the starting point of another one; for example, 395range should not be the starting point of another one; for example,
396@samp{[a-m-z]} should be avoided. 396@samp{[a-m-z]} should be avoided.
397 397
398A character alternative can also specify named character classes
399(@pxref{Char Classes}). This is a POSIX feature. For example,
400@samp{[[:ascii:]]} matches any @acronym{ASCII} character.
401Using a character class is equivalent to mentioning each of the
402characters in that class; but the latter is not feasible in practice,
403since some classes include thousands of different characters.
404A character class should not appear as the lower or upper bound
405of a range.
406
398The usual regexp special characters are not special inside a 407The usual regexp special characters are not special inside a
399character alternative. A completely different set of characters is 408character alternative. A completely different set of characters is
400special inside character alternatives: @samp{]}, @samp{-} and @samp{^}. 409special: @samp{]}, @samp{-} and @samp{^}.
401To include @samp{]} in a character alternative, put it at the 410To include @samp{]} in a character alternative, put it at the
402beginning. To include @samp{^}, put it anywhere but at the beginning. 411beginning. To include @samp{^}, put it anywhere but at the beginning.
403To include @samp{-}, put it at the end. Thus, @samp{[]^-]} matches 412To include @samp{-}, put it at the end. Thus, @samp{[]^-]} matches
@@ -430,33 +439,36 @@ matches only @samp{/} rather than the likely-intended four characters.
430@end enumerate 439@end enumerate
431 440
432Some kinds of character alternatives are not the best style even 441Some kinds of character alternatives are not the best style even
433though they are standardized by POSIX and are portable. They include: 442though they have a well-defined meaning in Emacs. They include:
434 443
435@enumerate 444@enumerate
436@item 445@item
437A character alternative can include duplicates. For example, 446Although a range's bound can be almost any character, it is better
438@samp{[XYa-yYb-zX]} is less clear than @samp{[XYa-z]}. 447style to stay within natural sequences of ASCII letters and digits
448because most people have not memorized character code tables.
449For example, @samp{[.-9]} is less clear than @samp{[./0-9]},
450and @samp{[`-~]} is less clear than @samp{[`a-z@{|@}~]}.
451Unicode character escapes can help here; for example, for most programmers
452@samp{[ก-ฺ฿-๛]} is less clear than @samp{[\u0E01-\u0E3A\u0E3F-\u0E5B]}.
439 453
440@item 454@item
441A range can denote just one, two, or three characters. For example, 455Although a character alternative can include duplicates, it is better
442@samp{[(-(]} is less clear than @samp{[(]}, @samp{[*-+]} is less clear 456style to avoid them. For example, @samp{[XYa-yYb-zX]} is less clear
443than @samp{[*+]}, and @samp{[*-,]} is less clear than @samp{[*+,]}. 457than @samp{[XYa-z]}.
444 458
445@item 459@item
446A @samp{-} also appear at the beginning of a character alternative, or 460Although a range can denote just one, two, or three characters, it
447as the upper bound of a range. For example, although @samp{[-a-z]} is 461is simpler to list the characters. For example,
448valid, @samp{[a-z-]} is better style; and although @samp{[!--/]} is 462@samp{[a-a0]} is less clear than @samp{[a0]}, @samp{[i-j]} is less clear
449valid, @samp{[!-,/-]} is clearer. 463than @samp{[ij]}, and @samp{[i-k]} is less clear than @samp{[ijk]}.
450@end enumerate
451 464
452A character alternative can also specify named character classes 465@item
453(@pxref{Char Classes}). This is a POSIX feature. For example, 466Although a @samp{-} can appear at the beginning of a character
454@samp{[[:ascii:]]} matches any @acronym{ASCII} character. 467alternative or as the upper bound of a range, it is better style to
455Using a character class is equivalent to mentioning each of the 468put @samp{-} by itself at the end of a character alternative. For
456characters in that class; but the latter is not feasible in practice, 469example, although @samp{[-a-z]} is valid, @samp{[a-z-]} is better
457since some classes include thousands of different characters. 470style; and although @samp{[*--]} is valid, @samp{[*+,-]} is clearer.
458A character class should not appear as the lower or upper bound 471@end enumerate
459of a range.
460 472
461@item @samp{[^ @dots{} ]} 473@item @samp{[^ @dots{} ]}
462@cindex @samp{^} in regexp 474@cindex @samp{^} in regexp
diff --git a/lisp/files.el b/lisp/files.el
index 77a194b085d..1dae57593a0 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -6316,7 +6316,7 @@ See also `auto-save-file-name-p'."
6316 ;; We do this on all platforms, because even if we are not 6316 ;; We do this on all platforms, because even if we are not
6317 ;; running on DOS/Windows, the current directory may be on a 6317 ;; running on DOS/Windows, the current directory may be on a
6318 ;; mounted VFAT filesystem, such as a USB memory stick. 6318 ;; mounted VFAT filesystem, such as a USB memory stick.
6319 (while (string-match "[^A-Za-z0-9-_.~#+]" buffer-name limit) 6319 (while (string-match "[^A-Za-z0-9_.~#+-]" buffer-name limit)
6320 (let* ((character (aref buffer-name (match-beginning 0))) 6320 (let* ((character (aref buffer-name (match-beginning 0)))
6321 (replacement 6321 (replacement
6322 ;; For multibyte characters, this will produce more than 6322 ;; For multibyte characters, this will produce more than
diff --git a/lisp/gnus/message.el b/lisp/gnus/message.el
index dae4b0dced6..c8b6f0ee685 100644
--- a/lisp/gnus/message.el
+++ b/lisp/gnus/message.el
@@ -1288,7 +1288,7 @@ called and its result is inserted."
1288 ;; According to RFC 822 and its successors, the field name must 1288 ;; According to RFC 822 and its successors, the field name must
1289 ;; consist of printable US-ASCII characters other than colon, 1289 ;; consist of printable US-ASCII characters other than colon,
1290 ;; i.e., decimal 33-56 and 59-126. 1290 ;; i.e., decimal 33-56 and 59-126.
1291 '(looking-at "[ \t]\\|[][!\"#$%&'()*+,-./0-9;<=>?@A-Z\\^_`a-z{|}~]+:")) 1291 '(looking-at "[ \t]\\|[][!\"#$%&'()*+,./0-9;<=>?@A-Z\\^_`a-z{|}~-]+:"))
1292 "Set this non-nil if the system's mailer runs the header and body together. 1292 "Set this non-nil if the system's mailer runs the header and body together.
1293\(This problem exists on Sunos 4 when sendmail is run in remote mode.) 1293\(This problem exists on Sunos 4 when sendmail is run in remote mode.)
1294The value should be an expression to test whether the problem will 1294The value should be an expression to test whether the problem will
diff --git a/lisp/gnus/nndoc.el b/lisp/gnus/nndoc.el
index 8f1217b1275..532ba11fa09 100644
--- a/lisp/gnus/nndoc.el
+++ b/lisp/gnus/nndoc.el
@@ -701,7 +701,7 @@ from the document.")
701 701
702(defun nndoc-lanl-gov-announce-type-p () 702(defun nndoc-lanl-gov-announce-type-p ()
703 (when (let ((case-fold-search nil)) 703 (when (let ((case-fold-search nil))
704 (re-search-forward "^\\\\\\\\\n\\(Paper\\( (\\*cross-listing\\*)\\)?: [a-zA-Z-\\.]+/[0-9]+\\|arXiv:\\)" nil t)) 704 (re-search-forward "^\\\\\\\\\n\\(Paper\\( (\\*cross-listing\\*)\\)?: [a-zA-Z\\.-]+/[0-9]+\\|arXiv:\\)" nil t))
705 t)) 705 t))
706 706
707(defun nndoc-transform-lanl-gov-announce (article) 707(defun nndoc-transform-lanl-gov-announce (article)
@@ -732,7 +732,7 @@ from the document.")
732 (save-restriction 732 (save-restriction
733 (narrow-to-region (car entry) (nth 1 entry)) 733 (narrow-to-region (car entry) (nth 1 entry))
734 (goto-char (point-min)) 734 (goto-char (point-min))
735 (when (looking-at "^\\(Paper.*: \\|arXiv:\\)\\([0-9a-zA-Z-\\./]+\\)") 735 (when (looking-at "^\\(Paper.*: \\|arXiv:\\)\\([0-9a-zA-Z\\./-]+\\)")
736 (setq subject (concat " (" (match-string 2) ")")) 736 (setq subject (concat " (" (match-string 2) ")"))
737 (when (re-search-forward "^From: \\(.*\\)" nil t) 737 (when (re-search-forward "^From: \\(.*\\)" nil t)
738 (setq from (concat "<" 738 (setq from (concat "<"
diff --git a/lisp/org/org-eshell.el b/lisp/org/org-eshell.el
index bb27d92e12d..2251a1b892f 100644
--- a/lisp/org/org-eshell.el
+++ b/lisp/org/org-eshell.el
@@ -37,7 +37,7 @@
37 eshell buffer) or a command line prefixed by a buffer name 37 eshell buffer) or a command line prefixed by a buffer name
38 followed by a colon." 38 followed by a colon."
39 (let* ((buffer-and-command 39 (let* ((buffer-and-command
40 (if (string-match "\\([A-Za-z0-9-+*]+\\):\\(.*\\)" link) 40 (if (string-match "\\([A-Za-z0-9+*-]+\\):\\(.*\\)" link)
41 (list (match-string 1 link) 41 (list (match-string 1 link)
42 (match-string 2 link)) 42 (match-string 2 link))
43 (list eshell-buffer-name link))) 43 (list eshell-buffer-name link)))
diff --git a/lisp/org/org.el b/lisp/org/org.el
index bf7e305b7a0..ce6dd24a83b 100644
--- a/lisp/org/org.el
+++ b/lisp/org/org.el
@@ -430,7 +430,7 @@ Matched keyword is in group 1.")
430 430
431(defconst org-deadline-time-hour-regexp 431(defconst org-deadline-time-hour-regexp
432 (concat "\\<" org-deadline-string 432 (concat "\\<" org-deadline-string
433 " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9-+:hdwmy \t.]*\\)>") 433 " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9+:hdwmy \t.-]*\\)>")
434 "Matches the DEADLINE keyword together with a time-and-hour stamp.") 434 "Matches the DEADLINE keyword together with a time-and-hour stamp.")
435 435
436(defconst org-deadline-line-regexp 436(defconst org-deadline-line-regexp
@@ -446,7 +446,7 @@ Matched keyword is in group 1.")
446 446
447(defconst org-scheduled-time-hour-regexp 447(defconst org-scheduled-time-hour-regexp
448 (concat "\\<" org-scheduled-string 448 (concat "\\<" org-scheduled-string
449 " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9-+:hdwmy \t.]*\\)>") 449 " *<\\([^>]+[0-9]\\{1,2\\}:[0-9]\\{2\\}[0-9+:hdwmy \t.-]*\\)>")
450 "Matches the SCHEDULED keyword together with a time-and-hour stamp.") 450 "Matches the SCHEDULED keyword together with a time-and-hour stamp.")
451 451
452(defconst org-closed-time-regexp 452(defconst org-closed-time-regexp
diff --git a/lisp/progmodes/bat-mode.el b/lisp/progmodes/bat-mode.el
index 6c85ff99053..a8b002be59b 100644
--- a/lisp/progmodes/bat-mode.el
+++ b/lisp/progmodes/bat-mode.el
@@ -78,7 +78,7 @@
78 "goto" "gtr" "if" "in" "leq" "lss" "neq" "not" "start")) 78 "goto" "gtr" "if" "in" "leq" "lss" "neq" "not" "start"))
79 (UNIX 79 (UNIX
80 '("bash" "cat" "cp" "fgrep" "grep" "ls" "sed" "sh" "mv" "rm"))) 80 '("bash" "cat" "cp" "fgrep" "grep" "ls" "sed" "sh" "mv" "rm")))
81 `(("\\_<\\(call\\|goto\\)\\_>[ \t]+%?\\([A-Za-z0-9-_\\:.]+\\)%?" 81 `(("\\_<\\(call\\|goto\\)\\_>[ \t]+%?\\([A-Za-z0-9_\\:.-]+\\)%?"
82 (2 font-lock-constant-face t)) 82 (2 font-lock-constant-face t))
83 ("^:[^:].*" 83 ("^:[^:].*"
84 . 'bat-label-face) 84 . 'bat-label-face)
diff --git a/lisp/progmodes/bug-reference.el b/lisp/progmodes/bug-reference.el
index 8baf74854f6..759db1f5686 100644
--- a/lisp/progmodes/bug-reference.el
+++ b/lisp/progmodes/bug-reference.el
@@ -69,7 +69,7 @@ so that it is considered safe, see `enable-local-variables'.")
69 (get s 'bug-reference-url-format))))) 69 (get s 'bug-reference-url-format)))))
70 70
71(defcustom bug-reference-bug-regexp 71(defcustom bug-reference-bug-regexp
72 "\\([Bb]ug ?#?\\|[Pp]atch ?#\\|RFE ?#\\|PR [a-z-+]+/\\)\\([0-9]+\\(?:#[0-9]+\\)?\\)" 72 "\\([Bb]ug ?#?\\|[Pp]atch ?#\\|RFE ?#\\|PR [a-z+-]+/\\)\\([0-9]+\\(?:#[0-9]+\\)?\\)"
73 "Regular expression matching bug references. 73 "Regular expression matching bug references.
74The second subexpression should match the bug reference (usually a number)." 74The second subexpression should match the bug reference (usually a number)."
75 :type 'string 75 :type 'string
diff --git a/lisp/textmodes/less-css-mode.el b/lisp/textmodes/less-css-mode.el
index b4c7f28985d..4077789eb12 100644
--- a/lisp/textmodes/less-css-mode.el
+++ b/lisp/textmodes/less-css-mode.el
@@ -194,10 +194,10 @@ directory by default."
194;; - custom faces. 194;; - custom faces.
195(defconst less-css-font-lock-keywords 195(defconst less-css-font-lock-keywords
196 '(;; Variables 196 '(;; Variables
197 ("@[a-z_-][a-z-_0-9]*" . font-lock-variable-name-face) 197 ("@[a-z_-][a-z_0-9-]*" . font-lock-variable-name-face)
198 ("&" . font-lock-preprocessor-face) 198 ("&" . font-lock-preprocessor-face)
199 ;; Mixins 199 ;; Mixins
200 ("\\(?:[ \t{;]\\|^\\)\\(\\.[a-z_-][a-z-_0-9]*\\)[ \t]*;" . 200 ("\\(?:[ \t{;]\\|^\\)\\(\\.[a-z_-][a-z_0-9-]*\\)[ \t]*;" .
201 (1 font-lock-keyword-face)))) 201 (1 font-lock-keyword-face))))
202 202
203(defvar less-css-mode-syntax-table 203(defvar less-css-mode-syntax-table
diff --git a/lisp/vc/vc-cvs.el b/lisp/vc/vc-cvs.el
index 3bbd0ed49b1..626e190c1e8 100644
--- a/lisp/vc/vc-cvs.el
+++ b/lisp/vc/vc-cvs.el
@@ -1087,7 +1087,7 @@ CVS/Entries should only be accessed through this function."
1087 ;; an uppercase or lowercase letter and can contain uppercase and 1087 ;; an uppercase or lowercase letter and can contain uppercase and
1088 ;; lowercase letters, digits, `-', and `_'. 1088 ;; lowercase letters, digits, `-', and `_'.
1089 (and (string-match "^[a-zA-Z]" tag) 1089 (and (string-match "^[a-zA-Z]" tag)
1090 (not (string-match "[^a-z0-9A-Z-_]" tag)))) 1090 (not (string-match "[^a-z0-9A-Z_-]" tag))))
1091 1091
1092(defun vc-cvs-valid-revision-number-p (tag) 1092(defun vc-cvs-valid-revision-number-p (tag)
1093 "Return non-nil if TAG is a valid revision number." 1093 "Return non-nil if TAG is a valid revision number."
diff --git a/lisp/vc/vc-svn.el b/lisp/vc/vc-svn.el
index 618f03eedc5..3c50c8fff64 100644
--- a/lisp/vc/vc-svn.el
+++ b/lisp/vc/vc-svn.el
@@ -759,7 +759,7 @@ Set file properties accordingly. If FILENAME is non-nil, return its status."
759 ;; an uppercase or lowercase letter and can contain uppercase and 759 ;; an uppercase or lowercase letter and can contain uppercase and
760 ;; lowercase letters, digits, `-', and `_'. 760 ;; lowercase letters, digits, `-', and `_'.
761 (and (string-match "^[a-zA-Z]" tag) 761 (and (string-match "^[a-zA-Z]" tag)
762 (not (string-match "[^a-z0-9A-Z-_]" tag)))) 762 (not (string-match "[^a-z0-9A-Z_-]" tag))))
763 763
764(defun vc-svn-valid-revision-number-p (tag) 764(defun vc-svn-valid-revision-number-p (tag)
765 "Return non-nil if TAG is a valid revision number." 765 "Return non-nil if TAG is a valid revision number."