diff options
| author | Jacob S. Gordon | 2025-05-19 15:05:37 -0400 |
|---|---|---|
| committer | Eli Zaretskii | 2025-06-14 17:07:19 +0300 |
| commit | 5bd9fa084dcf0ce8efaaf9212c24addec48d824f (patch) | |
| tree | e032a62cd687f5a15d24595684bfd1d16647e0ae | |
| parent | 82766b71a45a691e19386422d3a12a3e0321b2e8 (diff) | |
| download | emacs-5bd9fa084dcf0ce8efaaf9212c24addec48d824f.tar.gz emacs-5bd9fa084dcf0ce8efaaf9212c24addec48d824f.zip | |
calc: Allow strings with character codes above Latin-1
The current behavior of the functions 'calc-display-strings',
'strings', and 'bstrings' is to skip any vector containing
integers outside the Latin-1 range (0x00-0xFF). We introduce a
custom variable 'calc-string-maximum-character' to replace this
hard-coded maximum, and to allow vectors containing higher
character codes to be displayed as strings. The default value
of 0xFF preserves the existing behavior.
* lisp/calc/calc.el (calc-string-maximum-character): Add custom
variable 'calc-string-maximum-character'.
* lisp/calc/calccomp.el (math-vector-is-string): Replace hard-coded
maximum with 'calc-string-maximum-character', and the 'natnump'
assertion with 'characterp'. The latter guards against the
maximum being larger than '(max-char)', but not on invalid types of
the maximum such as strings.
* test/lisp/calc/calc-tests.el (calc-math-vector-is-string): Add
tests for 'math-vector-is-string' using different values of
'calc-string-maximum-character'.
* doc/misc/calc.texi (Quick Calculator, Strings, Customizing Calc):
Add variable definition for 'calc-string-maximum-character' and
reference thereof when discussing 'calc-display-strings'.
Generalize a comment about string display and availability of 8-bit
fonts.
(Bug#78528)
| -rw-r--r-- | doc/misc/calc.texi | 56 | ||||
| -rw-r--r-- | etc/NEWS | 12 | ||||
| -rw-r--r-- | lisp/calc/calc.el | 31 | ||||
| -rw-r--r-- | lisp/calc/calccomp.el | 15 | ||||
| -rw-r--r-- | test/lisp/calc/calc-tests.el | 67 |
5 files changed, 164 insertions, 17 deletions
diff --git a/doc/misc/calc.texi b/doc/misc/calc.texi index 61466b55201..eda442ecb38 100644 --- a/doc/misc/calc.texi +++ b/doc/misc/calc.texi | |||
| @@ -10179,7 +10179,7 @@ result @samp{[120]} (because 120 is the ASCII code of the lower-case | |||
| 10179 | is displayed only according to the current mode settings. But | 10179 | is displayed only according to the current mode settings. But |
| 10180 | running Quick Calc again and entering @samp{120} will produce the | 10180 | running Quick Calc again and entering @samp{120} will produce the |
| 10181 | result @samp{120 (16#78, 8#170, x)} which shows the number in its | 10181 | result @samp{120 (16#78, 8#170, x)} which shows the number in its |
| 10182 | decimal, hexadecimal, octal, and ASCII forms. | 10182 | decimal, hexadecimal, octal, and character forms. |
| 10183 | 10183 | ||
| 10184 | Please note that the Quick Calculator is not any faster at loading | 10184 | Please note that the Quick Calculator is not any faster at loading |
| 10185 | or computing the answer than the full Calculator; the name ``quick'' | 10185 | or computing the answer than the full Calculator; the name ``quick'' |
| @@ -10836,11 +10836,11 @@ from 1 to @samp{n}. | |||
| 10836 | @cindex Strings | 10836 | @cindex Strings |
| 10837 | @cindex Character strings | 10837 | @cindex Character strings |
| 10838 | Character strings are not a special data type in the Calculator. | 10838 | Character strings are not a special data type in the Calculator. |
| 10839 | Rather, a string is represented simply as a vector all of whose | 10839 | Rather, a string is represented simply as a vector all of whose elements |
| 10840 | elements are integers in the range 0 to 255 (ASCII codes). You can | 10840 | are integers in the Latin-1 range 0 to 255. You can enter a string at |
| 10841 | enter a string at any time by pressing the @kbd{"} key. Quotation | 10841 | any time by pressing the @kbd{"} key. Quotation marks and backslashes |
| 10842 | marks and backslashes are written @samp{\"} and @samp{\\}, respectively, | 10842 | are written @samp{\"} and @samp{\\}, respectively, inside strings. |
| 10843 | inside strings. Other notations introduced by backslashes are: | 10843 | Other notations introduced by backslashes are: |
| 10844 | 10844 | ||
| 10845 | @example | 10845 | @example |
| 10846 | @group | 10846 | @group |
| @@ -10857,21 +10857,24 @@ inside strings. Other notations introduced by backslashes are: | |||
| 10857 | 10857 | ||
| 10858 | @noindent | 10858 | @noindent |
| 10859 | Finally, a backslash followed by three octal digits produces any | 10859 | Finally, a backslash followed by three octal digits produces any |
| 10860 | character from its ASCII code. | 10860 | character from its code. |
| 10861 | 10861 | ||
| 10862 | @kindex d " | 10862 | @kindex d " |
| 10863 | @pindex calc-display-strings | 10863 | @pindex calc-display-strings |
| 10864 | Strings are normally displayed in vector-of-integers form. The | 10864 | Strings are normally displayed in vector-of-integers form. The |
| 10865 | @w{@kbd{d "}} (@code{calc-display-strings}) command toggles a mode in | 10865 | @w{@kbd{d "}} (@code{calc-display-strings}) command toggles a mode in |
| 10866 | which any vectors of small integers are displayed as quoted strings | 10866 | which any vectors of small integers are displayed as quoted strings |
| 10867 | instead. | 10867 | instead. The display of strings containing higher character codes can |
| 10868 | be enabled by increasing the custom variable | ||
| 10869 | @code{calc-string-maximum-character} (@pxref{Customizing Calc}). | ||
| 10868 | 10870 | ||
| 10869 | The backslash notations shown above are also used for displaying | 10871 | The backslash notations shown above are also used for displaying |
| 10870 | strings. Characters 128 and above are not translated by Calc; unless | 10872 | strings. For ASCII control characters (below 32), and for the |
| 10871 | you have an Emacs modified for 8-bit fonts, these will show up in | 10873 | @code{DEL} character (127), Calc uses the backslash-letter combination |
| 10872 | backslash-octal-digits notation. For characters below 32, and | 10874 | if there is one, or otherwise uses a @samp{\^} sequence. Control |
| 10873 | for character 127, Calc uses the backslash-letter combination if | 10875 | characters above 127 are not translated by Calc, and will show up in |
| 10874 | there is one, or otherwise uses a @samp{\^} sequence. | 10876 | backslash-octal-digits notation. The display of higher character codes |
| 10877 | will depend on your display settings and system font coverage. | ||
| 10875 | 10878 | ||
| 10876 | The only Calc feature that uses strings is @dfn{compositions}; | 10879 | The only Calc feature that uses strings is @dfn{compositions}; |
| 10877 | @pxref{Compositions}. Strings also provide a convenient | 10880 | @pxref{Compositions}. Strings also provide a convenient |
| @@ -35684,6 +35687,33 @@ choose from, or the user can enter their own date. | |||
| 35684 | The default value of @code{calc-gregorian-switch} is @code{nil}. | 35687 | The default value of @code{calc-gregorian-switch} is @code{nil}. |
| 35685 | @end defvar | 35688 | @end defvar |
| 35686 | 35689 | ||
| 35690 | @defvar calc-string-maximum-character | ||
| 35691 | @xref{Strings}.@* | ||
| 35692 | |||
| 35693 | The variable @code{calc-string-maximum-character} is the maximum value | ||
| 35694 | of a vector's elements for @code{calc-display-strings}, @code{string}, | ||
| 35695 | and @code{bstring} to display the vector as a string. This maximum | ||
| 35696 | @emph{must} represent a character, i.e. it's a non-negative integer less | ||
| 35697 | than or equal to @code{(max-char)} or @code{0x3FFFFF}. Any negative | ||
| 35698 | value effectively disables the display of strings, and for values larger | ||
| 35699 | than @code{0x3FFFFF} the display acts as if the maximum were | ||
| 35700 | @code{0x3FFFFF}. Some natural choices (and their resulting ranges) are: | ||
| 35701 | |||
| 35702 | @itemize | ||
| 35703 | @item | ||
| 35704 | @code{0x7F} or 127 (ASCII), | ||
| 35705 | @item | ||
| 35706 | @code{0xFF} or 255 (Latin-1, the default), | ||
| 35707 | @item | ||
| 35708 | @code{0x10FFFF} (Unicode), | ||
| 35709 | @item | ||
| 35710 | @code{0x3FFFFF} (Emacs). | ||
| 35711 | @end itemize | ||
| 35712 | |||
| 35713 | The default value of @code{calc-string-maximum-character} is @code{0xFF} | ||
| 35714 | or 255. | ||
| 35715 | @end defvar | ||
| 35716 | |||
| 35687 | @node Reporting Bugs | 35717 | @node Reporting Bugs |
| 35688 | @appendix Reporting Bugs | 35718 | @appendix Reporting Bugs |
| 35689 | 35719 | ||
| @@ -2182,7 +2182,19 @@ modifier, it scrolls by year. | |||
| 2182 | The month and year navigation key bindings 'M-}', 'M-{', 'C-x ]' and | 2182 | The month and year navigation key bindings 'M-}', 'M-{', 'C-x ]' and |
| 2183 | 'C-x [' now have the alternative keys '}', '{', ']' and '['. | 2183 | 'C-x [' now have the alternative keys '}', '{', ']' and '['. |
| 2184 | 2184 | ||
| 2185 | ** Calc | ||
| 2186 | |||
| 2187 | *** New user option 'calc-string-maximum-character'. | ||
| 2188 | |||
| 2189 | Previously, the 'calc-display-strings', 'string', and 'bstring' | ||
| 2190 | functions only considered integer vectors whose elements are all in the | ||
| 2191 | Latin-1 range 0-255. This hard-coded maximum is replaced by | ||
| 2192 | 'calc-string-maximum-character', and setting it to a higher value allows | ||
| 2193 | the display of matching vectors as Unicode strings. The default value | ||
| 2194 | is '0xFF' or '255' to preserve the existing behavior. | ||
| 2195 | |||
| 2185 | 2196 | ||
| 2197 | |||
| 2186 | * New Modes and Packages in Emacs 31.1 | 2198 | * New Modes and Packages in Emacs 31.1 |
| 2187 | 2199 | ||
| 2188 | ** New minor mode 'delete-trailing-whitespace-mode'. | 2200 | ** New minor mode 'delete-trailing-whitespace-mode'. |
diff --git a/lisp/calc/calc.el b/lisp/calc/calc.el index a7bd671998e..a350419b320 100644 --- a/lisp/calc/calc.el +++ b/lisp/calc/calc.el | |||
| @@ -628,6 +628,37 @@ Otherwise, 1 / 0 is changed to uinf (undirected infinity).") | |||
| 628 | (defcalcmodevar calc-display-strings nil | 628 | (defcalcmodevar calc-display-strings nil |
| 629 | "If non-nil, display vectors of byte-sized integers as strings.") | 629 | "If non-nil, display vectors of byte-sized integers as strings.") |
| 630 | 630 | ||
| 631 | (defcustom calc-string-maximum-character #xFF | ||
| 632 | "Maximum value of vector contents to be displayed as a string. | ||
| 633 | |||
| 634 | If a vector consists of characters up to this maximum value, the | ||
| 635 | function `calc-display-strings' will toggle displaying the vector as a | ||
| 636 | string. This maximum value must represent a character (see `characterp'). | ||
| 637 | Some natural choices (and their resulting ranges) are: | ||
| 638 | |||
| 639 | - `0x7F' (`ASCII'), | ||
| 640 | - `0xFF' (`Latin-1', the default), | ||
| 641 | - `0x10FFFF' (`Unicode'), | ||
| 642 | - `0x3FFFFF' (`Emacs'). | ||
| 643 | |||
| 644 | Characters for low control codes are either caret or backslash escaped, | ||
| 645 | while others without a glyph are displayed in backslash-octal notation. | ||
| 646 | The display of strings containing higher character codes will depend on | ||
| 647 | your display settings and system font coverage. | ||
| 648 | |||
| 649 | See the following for further information: | ||
| 650 | |||
| 651 | - info node `(calc)Strings', | ||
| 652 | - info node `(elisp)Text Representations', | ||
| 653 | - info node `(emacs)Text Display'." | ||
| 654 | :version "31.1" | ||
| 655 | :type '(choice (restricted-sexp :tag "Character Code" | ||
| 656 | :match-alternatives (characterp)) | ||
| 657 | (const :tag "ASCII" #x7F) | ||
| 658 | (const :tag "Latin-1" #xFF) | ||
| 659 | (const :tag "Unicode" #x10FFFF) | ||
| 660 | (const :tag "Emacs" #x3FFFFF))) | ||
| 661 | |||
| 631 | (defcalcmodevar calc-matrix-just 'center | 662 | (defcalcmodevar calc-matrix-just 'center |
| 632 | "If nil, vector elements are left-justified. | 663 | "If nil, vector elements are left-justified. |
| 633 | If `right', vector elements are right-justified. | 664 | If `right', vector elements are right-justified. |
diff --git a/lisp/calc/calccomp.el b/lisp/calc/calccomp.el index cc27e6c2025..7e1f8378d80 100644 --- a/lisp/calc/calccomp.el +++ b/lisp/calc/calccomp.el | |||
| @@ -907,13 +907,20 @@ | |||
| 907 | (concat " " math-comp-right-bracket))))) | 907 | (concat " " math-comp-right-bracket))))) |
| 908 | 908 | ||
| 909 | (defun math-vector-is-string (a) | 909 | (defun math-vector-is-string (a) |
| 910 | "Return t if A can be displayed as a string, and nil otherwise. | ||
| 911 | |||
| 912 | Elements of A must either be a character (see `characterp') or a complex | ||
| 913 | number with only a real character part, each with a value less than or | ||
| 914 | equal to the custom variable `calc-string-maximum-character'." | ||
| 910 | (while (and (setq a (cdr a)) | 915 | (while (and (setq a (cdr a)) |
| 911 | (or (and (natnump (car a)) | 916 | (or (and (characterp (car a)) |
| 912 | (<= (car a) 255)) | 917 | (<= (car a) |
| 918 | calc-string-maximum-character)) | ||
| 913 | (and (eq (car-safe (car a)) 'cplx) | 919 | (and (eq (car-safe (car a)) 'cplx) |
| 914 | (natnump (nth 1 (car a))) | 920 | (characterp (nth 1 (car a))) |
| 915 | (eq (nth 2 (car a)) 0) | 921 | (eq (nth 2 (car a)) 0) |
| 916 | (<= (nth 1 (car a)) 255))))) | 922 | (<= (nth 1 (car a)) |
| 923 | calc-string-maximum-character))))) | ||
| 917 | (null a)) | 924 | (null a)) |
| 918 | 925 | ||
| 919 | (defconst math-vector-to-string-chars '( ( ?\" . "\\\"" ) | 926 | (defconst math-vector-to-string-chars '( ( ?\" . "\\\"" ) |
diff --git a/test/lisp/calc/calc-tests.el b/test/lisp/calc/calc-tests.el index 42eb6077b04..2fd6a6be45e 100644 --- a/test/lisp/calc/calc-tests.el +++ b/test/lisp/calc/calc-tests.el | |||
| @@ -879,5 +879,72 @@ An existing calc stack is reused, otherwise a new one is created." | |||
| 879 | (should-error (math-read-preprocess-string nil)) | 879 | (should-error (math-read-preprocess-string nil)) |
| 880 | (should-error (math-read-preprocess-string 42))) | 880 | (should-error (math-read-preprocess-string 42))) |
| 881 | 881 | ||
| 882 | (ert-deftest calc-math-vector-is-string () | ||
| 883 | "Test `math-vector-is-string' with varying `calc-string-maximum-character'. | ||
| 884 | |||
| 885 | All tests operate on both an integer vector and the corresponding | ||
| 886 | complex vector. The sets covered are: | ||
| 887 | |||
| 888 | 1. `calc-string-maximum-character' is a valid character. The last case | ||
| 889 | with `0x3FFFFF' is borderline, as integers above it will not make it | ||
| 890 | past the `characterp' test. | ||
| 891 | 2. `calc-string-maximum-character' is negative, so the test always fails. | ||
| 892 | 3. `calc-string-maximum-character' is above `(max-char)', so only the | ||
| 893 | first `characterp' test is active. | ||
| 894 | 4. `calc-string-maximum-character' has an invalid type, which triggers | ||
| 895 | an error in the comparison." | ||
| 896 | (cl-flet* ((make-vec (lambda (contents) (append (list 'vec) contents))) | ||
| 897 | (make-cplx (lambda (x) (list 'cplx x 0))) | ||
| 898 | (make-cplx-vec (lambda (contents) | ||
| 899 | (make-vec (mapcar #'make-cplx contents))))) | ||
| 900 | ;; 1: calc-string-maximum-character is a valid character | ||
| 901 | (dolist (maxchar '(#x7F #xFF #x10FFFF #x3FFFFD #x3FFFFF)) | ||
| 902 | (let* ((calc-string-maximum-character maxchar) | ||
| 903 | (small-chars (number-sequence (- maxchar 2) maxchar)) | ||
| 904 | (large-chars (number-sequence maxchar (+ maxchar 2))) | ||
| 905 | (small-real-vec (make-vec small-chars)) | ||
| 906 | (large-real-vec (make-vec large-chars)) | ||
| 907 | (small-cplx-vec (make-cplx-vec small-chars)) | ||
| 908 | (large-cplx-vec (make-cplx-vec large-chars))) | ||
| 909 | (should (math-vector-is-string small-real-vec)) | ||
| 910 | (should-not (math-vector-is-string large-real-vec)) | ||
| 911 | (should (math-vector-is-string small-cplx-vec)) | ||
| 912 | (should-not (math-vector-is-string large-cplx-vec)))) | ||
| 913 | ;; 2: calc-string-maximum-character is negative | ||
| 914 | (let* ((maxchar -1) | ||
| 915 | (calc-string-maximum-character maxchar) | ||
| 916 | (valid-contents (number-sequence 0 2)) | ||
| 917 | (invalid-contents (number-sequence (- maxchar 2) maxchar)) | ||
| 918 | (valid-real-vec (make-vec valid-contents)) | ||
| 919 | (invalid-real-vec (make-vec invalid-contents)) | ||
| 920 | (valid-cplx-vec (make-cplx-vec valid-contents)) | ||
| 921 | (invalid-cplx-vec (make-cplx-vec invalid-contents))) | ||
| 922 | (should-not (math-vector-is-string valid-real-vec)) | ||
| 923 | (should-not (math-vector-is-string invalid-real-vec)) | ||
| 924 | (should-not (math-vector-is-string valid-cplx-vec)) | ||
| 925 | (should-not (math-vector-is-string invalid-cplx-vec))) | ||
| 926 | ;; 3: calc-string-maximum-character is larger than (max-char) | ||
| 927 | (let* ((maxchar (+ (max-char) 3)) | ||
| 928 | (calc-string-maximum-character maxchar) | ||
| 929 | (valid-chars (number-sequence (- (max-char) 2) (max-char))) | ||
| 930 | (invalid-chars (number-sequence (1+ (max-char)) maxchar)) | ||
| 931 | (valid-real-vec (make-vec valid-chars)) | ||
| 932 | (invalid-real-vec (make-vec invalid-chars)) | ||
| 933 | (valid-cplx-vec (make-cplx-vec valid-chars)) | ||
| 934 | (invalid-cplx-vec (make-cplx-vec invalid-chars))) | ||
| 935 | (should (math-vector-is-string valid-real-vec)) | ||
| 936 | (should-not (math-vector-is-string invalid-real-vec)) | ||
| 937 | (should (math-vector-is-string valid-cplx-vec)) | ||
| 938 | (should-not (math-vector-is-string invalid-cplx-vec))) | ||
| 939 | ;; 4: calc-string-maximum-character has the wrong type | ||
| 940 | (let* ((calc-string-maximum-character "wrong type") | ||
| 941 | (contents (number-sequence 0 2)) | ||
| 942 | (real-vec (make-vec contents)) | ||
| 943 | (cplx-vec (make-cplx-vec contents))) | ||
| 944 | (should-error (math-vector-is-string real-vec) | ||
| 945 | :type 'wrong-type-argument) | ||
| 946 | (should-error (math-vector-is-string cplx-vec) | ||
| 947 | :type 'wrong-type-argument)))) | ||
| 948 | |||
| 882 | (provide 'calc-tests) | 949 | (provide 'calc-tests) |
| 883 | ;;; calc-tests.el ends here | 950 | ;;; calc-tests.el ends here |