%o and %x can now format signed integers

Optionally treat integers as signed numbers with %o and %x format specifiers, instead of treating them as a machine-dependent two’s complement representation. This option is more machine-independent, allows formats like "#x%x" to be useful for reading later, and is better-insulated for future changes involving bignums. Setting the new variable ‘binary-as-unsigned’ to nil enables the new behavior (Bug#32252). This is a simplified version of the change proposed in: https://lists.gnu.org/r/emacs-devel/2018-07/msg00763.html I simplified that proposal by omitting bitwidth modifiers, as I could not find an any example uses in the Emacs source code that needed them and doing them correctly would have been quite a bit more work for apparently little benefit. * doc/lispref/strings.texi (Formatting Strings): Document that %x and %o format negative integers in a platform-dependent way. Also, document how to format numbers so that the same values can be read back in. * etc/NEWS: Document the change. * src/editfns.c (styled_format): Treat integers as signed numbers even with %o and %x, if binary-as-unsigned is nil. Support the + and space flags with %o and %x, since they’re about signs. (syms_of_editfns): New variable binary-as-unsigned. * test/src/editfns-tests.el (read-large-integer): Test that maximal integers can be read after printing with all integer formats, if binary-as-unsigned is nil.
author: Paul Eggert 2018-07-26 00:34:10 -0700
committer: Paul Eggert 2018-07-26 00:39:17 -0700
commit: 4a56ca5bbfabbb9c581828cd91648346e6b03844 (patch)
tree: 90b804ea4ec22a8b7be181f0b505b57c40a85c27
parent: 19f5f7b19b0dcdae87476a3fd51c41f840b2b80f (diff)
download: emacs-4a56ca5bbfabbb9c581828cd91648346e6b03844.tar.gz
emacs-4a56ca5bbfabbb9c581828cd91648346e6b03844.zip
4 files changed, 68 insertions, 11 deletions
diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 2fff3c7c75c..3558f17301d 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -922,7 +922,8 @@ Functions}).  Thus, strings are enclosed in @samp{"} characters, and
 @item %o
 @cindex integer to octal
 Replace the specification with the base-eight representation of an
-unsigned integer.  The object can also be a nonnegative floating-point
+integer.  Negative integers are formatted in a platform-dependent
+way.  The object can also be a nonnegative floating-point
 number that is formatted as an integer, dropping any fraction, if the
 integer does not exceed machine limits.
@@ -935,7 +936,8 @@ formatted as an integer, dropping any fraction.
 @itemx %X
 @cindex integer to hexadecimal
 Replace the specification with the base-sixteen representation of an
-unsigned integer.  @samp{%x} uses lower case and @samp{%X} uses upper
+integer.  Negative integers are formatted in a platform-dependent
+way.  @samp{%x} uses lower case and @samp{%X} uses upper
 case.  The object can also be a nonnegative floating-point number that
 is formatted as an integer, dropping any fraction, if the integer does
 not exceed machine limits.
@@ -1108,6 +1110,17 @@ shows only the first three characters of the representation for
 precision is what the local library functions of the @code{printf}
 family produce.
+@cindex formatting numbers for rereading later
+  If you plan to use @code{read} later on the formatted string to
+retrieve a copy of the formatted value, use a specification that lets
+@code{read} reconstruct the value.  To format numbers in this
+reversible way you can use @samp{%s} and @samp{%S}, to format just
+integers you can also use @samp{%d}, and to format just nonnegative
+integers you can also use @samp{#x%x} and @samp{#o%o}.  Other formats
+may be problematic; for example, @samp{%d} and @samp{%g} can mishandle
+NaNs and can lose precision and type, and @samp{#x%x} and @samp{#o%o}
+can mishandle negative integers.  @xref{Input Functions}.
 @node Case Conversion
 @section Case Conversion in Lisp
 @cindex upper case
diff --git a/etc/NEWS b/etc/NEWS
index 995ceb67b78..089fc4053b1 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -812,6 +812,15 @@ between two strings.
 ** 'print-quoted' now defaults to t, so if you want to see
 (quote x) instead of 'x you will have to bind it to nil where applicable.
+++
+** Numbers formatted via %o or %x may now be formatted as signed integers.
+This avoids problems in calls like (read (format "#x%x" -1)), and is
+more compatible with bignums, a planned feature.  To get this
+behavior, set the experimental variable binary-as-unsigned to nil,
+and if the new behavior breaks your code please email
+32252@debbugs.gnu.org.  Because %o and %x can now format signed
+integers, they now support the + and space flags.
 ** To avoid confusion caused by "smart quotes", the reader signals an
 error when reading Lisp symbols which begin with one of the following
 quotation characters: ‘’‛“”‟〞＂＇.  A symbol beginning with such a
diff --git a/src/editfns.c b/src/editfns.c
index 1d6040da3f7..df257219e8f 100644
--- a/src/editfns.c
+++ b/src/editfns.c
@@ -4196,8 +4196,8 @@ contain either numbered or unnumbered %-sequences but not both, except
 that %% can be mixed with numbered %-sequences.
 The + flag character inserts a + before any nonnegative number, while a
-space inserts a space before any nonnegative number; these flags only
+space inserts a space before any nonnegative number; these flags
-affect %d, %e, %f, and %g sequences, and the + flag takes precedence.
+affect only numeric %-sequences, and the + flag takes precedence.
 The - and 0 flags affect the width specifier, as described below.
 The # flag means to use an alternate display form for %o, %x, %X, %e,
@@ -4736,10 +4736,22 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message)
                }
              else
                {
-                  /* Don't sign-extend for octal or hex printing.  */
                  uprintmax_t x;
+                  bool negative;
                  if (INTEGERP (arg))
-                    x = XUINT (arg);
+                    {
+                      if (binary_as_unsigned)
+                        {
+                          x = XUINT (arg);
+                          negative = false;
+                        }
+                      else
+                        {
+                          EMACS_INT i = XINT (arg);
+                          negative = i < 0;
+                          x = negative ? -i : i;
+                        }
+                    }
                  else
                    {
                      double d = XFLOAT_DATA (arg);
@@ -4747,8 +4759,13 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message)
                      if (! (0 <= d && d < uprintmax + 1))
                        xsignal1 (Qoverflow_error, arg);
                      x = d;
+                      negative = false;
                    }
-                  sprintf_bytes = sprintf (sprintf_buf, convspec, prec, x);
+                  sprintf_buf[0] = negative ? '-' : plus_flag ? '+' : ' ';
+                  bool signedp = negative | plus_flag | space_flag;
+                  sprintf_bytes = sprintf (sprintf_buf + signedp,
+                                           convspec, prec, x);
+                  sprintf_bytes += signedp;
                }
              /* Now the length of the formatted item is known, except it omits
@@ -5558,6 +5575,22 @@ functions if all the text being accessed has this property.  */);
  DEFVAR_LISP ("operating-system-release", Voperating_system_release,
               doc: /* The release of the operating system Emacs is running on.  */);
+  DEFVAR_BOOL ("binary-as-unsigned",
+               binary_as_unsigned,
+               doc: /* Non-nil means `format' %x and %o treat integers as unsigned.
+This has machine-dependent results.  Nil means to treat integers as
+signed, which is portable; for example, if N is a negative integer,
+(read (format "#x%x") N) returns N only when this variable is nil.
+This variable is experimental; email 32252@debbugs.gnu.org if you need
+it to be non-nil.  */);
+  /* For now, default to true if bignums exist, false in traditional Emacs.  */
+#ifdef lisp_h_FIXNUMP
+  binary_as_unsigned = true;
+#else
+  binary_as_unsigned = false;
+#endif
  defsubr (&Spropertize);
  defsubr (&Schar_equal);
  defsubr (&Sgoto_char);
diff --git a/test/src/editfns-tests.el b/test/src/editfns-tests.el
index c828000bb4f..2951270dbf7 100644
--- a/test/src/editfns-tests.el
+++ b/test/src/editfns-tests.el
@@ -165,10 +165,12 @@
                :type 'overflow-error)
  (should-error (read (substring (format "%d" most-negative-fixnum) 1))
                :type 'overflow-error)
-  (should-error (read (format "#x%x" most-negative-fixnum))
+  (let ((binary-as-unsigned nil))
-                :type 'overflow-error)
+    (dolist (fmt '("%d" "%s" "#o%o" "#x%x"))
-  (should-error (read (format "#o%o" most-negative-fixnum))
+      (dolist (val (list most-negative-fixnum (1+ most-negative-fixnum)
-                :type 'overflow-error)
+                         -1 0 1
+                         (1- most-positive-fixnum) most-positive-fixnum))
+        (should (eq val (read (format fmt val)))))))
  (should-error (read (format "#32rG%x" most-positive-fixnum))
                :type 'overflow-error))
author	Paul Eggert	2018-07-26 00:34:10 -0700
committer	Paul Eggert	2018-07-26 00:39:17 -0700
commit	4a56ca5bbfabbb9c581828cd91648346e6b03844 (patch)
tree	90b804ea4ec22a8b7be181f0b505b57c40a85c27
parent	19f5f7b19b0dcdae87476a3fd51c41f840b2b80f (diff)
download	emacs-4a56ca5bbfabbb9c581828cd91648346e6b03844.tar.gz emacs-4a56ca5bbfabbb9c581828cd91648346e6b03844.zip