Improve character name escapes

* doc/lispref/nonascii.texi (Character Properties): Avoid duplication of Unicode names. Reformat examples to fit in narrow pages. * doc/lispref/objects.texi (General Escape Syntax): Simplify and better-organize explanation of \N{...} escapes. * src/character.h (CHAR_SURROGATE_PAIR_P): Remove; unused. (char_surrogate_p): New inline function. * src/lread.c: Do not include string.h; no longer needed. (invalid_character_name, check_scalar_value): Remove; the ideas behind these functions are now bundled into character_name_to_code. (character_name_to_code): Remove undocumented support for "CJK IDEOGRAPH-XXXX" names, as "U+XXXX" suffices. Reject monstrosities like "\N{U+-0}" and null bytes in \N escapes. Reject floating point in \N escapes instead of returning garbage. Use AUTO_STRING_WITH_LEN to lessen pressure on the garbage collector. * test/src/lread-tests.el (lread-char-number, lread-char-name) (lread-string-char-number, lread-string-char-name): Test runtime behavior, not compile-time, as the test framework is not set up to test compile-time. (lread-char-surrogate-1, lread-char-surrogate-2) (lread-char-surrogate-3, lread-char-surrogate-4) (lread-string-char-number-2, lread-string-char-number-3): New tests. (lread-string-char-number-1): Rename from lread-string-char-number.
author: Paul Eggert 2016-04-21 19:26:34 -0700
committer: Paul Eggert 2016-04-21 19:29:41 -0700
commit: bd1c7ca67e7429e07f78d4ff49163fd7a67a6765 (patch)
tree: 941d5cf573be2a4588468b3a315c0c6cb47e2c97 /src/character.h
parent: e7cb38edc946ff60c1c878b30b068376d6ef56d2 (diff)
download: emacs-bd1c7ca67e7429e07f78d4ff49163fd7a67a6765.tar.gz
emacs-bd1c7ca67e7429e07f78d4ff49163fd7a67a6765.zip
1 files changed, 6 insertions, 7 deletions
diff --git a/src/character.h b/src/character.h
index bc3e1557844..586f330fba9 100644
--- a/src/character.h
+++ b/src/character.h
@@ -612,14 +612,13 @@ sanitize_char_width (EMACS_INT width)
   : (c) <= 0xE01EF ? (c) - 0xE0100 + 17        \
   : 0)
-/* If C is a high surrogate, return 1.  If C is a low surrogate,
+/* Return true if C is a surrogate.  */
-   return 2.  Otherwise, return 0.  */
-#define CHAR_SURROGATE_PAIR_P(c)        \
+INLINE bool
-  ((c) < 0xD800 ? 0                     \
+char_surrogate_p (int c)
-   : (c) <= 0xDBFF ? 1                  \
+{
-   : (c) <= 0xDFFF ? 2                  \
+  return 0xD800 <= c && c <= 0xDFFF;
-   : 0)
+}
 /* Data type for Unicode general category.
author	Paul Eggert	2016-04-21 19:26:34 -0700
committer	Paul Eggert	2016-04-21 19:29:41 -0700
commit	bd1c7ca67e7429e07f78d4ff49163fd7a67a6765 (patch)
tree	941d5cf573be2a4588468b3a315c0c6cb47e2c97 /src/character.h
parent	e7cb38edc946ff60c1c878b30b068376d6ef56d2 (diff)
download	emacs-bd1c7ca67e7429e07f78d4ff49163fd7a67a6765.tar.gz emacs-bd1c7ca67e7429e07f78d4ff49163fd7a67a6765.zip

diff --git a/src/character.h b/src/character.h index bc3e1557844..586f330fba9 100644 --- a/src/character.h +++ b/src/character.h
@@ -612,14 +612,13 @@ sanitize_char_width (EMACS_INT width)
612	: (c) <= 0xE01EF ? (c) - 0xE0100 + 17 \	612	: (c) <= 0xE01EF ? (c) - 0xE0100 + 17 \
613	: 0)	613	: 0)
614		614
615	/* If C is a high surrogate, return 1. If C is a low surrogate,	615	/* Return true if C is a surrogate. */
616	return 2. Otherwise, return 0. */
617		616
618	#define CHAR_SURROGATE_PAIR_P(c) \	617	INLINE bool
619	((c) < 0xD800 ? 0 \	618	char_surrogate_p (int c)
620	: (c) <= 0xDBFF ? 1 \	619	{
621	: (c) <= 0xDFFF ? 2 \	620	return 0xD800 <= c && c <= 0xDFFF;
622	: 0)	621	}
623		622
624	/* Data type for Unicode general category.	623	/* Data type for Unicode general category.
625		624