aboutsummaryrefslogtreecommitdiffstats
path: root/admin/notes
diff options
context:
space:
mode:
authorEli Zaretskii2019-03-09 12:41:48 +0200
committerEli Zaretskii2019-03-09 12:41:48 +0200
commitfddb915d234515af81dce30982a8dd22568b4e84 (patch)
tree7fc4d497bd317df930e6f492a6bddbf0ba1e5b96 /admin/notes
parent4e082ce3941a9c1fcaae509897761d3e24e08625 (diff)
downloademacs-fddb915d234515af81dce30982a8dd22568b4e84.tar.gz
emacs-fddb915d234515af81dce30982a8dd22568b4e84.zip
Import Unicode 12.0 data files
* admin/unidata/copyright.html: * admin/unidata/UnicodeData.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/Blocks.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/BidiBrackets.txt: New versions from Unicode 12.0. * admin/unidata/unidata-gen.el (unidata-gen-file): * admin/unidata/blocks.awk (name2alias): Adapt to changes in new data files. * admin/notes/unicode: Update and improve instructions for importing a new Unicode Standard. * lisp/international/characters.el (char-width-table): Update lists of characters according to Unicode 12.0. * lisp/international/fontset.el (script-representative-chars): Add characters from new scripts to 'script-representative-chars'. (otf-script-alist): Update according to data on the MS site. * lisp/international/mule-cmds.el (ucs-names): Update unused ranges of codepoints according to Unicode 12.0. * test/lisp/international/ucs-normalize-tests.el (ucs-normalize-tests--failing-lines-part1) (ucs-normalize-tests--failing-lines-part2): Update for the new NormalizationTest.txt file. * test/manual/BidiCharacterTest.txt: Update with the new version from Unicode 12.0.
Diffstat (limited to 'admin/notes')
-rw-r--r--admin/notes/unicode23
1 files changed, 18 insertions, 5 deletions
diff --git a/admin/notes/unicode b/admin/notes/unicode
index bbee3e9de7f..4d6aa6e9a9e 100644
--- a/admin/notes/unicode
+++ b/admin/notes/unicode
@@ -11,15 +11,20 @@ Emacs uses the following files from the Unicode Character Database
11 11
12 . UnicodeData.txt 12 . UnicodeData.txt
13 . Blocks.txt 13 . Blocks.txt
14 . BidiMirroring.txt
15 . BidiBrackets.txt 14 . BidiBrackets.txt
15 . BidiCharacterTest.txt
16 . BidiMirroring.txt
16 . IVD_Sequences.txt 17 . IVD_Sequences.txt
17 . NormalizationTest.txt 18 . NormalizationTest.txt
18 . SpecialCasing.txt 19 . SpecialCasing.txt
19 . BidiCharacterTest.txt
20 20
21First, the first 7 files need to be copied into admin/unidata/, and 21First, the first 7 files need to be copied into admin/unidata/, and
22then Emacs should be rebuilt for them to take effect. Rebuilding 22the file https://www.unicode.org/copyright.html should be copied over
23copyright.html in admin/unidata (that file might need trailing
24whitespace removed before it can be committed to the Emacs
25repository).
26
27Then Emacs should be rebuilt for them to take effect. Rebuilding
23Emacs updates several derived files elsewhere in the Emacs source 28Emacs updates several derived files elsewhere in the Emacs source
24tree, mainly in lisp/international/. 29tree, mainly in lisp/international/.
25 30
@@ -28,7 +33,10 @@ files, pay attention to any warning or error messages. In particular,
28admin/unidata/unidata-gen.el will complain if UnicodeData.txt defines 33admin/unidata/unidata-gen.el will complain if UnicodeData.txt defines
29new bidirectional attributes of characters, because unidata-gen.el, 34new bidirectional attributes of characters, because unidata-gen.el,
30bidi.c and dispextern.h need to be updated in that case; failure to do 35bidi.c and dispextern.h need to be updated in that case; failure to do
31so will cause aborts in redisplay. 36so will cause aborts in redisplay. unidata-gen.el will also complain
37if the format of the Unicode Copyright notice in copyright.html
38changed in significant ways; in that case, update the regular
39expression in unidata-gen-file used to extract the copyright string.
32 40
33Next, review the changes in UnicodeData.txt vs the previous version 41Next, review the changes in UnicodeData.txt vs the previous version
34used by Emacs. Any changes, be it introduction of new scripts or 42used by Emacs. Any changes, be it introduction of new scripts or
@@ -40,7 +48,12 @@ and see if any changes in admin/unidata/blocks.awk are required.
40 48
41The setting of char-width-table around line 1200 of characters.el 49The setting of char-width-table around line 1200 of characters.el
42should be checked against the latest version of the Unicode file 50should be checked against the latest version of the Unicode file
43EastAsianWidth.txt, and any discrepancies fixed. 51EastAsianWidth.txt, and any discrepancies fixed: double-width
52characters are those marked with W or F in that file. Zero-width
53characters are not taken from EastAsianWidth.txt, they are those whose
54Unicode General Category property is one of Mn, Me, or Cf, and also
55Hangul jungseong and jongseong characters (a.k.a. "Jamo medial vowels"
56and "Jamo final consonants").
44 57
45Any new scripts added by UnicodeData.txt will also need updates to 58Any new scripts added by UnicodeData.txt will also need updates to
46script-representative-chars defined in fontset.el, and also the list 59script-representative-chars defined in fontset.el, and also the list