diff options
| author | YAMAMOTO Mitsuharu | 2019-04-27 18:33:39 +0900 |
|---|---|---|
| committer | YAMAMOTO Mitsuharu | 2019-04-27 18:33:39 +0900 |
| commit | 886bedb36c7b959b7e6fc8ce8e0c04e144b0ae28 (patch) | |
| tree | b5770d9fc10a704ad8aeb3474c6940121252c770 /admin/notes/unicode | |
| parent | 015a6e1df2772bd43680df5cbeaffccf98a881da (diff) | |
| parent | 8dc00b2f1e6523c634df3e24379afbe712a32b27 (diff) | |
| download | emacs-886bedb36c7b959b7e6fc8ce8e0c04e144b0ae28.tar.gz emacs-886bedb36c7b959b7e6fc8ce8e0c04e144b0ae28.zip | |
Merge branch 'master' into harfbuzz
Diffstat (limited to 'admin/notes/unicode')
| -rw-r--r-- | admin/notes/unicode | 55 |
1 files changed, 26 insertions, 29 deletions
diff --git a/admin/notes/unicode b/admin/notes/unicode index 40f93fc216f..4d6aa6e9a9e 100644 --- a/admin/notes/unicode +++ b/admin/notes/unicode | |||
| @@ -1,6 +1,6 @@ | |||
| 1 | -*-mode: text; coding: utf-8;-*- | 1 | -*-mode: text; coding: utf-8;-*- |
| 2 | 2 | ||
| 3 | Copyright (C) 2002-2018 Free Software Foundation, Inc. | 3 | Copyright (C) 2002-2019 Free Software Foundation, Inc. |
| 4 | See the end of the file for license conditions. | 4 | See the end of the file for license conditions. |
| 5 | 5 | ||
| 6 | Importing a new Unicode Standard version into Emacs | 6 | Importing a new Unicode Standard version into Emacs |
| @@ -11,15 +11,20 @@ Emacs uses the following files from the Unicode Character Database | |||
| 11 | 11 | ||
| 12 | . UnicodeData.txt | 12 | . UnicodeData.txt |
| 13 | . Blocks.txt | 13 | . Blocks.txt |
| 14 | . BidiMirroring.txt | ||
| 15 | . BidiBrackets.txt | 14 | . BidiBrackets.txt |
| 15 | . BidiCharacterTest.txt | ||
| 16 | . BidiMirroring.txt | ||
| 16 | . IVD_Sequences.txt | 17 | . IVD_Sequences.txt |
| 17 | . NormalizationTest.txt | 18 | . NormalizationTest.txt |
| 18 | . SpecialCasing.txt | 19 | . SpecialCasing.txt |
| 19 | . BidiCharacterTest.txt | ||
| 20 | 20 | ||
| 21 | First, the first 7 files need to be copied into admin/unidata/, and | 21 | First, the first 7 files need to be copied into admin/unidata/, and |
| 22 | then Emacs should be rebuilt for them to take effect. Rebuilding | 22 | the file https://www.unicode.org/copyright.html should be copied over |
| 23 | copyright.html in admin/unidata (that file might need trailing | ||
| 24 | whitespace removed before it can be committed to the Emacs | ||
| 25 | repository). | ||
| 26 | |||
| 27 | Then Emacs should be rebuilt for them to take effect. Rebuilding | ||
| 23 | Emacs updates several derived files elsewhere in the Emacs source | 28 | Emacs updates several derived files elsewhere in the Emacs source |
| 24 | tree, mainly in lisp/international/. | 29 | tree, mainly in lisp/international/. |
| 25 | 30 | ||
| @@ -28,7 +33,10 @@ files, pay attention to any warning or error messages. In particular, | |||
| 28 | admin/unidata/unidata-gen.el will complain if UnicodeData.txt defines | 33 | admin/unidata/unidata-gen.el will complain if UnicodeData.txt defines |
| 29 | new bidirectional attributes of characters, because unidata-gen.el, | 34 | new bidirectional attributes of characters, because unidata-gen.el, |
| 30 | bidi.c and dispextern.h need to be updated in that case; failure to do | 35 | bidi.c and dispextern.h need to be updated in that case; failure to do |
| 31 | so will cause aborts in redisplay. | 36 | so will cause aborts in redisplay. unidata-gen.el will also complain |
| 37 | if the format of the Unicode Copyright notice in copyright.html | ||
| 38 | changed in significant ways; in that case, update the regular | ||
| 39 | expression in unidata-gen-file used to extract the copyright string. | ||
| 32 | 40 | ||
| 33 | Next, review the changes in UnicodeData.txt vs the previous version | 41 | Next, review the changes in UnicodeData.txt vs the previous version |
| 34 | used by Emacs. Any changes, be it introduction of new scripts or | 42 | used by Emacs. Any changes, be it introduction of new scripts or |
| @@ -40,7 +48,12 @@ and see if any changes in admin/unidata/blocks.awk are required. | |||
| 40 | 48 | ||
| 41 | The setting of char-width-table around line 1200 of characters.el | 49 | The setting of char-width-table around line 1200 of characters.el |
| 42 | should be checked against the latest version of the Unicode file | 50 | should be checked against the latest version of the Unicode file |
| 43 | EastAsianWidth.txt, and any discrepancies fixed. | 51 | EastAsianWidth.txt, and any discrepancies fixed: double-width |
| 52 | characters are those marked with W or F in that file. Zero-width | ||
| 53 | characters are not taken from EastAsianWidth.txt, they are those whose | ||
| 54 | Unicode General Category property is one of Mn, Me, or Cf, and also | ||
| 55 | Hangul jungseong and jongseong characters (a.k.a. "Jamo medial vowels" | ||
| 56 | and "Jamo final consonants"). | ||
| 44 | 57 | ||
| 45 | Any new scripts added by UnicodeData.txt will also need updates to | 58 | Any new scripts added by UnicodeData.txt will also need updates to |
| 46 | script-representative-chars defined in fontset.el, and also the list | 59 | script-representative-chars defined in fontset.el, and also the list |
| @@ -230,37 +243,21 @@ nontrivial changes to the build process. | |||
| 230 | 243 | ||
| 231 | admin/charsets/mapfiles/cns2ucsdkw.txt | 244 | admin/charsets/mapfiles/cns2ucsdkw.txt |
| 232 | 245 | ||
| 233 | * iso-2022-7bit | 246 | * iso-2022-jp |
| 234 | 247 | ||
| 235 | Each of these files contains just one CJK charset, but Emacs | 248 | This contains just one CJK charset, but Emacs currently has no |
| 236 | currently has no easy way to specify set-charset-priority on a | 249 | easy way to specify set-charset-priority on a per-file basis, so |
| 237 | per-file basis, so converting any of these files to UTF-8 might | 250 | converting this file to UTF-8 might change the file's appearance |
| 238 | change the file's appearance when viewed by an Emacs that is | 251 | when viewed by an Emacs that is operating in some other language |
| 239 | operating in some other language environment. | 252 | environment. |
| 240 | 253 | ||
| 241 | etc/tutorials/TUTORIAL.ja | 254 | etc/tutorials/TUTORIAL.ja |
| 242 | lisp/international/ja-dic-cnv.el | ||
| 243 | lisp/international/ja-dic-utl.el | ||
| 244 | lisp/international/kinsoku.el | ||
| 245 | lisp/international/kkc.el | ||
| 246 | lisp/international/titdic-cnv.el | ||
| 247 | lisp/language/japan-util.el | ||
| 248 | lisp/language/japanese.el | ||
| 249 | lisp/leim/quail/cyril-jis.el | ||
| 250 | lisp/leim/quail/hanja-jis.el | ||
| 251 | lisp/leim/quail/japanese.el | ||
| 252 | lisp/leim/quail/py-punct.el | ||
| 253 | lisp/leim/quail/pypunct-b5.el | ||
| 254 | |||
| 255 | This file contains just Chinese characters, and has same problem. | ||
| 256 | Also, it contains characters that cannot be encoded in UTF-8. | ||
| 257 | |||
| 258 | lisp/international/titdic-cnv.el | ||
| 259 | 255 | ||
| 260 | * utf-8-emacs | 256 | * utf-8-emacs |
| 261 | 257 | ||
| 262 | These files contain characters that cannot be encoded in UTF-8. | 258 | These files contain characters that cannot be encoded in UTF-8. |
| 263 | 259 | ||
| 260 | lisp/international/titdic-cnv.el | ||
| 264 | lisp/language/ethio-util.el | 261 | lisp/language/ethio-util.el |
| 265 | lisp/language/ethiopic.el | 262 | lisp/language/ethiopic.el |
| 266 | lisp/language/ind-util.el | 263 | lisp/language/ind-util.el |