aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorKenichi Handa2005-09-15 02:55:22 +0000
committerKenichi Handa2005-09-15 02:55:22 +0000
commitce9b56fe13c59f2c718c03d6ea7f9dc3a0619e42 (patch)
treebd25e247d301294533fb001d7185cfd72d97bae7
parent503ac8a45f01bbd5f63359d629b528322c830b8c (diff)
downloademacs-ce9b56fe13c59f2c718c03d6ea7f9dc3a0619e42.tar.gz
emacs-ce9b56fe13c59f2c718c03d6ea7f9dc3a0619e42.zip
Fix the paragraph describing the limitation of
UTF-8/16/7.
-rw-r--r--etc/ChangeLog5
-rw-r--r--etc/PROBLEMS20
2 files changed, 15 insertions, 10 deletions
diff --git a/etc/ChangeLog b/etc/ChangeLog
index bcd49751247..316ec3e4cd1 100644
--- a/etc/ChangeLog
+++ b/etc/ChangeLog
@@ -1,3 +1,8 @@
12005-09-15 Kenichi Handa <handa@m17n.org>
2
3 * PROBLEMS: Fix the paragraph describing the limitation of
4 UTF-8/16/7.
5
12005-09-14 Romain Francoise <romain@orebokech.com> 62005-09-14 Romain Francoise <romain@orebokech.com>
2 7
3 * NEWS: Add entry for write-region-inhibit-fsync. 8 * NEWS: Add entry for write-region-inhibit-fsync.
diff --git a/etc/PROBLEMS b/etc/PROBLEMS
index ae9a42bde6d..3b9dc6b17ff 100644
--- a/etc/PROBLEMS
+++ b/etc/PROBLEMS
@@ -841,9 +841,16 @@ mule-unicode-0100-24ff:-gnu-unifont-*-iso10646-1
841 841
842** The UTF-8/16/7 coding systems don't encode CJK (Far Eastern) characters. 842** The UTF-8/16/7 coding systems don't encode CJK (Far Eastern) characters.
843 843
844Emacs by default only supports the parts of the Unicode BMP whose code 844Emacs directly supports the Unicode BMP whose code points are in the
845points are in the ranges 0000-33ff and e000-ffff. This excludes: most 845ranges 0000-33ff and e000-ffff, and indirectly supports the parts of
846of CJK, Yi and Hangul, as well as everything outside the BMP. 846CJK characters belonging to these legacy charsets:
847
848 GB2312, Big5, JISX0208, JISX0212, JISX0213-1, JISX0213-2, KSC5601
849
850The latter support is done in Utf-Translate-Cjk mode (turned on by
851default). Which Unicode CJK characters are decoded into which Emacs
852charset is decided by the current language environment. For instance,
853in Chinese-GB, most of them are decoded into chinese-gb2312.
847 854
848If you read UTF-8 data with code points outside these ranges, the 855If you read UTF-8 data with code points outside these ranges, the
849characters appear in the buffer as raw bytes of the original UTF-8 856characters appear in the buffer as raw bytes of the original UTF-8
@@ -853,13 +860,6 @@ If you read such characters from UTF-16 or UTF-7 data, they are
853substituted with the Unicode `replacement character', and you lose 860substituted with the Unicode `replacement character', and you lose
854information. 861information.
855 862
856To edit such UTF data, turn on Utf-Translate-Cjk mode, which makes
857many common CJK characters available for encoding and decoding and can
858be extended by updating the tables it uses. This also allows you to
859save as UTF buffers containing characters decoded by the chinese-,
860japanese- and korean- coding systems, e.g. cut and pasted from
861elsewhere.
862
863** Mule-UCS loads very slowly. 863** Mule-UCS loads very slowly.
864 864
865Changes to Emacs internals interact badly with Mule-UCS's `un-define' 865Changes to Emacs internals interact badly with Mule-UCS's `un-define'