Correct Unicode stuff.

author: Dave Love 2003-05-29 18:15:21 +0000
committer: Dave Love 2003-05-29 18:15:21 +0000
commit: fc1bfc2a53ee010adceb644a636f10a084e8197c (patch)
tree: d5e43869c550358b9ed960a72baad4d50e8ed36e /etc/PROBLEMS
parent: 074468698d68e98cd9b66f4f329e1526228dad05 (diff)
download: emacs-fc1bfc2a53ee010adceb644a636f10a084e8197c.tar.gz
emacs-fc1bfc2a53ee010adceb644a636f10a084e8197c.zip
1 files changed, 25 insertions, 16 deletions
diff --git a/etc/PROBLEMS b/etc/PROBLEMS
index 1574b16a444..2a385ed6313 100644
--- a/etc/PROBLEMS
+++ b/etc/PROBLEMS
@@ -15,30 +15,39 @@ problems with the unexec code and its interaction with libSystem.B.
 * Characters from the mule-unicode charsets aren't displayed under X.
 XFree86 4 contains many fonts in iso10646-1 encoding which have
-minimal character repertoires (whereas the encoding is meant to be a
+minimal character repertoires (whereas the encoding part of the font
-reasonable indication of the repertoire).  Emacs may choose one of
+name is meant to be a reasonable indication of the repertoire
-these to display characters from the mule-unicode charsets and then
+according to the XLFD spec).  Emacs may choose one of these to display
-typically won't be able to find the glyphs to display many characters.
+characters from the mule-unicode charsets and then typically won't be
-(Check with C-u C-x = .)  To avoid this, you may need to use a fontset
+able to find the glyphs to display many characters.  (Check with C-u
-which sets the font for the mule-unicode sets explicitly.  E.g. to use
+C-x = .)  To avoid this, you may need to use a fontset which sets the
-GNU unifont, include in the fontset spec:
+font for the mule-unicode sets explicitly.  E.g. to use GNU unifont,
+include in the fontset spec:
 mule-unicode-2500-33ff:-gnu-unifont-*-iso10646-1,\
 mule-unicode-e000-ffff:-gnu-unifont-*-iso10646-1,\
 mule-unicode-0100-24ff:-gnu-unifont-*-iso10646-1
-* Encoding some characters as Unicode (UTF-8/16) is rejected by Emacs.
+* The UTF-8/16/7 coding systems don't encode CJK (Far Eastern) characters.
-Emacs currently, by default, only supports the parts of the BMP whose
+Emacs by default only supports the parts of the Unicode BMP whose code
-codepoints are in the ranges 0000-33ff and e000-ffff.  This excludes
+points are in the ranges 0000-33ff and e000-ffff.  This excludes: most
-CJK, Yi, Music, Maths, Private Use Area, Gothic, and Old Italic.
+of CJK, Yi and Hangul, as well as everything outside the BMP.
-If you try to save a file containing characters with code points
+If you read UTF-8 data with code points outside these ranges, the
-outside this range, Emacs will suggest other compatible coding
+characters appear in the buffer as raw bytes of the original UTF-8
-systems.
+(composed into a single quasi-character) and they will be written back
+correctly as UTF-8, assuming you don't break the composed sequences.
+If you read such characters from UTF-16 or UTF-7 data, they are
+substituted with the Unicode `replacement character', and you lose
+information.
-By turning Utf-Translate-Cjk mode on, many more CJK characters are
+To edit such UTF data, turn on Utf-Translate-Cjk mode, which makes
-included in the support.
+many common CJK characters available for encoding and decoding and can
+be extended by updating the tables it uses.  This also allows you to
+save as UTF buffers containing characters decoded by the chinese-,
+japanese- and korean- coding systems, e.g. cut and pasted from
+elsewhere.
 * Problems with file dialogs in Emacs built with Open Motif.
author	Dave Love	2003-05-29 18:15:21 +0000
committer	Dave Love	2003-05-29 18:15:21 +0000
commit	fc1bfc2a53ee010adceb644a636f10a084e8197c (patch)
tree	d5e43869c550358b9ed960a72baad4d50e8ed36e /etc/PROBLEMS
parent	074468698d68e98cd9b66f4f329e1526228dad05 (diff)
download	emacs-fc1bfc2a53ee010adceb644a636f10a084e8197c.tar.gz emacs-fc1bfc2a53ee010adceb644a636f10a084e8197c.zip

diff --git a/etc/PROBLEMS b/etc/PROBLEMS index 1574b16a444..2a385ed6313 100644 --- a/etc/PROBLEMS +++ b/etc/PROBLEMS
@@ -15,30 +15,39 @@ problems with the unexec code and its interaction with libSystem.B.
15	* Characters from the mule-unicode charsets aren't displayed under X.	15	* Characters from the mule-unicode charsets aren't displayed under X.
16		16
17	XFree86 4 contains many fonts in iso10646-1 encoding which have	17	XFree86 4 contains many fonts in iso10646-1 encoding which have
18	minimal character repertoires (whereas the encoding is meant to be a	18	minimal character repertoires (whereas the encoding part of the font
19	reasonable indication of the repertoire). Emacs may choose one of	19	name is meant to be a reasonable indication of the repertoire
20	these to display characters from the mule-unicode charsets and then	20	according to the XLFD spec). Emacs may choose one of these to display
21	typically won't be able to find the glyphs to display many characters.	21	characters from the mule-unicode charsets and then typically won't be
22	(Check with C-u C-x = .) To avoid this, you may need to use a fontset	22	able to find the glyphs to display many characters. (Check with C-u
23	which sets the font for the mule-unicode sets explicitly. E.g. to use	23	C-x = .) To avoid this, you may need to use a fontset which sets the
24	GNU unifont, include in the fontset spec:	24	font for the mule-unicode sets explicitly. E.g. to use GNU unifont,
		25	include in the fontset spec:
25		26
26	mule-unicode-2500-33ff:-gnu-unifont-*-iso10646-1,\	27	mule-unicode-2500-33ff:-gnu-unifont-*-iso10646-1,\
27	mule-unicode-e000-ffff:-gnu-unifont-*-iso10646-1,\	28	mule-unicode-e000-ffff:-gnu-unifont-*-iso10646-1,\
28	mule-unicode-0100-24ff:-gnu-unifont-*-iso10646-1	29	mule-unicode-0100-24ff:-gnu-unifont-*-iso10646-1
29		30
30	* Encoding some characters as Unicode (UTF-8/16) is rejected by Emacs.	31	* The UTF-8/16/7 coding systems don't encode CJK (Far Eastern) characters.
31		32
32	Emacs currently, by default, only supports the parts of the BMP whose	33	Emacs by default only supports the parts of the Unicode BMP whose code
33	codepoints are in the ranges 0000-33ff and e000-ffff. This excludes	34	points are in the ranges 0000-33ff and e000-ffff. This excludes: most
34	CJK, Yi, Music, Maths, Private Use Area, Gothic, and Old Italic.	35	of CJK, Yi and Hangul, as well as everything outside the BMP.
35		36
36	If you try to save a file containing characters with code points	37	If you read UTF-8 data with code points outside these ranges, the
37	outside this range, Emacs will suggest other compatible coding	38	characters appear in the buffer as raw bytes of the original UTF-8
38	systems.	39	(composed into a single quasi-character) and they will be written back
		40	correctly as UTF-8, assuming you don't break the composed sequences.
		41	If you read such characters from UTF-16 or UTF-7 data, they are
		42	substituted with the Unicode `replacement character', and you lose
		43	information.
39		44
40	By turning Utf-Translate-Cjk mode on, many more CJK characters are	45	To edit such UTF data, turn on Utf-Translate-Cjk mode, which makes
41	included in the support.	46	many common CJK characters available for encoding and decoding and can
		47	be extended by updating the tables it uses. This also allows you to
		48	save as UTF buffers containing characters decoded by the chinese-,
		49	japanese- and korean- coding systems, e.g. cut and pasted from
		50	elsewhere.
42		51
43	* Problems with file dialogs in Emacs built with Open Motif.	52	* Problems with file dialogs in Emacs built with Open Motif.
44		53