aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorKenichi Handa2009-06-17 01:14:36 +0000
committerKenichi Handa2009-06-17 01:14:36 +0000
commit3af970a06ef84118eea62944da16ba37b4bb41d9 (patch)
tree1d107a868a65061627a5ca64d85fadc4d68aae2f
parent7f1faf1cc202ec9ee543bd9c6b35d89e162fbe5b (diff)
downloademacs-3af970a06ef84118eea62944da16ba37b4bb41d9.tar.gz
emacs-3af970a06ef84118eea62944da16ba37b4bb41d9.zip
(Charsets): Update the description for the new charset.
(list-character-sets): New findex.
-rw-r--r--doc/emacs/mule.texi56
1 files changed, 37 insertions, 19 deletions
diff --git a/doc/emacs/mule.texi b/doc/emacs/mule.texi
index 9302ef2f988..a663d206536 100644
--- a/doc/emacs/mule.texi
+++ b/doc/emacs/mule.texi
@@ -1620,30 +1620,48 @@ Use @kbd{C-x 8 C-h} to list all the available @kbd{C-x 8} translations.
1620@section Charsets 1620@section Charsets
1621@cindex charsets 1621@cindex charsets
1622 1622
1623 Emacs groups all supported characters into disjoint @dfn{charsets}. 1623 Emacs defines most of popular character sets (e.g. ascii,
1624Each character code belongs to one and only one charset. For 1624iso-8859-1, cp1250, big5, unicode) as @dfn{charsets} and a few of its
1625historical reasons, Emacs typically divides an 8-bit character code 1625own charsets (e.g. emacs, unicode-bmp, eight-bit). All supported
1626for an extended version of @acronym{ASCII} into two charsets: 1626characters belong to one or more charsets. Usually you don't have to
1627@acronym{ASCII}, which covers the codes 0 through 127, plus another 1627take care of ``charset'', but knowing about it may help understanding
1628charset which covers the ``right-hand part'' (the codes 128 and up). 1628the behavior of Emacs in some cases.
1629For instance, the characters of Latin-1 include the Emacs charset 1629
1630@code{ascii} plus the Emacs charset @code{latin-iso8859-1}. 1630 One example is a font selection. In each language environment,
1631 1631charsets have different priorities. Emacs, at first, tries to use a
1632 Emacs characters belonging to different charsets may look the same, 1632font that matches with charsets of higher priority. For instance, in
1633but they are still different characters. For example, the letter 1633Japanese language environment, the charset @code{japanese-jisx0208}
1634@samp{o} with acute accent in charset @code{latin-iso8859-1}, used for 1634has the highest priority (@xref{describe-language-environment}). So,
1635Latin-1, is different from the letter @samp{o} with acute accent in 1635Emacs tries to use a font whose @code{registry} property is
1636charset @code{latin-iso8859-2}, used for Latin-2. 1636``JISX0208.1983-0'' for characters belonging to that charset.
1637
1638 Another example is a use of @code{charset} text property. When
1639Emacs reads a file encoded in a coding systems that uses escape
1640sequences to switch charsets (e.g. iso-2022-int-1), the buffer text
1641keep the information of the original charset by @code{charset} text
1642property. By using this information, Emacs can write the file with
1643the same byte sequence as the original.
1637 1644
1638@findex list-charset-chars 1645@findex list-charset-chars
1639@cindex characters in a certain charset 1646@cindex characters in a certain charset
1640@findex describe-character-set 1647@findex describe-character-set
1641 There are two commands for obtaining information about Emacs 1648 There are two commands for obtaining information about Emacs
1642charsets. The command @kbd{M-x list-charset-chars} prompts for a name 1649charsets. The command @kbd{M-x list-charset-chars} prompts for a
1643of a character set, and displays all the characters in that character 1650charset name, and displays all the characters in that character set.
1644set. The command @kbd{M-x describe-character-set} prompts for a 1651The command @kbd{M-x describe-character-set} prompts for a charset
1645charset name and displays information about that charset, including 1652name and displays information about that charset, including its
1646its internal representation within Emacs. 1653internal representation within Emacs.
1654
1655@findex list-character-sets
1656 To display a list of all the supported charsets, type @kbd{M-x
1657list-character-sets}. The list gives the names of charsets and
1658additional information to identity each charset (see ISO/IEC's this
1659page <http://www.itscj.ipsj.or.jp/ISO-IR/> for the detail). In the
1660list, charsets are categorized into two; the normal charsets are
1661listed first, and the supplementary charsets are listed last. A
1662charset in the latter category is used for defining another charset
1663(as a parent or a subset), or was used only in Emacs of the older
1664versions.
1647 1665
1648 To find out which charset a character in the buffer belongs to, 1666 To find out which charset a character in the buffer belongs to,
1649put point before it and type @kbd{C-u C-x =}. 1667put point before it and type @kbd{C-u C-x =}.