aboutsummaryrefslogtreecommitdiffstats
path: root/lispref
diff options
context:
space:
mode:
authorKenichi Handa2005-04-01 00:29:51 +0000
committerKenichi Handa2005-04-01 00:29:51 +0000
commit6fa886202fdca940dd16e9f0b863347c4f565e8a (patch)
treed007919400bd9acc188ec3ea308b3463f0eb30f1 /lispref
parent9b06ffa3dc182766ec67ee4fe06c2f7141602bc2 (diff)
downloademacs-6fa886202fdca940dd16e9f0b863347c4f565e8a.tar.gz
emacs-6fa886202fdca940dd16e9f0b863347c4f565e8a.zip
(Coding System Basics): Describe about rondtrip
identity of coding systems.
Diffstat (limited to 'lispref')
-rw-r--r--lispref/nonascii.texi22
1 files changed, 22 insertions, 0 deletions
diff --git a/lispref/nonascii.texi b/lispref/nonascii.texi
index 70e77e0a837..91a47ea50f9 100644
--- a/lispref/nonascii.texi
+++ b/lispref/nonascii.texi
@@ -628,6 +628,28 @@ characters; for example, there are three coding systems for the Cyrillic
628conversion, but some of them leave the choice unspecified---to be chosen 628conversion, but some of them leave the choice unspecified---to be chosen
629heuristically for each file, based on the data. 629heuristically for each file, based on the data.
630 630
631In general, a coding system doesn't guarantee a roundtrip identity,
632i.e. decoding followed by encoding in the same coding system can
633result in the different byte sequence. But there are several coding
634systems that go guarantee that the result will be the same as what you
635originally decoded. They are:
636
637@quotation
638chinese-big5 chinese-iso-8bit cyrillic-iso-8bit emacs-mule
639greek-iso-8bit hebrew-iso-8bit iso-latin-1 iso-latin-2 iso-latin-3
640iso-latin-4 iso-latin-5 iso-latin-8 iso-latin-9 iso-safe
641japanese-iso-8bit japanese-shift-jis korean-iso-8bit raw-text
642@end quotation
643
644Likewise, a coding systme doesn't guarantee the other way of roundtrip
645identity, i.e. encoding buffer text into a coding system followed by
646decoding again with the same coding system will produce the different
647buffer text. For instance, when you encode Latin-2 characters by
648@code{utf-8} and decode it back by the same coding system, you'll get
649Unicode charactes (of charset @code{mule-unicode-0100-24ff}), and when
650you encode Unicode characters by @code{iso-latin-2} and decode it back
651by the same coding system, you'll get Latin-2 characters.
652
631@cindex end of line conversion 653@cindex end of line conversion
632 @dfn{End of line conversion} handles three different conventions used 654 @dfn{End of line conversion} handles three different conventions used
633on various systems for representing end of line in files. The Unix 655on various systems for representing end of line in files. The Unix