* basic.texi (Inserting Text): Document ucs-insert.

* mule.texi (International Chars): Define "multibyte". Note that internal representation is unicode-based. Simplify definition of raw bytes. Mention ucs-insert. (Enabling Multibyte): Remove obsolete discussion. Copyedits. (Language Environments): Add language environments new to Emacs 23. (Multibyte Conversion): Node deleted. (Coding Systems): Remove obsolete unify-8859-on-decoding-mode. Don't mention obsolete emacs-mule coding system. (Output Coding): Copyedits. * emacs.texi (Top): Update node listing.
author: Chong Yidong 2009-05-06 03:55:12 +0000
committer: Chong Yidong 2009-05-06 03:55:12 +0000
commit: ad36c4224c80d2ca0caf26a8fe9a96cdce43d64b (patch)
tree: 79bf0100a27e4766a619b80b4e9abae4d691329e
parent: 5996e1b74c382855c5af0ccf737aeef3ad5f4626 (diff)
download: emacs-ad36c4224c80d2ca0caf26a8fe9a96cdce43d64b.tar.gz
emacs-ad36c4224c80d2ca0caf26a8fe9a96cdce43d64b.zip
4 files changed, 162 insertions, 207 deletions
diff --git a/doc/emacs/ChangeLog b/doc/emacs/ChangeLog
index fc2c277972c..29be8c714d3 100644
--- a/doc/emacs/ChangeLog
+++ b/doc/emacs/ChangeLog
@@ -1,3 +1,19 @@
+2009-05-06  Chong Yidong  <cyd@stupidchicken.com>
+        * basic.texi (Inserting Text): Document ucs-insert.
+        * mule.texi (International Chars): Define "multibyte".  Note that
+        internal representation is unicode-based.  Simplify definition of raw
+        bytes.  Mention ucs-insert.
+        (Enabling Multibyte): Remove obsolete discussion.  Copyedits.
+        (Language Environments): Add language environments new to Emacs 23.
+        (Multibyte Conversion): Node deleted.
+        (Coding Systems): Remove obsolete unify-8859-on-decoding-mode.  Don't
+        mention obsolete emacs-mule coding system.
+        (Output Coding): Copyedits.
+        * emacs.texi (Top): Update node listing.
 2009-05-05  Per Starbäck  <per@starback.se>  (tiny change)
        * trouble.texi (Lossage): Use new binding of view-emacs-problems.
diff --git a/doc/emacs/basic.texi b/doc/emacs/basic.texi
index 710a093f495..72ab17c33ac 100644
--- a/doc/emacs/basic.texi
+++ b/doc/emacs/basic.texi
@@ -64,9 +64,11 @@ key; other keys act as editing commands and do not insert themselves.
 For instance, @kbd{DEL} runs the command @code{delete-backward-char}
 by default (some modes bind it to a different command); it does not
 insert a literal @samp{DEL} character (@acronym{ASCII} character code
-127).  To insert a non-graphic character, first @dfn{quote} it by
+127).
-typing @kbd{C-q} (@code{quoted-insert}).  There are two ways to use
-@kbd{C-q}:
+  To insert a non-graphic character, or a character that your keyboard
+does not support, first @dfn{quote} it by typing @kbd{C-q}
+(@code{quoted-insert}).  There are two ways to use @kbd{C-q}:
 @itemize @bullet
 @item
@@ -87,32 +89,24 @@ Overwrite mode, to give you a convenient way to insert a digit instead
 of overwriting with it.
 @end itemize
-@cindex 8-bit character codes
-@noindent
-If you specify a code in the octal range 0200 through 0377, @kbd{C-q}
-assumes that you intend to use some ISO 8859-@var{n} character set,
-and converts the specified code to the corresponding Emacs character
-code.  Your choice of language environment determines which of the ISO
-8859 character sets to use (@pxref{Language Environments}).  This
-feature is disabled if multibyte characters are disabled
-(@pxref{Enabling Multibyte}).
 @vindex read-quoted-char-radix
+@noindent
 To use decimal or hexadecimal instead of octal, set the variable
-@code{read-quoted-char-radix} to 10 or 16.  If the radix is greater than
+@code{read-quoted-char-radix} to 10 or 16.  If the radix is greater
-10, some letters starting with @kbd{a} serve as part of a character
+than 10, some letters starting with @kbd{a} serve as part of a
-code, just like digits.
+character code, just like digits.
-A numeric argument tells @kbd{C-q} how many copies of the quoted
+  A numeric argument tells @kbd{C-q} how many copies of the quoted
 character to insert (@pxref{Arguments}).
-@findex newline
+@findex ucs-insert
-@findex self-insert
+@cindex Unicode
-  Customization information: @key{DEL} in most modes runs the command
+  Instead of @kbd{C-q}, you can use @kbd{C-x 8 @key{RET}}
-@code{delete-backward-char}; @key{RET} runs the command
+(@code{ucs-insert}) to insert a character based on its Unicode name or
-@code{newline}, and self-inserting printing characters run the command
+code-point.  This commands prompts for a character to insert, using
-@code{self-insert}, which inserts whatever character you typed.  Some
+the minibuffer; you can specify the character using either (i) the
-major modes rebind @key{DEL} to other commands.
+character's name in the Unicode standard, or (ii) the character's
+code-point in the Unicode standard.
 @node Moving Point
 @section Changing the Location of Point
diff --git a/doc/emacs/emacs.texi b/doc/emacs/emacs.texi
index 4fb083ad22b..717e2b78c3e 100644
--- a/doc/emacs/emacs.texi
+++ b/doc/emacs/emacs.texi
@@ -507,7 +507,6 @@ International Character Set Support
 * Language Environments::   Setting things up for the language you use.
 * Input Methods::           Entering text characters not on your keyboard.
 * Select Input Method::     Specifying your choice of input methods.
-* Multibyte Conversion::    How single-byte characters convert to multibyte.
 * Coding Systems::          Character set conversion when you read and
                              write files, and so on.
 * Recognize Coding::        How Emacs figures out which conversion to use.
diff --git a/doc/emacs/mule.texi b/doc/emacs/mule.texi
index a622722f1c6..aa25ed371de 100644
--- a/doc/emacs/mule.texi
+++ b/doc/emacs/mule.texi
@@ -89,7 +89,6 @@ to make sure Emacs interprets keyboard input correctly; see
 * Language Environments::   Setting things up for the language you use.
 * Input Methods::           Entering text characters not on your keyboard.
 * Select Input Method::     Specifying your choice of input methods.
-* Multibyte Conversion::    How single-byte characters convert to multibyte.
 * Coding Systems::          Character set conversion when you read and
                              write files, and so on.
 * Recognize Coding::        How Emacs figures out which conversion to use.
@@ -115,14 +114,17 @@ to make sure Emacs interprets keyboard input correctly; see
  The users of international character sets and scripts have
 established many more-or-less standard coding systems for storing
-files.  Emacs internally uses a single multibyte character encoding,
+files.  These coding systems are typically @dfn{multibyte}, meaning
-so that it can intermix characters from all these scripts in a single
+that sequences of two or more bytes are used to represent individual
-buffer or string.  This encoding represents each non-@acronym{ASCII}
+non-@acronym{ASCII} characters.
-character as a sequence of bytes in the range 0200 through 0377.
-Emacs translates between the multibyte character encoding and various
+@cindex Unicode
-other coding systems when reading and writing files, when exchanging
+  Internally, Emacs uses its own multibyte character encoding, which
-data with subprocesses, and (in some cases) in the @kbd{C-q} command
+is a superset of the @dfn{Unicode} standard.  This internal encoding
-(@pxref{Multibyte Conversion}).
+allows characters from almost every known script to be intermixed in a
+single buffer or string.  Emacs translates between the multibyte
+character encoding and various other coding systems when reading and
+writing files, and when exchanging data with subprocesses.
 @kindex C-h h
 @findex view-hello-file
@@ -134,10 +136,14 @@ This illustrates various scripts.  If some characters can't be
 displayed on your terminal, they appear as @samp{?} or as hollow boxes
 (@pxref{Undisplayable Characters}).
-  Keyboards, even in the countries where these character sets are used,
+  Keyboards, even in the countries where these character sets are
-generally don't have keys for all the characters in them.  So Emacs
+used, generally don't have keys for all the characters in them.  You
-supports various @dfn{input methods}, typically one for each script or
+can insert characters that your keyboard does not support, using
-language, to make it convenient to type them.
+@kbd{C-q} (@code{quoted-insert}) or @kbd{C-x 8 @key{RET}}
+(@code{ucs-insert}).  @xref{Inserting Text}.  Emacs also supports
+various @dfn{input methods}, typically one for each script or
+language, which make it easier to type characters in the script.
+@xref{Input Methods}.
 @kindex C-x RET
  The prefix key @kbd{C-x @key{RET}} is used for commands that pertain
@@ -165,12 +171,12 @@ system encodes the character safely and with a single byte
 (@pxref{Coding Systems}).  If the character's encoding is longer than
 one byte, Emacs shows @samp{file ...}.
-  However, if the character displayed is in the range 0200 through
+  As a special case, if the character lies in the range 128 (0200
-0377 octal, it may actually stand for an invalid UTF-8 byte read from
+octal) through 159 (0237 octal), it stands for a ``raw'' byte that
-a file.  In Emacs, that byte is represented as a sequence of 8-bit
+does not correspond to any specific displayable character.  Such a
-characters, but all of them together display as the original invalid
+``character'' lies within the @code{eight-bit-control} character set,
-byte, in octal code.  In this case, @kbd{C-x =} shows @samp{part of
+and is displayed as an escaped octal character code.  In this case,
-display ...} instead of @samp{file}.
+@kbd{C-x =} shows @samp{part of display ...} instead of @samp{file}.
 @cindex character set of character at point
 @cindex font of character at point
@@ -235,74 +241,62 @@ There are text properties here:
 @node Enabling Multibyte
 @section Enabling Multibyte Characters
-  By default, Emacs starts in multibyte mode, because that allows you to
+  By default, Emacs starts in multibyte mode: it stores the contents
-use all the supported languages and scripts without limitations.
+of buffers and strings using an internal encoding that represents
+non-@acronym{ASCII} characters using multi-byte sequences.  Multibyte
+mode allows you to use all the supported languages and scripts without
+limitations.
 @cindex turn multibyte support on or off
-  You can enable or disable multibyte character support, either for
+  Under very special circumstances, you may want to disable multibyte
-Emacs as a whole, or for a single buffer.  When multibyte characters
+character support, either for Emacs as a whole, or for a single
-are disabled in a buffer, we call that @dfn{unibyte mode}.  Then each
+buffer.  When multibyte characters are disabled in a buffer, we call
-byte in that buffer represents a character, even codes 0200 through
+that @dfn{unibyte mode}.  In unibyte mode, each character in the
-0377.
+buffer has a character code ranging from 0 through 255 (0377 octal); 0
+through 127 (0177 octal) represent @acronym{ASCII} characters, and 128
-  The old features for supporting the European character sets, ISO
+(0200 octal) through 255 (0377 octal) represent non-@acronym{ASCII}
-Latin-1 and ISO Latin-2, work in unibyte mode as they did in Emacs 19
+characters.
-and also work for the other ISO 8859 character sets.  However, there
-is no need to turn off multibyte character support to use ISO Latin;
-the Emacs multibyte character set includes all the characters in these
-character sets, and Emacs can translate automatically to and from the
-ISO codes.
  To edit a particular file in unibyte representation, visit it using
-@code{find-file-literally}.  @xref{Visiting}.  To convert a buffer in
+@code{find-file-literally}.  @xref{Visiting}.  You can convert a
-multibyte representation into a single-byte representation of the same
+multibyte buffer to unibyte by saving it to a file, killing the
-characters, the easiest way is to save the contents in a file, kill the
+buffer, and visiting the file again with @code{find-file-literally}.
-buffer, and find the file again with @code{find-file-literally}.  You
+Alternatively, you can use @kbd{C-x @key{RET} c}
-can also use @kbd{C-x @key{RET} c}
+(@code{universal-coding-system-argument}) and specify @samp{raw-text}
-(@code{universal-coding-system-argument}) and specify @samp{raw-text} as
+as the coding system with which to visit or save a file.  @xref{Text
-the coding system with which to find or save a file.  @xref{Text
+Coding}.  Unlike @code{find-file-literally}, finding a file as
-Coding}.  Finding a file as @samp{raw-text} doesn't disable format
+@samp{raw-text} doesn't disable format conversion, uncompression, or
-conversion, uncompression and auto mode selection as
+auto mode selection.
-@code{find-file-literally} does.
 @vindex enable-multibyte-characters
 @vindex default-enable-multibyte-characters
+@cindex environment variables, and non-@acronym{ASCII} characters
  To turn off multibyte character support by default, start Emacs with
 the @samp{--unibyte} option (@pxref{Initial Options}), or set the
 environment variable @env{EMACS_UNIBYTE}.  You can also customize
 @code{enable-multibyte-characters} or, equivalently, directly set the
 variable @code{default-enable-multibyte-characters} to @code{nil} in
 your init file to have basically the same effect as @samp{--unibyte}.
+With @samp{--unibyte}, multibyte strings are not created during
-@findex toggle-enable-multibyte-characters
+initialization from the values of environment variables,
-  To convert a unibyte session to a multibyte session, set
+@file{/etc/passwd} entries etc., even if those contain
-@code{default-enable-multibyte-characters} to @code{t}.  Buffers which
+non-@acronym{ASCII} characters.
-were created in the unibyte session before you turn on multibyte support
-will stay unibyte.  You can turn on multibyte support in a specific
-buffer by invoking the command @code{toggle-enable-multibyte-characters}
-in that buffer.
 @cindex Lisp files, and multibyte operation
 @cindex multibyte operation, and Lisp files
 @cindex unibyte operation, and Lisp files
 @cindex init file, and non-@acronym{ASCII} characters
-@cindex environment variables, and non-@acronym{ASCII} characters
-  With @samp{--unibyte}, multibyte strings are not created during
-initialization from the values of environment variables,
-@file{/etc/passwd} entries etc.@: that contain non-@acronym{ASCII} 8-bit
-characters.
  Emacs normally loads Lisp files as multibyte, regardless of whether
-you used @samp{--unibyte}.  This includes the Emacs initialization file,
+you used @samp{--unibyte}.  This includes the Emacs initialization
-@file{.emacs}, and the initialization files of Emacs packages such as
+file, @file{.emacs}, and the initialization files of Emacs packages
-Gnus.  However, you can specify unibyte loading for a particular Lisp
+such as Gnus.  However, you can specify unibyte loading for a
-file, by putting @w{@samp{-*-unibyte: t;-*-}} in a comment on the first
+particular Lisp file, by putting @w{@samp{-*-unibyte: t;-*-}} in a
-line (@pxref{File Variables}).  Then that file is always loaded as
+comment on the first line (@pxref{File Variables}).  Then that file is
-unibyte text, even if you did not start Emacs with @samp{--unibyte}.
+always loaded as unibyte text.  The motivation for these conventions
-The motivation for these conventions is that it is more reliable to
+is that it is more reliable to always load any particular Lisp file in
-always load any particular Lisp file in the same way.  However, you can
+the same way.  However, you can load a Lisp file as unibyte, on any
-load a Lisp file as unibyte, on any one occasion, by typing @kbd{C-x
+one occasion, by typing @kbd{C-x @key{RET} c raw-text @key{RET}}
-@key{RET} c raw-text @key{RET}} immediately before loading it.
+immediately before loading it.
  The mode line indicates whether multibyte character support is
 enabled in the current buffer.  If it is, there are two or more
@@ -312,6 +306,14 @@ convention (colon, backslash, etc.).  When multibyte characters
 are not enabled, nothing precedes the colon except a single dash.
 @xref{Mode Line}, for more details about this.
+@findex toggle-enable-multibyte-characters
+  To convert a unibyte session to a multibyte session, set
+@code{default-enable-multibyte-characters} to @code{t}.  Buffers which
+were created in the unibyte session before you turn on multibyte
+support will stay unibyte.  You can turn on multibyte support in a
+specific buffer by invoking the command
+@code{toggle-enable-multibyte-characters} in that buffer.
 @node Language Environments
 @section Language Environments
 @cindex language environments
@@ -319,43 +321,41 @@ are not enabled, nothing precedes the colon except a single dash.
  All supported character sets are supported in Emacs buffers whenever
 multibyte characters are enabled; there is no need to select a
 particular language in order to display its characters in an Emacs
-buffer.  However, it is important to select a @dfn{language environment}
+buffer.  However, it is important to select a @dfn{language
-in order to set various defaults.  The language environment really
+environment} in order to set various defaults.  Roughly speaking, the
-represents a choice of preferred script (more or less) rather than a
+language environment represents a choice of preferred script rather
-choice of language.
+than a choice of language.
  The language environment controls which coding systems to recognize
 when reading text (@pxref{Recognize Coding}).  This applies to files,
-incoming mail, netnews, and any other text you read into Emacs.  It may
+incoming mail, and any other text you read into Emacs.  It may also
-also specify the default coding system to use when you create a file.
+specify the default coding system to use when you create a file.  Each
-Each language environment also specifies a default input method.
+language environment also specifies a default input method.
 @findex set-language-environment
 @vindex current-language-environment
-  To select a language environment, you can customize the variable
+  To select a language environment, customize the variable
 @code{current-language-environment} or use the command @kbd{M-x
 set-language-environment}.  It makes no difference which buffer is
-current when you use this command, because the effects apply globally to
+current when you use this command, because the effects apply globally
-the Emacs session.  The supported language environments include:
+to the Emacs session.  The supported language environments include:
 @cindex Euro sign
 @cindex UTF-8
 @quotation
-ASCII, Belarusian, Brazilian Portuguese, Bulgarian, Chinese-BIG5,
+ASCII, Belarusian, Bengali, Brazilian Portuguese, Bulgarian,
-Chinese-CNS, Chinese-EUC-TW, Chinese-GB, Croatian, Cyrillic-ALT,
+Chinese-BIG5, Chinese-CNS, Chinese-EUC-TW, Chinese-GB, Chinese-GBK,
-Cyrillic-ISO, Cyrillic-KOI8, Czech, Devanagari, Dutch, English,
+Chinese-GB18030, Croatian, Cyrillic-ALT, Cyrillic-ISO, Cyrillic-KOI8,
-Esperanto, Ethiopic, French, Georgian, German, Greek, Hebrew, IPA,
+Czech, Devanagari, Dutch, English, Esperanto, Ethiopic, French,
-Italian, Japanese, Kannada, Korean, Lao, Latin-1, Latin-2, Latin-3,
+Georgian, German, Greek, Gujarati, Hebrew, IPA, Italian, Japanese,
-Latin-4, Latin-5, Latin-6, Latin-7, Latin-8 (Celtic), Latin-9 (updated
+Kannada, Khmer, Korean, Lao, Latin-1, Latin-2, Latin-3, Latin-4,
-Latin-1 with the Euro sign), Latvian, Lithuanian, Malayalam, Polish,
+Latin-5, Latin-6, Latin-7, Latin-8 (Celtic), Latin-9 (updated Latin-1
-Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tajik, Tamil,
+with the Euro sign), Latvian, Lithuanian, Malayalam, Oriya, Polish,
-Thai, Tibetan, Turkish, UTF-8 (for a setup which prefers Unicode
+Punjabi, Romanian, Russian, Sinhala, Slovak, Slovenian, Spanish,
-characters and files encoded in UTF-8), Ukrainian, Vietnamese, Welsh,
+Swedish, TaiViet, Tajik, Tamil, Telugu, Thai, Tibetan, Turkish, UTF-8
-and Windows-1255 (for a setup which prefers Cyrillic characters and
+(for a setup which prefers Unicode characters and files encoded in
-files encoded in Windows-1255).
+UTF-8), Ukrainian, Vietnamese, Welsh, and Windows-1255 (for a setup
-@tex
+which prefers Cyrillic characters and files encoded in Windows-1255).
-\hbadness=10000\par  % just avoid underfull hbox warning
-@end tex
 @end quotation
 @cindex fonts for various scripts
@@ -657,34 +657,6 @@ character.
 list-input-methods}.  The list gives information about each input
 method, including the string that stands for it in the mode line.
-@node Multibyte Conversion
-@section Unibyte and Multibyte Non-@acronym{ASCII} characters
-  When multibyte characters are enabled, character codes 0240 (octal)
-through 0377 (octal) are not really legitimate in the buffer.  The valid
-non-@acronym{ASCII} printing characters have codes that start from 0400.
-  If you type a self-inserting character in the range 0240 through
-0377, or if you use @kbd{C-q} to insert one, Emacs assumes you
-intended to use one of the ISO Latin-@var{n} character sets, and
-converts it to the Emacs code representing that Latin-@var{n}
-character.  You select @emph{which} ISO Latin character set to use
-through your choice of language environment
-@iftex
-(see above).
-@end iftex
-@ifnottex
-(@pxref{Language Environments}).
-@end ifnottex
-If you do not specify a choice, the default is Latin-1.
-  If you insert a character in the range 0200 through 0237, which
-forms the @code{eight-bit-control} character set, it is inserted
-literally.  You should normally avoid doing this since buffers
-containing such characters have to be written out in either the
-@code{emacs-mule} or @code{raw-text} coding system, which is usually
-not what you want.
 @node Coding Systems
 @section Coding Systems
 @cindex coding systems
@@ -698,11 +670,11 @@ possible in reading or writing files, in sending or receiving from the
 terminal, and in exchanging data with subprocesses.
  Emacs assigns a name to each coding system.  Most coding systems are
-used for one language, and the name of the coding system starts with the
+used for one language, and the name of the coding system starts with
-language name.  Some coding systems are used for several languages;
+the language name.  Some coding systems are used for several
-their names usually start with @samp{iso}.  There are also special
+languages; their names usually start with @samp{iso}.  There are also
-coding systems @code{no-conversion}, @code{raw-text} and
+special coding systems, such as @code{no-conversion}, @code{raw-text},
-@code{emacs-mule} which do not convert printing characters at all.
+and @code{emacs-internal}.
 @cindex international files from DOS/Windows systems
  A special class of coding systems, collectively known as
@@ -814,37 +786,21 @@ the @kbd{M-x find-file-literally} command.  This uses
 @code{no-conversion}, and also suppresses other Emacs features that
 might convert the file contents before you see them.  @xref{Visiting}.
-  The coding system @code{emacs-mule} means that the file contains
+  The coding system @code{emacs-internal} (or @code{utf-8-emacs},
-non-@acronym{ASCII} characters stored with the internal Emacs encoding.  It
+which is equivalent) means that the file contains non-@acronym{ASCII}
-handles end-of-line conversion based on the data encountered, and has
+characters stored with the internal Emacs encoding.  This coding
-the usual three variants to specify the kind of end-of-line conversion.
+system handles end-of-line conversion based on the data encountered,
+and has the usual three variants to specify the kind of end-of-line
-@findex unify-8859-on-decoding-mode
+conversion.
-@anchor{Character Translation} 
-  The @dfn{character translation} feature can modify the effect of
-various coding systems, by changing the internal Emacs codes that
-decoding produces.  For instance, the command
-@code{unify-8859-on-decoding-mode} enables a mode that ``unifies'' the
-Latin alphabets when decoding text.  This works by converting all
-non-@acronym{ASCII} Latin-@var{n} characters to either Latin-1 or
-Unicode characters.  This way it is easier to use various
-Latin-@var{n} alphabets together.  (In a future Emacs version we hope
-to move towards full Unicode support and complete unification of
-character sets.)
-@vindex enable-character-translation
-  If you set the variable @code{enable-character-translation} to
-@code{nil}, that disables all character translation (including
-@code{unify-8859-on-decoding-mode}).
 @node Recognize Coding
 @section Recognizing Coding Systems
-  Emacs tries to recognize which coding system to use for a given text
+  Whenever Emacs reads a given piece of text, it tries to recognize
-as an integral part of reading that text.  (This applies to files
+which coding system to use.  This applies to files being read, output
-being read, output from subprocesses, text from X selections, etc.)
+from subprocesses, text from X selections, etc.  Emacs can select the
-Emacs can select the right coding system automatically most of the
+right coding system automatically most of the time---once you have
-time---once you have specified your preferences.
+specified your preferences.
  Some coding systems can be recognized or distinguished by which byte
 sequences appear in the data.  However, there are coding systems that
@@ -948,19 +904,17 @@ pattern, are decoded correctly.  One of the builtin
 @code{auto-coding-functions} detects the encoding for XML files.
 @vindex rmail-decode-mime-charset
+@vindex rmail-file-coding-system
  When you get new mail in Rmail, each message is translated
 automatically from the coding system it is written in, as if it were a
 separate file.  This uses the priority list of coding systems that you
 have specified.  If a MIME message specifies a character set, Rmail
 obeys that specification, unless @code{rmail-decode-mime-charset} is
-@code{nil}.
+@code{nil}.  For reading and saving Rmail files themselves, Emacs uses
+the coding system specified by the variable
-@vindex rmail-file-coding-system
+@code{rmail-file-coding-system}.  The default value is @code{nil},
-  For reading and saving Rmail files themselves, Emacs uses the coding
+which means that Rmail files are not translated (they are read and
-system specified by the variable @code{rmail-file-coding-system}.  The
+written in the Emacs internal character code).
-default value is @code{nil}, which means that Rmail files are not
-translated (they are read and written in the Emacs internal character
-code).
 @node Specify Coding
 @section Specifying a File's Coding System
@@ -984,13 +938,6 @@ use of the Latin-1 coding system, as well as C mode.  When you specify
 the coding explicitly in the file, that overrides
 @code{file-coding-system-alist}.
-  If you add the character @samp{!} at the end of the coding system
-name in @code{coding}, it disables any character translation
-(@pxref{Character Translation}) while decoding the file.  This is
-useful when you need to make sure that the character codes in the
-Emacs buffer will not vary due to changes in user settings; for
-instance, for the sake of strings in Emacs Lisp source files.
 @node Output Coding
 @section Choosing Coding Systems for Output
@@ -1004,22 +951,21 @@ different coding system for further file output from the buffer using
  You can insert any character Emacs supports into any Emacs buffer,
 but most coding systems can only handle a subset of these characters.
-Therefore, you can insert characters that cannot be encoded with the
+Therefore, it's possible that the characters you insert cannot be
-coding system that will be used to save the buffer.  For example, you
+encoded with the coding system that will be used to save the buffer.
-could start with an @acronym{ASCII} file and insert a few Latin-1
+For example, you could visit a text file in Polish, encoded in
-characters into it, or you could edit a text file in Polish encoded in
+@code{iso-8859-2}, and add some Russian words to it.  When you save
-@code{iso-8859-2} and add some Russian words to it.  When you save
 that buffer, Emacs cannot use the current value of
 @code{buffer-file-coding-system}, because the characters you added
 cannot be encoded by that coding system.
  When that happens, Emacs tries the most-preferred coding system (set
 by @kbd{M-x prefer-coding-system} or @kbd{M-x
-set-language-environment}), and if that coding system can safely
+set-language-environment}).  If that coding system can safely encode
-encode all of the characters in the buffer, Emacs uses it, and stores
+all of the characters in the buffer, Emacs uses it, and stores its
-its value in @code{buffer-file-coding-system}.  Otherwise, Emacs
+value in @code{buffer-file-coding-system}.  Otherwise, Emacs displays
-displays a list of coding systems suitable for encoding the buffer's
+a list of coding systems suitable for encoding the buffer's contents,
-contents, and asks you to choose one of those coding systems.
+and asks you to choose one of those coding systems.
  If you insert the unsuitable characters in a mail message, Emacs
 behaves a bit differently.  It additionally checks whether the
@@ -1248,9 +1194,9 @@ interactively.
  If @code{file-name-coding-system} is @code{nil}, Emacs uses a
 default coding system determined by the selected language environment.
-In the default language environment, any non-@acronym{ASCII}
+In the default language environment, non-@acronym{ASCII} characters in
-characters in file names are not encoded specially; they appear in the
+file names are not encoded specially; they appear in the file system
-file system using the internal Emacs representation.
+using the internal Emacs representation.
  @strong{Warning:} if you change @code{file-name-coding-system} (or the
 language environment) in the middle of an Emacs session, problems can
@@ -1317,7 +1263,7 @@ You can do this by putting
 @end lisp
 @noindent
-in your @file{~/.emacs} file.
+in your init file.
  There is a similarity between using a coding system translation for
 keyboard input, and using an input method: both define sequences of
author	Chong Yidong	2009-05-06 03:55:12 +0000
committer	Chong Yidong	2009-05-06 03:55:12 +0000
commit	ad36c4224c80d2ca0caf26a8fe9a96cdce43d64b (patch)
tree	79bf0100a27e4766a619b80b4e9abae4d691329e
parent	5996e1b74c382855c5af0ccf737aeef3ad5f4626 (diff)
download	emacs-ad36c4224c80d2ca0caf26a8fe9a96cdce43d64b.tar.gz emacs-ad36c4224c80d2ca0caf26a8fe9a96cdce43d64b.zip