1 files changed, 163 insertions, 153 deletions
diff --git a/man/mule.texi b/man/mule.texi
index d127563efa6..940f0354d9e 100644
--- a/man/mule.texi
+++ b/man/mule.texi
@@ -42,7 +42,7 @@ have been merged from the modified version of Emacs known as MULE (for
 ``MULti-lingual Enhancement to GNU Emacs'')
  Emacs also supports various encodings of these characters used by
-internationalized software, such as word processors, mailers, etc.
+other internationalized software, such as word processors and mailers.
 @menu
 * International Intro::     Basic concepts of multibyte characters.
@@ -80,16 +80,31 @@ cases) in the @kbd{C-q} command (@pxref{Multibyte Conversion}).
 @kindex C-h h
 @findex view-hello-file
 @cindex undisplayable characters
-@cindex ?
+@cindex @samp{?} in display
-@cindex ??
  The command @kbd{C-h h} (@code{view-hello-file}) displays the file
 @file{etc/HELLO}, which shows how to say ``hello'' in many languages.
-This illustrates various scripts.  If the font you're using doesn't have
+This illustrates various scripts.  If some characters can't be
-characters for all those different languages, you will see some hollow
+displayed on your terminal, they appear as @samp{?} or as hollow boxes
-boxes instead of characters; see @ref{Fontsets}.  On non-windowing
+(@pxref{Undisplayable Characters}).
-displays, @samp{?} is displayed in place of the hollow box.  More than
-one @samp{?} is displayed for undisplayable characters that are wider
+  Keyboards, even in the countries where these character sets are used,
-than one column.
+generally don't have keys for all the characters in them.  So Emacs
+supports various @dfn{input methods}, typically one for each script or
+language, to make it convenient to type them.
+@kindex C-x RET
+  The prefix key @kbd{C-x @key{RET}} is used for commands that pertain
+to multibyte characters, coding systems, and input methods.
+@ignore
+@c This is commented out because it doesn't fit here, or anywhere.
+@c This manual does not discuss "character sets" as they
+@c are used in Mule, and it makes no sense to mention these commands
+@c except as part of a larger discussion of the topic.
+@c But it is not clear that topic is worth mentioning here,
+@c since that is more of an implementation concept
+@c than a user-level concept.  And when we switch to Unicode,
+@c character sets in the current sense may not even exist.
 @findex list-charset-chars
 @cindex characters in a certain charset
@@ -101,15 +116,7 @@ character set, and displays all the characters in that character set.
  The command @kbd{M-x describe-character-set} prompts for a character
 set name and displays information about that character set, including
 its internal representation within Emacs.
+@end ignore
-  Keyboards, even in the countries where these character sets are used,
-generally don't have keys for all the characters in them.  So Emacs
-supports various @dfn{input methods}, typically one for each script or
-language, to make it convenient to type them.
-@kindex C-x RET
-  The prefix key @kbd{C-x @key{RET}} is used for commands that pertain
-to multibyte characters, coding systems, and input methods.
 @node Enabling Multibyte
 @section Enabling Multibyte Characters
@@ -153,16 +160,22 @@ have basically the same effect as @samp{--unibyte}.
 @cindex unibyte operation, and Lisp files
 @cindex init file, and non-ASCII characters
 @cindex environment variables, and non-ASCII characters
-  Multibyte strings are not created during initialization from the
+  With @samp{--unibyte}, multibyte strings are not created during
-values of environment variables, @file{/etc/passwd} entries etc.@: that
+initialization from the values of environment variables,
-contain non-ASCII 8-bit characters.  However, Lisp files, when they are
+@file{/etc/passwd} entries etc.@: that contain non-ASCII 8-bit
-loaded for running, and in particular the initialization file
+characters.
-@file{.emacs}, are normally read as multibyte---even with
-@samp{--unibyte}.  To avoid multibyte strings being generated by
+  Emacs normally loads Lisp files as multibyte, regardless of whether
-non-ASCII characters in Lisp files, put @samp{-*-unibyte: t;-*-} in a
+you used @samp{--unibyte}.  This includes the Emacs initialization
-comment on the first line, or specify the coding system @samp{raw-text}
+file, @file{.emacs}, and the initialization files of Emacs packages
-with @kbd{C-x @key{RET} c}.  Do the same for initialization files for
+such as Gnus.  However, you can specify unibyte loading for a
-packages like Gnus.
+particular Lisp file, by putting @samp{-*-unibyte: t;-*-} in a comment
+on the first line.  Then that file is always loaded as unibyte text,
+even if you did not start Emacs with @samp{--unibyte}.  The motivation
+for these conventions is that it is more reliable to always load any
+particular Lisp file in the same way.  However, you can load a Lisp
+file as unibyte, on any one occasion, by typing @kbd{C-x @key{RET} c
+raw-text @key{RET}} immediately before loading it.
  The mode line indicates whether multibyte character support is enabled
 in the current buffer.  If it is, there are two or more characters (most
@@ -206,13 +219,12 @@ sign), Polish, Romanian, Slovak, Slovenian, Thai, Tibetan, Turkish,
 Dutch, Spanish, and Vietnamese.
 @end quotation
-@cindex fonts, for displaying different languages
+@cindex fonts for various scripts
-  To be able to display the script(s) used by your language environment
+  To display the script(s) used by your language environment on a
-on a windowed display, you need to have a suitable font installed.  If
+graphical display, you need to have a suitable font.  If some of the
-some of the characters appear as empty boxes, download and install the
+characters appear as empty boxes, you should install the GNU Intlfonts
-GNU Intlfonts distribution, which includes fonts for all supported
+package, which includes fonts for all supported scripts.
-scripts.  @xref{Fontsets}, for more details about setting up your
+@xref{Fontsets}, for more details about setting up your fonts.
-fonts.
 @findex set-locale-environment
 @vindex locale-language-names
@@ -220,31 +232,21 @@ fonts.
 @cindex locales
  Some operating systems let you specify the language you are using by
 setting the locale environment variables @env{LC_ALL}, @env{LC_CTYPE},
-and @env{LANG}; the first of these which is nonempty specifies your
+or @env{LANG}.@footnote{If more than one of these is set, the first
-locale.  Emacs handles this during startup by invoking the
+one that is nonempty specifies your locale for this purpose.}  Emacs
-@code{set-locale-environment} function, which matches your locale
+handles this during startup by matching your locale against entries in
-against entries in the value of the variable
+the value of the variables @code{locale-charset-language-names} and
 @code{locale-language-names} and selects the corresponding language
-environment if a match is found.  But if your locale also matches an
+environment if a match is found.  (The former variable overrides the
-entry in the variable @code{locale-charset-language-names}, this entry
+latter.)  It also adjusts the display table and terminal coding
-is preferred if its character set disagrees.  For example, suppose the
+system, the locale coding system, and the preferred coding system as
-locale @samp{en_GB.ISO8859-15} matches @code{"Latin-1"} in
+needed for the locale.
-@code{locale-language-names} and @code{"Latin-9"} in
-@code{locale-charset-language-names}; since these two language
+  If you modify the @env{LC_ALL}, @env{LC_CTYPE}, or @env{LANG}
-environments' character sets disagree, Emacs uses @code{"Latin-9"}.
+environment variables while running Emacs, you may want to invoke the
+@code{set-locale-environment} function afterwards to readjust the
-  If all goes well, the @code{set-locale-environment} function selects
+language environment from the new locale.
-the language environment, since language is part of locale.  It also
-adjusts the display table and terminal coding system, the locale coding
-system, and the preferred coding system as needed for the locale.
-  Since the @code{set-locale-environment} function is automatically
-invoked during startup, you normally do not need to invoke it yourself.
-However, if you modify the @env{LC_ALL}, @env{LC_CTYPE}, or @env{LANG}
-environment variables, you may want to invoke the
-@code{set-locale-environment} function afterwards.
-@findex set-locale-environment
 @vindex locale-preferred-coding-systems
  The @code{set-locale-environment} function normally uses the preferred
 coding system established by the language environment to decode system
@@ -255,10 +257,10 @@ matches @code{japanese-shift-jis} in
 @code{locale-preferred-coding-systems}, Emacs uses that encoding even
 though it might normally use @code{japanese-iso-8bit}.
-  The environment chosen from the locale when Emacs starts is
+  You can override the language environment chosen at startup with
-overidden by any explicit use of the command
+explicit use of the command @code{set-language-environment}, or with
-@code{set-language-environment} or customization of
+customization of @code{current-language-environment} in your init
-@code{current-language-environment} in your init file.
+file.
 @kindex C-h L
 @findex describe-language-environment
@@ -369,8 +371,10 @@ characters to type next is displayed in the echo area (but not when you
 are in the minibuffer).
 @cindex Leim package
-Input methods are implemented in the separate Leim package, which must
+  Input methods are implemented in the separate Leim package: they are
-be installed with Emacs.
+available only if the system administrator used Leim when building
+Emacs.  If Emacs was built without Leim, you will find that no input
+methods are defined.
 @node Select Input Method
 @section Selecting an Input Method
@@ -443,11 +447,12 @@ method, including the string that stands for it in the mode line.
 through 0377 (octal) are not really legitimate in the buffer.  The valid
 non-ASCII printing characters have codes that start from 0400.
-  If you type a self-inserting character in the range 0240
+  If you type a self-inserting character in the range 0240 through
-through 0377, Emacs assumes you intended to use one of the ISO
+0377, or if you use @kbd{C-q} to insert one, Emacs assumes you
-Latin-@var{n} character sets, and converts it to the Emacs code
+intended to use one of the ISO Latin-@var{n} character sets, and
-representing that Latin-@var{n} character.  You select @emph{which} ISO
+converts it to the Emacs code representing that Latin-@var{n}
-Latin character set to use through your choice of language environment
+character.  You select @emph{which} ISO Latin character set to use
+through your choice of language environment
 @iftex
 (see above).
 @end iftex
@@ -456,13 +461,12 @@ Latin character set to use through your choice of language environment
 @end ifinfo
 If you do not specify a choice, the default is Latin-1.
-  The same thing happens when you use @kbd{C-q} to enter an octal code
+  If you insert a character in the range 0200 through 0237, which
-in this range.  If you enter a code in the range 0200 through 0237,
+forms the @code{eight-bit-control} character set, it is inserted
-which forms the @code{eight-bit-control} character set, it is inserted
 literally.  You should normally avoid doing this since buffers
 containing such characters have to be written out in either the
-@code{emacs-mule} or @code{raw-text} coding system, which is usually not
+@code{emacs-mule} or @code{raw-text} coding system, which is usually
-what you want.
+not what you want.
 @node Coding Systems
 @section Coding Systems
@@ -652,24 +656,24 @@ to non-@code{nil}.
 @cindex escape sequences in files
  By default, the automatic detection of coding system is sensitive to
 escape sequences.  If Emacs sees a sequence of characters that begin
-with an @key{ESC} character, and the sequence is valid as an ISO-2022
+with an escape character, and the sequence is valid as an ISO-2022
-code, the code is determined as one of ISO-2022 encoding, and the file
+code, that tells Emacs to use one of the ISO-2022 encodings to decode
-is decoded by the corresponding coding system
+the file.
-(e.g. @code{iso-2022-7bit}).
-  However, there may be cases that you want to read escape sequences in
+  However, there may be cases that you want to read escape sequences
-a file as is.  In such a case, you can set th variable
+in a file as is.  In such a case, you can set the variable
 @code{inhibit-iso-escape-detection} to non-@code{nil}.  Then the code
-detection will ignore any escape sequences, and so no file is detected
+detection ignores any escape sequences, and never uses an ISO-2022
-as being encoded in some of ISO-2022 encoding.  The result is that all
+encoding.  The result is that all escape sequences become visible in
-escape sequences become visible in a buffer.
+the buffer.
  The default value of @code{inhibit-iso-escape-detection} is
-@code{nil}, and it is strongly recommended not to change it.  That's
+@code{nil}.  We recommend that you not change it permanently, only for
-because many Emacs Lisp source files that contain non-ASCII characters
+one specific operation.  That's because many Emacs Lisp source files
-are encoded in the coding system @code{iso-2022-7bit} in the Emacs
+that contain non-ASCII characters are encoded in the coding system
-distribution, and they won't be decoded correctly when you visit those
+@code{iso-2022-7bit} in the Emacs distribution, and they won't be
-files if you suppress the escape sequence detection.
+decoded correctly when you visit those files if you suppress the
+escape sequence detection.
 @vindex coding
  You can specify the coding system for a particular file using the
@@ -700,33 +704,34 @@ a different coding system, you can specify a different coding system for
 the buffer using @code{set-buffer-file-coding-system} (@pxref{Specify
 Coding}).
-  While editing a file, you will sometimes insert characters which
+  You can insert any possible character into any Emacs buffer, but
-cannot be encoded with the coding system stored in
+most coding systems can only handle some of the possible characters.
-@code{buffer-file-coding-system}.  For example, suppose you start with
+This means that you can insert characters that cannot be encoded with
-an ASCII file and insert a few Latin-1 characters into it.  Or you could
+the coding system that will be used to save the buffer.  For example,
-edit a text file in Polish encoded in @code{iso-8859-2} and add to it
+you could start with an ASCII file and insert a few Latin-1 characters
-translations of several Polish words into Russian.  When you save the
+into it, or or you could edit a text file in Polish encoded in
-buffer, Emacs can no longer use the previous value of the buffer's
+@code{iso-8859-2} and add to it translations of several Polish words
-coding system, because the characters you added cannot be encoded by
+into Russian.  When you save the buffer, Emacs cannot use the current
-that coding system.
+value of @code{buffer-file-coding-system}, because the characters you
+added cannot be encoded by that coding system.
  When that happens, Emacs tries the most-preferred coding system (set
 by @kbd{M-x prefer-coding-system} or @kbd{M-x
-set-language-environment}), and if that coding system can safely encode
+set-language-environment}), and if that coding system can safely
-all of the characters in the buffer, Emacs uses it, and stores its value
+encode all of the characters in the buffer, Emacs uses it, and stores
-in @code{buffer-file-coding-system}.  Otherwise, Emacs pops up a window
+its value in @code{buffer-file-coding-system}.  Otherwise, Emacs
-with a list of coding systems suitable for encoding the buffer, and
+displays a list of coding systems suitable for encoding the buffer's
-prompts you to choose one of those coding systems.
+contents, and asks to choose one of those coding systems.
-  If you insert characters which cannot be encoded by the buffer's
+  If you insert the unsuitable characters in a mail message, Emacs
-coding system while editing a mail message, Emacs behaves a bit
+behaves a bit differently.  It additionally checks whether the
-differently.  It additionally checks whether the most-preferred coding
+most-preferred coding system is recommended for use in MIME messages;
-system is recommended for use in MIME messages; if it isn't, Emacs tells
+if it isn't, Emacs tells you that the most-preferred coding system is
-you that the most-preferred coding system is not recommended and prompts
+not recommended and prompts you for another coding system.  This is so
-you for another coding system.  This is so you won't inadvertently send
+you won't inadvertently send a message encoded in a way that your
-a message encoded in a way that your recipient's mail software will have
+recipient's mail software will have difficulty decoding.  (If you do
-difficulty decoding.  (If you do want to use the most-preferred coding
+want to use the most-preferred coding system, you can type its name to
-system, you can type its name to Emacs prompt anyway.)
+Emacs prompt anyway.)
 @vindex sendmail-coding-system
  When you send a message with Mail mode (@pxref{Sending Mail}), Emacs has
@@ -916,13 +921,14 @@ name, or it may get an error.  If such a problem happens, use @kbd{C-x
 C-w} to specify a new file name for that buffer.
 @vindex locale-coding-system
-  The variable @code{locale-coding-system} specifies a coding system to
+  The variable @code{locale-coding-system} specifies a coding system
-use when encoding and decoding system strings such as system error
+to use when encoding and decoding system strings such as system error
-messages and @code{format-time-string} formats and time stamps.  This
+messages and @code{format-time-string} formats and time stamps.  You
-coding system should be compatible with the underlying system's coding
+should choose a coding system that is compatible with the underlying
-system, which is normally specified by the first environment variable in
+system's text representation, which is normally specified by one of
-the list @env{LC_ALL}, @env{LC_CTYPE}, @env{LANG} whose value is
+the environment variables @env{LC_ALL}, @env{LC_CTYPE}, and
-nonempty.
+@env{LANG}.  (The first one whose value is nonempty is the one that
+determines the text representation.)
 @node Fontsets
 @section Fontsets
@@ -941,7 +947,7 @@ specifying its name, anywhere that you could use a single font.  Of
 course, Emacs fontsets can use only the fonts that the X server
 supports; if certain characters appear on the screen as hollow boxes,
 this means that the fontset in use for them has no font for those
-characters.@footnote{The installation instructions have information on
+characters.@footnote{The Emacs installation instructions have information on
 additional font support.}
  Emacs creates two fontsets automatically: the @dfn{standard fontset}
@@ -1099,23 +1105,27 @@ call this function explicitly to create a fontset.
 @node Undisplayable Characters
 @section Undisplayable Characters
-Your terminal may not be able to display some non-@sc{ascii} characters.
+  Your terminal may be unable to display some non-@sc{ascii}
-Most non-windowing terminals can only use a single character set,
+characters.  Most non-windowing terminals can only use a single
-specified by the variable @code{default-terminal-coding-system}
+character set (use the variable @code{default-terminal-coding-system}
-(@pxref{Specify Coding}) and characters which can't be encoded in it are
+(@pxref{Specify Coding}) to tell Emacs which one); characters which
-displayed as @samp{?} by default.  Windowing terminals may not have the
+can't be encoded in that coding system are displayed as @samp{?} by
-necessary font available to display a given character and display a
+default.
-hollow box instead.  You can change the default behavior.
+  Windowing terminals can display a broader range of characters, but
+you may not have fonts installed for all of them; characters that have
+no font appear as a hollow box.
-If you use Latin-1 characters but your terminal can't display Latin-1,
+  If you use Latin-1 characters but your terminal can't display
-you can arrange to display mnemonic @sc{ascii} sequences instead, e.g.@:
+Latin-1, you can arrange to display mnemonic @sc{ascii} sequences
-@samp{"o} for o-umlaut.  Load the library @file{iso-ascii} to do this.
+instead, e.g.@: @samp{"o} for o-umlaut.  Load the library
+@file{iso-ascii} to do this.
-If your terminal can display Latin-1, you can display characters from
+  If your terminal can display Latin-1, you can display characters
-other European character sets using a mixture of equivalent Latin-1
+from other European character sets using a mixture of equivalent
-characters and @sc{ascii} mnemonics.  Use the Custom option
+Latin-1 characters and @sc{ascii} mnemonics.  Use the Custom option
-@code{latin1-display} to enable this.  The mnemonic @sc{ascii} sequences
+@code{latin1-display} to enable this.  The mnemonic @sc{ascii}
-mostly correspond to those of the prefix input methods.
+sequences mostly correspond to those of the prefix input methods.
 @node Single-Byte Character Support
 @section Single-byte Character Set Support
@@ -1172,18 +1182,18 @@ characters:
 @findex set-keyboard-coding-system
 @vindex keyboard-coding-system
 If your keyboard can generate character codes 128 and up, representing
-non-ASCII characters, use the command @code{M-x
+non-ASCII you can type those character codes directly.
-set-keyboard-coding-system} or the Custom option
-@code{keyboard-coding-system} to specify this in the same way as for
+On a windowing terminal, you should not need to do anything special to
-multibyte usage (@pxref{Specify Coding}).
+use these keys; they should simply work.  On a text-only terminal, you
+should use the command @code{M-x set-keyboard-coding-system} or the
-It is not necessary to do this under a window system which can
+Custom option @code{keyboard-coding-system} to specify which coding
-distinguish 8-bit characters and Meta keys.  If you do this on a normal
+system your keyboard uses (@pxref{Specify Coding}).  Enabling this
-terminal, you will probably need to use @kbd{ESC} to type Meta
+feature will probably require you to use @kbd{ESC} to type Meta
-characters.@footnote{In some cases, such as the Linux console and
+characters; however, on a Linux console or in @code{xterm}, you can
-@code{xterm}, you can arrange for Meta to be converted to @kbd{ESC} and
+arrange for Meta to be converted to @kbd{ESC} and still be able type
-still be able type 8-bit characters present directly on the keyboard or
+8-bit characters present directly on the keyboard or using
-using @kbd{Compose} or @kbd{AltGr} keys.}  @xref{User Input}.
+@kbd{Compose} or @kbd{AltGr} keys.  @xref{User Input}.
 @item
 You can use an input method for the selected language environment.
@@ -1205,7 +1215,7 @@ and in any other context where a key sequence is allowed.
 library is loaded, the @key{ALT} modifier key, if you have one, serves
 the same purpose as @kbd{C-x 8}; use @key{ALT} together with an accent
 character to modify the following letter.  In addition, if you have keys
-for the Latin-1 ``dead accent characters'', they too are defined to
+for the Latin-1 ``dead accent characters,'' they too are defined to
 compose with the following character, once @code{iso-transl} is loaded.
 Use @kbd{C-x 8 C-h} to list the available translations as mnemonic
 command names.
@@ -1215,9 +1225,9 @@ command names.
 @cindex ISO Accents mode
 @findex iso-accents-mode
 @cindex Latin-1, Latin-2 and Latin-3 input mode
-For Latin-1, Latin-2 and Latin-3, @kbd{M-x iso-accents-mode} installs a
+For Latin-1, Latin-2 and Latin-3, @kbd{M-x iso-accents-mode} installs
-minor mode which provides a facility like the @code{latin-1-prefix}
+a minor mode which works much like the @code{latin-1-prefix} input
-input method but independent of the Leim package.  This mode is
+method does not depend on having the input methods installed.  This
-buffer-local.  It can be customized for various languages with @kbd{M-x
+mode is buffer-local.  It can be customized for various languages with
-iso-accents-customize}.
+@kbd{M-x iso-accents-customize}.
 @end itemize