(Coding Systems): Move char translation stuff here.

(Specify Coding, Output Coding): New nodes, out of Recognize Coding. (Recognize Coding): Substantial local rewrites. (International): Update menu.
author: Richard M. Stallman 2006-07-03 15:48:23 +0000
committer: Richard M. Stallman 2006-07-03 15:48:23 +0000
commit: 50148a91b426fb6be9009d968f64b0e6d645e799 (patch)
tree: 204edd374edd45b246e087ea692bc8dc8704e9fe
parent: 43d6731323bf64459bc5a4d9aaa18cceca9c7eb1 (diff)
download: emacs-50148a91b426fb6be9009d968f64b0e6d645e799.tar.gz
emacs-50148a91b426fb6be9009d968f64b0e6d645e799.zip
1 files changed, 79 insertions, 65 deletions
diff --git a/man/mule.texi b/man/mule.texi
index 8220a5097d1..15ec08ce9b0 100644
--- a/man/mule.texi
+++ b/man/mule.texi
@@ -91,6 +91,8 @@ to make sure Emacs interprets keyboard input correctly; see
 * Coding Systems::          Character set conversion when you read and
                              write files, and so on.
 * Recognize Coding::        How Emacs figures out which conversion to use.
+* Specify Coding::          Specifying a file's coding system explicitly.
+* Output Coding::           Choosing coding systems for output.
 * Text Coding::             Choosing conversion to use for file text.
 * Communication Coding::    Coding systems for interprocess communication.
 * File Name Coding::        Coding systems for file @emph{names}.
@@ -718,6 +720,23 @@ non-@acronym{ASCII} characters stored with the internal Emacs encoding.  It
 handles end-of-line conversion based on the data encountered, and has
 the usual three variants to specify the kind of end-of-line conversion.
+@findex unify-8859-on-decoding-mode
+  The @dfn{character translation} feature can modify the effect of
+various coding systems, by changing the internal Emacs codes that
+decoding produces.  For instance, the command
+@code{unify-8859-on-decoding-mode} enables a mode that ``unifies'' the
+Latin alphabets when decoding text.  This works by converting all
+non-@acronym{ASCII} Latin-@var{n} characters to either Latin-1 or
+Unicode characters.  This way it is easier to use various
+Latin-@var{n} alphabets together.  (In a future Emacs version we hope
+to move towards full Unicode support and complete unification of
+character sets.)
+@vindex enable-character-translation
+  If you set the variable @code{enable-character-translation} to
+@code{nil}, that disables all character translation (including
+@code{unify-8859-on-decoding-mode}).
 @node Recognize Coding
 @section Recognizing Coding Systems
@@ -812,26 +831,6 @@ coding system @code{iso-2022-7bit}, and they won't be
 decoded correctly when you visit those files if you suppress the
 escape sequence detection.
-@vindex coding
-  You can specify the coding system for a particular file using the
-@w{@samp{-*-@dots{}-*-}} construct at the beginning of a file, or a
-local variables list at the end (@pxref{File Variables}).  You do this
-by defining a value for the ``variable'' named @code{coding}.  Emacs
-does not really have a variable @code{coding}; instead of setting a
-variable, this uses the specified coding system for the file.  For
-example, @samp{-*-mode: C; coding: latin-1;-*-} specifies use of the
-Latin-1 coding system, as well as C mode.  When you specify the coding
-explicitly in the file, that overrides
-@code{file-coding-system-alist}.
-  If you add the character @samp{!} at the end of the coding system
-name, it disables any character translation while decoding the file.
-For instance, it effectively cancels the effect of
-@code{unify-8859-on-decoding-mode}.  This is useful when you need to
-make sure that the character codes in the Emacs buffer will not
-according to user settings; for instance, for the sake of strings in
-Emacs Lisp source files.
 @vindex auto-coding-alist
 @vindex auto-coding-regexp-alist
 @vindex auto-coding-functions
@@ -848,6 +847,24 @@ RMAIL files, whose names in general don't match any particular
 pattern, are decoded correctly.  One of the builtin
 @code{auto-coding-functions} detects the encoding for XML files.
+@vindex rmail-decode-mime-charset
+  When you get new mail in Rmail, each message is translated
+automatically from the coding system it is written in, as if it were a
+separate file.  This uses the priority list of coding systems that you
+have specified.  If a MIME message specifies a character set, Rmail
+obeys that specification, unless @code{rmail-decode-mime-charset} is
+@code{nil}.
+@vindex rmail-file-coding-system
+  For reading and saving Rmail files themselves, Emacs uses the coding
+system specified by the variable @code{rmail-file-coding-system}.  The
+default value is @code{nil}, which means that Rmail files are not
+translated (they are read and written in the Emacs internal character
+code).
+@node Specify Coding
+@section Specifying a File's Coding System
  If Emacs recognizes the encoding of a file incorrectly, you can
 reread the file using the correct coding system by typing @kbd{C-x
 @key{RET} r @var{coding-system} @key{RET}}.  To see what coding system
@@ -855,33 +872,45 @@ Emacs actually used to decode the file, look at the coding system
 mnemonic letter near the left edge of the mode line (@pxref{Mode
 Line}), or type @kbd{C-h C @key{RET}}.
-@findex unify-8859-on-decoding-mode
+@vindex coding
-  The command @code{unify-8859-on-decoding-mode} enables a mode that
+  You can specify the coding system for a particular file in the file
-``unifies'' the Latin alphabets when decoding text.  This works by
+itself, using the @w{@samp{-*-@dots{}-*-}} construct at the beginning,
-converting all non-@acronym{ASCII} Latin-@var{n} characters to either
+or a local variables list at the end (@pxref{File Variables}).  You do
-Latin-1 or Unicode characters.  This way it is easier to use various
+this by defining a value for the ``variable'' named @code{coding}.
-Latin-@var{n} alphabets together.  In a future Emacs version we hope
+Emacs does not really have a variable @code{coding}; instead of
-to move towards full Unicode support and complete unification of
+setting a variable, this uses the specified coding system for the
-character sets.
+file.  For example, @samp{-*-mode: C; coding: latin-1;-*-} specifies
+use of the Latin-1 coding system, as well as C mode.  When you specify
+the coding explicitly in the file, that overrides
+@code{file-coding-system-alist}.
+  If you add the character @samp{!} at the end of the coding system
+name in @code{coding}, it disables any character translation while
+decoding the file.  For instance, it effectively cancels the effect of
+@code{unify-8859-on-decoding-mode}.  This is useful when you need to
+make sure that the character codes in the Emacs buffer will not vary
+due to changes in user settings; for instance, for the sake of strings
+in Emacs Lisp source files.
+@node Output Coding
+@section Choosing Coding Systems for Output
 @vindex buffer-file-coding-system
  Once Emacs has chosen a coding system for a buffer, it stores that
-coding system in @code{buffer-file-coding-system} and uses that coding
+coding system in @code{buffer-file-coding-system}.  That makes it the
-system, by default, for operations that write from this buffer into a
+default for operations that write from this buffer into a file, such
-file.  This includes the commands @code{save-buffer} and
+as @code{save-buffer} and @code{write-region}.  You can specify a
-@code{write-region}.  If you want to write files from this buffer using
+different coding system for further file output from the buffer using
-a different coding system, you can specify a different coding system for
+@code{set-buffer-file-coding-system} (@pxref{Text Coding}).
-the buffer using @code{set-buffer-file-coding-system} (@pxref{Text
-Coding}).
+  You can insert any character Emacs supports into any Emacs buffer,
+but most coding systems can only handle a subset of these characters.
-  You can insert any possible character into any Emacs buffer, but
+Therefore, you can insert characters that cannot be encoded with the
-most coding systems can only handle some of the possible characters.
+coding system that will be used to save the buffer.  For example, you
-This means that it is possible for you to insert characters that
+could start with an @acronym{ASCII} file and insert a few Latin-1
-cannot be encoded with the coding system that will be used to save the
+characters into it, or you could edit a text file in Polish encoded in
-buffer.  For example, you could start with an @acronym{ASCII} file and insert a
+@code{iso-8859-2} and add some Russian words to it.  When you save
-few Latin-1 characters into it, or you could edit a text file in
+that buffer, Emacs cannot use the current value of
-Polish encoded in @code{iso-8859-2} and add some Russian words to it.
-When you save the buffer, Emacs cannot use the current value of
 @code{buffer-file-coding-system}, because the characters you added
 cannot be encoded by that coding system.
@@ -896,12 +925,12 @@ contents, and asks you to choose one of those coding systems.
  If you insert the unsuitable characters in a mail message, Emacs
 behaves a bit differently.  It additionally checks whether the
 most-preferred coding system is recommended for use in MIME messages;
-if not, Emacs tells you that the most-preferred coding system is
+if not, Emacs tells you that the most-preferred coding system is not
-not recommended and prompts you for another coding system.  This is so
+recommended and prompts you for another coding system.  This is so you
-you won't inadvertently send a message encoded in a way that your
+won't inadvertently send a message encoded in a way that your
-recipient's mail software will have difficulty decoding.  (If you do
+recipient's mail software will have difficulty decoding.  (You can
-want to use the most-preferred coding system, you can still type its
+still use an unsuitable coding system if you type its name in response
-name in response to the question.)
+to the question.)
 @vindex sendmail-coding-system
  When you send a message with Mail mode (@pxref{Sending Mail}), Emacs has
@@ -914,21 +943,6 @@ new files, which is controlled by your choice of language environment,
 if that is non-@code{nil}.  If all of these three values are @code{nil},
 Emacs encodes outgoing mail using the Latin-1 coding system.
-@vindex rmail-decode-mime-charset
-  When you get new mail in Rmail, each message is translated
-automatically from the coding system it is written in, as if it were a
-separate file.  This uses the priority list of coding systems that you
-have specified.  If a MIME message specifies a character set, Rmail
-obeys that specification, unless @code{rmail-decode-mime-charset} is
-@code{nil}.
-@vindex rmail-file-coding-system
-  For reading and saving Rmail files themselves, Emacs uses the coding
-system specified by the variable @code{rmail-file-coding-system}.  The
-default value is @code{nil}, which means that Rmail files are not
-translated (they are read and written in the Emacs internal character
-code).
 @node Text Coding
 @section Specifying a Coding System for File Text
author	Richard M. Stallman	2006-07-03 15:48:23 +0000
committer	Richard M. Stallman	2006-07-03 15:48:23 +0000
commit	50148a91b426fb6be9009d968f64b0e6d645e799 (patch)
tree	204edd374edd45b246e087ea692bc8dc8704e9fe
parent	43d6731323bf64459bc5a4d9aaa18cceca9c7eb1 (diff)
download	emacs-50148a91b426fb6be9009d968f64b0e6d645e799.tar.gz emacs-50148a91b426fb6be9009d968f64b0e6d645e799.zip

diff --git a/man/mule.texi b/man/mule.texi index 8220a5097d1..15ec08ce9b0 100644 --- a/man/mule.texi +++ b/man/mule.texi
@@ -91,6 +91,8 @@ to make sure Emacs interprets keyboard input correctly; see
91	* Coding Systems:: Character set conversion when you read and	91	* Coding Systems:: Character set conversion when you read and
92	write files, and so on.	92	write files, and so on.
93	* Recognize Coding:: How Emacs figures out which conversion to use.	93	* Recognize Coding:: How Emacs figures out which conversion to use.
		94	* Specify Coding:: Specifying a file's coding system explicitly.
		95	* Output Coding:: Choosing coding systems for output.
94	* Text Coding:: Choosing conversion to use for file text.	96	* Text Coding:: Choosing conversion to use for file text.
95	* Communication Coding:: Coding systems for interprocess communication.	97	* Communication Coding:: Coding systems for interprocess communication.
96	* File Name Coding:: Coding systems for file @emph{names}.	98	* File Name Coding:: Coding systems for file @emph{names}.
@@ -718,6 +720,23 @@ non-@acronym{ASCII} characters stored with the internal Emacs encoding. It
718	handles end-of-line conversion based on the data encountered, and has	720	handles end-of-line conversion based on the data encountered, and has
719	the usual three variants to specify the kind of end-of-line conversion.	721	the usual three variants to specify the kind of end-of-line conversion.
720		722
		723	@findex unify-8859-on-decoding-mode
		724	The @dfn{character translation} feature can modify the effect of
		725	various coding systems, by changing the internal Emacs codes that
		726	decoding produces. For instance, the command
		727	@code{unify-8859-on-decoding-mode} enables a mode that ``unifies'' the
		728	Latin alphabets when decoding text. This works by converting all
		729	non-@acronym{ASCII} Latin-@var{n} characters to either Latin-1 or
		730	Unicode characters. This way it is easier to use various
		731	Latin-@var{n} alphabets together. (In a future Emacs version we hope
		732	to move towards full Unicode support and complete unification of
		733	character sets.)
		734
		735	@vindex enable-character-translation
		736	If you set the variable @code{enable-character-translation} to
		737	@code{nil}, that disables all character translation (including
		738	@code{unify-8859-on-decoding-mode}).
		739
721	@node Recognize Coding	740	@node Recognize Coding
722	@section Recognizing Coding Systems	741	@section Recognizing Coding Systems
723		742
@@ -812,26 +831,6 @@ coding system @code{iso-2022-7bit}, and they won't be
812	decoded correctly when you visit those files if you suppress the	831	decoded correctly when you visit those files if you suppress the
813	escape sequence detection.	832	escape sequence detection.
814		833
815	@vindex coding
816	You can specify the coding system for a particular file using the
817	@w{@samp{--@dots{}--}} construct at the beginning of a file, or a
818	local variables list at the end (@pxref{File Variables}). You do this
819	by defining a value for the ``variable'' named @code{coding}. Emacs
820	does not really have a variable @code{coding}; instead of setting a
821	variable, this uses the specified coding system for the file. For
822	example, @samp{--mode: C; coding: latin-1;--} specifies use of the
823	Latin-1 coding system, as well as C mode. When you specify the coding
824	explicitly in the file, that overrides
825	@code{file-coding-system-alist}.
826
827	If you add the character @samp{!} at the end of the coding system
828	name, it disables any character translation while decoding the file.
829	For instance, it effectively cancels the effect of
830	@code{unify-8859-on-decoding-mode}. This is useful when you need to
831	make sure that the character codes in the Emacs buffer will not
832	according to user settings; for instance, for the sake of strings in
833	Emacs Lisp source files.
834
835	@vindex auto-coding-alist	834	@vindex auto-coding-alist
836	@vindex auto-coding-regexp-alist	835	@vindex auto-coding-regexp-alist
837	@vindex auto-coding-functions	836	@vindex auto-coding-functions
@@ -848,6 +847,24 @@ RMAIL files, whose names in general don't match any particular
848	pattern, are decoded correctly. One of the builtin	847	pattern, are decoded correctly. One of the builtin
849	@code{auto-coding-functions} detects the encoding for XML files.	848	@code{auto-coding-functions} detects the encoding for XML files.
850		849
		850	@vindex rmail-decode-mime-charset
		851	When you get new mail in Rmail, each message is translated
		852	automatically from the coding system it is written in, as if it were a
		853	separate file. This uses the priority list of coding systems that you
		854	have specified. If a MIME message specifies a character set, Rmail
		855	obeys that specification, unless @code{rmail-decode-mime-charset} is
		856	@code{nil}.
		857
		858	@vindex rmail-file-coding-system
		859	For reading and saving Rmail files themselves, Emacs uses the coding
		860	system specified by the variable @code{rmail-file-coding-system}. The
		861	default value is @code{nil}, which means that Rmail files are not
		862	translated (they are read and written in the Emacs internal character
		863	code).
		864
		865	@node Specify Coding
		866	@section Specifying a File's Coding System
		867
851	If Emacs recognizes the encoding of a file incorrectly, you can	868	If Emacs recognizes the encoding of a file incorrectly, you can
852	reread the file using the correct coding system by typing @kbd{C-x	869	reread the file using the correct coding system by typing @kbd{C-x
853	@key{RET} r @var{coding-system} @key{RET}}. To see what coding system	870	@key{RET} r @var{coding-system} @key{RET}}. To see what coding system
@@ -855,33 +872,45 @@ Emacs actually used to decode the file, look at the coding system
855	mnemonic letter near the left edge of the mode line (@pxref{Mode	872	mnemonic letter near the left edge of the mode line (@pxref{Mode
856	Line}), or type @kbd{C-h C @key{RET}}.	873	Line}), or type @kbd{C-h C @key{RET}}.
857		874
858	@findex unify-8859-on-decoding-mode	875	@vindex coding
859	The command @code{unify-8859-on-decoding-mode} enables a mode that	876	You can specify the coding system for a particular file in the file
860	``unifies'' the Latin alphabets when decoding text. This works by	877	itself, using the @w{@samp{--@dots{}--}} construct at the beginning,
861	converting all non-@acronym{ASCII} Latin-@var{n} characters to either	878	or a local variables list at the end (@pxref{File Variables}). You do
862	Latin-1 or Unicode characters. This way it is easier to use various	879	this by defining a value for the ``variable'' named @code{coding}.
863	Latin-@var{n} alphabets together. In a future Emacs version we hope	880	Emacs does not really have a variable @code{coding}; instead of
864	to move towards full Unicode support and complete unification of	881	setting a variable, this uses the specified coding system for the
865	character sets.	882	file. For example, @samp{--mode: C; coding: latin-1;--} specifies
		883	use of the Latin-1 coding system, as well as C mode. When you specify
		884	the coding explicitly in the file, that overrides
		885	@code{file-coding-system-alist}.
		886
		887	If you add the character @samp{!} at the end of the coding system
		888	name in @code{coding}, it disables any character translation while
		889	decoding the file. For instance, it effectively cancels the effect of
		890	@code{unify-8859-on-decoding-mode}. This is useful when you need to
		891	make sure that the character codes in the Emacs buffer will not vary
		892	due to changes in user settings; for instance, for the sake of strings
		893	in Emacs Lisp source files.
		894
		895	@node Output Coding
		896	@section Choosing Coding Systems for Output
866		897
867	@vindex buffer-file-coding-system	898	@vindex buffer-file-coding-system
868	Once Emacs has chosen a coding system for a buffer, it stores that	899	Once Emacs has chosen a coding system for a buffer, it stores that
869	coding system in @code{buffer-file-coding-system} and uses that coding	900	coding system in @code{buffer-file-coding-system}. That makes it the
870	system, by default, for operations that write from this buffer into a	901	default for operations that write from this buffer into a file, such
871	file. This includes the commands @code{save-buffer} and	902	as @code{save-buffer} and @code{write-region}. You can specify a
872	@code{write-region}. If you want to write files from this buffer using	903	different coding system for further file output from the buffer using
873	a different coding system, you can specify a different coding system for	904	@code{set-buffer-file-coding-system} (@pxref{Text Coding}).
874	the buffer using @code{set-buffer-file-coding-system} (@pxref{Text	905
875	Coding}).	906	You can insert any character Emacs supports into any Emacs buffer,
876		907	but most coding systems can only handle a subset of these characters.
877	You can insert any possible character into any Emacs buffer, but	908	Therefore, you can insert characters that cannot be encoded with the
878	most coding systems can only handle some of the possible characters.	909	coding system that will be used to save the buffer. For example, you
879	This means that it is possible for you to insert characters that	910	could start with an @acronym{ASCII} file and insert a few Latin-1
880	cannot be encoded with the coding system that will be used to save the	911	characters into it, or you could edit a text file in Polish encoded in
881	buffer. For example, you could start with an @acronym{ASCII} file and insert a	912	@code{iso-8859-2} and add some Russian words to it. When you save
882	few Latin-1 characters into it, or you could edit a text file in	913	that buffer, Emacs cannot use the current value of
883	Polish encoded in @code{iso-8859-2} and add some Russian words to it.
884	When you save the buffer, Emacs cannot use the current value of
885	@code{buffer-file-coding-system}, because the characters you added	914	@code{buffer-file-coding-system}, because the characters you added
886	cannot be encoded by that coding system.	915	cannot be encoded by that coding system.
887		916
@@ -896,12 +925,12 @@ contents, and asks you to choose one of those coding systems.
896	If you insert the unsuitable characters in a mail message, Emacs	925	If you insert the unsuitable characters in a mail message, Emacs
897	behaves a bit differently. It additionally checks whether the	926	behaves a bit differently. It additionally checks whether the
898	most-preferred coding system is recommended for use in MIME messages;	927	most-preferred coding system is recommended for use in MIME messages;
899	if not, Emacs tells you that the most-preferred coding system is	928	if not, Emacs tells you that the most-preferred coding system is not
900	not recommended and prompts you for another coding system. This is so	929	recommended and prompts you for another coding system. This is so you
901	you won't inadvertently send a message encoded in a way that your	930	won't inadvertently send a message encoded in a way that your
902	recipient's mail software will have difficulty decoding. (If you do	931	recipient's mail software will have difficulty decoding. (You can
903	want to use the most-preferred coding system, you can still type its	932	still use an unsuitable coding system if you type its name in response
904	name in response to the question.)	933	to the question.)
905		934
906	@vindex sendmail-coding-system	935	@vindex sendmail-coding-system
907	When you send a message with Mail mode (@pxref{Sending Mail}), Emacs has	936	When you send a message with Mail mode (@pxref{Sending Mail}), Emacs has
@@ -914,21 +943,6 @@ new files, which is controlled by your choice of language environment,
914	if that is non-@code{nil}. If all of these three values are @code{nil},	943	if that is non-@code{nil}. If all of these three values are @code{nil},
915	Emacs encodes outgoing mail using the Latin-1 coding system.	944	Emacs encodes outgoing mail using the Latin-1 coding system.
916		945
917	@vindex rmail-decode-mime-charset
918	When you get new mail in Rmail, each message is translated
919	automatically from the coding system it is written in, as if it were a
920	separate file. This uses the priority list of coding systems that you
921	have specified. If a MIME message specifies a character set, Rmail
922	obeys that specification, unless @code{rmail-decode-mime-charset} is
923	@code{nil}.
924
925	@vindex rmail-file-coding-system
926	For reading and saving Rmail files themselves, Emacs uses the coding
927	system specified by the variable @code{rmail-file-coding-system}. The
928	default value is @code{nil}, which means that Rmail files are not
929	translated (they are read and written in the Emacs internal character
930	code).
931
932	@node Text Coding	946	@node Text Coding
933	@section Specifying a Coding System for File Text	947	@section Specifying a Coding System for File Text
934		948