(International): Add an overview of Mule features, with pointers to

detailed description. (Enabling Multibyte): Describe how to switch a unibyte session to multibyte. Mention that by default, all sessions are multibyte. (Coding Systems): Make it clear that cpNNN are coding systems, and should be used as such. (Recognize Coding): Explain that Emacs decodes text as part of reading it. Mention revert-buffer as a means to redecode a file.
author: Eli Zaretskii 2001-05-06 11:27:54 +0000
committer: Eli Zaretskii 2001-05-06 11:27:54 +0000
commit: 8561e53a1cfd859f26508676ddf9533875dea3c0 (patch)
tree: 945068cef06d9873809b7e74a4dd407c21561063
parent: 80561aaa69dfaf61bcbae2ea6f7befbb9551979c (diff)
download: emacs-8561e53a1cfd859f26508676ddf9533875dea3c0.tar.gz
emacs-8561e53a1cfd859f26508676ddf9533875dea3c0.zip
1 files changed, 66 insertions, 5 deletions
diff --git a/man/mule.texi b/man/mule.texi
index a80c0af7a81..e490d091b38 100644
--- a/man/mule.texi
+++ b/man/mule.texi
@@ -44,6 +44,42 @@ have been merged from the modified version of Emacs known as MULE (for
  Emacs also supports various encodings of these characters used by
 other internationalized software, such as word processors and mailers.
+  Emacs allows editing text with international characters by supporting
+all the related activities:
+@itemize @bullet
+@item
+You can visit files with non-ASCII characters, save non-ASCII text, and
+pass non-ASCII text between Emacs and programs it invokes (such as
+compilers, spell-checkers, and mailers).  Setting your language
+environment (@pxref{Language Environments}) takes care of setting up the
+coding systems and other options for a specific language or culture.
+Alternatively, you can specify how Emacs should encode or decode text
+for each command; see @ref{Specify Coding}.
+@item
+You can display non-ASCII characters encoded by the various scripts.
+This works by using appropriate fonts on X and similar graphics
+displays (@pxref{Defining Fontsets}), and by sending special codes to
+text-only displays (@pxref{Specify Coding}).  If some characters are
+displayed incorrectly, refer to @ref{Undisplayable Characters}, which
+describes possible problems and explains how to solve them.
+@item
+You can insert non-ASCII characters or search for them.  To do that,
+you can specify an input method (@pxref{Select Input Method}) suitable
+for your language, or use the default input method set up when you set
+your language environment.  (Emacs input methods are part of the Leim
+package, which must be installed for you to be able to use them.)  If
+your keyboard can produce non-ASCII characters, you can select an
+appropriate keyboard coding system (@pxref{Specify Coding}), and Emacs
+will accept those characters.  Latin-1 characters can also be input by
+using the @kbd{C-x 8} prefix, see @ref{Single-Byte Character Support,
+C-x 8}.
+@end itemize
+  The rest of this chapter describes these issues in detail.
 @menu
 * International Intro::     Basic concepts of multibyte characters.
 * Enabling Multibyte::      Controlling whether to use multibyte characters.
@@ -121,6 +157,7 @@ its internal representation within Emacs.
 @node Enabling Multibyte
 @section Enabling Multibyte Characters
+@cindex turn multibyte support on or off
  You can enable or disable multibyte character support, either for
 Emacs as a whole, or for a single buffer.  When multibyte characters are
 disabled in a buffer, then each byte in that buffer represents a
@@ -134,6 +171,9 @@ use ISO Latin; the Emacs multibyte character set includes all the
 characters in these character sets, and Emacs can translate
 automatically to and from the ISO codes.
+  By default, Emacs starts in multibyte mode, because that allows you to
+use all the supported languages and scripts without limitations.
  To edit a particular file in unibyte representation, visit it using
 @code{find-file-literally}.  @xref{Visiting}.  To convert a buffer in
 multibyte representation into a single-byte representation of the same
@@ -152,8 +192,16 @@ conversion, uncompression and auto mode selection as
 the @samp{--unibyte} option (@pxref{Initial Options}), or set the
 environment variable @env{EMACS_UNIBYTE}.  You can also customize
 @code{enable-multibyte-characters} or, equivalently, directly set the
-variable @code{default-enable-multibyte-characters} in your init file to
+variable @code{default-enable-multibyte-characters} to @code{nil} in
-have basically the same effect as @samp{--unibyte}.
+your init file to have basically the same effect as @samp{--unibyte}.
+@findex toggle-enable-multibyte-characters
+  To convert a unibyte session to a multibyte session, set
+@code{default-enable-multibyte-characters} to @code{t}.  Buffers which
+were created in the unibyte session before you turn on multibyte support
+will stay unibyte.  You can turn on multibyte support in a specific
+buffer by invoking the command @code{toggle-enable-multibyte-characters}
+in that buffer.
 @cindex Lisp files, and multibyte operation
 @cindex multibyte operation, and Lisp files
@@ -527,10 +575,15 @@ their names usually start with @samp{iso}.  There are also special
 coding systems @code{no-conversion}, @code{raw-text} and
 @code{emacs-mule} which do not convert printing characters at all.
+@cindex international files from DOS/Windows systems
  A special class of coding systems, collectively known as
 @dfn{codepages}, is designed to support text encoded by MS-Windows and
 MS-DOS software.  To use any of these systems, you need to create it
-with @kbd{M-x codepage-setup}.  @xref{MS-DOS and MULE}.
+with @kbd{M-x codepage-setup}.  @xref{MS-DOS and MULE}.  After
+creating the coding system for the codepage, you can use it as any
+other coding system.  For example, to visit a file encoded in codepage
+850, type @kbd{C-x @key{RET} c cp850 @key{RET} C-x C-f @var{filename}
+@key{RET}}.
  In addition to converting various representations of non-ASCII
 characters, a coding system can perform end-of-line conversion.  Emacs
@@ -630,8 +683,11 @@ the usual three variants to specify the kind of end-of-line conversion.
 @node Recognize Coding
 @section Recognizing Coding Systems
-  Most of the time, Emacs can recognize which coding system to use for
+  Emacs tries to recognize which coding system to use for a given text
-any given file---once you have specified your preferences.
+as an integral part of reading that text.  (This applies to files
+being read, output from subprocesses, text from X selections, etc.)
+Emacs can select the right coding system automatically most of the
+time---once you have specified your preferences.
  Some coding systems can be recognized or distinguished by which byte
 sequences appear in the data.  However, there are coding systems that
@@ -737,6 +793,11 @@ feature for tar and archive files, to prevent Emacs from being confused
 by a @samp{-*-coding:-*-} tag in a member of the archive and thinking it
 applies to the archive file as a whole.
+  If Emacs recognizes the encoding of a file incorrectly, you can
+reread the file using the correct coding system by typing @kbd{C-x
+@key{RET} c @var{coding-system} @key{RET} M-x revert-buffer
+@key{RET}}.
 @vindex buffer-file-coding-system
  Once Emacs has chosen a coding system for a buffer, it stores that
 coding system in @code{buffer-file-coding-system} and uses that coding
author	Eli Zaretskii	2001-05-06 11:27:54 +0000
committer	Eli Zaretskii	2001-05-06 11:27:54 +0000
commit	8561e53a1cfd859f26508676ddf9533875dea3c0 (patch)
tree	945068cef06d9873809b7e74a4dd407c21561063
parent	80561aaa69dfaf61bcbae2ea6f7befbb9551979c (diff)
download	emacs-8561e53a1cfd859f26508676ddf9533875dea3c0.tar.gz emacs-8561e53a1cfd859f26508676ddf9533875dea3c0.zip

diff --git a/man/mule.texi b/man/mule.texi index a80c0af7a81..e490d091b38 100644 --- a/man/mule.texi +++ b/man/mule.texi
@@ -44,6 +44,42 @@ have been merged from the modified version of Emacs known as MULE (for
44	Emacs also supports various encodings of these characters used by	44	Emacs also supports various encodings of these characters used by
45	other internationalized software, such as word processors and mailers.	45	other internationalized software, such as word processors and mailers.
46		46
		47	Emacs allows editing text with international characters by supporting
		48	all the related activities:
		49
		50	@itemize @bullet
		51	@item
		52	You can visit files with non-ASCII characters, save non-ASCII text, and
		53	pass non-ASCII text between Emacs and programs it invokes (such as
		54	compilers, spell-checkers, and mailers). Setting your language
		55	environment (@pxref{Language Environments}) takes care of setting up the
		56	coding systems and other options for a specific language or culture.
		57	Alternatively, you can specify how Emacs should encode or decode text
		58	for each command; see @ref{Specify Coding}.
		59
		60	@item
		61	You can display non-ASCII characters encoded by the various scripts.
		62	This works by using appropriate fonts on X and similar graphics
		63	displays (@pxref{Defining Fontsets}), and by sending special codes to
		64	text-only displays (@pxref{Specify Coding}). If some characters are
		65	displayed incorrectly, refer to @ref{Undisplayable Characters}, which
		66	describes possible problems and explains how to solve them.
		67
		68	@item
		69	You can insert non-ASCII characters or search for them. To do that,
		70	you can specify an input method (@pxref{Select Input Method}) suitable
		71	for your language, or use the default input method set up when you set
		72	your language environment. (Emacs input methods are part of the Leim
		73	package, which must be installed for you to be able to use them.) If
		74	your keyboard can produce non-ASCII characters, you can select an
		75	appropriate keyboard coding system (@pxref{Specify Coding}), and Emacs
		76	will accept those characters. Latin-1 characters can also be input by
		77	using the @kbd{C-x 8} prefix, see @ref{Single-Byte Character Support,
		78	C-x 8}.
		79	@end itemize
		80
		81	The rest of this chapter describes these issues in detail.
		82
47	@menu	83	@menu
48	* International Intro:: Basic concepts of multibyte characters.	84	* International Intro:: Basic concepts of multibyte characters.
49	* Enabling Multibyte:: Controlling whether to use multibyte characters.	85	* Enabling Multibyte:: Controlling whether to use multibyte characters.
@@ -121,6 +157,7 @@ its internal representation within Emacs.
121	@node Enabling Multibyte	157	@node Enabling Multibyte
122	@section Enabling Multibyte Characters	158	@section Enabling Multibyte Characters
123		159
		160	@cindex turn multibyte support on or off
124	You can enable or disable multibyte character support, either for	161	You can enable or disable multibyte character support, either for
125	Emacs as a whole, or for a single buffer. When multibyte characters are	162	Emacs as a whole, or for a single buffer. When multibyte characters are
126	disabled in a buffer, then each byte in that buffer represents a	163	disabled in a buffer, then each byte in that buffer represents a
@@ -134,6 +171,9 @@ use ISO Latin; the Emacs multibyte character set includes all the
134	characters in these character sets, and Emacs can translate	171	characters in these character sets, and Emacs can translate
135	automatically to and from the ISO codes.	172	automatically to and from the ISO codes.
136		173
		174	By default, Emacs starts in multibyte mode, because that allows you to
		175	use all the supported languages and scripts without limitations.
		176
137	To edit a particular file in unibyte representation, visit it using	177	To edit a particular file in unibyte representation, visit it using
138	@code{find-file-literally}. @xref{Visiting}. To convert a buffer in	178	@code{find-file-literally}. @xref{Visiting}. To convert a buffer in
139	multibyte representation into a single-byte representation of the same	179	multibyte representation into a single-byte representation of the same
@@ -152,8 +192,16 @@ conversion, uncompression and auto mode selection as
152	the @samp{--unibyte} option (@pxref{Initial Options}), or set the	192	the @samp{--unibyte} option (@pxref{Initial Options}), or set the
153	environment variable @env{EMACS_UNIBYTE}. You can also customize	193	environment variable @env{EMACS_UNIBYTE}. You can also customize
154	@code{enable-multibyte-characters} or, equivalently, directly set the	194	@code{enable-multibyte-characters} or, equivalently, directly set the
155	variable @code{default-enable-multibyte-characters} in your init file to	195	variable @code{default-enable-multibyte-characters} to @code{nil} in
156	have basically the same effect as @samp{--unibyte}.	196	your init file to have basically the same effect as @samp{--unibyte}.
		197
		198	@findex toggle-enable-multibyte-characters
		199	To convert a unibyte session to a multibyte session, set
		200	@code{default-enable-multibyte-characters} to @code{t}. Buffers which
		201	were created in the unibyte session before you turn on multibyte support
		202	will stay unibyte. You can turn on multibyte support in a specific
		203	buffer by invoking the command @code{toggle-enable-multibyte-characters}
		204	in that buffer.
157		205
158	@cindex Lisp files, and multibyte operation	206	@cindex Lisp files, and multibyte operation
159	@cindex multibyte operation, and Lisp files	207	@cindex multibyte operation, and Lisp files
@@ -527,10 +575,15 @@ their names usually start with @samp{iso}. There are also special
527	coding systems @code{no-conversion}, @code{raw-text} and	575	coding systems @code{no-conversion}, @code{raw-text} and
528	@code{emacs-mule} which do not convert printing characters at all.	576	@code{emacs-mule} which do not convert printing characters at all.
529		577
		578	@cindex international files from DOS/Windows systems
530	A special class of coding systems, collectively known as	579	A special class of coding systems, collectively known as
531	@dfn{codepages}, is designed to support text encoded by MS-Windows and	580	@dfn{codepages}, is designed to support text encoded by MS-Windows and
532	MS-DOS software. To use any of these systems, you need to create it	581	MS-DOS software. To use any of these systems, you need to create it
533	with @kbd{M-x codepage-setup}. @xref{MS-DOS and MULE}.	582	with @kbd{M-x codepage-setup}. @xref{MS-DOS and MULE}. After
		583	creating the coding system for the codepage, you can use it as any
		584	other coding system. For example, to visit a file encoded in codepage
		585	850, type @kbd{C-x @key{RET} c cp850 @key{RET} C-x C-f @var{filename}
		586	@key{RET}}.
534		587
535	In addition to converting various representations of non-ASCII	588	In addition to converting various representations of non-ASCII
536	characters, a coding system can perform end-of-line conversion. Emacs	589	characters, a coding system can perform end-of-line conversion. Emacs
@@ -630,8 +683,11 @@ the usual three variants to specify the kind of end-of-line conversion.
630	@node Recognize Coding	683	@node Recognize Coding
631	@section Recognizing Coding Systems	684	@section Recognizing Coding Systems
632		685
633	Most of the time, Emacs can recognize which coding system to use for	686	Emacs tries to recognize which coding system to use for a given text
634	any given file---once you have specified your preferences.	687	as an integral part of reading that text. (This applies to files
		688	being read, output from subprocesses, text from X selections, etc.)
		689	Emacs can select the right coding system automatically most of the
		690	time---once you have specified your preferences.
635		691
636	Some coding systems can be recognized or distinguished by which byte	692	Some coding systems can be recognized or distinguished by which byte
637	sequences appear in the data. However, there are coding systems that	693	sequences appear in the data. However, there are coding systems that
@@ -737,6 +793,11 @@ feature for tar and archive files, to prevent Emacs from being confused
737	by a @samp{--coding:--} tag in a member of the archive and thinking it	793	by a @samp{--coding:--} tag in a member of the archive and thinking it
738	applies to the archive file as a whole.	794	applies to the archive file as a whole.
739		795
		796	If Emacs recognizes the encoding of a file incorrectly, you can
		797	reread the file using the correct coding system by typing @kbd{C-x
		798	@key{RET} c @var{coding-system} @key{RET} M-x revert-buffer
		799	@key{RET}}.
		800
740	@vindex buffer-file-coding-system	801	@vindex buffer-file-coding-system
741	Once Emacs has chosen a coding system for a buffer, it stores that	802	Once Emacs has chosen a coding system for a buffer, it stores that
742	coding system in @code{buffer-file-coding-system} and uses that coding	803	coding system in @code{buffer-file-coding-system} and uses that coding