aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorEli Zaretskii2001-05-06 11:27:54 +0000
committerEli Zaretskii2001-05-06 11:27:54 +0000
commit8561e53a1cfd859f26508676ddf9533875dea3c0 (patch)
tree945068cef06d9873809b7e74a4dd407c21561063
parent80561aaa69dfaf61bcbae2ea6f7befbb9551979c (diff)
downloademacs-8561e53a1cfd859f26508676ddf9533875dea3c0.tar.gz
emacs-8561e53a1cfd859f26508676ddf9533875dea3c0.zip
(International): Add an overview of Mule features, with pointers to
detailed description. (Enabling Multibyte): Describe how to switch a unibyte session to multibyte. Mention that by default, all sessions are multibyte. (Coding Systems): Make it clear that cpNNN are coding systems, and should be used as such. (Recognize Coding): Explain that Emacs decodes text as part of reading it. Mention revert-buffer as a means to redecode a file.
-rw-r--r--man/mule.texi71
1 files changed, 66 insertions, 5 deletions
diff --git a/man/mule.texi b/man/mule.texi
index a80c0af7a81..e490d091b38 100644
--- a/man/mule.texi
+++ b/man/mule.texi
@@ -44,6 +44,42 @@ have been merged from the modified version of Emacs known as MULE (for
44 Emacs also supports various encodings of these characters used by 44 Emacs also supports various encodings of these characters used by
45other internationalized software, such as word processors and mailers. 45other internationalized software, such as word processors and mailers.
46 46
47 Emacs allows editing text with international characters by supporting
48all the related activities:
49
50@itemize @bullet
51@item
52You can visit files with non-ASCII characters, save non-ASCII text, and
53pass non-ASCII text between Emacs and programs it invokes (such as
54compilers, spell-checkers, and mailers). Setting your language
55environment (@pxref{Language Environments}) takes care of setting up the
56coding systems and other options for a specific language or culture.
57Alternatively, you can specify how Emacs should encode or decode text
58for each command; see @ref{Specify Coding}.
59
60@item
61You can display non-ASCII characters encoded by the various scripts.
62This works by using appropriate fonts on X and similar graphics
63displays (@pxref{Defining Fontsets}), and by sending special codes to
64text-only displays (@pxref{Specify Coding}). If some characters are
65displayed incorrectly, refer to @ref{Undisplayable Characters}, which
66describes possible problems and explains how to solve them.
67
68@item
69You can insert non-ASCII characters or search for them. To do that,
70you can specify an input method (@pxref{Select Input Method}) suitable
71for your language, or use the default input method set up when you set
72your language environment. (Emacs input methods are part of the Leim
73package, which must be installed for you to be able to use them.) If
74your keyboard can produce non-ASCII characters, you can select an
75appropriate keyboard coding system (@pxref{Specify Coding}), and Emacs
76will accept those characters. Latin-1 characters can also be input by
77using the @kbd{C-x 8} prefix, see @ref{Single-Byte Character Support,
78C-x 8}.
79@end itemize
80
81 The rest of this chapter describes these issues in detail.
82
47@menu 83@menu
48* International Intro:: Basic concepts of multibyte characters. 84* International Intro:: Basic concepts of multibyte characters.
49* Enabling Multibyte:: Controlling whether to use multibyte characters. 85* Enabling Multibyte:: Controlling whether to use multibyte characters.
@@ -121,6 +157,7 @@ its internal representation within Emacs.
121@node Enabling Multibyte 157@node Enabling Multibyte
122@section Enabling Multibyte Characters 158@section Enabling Multibyte Characters
123 159
160@cindex turn multibyte support on or off
124 You can enable or disable multibyte character support, either for 161 You can enable or disable multibyte character support, either for
125Emacs as a whole, or for a single buffer. When multibyte characters are 162Emacs as a whole, or for a single buffer. When multibyte characters are
126disabled in a buffer, then each byte in that buffer represents a 163disabled in a buffer, then each byte in that buffer represents a
@@ -134,6 +171,9 @@ use ISO Latin; the Emacs multibyte character set includes all the
134characters in these character sets, and Emacs can translate 171characters in these character sets, and Emacs can translate
135automatically to and from the ISO codes. 172automatically to and from the ISO codes.
136 173
174 By default, Emacs starts in multibyte mode, because that allows you to
175use all the supported languages and scripts without limitations.
176
137 To edit a particular file in unibyte representation, visit it using 177 To edit a particular file in unibyte representation, visit it using
138@code{find-file-literally}. @xref{Visiting}. To convert a buffer in 178@code{find-file-literally}. @xref{Visiting}. To convert a buffer in
139multibyte representation into a single-byte representation of the same 179multibyte representation into a single-byte representation of the same
@@ -152,8 +192,16 @@ conversion, uncompression and auto mode selection as
152the @samp{--unibyte} option (@pxref{Initial Options}), or set the 192the @samp{--unibyte} option (@pxref{Initial Options}), or set the
153environment variable @env{EMACS_UNIBYTE}. You can also customize 193environment variable @env{EMACS_UNIBYTE}. You can also customize
154@code{enable-multibyte-characters} or, equivalently, directly set the 194@code{enable-multibyte-characters} or, equivalently, directly set the
155variable @code{default-enable-multibyte-characters} in your init file to 195variable @code{default-enable-multibyte-characters} to @code{nil} in
156have basically the same effect as @samp{--unibyte}. 196your init file to have basically the same effect as @samp{--unibyte}.
197
198@findex toggle-enable-multibyte-characters
199 To convert a unibyte session to a multibyte session, set
200@code{default-enable-multibyte-characters} to @code{t}. Buffers which
201were created in the unibyte session before you turn on multibyte support
202will stay unibyte. You can turn on multibyte support in a specific
203buffer by invoking the command @code{toggle-enable-multibyte-characters}
204in that buffer.
157 205
158@cindex Lisp files, and multibyte operation 206@cindex Lisp files, and multibyte operation
159@cindex multibyte operation, and Lisp files 207@cindex multibyte operation, and Lisp files
@@ -527,10 +575,15 @@ their names usually start with @samp{iso}. There are also special
527coding systems @code{no-conversion}, @code{raw-text} and 575coding systems @code{no-conversion}, @code{raw-text} and
528@code{emacs-mule} which do not convert printing characters at all. 576@code{emacs-mule} which do not convert printing characters at all.
529 577
578@cindex international files from DOS/Windows systems
530 A special class of coding systems, collectively known as 579 A special class of coding systems, collectively known as
531@dfn{codepages}, is designed to support text encoded by MS-Windows and 580@dfn{codepages}, is designed to support text encoded by MS-Windows and
532MS-DOS software. To use any of these systems, you need to create it 581MS-DOS software. To use any of these systems, you need to create it
533with @kbd{M-x codepage-setup}. @xref{MS-DOS and MULE}. 582with @kbd{M-x codepage-setup}. @xref{MS-DOS and MULE}. After
583creating the coding system for the codepage, you can use it as any
584other coding system. For example, to visit a file encoded in codepage
585850, type @kbd{C-x @key{RET} c cp850 @key{RET} C-x C-f @var{filename}
586@key{RET}}.
534 587
535 In addition to converting various representations of non-ASCII 588 In addition to converting various representations of non-ASCII
536characters, a coding system can perform end-of-line conversion. Emacs 589characters, a coding system can perform end-of-line conversion. Emacs
@@ -630,8 +683,11 @@ the usual three variants to specify the kind of end-of-line conversion.
630@node Recognize Coding 683@node Recognize Coding
631@section Recognizing Coding Systems 684@section Recognizing Coding Systems
632 685
633 Most of the time, Emacs can recognize which coding system to use for 686 Emacs tries to recognize which coding system to use for a given text
634any given file---once you have specified your preferences. 687as an integral part of reading that text. (This applies to files
688being read, output from subprocesses, text from X selections, etc.)
689Emacs can select the right coding system automatically most of the
690time---once you have specified your preferences.
635 691
636 Some coding systems can be recognized or distinguished by which byte 692 Some coding systems can be recognized or distinguished by which byte
637sequences appear in the data. However, there are coding systems that 693sequences appear in the data. However, there are coding systems that
@@ -737,6 +793,11 @@ feature for tar and archive files, to prevent Emacs from being confused
737by a @samp{-*-coding:-*-} tag in a member of the archive and thinking it 793by a @samp{-*-coding:-*-} tag in a member of the archive and thinking it
738applies to the archive file as a whole. 794applies to the archive file as a whole.
739 795
796 If Emacs recognizes the encoding of a file incorrectly, you can
797reread the file using the correct coding system by typing @kbd{C-x
798@key{RET} c @var{coding-system} @key{RET} M-x revert-buffer
799@key{RET}}.
800
740@vindex buffer-file-coding-system 801@vindex buffer-file-coding-system
741 Once Emacs has chosen a coding system for a buffer, it stores that 802 Once Emacs has chosen a coding system for a buffer, it stores that
742coding system in @code{buffer-file-coding-system} and uses that coding 803coding system in @code{buffer-file-coding-system} and uses that coding