diff options
| author | Eli Zaretskii | 2001-05-06 11:27:54 +0000 |
|---|---|---|
| committer | Eli Zaretskii | 2001-05-06 11:27:54 +0000 |
| commit | 8561e53a1cfd859f26508676ddf9533875dea3c0 (patch) | |
| tree | 945068cef06d9873809b7e74a4dd407c21561063 | |
| parent | 80561aaa69dfaf61bcbae2ea6f7befbb9551979c (diff) | |
| download | emacs-8561e53a1cfd859f26508676ddf9533875dea3c0.tar.gz emacs-8561e53a1cfd859f26508676ddf9533875dea3c0.zip | |
(International): Add an overview of Mule features, with pointers to
detailed description.
(Enabling Multibyte): Describe how to switch a unibyte session to multibyte.
Mention that by default, all sessions are multibyte.
(Coding Systems): Make it clear that cpNNN are coding systems, and should
be used as such.
(Recognize Coding): Explain that Emacs decodes text as part of reading
it. Mention revert-buffer as a means to redecode a file.
| -rw-r--r-- | man/mule.texi | 71 |
1 files changed, 66 insertions, 5 deletions
diff --git a/man/mule.texi b/man/mule.texi index a80c0af7a81..e490d091b38 100644 --- a/man/mule.texi +++ b/man/mule.texi | |||
| @@ -44,6 +44,42 @@ have been merged from the modified version of Emacs known as MULE (for | |||
| 44 | Emacs also supports various encodings of these characters used by | 44 | Emacs also supports various encodings of these characters used by |
| 45 | other internationalized software, such as word processors and mailers. | 45 | other internationalized software, such as word processors and mailers. |
| 46 | 46 | ||
| 47 | Emacs allows editing text with international characters by supporting | ||
| 48 | all the related activities: | ||
| 49 | |||
| 50 | @itemize @bullet | ||
| 51 | @item | ||
| 52 | You can visit files with non-ASCII characters, save non-ASCII text, and | ||
| 53 | pass non-ASCII text between Emacs and programs it invokes (such as | ||
| 54 | compilers, spell-checkers, and mailers). Setting your language | ||
| 55 | environment (@pxref{Language Environments}) takes care of setting up the | ||
| 56 | coding systems and other options for a specific language or culture. | ||
| 57 | Alternatively, you can specify how Emacs should encode or decode text | ||
| 58 | for each command; see @ref{Specify Coding}. | ||
| 59 | |||
| 60 | @item | ||
| 61 | You can display non-ASCII characters encoded by the various scripts. | ||
| 62 | This works by using appropriate fonts on X and similar graphics | ||
| 63 | displays (@pxref{Defining Fontsets}), and by sending special codes to | ||
| 64 | text-only displays (@pxref{Specify Coding}). If some characters are | ||
| 65 | displayed incorrectly, refer to @ref{Undisplayable Characters}, which | ||
| 66 | describes possible problems and explains how to solve them. | ||
| 67 | |||
| 68 | @item | ||
| 69 | You can insert non-ASCII characters or search for them. To do that, | ||
| 70 | you can specify an input method (@pxref{Select Input Method}) suitable | ||
| 71 | for your language, or use the default input method set up when you set | ||
| 72 | your language environment. (Emacs input methods are part of the Leim | ||
| 73 | package, which must be installed for you to be able to use them.) If | ||
| 74 | your keyboard can produce non-ASCII characters, you can select an | ||
| 75 | appropriate keyboard coding system (@pxref{Specify Coding}), and Emacs | ||
| 76 | will accept those characters. Latin-1 characters can also be input by | ||
| 77 | using the @kbd{C-x 8} prefix, see @ref{Single-Byte Character Support, | ||
| 78 | C-x 8}. | ||
| 79 | @end itemize | ||
| 80 | |||
| 81 | The rest of this chapter describes these issues in detail. | ||
| 82 | |||
| 47 | @menu | 83 | @menu |
| 48 | * International Intro:: Basic concepts of multibyte characters. | 84 | * International Intro:: Basic concepts of multibyte characters. |
| 49 | * Enabling Multibyte:: Controlling whether to use multibyte characters. | 85 | * Enabling Multibyte:: Controlling whether to use multibyte characters. |
| @@ -121,6 +157,7 @@ its internal representation within Emacs. | |||
| 121 | @node Enabling Multibyte | 157 | @node Enabling Multibyte |
| 122 | @section Enabling Multibyte Characters | 158 | @section Enabling Multibyte Characters |
| 123 | 159 | ||
| 160 | @cindex turn multibyte support on or off | ||
| 124 | You can enable or disable multibyte character support, either for | 161 | You can enable or disable multibyte character support, either for |
| 125 | Emacs as a whole, or for a single buffer. When multibyte characters are | 162 | Emacs as a whole, or for a single buffer. When multibyte characters are |
| 126 | disabled in a buffer, then each byte in that buffer represents a | 163 | disabled in a buffer, then each byte in that buffer represents a |
| @@ -134,6 +171,9 @@ use ISO Latin; the Emacs multibyte character set includes all the | |||
| 134 | characters in these character sets, and Emacs can translate | 171 | characters in these character sets, and Emacs can translate |
| 135 | automatically to and from the ISO codes. | 172 | automatically to and from the ISO codes. |
| 136 | 173 | ||
| 174 | By default, Emacs starts in multibyte mode, because that allows you to | ||
| 175 | use all the supported languages and scripts without limitations. | ||
| 176 | |||
| 137 | To edit a particular file in unibyte representation, visit it using | 177 | To edit a particular file in unibyte representation, visit it using |
| 138 | @code{find-file-literally}. @xref{Visiting}. To convert a buffer in | 178 | @code{find-file-literally}. @xref{Visiting}. To convert a buffer in |
| 139 | multibyte representation into a single-byte representation of the same | 179 | multibyte representation into a single-byte representation of the same |
| @@ -152,8 +192,16 @@ conversion, uncompression and auto mode selection as | |||
| 152 | the @samp{--unibyte} option (@pxref{Initial Options}), or set the | 192 | the @samp{--unibyte} option (@pxref{Initial Options}), or set the |
| 153 | environment variable @env{EMACS_UNIBYTE}. You can also customize | 193 | environment variable @env{EMACS_UNIBYTE}. You can also customize |
| 154 | @code{enable-multibyte-characters} or, equivalently, directly set the | 194 | @code{enable-multibyte-characters} or, equivalently, directly set the |
| 155 | variable @code{default-enable-multibyte-characters} in your init file to | 195 | variable @code{default-enable-multibyte-characters} to @code{nil} in |
| 156 | have basically the same effect as @samp{--unibyte}. | 196 | your init file to have basically the same effect as @samp{--unibyte}. |
| 197 | |||
| 198 | @findex toggle-enable-multibyte-characters | ||
| 199 | To convert a unibyte session to a multibyte session, set | ||
| 200 | @code{default-enable-multibyte-characters} to @code{t}. Buffers which | ||
| 201 | were created in the unibyte session before you turn on multibyte support | ||
| 202 | will stay unibyte. You can turn on multibyte support in a specific | ||
| 203 | buffer by invoking the command @code{toggle-enable-multibyte-characters} | ||
| 204 | in that buffer. | ||
| 157 | 205 | ||
| 158 | @cindex Lisp files, and multibyte operation | 206 | @cindex Lisp files, and multibyte operation |
| 159 | @cindex multibyte operation, and Lisp files | 207 | @cindex multibyte operation, and Lisp files |
| @@ -527,10 +575,15 @@ their names usually start with @samp{iso}. There are also special | |||
| 527 | coding systems @code{no-conversion}, @code{raw-text} and | 575 | coding systems @code{no-conversion}, @code{raw-text} and |
| 528 | @code{emacs-mule} which do not convert printing characters at all. | 576 | @code{emacs-mule} which do not convert printing characters at all. |
| 529 | 577 | ||
| 578 | @cindex international files from DOS/Windows systems | ||
| 530 | A special class of coding systems, collectively known as | 579 | A special class of coding systems, collectively known as |
| 531 | @dfn{codepages}, is designed to support text encoded by MS-Windows and | 580 | @dfn{codepages}, is designed to support text encoded by MS-Windows and |
| 532 | MS-DOS software. To use any of these systems, you need to create it | 581 | MS-DOS software. To use any of these systems, you need to create it |
| 533 | with @kbd{M-x codepage-setup}. @xref{MS-DOS and MULE}. | 582 | with @kbd{M-x codepage-setup}. @xref{MS-DOS and MULE}. After |
| 583 | creating the coding system for the codepage, you can use it as any | ||
| 584 | other coding system. For example, to visit a file encoded in codepage | ||
| 585 | 850, type @kbd{C-x @key{RET} c cp850 @key{RET} C-x C-f @var{filename} | ||
| 586 | @key{RET}}. | ||
| 534 | 587 | ||
| 535 | In addition to converting various representations of non-ASCII | 588 | In addition to converting various representations of non-ASCII |
| 536 | characters, a coding system can perform end-of-line conversion. Emacs | 589 | characters, a coding system can perform end-of-line conversion. Emacs |
| @@ -630,8 +683,11 @@ the usual three variants to specify the kind of end-of-line conversion. | |||
| 630 | @node Recognize Coding | 683 | @node Recognize Coding |
| 631 | @section Recognizing Coding Systems | 684 | @section Recognizing Coding Systems |
| 632 | 685 | ||
| 633 | Most of the time, Emacs can recognize which coding system to use for | 686 | Emacs tries to recognize which coding system to use for a given text |
| 634 | any given file---once you have specified your preferences. | 687 | as an integral part of reading that text. (This applies to files |
| 688 | being read, output from subprocesses, text from X selections, etc.) | ||
| 689 | Emacs can select the right coding system automatically most of the | ||
| 690 | time---once you have specified your preferences. | ||
| 635 | 691 | ||
| 636 | Some coding systems can be recognized or distinguished by which byte | 692 | Some coding systems can be recognized or distinguished by which byte |
| 637 | sequences appear in the data. However, there are coding systems that | 693 | sequences appear in the data. However, there are coding systems that |
| @@ -737,6 +793,11 @@ feature for tar and archive files, to prevent Emacs from being confused | |||
| 737 | by a @samp{-*-coding:-*-} tag in a member of the archive and thinking it | 793 | by a @samp{-*-coding:-*-} tag in a member of the archive and thinking it |
| 738 | applies to the archive file as a whole. | 794 | applies to the archive file as a whole. |
| 739 | 795 | ||
| 796 | If Emacs recognizes the encoding of a file incorrectly, you can | ||
| 797 | reread the file using the correct coding system by typing @kbd{C-x | ||
| 798 | @key{RET} c @var{coding-system} @key{RET} M-x revert-buffer | ||
| 799 | @key{RET}}. | ||
| 800 | |||
| 740 | @vindex buffer-file-coding-system | 801 | @vindex buffer-file-coding-system |
| 741 | Once Emacs has chosen a coding system for a buffer, it stores that | 802 | Once Emacs has chosen a coding system for a buffer, it stores that |
| 742 | coding system in @code{buffer-file-coding-system} and uses that coding | 803 | coding system in @code{buffer-file-coding-system} and uses that coding |