aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorEli Zaretskii2021-11-12 10:53:52 +0200
committerEli Zaretskii2021-11-12 10:53:52 +0200
commit0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff (patch)
tree3e499aaac6f4af23f08fd5931d45a8f50609c91c
parenta6905e90cc3358a21726646c4ee9154e80fc96d6 (diff)
downloademacs-0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff.tar.gz
emacs-0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff.zip
Improve documentation of 'decode-coding-region'
* src/coding.c (Fdecode_coding_region): Doc fix. * doc/lispref/nonascii.texi (Coding System Basics) (Explicit Encoding): Explain the significance of using 'undecided' in 'decode-coding-*' functions.
-rw-r--r--doc/lispref/nonascii.texi26
-rw-r--r--src/coding.c9
2 files changed, 27 insertions, 8 deletions
diff --git a/doc/lispref/nonascii.texi b/doc/lispref/nonascii.texi
index 6980920a7b9..24117b50014 100644
--- a/doc/lispref/nonascii.texi
+++ b/doc/lispref/nonascii.texi
@@ -1048,9 +1048,9 @@ Alternativnyj, and KOI8.
1048 Every coding system specifies a particular set of character code 1048 Every coding system specifies a particular set of character code
1049conversions, but the coding system @code{undecided} is special: it 1049conversions, but the coding system @code{undecided} is special: it
1050leaves the choice unspecified, to be chosen heuristically for each 1050leaves the choice unspecified, to be chosen heuristically for each
1051file, based on the file's data. The coding system @code{prefer-utf-8} 1051file or string, based on the file's or string's data, when they are
1052is like @code{undecided}, but it prefers to choose @code{utf-8} when 1052decoded or encoded. The coding system @code{prefer-utf-8} is like
1053possible. 1053@code{undecided}, but it prefers to choose @code{utf-8} when possible.
1054 1054
1055 In general, a coding system doesn't guarantee roundtrip identity: 1055 In general, a coding system doesn't guarantee roundtrip identity:
1056decoding a byte sequence using a coding system, then encoding the 1056decoding a byte sequence using a coding system, then encoding the
@@ -1921,9 +1921,24 @@ length of the decoded text. If that buffer is a unibyte buffer
1921the decoded text (@pxref{Text Representations}) is inserted into the 1921the decoded text (@pxref{Text Representations}) is inserted into the
1922buffer as individual bytes. 1922buffer as individual bytes.
1923 1923
1924@cindex @code{charset}, text property on buffer text
1924This command puts a @code{charset} text property on the decoded text. 1925This command puts a @code{charset} text property on the decoded text.
1925The value of the property states the character set used to decode the 1926The value of the property states the character set used to decode the
1926original text. 1927original text.
1928
1929@cindex undecided coding-system, when decoding
1930This command detects the encoding of the text if necessary. If
1931@var{coding-system} is @code{undecided}, the command detects the
1932encoding of the text based on the byte sequences it finds in the text,
1933and also detects the type of end-of-line convention used by the text
1934(@pxref{Lisp and Coding Systems, eol type}). If @var{coding-system}
1935is @code{undecided-@var{eol-type}}, where @var{eol-type} is
1936@code{unix}, @code{dos}, or @code{mac}, then the command detects only
1937the encoding of the text. Any @var{coding-system} that doesn't
1938specify @var{eol-type}, as in @code{utf-8}, causes the command to
1939detect the end-of-line convention; specify the encoding completely, as
1940in @code{utf-8-unix}, if the EOL convention used by the text is known
1941in advance, to prevent any automatic detection.
1927@end deffn 1942@end deffn
1928 1943
1929@defun decode-coding-string string coding-system &optional nocopy buffer 1944@defun decode-coding-string string coding-system &optional nocopy buffer
@@ -1936,13 +1951,16 @@ trivial. To make explicit decoding useful, the contents of
1936values, but a multibyte string is also acceptable (assuming it 1951values, but a multibyte string is also acceptable (assuming it
1937contains 8-bit bytes in their multibyte form). 1952contains 8-bit bytes in their multibyte form).
1938 1953
1954This function detects the encoding of the string if needed, like
1955@code{decode-coding-region} does.
1956
1939If optional argument @var{buffer} specifies a buffer, the decoded text 1957If optional argument @var{buffer} specifies a buffer, the decoded text
1940is inserted in that buffer after point (point does not move). In this 1958is inserted in that buffer after point (point does not move). In this
1941case, the return value is the length of the decoded text. If that 1959case, the return value is the length of the decoded text. If that
1942buffer is a unibyte buffer, the internal representation of the decoded 1960buffer is a unibyte buffer, the internal representation of the decoded
1943text is inserted into it as individual bytes. 1961text is inserted into it as individual bytes.
1944 1962
1945@cindex @code{charset}, text property 1963@cindex @code{charset}, text property on strings
1946This function puts a @code{charset} text property on the decoded text. 1964This function puts a @code{charset} text property on the decoded text.
1947The value of the property states the character set used to decode the 1965The value of the property states the character set used to decode the
1948original text: 1966original text:
diff --git a/src/coding.c b/src/coding.c
index 7030a53869a..02dccf5bdb0 100644
--- a/src/coding.c
+++ b/src/coding.c
@@ -9455,11 +9455,12 @@ code_convert_region (Lisp_Object start, Lisp_Object end,
9455DEFUN ("decode-coding-region", Fdecode_coding_region, Sdecode_coding_region, 9455DEFUN ("decode-coding-region", Fdecode_coding_region, Sdecode_coding_region,
9456 3, 4, "r\nzCoding system: ", 9456 3, 4, "r\nzCoding system: ",
9457 doc: /* Decode the current region from the specified coding system. 9457 doc: /* Decode the current region from the specified coding system.
9458Interactively, prompt for the coding system to decode the region.
9458 9459
9459What's meant by \"decoding\" is transforming bytes into text 9460\"Decoding\" means transforming bytes into readable text (characters).
9460(characters). If, for instance, you have a region that contains data 9461If, for instance, you have a region that contains data that represents
9461that represents the two bytes #xc2 #xa9, after calling this function 9462the two bytes #xc2 #xa9, after calling this function with the utf-8
9462with the utf-8 coding system, the region will contain the single 9463coding system, the region will contain the single
9463character ?\\N{COPYRIGHT SIGN}. 9464character ?\\N{COPYRIGHT SIGN}.
9464 9465
9465When called from a program, takes four arguments: 9466When called from a program, takes four arguments: