Improve documentation of 'decode-coding-region'

* src/coding.c (Fdecode_coding_region): Doc fix. * doc/lispref/nonascii.texi (Coding System Basics) (Explicit Encoding): Explain the significance of using 'undecided' in 'decode-coding-*' functions.
author: Eli Zaretskii 2021-11-12 10:53:52 +0200
committer: Eli Zaretskii 2021-11-12 10:53:52 +0200
commit: 0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff (patch)
tree: 3e499aaac6f4af23f08fd5931d45a8f50609c91c
parent: a6905e90cc3358a21726646c4ee9154e80fc96d6 (diff)
download: emacs-0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff.tar.gz
emacs-0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff.zip
2 files changed, 27 insertions, 8 deletions
diff --git a/doc/lispref/nonascii.texi b/doc/lispref/nonascii.texi
index 6980920a7b9..24117b50014 100644
--- a/doc/lispref/nonascii.texi
+++ b/doc/lispref/nonascii.texi
@@ -1048,9 +1048,9 @@ Alternativnyj, and KOI8.
  Every coding system specifies a particular set of character code
 conversions, but the coding system @code{undecided} is special: it
 leaves the choice unspecified, to be chosen heuristically for each
-file, based on the file's data.  The coding system @code{prefer-utf-8}
+file or string, based on the file's or string's data, when they are
-is like @code{undecided}, but it prefers to choose @code{utf-8} when
+decoded or encoded.  The coding system @code{prefer-utf-8} is like
-possible.
+@code{undecided}, but it prefers to choose @code{utf-8} when possible.
  In general, a coding system doesn't guarantee roundtrip identity:
 decoding a byte sequence using a coding system, then encoding the
@@ -1921,9 +1921,24 @@ length of the decoded text.  If that buffer is a unibyte buffer
 the decoded text (@pxref{Text Representations}) is inserted into the
 buffer as individual bytes.
+@cindex @code{charset}, text property on buffer text
 This command puts a @code{charset} text property on the decoded text.
 The value of the property states the character set used to decode the
 original text.
+@cindex undecided coding-system, when decoding
+This command detects the encoding of the text if necessary.  If
+@var{coding-system} is @code{undecided}, the command detects the
+encoding of the text based on the byte sequences it finds in the text,
+and also detects the type of end-of-line convention used by the text
+(@pxref{Lisp and Coding Systems, eol type}).  If @var{coding-system}
+is @code{undecided-@var{eol-type}}, where @var{eol-type} is
+@code{unix}, @code{dos}, or @code{mac}, then the command detects only
+the encoding of the text.  Any @var{coding-system} that doesn't
+specify @var{eol-type}, as in @code{utf-8}, causes the command to
+detect the end-of-line convention; specify the encoding completely, as
+in @code{utf-8-unix}, if the EOL convention used by the text is known
+in advance, to prevent any automatic detection.
 @end deffn
 @defun decode-coding-string string coding-system &optional nocopy buffer
@@ -1936,13 +1951,16 @@ trivial.  To make explicit decoding useful, the contents of
 values, but a multibyte string is also acceptable (assuming it
 contains 8-bit bytes in their multibyte form).
+This function detects the encoding of the string if needed, like
+@code{decode-coding-region} does.
 If optional argument @var{buffer} specifies a buffer, the decoded text
 is inserted in that buffer after point (point does not move).  In this
 case, the return value is the length of the decoded text.  If that
 buffer is a unibyte buffer, the internal representation of the decoded
 text is inserted into it as individual bytes.
-@cindex @code{charset}, text property
+@cindex @code{charset}, text property on strings
 This function puts a @code{charset} text property on the decoded text.
 The value of the property states the character set used to decode the
 original text:
diff --git a/src/coding.c b/src/coding.c
index 7030a53869a..02dccf5bdb0 100644
--- a/src/coding.c
+++ b/src/coding.c
@@ -9455,11 +9455,12 @@ code_convert_region (Lisp_Object start, Lisp_Object end,
 DEFUN ("decode-coding-region", Fdecode_coding_region, Sdecode_coding_region,
       3, 4, "r\nzCoding system: ",
       doc: /* Decode the current region from the specified coding system.
+Interactively, prompt for the coding system to decode the region.
-What's meant by \"decoding\" is transforming bytes into text
+\"Decoding\" means transforming bytes into readable text (characters).
-(characters).  If, for instance, you have a region that contains data
+If, for instance, you have a region that contains data that represents
-that represents the two bytes #xc2 #xa9, after calling this function
+the two bytes #xc2 #xa9, after calling this function with the utf-8
-with the utf-8 coding system, the region will contain the single
+coding system, the region will contain the single
 character ?\\N{COPYRIGHT SIGN}.
 When called from a program, takes four arguments:
author	Eli Zaretskii	2021-11-12 10:53:52 +0200
committer	Eli Zaretskii	2021-11-12 10:53:52 +0200
commit	0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff (patch)
tree	3e499aaac6f4af23f08fd5931d45a8f50609c91c
parent	a6905e90cc3358a21726646c4ee9154e80fc96d6 (diff)
download	emacs-0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff.tar.gz emacs-0d0125daaeb77af5aa6091059ff6d0c1ce9f6cff.zip