aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorChong Yidong2009-02-22 00:22:46 +0000
committerChong Yidong2009-02-22 00:22:46 +0000
commit8f88eb24b234cb3ae794a4780e76f99498d0a154 (patch)
treeb352d6c3da7d7596eacdd01b37f58507607e3d8e
parent5fbf8b28ee3bc4c1921eeaf2a33d64bd1888f024 (diff)
downloademacs-8f88eb24b234cb3ae794a4780e76f99498d0a154.tar.gz
emacs-8f88eb24b234cb3ae794a4780e76f99498d0a154.zip
(Creating Strings): Copyedits. Remove obsolete Emacs 20 usage of `concat'.
(Case Conversion): Copyedits.
-rw-r--r--doc/lispref/strings.texi145
1 files changed, 67 insertions, 78 deletions
diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 0b53a7ae593..5dd5e802b89 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -61,15 +61,13 @@ concerned with these two representations.
61 Sometimes key sequences are represented as unibyte strings. When a 61 Sometimes key sequences are represented as unibyte strings. When a
62unibyte string is a key sequence, string elements in the range 128 to 62unibyte string is a key sequence, string elements in the range 128 to
63255 represent meta characters (which are large integers) rather than 63255 represent meta characters (which are large integers) rather than
64character codes in the range 128 to 255. 64character codes in the range 128 to 255. Strings cannot hold
65 65characters that have the hyper, super or alt modifiers; they can hold
66 Strings cannot hold characters that have the hyper, super or alt 66@acronym{ASCII} control characters, but no other control characters.
67modifiers; they can hold @acronym{ASCII} control characters, but no other 67They do not distinguish case in @acronym{ASCII} control characters.
68control characters. They do not distinguish case in @acronym{ASCII} control 68If you want to store such characters in a sequence, such as a key
69characters. If you want to store such characters in a sequence, such as 69sequence, you must use a vector instead of a string. @xref{Character
70a key sequence, you must use a vector instead of a string. 70Type}, for more information about keyboard input characters.
71@xref{Character Type}, for more information about the representation of meta
72and other modifiers for keyboard input characters.
73 71
74 Strings are useful for holding regular expressions. You can also 72 Strings are useful for holding regular expressions. You can also
75match regular expressions against strings with @code{string-match} 73match regular expressions against strings with @code{string-match}
@@ -155,11 +153,11 @@ index @var{start} up to (but excluding) the character at the index
155@end example 153@end example
156 154
157@noindent 155@noindent
158Here the index for @samp{a} is 0, the index for @samp{b} is 1, and the 156In the above example, the index for @samp{a} is 0, the index for
159index for @samp{c} is 2. Thus, three letters, @samp{abc}, are copied 157@samp{b} is 1, and the index for @samp{c} is 2. The index 3---which
160from the string @code{"abcdefg"}. The index 3 marks the character 158is the the fourth character in the string---marks the character
161position up to which the substring is copied. The character whose index 159position up to which the substring is copied. Thus, @samp{abc} is
162is 3 is actually the fourth character in the string. 160copied from the string @code{"abcdefg"}.
163 161
164A negative number counts from the end of the string, so that @minus{}1 162A negative number counts from the end of the string, so that @minus{}1
165signifies the index of the last character of the string. For example: 163signifies the index of the last character of the string. For example:
@@ -256,16 +254,9 @@ returns an empty string.
256@end example 254@end example
257 255
258@noindent 256@noindent
259The @code{concat} function always constructs a new string that is 257This function always constructs a new string that is not @code{eq} to
260not @code{eq} to any existing string, except when the result is empty 258any existing string, except when the result is the empty string (to
261(since empty strings are canonicalized to save space). 259save space, Emacs makes only one empty multibyte string).
262
263In Emacs versions before 21, when an argument was an integer (not a
264sequence of integers), it was converted to a string of digits making up
265the decimal printed representation of the integer. This obsolete usage
266no longer works. The proper way to convert an integer to its decimal
267printed form is with @code{format} (@pxref{Formatting Strings}) or
268@code{number-to-string} (@pxref{String Conversion}).
269 260
270For information about other concatenation functions, see the 261For information about other concatenation functions, see the
271description of @code{mapconcat} in @ref{Mapping Functions}, 262description of @code{mapconcat} in @ref{Mapping Functions},
@@ -276,20 +267,19 @@ combine-and-quote-strings}.
276@end defun 267@end defun
277 268
278@defun split-string string &optional separators omit-nulls 269@defun split-string string &optional separators omit-nulls
279This function splits @var{string} into substrings at matches for the 270This function splits @var{string} into substrings based on the regular
280regular expression @var{separators}. Each match for @var{separators} 271expression @var{separators} (@pxref{Regular Expressions}). Each match
281defines a splitting point; the substrings between the splitting points 272for @var{separators} defines a splitting point; the substrings between
282are made into a list, which is the value returned by 273splitting points are made into a list, which is returned.
283@code{split-string}.
284 274
285If @var{omit-nulls} is @code{nil}, the result contains null strings 275If @var{omit-nulls} is @code{nil} (or omitted), the result contains
286whenever there are two consecutive matches for @var{separators}, or a 276null strings whenever there are two consecutive matches for
287match is adjacent to the beginning or end of @var{string}. If 277@var{separators}, or a match is adjacent to the beginning or end of
288@var{omit-nulls} is @code{t}, these null strings are omitted from the 278@var{string}. If @var{omit-nulls} is @code{t}, these null strings are
289result. 279omitted from the result.
290 280
291If @var{separators} is @code{nil} (or omitted), 281If @var{separators} is @code{nil} (or omitted), the default is the
292the default is the value of @code{split-string-default-separators}. 282value of @code{split-string-default-separators}.
293 283
294As a special case, when @var{separators} is @code{nil} (or omitted), 284As a special case, when @var{separators} is @code{nil} (or omitted),
295null strings are always omitted from the result. Thus: 285null strings are always omitted from the result. Thus:
@@ -441,9 +431,9 @@ For technical reasons, a unibyte and a multibyte string are
441@code{equal} if and only if they contain the same sequence of 431@code{equal} if and only if they contain the same sequence of
442character codes and all these codes are either in the range 0 through 432character codes and all these codes are either in the range 0 through
443127 (@acronym{ASCII}) or 160 through 255 (@code{eight-bit-graphic}). 433127 (@acronym{ASCII}) or 160 through 255 (@code{eight-bit-graphic}).
444However, when a unibyte string gets converted to a multibyte string, 434However, when a unibyte string is converted to a multibyte string, all
445all characters with codes in the range 160 through 255 get converted 435characters with codes in the range 160 through 255 are converted to
446to characters with higher codes, whereas @acronym{ASCII} characters 436characters with higher codes, whereas @acronym{ASCII} characters
447remain unchanged. Thus, a unibyte string and its conversion to 437remain unchanged. Thus, a unibyte string and its conversion to
448multibyte are only @code{equal} if the string is all @acronym{ASCII}. 438multibyte are only @code{equal} if the string is all @acronym{ASCII}.
449Character codes 160 through 255 are not entirely proper in multibyte 439Character codes 160 through 255 are not entirely proper in multibyte
@@ -549,7 +539,7 @@ be a list of strings or symbols rather than an actual alist.
549@xref{Association Lists}. 539@xref{Association Lists}.
550@end defun 540@end defun
551 541
552 See also the @code{compare-buffer-substrings} function in 542 See also the function @code{compare-buffer-substrings} in
553@ref{Comparing Text}, for a way to compare text in buffers. The 543@ref{Comparing Text}, for a way to compare text in buffers. The
554function @code{string-match}, which matches a regular expression 544function @code{string-match}, which matches a regular expression
555against a string, can be used for a kind of string comparison; see 545against a string, can be used for a kind of string comparison; see
@@ -560,14 +550,14 @@ against a string, can be used for a kind of string comparison; see
560@section Conversion of Characters and Strings 550@section Conversion of Characters and Strings
561@cindex conversion of strings 551@cindex conversion of strings
562 552
563 This section describes functions for conversions between characters, 553 This section describes functions for converting between characters,
564strings and integers. @code{format} (@pxref{Formatting Strings}) 554strings and integers. @code{format} (@pxref{Formatting Strings}) and
565and @code{prin1-to-string} 555@code{prin1-to-string} (@pxref{Output Functions}) can also convert
566(@pxref{Output Functions}) can also convert Lisp objects into strings. 556Lisp objects into strings. @code{read-from-string} (@pxref{Input
567@code{read-from-string} (@pxref{Input Functions}) can ``convert'' a 557Functions}) can ``convert'' a string representation of a Lisp object
568string representation of a Lisp object into an object. The functions 558into an object. The functions @code{string-make-multibyte} and
569@code{string-make-multibyte} and @code{string-make-unibyte} convert the 559@code{string-make-unibyte} convert the text representation of a string
570text representation of a string (@pxref{Converting Representations}). 560(@pxref{Converting Representations}).
571 561
572 @xref{Documentation}, for functions that produce textual descriptions 562 @xref{Documentation}, for functions that produce textual descriptions
573of text characters and general input events 563of text characters and general input events
@@ -689,10 +679,10 @@ Functions}.
689@cindex formatting strings 679@cindex formatting strings
690@cindex strings, formatting them 680@cindex strings, formatting them
691 681
692 @dfn{Formatting} means constructing a string by substitution of 682 @dfn{Formatting} means constructing a string by substituting
693computed values at various places in a constant string. This constant string 683computed values at various places in a constant string. This constant
694controls how the other values are printed, as well as where they appear; 684string controls how the other values are printed, as well as where
695it is called a @dfn{format string}. 685they appear; it is called a @dfn{format string}.
696 686
697 Formatting is often useful for computing messages to be displayed. In 687 Formatting is often useful for computing messages to be displayed. In
698fact, the functions @code{message} and @code{error} provide the same 688fact, the functions @code{message} and @code{error} provide the same
@@ -936,15 +926,15 @@ arguments.
936@acronym{ASCII} codes 88 and 120 respectively. 926@acronym{ASCII} codes 88 and 120 respectively.
937 927
938@defun downcase string-or-char 928@defun downcase string-or-char
939This function converts a character or a string to lower case. 929This function converts @var{string-or-char}, which should be either a
930character or a string, to lower case.
940 931
941When the argument to @code{downcase} is a string, the function creates 932When @var{string-or-char} is a string, this function returns a new
942and returns a new string in which each letter in the argument that is 933string in which each letter in the argument that is upper case is
943upper case is converted to lower case. When the argument to 934converted to lower case. When @var{string-or-char} is a character,
944@code{downcase} is a character, @code{downcase} returns the 935this function returns the corresponding lower case character (an
945corresponding lower case character. This value is an integer. If the 936integer); if the original character is lower case, or is not a letter,
946original character is lower case, or is not a letter, then the value 937the return value is equal to the original character.
947equals the original character.
948 938
949@example 939@example
950(downcase "The cat in the hat") 940(downcase "The cat in the hat")
@@ -956,16 +946,15 @@ equals the original character.
956@end defun 946@end defun
957 947
958@defun upcase string-or-char 948@defun upcase string-or-char
959This function converts a character or a string to upper case. 949This function converts @var{string-or-char}, which should be either a
960 950character or a string, to upper case.
961When the argument to @code{upcase} is a string, the function creates
962and returns a new string in which each letter in the argument that is
963lower case is converted to upper case.
964 951
965When the argument to @code{upcase} is a character, @code{upcase} 952When @var{string-or-char} is a string, this function returns a new
966returns the corresponding upper case character. This value is an integer. 953string in which each letter in the argument that is lower case is
967If the original character is upper case, or is not a letter, then the 954converted to upper case. When @var{string-or-char} is a character,
968value returned equals the original character. 955this function returns the corresponding upper case character (an an
956integer); if the original character is upper case, or is not a letter,
957the return value is equal to the original character.
969 958
970@example 959@example
971(upcase "The cat in the hat") 960(upcase "The cat in the hat")
@@ -979,9 +968,9 @@ value returned equals the original character.
979@defun capitalize string-or-char 968@defun capitalize string-or-char
980@cindex capitalization 969@cindex capitalization
981This function capitalizes strings or characters. If 970This function capitalizes strings or characters. If
982@var{string-or-char} is a string, the function creates and returns a new 971@var{string-or-char} is a string, the function returns a new string
983string, whose contents are a copy of @var{string-or-char} in which each 972whose contents are a copy of @var{string-or-char} in which each word
984word has been capitalized. This means that the first character of each 973has been capitalized. This means that the first character of each
985word is converted to upper case, and the rest are converted to lower 974word is converted to upper case, and the rest are converted to lower
986case. 975case.
987 976
@@ -989,8 +978,8 @@ The definition of a word is any sequence of consecutive characters that
989are assigned to the word constituent syntax class in the current syntax 978are assigned to the word constituent syntax class in the current syntax
990table (@pxref{Syntax Class Table}). 979table (@pxref{Syntax Class Table}).
991 980
992When the argument to @code{capitalize} is a character, @code{capitalize} 981When @var{string-or-char} is a character, this function does the same
993has the same result as @code{upcase}. 982thing as @code{upcase}.
994 983
995@example 984@example
996@group 985@group
@@ -1084,13 +1073,13 @@ equivalent). (For ordinary @acronym{ASCII}, this would map @samp{a} into
1084@samp{A} and @samp{A} into @samp{a}, and likewise for each set of 1073@samp{A} and @samp{A} into @samp{a}, and likewise for each set of
1085equivalent characters.) 1074equivalent characters.)
1086 1075
1087 When you construct a case table, you can provide @code{nil} for 1076 When constructing a case table, you can provide @code{nil} for
1088@var{canonicalize}; then Emacs fills in this slot from the lower case 1077@var{canonicalize}; then Emacs fills in this slot from the lower case
1089and upper case mappings. You can also provide @code{nil} for 1078and upper case mappings. You can also provide @code{nil} for
1090@var{equivalences}; then Emacs fills in this slot from 1079@var{equivalences}; then Emacs fills in this slot from
1091@var{canonicalize}. In a case table that is actually in use, those 1080@var{canonicalize}. In a case table that is actually in use, those
1092components are non-@code{nil}. Do not try to specify @var{equivalences} 1081components are non-@code{nil}. Do not try to specify
1093without also specifying @var{canonicalize}. 1082@var{equivalences} without also specifying @var{canonicalize}.
1094 1083
1095 Here are the functions for working with case tables: 1084 Here are the functions for working with case tables:
1096 1085
@@ -1125,7 +1114,7 @@ of an abnormal exit via @code{throw} or error (@pxref{Nonlocal
1125Exits}). 1114Exits}).
1126@end defmac 1115@end defmac
1127 1116
1128 Some language environments may modify the case conversions of 1117 Some language environments modify the case conversions of
1129@acronym{ASCII} characters; for example, in the Turkish language 1118@acronym{ASCII} characters; for example, in the Turkish language
1130environment, the @acronym{ASCII} character @samp{I} is downcased into 1119environment, the @acronym{ASCII} character @samp{I} is downcased into
1131a Turkish ``dotless i''. This can interfere with code that requires 1120a Turkish ``dotless i''. This can interfere with code that requires