diff options
| author | Luc Teirlinck | 2003-10-27 15:54:13 +0000 |
|---|---|---|
| committer | Luc Teirlinck | 2003-10-27 15:54:13 +0000 |
| commit | f67b6c12766d0ba5768cf31344ad112b54e5d694 (patch) | |
| tree | 8f14faac06321313568d8dcbc50cd94886862850 | |
| parent | a2f4def12b0e9936219573505e6405dbbcac8a65 (diff) | |
| download | emacs-f67b6c12766d0ba5768cf31344ad112b54e5d694.tar.gz emacs-f67b6c12766d0ba5768cf31344ad112b54e5d694.zip | |
(Creating Strings): Argument START to `substring' can not be `nil'.
Expand description of `substring-no-properties'. Correct description
of `split-string', especially with respect to empty matches. Prevent
very bad line break in definition of `split-string-default-separators'.
(Text Comparison): `string=' and `string<' also accept symbols as
arguments.
(String Conversion): More completely describe argument BASE in
`string-to-number'.
(Formatting Strings): `%s' and `%S" in `format' do require corresponding
object. Clarify behavior of numeric prefix after `%' in `format'.
(Case Conversion): The argument to `upcase-initials' can be a character.
| -rw-r--r-- | lispref/strings.texi | 144 |
1 files changed, 97 insertions, 47 deletions
diff --git a/lispref/strings.texi b/lispref/strings.texi index 79aeb976f1e..b0106f9a73b 100644 --- a/lispref/strings.texi +++ b/lispref/strings.texi | |||
| @@ -172,7 +172,7 @@ In this example, the index for @samp{e} is @minus{}3, the index for | |||
| 172 | @samp{f} is @minus{}2, and the index for @samp{g} is @minus{}1. | 172 | @samp{f} is @minus{}2, and the index for @samp{g} is @minus{}1. |
| 173 | Therefore, @samp{e} and @samp{f} are included, and @samp{g} is excluded. | 173 | Therefore, @samp{e} and @samp{f} are included, and @samp{g} is excluded. |
| 174 | 174 | ||
| 175 | When @code{nil} is used as an index, it stands for the length of the | 175 | When @code{nil} is used for @var{end}, it stands for the length of the |
| 176 | string. Thus, | 176 | string. Thus, |
| 177 | 177 | ||
| 178 | @example | 178 | @example |
| @@ -208,10 +208,11 @@ For example: | |||
| 208 | @result{} [b (c)] | 208 | @result{} [b (c)] |
| 209 | @end example | 209 | @end example |
| 210 | 210 | ||
| 211 | A @code{wrong-type-argument} error is signaled if either @var{start} or | 211 | A @code{wrong-type-argument} error is signaled if @var{start} is not |
| 212 | @var{end} is not an integer or @code{nil}. An @code{args-out-of-range} | 212 | an integer or if @var{end} is neither an integer nor @code{nil}. An |
| 213 | error is signaled if @var{start} indicates a character following | 213 | @code{args-out-of-range} error is signaled if @var{start} indicates a |
| 214 | @var{end}, or if either integer is out of range for @var{string}. | 214 | character following @var{end}, or if either integer is out of range |
| 215 | for @var{string}. | ||
| 215 | 216 | ||
| 216 | Contrast this function with @code{buffer-substring} (@pxref{Buffer | 217 | Contrast this function with @code{buffer-substring} (@pxref{Buffer |
| 217 | Contents}), which returns a string containing a portion of the text in | 218 | Contents}), which returns a string containing a portion of the text in |
| @@ -219,9 +220,12 @@ the current buffer. The beginning of a string is at index 0, but the | |||
| 219 | beginning of a buffer is at index 1. | 220 | beginning of a buffer is at index 1. |
| 220 | @end defun | 221 | @end defun |
| 221 | 222 | ||
| 222 | @defun substring-no-properties string start &optional end | 223 | @defun substring-no-properties string &optional start end |
| 223 | This works like @code{substring} but discards all text properties | 224 | This works like @code{substring} but discards all text properties from |
| 224 | from the value. | 225 | the value. Also, @var{start} may be omitted or @code{nil}, which is |
| 226 | equivalent to 0. Thus, @w{@code{(substring-no-properties | ||
| 227 | @var{string})}} returns a copy of @var{string}, with all text | ||
| 228 | properties removed. | ||
| 225 | @end defun | 229 | @end defun |
| 226 | 230 | ||
| 227 | @defun concat &rest sequences | 231 | @defun concat &rest sequences |
| @@ -264,7 +268,7 @@ description of @code{mapconcat} in @ref{Mapping Functions}, | |||
| 264 | Lists}. | 268 | Lists}. |
| 265 | @end defun | 269 | @end defun |
| 266 | 270 | ||
| 267 | @defun split-string string separators omit-nulls | 271 | @defun split-string string &optional separators omit-nulls |
| 268 | This function splits @var{string} into substrings at matches for the | 272 | This function splits @var{string} into substrings at matches for the |
| 269 | regular expression @var{separators}. Each match for @var{separators} | 273 | regular expression @var{separators}. Each match for @var{separators} |
| 270 | defines a splitting point; the substrings between the splitting points | 274 | defines a splitting point; the substrings between the splitting points |
| @@ -285,7 +289,7 @@ null strings are always omitted from the result. Thus: | |||
| 285 | 289 | ||
| 286 | @example | 290 | @example |
| 287 | (split-string " two words ") | 291 | (split-string " two words ") |
| 288 | @result{} ("two" "words") | 292 | @result{} ("two" "words") |
| 289 | @end example | 293 | @end example |
| 290 | 294 | ||
| 291 | The result is not @samp{("" "two" "words" "")}, which would rarely be | 295 | The result is not @samp{("" "two" "words" "")}, which would rarely be |
| @@ -294,33 +298,62 @@ useful. If you need such a result, use an explict value for | |||
| 294 | 298 | ||
| 295 | @example | 299 | @example |
| 296 | (split-string " two words " split-string-default-separators) | 300 | (split-string " two words " split-string-default-separators) |
| 297 | @result{} ("" "two" "words" "") | 301 | @result{} ("" "two" "words" "") |
| 298 | @end example | 302 | @end example |
| 299 | 303 | ||
| 300 | More examples: | 304 | More examples: |
| 301 | 305 | ||
| 302 | @example | 306 | @example |
| 303 | (split-string "Soup is good food" "o") | 307 | (split-string "Soup is good food" "o") |
| 304 | @result{} ("S" "up is g" "" "d f" "" "d") | 308 | @result{} ("S" "up is g" "" "d f" "" "d") |
| 305 | (split-string "Soup is good food" "o" t) | 309 | (split-string "Soup is good food" "o" t) |
| 306 | @result{} ("S" "up is g" "d f" "d") | 310 | @result{} ("S" "up is g" "d f" "d") |
| 307 | (split-string "Soup is good food" "o+") | 311 | (split-string "Soup is good food" "o+") |
| 308 | @result{} ("S" "up is g" "d f" "d") | 312 | @result{} ("S" "up is g" "d f" "d") |
| 309 | @end example | 313 | @end example |
| 310 | 314 | ||
| 311 | Empty matches do count, when not adjacent to another match: | 315 | Empty matches do count, except that @code{split-string} will not look |
| 316 | for a final empty match when it already reached the end of the string | ||
| 317 | using a non-empty match or when @var{string} is empty: | ||
| 312 | 318 | ||
| 313 | @example | 319 | @example |
| 314 | (split-string "Soup is good food" "o*") | 320 | (split-string "aooob" "o*") |
| 315 | @result{}("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d") | 321 | @result{} ("" "a" "" "b" "") |
| 316 | (split-string "Nice doggy!" "") | 322 | (split-string "ooaboo" "o*") |
| 317 | @result{}("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!") | 323 | @result{} ("" "" "a" "b" "") |
| 324 | (split-string "" "") | ||
| 325 | @result{} ("") | ||
| 326 | @end example | ||
| 327 | |||
| 328 | However, when @var{separators} can match the empty string, | ||
| 329 | @var{omit-nulls} is usually @code{t}, so that the subtleties in the | ||
| 330 | three previous examples are rarely relevant: | ||
| 331 | |||
| 332 | @example | ||
| 333 | (split-string "Soup is good food" "o*" t) | ||
| 334 | @result{} ("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d") | ||
| 335 | (split-string "Nice doggy!" "" t) | ||
| 336 | @result{} ("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!") | ||
| 337 | (split-string "" "" t) | ||
| 338 | @result{} nil | ||
| 339 | @end example | ||
| 340 | |||
| 341 | Somewhat odd, but predictable, behavior can occur for certain | ||
| 342 | ``non-greedy'' values of @var{separators} that can prefer empty | ||
| 343 | matches over non-empty matches. Again, such values rarely occur in | ||
| 344 | practice: | ||
| 345 | |||
| 346 | @example | ||
| 347 | (split-string "ooo" "o*" t) | ||
| 348 | @result{} nil | ||
| 349 | (split-string "ooo" "\\|o+" t) | ||
| 350 | @result{} ("o" "o" "o") | ||
| 318 | @end example | 351 | @end example |
| 319 | @end defun | 352 | @end defun |
| 320 | 353 | ||
| 321 | @defvar split-string-default-separators | 354 | @defvar split-string-default-separators |
| 322 | The default value of @var{separators} for @code{split-string}, initially | 355 | The default value of @var{separators} for @code{split-string}, initially |
| 323 | @samp{"[ \f\t\n\r\v]+"}. | 356 | @w{@samp{"[ \f\t\n\r\v]+"}}. |
| 324 | @end defvar | 357 | @end defvar |
| 325 | 358 | ||
| 326 | @node Modifying Strings | 359 | @node Modifying Strings |
| @@ -367,7 +400,8 @@ in case if @code{case-fold-search} is non-@code{nil}. | |||
| 367 | 400 | ||
| 368 | @defun string= string1 string2 | 401 | @defun string= string1 string2 |
| 369 | This function returns @code{t} if the characters of the two strings | 402 | This function returns @code{t} if the characters of the two strings |
| 370 | match exactly. | 403 | match exactly. Symbols are also allowed as arguments, in which case |
| 404 | their print names are used. | ||
| 371 | Case is always significant, regardless of @code{case-fold-search}. | 405 | Case is always significant, regardless of @code{case-fold-search}. |
| 372 | 406 | ||
| 373 | @example | 407 | @example |
| @@ -441,6 +475,9 @@ no characters is less than any other string. | |||
| 441 | @result{} nil | 475 | @result{} nil |
| 442 | @end group | 476 | @end group |
| 443 | @end example | 477 | @end example |
| 478 | |||
| 479 | Symbols are also allowed as arguments, in which case their print names | ||
| 480 | are used. | ||
| 444 | @end defun | 481 | @end defun |
| 445 | 482 | ||
| 446 | @defun string-lessp string1 string2 | 483 | @defun string-lessp string1 string2 |
| @@ -545,8 +582,10 @@ negative. | |||
| 545 | @example | 582 | @example |
| 546 | (number-to-string 256) | 583 | (number-to-string 256) |
| 547 | @result{} "256" | 584 | @result{} "256" |
| 585 | @group | ||
| 548 | (number-to-string -23) | 586 | (number-to-string -23) |
| 549 | @result{} "-23" | 587 | @result{} "-23" |
| 588 | @end group | ||
| 550 | (number-to-string -23.5) | 589 | (number-to-string -23.5) |
| 551 | @result{} "-23.5" | 590 | @result{} "-23.5" |
| 552 | @end example | 591 | @end example |
| @@ -560,20 +599,22 @@ See also the function @code{format} in @ref{Formatting Strings}. | |||
| 560 | @defun string-to-number string &optional base | 599 | @defun string-to-number string &optional base |
| 561 | @cindex string to number | 600 | @cindex string to number |
| 562 | This function returns the numeric value of the characters in | 601 | This function returns the numeric value of the characters in |
| 563 | @var{string}. If @var{base} is non-@code{nil}, integers are converted | 602 | @var{string}. If @var{base} is non-@code{nil}, it must be an integer |
| 564 | in that base. If @var{base} is @code{nil}, then base ten is used. | 603 | between 2 and 16 (inclusive), and integers are converted in that base. |
| 565 | Floating point conversion always uses base ten; we have not implemented | 604 | If @var{base} is @code{nil}, then base ten is used. Floating point |
| 566 | other radices for floating point numbers, because that would be much | 605 | conversion only works in base ten; we have not implemented other |
| 567 | more work and does not seem useful. If @var{string} looks like an | 606 | radices for floating point numbers, because that would be much more |
| 568 | integer but its value is too large to fit into a Lisp integer, | 607 | work and does not seem useful. If @var{string} looks like an integer |
| 608 | but its value is too large to fit into a Lisp integer, | ||
| 569 | @code{string-to-number} returns a floating point result. | 609 | @code{string-to-number} returns a floating point result. |
| 570 | 610 | ||
| 571 | The parsing skips spaces and tabs at the beginning of @var{string}, then | 611 | The parsing skips spaces and tabs at the beginning of @var{string}, |
| 572 | reads as much of @var{string} as it can interpret as a number. (On some | 612 | then reads as much of @var{string} as it can interpret as a number in |
| 573 | systems it ignores other whitespace at the beginning, not just spaces | 613 | the given base. (On some systems it ignores other whitespace at the |
| 574 | and tabs.) If the first character after the ignored whitespace is | 614 | beginning, not just spaces and tabs.) If the first character after |
| 575 | neither a digit, nor a plus or minus sign, nor the leading dot of a | 615 | the ignored whitespace is neither a digit in the given base, nor a |
| 576 | floating point number, this function returns 0. | 616 | plus or minus sign, nor the leading dot of a floating point number, |
| 617 | this function returns 0. | ||
| 577 | 618 | ||
| 578 | @example | 619 | @example |
| 579 | (string-to-number "256") | 620 | (string-to-number "256") |
| @@ -675,16 +716,12 @@ Starting in Emacs 21, if the object is a string, its text properties are | |||
| 675 | copied into the output. The text properties of the @samp{%s} itself | 716 | copied into the output. The text properties of the @samp{%s} itself |
| 676 | are also copied, but those of the object take priority. | 717 | are also copied, but those of the object take priority. |
| 677 | 718 | ||
| 678 | If there is no corresponding object, the empty string is used. | ||
| 679 | |||
| 680 | @item %S | 719 | @item %S |
| 681 | Replace the specification with the printed representation of the object, | 720 | Replace the specification with the printed representation of the object, |
| 682 | made with quoting (that is, using @code{prin1}---@pxref{Output | 721 | made with quoting (that is, using @code{prin1}---@pxref{Output |
| 683 | Functions}). Thus, strings are enclosed in @samp{"} characters, and | 722 | Functions}). Thus, strings are enclosed in @samp{"} characters, and |
| 684 | @samp{\} characters appear where necessary before special characters. | 723 | @samp{\} characters appear where necessary before special characters. |
| 685 | 724 | ||
| 686 | If there is no corresponding object, the empty string is used. | ||
| 687 | |||
| 688 | @item %o | 725 | @item %o |
| 689 | @cindex integer to octal | 726 | @cindex integer to octal |
| 690 | Replace the specification with the base-eight representation of an | 727 | Replace the specification with the base-eight representation of an |
| @@ -747,12 +784,17 @@ operation} error. | |||
| 747 | @cindex padding | 784 | @cindex padding |
| 748 | All the specification characters allow an optional numeric prefix | 785 | All the specification characters allow an optional numeric prefix |
| 749 | between the @samp{%} and the character. The optional numeric prefix | 786 | between the @samp{%} and the character. The optional numeric prefix |
| 750 | defines the minimum width for the object. If the printed representation | 787 | defines the minimum width for the object. If the printed |
| 751 | of the object contains fewer characters than this, then it is padded. | 788 | representation of the object contains fewer characters than this, then |
| 752 | The padding is on the left if the prefix is positive (or starts with | 789 | it is padded. The padding is on the left if the prefix is positive |
| 753 | zero) and on the right if the prefix is negative. The padding character | 790 | (or starts with zero) and on the right if the prefix is negative. The |
| 754 | is normally a space, but if the numeric prefix starts with a zero, zeros | 791 | padding character is normally a space, but if the numeric prefix |
| 755 | are used for padding. Here are some examples of padding: | 792 | starts with a zero, zeros are used for padding. Some of these |
| 793 | conventions are ignored for specification characters for which they do | ||
| 794 | not make sense. That is, %s, %S and %c accept a numeric prefix | ||
| 795 | starting with 0, but still pad with @emph{spaces} on the left. Also, | ||
| 796 | %% accepts a numeric prefix, but ignores it. Here are some examples | ||
| 797 | of padding: | ||
| 756 | 798 | ||
| 757 | @example | 799 | @example |
| 758 | (format "%06d is padded on the left with zeros" 123) | 800 | (format "%06d is padded on the left with zeros" 123) |
| @@ -872,11 +914,15 @@ When the argument to @code{capitalize} is a character, @code{capitalize} | |||
| 872 | has the same result as @code{upcase}. | 914 | has the same result as @code{upcase}. |
| 873 | 915 | ||
| 874 | @example | 916 | @example |
| 917 | @group | ||
| 875 | (capitalize "The cat in the hat") | 918 | (capitalize "The cat in the hat") |
| 876 | @result{} "The Cat In The Hat" | 919 | @result{} "The Cat In The Hat" |
| 920 | @end group | ||
| 877 | 921 | ||
| 922 | @group | ||
| 878 | (capitalize "THE 77TH-HATTED CAT") | 923 | (capitalize "THE 77TH-HATTED CAT") |
| 879 | @result{} "The 77th-Hatted Cat" | 924 | @result{} "The 77th-Hatted Cat" |
| 925 | @end group | ||
| 880 | 926 | ||
| 881 | @group | 927 | @group |
| 882 | (capitalize ?x) | 928 | (capitalize ?x) |
| @@ -885,16 +931,20 @@ has the same result as @code{upcase}. | |||
| 885 | @end example | 931 | @end example |
| 886 | @end defun | 932 | @end defun |
| 887 | 933 | ||
| 888 | @defun upcase-initials string | 934 | @defun upcase-initials string-or-char |
| 889 | This function capitalizes the initials of the words in @var{string}, | 935 | If @var{string-or-char} is a string, this function capitalizes the |
| 890 | without altering any letters other than the initials. It returns a new | 936 | initials of the words in @var{string-or-char}, without altering any |
| 891 | string whose contents are a copy of @var{string}, in which each word has | 937 | letters other than the initials. It returns a new string whose |
| 938 | contents are a copy of @var{string-or-char}, in which each word has | ||
| 892 | had its initial letter converted to upper case. | 939 | had its initial letter converted to upper case. |
| 893 | 940 | ||
| 894 | The definition of a word is any sequence of consecutive characters that | 941 | The definition of a word is any sequence of consecutive characters that |
| 895 | are assigned to the word constituent syntax class in the current syntax | 942 | are assigned to the word constituent syntax class in the current syntax |
| 896 | table (@pxref{Syntax Class Table}). | 943 | table (@pxref{Syntax Class Table}). |
| 897 | 944 | ||
| 945 | When the argument to @code{upcase-initials} is a character, | ||
| 946 | @code{upcase-initials} has the same result as @code{upcase}. | ||
| 947 | |||
| 898 | @example | 948 | @example |
| 899 | @group | 949 | @group |
| 900 | (upcase-initials "The CAT in the hAt") | 950 | (upcase-initials "The CAT in the hAt") |