aboutsummaryrefslogtreecommitdiffstats
path: root/src
diff options
context:
space:
mode:
authorKenichi Handa2008-10-22 05:23:47 +0000
committerKenichi Handa2008-10-22 05:23:47 +0000
commit714b2198bf98f68e3f721675c4df8cefb7d0b268 (patch)
tree38419404dfe6bf1bf13b5a2e8991b9b7e49c49e8 /src
parent67a9bee7b897452b1515e491a91bd75261dabbe0 (diff)
downloademacs-714b2198bf98f68e3f721675c4df8cefb7d0b268.tar.gz
emacs-714b2198bf98f68e3f721675c4df8cefb7d0b268.zip
(word_boundary_p): Check scripts instead of charset.
Handle nil value in word-separating-categories and word-combining-categories. (syms_of_category): Fix docstrings of word-separating-categories and word-combining-categories.
Diffstat (limited to 'src')
-rw-r--r--src/category.c35
1 files changed, 19 insertions, 16 deletions
diff --git a/src/category.c b/src/category.c
index fca39ecb4e6..d5776fa4556 100644
--- a/src/category.c
+++ b/src/category.c
@@ -397,7 +397,8 @@ word_boundary_p (c1, c2)
397 Lisp_Object tail; 397 Lisp_Object tail;
398 int default_result; 398 int default_result;
399 399
400 if (CHAR_CHARSET (c1) == CHAR_CHARSET (c2)) 400 if (EQ (CHAR_TABLE_REF (Vchar_script_table, c1),
401 CHAR_TABLE_REF (Vchar_script_table, c2)))
401 { 402 {
402 tail = Vword_separating_categories; 403 tail = Vword_separating_categories;
403 default_result = 0; 404 default_result = 0;
@@ -420,10 +421,12 @@ word_boundary_p (c1, c2)
420 Lisp_Object elt = XCAR (tail); 421 Lisp_Object elt = XCAR (tail);
421 422
422 if (CONSP (elt) 423 if (CONSP (elt)
423 && CATEGORYP (XCAR (elt)) 424 && (NILP (XCAR (elt))
424 && CATEGORYP (XCDR (elt)) 425 || (CATEGORYP (XCAR (elt))
425 && CATEGORY_MEMBER (XFASTINT (XCAR (elt)), category_set1) 426 && CATEGORY_MEMBER (XFASTINT (XCAR (elt)), category_set1)))
426 && CATEGORY_MEMBER (XFASTINT (XCDR (elt)), category_set2)) 427 && (NILP (XCDR (elt))
428 || (CATEGORYP (XCDR (elt))
429 && CATEGORY_MEMBER (XFASTINT (XCDR (elt)), category_set2))))
427 return !default_result; 430 return !default_result;
428 } 431 }
429 return default_result; 432 return default_result;
@@ -468,35 +471,35 @@ syms_of_category ()
468 471
469Emacs treats a sequence of word constituent characters as a single 472Emacs treats a sequence of word constituent characters as a single
470word (i.e. finds no word boundary between them) only if they belong to 473word (i.e. finds no word boundary between them) only if they belong to
471the same charset. But, exceptions are allowed in the following cases. 474the same script. But, exceptions are allowed in the following cases.
472 475
473\(1) The case that characters are in different charsets is controlled 476\(1) The case that characters are in different scripts is controlled
474by the variable `word-combining-categories'. 477by the variable `word-combining-categories'.
475 478
476Emacs finds no word boundary between characters of different charsets 479Emacs finds no word boundary between characters of different scripts
477if they have categories matching some element of this list. 480if they have categories matching some element of this list.
478 481
479More precisely, if an element of this list is a cons of category CAT1 482More precisely, if an element of this list is a cons of category CAT1
480and CAT2, and a multibyte character C1 which has CAT1 is followed by 483and CAT2, and a multibyte character C1 which has CAT1 is followed by
481C2 which has CAT2, there's no word boundary between C1 and C2. 484C2 which has CAT2, there's no word boundary between C1 and C2.
482 485
483For instance, to tell that ASCII characters and Latin-1 characters can 486For instance, to tell that Han characters followed by Hiragana
484form a single word, the element `(?l . ?l)' should be in this list 487characters can form a single word, the element `(?C . ?H)' should be
485because both characters have the category `l' (Latin characters). 488in this list.
486 489
487\(2) The case that character are in the same charset is controlled by 490\(2) The case that character are in the same script is controlled by
488the variable `word-separating-categories'. 491the variable `word-separating-categories'.
489 492
490Emacs find a word boundary between characters of the same charset 493Emacs find a word boundary between characters of the same script
491if they have categories matching some element of this list. 494if they have categories matching some element of this list.
492 495
493More precisely, if an element of this list is a cons of category CAT1 496More precisely, if an element of this list is a cons of category CAT1
494and CAT2, and a multibyte character C1 which has CAT1 is followed by 497and CAT2, and a multibyte character C1 which has CAT1 is followed by
495C2 which has CAT2, there's a word boundary between C1 and C2. 498C2 which has CAT2, there's a word boundary between C1 and C2.
496 499
497For instance, to tell that there's a word boundary between Japanese 500For instance, to tell that there's a word boundary between Hiragana
498Hiragana and Japanese Kanji (both are in the same charset), the 501and Katakana (both are in the same script `kana'),
499element `(?H . ?C) should be in this list. */); 502the element `(?H . ?K) should be in this list. */);
500 503
501 Vword_combining_categories = Qnil; 504 Vword_combining_categories = Qnil;
502 505