| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (regex_compile): Adjusted for the change of CHAR_STRING.
1999-12-04 Stefan Monnier <monnier@cs.yale.edu>
* regex.c (regex_compile): Recognize *?, +? and ?? as non-greedy
operators and handle them properly.
* regex.h (RE_ALL_GREEDY): New option.
(RE_UNMATCHED_RIGHT_PAREN_ORD): Moved to the end where alphabetic
sorting would put it.
(RE_SYNTAX_AWK, RE_SYNTAX_GREP, RE_SYNTAX_EGREP)
(_RE_SYNTAX_POSIX_COMMON): Use the new option to keep old behavior.
|
| |
|
|
| |
as arg to DEBUG_POP and DEBUG_PRINT.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c [emacs] (ISALNUM, ISALPHA, ISPUNCT): Don't depend on locale
[emacs] (ISASCII): Don't define ISASCII in this case.
(IS_REAL_ASCII): New macro, 2 alternate definitions.
(ISUNIBYTE): Likewise.
[emacs] (ISDIGIT, ISCNTRL, ISXDIGIT, ISGRAPH, ISPRINT):
Don't use ISASCII.
* regex.c: Handle new class names `ascii', `nonascii',
`unibyte, `multibyte'.
(BIT_ASCII, BIT_NONASCII, BIT_UNIBYTE, BIT_MULTIBYTE): New macros.
(IS_CHAR_CLASS): Accept new class names.
(regex_compile, re_match_2_internal): Handle the new classes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(ISBLANK, ISGRAPH, ISPRINT, ISALNUM, ISALPHA, ISLOWER)
(ISPUNCT, ISSPACE, ISUPPER): New definitions for emacs only.
(ISWORD): New macro.
(re_opcode_t): Add 2 bytes of flag bits to charset and charset_not.
(CHARSET_RANGE_TABLE): Update definition.
(CHARSET_RANGE_TABLE_BITS): New macro.
(print_partial_compiled_pattern): Skip charset's range table.
(struct range_table_work_area): New field `bits'.
(SET_RANGE_TABLE_WORK_AREA_BIT): New macro.
(BIT_ALNUM, BIT_ALPHA, BIT_WORD, BIT_GRAPH, BIT_LOWER, BIT_PRINT)
(BIT_PUNCT, BIT_SPACE, BIT_UPPER): New macros.
(CLEAR_RANGE_TABLE_WORK_USED): Clear field `bits'.
(RANGE_TABLE_WORK_BITS): New macro.
(IS_CHAR_CLASS): Check for "word".
(regex_compile): Set the `bits' field for some character classes.
Handle the `word' class. Store the `bits' field into the range table.
(re_compile_fastmap): Handle flag bits in range table.
(re_match_2_internal): For charset and charset_not,
handle flag bits in the range table.
|
| | |
|
| |
|
|
|
| |
previous change, for charset_not, wordchar, notwordchar,
categoryspec, notcategoryspec.
|
| |
|
|
| |
elements for all possible unibyte chars (except newline).
|
| |
|
|
| |
exact-match characters.
|
| | |
|
| | |
|
| |
|
|
|
|
|
| |
(re_match_2, re_search_2): Adjust startpos or pos by 1
only if acting on a buffer.
nil for re_match_object means a buffer.
(re_match_2_internal <notwordbeg>): Assume POS1 is positive.
|
| |
|
|
| |
(re_match_2_internal): Likewise.
|
| |
|
|
| |
for a repetition operator, don't look beyond end of pattern arg.
|
| | |
|
| |
|
|
| |
Fix the way RANGE is set when handling begbuf.
|
| |
|
|
| |
needed.
|
| | |
|
| |
|
|
| |
before calling SETUP_SYNTAX_TABLE_FOR_OBJECT.
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(regex_compile): Special handling for range \177-\377.
(regex_compile): Cast args to TRANSLATE to unsigned char.
(re_search_2): Fix forward scan handling multibyte.
Recognize that nonascii characters are not in the fastmap.
Handle fetching multibyte characters for backward scan,
(re_match_2_internal): Handle multibyte and translation
in exactn and anychar.
(bcmp_translate): Handle multibyte chars for translation.
(TRANSLATE): Don't cast to unsigned char.
(PATFETCH): Use RE_TRANSLATE to translate.
|
| |
|
|
|
| |
(re_match_2_internal) <wordbeg, wordend>:
Call UPDATE_SYNTAX_TABLE properly with a charpos.
|
| |
|
|
| |
(re_match_2_internal): Use PTR_BYTE_POS and PT_BYTE.
|
| | |
|
| |
|
|
| |
update (fail_stack).size properly.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Define it simply as a number.
(DOUBLE_FAIL_STACK, regex_compile): Set the limit at the size
TYPICAL_FAILURE_SIZE specifies, rather than at twice that much.
(re_max_failures): Double the initial values.
(INIT_FAIL_STACK): Use TYPICAL_FAILURE_SIZE so that INIT_FAILURE_ALLOC
counts in the proper units.
(INIT_FAILURE_ALLOC): Increase to 20.
(FAIL_STACK_GROWTH_FACTOR): New macro.
(GROW_FAIL_STACK): Renamed from DOUBLE_FAIL_STACK.
FAIL_STACK_GROWTH_FACTOR controls what ratio to increase size by.
|
| | |
|
| |
|
|
| |
Use RE_TRANSLATE instead of accessing translate array directly.
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(POS_AS_IN_BUFFER): New macro.
(SYNTAX_ENTRY_VIA_PROPERTY): Set to take `syntax-table' text
property into account when doing SYNTAX (c).
(re_compile_fastmap): disable fastmap if any of wordbound
notwordbound wordbeg wordend notsyntaxspec syntaxspec are seen.
(re_search_2): SETUP_SYNTAX_TABLE_FOR_OBJECT at the start.
(re_match_object): New variable.
(re_match_2): SETUP_SYNTAX_TABLE_FOR_OBJECT at the start.
(re_match_2_internal): For any of wordbound notwordbound wordbeg
wordend notsyntaxspec syntaxspec call UPDATE_SYNTAX_TABLE before
doing SYNTAX (c).
[emacs]: Include charset.h and category.h
[!emacs] (BASE_LEADING_CODE_P, WORD_BOUNDARY_P, CHAR_HEAD_P,
SINGLE_BYTE_CHAR_P, SAME_CHARSET_P, MULTIBYTE_FORM_LENGTH,
STRING_CHAR_AND_LENGTH, GET_CHAR_AFTER_2, GET_CHAR_BEFORE_2):
New dummy macros.
(enum re_opcode_t): New member categoryspec and notcategoryspec.
(STORE_CHARACTER_AND_INCR, EXTRACT_CHARACTER,
CHARSET_LOOKUP_RANGE_TABLE_WITH_COUNT,
CHARSET_LOOKUP_RANGE_TABLE, CHARSET_BITMAP_SIZE,
CHARSET_RANGE_TABLE_EXISTS_P, CHARSET_RANGE_TABLE
CHARSET_PAST_RANGE_TABLE): New macros.
(TRANSLATE): Cast return value to unsigned char, not char.
(struct range_table_work_area): New structure.
(EXTEND_RANGE_TABLE_WORK_AREA, SET_RANGE_TABLE_WORK_AREA,
FREE_RANGE_TABLE_WORK_AREA, CLEAR_RANGE_TABLE_WORK_USED,
RANGE_TABLE_WORK_USED, RANGE_TABLE_WORK_ELT): New macros.
(FREE_STACK_RETURN): Call FREE_RANGE_TABLE_WORK_AREA.
(regex_compile): Declare `c' and `c1' as int to store multibyte characters.
Declare range_table_work and initialize it.
Initialize bufp->multibyte to 0 if not emacs.
For case '[' and `default', code re-written to handle multibyte characters.
Add code for case 'c' and 'C' to handle category spec.
(re_compile_fastmap): New local variables k, simple_char_max,
and match_any_multibyte_characters.
Use macro CHARSET_BITMAP_SIZE.
Handle multibyte characters in cases charset, charset_not,
wordchar, notwordchar, anychar, syntaxspec, notsyntaxspec,
categoryspec, notcategoryspec.
(STOP_ADDR_VSTRING, POS_ADDR_VSTRING): New macros.
(re_search_2): Code re-written to handle multibyte characters.
(AT_WORD_BOUNDARY): Macro disabled.
(re_match_2_internal): New local variable multibyte. `d' is
incremented while paying attention to multibyte characters if necessary.
For case charset, charsetnot, wordbound, notwordbound,
wordbeg, wordend, matchsyntax, and matchnotsyntax, code
re-written to handle multibyte characters.
Add code for case categoryspec and notcategoryspec.
Declare c, c1 as unsigned int, not unsigned char.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
|
|
| |
weak_symbol macro after.
|
| | |
|
| | |
|
| |
|
|
| |
* regex.c [_LIBC] (re_comp, re_exec): Define these, but weakly.
|
| | |
|
| |
|
|
|
| |
length of exactn as character, and don't use length of bitmap of
charset as bitmap.
|
| | |
|
| | |
|