diff options
| -rw-r--r-- | etc/NEWS | 2 | ||||
| -rw-r--r-- | man/search.texi | 22 | ||||
| -rw-r--r-- | src/ChangeLog | 82 |
3 files changed, 98 insertions, 8 deletions
| @@ -1087,7 +1087,7 @@ what BODY returns. | |||
| 1087 | 1087 | ||
| 1088 | +++ | 1088 | +++ |
| 1089 | ** Regular expressions now support intervals \{n,m\} as well as | 1089 | ** Regular expressions now support intervals \{n,m\} as well as |
| 1090 | Perl's non-greedy *? +? and ?? operators. | 1090 | Perl's shy-groups \(?:...\) and non-greedy *? +? and ?? operators. |
| 1091 | 1091 | ||
| 1092 | +++ | 1092 | +++ |
| 1093 | ** The optional argument BUFFER of function file-local-copy has been | 1093 | ** The optional argument BUFFER of function file-local-copy has been |
diff --git a/man/search.texi b/man/search.texi index de6cd92849e..96f984b053c 100644 --- a/man/search.texi +++ b/man/search.texi | |||
| @@ -432,16 +432,16 @@ are non-greedy variants of the operators above. The normal operators | |||
| 432 | as they can, while if you append a @samp{?} after them, it makes them | 432 | as they can, while if you append a @samp{?} after them, it makes them |
| 433 | non-greedy: they will match as little as possible. | 433 | non-greedy: they will match as little as possible. |
| 434 | 434 | ||
| 435 | @item \@{n,m\@} | 435 | @item \@{@var{n},@var{m}\@} |
| 436 | is another postfix operator that specifies an interval of iteration: | 436 | is another postfix operator that specifies an interval of iteration: |
| 437 | the preceding regular expression must match between @samp{n} and | 437 | the preceding regular expression must match between @var{n} and |
| 438 | @samp{m} times. If @samp{m} is omitted, then there is no upper bound | 438 | @var{m} times. If @var{m} is omitted, then there is no upper bound |
| 439 | and if @samp{,m} is omitted, then the regular expression must match | 439 | and if @var{,m} is omitted, then the regular expression must match |
| 440 | exactly @samp{n} times. @* | 440 | exactly @var{n} times. @* |
| 441 | @samp{\@{0,1\@}} is equivalent to @samp{?}. @* | 441 | @samp{\@{0,1\@}} is equivalent to @samp{?}. @* |
| 442 | @samp{\@{0,\@}} is equivalent to @samp{*}. @* | 442 | @samp{\@{0,\@}} is equivalent to @samp{*}. @* |
| 443 | @samp{\@{1,\@}} is equivalent to @samp{+}. @* | 443 | @samp{\@{1,\@}} is equivalent to @samp{+}. @* |
| 444 | @samp{\@{n\@}} is equivalent to @samp{\@{n,n\@}}. | 444 | @samp{\@{@var{n}\@}} is equivalent to @samp{\@{@var{n},@var{n}\@}}. |
| 445 | 445 | ||
| 446 | @item [ @dots{} ] | 446 | @item [ @dots{} ] |
| 447 | is a @dfn{character set}, which begins with @samp{[} and is terminated | 447 | is a @dfn{character set}, which begins with @samp{[} and is terminated |
| @@ -560,7 +560,15 @@ To record a matched substring for future reference. | |||
| 560 | This last application is not a consequence of the idea of a | 560 | This last application is not a consequence of the idea of a |
| 561 | parenthetical grouping; it is a separate feature that is assigned as a | 561 | parenthetical grouping; it is a separate feature that is assigned as a |
| 562 | second meaning to the same @samp{\( @dots{} \)} construct. In practice | 562 | second meaning to the same @samp{\( @dots{} \)} construct. In practice |
| 563 | there is no conflict between the two meanings. | 563 | there is almost no conflict between the two meanings. |
| 564 | |||
| 565 | @item \(?: @dots{} \) | ||
| 566 | is another grouping construct (often called ``shy'') that serves the same | ||
| 567 | first two purposes, but not the third: | ||
| 568 | it cannot be referred to later on by number. This is only useful | ||
| 569 | for mechanically constructed regular expressions where grouping | ||
| 570 | constructs need to be introduced implicitly and hence risk changing the | ||
| 571 | numbering of subsequent groups. | ||
| 564 | 572 | ||
| 565 | @item \@var{d} | 573 | @item \@var{d} |
| 566 | matches the same text that matched the @var{d}th occurrence of a | 574 | matches the same text that matched the @var{d}th occurrence of a |
diff --git a/src/ChangeLog b/src/ChangeLog index 847ac8c7748..3d7584433b1 100644 --- a/src/ChangeLog +++ b/src/ChangeLog | |||
| @@ -1,3 +1,85 @@ | |||
| 1 | 2000-03-08 Stefan Monnier <monnier@cs.yale.edu> | ||
| 2 | |||
| 3 | This is a big redesign of failure-stack and register handling, prompted | ||
| 4 | by bugs revealed when trying to add shy-groups. Overall, what happened | ||
| 5 | is that loops are now structured a little differently, groups can be | ||
| 6 | shy and the code is a little simpler. | ||
| 7 | |||
| 8 | * regex.h: Update the copyright. | ||
| 9 | (RE_SHY_GROUPS): New value. | ||
| 10 | (RE_UNMATCHED_RIGHT_PAREN_ORD): Renumber. | ||
| 11 | (RE_SYNTAX_EMACS): Add RE_SHY_GROUPS. | ||
| 12 | |||
| 13 | * regex.c (enum re_opcode_t): Remove jump_past_alt, maybe_pop_jump, | ||
| 14 | push_dummy_failure and dumy_failure_jump. | ||
| 15 | Add on_failure_jump_(exclusive, loop and smart). | ||
| 16 | Also fix the comment for (start|stop)_memory since they now only take | ||
| 17 | one argument (the second has becomes unnecessary). | ||
| 18 | (print_partial_compiled_pattern): Adjust for changes in re_opcode_t. | ||
| 19 | (print_compiled_pattern): Use %ld to printf long ints and flush to make | ||
| 20 | debugging a little easier. | ||
| 21 | (union fail_stack_elt): Make the integer unsigned. | ||
| 22 | (struct fail_stack_type): Add a `frame' element. | ||
| 23 | (INIT_FAIL_STACK): Init `frame' as well. | ||
| 24 | (POP_PATTERN_OP): New macro for re_compile_fastmap. | ||
| 25 | (DEBUG_PUSH, DEBUG_POP): Remove. | ||
| 26 | (NUM_REG_ITEMS): Remove. | ||
| 27 | (NUM_NONREG_ITEMS): Adjust. | ||
| 28 | (FAILURE_PAT, FAILURE_STR, NEXT_FAILURE_HANDLE, TOP_FAILURE_HANDLE): | ||
| 29 | New macros for the cycle detection. | ||
| 30 | (ENSURE_FAIL_STACK): New macro for PUSH_FAILURE_(REG|POINT). | ||
| 31 | (PUSH_FAILURE_REG, POP_FAILURE_REG, CHECK_INFINITE_LOOP): New macros. | ||
| 32 | (PUSH_FAILURE_POINT): Don't push registers any more. | ||
| 33 | The pattern address pushed is not the destination of the jump | ||
| 34 | but the source of it instead. | ||
| 35 | (NUM_FAILURE_ITEMS): Remove. | ||
| 36 | (POP_FAILURE_POINT): Adapt to the new stack structure (i.e. pop | ||
| 37 | registers before the actual failure point). | ||
| 38 | Don't hardcode any meaning for str==NULL anymore. | ||
| 39 | (union register_info_type, REG_MATCH_NULL_STRING_P, IS_ACTIVE) | ||
| 40 | (MATCHED_SOMETHING, EVER_MATCHED_SOMETHING, SET_REGS_MATCHED): Remove. | ||
| 41 | (REG_UNSET_VALUE): Use NULL (why not?). | ||
| 42 | (compile_range): Remove declaration since it doesn't exist. | ||
| 43 | (struct compile_stack_elt_t): Remove inner_group_offset. | ||
| 44 | (old_reg(start|end), reg_info, reg_dummy, reg_info_dummy): Remove. | ||
| 45 | (regex_grow_registers): Remove dead code. | ||
| 46 | (FIXUP_ALT_JUMP): New macro. | ||
| 47 | (regex_compile): Add shy-groups | ||
| 48 | Change loops to use on_failure_jump_smart&jump instead of | ||
| 49 | on_failure_jump&maybe_pop_jump. | ||
| 50 | Change + loops to eliminate the initial (dummy_failure_)jump. | ||
| 51 | Remove c1_base (looks like unused variable to me). | ||
| 52 | Use `jump' instead of `jump_past_alt' and don't bother with | ||
| 53 | push_dummy_failure in alternatives since it is now unnecessary. | ||
| 54 | Use FIXUP_ALT_JUMP. | ||
| 55 | Eliminate a useless `#ifdef emacs' for (re)allocating the stack. | ||
| 56 | (re_compile_fastmap): Remove dead variables i and num_regs. | ||
| 57 | Exit from loop when bufp->can_be_null rather than jumping to `done'. | ||
| 58 | Avoid jumping backwards so as to ensure termination. | ||
| 59 | Use PATTERN_STACK_EMPTY and POP_PATTERN_OP. | ||
| 60 | Improved handling of backreferences. | ||
| 61 | Remove dead code in handling of `anychar'. | ||
| 62 | (skip_noops, mutually_exclusive_p): New functions taken from the | ||
| 63 | handling of `maybe_pop_jump' in re_match_2_internal. | ||
| 64 | Slightly improve mutually_exclusive_p to handle ".+\n". | ||
| 65 | ((lowest|highest)_active_reg, NO_(LOWEST|HIGHEST)_ACTIVE_REG) | ||
| 66 | Remove. | ||
| 67 | (re_match_2_internal): Use %p instead of 0x%x when printf'ing ptrs. | ||
| 68 | Don't SET_REGS_MATCHED anymore. Remove many dead variables. | ||
| 69 | Push register (in `start_memory') on the stack rather than storing it | ||
| 70 | in old_reg(start|end). | ||
| 71 | Remove the cycle detection from `stop_memory', replaced by the use | ||
| 72 | of on_failure_jump_loop for greedy loops. | ||
| 73 | Add code for the new on_failure_jump_<foo>. | ||
| 74 | Remove ad-hoc code in `on_failure_jump' to push more registers | ||
| 75 | in the case of a loop. | ||
| 76 | Take out code from `maybe_pop_jump' into separate functions and | ||
| 77 | adapt it to the semantics of `on_failure_jump_smart'. | ||
| 78 | Remove jump_past_alt, dummy_failure_jump and push_dummy_failure. | ||
| 79 | Remove dummy_failure handling and handling of `failures to jump | ||
| 80 | to on_failure_jump' (this last one was already dead code, it seems). | ||
| 81 | ((group|alt|common_op)_match_null_string_p): Remove. | ||
| 82 | |||
| 1 | 2000-03-08 Dave Love <fx@gnu.org> | 83 | 2000-03-08 Dave Love <fx@gnu.org> |
| 2 | 84 | ||
| 3 | * config.in: Don't depend on __STDC__ for volatile. | 85 | * config.in: Don't depend on __STDC__ for volatile. |