aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--etc/NEWS2
-rw-r--r--man/search.texi22
-rw-r--r--src/ChangeLog82
3 files changed, 98 insertions, 8 deletions
diff --git a/etc/NEWS b/etc/NEWS
index eeb04e11410..80f4d96a785 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1087,7 +1087,7 @@ what BODY returns.
1087 1087
1088+++ 1088+++
1089** Regular expressions now support intervals \{n,m\} as well as 1089** Regular expressions now support intervals \{n,m\} as well as
1090Perl's non-greedy *? +? and ?? operators. 1090Perl's shy-groups \(?:...\) and non-greedy *? +? and ?? operators.
1091 1091
1092+++ 1092+++
1093** The optional argument BUFFER of function file-local-copy has been 1093** The optional argument BUFFER of function file-local-copy has been
diff --git a/man/search.texi b/man/search.texi
index de6cd92849e..96f984b053c 100644
--- a/man/search.texi
+++ b/man/search.texi
@@ -432,16 +432,16 @@ are non-greedy variants of the operators above. The normal operators
432as they can, while if you append a @samp{?} after them, it makes them 432as they can, while if you append a @samp{?} after them, it makes them
433non-greedy: they will match as little as possible. 433non-greedy: they will match as little as possible.
434 434
435@item \@{n,m\@} 435@item \@{@var{n},@var{m}\@}
436is another postfix operator that specifies an interval of iteration: 436is another postfix operator that specifies an interval of iteration:
437the preceding regular expression must match between @samp{n} and 437the preceding regular expression must match between @var{n} and
438@samp{m} times. If @samp{m} is omitted, then there is no upper bound 438@var{m} times. If @var{m} is omitted, then there is no upper bound
439and if @samp{,m} is omitted, then the regular expression must match 439and if @var{,m} is omitted, then the regular expression must match
440exactly @samp{n} times. @* 440exactly @var{n} times. @*
441@samp{\@{0,1\@}} is equivalent to @samp{?}. @* 441@samp{\@{0,1\@}} is equivalent to @samp{?}. @*
442@samp{\@{0,\@}} is equivalent to @samp{*}. @* 442@samp{\@{0,\@}} is equivalent to @samp{*}. @*
443@samp{\@{1,\@}} is equivalent to @samp{+}. @* 443@samp{\@{1,\@}} is equivalent to @samp{+}. @*
444@samp{\@{n\@}} is equivalent to @samp{\@{n,n\@}}. 444@samp{\@{@var{n}\@}} is equivalent to @samp{\@{@var{n},@var{n}\@}}.
445 445
446@item [ @dots{} ] 446@item [ @dots{} ]
447is a @dfn{character set}, which begins with @samp{[} and is terminated 447is a @dfn{character set}, which begins with @samp{[} and is terminated
@@ -560,7 +560,15 @@ To record a matched substring for future reference.
560This last application is not a consequence of the idea of a 560This last application is not a consequence of the idea of a
561parenthetical grouping; it is a separate feature that is assigned as a 561parenthetical grouping; it is a separate feature that is assigned as a
562second meaning to the same @samp{\( @dots{} \)} construct. In practice 562second meaning to the same @samp{\( @dots{} \)} construct. In practice
563there is no conflict between the two meanings. 563there is almost no conflict between the two meanings.
564
565@item \(?: @dots{} \)
566is another grouping construct (often called ``shy'') that serves the same
567first two purposes, but not the third:
568it cannot be referred to later on by number. This is only useful
569for mechanically constructed regular expressions where grouping
570constructs need to be introduced implicitly and hence risk changing the
571numbering of subsequent groups.
564 572
565@item \@var{d} 573@item \@var{d}
566matches the same text that matched the @var{d}th occurrence of a 574matches the same text that matched the @var{d}th occurrence of a
diff --git a/src/ChangeLog b/src/ChangeLog
index 847ac8c7748..3d7584433b1 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,85 @@
12000-03-08 Stefan Monnier <monnier@cs.yale.edu>
2
3 This is a big redesign of failure-stack and register handling, prompted
4 by bugs revealed when trying to add shy-groups. Overall, what happened
5 is that loops are now structured a little differently, groups can be
6 shy and the code is a little simpler.
7
8 * regex.h: Update the copyright.
9 (RE_SHY_GROUPS): New value.
10 (RE_UNMATCHED_RIGHT_PAREN_ORD): Renumber.
11 (RE_SYNTAX_EMACS): Add RE_SHY_GROUPS.
12
13 * regex.c (enum re_opcode_t): Remove jump_past_alt, maybe_pop_jump,
14 push_dummy_failure and dumy_failure_jump.
15 Add on_failure_jump_(exclusive, loop and smart).
16 Also fix the comment for (start|stop)_memory since they now only take
17 one argument (the second has becomes unnecessary).
18 (print_partial_compiled_pattern): Adjust for changes in re_opcode_t.
19 (print_compiled_pattern): Use %ld to printf long ints and flush to make
20 debugging a little easier.
21 (union fail_stack_elt): Make the integer unsigned.
22 (struct fail_stack_type): Add a `frame' element.
23 (INIT_FAIL_STACK): Init `frame' as well.
24 (POP_PATTERN_OP): New macro for re_compile_fastmap.
25 (DEBUG_PUSH, DEBUG_POP): Remove.
26 (NUM_REG_ITEMS): Remove.
27 (NUM_NONREG_ITEMS): Adjust.
28 (FAILURE_PAT, FAILURE_STR, NEXT_FAILURE_HANDLE, TOP_FAILURE_HANDLE):
29 New macros for the cycle detection.
30 (ENSURE_FAIL_STACK): New macro for PUSH_FAILURE_(REG|POINT).
31 (PUSH_FAILURE_REG, POP_FAILURE_REG, CHECK_INFINITE_LOOP): New macros.
32 (PUSH_FAILURE_POINT): Don't push registers any more.
33 The pattern address pushed is not the destination of the jump
34 but the source of it instead.
35 (NUM_FAILURE_ITEMS): Remove.
36 (POP_FAILURE_POINT): Adapt to the new stack structure (i.e. pop
37 registers before the actual failure point).
38 Don't hardcode any meaning for str==NULL anymore.
39 (union register_info_type, REG_MATCH_NULL_STRING_P, IS_ACTIVE)
40 (MATCHED_SOMETHING, EVER_MATCHED_SOMETHING, SET_REGS_MATCHED): Remove.
41 (REG_UNSET_VALUE): Use NULL (why not?).
42 (compile_range): Remove declaration since it doesn't exist.
43 (struct compile_stack_elt_t): Remove inner_group_offset.
44 (old_reg(start|end), reg_info, reg_dummy, reg_info_dummy): Remove.
45 (regex_grow_registers): Remove dead code.
46 (FIXUP_ALT_JUMP): New macro.
47 (regex_compile): Add shy-groups
48 Change loops to use on_failure_jump_smart&jump instead of
49 on_failure_jump&maybe_pop_jump.
50 Change + loops to eliminate the initial (dummy_failure_)jump.
51 Remove c1_base (looks like unused variable to me).
52 Use `jump' instead of `jump_past_alt' and don't bother with
53 push_dummy_failure in alternatives since it is now unnecessary.
54 Use FIXUP_ALT_JUMP.
55 Eliminate a useless `#ifdef emacs' for (re)allocating the stack.
56 (re_compile_fastmap): Remove dead variables i and num_regs.
57 Exit from loop when bufp->can_be_null rather than jumping to `done'.
58 Avoid jumping backwards so as to ensure termination.
59 Use PATTERN_STACK_EMPTY and POP_PATTERN_OP.
60 Improved handling of backreferences.
61 Remove dead code in handling of `anychar'.
62 (skip_noops, mutually_exclusive_p): New functions taken from the
63 handling of `maybe_pop_jump' in re_match_2_internal.
64 Slightly improve mutually_exclusive_p to handle ".+\n".
65 ((lowest|highest)_active_reg, NO_(LOWEST|HIGHEST)_ACTIVE_REG)
66 Remove.
67 (re_match_2_internal): Use %p instead of 0x%x when printf'ing ptrs.
68 Don't SET_REGS_MATCHED anymore. Remove many dead variables.
69 Push register (in `start_memory') on the stack rather than storing it
70 in old_reg(start|end).
71 Remove the cycle detection from `stop_memory', replaced by the use
72 of on_failure_jump_loop for greedy loops.
73 Add code for the new on_failure_jump_<foo>.
74 Remove ad-hoc code in `on_failure_jump' to push more registers
75 in the case of a loop.
76 Take out code from `maybe_pop_jump' into separate functions and
77 adapt it to the semantics of `on_failure_jump_smart'.
78 Remove jump_past_alt, dummy_failure_jump and push_dummy_failure.
79 Remove dummy_failure handling and handling of `failures to jump
80 to on_failure_jump' (this last one was already dead code, it seems).
81 ((group|alt|common_op)_match_null_string_p): Remove.
82
12000-03-08 Dave Love <fx@gnu.org> 832000-03-08 Dave Love <fx@gnu.org>
2 84
3 * config.in: Don't depend on __STDC__ for volatile. 85 * config.in: Don't depend on __STDC__ for volatile.