aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAlan Mackenzie2016-03-20 13:19:48 +0000
committerAlan Mackenzie2016-03-20 13:19:48 +0000
commit9dcf5998935c8aaa846d7585b81f0dcfe1935b3d (patch)
tree371e627342a753acc111fa1c774cef559407e18f
parent565df7265dd73b4812fcb02cd1663fce4dc40be7 (diff)
downloademacs-9dcf5998935c8aaa846d7585b81f0dcfe1935b3d.tar.gz
emacs-9dcf5998935c8aaa846d7585b81f0dcfe1935b3d.zip
Amend parse-partial-sexp correctly to handle two character comment delimiters
Do this by adding a new field to the parser state: the syntax of the last character scanned, should that be the first char of a (potential) two char construct, nil otherwise. This should make the parser state complete. Also document element 9 of the parser state. Also refactor the code a bit. * src/syntax.c (struct lisp_parse_state): Add a new field. (SYNTAX_FLAGS_COMSTARTEND_FIRST): New function. (internalize_parse_state): New function, extracted from scan_sexps_forward. (back_comment): Call internalize_parse_state. (forw_comment): Return the syntax of the last character scanned to the caller when that character might be the first of a two character construct. (Fforward_comment, scan_lists): New dummy variables, passed to forw_comment. (scan_sexps_forward): Remove a redundant state parameter. Access all `state' information via the address parameter `state'. Remove the code which converts from external to internal form of `state'. Access buffer contents only from `from' onwards. Reformulate code at the top of the main loop correctly to recognize comment openers when starting in the middle of one. Call forw_comment with extra argument (for return of syntax value of possible first char of a two char construct). (Fparse_partial_sexp): Document elements 9, 10 of the parser state in the doc string. Clarify the doc string in general. Call internalize_parse_state. Take account of the new elements when consing up the output parser state. * doc/lispref/syntax.texi: (Parser State): Document element 9 and the new element 10. Minor wording corrections (remove reference to "trivial cases"). (Low Level Parsing): Minor corrections. * etc/NEWS: Note new element 10, and documentation of element 9 of parser state.
-rw-r--r--doc/lispref/syntax.texi33
-rw-r--r--etc/NEWS12
-rw-r--r--src/syntax.c372
3 files changed, 252 insertions, 165 deletions
diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
index d5a7eba13fe..f81c1643c21 100644
--- a/doc/lispref/syntax.texi
+++ b/doc/lispref/syntax.texi
@@ -791,10 +791,10 @@ Hooks}).
791@subsection Parser State 791@subsection Parser State
792@cindex parser state 792@cindex parser state
793 793
794 A @dfn{parser state} is a list of ten elements describing the state 794 A @dfn{parser state} is a list of (currently) eleven elements
795of the syntactic parser, after it parses the text between a specified 795describing the state of the syntactic parser, after it parses the text
796starting point and a specified end point in the buffer. Parsing 796between a specified starting point and a specified end point in the
797functions such as @code{syntax-ppss} 797buffer. Parsing functions such as @code{syntax-ppss}
798@ifnottex 798@ifnottex
799(@pxref{Position Parse}) 799(@pxref{Position Parse})
800@end ifnottex 800@end ifnottex
@@ -851,15 +851,20 @@ position where the string began. When outside of strings and comments,
851this element is @code{nil}. 851this element is @code{nil}.
852 852
853@item 853@item
854Internal data for continuing the parsing. The meaning of this 854The list of the positions of the currently open parentheses, starting
855data is subject to change; it is used if you pass this list 855with the outermost.
856as the @var{state} argument to another call. 856
857@item
858When the last buffer position scanned was the (potential) first
859character of a two character construct (comment delimiter or
860escaped/char-quoted character pair), the @var{syntax-code}
861(@pxref{Syntax Table Internals}) of that position. Otherwise
862@code{nil}.
857@end enumerate 863@end enumerate
858 864
859 Elements 1, 2, and 6 are ignored in a state which you pass as an 865 Elements 1, 2, and 6 are ignored in a state which you pass as an
860argument to continue parsing, and elements 8 and 9 are used only in 866argument to continue parsing. Elements 9 and 10 are mainly used
861trivial cases. Those elements are mainly used internally by the 867internally by the parser code.
862parser code.
863 868
864 One additional piece of useful information is available from a 869 One additional piece of useful information is available from a
865parser state using this function: 870parser state using this function:
@@ -898,11 +903,11 @@ The depth starts at 0, or at whatever is given in @var{state}.
898 903
899If the fourth argument @var{stop-before} is non-@code{nil}, parsing 904If the fourth argument @var{stop-before} is non-@code{nil}, parsing
900stops when it comes to any character that starts a sexp. If 905stops when it comes to any character that starts a sexp. If
901@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the 906@var{stop-comment} is non-@code{nil}, parsing stops after the start of
902start of an unnested comment. If @var{stop-comment} is the symbol 907an unnested comment. If @var{stop-comment} is the symbol
903@code{syntax-table}, parsing stops after the start of an unnested 908@code{syntax-table}, parsing stops after the start of an unnested
904comment or a string, or the end of an unnested comment or a string, 909comment or a string, or after the end of an unnested comment or a
905whichever comes first. 910string, whichever comes first.
906 911
907If @var{state} is @code{nil}, @var{start} is assumed to be at the top 912If @var{state} is @code{nil}, @var{start} is assumed to be at the top
908level of parenthesis structure, such as the beginning of a function 913level of parenthesis structure, such as the beginning of a function
diff --git a/etc/NEWS b/etc/NEWS
index d963dee2c63..ea321539426 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -175,6 +175,18 @@ a new window when opening man pages when there's already one, use
175 (inhibit-same-window . nil) 175 (inhibit-same-window . nil)
176 (mode . Man-mode)))) 176 (mode . Man-mode))))
177 177
178+++
179** `parse-partial-sexp' state has a new element. Element 10 is
180non-nil when the last character scanned might be the first character
181of a two character construct, i.e. a comment delimiter or escaped
182character. Its value is the syntax of that last character.
183
184+++
185** `parse-partial-sexp''s state, element 9, has now been confirmed as
186permanent and documented, and may be used by Lisp programs. Its value
187is a list of currently open parenthesis positions, starting with the
188outermost parenthesis.
189
178 190
179* Changes in Emacs 25.2 on Non-Free Operating Systems 191* Changes in Emacs 25.2 on Non-Free Operating Systems
180 192
diff --git a/src/syntax.c b/src/syntax.c
index fdcfdfc62a9..ffe0ea5e0d9 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -81,6 +81,11 @@ SYNTAX_FLAGS_COMEND_SECOND (int flags)
81 return (flags >> 19) & 1; 81 return (flags >> 19) & 1;
82} 82}
83static bool 83static bool
84SYNTAX_FLAGS_COMSTARTEND_FIRST (int flags)
85{
86 return (flags & 0x50000) != 0;
87}
88static bool
84SYNTAX_FLAGS_PREFIX (int flags) 89SYNTAX_FLAGS_PREFIX (int flags)
85{ 90{
86 return (flags >> 20) & 1; 91 return (flags >> 20) & 1;
@@ -153,6 +158,10 @@ struct lisp_parse_state
153 ptrdiff_t comstr_start; /* Position of last comment/string starter. */ 158 ptrdiff_t comstr_start; /* Position of last comment/string starter. */
154 Lisp_Object levelstarts; /* Char numbers of starts-of-expression 159 Lisp_Object levelstarts; /* Char numbers of starts-of-expression
155 of levels (starting from outermost). */ 160 of levels (starting from outermost). */
161 int prev_syntax; /* Syntax of previous position scanned, when
162 that position (potentially) holds the first char
163 of a 2-char construct, i.e. comment delimiter
164 or Sescape, etc. Smax otherwise. */
156 }; 165 };
157 166
158/* These variables are a cache for finding the start of a defun. 167/* These variables are a cache for finding the start of a defun.
@@ -176,7 +185,8 @@ static Lisp_Object skip_syntaxes (bool, Lisp_Object, Lisp_Object);
176static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool); 185static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool);
177static void scan_sexps_forward (struct lisp_parse_state *, 186static void scan_sexps_forward (struct lisp_parse_state *,
178 ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT, 187 ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT,
179 bool, Lisp_Object, int); 188 bool, int);
189static void internalize_parse_state (Lisp_Object, struct lisp_parse_state *);
180static bool in_classes (int, Lisp_Object); 190static bool in_classes (int, Lisp_Object);
181static void parse_sexp_propertize (ptrdiff_t charpos); 191static void parse_sexp_propertize (ptrdiff_t charpos);
182 192
@@ -911,10 +921,11 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
911 } 921 }
912 do 922 do
913 { 923 {
924 internalize_parse_state (Qnil, &state);
914 scan_sexps_forward (&state, 925 scan_sexps_forward (&state,
915 defun_start, defun_start_byte, 926 defun_start, defun_start_byte,
916 comment_end, TYPE_MINIMUM (EMACS_INT), 927 comment_end, TYPE_MINIMUM (EMACS_INT),
917 0, Qnil, 0); 928 0, 0);
918 defun_start = comment_end; 929 defun_start = comment_end;
919 if (!adjusted) 930 if (!adjusted)
920 { 931 {
@@ -2310,11 +2321,15 @@ in_classes (int c, Lisp_Object iso_classes)
2310 PREV_SYNTAX is the SYNTAX_WITH_FLAGS of the previous character 2321 PREV_SYNTAX is the SYNTAX_WITH_FLAGS of the previous character
2311 (or 0 If the search cannot start in the middle of a two-character). 2322 (or 0 If the search cannot start in the middle of a two-character).
2312 2323
2313 If successful, return true and store the charpos of the comment's end 2324 If successful, return true and store the charpos of the comment's
2314 into *CHARPOS_PTR and the corresponding bytepos into *BYTEPOS_PTR. 2325 end into *CHARPOS_PTR and the corresponding bytepos into
2315 Else, return false and store the charpos STOP into *CHARPOS_PTR, the 2326 *BYTEPOS_PTR. Else, return false and store the charpos STOP into
2316 corresponding bytepos into *BYTEPOS_PTR and the current nesting 2327 *CHARPOS_PTR, the corresponding bytepos into *BYTEPOS_PTR and the
2317 (as defined for state.incomment) in *INCOMMENT_PTR. 2328 current nesting (as defined for state->incomment) in
2329 *INCOMMENT_PTR. Should the last character scanned in an incomplete
2330 comment be a possible first character of a two character construct,
2331 we store its SYNTAX_WITH_FLAGS into *last_syntax_ptr. Otherwise,
2332 we store Smax into *last_syntax_ptr.
2318 2333
2319 The comment end is the last character of the comment rather than the 2334 The comment end is the last character of the comment rather than the
2320 character just after the comment. 2335 character just after the comment.
@@ -2326,7 +2341,7 @@ static bool
2326forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop, 2341forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
2327 EMACS_INT nesting, int style, int prev_syntax, 2342 EMACS_INT nesting, int style, int prev_syntax,
2328 ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr, 2343 ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr,
2329 EMACS_INT *incomment_ptr) 2344 EMACS_INT *incomment_ptr, int *last_syntax_ptr)
2330{ 2345{
2331 register int c, c1; 2346 register int c, c1;
2332 register enum syntaxcode code; 2347 register enum syntaxcode code;
@@ -2337,7 +2352,8 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
2337 /* Enter the loop in the middle so that we find 2352 /* Enter the loop in the middle so that we find
2338 a 2-char comment ender if we start in the middle of it. */ 2353 a 2-char comment ender if we start in the middle of it. */
2339 syntax = prev_syntax; 2354 syntax = prev_syntax;
2340 if (syntax != 0) goto forw_incomment; 2355 code = syntax & 0xff;
2356 if (syntax != 0 && from < stop) goto forw_incomment;
2341 2357
2342 while (1) 2358 while (1)
2343 { 2359 {
@@ -2346,6 +2362,12 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
2346 *incomment_ptr = nesting; 2362 *incomment_ptr = nesting;
2347 *charpos_ptr = from; 2363 *charpos_ptr = from;
2348 *bytepos_ptr = from_byte; 2364 *bytepos_ptr = from_byte;
2365 *last_syntax_ptr =
2366 (code == Sescape || code == Scharquote
2367 || SYNTAX_FLAGS_COMEND_FIRST (syntax)
2368 || (nesting > 0
2369 && SYNTAX_FLAGS_COMSTART_FIRST (syntax)))
2370 ? syntax : Smax ;
2349 return 0; 2371 return 0;
2350 } 2372 }
2351 c = FETCH_CHAR_AS_MULTIBYTE (from_byte); 2373 c = FETCH_CHAR_AS_MULTIBYTE (from_byte);
@@ -2386,7 +2408,9 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
2386 SYNTAX_FLAGS_COMMENT_NESTED (other_syntax)) 2408 SYNTAX_FLAGS_COMMENT_NESTED (other_syntax))
2387 ? nesting > 0 : nesting < 0)) 2409 ? nesting > 0 : nesting < 0))
2388 { 2410 {
2389 if (--nesting <= 0) 2411 syntax = Smax; /* So that "|#" (lisp) can not return
2412 the syntax of "#" in *last_syntax_ptr. */
2413 if (--nesting <= 0)
2390 /* We have encountered a comment end of the same style 2414 /* We have encountered a comment end of the same style
2391 as the comment sequence which began this comment section. */ 2415 as the comment sequence which began this comment section. */
2392 break; 2416 break;
@@ -2408,6 +2432,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
2408 /* We have encountered a nested comment of the same style 2432 /* We have encountered a nested comment of the same style
2409 as the comment sequence which began this comment section. */ 2433 as the comment sequence which began this comment section. */
2410 { 2434 {
2435 syntax = Smax; /* So that "#|#" isn't also a comment ender. */
2411 INC_BOTH (from, from_byte); 2436 INC_BOTH (from, from_byte);
2412 UPDATE_SYNTAX_TABLE_FORWARD (from); 2437 UPDATE_SYNTAX_TABLE_FORWARD (from);
2413 nesting++; 2438 nesting++;
@@ -2415,6 +2440,8 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
2415 } 2440 }
2416 *charpos_ptr = from; 2441 *charpos_ptr = from;
2417 *bytepos_ptr = from_byte; 2442 *bytepos_ptr = from_byte;
2443 *last_syntax_ptr = Smax; /* Any syntactic power the last byte had is
2444 used up. */
2418 return 1; 2445 return 1;
2419} 2446}
2420 2447
@@ -2436,6 +2463,7 @@ between them, return t; otherwise return nil. */)
2436 EMACS_INT count1; 2463 EMACS_INT count1;
2437 ptrdiff_t out_charpos, out_bytepos; 2464 ptrdiff_t out_charpos, out_bytepos;
2438 EMACS_INT dummy; 2465 EMACS_INT dummy;
2466 int dummy2;
2439 2467
2440 CHECK_NUMBER (count); 2468 CHECK_NUMBER (count);
2441 count1 = XINT (count); 2469 count1 = XINT (count);
@@ -2499,7 +2527,7 @@ between them, return t; otherwise return nil. */)
2499 } 2527 }
2500 /* We're at the start of a comment. */ 2528 /* We're at the start of a comment. */
2501 found = forw_comment (from, from_byte, stop, comnested, comstyle, 0, 2529 found = forw_comment (from, from_byte, stop, comnested, comstyle, 0,
2502 &out_charpos, &out_bytepos, &dummy); 2530 &out_charpos, &out_bytepos, &dummy, &dummy2);
2503 from = out_charpos; from_byte = out_bytepos; 2531 from = out_charpos; from_byte = out_bytepos;
2504 if (!found) 2532 if (!found)
2505 { 2533 {
@@ -2659,6 +2687,7 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
2659 ptrdiff_t from_byte; 2687 ptrdiff_t from_byte;
2660 ptrdiff_t out_bytepos, out_charpos; 2688 ptrdiff_t out_bytepos, out_charpos;
2661 EMACS_INT dummy; 2689 EMACS_INT dummy;
2690 int dummy2;
2662 bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol; 2691 bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol;
2663 2692
2664 if (depth > 0) min_depth = 0; 2693 if (depth > 0) min_depth = 0;
@@ -2755,7 +2784,8 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
2755 UPDATE_SYNTAX_TABLE_FORWARD (from); 2784 UPDATE_SYNTAX_TABLE_FORWARD (from);
2756 found = forw_comment (from, from_byte, stop, 2785 found = forw_comment (from, from_byte, stop,
2757 comnested, comstyle, 0, 2786 comnested, comstyle, 0,
2758 &out_charpos, &out_bytepos, &dummy); 2787 &out_charpos, &out_bytepos, &dummy,
2788 &dummy2);
2759 from = out_charpos, from_byte = out_bytepos; 2789 from = out_charpos, from_byte = out_bytepos;
2760 if (!found) 2790 if (!found)
2761 { 2791 {
@@ -3119,7 +3149,7 @@ the prefix syntax flag (p). */)
3119} 3149}
3120 3150
3121/* Parse forward from FROM / FROM_BYTE to END, 3151/* Parse forward from FROM / FROM_BYTE to END,
3122 assuming that FROM has state OLDSTATE (nil means FROM is start of function), 3152 assuming that FROM has state STATE,
3123 and return a description of the state of the parse at END. 3153 and return a description of the state of the parse at END.
3124 If STOPBEFORE, stop at the start of an atom. 3154 If STOPBEFORE, stop at the start of an atom.
3125 If COMMENTSTOP is 1, stop at the start of a comment. 3155 If COMMENTSTOP is 1, stop at the start of a comment.
@@ -3127,12 +3157,11 @@ the prefix syntax flag (p). */)
3127 after the beginning of a string, or after the end of a string. */ 3157 after the beginning of a string, or after the end of a string. */
3128 3158
3129static void 3159static void
3130scan_sexps_forward (struct lisp_parse_state *stateptr, 3160scan_sexps_forward (struct lisp_parse_state *state,
3131 ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end, 3161 ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end,
3132 EMACS_INT targetdepth, bool stopbefore, 3162 EMACS_INT targetdepth, bool stopbefore,
3133 Lisp_Object oldstate, int commentstop) 3163 int commentstop)
3134{ 3164{
3135 struct lisp_parse_state state;
3136 enum syntaxcode code; 3165 enum syntaxcode code;
3137 int c1; 3166 int c1;
3138 bool comnested; 3167 bool comnested;
@@ -3148,7 +3177,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
3148 Lisp_Object tem; 3177 Lisp_Object tem;
3149 ptrdiff_t prev_from; /* Keep one character before FROM. */ 3178 ptrdiff_t prev_from; /* Keep one character before FROM. */
3150 ptrdiff_t prev_from_byte; 3179 ptrdiff_t prev_from_byte;
3151 int prev_from_syntax; 3180 int prev_from_syntax, prev_prev_from_syntax;
3152 bool boundary_stop = commentstop == -1; 3181 bool boundary_stop = commentstop == -1;
3153 bool nofence; 3182 bool nofence;
3154 bool found; 3183 bool found;
@@ -3165,6 +3194,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
3165do { prev_from = from; \ 3194do { prev_from = from; \
3166 prev_from_byte = from_byte; \ 3195 prev_from_byte = from_byte; \
3167 temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte); \ 3196 temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte); \
3197 prev_prev_from_syntax = prev_from_syntax; \
3168 prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \ 3198 prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \
3169 INC_BOTH (from, from_byte); \ 3199 INC_BOTH (from, from_byte); \
3170 if (from < end) \ 3200 if (from < end) \
@@ -3174,88 +3204,38 @@ do { prev_from = from; \
3174 immediate_quit = 1; 3204 immediate_quit = 1;
3175 QUIT; 3205 QUIT;
3176 3206
3177 if (NILP (oldstate)) 3207 depth = state->depth;
3178 { 3208 start_quoted = state->quoted;
3179 depth = 0; 3209 prev_prev_from_syntax = Smax;
3180 state.instring = -1; 3210 prev_from_syntax = state->prev_syntax;
3181 state.incomment = 0;
3182 state.comstyle = 0; /* comment style a by default. */
3183 state.comstr_start = -1; /* no comment/string seen. */
3184 }
3185 else
3186 {
3187 tem = Fcar (oldstate);
3188 if (!NILP (tem))
3189 depth = XINT (tem);
3190 else
3191 depth = 0;
3192
3193 oldstate = Fcdr (oldstate);
3194 oldstate = Fcdr (oldstate);
3195 oldstate = Fcdr (oldstate);
3196 tem = Fcar (oldstate);
3197 /* Check whether we are inside string_fence-style string: */
3198 state.instring = (!NILP (tem)
3199 ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
3200 : -1);
3201
3202 oldstate = Fcdr (oldstate);
3203 tem = Fcar (oldstate);
3204 state.incomment = (!NILP (tem)
3205 ? (INTEGERP (tem) ? XINT (tem) : -1)
3206 : 0);
3207
3208 oldstate = Fcdr (oldstate);
3209 tem = Fcar (oldstate);
3210 start_quoted = !NILP (tem);
3211 3211
3212 /* if the eighth element of the list is nil, we are in comment 3212 tem = state->levelstarts;
3213 style a. If it is non-nil, we are in comment style b */ 3213 while (!NILP (tem)) /* >= second enclosing sexps. */
3214 oldstate = Fcdr (oldstate); 3214 {
3215 oldstate = Fcdr (oldstate); 3215 Lisp_Object temhd = Fcar (tem);
3216 tem = Fcar (oldstate); 3216 if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
3217 state.comstyle = (NILP (tem) 3217 curlevel->last = XINT (temhd);
3218 ? 0 3218 if (++curlevel == endlevel)
3219 : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE) 3219 curlevel--; /* error ("Nesting too deep for parser"); */
3220 ? XINT (tem) 3220 curlevel->prev = -1;
3221 : ST_COMMENT_STYLE)); 3221 curlevel->last = -1;
3222 3222 tem = Fcdr (tem);
3223 oldstate = Fcdr (oldstate);
3224 tem = Fcar (oldstate);
3225 state.comstr_start =
3226 RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
3227 oldstate = Fcdr (oldstate);
3228 tem = Fcar (oldstate);
3229 while (!NILP (tem)) /* >= second enclosing sexps. */
3230 {
3231 Lisp_Object temhd = Fcar (tem);
3232 if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
3233 curlevel->last = XINT (temhd);
3234 if (++curlevel == endlevel)
3235 curlevel--; /* error ("Nesting too deep for parser"); */
3236 curlevel->prev = -1;
3237 curlevel->last = -1;
3238 tem = Fcdr (tem);
3239 }
3240 } 3223 }
3241 state.quoted = 0;
3242 mindepth = depth;
3243
3244 curlevel->prev = -1; 3224 curlevel->prev = -1;
3245 curlevel->last = -1; 3225 curlevel->last = -1;
3246 3226
3247 SETUP_SYNTAX_TABLE (prev_from, 1); 3227 state->quoted = 0;
3248 temp = FETCH_CHAR (prev_from_byte); 3228 mindepth = depth;
3249 prev_from_syntax = SYNTAX_WITH_FLAGS (temp); 3229
3250 UPDATE_SYNTAX_TABLE_FORWARD (from); 3230 SETUP_SYNTAX_TABLE (from, 1);
3251 3231
3252 /* Enter the loop at a place appropriate for initial state. */ 3232 /* Enter the loop at a place appropriate for initial state. */
3253 3233
3254 if (state.incomment) 3234 if (state->incomment)
3255 goto startincomment; 3235 goto startincomment;
3256 if (state.instring >= 0) 3236 if (state->instring >= 0)
3257 { 3237 {
3258 nofence = state.instring != ST_STRING_STYLE; 3238 nofence = state->instring != ST_STRING_STYLE;
3259 if (start_quoted) 3239 if (start_quoted)
3260 goto startquotedinstring; 3240 goto startquotedinstring;
3261 goto startinstring; 3241 goto startinstring;
@@ -3266,11 +3246,8 @@ do { prev_from = from; \
3266 while (from < end) 3246 while (from < end)
3267 { 3247 {
3268 int syntax; 3248 int syntax;
3269 INC_FROM;
3270 code = prev_from_syntax & 0xff;
3271 3249
3272 if (from < end 3250 if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
3273 && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
3274 && (c1 = FETCH_CHAR (from_byte), 3251 && (c1 = FETCH_CHAR (from_byte),
3275 syntax = SYNTAX_WITH_FLAGS (c1), 3252 syntax = SYNTAX_WITH_FLAGS (c1),
3276 SYNTAX_FLAGS_COMSTART_SECOND (syntax))) 3253 SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
@@ -3280,32 +3257,39 @@ do { prev_from = from; \
3280 /* Record the comment style we have entered so that only 3257 /* Record the comment style we have entered so that only
3281 the comment-end sequence of the same style actually 3258 the comment-end sequence of the same style actually
3282 terminates the comment section. */ 3259 terminates the comment section. */
3283 state.comstyle 3260 state->comstyle
3284 = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax); 3261 = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
3285 comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) 3262 comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
3286 | SYNTAX_FLAGS_COMMENT_NESTED (syntax)); 3263 | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
3287 state.incomment = comnested ? 1 : -1; 3264 state->incomment = comnested ? 1 : -1;
3288 state.comstr_start = prev_from; 3265 state->comstr_start = prev_from;
3289 INC_FROM; 3266 INC_FROM;
3267 prev_from_syntax = Smax; /* the syntax has already been
3268 "used up". */
3290 code = Scomment; 3269 code = Scomment;
3291 } 3270 }
3292 else if (code == Scomment_fence) 3271 else
3293 { 3272 {
3294 /* Record the comment style we have entered so that only 3273 INC_FROM;
3295 the comment-end sequence of the same style actually 3274 code = prev_from_syntax & 0xff;
3296 terminates the comment section. */ 3275 if (code == Scomment_fence)
3297 state.comstyle = ST_COMMENT_STYLE; 3276 {
3298 state.incomment = -1; 3277 /* Record the comment style we have entered so that only
3299 state.comstr_start = prev_from; 3278 the comment-end sequence of the same style actually
3300 code = Scomment; 3279 terminates the comment section. */
3301 } 3280 state->comstyle = ST_COMMENT_STYLE;
3302 else if (code == Scomment) 3281 state->incomment = -1;
3303 { 3282 state->comstr_start = prev_from;
3304 state.comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0); 3283 code = Scomment;
3305 state.incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ? 3284 }
3306 1 : -1); 3285 else if (code == Scomment)
3307 state.comstr_start = prev_from; 3286 {
3308 } 3287 state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
3288 state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
3289 1 : -1);
3290 state->comstr_start = prev_from;
3291 }
3292 }
3309 3293
3310 if (SYNTAX_FLAGS_PREFIX (prev_from_syntax)) 3294 if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
3311 continue; 3295 continue;
@@ -3350,26 +3334,28 @@ do { prev_from = from; \
3350 3334
3351 case Scomment_fence: /* Can't happen because it's handled above. */ 3335 case Scomment_fence: /* Can't happen because it's handled above. */
3352 case Scomment: 3336 case Scomment:
3353 if (commentstop || boundary_stop) goto done; 3337 if (commentstop || boundary_stop) goto done;
3354 startincomment: 3338 startincomment:
3355 /* The (from == BEGV) test was to enter the loop in the middle so 3339 /* The (from == BEGV) test was to enter the loop in the middle so
3356 that we find a 2-char comment ender even if we start in the 3340 that we find a 2-char comment ender even if we start in the
3357 middle of it. We don't want to do that if we're just at the 3341 middle of it. We don't want to do that if we're just at the
3358 beginning of the comment (think of (*) ... (*)). */ 3342 beginning of the comment (think of (*) ... (*)). */
3359 found = forw_comment (from, from_byte, end, 3343 found = forw_comment (from, from_byte, end,
3360 state.incomment, state.comstyle, 3344 state->incomment, state->comstyle,
3361 (from == BEGV || from < state.comstr_start + 3) 3345 from == BEGV ? 0 : prev_from_syntax,
3362 ? 0 : prev_from_syntax, 3346 &out_charpos, &out_bytepos, &state->incomment,
3363 &out_charpos, &out_bytepos, &state.incomment); 3347 &prev_from_syntax);
3364 from = out_charpos; from_byte = out_bytepos; 3348 from = out_charpos; from_byte = out_bytepos;
3365 /* Beware! prev_from and friends are invalid now. 3349 /* Beware! prev_from and friends (except prev_from_syntax)
3366 Luckily, the `done' doesn't use them and the INC_FROM 3350 are invalid now. Luckily, the `done' doesn't use them
3367 sets them to a sane value without looking at them. */ 3351 and the INC_FROM sets them to a sane value without
3352 looking at them. */
3368 if (!found) goto done; 3353 if (!found) goto done;
3369 INC_FROM; 3354 INC_FROM;
3370 state.incomment = 0; 3355 state->incomment = 0;
3371 state.comstyle = 0; /* reset the comment style */ 3356 state->comstyle = 0; /* reset the comment style */
3372 if (boundary_stop) goto done; 3357 prev_from_syntax = Smax; /* For the comment closer */
3358 if (boundary_stop) goto done;
3373 break; 3359 break;
3374 3360
3375 case Sopen: 3361 case Sopen:
@@ -3396,16 +3382,16 @@ do { prev_from = from; \
3396 3382
3397 case Sstring: 3383 case Sstring:
3398 case Sstring_fence: 3384 case Sstring_fence:
3399 state.comstr_start = from - 1; 3385 state->comstr_start = from - 1;
3400 if (stopbefore) goto stop; /* this arg means stop at sexp start */ 3386 if (stopbefore) goto stop; /* this arg means stop at sexp start */
3401 curlevel->last = prev_from; 3387 curlevel->last = prev_from;
3402 state.instring = (code == Sstring 3388 state->instring = (code == Sstring
3403 ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte)) 3389 ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte))
3404 : ST_STRING_STYLE); 3390 : ST_STRING_STYLE);
3405 if (boundary_stop) goto done; 3391 if (boundary_stop) goto done;
3406 startinstring: 3392 startinstring:
3407 { 3393 {
3408 nofence = state.instring != ST_STRING_STYLE; 3394 nofence = state->instring != ST_STRING_STYLE;
3409 3395
3410 while (1) 3396 while (1)
3411 { 3397 {
@@ -3419,7 +3405,7 @@ do { prev_from = from; \
3419 /* Check C_CODE here so that if the char has 3405 /* Check C_CODE here so that if the char has
3420 a syntax-table property which says it is NOT 3406 a syntax-table property which says it is NOT
3421 a string character, it does not end the string. */ 3407 a string character, it does not end the string. */
3422 if (nofence && c == state.instring && c_code == Sstring) 3408 if (nofence && c == state->instring && c_code == Sstring)
3423 break; 3409 break;
3424 3410
3425 switch (c_code) 3411 switch (c_code)
@@ -3442,7 +3428,7 @@ do { prev_from = from; \
3442 } 3428 }
3443 } 3429 }
3444 string_end: 3430 string_end:
3445 state.instring = -1; 3431 state->instring = -1;
3446 curlevel->prev = curlevel->last; 3432 curlevel->prev = curlevel->last;
3447 INC_FROM; 3433 INC_FROM;
3448 if (boundary_stop) goto done; 3434 if (boundary_stop) goto done;
@@ -3461,25 +3447,96 @@ do { prev_from = from; \
3461 stop: /* Here if stopping before start of sexp. */ 3447 stop: /* Here if stopping before start of sexp. */
3462 from = prev_from; /* We have just fetched the char that starts it; */ 3448 from = prev_from; /* We have just fetched the char that starts it; */
3463 from_byte = prev_from_byte; 3449 from_byte = prev_from_byte;
3450 prev_from_syntax = prev_prev_from_syntax;
3464 goto done; /* but return the position before it. */ 3451 goto done; /* but return the position before it. */
3465 3452
3466 endquoted: 3453 endquoted:
3467 state.quoted = 1; 3454 state->quoted = 1;
3468 done: 3455 done:
3469 state.depth = depth; 3456 state->depth = depth;
3470 state.mindepth = mindepth; 3457 state->mindepth = mindepth;
3471 state.thislevelstart = curlevel->prev; 3458 state->thislevelstart = curlevel->prev;
3472 state.prevlevelstart 3459 state->prevlevelstart
3473 = (curlevel == levelstart) ? -1 : (curlevel - 1)->last; 3460 = (curlevel == levelstart) ? -1 : (curlevel - 1)->last;
3474 state.location = from; 3461 state->location = from;
3475 state.location_byte = from_byte; 3462 state->location_byte = from_byte;
3476 state.levelstarts = Qnil; 3463 state->levelstarts = Qnil;
3477 while (curlevel > levelstart) 3464 while (curlevel > levelstart)
3478 state.levelstarts = Fcons (make_number ((--curlevel)->last), 3465 state->levelstarts = Fcons (make_number ((--curlevel)->last),
3479 state.levelstarts); 3466 state->levelstarts);
3467 state->prev_syntax = (SYNTAX_FLAGS_COMSTARTEND_FIRST (prev_from_syntax)
3468 || state->quoted) ? prev_from_syntax : Smax;
3480 immediate_quit = 0; 3469 immediate_quit = 0;
3470}
3471
3472/* Convert a (lisp) parse state to the internal form used in
3473 scan_sexps_forward. */
3474static void
3475internalize_parse_state (Lisp_Object external, struct lisp_parse_state *state)
3476{
3477 Lisp_Object tem;
3478
3479 if (NILP (external))
3480 {
3481 state->depth = 0;
3482 state->instring = -1;
3483 state->incomment = 0;
3484 state->quoted = 0;
3485 state->comstyle = 0; /* comment style a by default. */
3486 state->comstr_start = -1; /* no comment/string seen. */
3487 state->levelstarts = Qnil;
3488 state->prev_syntax = Smax;
3489 }
3490 else
3491 {
3492 tem = Fcar (external);
3493 if (!NILP (tem))
3494 state->depth = XINT (tem);
3495 else
3496 state->depth = 0;
3497
3498 external = Fcdr (external);
3499 external = Fcdr (external);
3500 external = Fcdr (external);
3501 tem = Fcar (external);
3502 /* Check whether we are inside string_fence-style string: */
3503 state->instring = (!NILP (tem)
3504 ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
3505 : -1);
3506
3507 external = Fcdr (external);
3508 tem = Fcar (external);
3509 state->incomment = (!NILP (tem)
3510 ? (INTEGERP (tem) ? XINT (tem) : -1)
3511 : 0);
3512
3513 external = Fcdr (external);
3514 tem = Fcar (external);
3515 state->quoted = !NILP (tem);
3481 3516
3482 *stateptr = state; 3517 /* if the eighth element of the list is nil, we are in comment
3518 style a. If it is non-nil, we are in comment style b */
3519 external = Fcdr (external);
3520 external = Fcdr (external);
3521 tem = Fcar (external);
3522 state->comstyle = (NILP (tem)
3523 ? 0
3524 : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
3525 ? XINT (tem)
3526 : ST_COMMENT_STYLE));
3527
3528 external = Fcdr (external);
3529 tem = Fcar (external);
3530 state->comstr_start =
3531 RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
3532 external = Fcdr (external);
3533 tem = Fcar (external);
3534 state->levelstarts = tem;
3535
3536 external = Fcdr (external);
3537 tem = Fcar (external);
3538 state->prev_syntax = NILP (tem) ? Smax : XINT (tem);
3539 }
3483} 3540}
3484 3541
3485DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0, 3542DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0,
@@ -3488,6 +3545,7 @@ Parsing stops at TO or when certain criteria are met;
3488 point is set to where parsing stops. 3545 point is set to where parsing stops.
3489If fifth arg OLDSTATE is omitted or nil, 3546If fifth arg OLDSTATE is omitted or nil,
3490 parsing assumes that FROM is the beginning of a function. 3547 parsing assumes that FROM is the beginning of a function.
3548
3491Value is a list of elements describing final state of parsing: 3549Value is a list of elements describing final state of parsing:
3492 0. depth in parens. 3550 0. depth in parens.
3493 1. character address of start of innermost containing list; nil if none. 3551 1. character address of start of innermost containing list; nil if none.
@@ -3501,16 +3559,22 @@ Value is a list of elements describing final state of parsing:
3501 6. the minimum paren-depth encountered during this scan. 3559 6. the minimum paren-depth encountered during this scan.
3502 7. style of comment, if any. 3560 7. style of comment, if any.
3503 8. character address of start of comment or string; nil if not in one. 3561 8. character address of start of comment or string; nil if not in one.
3504 9. Intermediate data for continuation of parsing (subject to change). 3562 9. List of positions of currently open parens, outermost first.
356310. When the last position scanned holds the first character of a
3564 (potential) two character construct, the syntax of that position,
3565 otherwise nil. That construct can be a two character comment
3566 delimiter or an Escaped or Char-quoted character.
356711..... Possible further internal information used by `parse-partial-sexp'.
3568
3505If third arg TARGETDEPTH is non-nil, parsing stops if the depth 3569If third arg TARGETDEPTH is non-nil, parsing stops if the depth
3506in parentheses becomes equal to TARGETDEPTH. 3570in parentheses becomes equal to TARGETDEPTH.
3507Fourth arg STOPBEFORE non-nil means stop when come to 3571Fourth arg STOPBEFORE non-nil means stop when we come to
3508 any character that starts a sexp. 3572 any character that starts a sexp.
3509Fifth arg OLDSTATE is a list like what this function returns. 3573Fifth arg OLDSTATE is a list like what this function returns.
3510 It is used to initialize the state of the parse. Elements number 1, 2, 6 3574 It is used to initialize the state of the parse. Elements number 1, 2, 6
3511 are ignored. 3575 are ignored.
3512Sixth arg COMMENTSTOP non-nil means stop at the start of a comment. 3576Sixth arg COMMENTSTOP non-nil means stop after the start of a comment.
3513 If it is symbol `syntax-table', stop after the start of a comment or a 3577 If it is the symbol `syntax-table', stop after the start of a comment or a
3514 string, or after end of a comment or a string. */) 3578 string, or after end of a comment or a string. */)
3515 (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth, 3579 (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth,
3516 Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop) 3580 Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop)
@@ -3527,15 +3591,17 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
3527 target = TYPE_MINIMUM (EMACS_INT); /* We won't reach this depth. */ 3591 target = TYPE_MINIMUM (EMACS_INT); /* We won't reach this depth. */
3528 3592
3529 validate_region (&from, &to); 3593 validate_region (&from, &to);
3594 internalize_parse_state (oldstate, &state);
3530 scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)), 3595 scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)),
3531 XINT (to), 3596 XINT (to),
3532 target, !NILP (stopbefore), oldstate, 3597 target, !NILP (stopbefore),
3533 (NILP (commentstop) 3598 (NILP (commentstop)
3534 ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1))); 3599 ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1)));
3535 3600
3536 SET_PT_BOTH (state.location, state.location_byte); 3601 SET_PT_BOTH (state.location, state.location_byte);
3537 3602
3538 return Fcons (make_number (state.depth), 3603 return
3604 Fcons (make_number (state.depth),
3539 Fcons (state.prevlevelstart < 0 3605 Fcons (state.prevlevelstart < 0
3540 ? Qnil : make_number (state.prevlevelstart), 3606 ? Qnil : make_number (state.prevlevelstart),
3541 Fcons (state.thislevelstart < 0 3607 Fcons (state.thislevelstart < 0
@@ -3553,11 +3619,15 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
3553 ? Qsyntax_table 3619 ? Qsyntax_table
3554 : make_number (state.comstyle)) 3620 : make_number (state.comstyle))
3555 : Qnil), 3621 : Qnil),
3556 Fcons (((state.incomment 3622 Fcons (((state.incomment
3557 || (state.instring >= 0)) 3623 || (state.instring >= 0))
3558 ? make_number (state.comstr_start) 3624 ? make_number (state.comstr_start)
3559 : Qnil), 3625 : Qnil),
3560 Fcons (state.levelstarts, Qnil)))))))))); 3626 Fcons (state.levelstarts,
3627 Fcons (state.prev_syntax == Smax
3628 ? Qnil
3629 : make_number (state.prev_syntax),
3630 Qnil)))))))))));
3561} 3631}
3562 3632
3563void 3633void