diff options
| author | Richard M. Stallman | 2006-12-17 22:02:52 +0000 |
|---|---|---|
| committer | Richard M. Stallman | 2006-12-17 22:02:52 +0000 |
| commit | fe963f844bbb6feaf626e1aa096c12e7681f603d (patch) | |
| tree | d5e342a284e5439518512814dda664fa3d616996 | |
| parent | 07af30248aef20ee769333daf00f37fbeb1bbfce (diff) | |
| download | emacs-fe963f844bbb6feaf626e1aa096c12e7681f603d.tar.gz emacs-fe963f844bbb6feaf626e1aa096c12e7681f603d.zip | |
(Parsing Expressions): Split up node.
(Motion via Parsing, Position Parse, Parser State)
(Low-Level Parsing, Control Parsing): New subnodes.
(Parser State): Document syntax-ppss-toplevel-pos.
| -rw-r--r-- | lispref/ChangeLog | 7 | ||||
| -rw-r--r-- | lispref/syntax.texi | 307 |
2 files changed, 182 insertions, 132 deletions
diff --git a/lispref/ChangeLog b/lispref/ChangeLog index e2bfbf2680e..0e0079c73f3 100644 --- a/lispref/ChangeLog +++ b/lispref/ChangeLog | |||
| @@ -1,5 +1,12 @@ | |||
| 1 | 2006-12-17 Richard Stallman <rms@gnu.org> | 1 | 2006-12-17 Richard Stallman <rms@gnu.org> |
| 2 | 2 | ||
| 3 | * syntax.texi (Parsing Expressions): Split up node. | ||
| 4 | (Motion via Parsing, Position Parse, Parser State) | ||
| 5 | (Low-Level Parsing, Control Parsing): New subnodes. | ||
| 6 | (Parser State): Document syntax-ppss-toplevel-pos. | ||
| 7 | |||
| 8 | * positions.texi (List Motion): Punctuation fix. | ||
| 9 | |||
| 3 | * files.texi (File Name Completion): Document PREDICATE arg | 10 | * files.texi (File Name Completion): Document PREDICATE arg |
| 4 | to file-name-completion. | 11 | to file-name-completion. |
| 5 | 12 | ||
diff --git a/lispref/syntax.texi b/lispref/syntax.texi index 54b0d4a0bc0..4458547f7d2 100644 --- a/lispref/syntax.texi +++ b/lispref/syntax.texi | |||
| @@ -597,26 +597,26 @@ expression prefix syntax class, and characters with the @samp{p} flag. | |||
| 597 | @end defun | 597 | @end defun |
| 598 | 598 | ||
| 599 | @node Parsing Expressions | 599 | @node Parsing Expressions |
| 600 | @section Parsing Balanced Expressions | 600 | @section Parsing Expressions |
| 601 | 601 | ||
| 602 | Here are several functions for parsing and scanning balanced | 602 | This section describes functions for parsing and scanning balanced |
| 603 | expressions, also known as @dfn{sexps}. Basically, a sexp is either a | 603 | expressions, also known as @dfn{sexps}. Basically, a sexp is either a |
| 604 | balanced parenthetical grouping, or a symbol name (a sequence of | 604 | balanced parenthetical grouping, a string, or a symbol name (a |
| 605 | characters whose syntax is either word constituent or symbol | 605 | sequence of characters whose syntax is either word constituent or |
| 606 | constituent). However, characters whose syntax is expression prefix | 606 | symbol constituent). However, characters whose syntax is expression |
| 607 | are treated as part of the sexp if they appear next to it. | 607 | prefix are treated as part of the sexp if they appear next to it. |
| 608 | 608 | ||
| 609 | The syntax table controls the interpretation of characters, so these | 609 | The syntax table controls the interpretation of characters, so these |
| 610 | functions can be used for Lisp expressions when in Lisp mode and for C | 610 | functions can be used for Lisp expressions when in Lisp mode and for C |
| 611 | expressions when in C mode. @xref{List Motion}, for convenient | 611 | expressions when in C mode. @xref{List Motion}, for convenient |
| 612 | higher-level functions for moving over balanced expressions. | 612 | higher-level functions for moving over balanced expressions. |
| 613 | 613 | ||
| 614 | A syntax table only describes how each character changes the state | 614 | A character's syntax controls how it changes the state of the |
| 615 | of the parser, rather than describing the state itself. For example, | 615 | parser, rather than describing the state itself. For example, a |
| 616 | a string delimiter character toggles the parser state between | 616 | string delimiter character toggles the parser state between |
| 617 | ``in-string'' and ``in-code'' but the characters inside the string do | 617 | ``in-string'' and ``in-code,'' but the syntax of characters does not |
| 618 | not have any particular syntax to identify them as such. For example | 618 | directly say whether they are inside a string. For example (note that |
| 619 | (note that 15 is the syntax code for generic string delimiters), | 619 | 15 is the syntax code for generic string delimiters), |
| 620 | 620 | ||
| 621 | @example | 621 | @example |
| 622 | (put-text-property 1 9 'syntax-table '(15 . nil)) | 622 | (put-text-property 1 9 'syntax-table '(15 . nil)) |
| @@ -627,46 +627,128 @@ does not tell Emacs that the first eight chars of the current buffer | |||
| 627 | are a string, but rather that they are all string delimiters. As a | 627 | are a string, but rather that they are all string delimiters. As a |
| 628 | result, Emacs treats them as four consecutive empty string constants. | 628 | result, Emacs treats them as four consecutive empty string constants. |
| 629 | 629 | ||
| 630 | Every time you use the parser, you specify it a starting state as | 630 | @menu |
| 631 | well as a starting position. If you omit the starting state, the | 631 | * Motion via Parsing:: Motion functions that work by parsing. |
| 632 | default is ``top level in parenthesis structure,'' as it would be at | 632 | * Position Parse:: Determining the syntactic state of a position. |
| 633 | the beginning of a function definition. (This is the case for | 633 | * Parser State:: How Emacs represents a syntactic state. |
| 634 | @code{forward-sexp}, which blindly assumes that the starting point is | 634 | * Low-Level Parsing:: Parsing across a specified region. |
| 635 | in such a state.) | 635 | * Control Parsing:: Parameters that affect parsing. |
| 636 | @end menu | ||
| 636 | 637 | ||
| 637 | @defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment | 638 | @node Motion via Parsing |
| 638 | This function parses a sexp in the current buffer starting at | 639 | @subsection Motion Commands Based on Parsing |
| 639 | @var{start}, not scanning past @var{limit}. It stops at position | ||
| 640 | @var{limit} or when certain criteria described below are met, and sets | ||
| 641 | point to the location where parsing stops. It returns a value | ||
| 642 | describing the status of the parse at the point where it stops. | ||
| 643 | 640 | ||
| 644 | If @var{state} is @code{nil}, @var{start} is assumed to be at the top | 641 | This section describes simple point-motion functions that operate |
| 645 | level of parenthesis structure, such as the beginning of a function | 642 | based on parsing expressions. |
| 646 | definition. Alternatively, you might wish to resume parsing in the | ||
| 647 | middle of the structure. To do this, you must provide a @var{state} | ||
| 648 | argument that describes the initial status of parsing. | ||
| 649 | 643 | ||
| 650 | @cindex parenthesis depth | 644 | @defun scan-lists from count depth |
| 651 | If the third argument @var{target-depth} is non-@code{nil}, parsing | 645 | This function scans forward @var{count} balanced parenthetical groupings |
| 652 | stops if the depth in parentheses becomes equal to @var{target-depth}. | 646 | from position @var{from}. It returns the position where the scan stops. |
| 653 | The depth starts at 0, or at whatever is given in @var{state}. | 647 | If @var{count} is negative, the scan moves backwards. |
| 654 | 648 | ||
| 655 | If the fourth argument @var{stop-before} is non-@code{nil}, parsing | 649 | If @var{depth} is nonzero, parenthesis depth counting begins from that |
| 656 | stops when it comes to any character that starts a sexp. If | 650 | value. The only candidates for stopping are places where the depth in |
| 657 | @var{stop-comment} is non-@code{nil}, parsing stops when it comes to the | 651 | parentheses becomes zero; @code{scan-lists} counts @var{count} such |
| 658 | start of a comment. If @var{stop-comment} is the symbol | 652 | places and then stops. Thus, a positive value for @var{depth} means go |
| 659 | @code{syntax-table}, parsing stops after the start of a comment or a | 653 | out @var{depth} levels of parenthesis. |
| 660 | string, or the end of a comment or a string, whichever comes first. | 654 | |
| 655 | Scanning ignores comments if @code{parse-sexp-ignore-comments} is | ||
| 656 | non-@code{nil}. | ||
| 657 | |||
| 658 | If the scan reaches the beginning or end of the buffer (or its | ||
| 659 | accessible portion), and the depth is not zero, an error is signaled. | ||
| 660 | If the depth is zero but the count is not used up, @code{nil} is | ||
| 661 | returned. | ||
| 662 | @end defun | ||
| 663 | |||
| 664 | @defun scan-sexps from count | ||
| 665 | This function scans forward @var{count} sexps from position @var{from}. | ||
| 666 | It returns the position where the scan stops. If @var{count} is | ||
| 667 | negative, the scan moves backwards. | ||
| 668 | |||
| 669 | Scanning ignores comments if @code{parse-sexp-ignore-comments} is | ||
| 670 | non-@code{nil}. | ||
| 671 | |||
| 672 | If the scan reaches the beginning or end of (the accessible part of) the | ||
| 673 | buffer while in the middle of a parenthetical grouping, an error is | ||
| 674 | signaled. If it reaches the beginning or end between groupings but | ||
| 675 | before count is used up, @code{nil} is returned. | ||
| 676 | @end defun | ||
| 677 | |||
| 678 | @defun forward-comment count | ||
| 679 | This function moves point forward across @var{count} complete comments | ||
| 680 | (that is, including the starting delimiter and the terminating | ||
| 681 | delimiter if any), plus any whitespace encountered on the way. It | ||
| 682 | moves backward if @var{count} is negative. If it encounters anything | ||
| 683 | other than a comment or whitespace, it stops, leaving point at the | ||
| 684 | place where it stopped. This includes (for instance) finding the end | ||
| 685 | of a comment when moving forward and expecting the beginning of one. | ||
| 686 | The function also stops immediately after moving over the specified | ||
| 687 | number of complete comments. If @var{count} comments are found as | ||
| 688 | expected, with nothing except whitespace between them, it returns | ||
| 689 | @code{t}; otherwise it returns @code{nil}. | ||
| 690 | |||
| 691 | This function cannot tell whether the ``comments'' it traverses are | ||
| 692 | embedded within a string. If they look like comments, it treats them | ||
| 693 | as comments. | ||
| 694 | @end defun | ||
| 695 | |||
| 696 | To move forward over all comments and whitespace following point, use | ||
| 697 | @code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good | ||
| 698 | argument to use, because the number of comments in the buffer cannot | ||
| 699 | exceed that many. | ||
| 700 | |||
| 701 | @node Position Parse | ||
| 702 | @subsection Finding the Parse State for a Position | ||
| 703 | |||
| 704 | For syntactic analysis, such as in indentation, often the useful | ||
| 705 | thing is to compute the syntactic state corresponding to a given buffer | ||
| 706 | position. This function does that conveniently. | ||
| 707 | |||
| 708 | @defun syntax-ppss &optional pos | ||
| 709 | This function returns the parser state (see next section) that the | ||
| 710 | parser would reach at position @var{pos} starting from the beginning | ||
| 711 | of the buffer. This is equivalent to @code{(parse-partial-sexp | ||
| 712 | (point-min) @var{pos})}, except that @code{syntax-ppss} uses a cache | ||
| 713 | to speed up the computation. Due to this optimization, the 2nd value | ||
| 714 | (previous complete subexpression) and 6th value (minimum parenthesis | ||
| 715 | depth) of the returned parser state are not meaningful. | ||
| 716 | @end defun | ||
| 717 | |||
| 718 | @code{syntax-ppss} automatically hooks itself to | ||
| 719 | @code{before-change-functions} to keep its cache consistent. But | ||
| 720 | updating can fail if @code{syntax-ppss} is called while | ||
| 721 | @code{before-change-functions} is temporarily let-bound, or if the | ||
| 722 | buffer is modified without obeying the hook, such as when using | ||
| 723 | @code{inhibit-modification-hooks}. For this reason, it is sometimes | ||
| 724 | necessary to flush the cache manually. | ||
| 725 | |||
| 726 | @defun syntax-ppss-flush-cache beg | ||
| 727 | This function flushes the cache used by @code{syntax-ppss}, starting at | ||
| 728 | position @var{beg}. | ||
| 729 | @end defun | ||
| 730 | |||
| 731 | Major modes can make @code{syntax-ppss} run faster by specifying | ||
| 732 | where it needs to start parsing. | ||
| 661 | 733 | ||
| 662 | @cindex parse state | 734 | @defvar syntax-begin-function |
| 663 | The fifth argument @var{state} is a ten-element list of the same form | 735 | If this is non-@code{nil}, it should be a function that moves to an |
| 664 | as the value of this function, described below. The return value of | 736 | earlier buffer position where the parser state is equivalent to |
| 665 | one call may be used to initialize the state of the parse on another | 737 | @code{nil}---in other words, a position outside of any comment, |
| 666 | call to @code{parse-partial-sexp}. | 738 | string, or parenthesis. @code{syntax-ppss} uses it to further |
| 739 | optimize its computations, when the cache gives no help. | ||
| 740 | @end defvar | ||
| 741 | |||
| 742 | @node Parser State | ||
| 743 | @subsection Parser State | ||
| 744 | @cindex parser state | ||
| 667 | 745 | ||
| 668 | The result is a list of ten elements describing the final state of | 746 | A @dfn{parser state} is a list of ten elements describing the final |
| 669 | the parse: | 747 | state of parsing text syntactically as part of an expression. The |
| 748 | parsing functions in the following sections return a parser state as | ||
| 749 | the value, and in some cases accept one as an argument also, so that | ||
| 750 | you can resume parsing after it stops. Here are the meanings of the | ||
| 751 | elements of the parser state: | ||
| 670 | 752 | ||
| 671 | @enumerate 0 | 753 | @enumerate 0 |
| 672 | @item | 754 | @item |
| @@ -721,82 +803,66 @@ data is subject to change; it is used if you pass this list | |||
| 721 | as the @var{state} argument to another call. | 803 | as the @var{state} argument to another call. |
| 722 | @end enumerate | 804 | @end enumerate |
| 723 | 805 | ||
| 724 | Elements 1, 2, and 6 are ignored in the argument @var{state}. Element | 806 | Elements 1, 2, and 6 are ignored in a state which you pass as an |
| 725 | 8 is used only to set the corresponding element of the return value, | 807 | argument to continue parsing, and elements 8 and 9 are used only in |
| 726 | in certain simple cases. Element 9 is used only to set element 1 of | 808 | trivial cases. Those elements serve primarily to convey information |
| 727 | the return value, in trivial cases where parsing starts and stops | 809 | to the Lisp program which does the parsing. |
| 728 | within the same pair of parentheses. | ||
| 729 | 810 | ||
| 730 | @cindex indenting with parentheses | 811 | One additional piece of useful information is available from a |
| 731 | This function is most often used to compute indentation for languages | 812 | parser state using this function: |
| 732 | that have nested parentheses. | ||
| 733 | @end defun | ||
| 734 | 813 | ||
| 735 | @defun syntax-ppss &optional pos | 814 | @defun syntax-ppss-toplevel-pos state |
| 736 | This function returns the state that the parser would have at position | 815 | This function extracts, from parser state @var{state}, the last |
| 737 | @var{pos}, if it were started with a default start state at the | 816 | position scanned in the parse which was at top level in grammatical |
| 738 | beginning of the buffer. Thus, it is equivalent to | 817 | structure. ``At top level'' means outside of any parentheses, |
| 739 | @code{(parse-partial-sexp (point-min) @var{pos})}, except that | 818 | comments, or strings. |
| 740 | @code{syntax-ppss} uses a cache to speed up the computation. Also, | ||
| 741 | the 2nd value (previous complete subexpression) and 6th value (minimum | ||
| 742 | parenthesis depth) of the returned state are not meaningful. | ||
| 743 | @end defun | ||
| 744 | 819 | ||
| 745 | @defun syntax-ppss-flush-cache beg | 820 | The value is @code{nil} if @var{state} represents a parse which has |
| 746 | This function flushes the cache used by @code{syntax-ppss}, starting at | 821 | arrived at a top level position. |
| 747 | position @var{beg}. | ||
| 748 | |||
| 749 | When @code{syntax-ppss} is called, it automatically hooks itself | ||
| 750 | to @code{before-change-functions} to keep its cache consistent. | ||
| 751 | But this can fail if @code{syntax-ppss} is called while | ||
| 752 | @code{before-change-functions} is temporarily let-bound, or if the | ||
| 753 | buffer is modified without obeying the hook, such as when using | ||
| 754 | @code{inhibit-modification-hooks}. For this reason, it is sometimes | ||
| 755 | necessary to flush the cache manually. | ||
| 756 | @end defun | 822 | @end defun |
| 757 | 823 | ||
| 758 | @defvar syntax-begin-function | 824 | We have provided this access function rather than document how the |
| 759 | If this is non-@code{nil}, it should be a function that moves to an | 825 | data is represented in the state, because we plan to change the |
| 760 | earlier buffer position where the parser state is equivalent to | 826 | representation in the future. |
| 761 | @code{nil}---in other words, a position outside of any comment, | ||
| 762 | string, or parenthesis. @code{syntax-ppss} uses it to supplement its | ||
| 763 | cache. | ||
| 764 | @end defvar | ||
| 765 | |||
| 766 | @defun scan-lists from count depth | ||
| 767 | This function scans forward @var{count} balanced parenthetical groupings | ||
| 768 | from position @var{from}. It returns the position where the scan stops. | ||
| 769 | If @var{count} is negative, the scan moves backwards. | ||
| 770 | 827 | ||
| 771 | If @var{depth} is nonzero, parenthesis depth counting begins from that | 828 | @node Low-Level Parsing |
| 772 | value. The only candidates for stopping are places where the depth in | 829 | @subsection Low-Level Parsing |
| 773 | parentheses becomes zero; @code{scan-lists} counts @var{count} such | ||
| 774 | places and then stops. Thus, a positive value for @var{depth} means go | ||
| 775 | out @var{depth} levels of parenthesis. | ||
| 776 | 830 | ||
| 777 | Scanning ignores comments if @code{parse-sexp-ignore-comments} is | 831 | The most basic way to use the expression parser is to tell it |
| 778 | non-@code{nil}. | 832 | to start at a given position with a certain state, and parse up to |
| 833 | a specified end position. | ||
| 779 | 834 | ||
| 780 | If the scan reaches the beginning or end of the buffer (or its | 835 | @defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment |
| 781 | accessible portion), and the depth is not zero, an error is signaled. | 836 | This function parses a sexp in the current buffer starting at |
| 782 | If the depth is zero but the count is not used up, @code{nil} is | 837 | @var{start}, not scanning past @var{limit}. It stops at position |
| 783 | returned. | 838 | @var{limit} or when certain criteria described below are met, and sets |
| 784 | @end defun | 839 | point to the location where parsing stops. It returns a parser state |
| 840 | describing the status of the parse at the point where it stops. | ||
| 785 | 841 | ||
| 786 | @defun scan-sexps from count | 842 | @cindex parenthesis depth |
| 787 | This function scans forward @var{count} sexps from position @var{from}. | 843 | If the third argument @var{target-depth} is non-@code{nil}, parsing |
| 788 | It returns the position where the scan stops. If @var{count} is | 844 | stops if the depth in parentheses becomes equal to @var{target-depth}. |
| 789 | negative, the scan moves backwards. | 845 | The depth starts at 0, or at whatever is given in @var{state}. |
| 790 | 846 | ||
| 791 | Scanning ignores comments if @code{parse-sexp-ignore-comments} is | 847 | If the fourth argument @var{stop-before} is non-@code{nil}, parsing |
| 792 | non-@code{nil}. | 848 | stops when it comes to any character that starts a sexp. If |
| 849 | @var{stop-comment} is non-@code{nil}, parsing stops when it comes to the | ||
| 850 | start of a comment. If @var{stop-comment} is the symbol | ||
| 851 | @code{syntax-table}, parsing stops after the start of a comment or a | ||
| 852 | string, or the end of a comment or a string, whichever comes first. | ||
| 793 | 853 | ||
| 794 | If the scan reaches the beginning or end of (the accessible part of) the | 854 | If @var{state} is @code{nil}, @var{start} is assumed to be at the top |
| 795 | buffer while in the middle of a parenthetical grouping, an error is | 855 | level of parenthesis structure, such as the beginning of a function |
| 796 | signaled. If it reaches the beginning or end between groupings but | 856 | definition. Alternatively, you might wish to resume parsing in the |
| 797 | before count is used up, @code{nil} is returned. | 857 | middle of the structure. To do this, you must provide a @var{state} |
| 858 | argument that describes the initial status of parsing. The value | ||
| 859 | returned by a previous call to @code{parse-partial-sexp} will do | ||
| 860 | nicely. | ||
| 798 | @end defun | 861 | @end defun |
| 799 | 862 | ||
| 863 | @node Control Parsing | ||
| 864 | @subsection Parameters to Control Parsing | ||
| 865 | |||
| 800 | @defvar multibyte-syntax-as-symbol | 866 | @defvar multibyte-syntax-as-symbol |
| 801 | If this variable is non-@code{nil}, @code{scan-sexps} treats all | 867 | If this variable is non-@code{nil}, @code{scan-sexps} treats all |
| 802 | non-@acronym{ASCII} characters as symbol constituents regardless | 868 | non-@acronym{ASCII} characters as symbol constituents regardless |
| @@ -817,29 +883,6 @@ The behavior of @code{parse-partial-sexp} is also affected by | |||
| 817 | You can use @code{forward-comment} to move forward or backward over | 883 | You can use @code{forward-comment} to move forward or backward over |
| 818 | one comment or several comments. | 884 | one comment or several comments. |
| 819 | 885 | ||
| 820 | @defun forward-comment count | ||
| 821 | This function moves point forward across @var{count} complete comments | ||
| 822 | (that is, including the starting delimiter and the terminating | ||
| 823 | delimiter if any), plus any whitespace encountered on the way. It | ||
| 824 | moves backward if @var{count} is negative. If it encounters anything | ||
| 825 | other than a comment or whitespace, it stops, leaving point at the | ||
| 826 | place where it stopped. This includes (for instance) finding the end | ||
| 827 | of a comment when moving forward and expecting the beginning of one. | ||
| 828 | The function also stops immediately after moving over the specified | ||
| 829 | number of complete comments. If @var{count} comments are found as | ||
| 830 | expected, with nothing except whitespace between them, it returns | ||
| 831 | @code{t}; otherwise it returns @code{nil}. | ||
| 832 | |||
| 833 | This function cannot tell whether the ``comments'' it traverses are | ||
| 834 | embedded within a string. If they look like comments, it treats them | ||
| 835 | as comments. | ||
| 836 | @end defun | ||
| 837 | |||
| 838 | To move forward over all comments and whitespace following point, use | ||
| 839 | @code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good | ||
| 840 | argument to use, because the number of comments in the buffer cannot | ||
| 841 | exceed that many. | ||
| 842 | |||
| 843 | @node Standard Syntax Tables | 886 | @node Standard Syntax Tables |
| 844 | @section Some Standard Syntax Tables | 887 | @section Some Standard Syntax Tables |
| 845 | 888 | ||