diff options
| author | Eli Zaretskii | 2025-06-05 10:30:44 +0300 |
|---|---|---|
| committer | Eli Zaretskii | 2025-06-05 10:30:44 +0300 |
| commit | bcf005fa774194d434c68cc191566b58c297ca86 (patch) | |
| tree | efe2611242504140a8dfaf1c58e38333e8ea9080 | |
| parent | 1903b0062b1688241d1485b3e460264e124ba757 (diff) | |
| download | emacs-bcf005fa774194d434c68cc191566b58c297ca86.tar.gz emacs-bcf005fa774194d434c68cc191566b58c297ca86.zip | |
Improve documentation of treesit "thing"
* src/treesit.c (syms_of_treesit):
* lisp/treesit.el (treesit-cycle-sexp-type):
(treesit-thing-at, treesit-thing-at-point): Doc fixes.
* doc/lispref/parsing.texi (User-defined Things): Improve
documentation of treesit "thing" and related functions; add
cross-references and indexing.
| -rw-r--r-- | doc/lispref/parsing.texi | 79 | ||||
| -rw-r--r-- | lisp/treesit.el | 31 | ||||
| -rw-r--r-- | src/treesit.c | 22 |
3 files changed, 80 insertions, 52 deletions
diff --git a/doc/lispref/parsing.texi b/doc/lispref/parsing.texi index aa321785460..374eeb28b7a 100644 --- a/doc/lispref/parsing.texi +++ b/doc/lispref/parsing.texi | |||
| @@ -1619,14 +1619,16 @@ documentation about pattern-matching. The documentation can be found at | |||
| 1619 | 1619 | ||
| 1620 | It's often useful to be able to identify and find certain @dfn{things} in | 1620 | It's often useful to be able to identify and find certain @dfn{things} in |
| 1621 | a buffer, like function and class definitions, statements, code blocks, | 1621 | a buffer, like function and class definitions, statements, code blocks, |
| 1622 | strings, comments, etc. Emacs allows users to define what kind of | 1622 | strings, comments, etc., in terms of node types defined by the |
| 1623 | tree-sitter node corresponds to a ``thing''. This enables handy | 1623 | tree-sitter grammar used in the buffer. Emacs allows Lisp programs to |
| 1624 | features like jumping to the next function, marking the code block at | 1624 | define what kinds of tree-sitter nodes corresponds to each ``thing''. |
| 1625 | point, or transposing two function arguments. | 1625 | This enables handy features like jumping to the next function, marking |
| 1626 | the code block at point, transposing two function arguments, etc. | ||
| 1626 | 1627 | ||
| 1627 | The ``things'' feature in Emacs is independent of the pattern matching | 1628 | The ``things'' feature in Emacs is independent of the pattern matching |
| 1628 | feature of tree-sitter, and comparatively less powerful, but more | 1629 | feature of tree-sitter (@pxref{Pattern Matching}), and comparatively |
| 1629 | suitable for navigation and traversing the parse tree. | 1630 | less powerful, but more suitable for navigation and traversing the |
| 1631 | buffer text in terms of the tree-sitter parse tree. | ||
| 1630 | 1632 | ||
| 1631 | @findex treesit-thing-definition | 1633 | @findex treesit-thing-definition |
| 1632 | @findex treesit-thing-defined-p | 1634 | @findex treesit-thing-defined-p |
| @@ -1635,12 +1637,22 @@ predicate of a defined thing with @code{treesit-thing-definition}, and | |||
| 1635 | test if a thing is defined with @code{treesit-thing-defined-p}. | 1637 | test if a thing is defined with @code{treesit-thing-defined-p}. |
| 1636 | 1638 | ||
| 1637 | @defvar treesit-thing-settings | 1639 | @defvar treesit-thing-settings |
| 1638 | This is an alist of thing definitions for each language. The key of | 1640 | This is an alist of thing definitions for each language supported by the |
| 1639 | each entry is a language symbol, and the value is a list of thing | 1641 | grammar used in a buffer; it should be defined by the buffer's major |
| 1640 | definitions of the form @w{@code{(@var{thing} @var{pred})}}, where | 1642 | mode (the default value is @code{nil}). The key of each entry is a |
| 1641 | @var{thing} is a symbol representing the thing, like @code{defun}, | 1643 | language symbol (e.g., @code{c} for C, @code{cpp} for C@t{++}, etc.), |
| 1642 | @code{sexp}, or @code{sentence}; and @var{pred} specifies what kind of | 1644 | and the value is a list of thing definitions of the form |
| 1643 | tree-sitter node is this @var{thing}. | 1645 | @w{@code{(@var{thing} @var{pred})}}, where @var{thing} is a symbol |
| 1646 | representing the thing, and @var{pred} specifies what kinds of | ||
| 1647 | tree-sitter nodes are considered as this @var{thing}. | ||
| 1648 | |||
| 1649 | @cindex @code{sexp}, treesit-defined thing | ||
| 1650 | @cindex @code{list}, treesit-defined thing | ||
| 1651 | The symbol used to define the @var{thing} can be anything meaningful for | ||
| 1652 | the major mode: @code{defun}, @code{defclass}, @code{sentence}, | ||
| 1653 | @code{comment}, @code{string}, etc. To support tree-sitter based | ||
| 1654 | navigation commands (@pxref{List Motion}), the mode should define two | ||
| 1655 | things: @code{list} and @code{sexp}. | ||
| 1644 | 1656 | ||
| 1645 | @var{pred} can be a regexp string that matches the type of the node; it | 1657 | @var{pred} can be a regexp string that matches the type of the node; it |
| 1646 | can be a function that takes a node as the argument and returns a | 1658 | can be a function that takes a node as the argument and returns a |
| @@ -1660,13 +1672,16 @@ meaning that not satisfying @var{pred} qualifies the node. | |||
| 1660 | Finally, @var{pred} can refer to other @var{thing}s defined in this | 1672 | Finally, @var{pred} can refer to other @var{thing}s defined in this |
| 1661 | list. For example, @w{@code{(or sexp sentence)}} defines something | 1673 | list. For example, @w{@code{(or sexp sentence)}} defines something |
| 1662 | that's either a @code{sexp} thing or a @code{sentence} thing, as defined | 1674 | that's either a @code{sexp} thing or a @code{sentence} thing, as defined |
| 1663 | by some other rule in the alist. | 1675 | by some other rules in the alist. |
| 1664 | 1676 | ||
| 1677 | @cindex @code{named}, treesit-defined thing | ||
| 1678 | @cindex @code{anonymous}, treesit-defined thing | ||
| 1665 | There are two pre-defined predicates: @code{named} and @code{anonymous}, | 1679 | There are two pre-defined predicates: @code{named} and @code{anonymous}, |
| 1666 | which qualify, respectively, named and anonymous nodes. They can be | 1680 | which qualify, respectively, named and anonymous nodes of the |
| 1667 | combined with @code{and} to narrow down the match. | 1681 | tree-sitter grammar. They can be combined with @code{and} to narrow |
| 1682 | down the match. | ||
| 1668 | 1683 | ||
| 1669 | Here's an example @code{treesit-thing-settings} for C and C++: | 1684 | Here's an example @code{treesit-thing-settings} for C and C@t{++}: |
| 1670 | 1685 | ||
| 1671 | @example | 1686 | @example |
| 1672 | @group | 1687 | @group |
| @@ -1676,6 +1691,8 @@ Here's an example @code{treesit-thing-settings} for C and C++: | |||
| 1676 | (comment "comment") | 1691 | (comment "comment") |
| 1677 | (string "raw_string_literal") | 1692 | (string "raw_string_literal") |
| 1678 | (text (or comment string))) | 1693 | (text (or comment string))) |
| 1694 | @end group | ||
| 1695 | @group | ||
| 1679 | (cpp | 1696 | (cpp |
| 1680 | (defun ("function_definition" . cpp-ts-mode-defun-valid-p)) | 1697 | (defun ("function_definition" . cpp-ts-mode-defun-valid-p)) |
| 1681 | (defclass "class_specifier") | 1698 | (defclass "class_specifier") |
| @@ -1685,12 +1702,12 @@ Here's an example @code{treesit-thing-settings} for C and C++: | |||
| 1685 | 1702 | ||
| 1686 | @noindent | 1703 | @noindent |
| 1687 | Note that this example is modified for didactic purposes, and isn't | 1704 | Note that this example is modified for didactic purposes, and isn't |
| 1688 | exactly how C and C@t{++} modes define things. | 1705 | exactly how tree-sitter based C and C@t{++} modes define things. |
| 1689 | @end defvar | 1706 | @end defvar |
| 1690 | 1707 | ||
| 1691 | Emacs builtin functions already make use some thing definitions. | 1708 | Emacs builtin functions already make use of some thing definitions. |
| 1692 | Command @code{treesit-forward-sexp} uses the @code{sexp} definition if | 1709 | Command @code{treesit-forward-sexp} uses the @code{sexp} definition if |
| 1693 | major mode defines it; @code{treesit-forward-list}, | 1710 | major mode defines it (@pxref{List Motion}); @code{treesit-forward-list}, |
| 1694 | @code{treesit-down-list}, @code{treesit-up-list}, | 1711 | @code{treesit-down-list}, @code{treesit-up-list}, |
| 1695 | @code{treesit-show-paren-data} use the @code{list} definition (its | 1712 | @code{treesit-show-paren-data} use the @code{list} definition (its |
| 1696 | symbol @code{list} has the symbol property @code{treesit-thing-symbol} | 1713 | symbol @code{list} has the symbol property @code{treesit-thing-symbol} |
| @@ -1699,8 +1716,8 @@ to avoid ambiguity with the function that has the same name); | |||
| 1699 | Defun movement functions like @code{treesit-end-of-defun} uses the | 1716 | Defun movement functions like @code{treesit-end-of-defun} uses the |
| 1700 | @code{defun} definition (@code{defun} definition is overridden by | 1717 | @code{defun} definition (@code{defun} definition is overridden by |
| 1701 | @var{treesit-defun-type-regexp} for backward compatibility). Major | 1718 | @var{treesit-defun-type-regexp} for backward compatibility). Major |
| 1702 | modes can also define @code{comment}, @code{string}, @code{text} | 1719 | modes can also define @code{comment}, @code{string}, and @code{text} |
| 1703 | (generally comments and strings). | 1720 | things (to match comments and strings). |
| 1704 | 1721 | ||
| 1705 | The rest of this section lists a few functions that take advantage of | 1722 | The rest of this section lists a few functions that take advantage of |
| 1706 | the thing definitions. Besides the functions below, some other | 1723 | the thing definitions. Besides the functions below, some other |
| @@ -1709,10 +1726,10 @@ tree-traversing functions like @code{treesit-search-forward}, | |||
| 1709 | @code{treesit-induce-sparse-tree}, etc. @xref{Retrieving Nodes}. | 1726 | @code{treesit-induce-sparse-tree}, etc. @xref{Retrieving Nodes}. |
| 1710 | 1727 | ||
| 1711 | @defun treesit-node-match-p node thing &optional ignore-missing | 1728 | @defun treesit-node-match-p node thing &optional ignore-missing |
| 1712 | This function checks whether @var{node} is a @var{thing}. | 1729 | This function checks whether @var{node} represents a @var{thing}. |
| 1713 | 1730 | ||
| 1714 | If @var{node} is a @var{thing}, return non-@code{nil}, otherwise return | 1731 | If @var{node} represents @var{thing}, return non-@code{nil}, otherwise |
| 1715 | @code{nil}. For convenience, if @code{node} is @code{nil}, this | 1732 | return @code{nil}. For convenience, if @code{node} is @code{nil}, this |
| 1716 | function just returns @code{nil}. | 1733 | function just returns @code{nil}. |
| 1717 | 1734 | ||
| 1718 | The @var{thing} can be either a thing symbol like @code{defun}, or | 1735 | The @var{thing} can be either a thing symbol like @code{defun}, or |
| @@ -1727,8 +1744,9 @@ undefined and just returns @code{nil}; but it still signals the error if | |||
| 1727 | @end defun | 1744 | @end defun |
| 1728 | 1745 | ||
| 1729 | @defun treesit-thing-prev position thing | 1746 | @defun treesit-thing-prev position thing |
| 1730 | This function returns the first node before @var{position} that is the | 1747 | This function returns the first node before @var{position} in the |
| 1731 | specified @var{thing}. If no such node exists, it returns @code{nil}. | 1748 | current buffer that is the specified @var{thing}. If no such node |
| 1749 | exists, it returns @code{nil}. | ||
| 1732 | It's guaranteed that, if a node is returned, the node's end position is | 1750 | It's guaranteed that, if a node is returned, the node's end position is |
| 1733 | less or equal to @var{position}. In other words, this function never | 1751 | less or equal to @var{position}. In other words, this function never |
| 1734 | returns a node that encloses @var{position}. | 1752 | returns a node that encloses @var{position}. |
| @@ -1753,8 +1771,9 @@ function doesn't move point. | |||
| 1753 | 1771 | ||
| 1754 | A positive @var{arg} means moving forward that many instances of | 1772 | A positive @var{arg} means moving forward that many instances of |
| 1755 | @var{thing}; negative @var{arg} means moving backward. If @var{side} is | 1773 | @var{thing}; negative @var{arg} means moving backward. If @var{side} is |
| 1756 | @code{beg}, this function stops at the beginning of @var{thing}; if | 1774 | @code{beg}, this function returns the position of the beginning of |
| 1757 | @code{end}, stop at the end of @var{thing}. | 1775 | @var{thing}; if it's @code{end}, it returns the position at the end of |
| 1776 | @var{thing}. | ||
| 1758 | 1777 | ||
| 1759 | Like in @code{treesit-thing-prev}, @var{thing} can be a thing symbol | 1778 | Like in @code{treesit-thing-prev}, @var{thing} can be a thing symbol |
| 1760 | defined in @code{treesit-thing-settings}, or a predicate. | 1779 | defined in @code{treesit-thing-settings}, or a predicate. |
| @@ -1780,8 +1799,8 @@ less or equal to @var{position}, and it's end position is greater or equal to | |||
| 1780 | @var{position}. | 1799 | @var{position}. |
| 1781 | 1800 | ||
| 1782 | If @var{strict} is non-@code{nil}, this function uses strict comparison, | 1801 | If @var{strict} is non-@code{nil}, this function uses strict comparison, |
| 1783 | i.e., start position must be strictly greater than @var{position}, and end | 1802 | i.e., start position must be strictly smaller than @var{position}, and end |
| 1784 | position must be strictly less than @var{position}. | 1803 | position must be strictly greater than @var{position}. |
| 1785 | 1804 | ||
| 1786 | @var{thing} can be either a thing symbol defined in | 1805 | @var{thing} can be either a thing symbol defined in |
| 1787 | @code{treesit-thing-settings}, or a predicate. | 1806 | @code{treesit-thing-settings}, or a predicate. |
diff --git a/lisp/treesit.el b/lisp/treesit.el index 5df8eb70cbf..45626e77b99 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el | |||
| @@ -3237,11 +3237,14 @@ The type can be `list' (the default) or `sexp'. | |||
| 3237 | 3237 | ||
| 3238 | The `list' type uses the `list' thing defined in `treesit-thing-settings'. | 3238 | The `list' type uses the `list' thing defined in `treesit-thing-settings'. |
| 3239 | See `treesit-thing-at-point'. With this type commands use syntax tables to | 3239 | See `treesit-thing-at-point'. With this type commands use syntax tables to |
| 3240 | navigate symbols and treesit definition to navigate lists. | 3240 | navigate symbols and treesit definitions to navigate lists. |
| 3241 | 3241 | ||
| 3242 | The `sexp' type uses the `sexp' thing defined in `treesit-thing-settings'. | 3242 | The `sexp' type uses the `sexp' thing defined in `treesit-thing-settings'. |
| 3243 | With this type commands use only the treesit definition of parser nodes, | 3243 | With this type commands use only the treesit definitions of parser nodes, |
| 3244 | without distinction between symbols and lists." | 3244 | without distinction between symbols and lists. Since tree-sitter grammars |
| 3245 | could group node types in arbitrary ways, navigation by `sexp' might not | ||
| 3246 | match your expectations, and might produce different results in differnt | ||
| 3247 | treesit-based modes." | ||
| 3245 | (interactive "p") | 3248 | (interactive "p") |
| 3246 | (if (not (treesit-thing-defined-p 'list (treesit-language-at (point)))) | 3249 | (if (not (treesit-thing-defined-p 'list (treesit-language-at (point)))) |
| 3247 | (user-error "No `list' thing is defined in `treesit-thing-settings'") | 3250 | (user-error "No `list' thing is defined in `treesit-thing-settings'") |
| @@ -3630,14 +3633,15 @@ predicate as described in `treesit-thing-settings'." | |||
| 3630 | (treesit--thing-sibling pos thing nil)) | 3633 | (treesit--thing-sibling pos thing nil)) |
| 3631 | 3634 | ||
| 3632 | (defun treesit-thing-at (pos thing &optional strict) | 3635 | (defun treesit-thing-at (pos thing &optional strict) |
| 3633 | "Return the smallest THING enclosing POS. | 3636 | "Return the smallest node enclosing POS for THING. |
| 3634 | 3637 | ||
| 3635 | The returned node, if non-nil, must enclose POS, i.e., its start | 3638 | The returned node, if non-nil, must enclose POS, i.e., its |
| 3636 | <= POS, its end > POS. If STRICT is non-nil, the returned node's | 3639 | start <= POS, its end > POS. If STRICT is non-nil, the returned |
| 3637 | start must < POS rather than <= POS. | 3640 | node's start must be < POS rather than <= POS. |
| 3638 | 3641 | ||
| 3639 | THING should be a thing defined in `treesit-thing-settings', or | 3642 | THING should be a thing defined in `treesit-thing-settings' for |
| 3640 | it can be a predicate described in `treesit-thing-settings'." | 3643 | the current buffer's major mode, or it can be a predicate |
| 3644 | described in `treesit-thing-settings'." | ||
| 3641 | (let* ((cursor (treesit-node-at pos)) | 3645 | (let* ((cursor (treesit-node-at pos)) |
| 3642 | (iter-pred (lambda (node) | 3646 | (iter-pred (lambda (node) |
| 3643 | (and (treesit-node-match-p node thing t) | 3647 | (and (treesit-node-match-p node thing t) |
| @@ -3789,13 +3793,14 @@ function is called recursively." | |||
| 3789 | (if (eq counter 0) pos nil))) | 3793 | (if (eq counter 0) pos nil))) |
| 3790 | 3794 | ||
| 3791 | (defun treesit-thing-at-point (thing tactic) | 3795 | (defun treesit-thing-at-point (thing tactic) |
| 3792 | "Return the THING at point, or nil if none is found. | 3796 | "Return the node for THING at point, or nil if no THING is found at point. |
| 3793 | 3797 | ||
| 3794 | THING can be a symbol, a regexp, a predicate function, and more; | 3798 | THING can be a symbol, a regexp, a predicate function, and more; |
| 3795 | see `treesit-thing-settings' for details. | 3799 | for details, see `treesit-thing-settings' as defined by the |
| 3800 | current buffer's major mode. | ||
| 3796 | 3801 | ||
| 3797 | Return the top-level THING if TACTIC is `top-level'; return the | 3802 | Return the top-level node for THING if TACTIC is `top-level'; return |
| 3798 | smallest enclosing THING as POS if TACTIC is `nested'." | 3803 | the smallest node enclosing THING at point if TACTIC is `nested'." |
| 3799 | 3804 | ||
| 3800 | (let ((node (treesit-thing-at (point) thing))) | 3805 | (let ((node (treesit-thing-at (point) thing))) |
| 3801 | (if (eq tactic 'top-level) | 3806 | (if (eq tactic 'top-level) |
diff --git a/src/treesit.c b/src/treesit.c index de74e41c89a..67dd2ee3a7a 100644 --- a/src/treesit.c +++ b/src/treesit.c | |||
| @@ -5193,13 +5193,16 @@ then in the system default locations for dynamic libraries, in that order. */); | |||
| 5193 | doc: | 5193 | doc: |
| 5194 | /* A list defining things. | 5194 | /* A list defining things. |
| 5195 | 5195 | ||
| 5196 | The value should be an alist of (LANGUAGE . DEFINITIONS), where | 5196 | The value should be defined by the major mode, and should be an alist |
| 5197 | LANGUAGE is a language symbol, and DEFINITIONS is a list of | 5197 | of the form (LANGUAGE . DEFINITIONS), where LANGUAGE is a language |
| 5198 | symbol and DEFINITIONS is a list whose elements are of the form | ||
| 5198 | 5199 | ||
| 5199 | (THING PRED) | 5200 | (THING PRED) |
| 5200 | 5201 | ||
| 5201 | THING is a symbol representing the thing, like `defun', `sexp', or | 5202 | THING is a symbol representing the thing, like `defun', `defclass', |
| 5202 | `sentence'; PRED defines what kind of node can be qualified as THING. | 5203 | `sexp', `sentence', `comment', or any other symbol that is meaningful |
| 5204 | for the major mode; PRED defines what kind of node can be qualified | ||
| 5205 | as THING. | ||
| 5203 | 5206 | ||
| 5204 | PRED can be a regexp string that matches the type of the node; it can | 5207 | PRED can be a regexp string that matches the type of the node; it can |
| 5205 | be a predicate function that takes the node as the sole argument and | 5208 | be a predicate function that takes the node as the sole argument and |
| @@ -5207,12 +5210,13 @@ returns t if the node is the thing, and nil otherwise; it can be a | |||
| 5207 | cons (REGEXP . FN), which is a combination of a regexp and a predicate | 5210 | cons (REGEXP . FN), which is a combination of a regexp and a predicate |
| 5208 | function, and the node has to match both to qualify as the thing. | 5211 | function, and the node has to match both to qualify as the thing. |
| 5209 | 5212 | ||
| 5210 | PRED can also be recursively defined. It can be (or PRED...), meaning | 5213 | PRED can also be recursively defined. It can be: |
| 5211 | satisfying anyone of the inner PREDs qualifies the node; or (and | ||
| 5212 | PRED...) meaning satisfying all of the inner PREDs qualifies the node; | ||
| 5213 | or (not PRED), meaning not satisfying the inner PRED qualifies the node. | ||
| 5214 | 5214 | ||
| 5215 | There are two pre-defined predicates, `named' and `anonymous`. They | 5215 | (or PRED...), meaning satisfying any of the inner PREDs qualifies the node; |
| 5216 | (and PRED...) meaning satisfying all of the inner PREDs qualifies the node; | ||
| 5217 | (not PRED), meaning not satisfying the inner PRED qualifies the node. | ||
| 5218 | |||
| 5219 | There are two pre-defined predicates, `named' and `anonymous'. They | ||
| 5216 | match named nodes and anonymous nodes, respectively. | 5220 | match named nodes and anonymous nodes, respectively. |
| 5217 | 5221 | ||
| 5218 | Finally, PRED can refer to other THINGs defined in this list by using | 5222 | Finally, PRED can refer to other THINGs defined in this list by using |