aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorEli Zaretskii2025-06-05 10:30:44 +0300
committerEli Zaretskii2025-06-05 10:30:44 +0300
commitbcf005fa774194d434c68cc191566b58c297ca86 (patch)
treeefe2611242504140a8dfaf1c58e38333e8ea9080
parent1903b0062b1688241d1485b3e460264e124ba757 (diff)
downloademacs-bcf005fa774194d434c68cc191566b58c297ca86.tar.gz
emacs-bcf005fa774194d434c68cc191566b58c297ca86.zip
Improve documentation of treesit "thing"
* src/treesit.c (syms_of_treesit): * lisp/treesit.el (treesit-cycle-sexp-type): (treesit-thing-at, treesit-thing-at-point): Doc fixes. * doc/lispref/parsing.texi (User-defined Things): Improve documentation of treesit "thing" and related functions; add cross-references and indexing.
-rw-r--r--doc/lispref/parsing.texi79
-rw-r--r--lisp/treesit.el31
-rw-r--r--src/treesit.c22
3 files changed, 80 insertions, 52 deletions
diff --git a/doc/lispref/parsing.texi b/doc/lispref/parsing.texi
index aa321785460..374eeb28b7a 100644
--- a/doc/lispref/parsing.texi
+++ b/doc/lispref/parsing.texi
@@ -1619,14 +1619,16 @@ documentation about pattern-matching. The documentation can be found at
1619 1619
1620It's often useful to be able to identify and find certain @dfn{things} in 1620It's often useful to be able to identify and find certain @dfn{things} in
1621a buffer, like function and class definitions, statements, code blocks, 1621a buffer, like function and class definitions, statements, code blocks,
1622strings, comments, etc. Emacs allows users to define what kind of 1622strings, comments, etc., in terms of node types defined by the
1623tree-sitter node corresponds to a ``thing''. This enables handy 1623tree-sitter grammar used in the buffer. Emacs allows Lisp programs to
1624features like jumping to the next function, marking the code block at 1624define what kinds of tree-sitter nodes corresponds to each ``thing''.
1625point, or transposing two function arguments. 1625This enables handy features like jumping to the next function, marking
1626the code block at point, transposing two function arguments, etc.
1626 1627
1627The ``things'' feature in Emacs is independent of the pattern matching 1628The ``things'' feature in Emacs is independent of the pattern matching
1628feature of tree-sitter, and comparatively less powerful, but more 1629feature of tree-sitter (@pxref{Pattern Matching}), and comparatively
1629suitable for navigation and traversing the parse tree. 1630less powerful, but more suitable for navigation and traversing the
1631buffer text in terms of the tree-sitter parse tree.
1630 1632
1631@findex treesit-thing-definition 1633@findex treesit-thing-definition
1632@findex treesit-thing-defined-p 1634@findex treesit-thing-defined-p
@@ -1635,12 +1637,22 @@ predicate of a defined thing with @code{treesit-thing-definition}, and
1635test if a thing is defined with @code{treesit-thing-defined-p}. 1637test if a thing is defined with @code{treesit-thing-defined-p}.
1636 1638
1637@defvar treesit-thing-settings 1639@defvar treesit-thing-settings
1638This is an alist of thing definitions for each language. The key of 1640This is an alist of thing definitions for each language supported by the
1639each entry is a language symbol, and the value is a list of thing 1641grammar used in a buffer; it should be defined by the buffer's major
1640definitions of the form @w{@code{(@var{thing} @var{pred})}}, where 1642mode (the default value is @code{nil}). The key of each entry is a
1641@var{thing} is a symbol representing the thing, like @code{defun}, 1643language symbol (e.g., @code{c} for C, @code{cpp} for C@t{++}, etc.),
1642@code{sexp}, or @code{sentence}; and @var{pred} specifies what kind of 1644and the value is a list of thing definitions of the form
1643tree-sitter node is this @var{thing}. 1645@w{@code{(@var{thing} @var{pred})}}, where @var{thing} is a symbol
1646representing the thing, and @var{pred} specifies what kinds of
1647tree-sitter nodes are considered as this @var{thing}.
1648
1649@cindex @code{sexp}, treesit-defined thing
1650@cindex @code{list}, treesit-defined thing
1651The symbol used to define the @var{thing} can be anything meaningful for
1652the major mode: @code{defun}, @code{defclass}, @code{sentence},
1653@code{comment}, @code{string}, etc. To support tree-sitter based
1654navigation commands (@pxref{List Motion}), the mode should define two
1655things: @code{list} and @code{sexp}.
1644 1656
1645@var{pred} can be a regexp string that matches the type of the node; it 1657@var{pred} can be a regexp string that matches the type of the node; it
1646can be a function that takes a node as the argument and returns a 1658can be a function that takes a node as the argument and returns a
@@ -1660,13 +1672,16 @@ meaning that not satisfying @var{pred} qualifies the node.
1660Finally, @var{pred} can refer to other @var{thing}s defined in this 1672Finally, @var{pred} can refer to other @var{thing}s defined in this
1661list. For example, @w{@code{(or sexp sentence)}} defines something 1673list. For example, @w{@code{(or sexp sentence)}} defines something
1662that's either a @code{sexp} thing or a @code{sentence} thing, as defined 1674that's either a @code{sexp} thing or a @code{sentence} thing, as defined
1663by some other rule in the alist. 1675by some other rules in the alist.
1664 1676
1677@cindex @code{named}, treesit-defined thing
1678@cindex @code{anonymous}, treesit-defined thing
1665There are two pre-defined predicates: @code{named} and @code{anonymous}, 1679There are two pre-defined predicates: @code{named} and @code{anonymous},
1666which qualify, respectively, named and anonymous nodes. They can be 1680which qualify, respectively, named and anonymous nodes of the
1667combined with @code{and} to narrow down the match. 1681tree-sitter grammar. They can be combined with @code{and} to narrow
1682down the match.
1668 1683
1669Here's an example @code{treesit-thing-settings} for C and C++: 1684Here's an example @code{treesit-thing-settings} for C and C@t{++}:
1670 1685
1671@example 1686@example
1672@group 1687@group
@@ -1676,6 +1691,8 @@ Here's an example @code{treesit-thing-settings} for C and C++:
1676 (comment "comment") 1691 (comment "comment")
1677 (string "raw_string_literal") 1692 (string "raw_string_literal")
1678 (text (or comment string))) 1693 (text (or comment string)))
1694@end group
1695@group
1679 (cpp 1696 (cpp
1680 (defun ("function_definition" . cpp-ts-mode-defun-valid-p)) 1697 (defun ("function_definition" . cpp-ts-mode-defun-valid-p))
1681 (defclass "class_specifier") 1698 (defclass "class_specifier")
@@ -1685,12 +1702,12 @@ Here's an example @code{treesit-thing-settings} for C and C++:
1685 1702
1686@noindent 1703@noindent
1687Note that this example is modified for didactic purposes, and isn't 1704Note that this example is modified for didactic purposes, and isn't
1688exactly how C and C@t{++} modes define things. 1705exactly how tree-sitter based C and C@t{++} modes define things.
1689@end defvar 1706@end defvar
1690 1707
1691Emacs builtin functions already make use some thing definitions. 1708Emacs builtin functions already make use of some thing definitions.
1692Command @code{treesit-forward-sexp} uses the @code{sexp} definition if 1709Command @code{treesit-forward-sexp} uses the @code{sexp} definition if
1693major mode defines it; @code{treesit-forward-list}, 1710major mode defines it (@pxref{List Motion}); @code{treesit-forward-list},
1694@code{treesit-down-list}, @code{treesit-up-list}, 1711@code{treesit-down-list}, @code{treesit-up-list},
1695@code{treesit-show-paren-data} use the @code{list} definition (its 1712@code{treesit-show-paren-data} use the @code{list} definition (its
1696symbol @code{list} has the symbol property @code{treesit-thing-symbol} 1713symbol @code{list} has the symbol property @code{treesit-thing-symbol}
@@ -1699,8 +1716,8 @@ to avoid ambiguity with the function that has the same name);
1699Defun movement functions like @code{treesit-end-of-defun} uses the 1716Defun movement functions like @code{treesit-end-of-defun} uses the
1700@code{defun} definition (@code{defun} definition is overridden by 1717@code{defun} definition (@code{defun} definition is overridden by
1701@var{treesit-defun-type-regexp} for backward compatibility). Major 1718@var{treesit-defun-type-regexp} for backward compatibility). Major
1702modes can also define @code{comment}, @code{string}, @code{text} 1719modes can also define @code{comment}, @code{string}, and @code{text}
1703(generally comments and strings). 1720things (to match comments and strings).
1704 1721
1705The rest of this section lists a few functions that take advantage of 1722The rest of this section lists a few functions that take advantage of
1706the thing definitions. Besides the functions below, some other 1723the thing definitions. Besides the functions below, some other
@@ -1709,10 +1726,10 @@ tree-traversing functions like @code{treesit-search-forward},
1709@code{treesit-induce-sparse-tree}, etc. @xref{Retrieving Nodes}. 1726@code{treesit-induce-sparse-tree}, etc. @xref{Retrieving Nodes}.
1710 1727
1711@defun treesit-node-match-p node thing &optional ignore-missing 1728@defun treesit-node-match-p node thing &optional ignore-missing
1712This function checks whether @var{node} is a @var{thing}. 1729This function checks whether @var{node} represents a @var{thing}.
1713 1730
1714If @var{node} is a @var{thing}, return non-@code{nil}, otherwise return 1731If @var{node} represents @var{thing}, return non-@code{nil}, otherwise
1715@code{nil}. For convenience, if @code{node} is @code{nil}, this 1732return @code{nil}. For convenience, if @code{node} is @code{nil}, this
1716function just returns @code{nil}. 1733function just returns @code{nil}.
1717 1734
1718The @var{thing} can be either a thing symbol like @code{defun}, or 1735The @var{thing} can be either a thing symbol like @code{defun}, or
@@ -1727,8 +1744,9 @@ undefined and just returns @code{nil}; but it still signals the error if
1727@end defun 1744@end defun
1728 1745
1729@defun treesit-thing-prev position thing 1746@defun treesit-thing-prev position thing
1730This function returns the first node before @var{position} that is the 1747This function returns the first node before @var{position} in the
1731specified @var{thing}. If no such node exists, it returns @code{nil}. 1748current buffer that is the specified @var{thing}. If no such node
1749exists, it returns @code{nil}.
1732It's guaranteed that, if a node is returned, the node's end position is 1750It's guaranteed that, if a node is returned, the node's end position is
1733less or equal to @var{position}. In other words, this function never 1751less or equal to @var{position}. In other words, this function never
1734returns a node that encloses @var{position}. 1752returns a node that encloses @var{position}.
@@ -1753,8 +1771,9 @@ function doesn't move point.
1753 1771
1754A positive @var{arg} means moving forward that many instances of 1772A positive @var{arg} means moving forward that many instances of
1755@var{thing}; negative @var{arg} means moving backward. If @var{side} is 1773@var{thing}; negative @var{arg} means moving backward. If @var{side} is
1756@code{beg}, this function stops at the beginning of @var{thing}; if 1774@code{beg}, this function returns the position of the beginning of
1757@code{end}, stop at the end of @var{thing}. 1775@var{thing}; if it's @code{end}, it returns the position at the end of
1776@var{thing}.
1758 1777
1759Like in @code{treesit-thing-prev}, @var{thing} can be a thing symbol 1778Like in @code{treesit-thing-prev}, @var{thing} can be a thing symbol
1760defined in @code{treesit-thing-settings}, or a predicate. 1779defined in @code{treesit-thing-settings}, or a predicate.
@@ -1780,8 +1799,8 @@ less or equal to @var{position}, and it's end position is greater or equal to
1780@var{position}. 1799@var{position}.
1781 1800
1782If @var{strict} is non-@code{nil}, this function uses strict comparison, 1801If @var{strict} is non-@code{nil}, this function uses strict comparison,
1783i.e., start position must be strictly greater than @var{position}, and end 1802i.e., start position must be strictly smaller than @var{position}, and end
1784position must be strictly less than @var{position}. 1803position must be strictly greater than @var{position}.
1785 1804
1786@var{thing} can be either a thing symbol defined in 1805@var{thing} can be either a thing symbol defined in
1787@code{treesit-thing-settings}, or a predicate. 1806@code{treesit-thing-settings}, or a predicate.
diff --git a/lisp/treesit.el b/lisp/treesit.el
index 5df8eb70cbf..45626e77b99 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -3237,11 +3237,14 @@ The type can be `list' (the default) or `sexp'.
3237 3237
3238The `list' type uses the `list' thing defined in `treesit-thing-settings'. 3238The `list' type uses the `list' thing defined in `treesit-thing-settings'.
3239See `treesit-thing-at-point'. With this type commands use syntax tables to 3239See `treesit-thing-at-point'. With this type commands use syntax tables to
3240navigate symbols and treesit definition to navigate lists. 3240navigate symbols and treesit definitions to navigate lists.
3241 3241
3242The `sexp' type uses the `sexp' thing defined in `treesit-thing-settings'. 3242The `sexp' type uses the `sexp' thing defined in `treesit-thing-settings'.
3243With this type commands use only the treesit definition of parser nodes, 3243With this type commands use only the treesit definitions of parser nodes,
3244without distinction between symbols and lists." 3244without distinction between symbols and lists. Since tree-sitter grammars
3245could group node types in arbitrary ways, navigation by `sexp' might not
3246match your expectations, and might produce different results in differnt
3247treesit-based modes."
3245 (interactive "p") 3248 (interactive "p")
3246 (if (not (treesit-thing-defined-p 'list (treesit-language-at (point)))) 3249 (if (not (treesit-thing-defined-p 'list (treesit-language-at (point))))
3247 (user-error "No `list' thing is defined in `treesit-thing-settings'") 3250 (user-error "No `list' thing is defined in `treesit-thing-settings'")
@@ -3630,14 +3633,15 @@ predicate as described in `treesit-thing-settings'."
3630 (treesit--thing-sibling pos thing nil)) 3633 (treesit--thing-sibling pos thing nil))
3631 3634
3632(defun treesit-thing-at (pos thing &optional strict) 3635(defun treesit-thing-at (pos thing &optional strict)
3633 "Return the smallest THING enclosing POS. 3636 "Return the smallest node enclosing POS for THING.
3634 3637
3635The returned node, if non-nil, must enclose POS, i.e., its start 3638The returned node, if non-nil, must enclose POS, i.e., its
3636<= POS, its end > POS. If STRICT is non-nil, the returned node's 3639start <= POS, its end > POS. If STRICT is non-nil, the returned
3637start must < POS rather than <= POS. 3640node's start must be < POS rather than <= POS.
3638 3641
3639THING should be a thing defined in `treesit-thing-settings', or 3642THING should be a thing defined in `treesit-thing-settings' for
3640it can be a predicate described in `treesit-thing-settings'." 3643the current buffer's major mode, or it can be a predicate
3644described in `treesit-thing-settings'."
3641 (let* ((cursor (treesit-node-at pos)) 3645 (let* ((cursor (treesit-node-at pos))
3642 (iter-pred (lambda (node) 3646 (iter-pred (lambda (node)
3643 (and (treesit-node-match-p node thing t) 3647 (and (treesit-node-match-p node thing t)
@@ -3789,13 +3793,14 @@ function is called recursively."
3789 (if (eq counter 0) pos nil))) 3793 (if (eq counter 0) pos nil)))
3790 3794
3791(defun treesit-thing-at-point (thing tactic) 3795(defun treesit-thing-at-point (thing tactic)
3792 "Return the THING at point, or nil if none is found. 3796 "Return the node for THING at point, or nil if no THING is found at point.
3793 3797
3794THING can be a symbol, a regexp, a predicate function, and more; 3798THING can be a symbol, a regexp, a predicate function, and more;
3795see `treesit-thing-settings' for details. 3799for details, see `treesit-thing-settings' as defined by the
3800current buffer's major mode.
3796 3801
3797Return the top-level THING if TACTIC is `top-level'; return the 3802Return the top-level node for THING if TACTIC is `top-level'; return
3798smallest enclosing THING as POS if TACTIC is `nested'." 3803the smallest node enclosing THING at point if TACTIC is `nested'."
3799 3804
3800 (let ((node (treesit-thing-at (point) thing))) 3805 (let ((node (treesit-thing-at (point) thing)))
3801 (if (eq tactic 'top-level) 3806 (if (eq tactic 'top-level)
diff --git a/src/treesit.c b/src/treesit.c
index de74e41c89a..67dd2ee3a7a 100644
--- a/src/treesit.c
+++ b/src/treesit.c
@@ -5193,13 +5193,16 @@ then in the system default locations for dynamic libraries, in that order. */);
5193 doc: 5193 doc:
5194 /* A list defining things. 5194 /* A list defining things.
5195 5195
5196The value should be an alist of (LANGUAGE . DEFINITIONS), where 5196The value should be defined by the major mode, and should be an alist
5197LANGUAGE is a language symbol, and DEFINITIONS is a list of 5197of the form (LANGUAGE . DEFINITIONS), where LANGUAGE is a language
5198symbol and DEFINITIONS is a list whose elements are of the form
5198 5199
5199 (THING PRED) 5200 (THING PRED)
5200 5201
5201THING is a symbol representing the thing, like `defun', `sexp', or 5202THING is a symbol representing the thing, like `defun', `defclass',
5202`sentence'; PRED defines what kind of node can be qualified as THING. 5203`sexp', `sentence', `comment', or any other symbol that is meaningful
5204for the major mode; PRED defines what kind of node can be qualified
5205as THING.
5203 5206
5204PRED can be a regexp string that matches the type of the node; it can 5207PRED can be a regexp string that matches the type of the node; it can
5205be a predicate function that takes the node as the sole argument and 5208be a predicate function that takes the node as the sole argument and
@@ -5207,12 +5210,13 @@ returns t if the node is the thing, and nil otherwise; it can be a
5207cons (REGEXP . FN), which is a combination of a regexp and a predicate 5210cons (REGEXP . FN), which is a combination of a regexp and a predicate
5208function, and the node has to match both to qualify as the thing. 5211function, and the node has to match both to qualify as the thing.
5209 5212
5210PRED can also be recursively defined. It can be (or PRED...), meaning 5213PRED can also be recursively defined. It can be:
5211satisfying anyone of the inner PREDs qualifies the node; or (and
5212PRED...) meaning satisfying all of the inner PREDs qualifies the node;
5213or (not PRED), meaning not satisfying the inner PRED qualifies the node.
5214 5214
5215There are two pre-defined predicates, `named' and `anonymous`. They 5215 (or PRED...), meaning satisfying any of the inner PREDs qualifies the node;
5216 (and PRED...) meaning satisfying all of the inner PREDs qualifies the node;
5217 (not PRED), meaning not satisfying the inner PRED qualifies the node.
5218
5219There are two pre-defined predicates, `named' and `anonymous'. They
5216match named nodes and anonymous nodes, respectively. 5220match named nodes and anonymous nodes, respectively.
5217 5221
5218Finally, PRED can refer to other THINGs defined in this list by using 5222Finally, PRED can refer to other THINGs defined in this list by using