diff options
| author | Mattias EngdegÄrd | 2019-09-25 14:29:50 -0700 |
|---|---|---|
| committer | Paul Eggert | 2019-09-25 14:29:50 -0700 |
| commit | 07367e5b95fe31f3d4e994b42b081075501b9b60 (patch) | |
| tree | 7d26251a300462083d971aa3aa9880cc23c423a1 /doc/lispref | |
| parent | 2ed71227c626c6cfdc684948644ccf3d9eaeb15b (diff) | |
| download | emacs-07367e5b95fe31f3d4e994b42b081075501b9b60.tar.gz emacs-07367e5b95fe31f3d4e994b42b081075501b9b60.zip | |
Add rx extension mechanism
Add a built-in set of extension macros: `rx-define', `rx-let' and
`rx-let-eval'.
* lisp/emacs-lisp/rx.el (rx-constituents, rx-to-string): Doc updates.
(rx--builtin-symbols, rx--builtin-names, rx--local-definitions)
(rx--lookup-def, rx--substitute, rx--expand-template)
(rx--make-binding, rx--make-named-binding, rx--extend-local-defs)
(rx-let-eval, rx-let, rx-define): New.
(rx--translate-symbol, rx--translate-form): Use extensions if any.
(rx): Use local definitions.
* test/lisp/emacs-lisp/rx-tests.el (rx-let, rx-define)
(rx-to-string-define, rx-let-define, rx-let-eval): New.
* etc/NEWS (Changes in Specialized Modes and Packages):
* doc/lispref/searching.texi (Rx Notation, Rx Functions, Extending Rx):
Add node about rx extensions.
Diffstat (limited to 'doc/lispref')
| -rw-r--r-- | doc/lispref/searching.texi | 157 |
1 files changed, 157 insertions, 0 deletions
diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi index 2d94e5659de..a4b65334126 100644 --- a/doc/lispref/searching.texi +++ b/doc/lispref/searching.texi | |||
| @@ -1037,6 +1037,7 @@ customisation. | |||
| 1037 | @menu | 1037 | @menu |
| 1038 | * Rx Constructs:: Constructs valid in rx forms. | 1038 | * Rx Constructs:: Constructs valid in rx forms. |
| 1039 | * Rx Functions:: Functions and macros that use rx forms. | 1039 | * Rx Functions:: Functions and macros that use rx forms. |
| 1040 | * Extending Rx:: How to define your own rx forms. | ||
| 1040 | @end menu | 1041 | @end menu |
| 1041 | 1042 | ||
| 1042 | @node Rx Constructs | 1043 | @node Rx Constructs |
| @@ -1524,6 +1525,162 @@ must be string literals. | |||
| 1524 | 1525 | ||
| 1525 | The @code{pcase} macro can use @code{rx} expressions as patterns | 1526 | The @code{pcase} macro can use @code{rx} expressions as patterns |
| 1526 | directly; @pxref{rx in pcase}. | 1527 | directly; @pxref{rx in pcase}. |
| 1528 | |||
| 1529 | For mechanisms to add user-defined extensions to the @code{rx} | ||
| 1530 | notation, @pxref{Extending Rx}. | ||
| 1531 | |||
| 1532 | @node Extending Rx | ||
| 1533 | @subsubsection Defining new @code{rx} forms | ||
| 1534 | |||
| 1535 | The @code{rx} notation can be extended by defining new symbols and | ||
| 1536 | parametrised forms in terms of other @code{rx} expressions. This is | ||
| 1537 | handy for sharing parts between several regexps, and for making | ||
| 1538 | complex ones easier to build and understand by putting them together | ||
| 1539 | from smaller pieces. | ||
| 1540 | |||
| 1541 | For example, you could define @code{name} to mean | ||
| 1542 | @code{(one-or-more letter)}, and @code{(quoted @var{x})} to mean | ||
| 1543 | @code{(seq ?' @var{x} ?')} for any @var{x}. These forms could then be | ||
| 1544 | used in @code{rx} expressions like any other: @code{(rx (quoted name))} | ||
| 1545 | would match a nonempty sequence of letters inside single quotes. | ||
| 1546 | |||
| 1547 | The Lisp macros below provide different ways of binding names to | ||
| 1548 | definitions. Common to all of them are the following rules: | ||
| 1549 | |||
| 1550 | @itemize | ||
| 1551 | @item | ||
| 1552 | Built-in @code{rx} forms, like @code{digit} and @code{group}, cannot | ||
| 1553 | be redefined. | ||
| 1554 | |||
| 1555 | @item | ||
| 1556 | The definitions live in a name space of their own, separate from that | ||
| 1557 | of Lisp variables. There is thus no need to attach a suffix like | ||
| 1558 | @code{-regexp} to names; they cannot collide with anything else. | ||
| 1559 | |||
| 1560 | @item | ||
| 1561 | Definitions cannot refer to themselves recursively, directly or | ||
| 1562 | indirectly. If you find yourself needing this, you want a parser, not | ||
| 1563 | a regular expression. | ||
| 1564 | |||
| 1565 | @item | ||
| 1566 | Definitions are only ever expanded in calls to @code{rx} or | ||
| 1567 | @code{rx-to-string}, not merely by their presence in definition | ||
| 1568 | macros. This means that the order of definitions doesn't matter, even | ||
| 1569 | when they refer to each other, and that syntax errors only show up | ||
| 1570 | when they are used, not when they are defined. | ||
| 1571 | |||
| 1572 | @item | ||
| 1573 | User-defined forms are allowed wherever arbitrary @code{rx} | ||
| 1574 | expressions are expected; for example, in the body of a | ||
| 1575 | @code{zero-or-one} form, but not inside @code{any} or @code{category} | ||
| 1576 | forms. | ||
| 1577 | @end itemize | ||
| 1578 | |||
| 1579 | @defmac rx-define name [arglist] rx-form | ||
| 1580 | Define @var{name} globally in all subsequent calls to @code{rx} and | ||
| 1581 | @code{rx-to-string}. If @var{arglist} is absent, then @var{name} is | ||
| 1582 | defined as a plain symbol to be replaced with @var{rx-form}. Example: | ||
| 1583 | |||
| 1584 | @example | ||
| 1585 | @group | ||
| 1586 | (rx-define haskell-comment (seq "--" (zero-or-more nonl))) | ||
| 1587 | (rx haskell-comment) | ||
| 1588 | @result{} "--.*" | ||
| 1589 | @end group | ||
| 1590 | @end example | ||
| 1591 | |||
| 1592 | If @var{arglist} is present, it must be a list of zero or more | ||
| 1593 | argument names, and @var{name} is then defined as a parametrised form. | ||
| 1594 | When used in an @code{rx} expression as @code{(@var{name} @var{arg}@dots{})}, | ||
| 1595 | each @var{arg} will replace the corresponding argument name inside | ||
| 1596 | @var{rx-form}. | ||
| 1597 | |||
| 1598 | @var{arglist} may end in @code{&rest} and one final argument name, | ||
| 1599 | denoting a rest parameter. The rest parameter will expand to all | ||
| 1600 | extra actual argument values not matched by any other parameter in | ||
| 1601 | @var{arglist}, spliced into @var{rx-form} where it occurs. Example: | ||
| 1602 | |||
| 1603 | @example | ||
| 1604 | @group | ||
| 1605 | (rx-define moan (x y &rest r) (seq x (one-or-more y) r "!")) | ||
| 1606 | (rx (moan "MOO" "A" "MEE" "OW")) | ||
| 1607 | @result{} "MOOA+MEEOW!" | ||
| 1608 | @end group | ||
| 1609 | @end example | ||
| 1610 | |||
| 1611 | Since the definition is global, it is recommended to give @var{name} a | ||
| 1612 | package prefix to avoid name clashes with definitions elsewhere, as is | ||
| 1613 | usual when naming non-local variables and functions. | ||
| 1614 | @end defmac | ||
| 1615 | |||
| 1616 | @defmac rx-let (bindings@dots{}) body@dots{} | ||
| 1617 | Make the @code{rx} definitions in @var{bindings} available locally for | ||
| 1618 | @code{rx} macro invocations in @var{body}, which is then evaluated. | ||
| 1619 | |||
| 1620 | Each element of @var{bindings} is on the form | ||
| 1621 | @w{@code{(@var{name} [@var{arglist}] @var{rx-form})}}, where the parts | ||
| 1622 | have the same meaning as in @code{rx-define} above. Example: | ||
| 1623 | |||
| 1624 | @example | ||
| 1625 | @group | ||
| 1626 | (rx-let ((comma-separated (item) (seq item (0+ "," item))) | ||
| 1627 | (number (1+ digit)) | ||
| 1628 | (numbers (comma-separated number))) | ||
| 1629 | (re-search-forward (rx "(" numbers ")"))) | ||
| 1630 | @end group | ||
| 1631 | @end example | ||
| 1632 | |||
| 1633 | The definitions are only available during the macro-expansion of | ||
| 1634 | @var{body}, and are thus not present during execution of compiled | ||
| 1635 | code. | ||
| 1636 | |||
| 1637 | @code{rx-let} can be used not only inside a function, but also at top | ||
| 1638 | level to include global variable and function definitions that need | ||
| 1639 | to share a common set of @code{rx} forms. Since the names are local | ||
| 1640 | inside @var{body}, there is no need for any package prefixes. | ||
| 1641 | Example: | ||
| 1642 | |||
| 1643 | @example | ||
| 1644 | @group | ||
| 1645 | (rx-let ((phone-number (seq (opt ?+) (1+ (any digit ?-))))) | ||
| 1646 | (defun find-next-phone-number () | ||
| 1647 | (re-search-forward (rx phone-number))) | ||
| 1648 | (defun phone-number-p (string) | ||
| 1649 | (string-match-p (rx bos phone-number eos) string))) | ||
| 1650 | @end group | ||
| 1651 | @end example | ||
| 1652 | |||
| 1653 | The scope of the @code{rx-let} bindings is lexical, which means that | ||
| 1654 | they are not visible outside @var{body} itself, even in functions | ||
| 1655 | called from @var{body}. | ||
| 1656 | @end defmac | ||
| 1657 | |||
| 1658 | @defmac rx-let-eval bindings body@dots{} | ||
| 1659 | Evaluate @var{bindings} to a list of bindings as in @code{rx-let}, | ||
| 1660 | and evaluate @var{body} with those bindings in effect for calls | ||
| 1661 | to @code{rx-to-string}. | ||
| 1662 | |||
| 1663 | This macro is similar to @code{rx-let}, except that the @var{bindings} | ||
| 1664 | argument is evaluated (and thus needs to be quoted if it is a list | ||
| 1665 | literal), and the definitions are substituted at run time, which is | ||
| 1666 | required for @code{rx-to-string} to work. Example: | ||
| 1667 | |||
| 1668 | @example | ||
| 1669 | @group | ||
| 1670 | (rx-let-eval | ||
| 1671 | '((ponder (x) (seq "Where have all the " x " gone?"))) | ||
| 1672 | (looking-at (rx-to-string | ||
| 1673 | '(ponder (or "flowers" "young girls" | ||
| 1674 | "left socks"))))) | ||
| 1675 | @end group | ||
| 1676 | @end example | ||
| 1677 | |||
| 1678 | Another difference from @code{rx-let} is that the @var{bindings} are | ||
| 1679 | dynamically scoped, and thus also available in functions called from | ||
| 1680 | @var{body}. However, they are not visible inside functions defined in | ||
| 1681 | @var{body}. | ||
| 1682 | @end defmac | ||
| 1683 | |||
| 1527 | @end ifnottex | 1684 | @end ifnottex |
| 1528 | 1685 | ||
| 1529 | @node Regexp Functions | 1686 | @node Regexp Functions |