aboutsummaryrefslogtreecommitdiffstats
path: root/test/src/regex-tests.el (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Rename src/regex.c to src/regex-emacs.c.Paul Eggert2018-08-051-686/+0
| | | | | | | | This is in preparation for using Gnulib regex for etags, to avoid collisions in include directives. * src/regex-emacs.c: Rename from src/regex.c. * src/regex-emacs.h: Rename from src/regex.h. All uses changed. * test/src/regex-emacs-tests.el: Rename from test/src/regex-tests.el.
* Raise limit of regexp repetition (Bug#24914)Noam Postavsky2018-01-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * src/regex.h (RE_DUP_MAX): Raise limit to 2^16-1. * etc/NEWS: Announce it. * doc/lispref/searching.texi (Regexp Backslash): Document it. * test/src/regex-tests.el (regex-repeat-limit): Test it. * src/regex.h (reg_errcode_t): Add REG_ESIZEBR code. * src/regex.c (re_error_msgid): Add corresponding entry. (GET_INTERVAL_COUNT): Return it instead of the more generic REG_EBADBR when encountering a repetition greater than RE_DUP_MAX. * lisp/isearch.el (isearch-search): Don't convert errors starting with "Invalid" into "incomplete". Such errors are not incomplete, in the sense that they cannot be corrected by appending more characters to the end of the regexp. The affected error messages are: - REG_BADPAT "Invalid regular expression" - \\(?X:\\) where X is not a legal group number - \\_X where X is not < or > - REG_ECOLLATE "Invalid collation character" - There is no code to throw this. - REG_ECTYPE "Invalid character class name" - [[:foo:] where foo is not a valid class name - REG_ESUBREG "Invalid back reference" - \N where N is referenced before matching group N - REG_BADBR "Invalid content of \\{\\}" - \\{N,M\\} where N < 0, M < N, M or N larger than max - \\{NX where X is not a digit or backslash - \\{N\\X where X is not a } - REG_ERANGE "Invalid range end" - There is no code to throw this. - REG_BADRPT "Invalid preceding regular expression" - We never throw this. It would usually indicate a "*" with no preceding regexp text, but Emacs allows that to match a literal "*".
* Update copyright year to 2018Paul Eggert2018-01-011-1/+1
| | | | Run admin/update-copyright.
* ; Replace non-ascii quote characters in doc strings etcGlenn Morris2017-12-201-2/+2
|
* Prefer HTTPS to FTP and HTTP in documentationPaul Eggert2017-09-131-1/+1
| | | | | | | | | | | | | Most of this change is to boilerplate commentary such as license URLs. This change was prompted by ftp://ftp.gnu.org's going-away party, planned for November. Change these FTP URLs to https://ftp.gnu.org instead. Make similar changes for URLs to other organizations moving away from FTP. Also, change HTTP to HTTPS for URLs to gnu.org and fsf.org when this works, as this will further help defend against man-in-the-middle attacks (for this part I omitted the MS-DOS and MS-Windows sources and the test tarballs to keep the workload down). HTTPS is not fully working to lists.gnu.org so I left those URLs alone for now.
* Quieten compilation of some test filesGlenn Morris2017-05-311-1/+1
| | | | | | | | * test/lisp/dired-tests.el (dired-test-bug25609): Mark unused args. * test/src/data-tests.el (binding-test-set-constant-t) (binding-test-set-constant-nil, binding-test-set-constant-keyword) (binding-test-set-constant-nil): Silence compiler. * test/src/regex-tests.el (regex-tests-BOOST): Escape char literal.
* Add support for Unicode whitespace in [:blank:]Philipp Stephani2017-01-061-1/+1
| | | | | | | | | | | | | | | | See Bug#25366. * src/character.c (blankp): New function for checking Unicode horizontal whitespace. * src/regex.c (ISBLANK): Use 'blankp' for non-ASCII horizontal whitespace. (BIT_BLANK): New bit for range table. (re_wctype_to_bit, execute_charset): Use it. * test/lisp/subr-tests.el (subr-tests--string-match-p--blank): Add unit test for [:blank:] character class. * test/src/regex-tests.el (test): Adapt unit test. * doc/lispref/searching.texi (Char Classes): Document new Unicode behavior for [:blank:].
* Update copyright year to 2017 in masterPaul Eggert2017-01-011-1/+1
| | | | | | Run admin/update-copyright in the master branch. This fixes files that were not already fixed in the emacs-25 branch before it was merged here.
* Split regex character class test into smaller chunksMichal Nazarewicz2016-09-091-44/+46
| | | | | | | | | | | | | | | | | | | | | | | | Having one test for all character classes it is not always trivial to determine which class is failing. This happens when failure is caused by ‘(should (equal (point) (point-max)))’ not being met. With per-character class tests, it is immidiatelly obvious which test causes issues plus tests for all classes are run even if some of them fail. * test/src/regex-tests.el (regex-character-classes): Delete and split into… (regex-tests-alnum-character-class, regex-tests-alpha-character-class, regex-tests-ascii-character-class, regex-tests-blank-character-class, regex-tests-cntrl-character-class, regex-tests-digit-character-class, regex-tests-graph-character-class, regex-tests-lower-character-class, regex-tests-multibyte-character-class, regex-tests-nonascii-character-class, regex-tests-print-character-class, regex-tests-punct-character-class, regex-tests-space-character-class, regex-tests-unibyte-character-class, regex-tests-upper-character-class, regex-tests-word-character-class, regex-tests-xdigit-character-class): …new tests.
* Spelling and minor grammar fixesPaul Eggert2016-08-051-5/+5
| | | | * test/file-organization.org: Rename from test/file-organisation.org.
* Fix accessing regex-resources in out-of-tree test runs in regex-testsMichal Nazarewicz2016-08-031-1/+6
| | | | | | | | | | | | | | | | [82a487d: Fix reading of regex-resources in regex-tests] attempted to fix regex-tests failing when run from the source tree (i.e. via make) by hard-coding path to regex-resources directory relative to the test directory. This fixed runs from the tree but broke the test when run using other methods. Fix by trying ‘load-file-name’ or ‘buffer-file-name’, whichever is set. * test/src/regex-tests.el (regex-tests--resources-dir): New variable storing path to the regex-resources directory. (regex-tests-generic-line): Use aforementioned variable.
* Fix unused lexical variableMichal Nazarewicz2016-08-021-3/+2
| | | | | | | | | | This fixes the following warning: In toplevel form: src/regex-tests.el:416:1:Warning: Unused lexical variable ‘newline’ * test/src/regex-tests.el (regex-tests-BOOST): Remove unused lexical variable.
* Split regex glibc test cases into separet testsMichal Nazarewicz2016-08-021-6/+18
| | | | | | | | * test/src/regex-tests.el (regex-tests): Remove and split into multiple tests cases. (regex-tests-glbic-BOOST, regex-tests-glibc-PCRE, regex-tests-glibc-PTESTS, regex-tests-glibc-TESTS): New test cases split from ‘regex-tests’.
* Don’t (require 'cl)Michal Nazarewicz2016-08-021-4/+3
| | | | | * test/src/regex-test.el: Don’t (require 'cl). (regex-tests-PCRE): s/loop/cl-loop/
* Fix reading of regex-resources in regex-testsMichal Nazarewicz2016-08-021-6/+5
| | | | | | | | | | | * test/src/regex-tests.el (regex-tests-generic-line): Referring to ‘buffer-file-name’ does not work when running the test from command line, i.e. via make, which results in (wrong-type-argument stringp nil) failures. Replace it with hard-coded path. (regex-tests-BOOST, regex-tests-PCRE, regex-tests-PTESTS-whitelist, regex-tests-TESTS-whitelist): ‘regex-tests-generic-line’ now includes the ‘regex-resources’ path component so the tests don’t need to specify it explicitly.
* Added driver for the regex testsDima Kogan2016-08-021-0/+572
| | | | | | | * test/src/regex-tests.el (regex-tests): Test executing glibc tests cases. [mina86@mina86.com: merged test with existing file]
* Fix ‘[[:cc:]]*literal’ regex failing to match ‘literal’ (bug#24020)Michal Nazarewicz2016-07-251-0/+92
The regex engine tries to optimise Kleene star by avoiding backtracking when it can detect that star’s operand cannot match what follows it in the pattern. For example, when ‘[[:alpha:]]*1’ tries to match a ‘foo’, the engine will test the longest match for ‘[[:alpha:]]*’, namely ’foo’ which is the entire string. Literal digit one still present in the pattern will however not match the remaining empty string. Normally, backtracking would be performed trying a shorter match for the character class (namely ‘fo’ leaving ‘o’ in the string), but since the engine knows whatever would be put back into the string cannot possibly match literal digit one so no backtracking will be attempted. In the regexes of the form ‘[[:CC:]]*X’, the optimisation can be applied if the character class CC does not match character X. In the above example, this holds because digit one is not in alpha character class. This test is performed by mutually_exclusive_p function but it did not check class bits of a charset opcode. This resulted in an assumption that character classes do not match multibyte characters. For example, it would incorrectly conclude that [[:alpha:]] doesn’t match ‘ż’. This, in turn, led to the aforementioned Kleene star optimisation being incorrectly applied in patterns such as ‘[[:graph:]]*☠’ (which should match ‘☠’ but doesn’t as can be tested by executing (string-match-p "[[:graph:]]*☠" "☠") which should return 0 but instead yields nil. This issue affects any class witch matches multibyte characters, i.e. if ‘[[:cc:]]’ matches a multibyte character X then ‘[[:cc:]]*X’ will fail to match ‘X’. * src/regex.c (executing_charset): A new function for executing the charset and charset_not opcodes. It performs check on the character taking into consideration existing bitmap, range table and class bits. It also advances the pointer in the regex bytecode past the parsed opcode. (CHARSET_LOOKUP_RANGE_TABLE_RAW, CHARSET_LOOKUP_RANGE_TABLE): Removed. Code now included in executing_charset. (mutually_exclusive_p, re_match_2_internal): Changed to take advantage of executing_charset function. * test/src/regex-tests.el: New file with tests for the character class matching.