aboutsummaryrefslogtreecommitdiffstats
path: root/src/regex.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* ; Don't use non-ASCII quotes in commentsEli Zaretskii2017-07-301-4/+4
| | | | | | * src/regex.h: * src/regex.c (re_wctype_parse): Don't use non-ASCII quotes in comments.
* Convert hex digits more systematicallyPaul Eggert2017-07-051-3/+1
| | | | | | | | | | | | | | | | | This makes the code a bit smaller and presumably faster, as it substitutes a single lookup for conditional jumps. * src/character.c (hexdigit): New constant. (syms_of_character) [HEXDIGIT_IS_CONST]: Initialize it. * src/character.h (HEXDIGIT_CONST, HEXDIGIT_IS_CONST): New macros. (hexdigit): New decl. (char_hexdigit): New inline function. * src/charset.c: Do not include c-ctype.h. * src/charset.c (read_hex): * src/editfns.c (styled_format): * src/image.c (xbm_scan): * src/lread.c (read_escape): * src/regex.c (ISXDIGIT) [emacs]: Use char_hexdigit insted of doing it by hand.
* Pacify GCC 7 with --enable-gcc-warningsPaul Eggert2017-05-161-1/+1
| | | | | * src/regex.c (regex_compile): Swap labels, so that the FALLTHROUGH immediately precedes the case label.
* Merge with gnulib, pacifying GCC 7Paul Eggert2017-05-161-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | This incorporates: 2017-05-16 manywarnings: update for GCC 7 2017-05-15 sys_select: Avoid "was expanded before it was required" * configure.ac (nw): Suppress GCC 7’s new -Wduplicated-branches and -Wformat-overflow=2 options, due to too many false alarms. * doc/misc/texinfo.tex, lib/strftime.c, m4/manywarnings.m4: Copy from gnulib. * m4/gnulib-comp.m4: Regenerate. * src/coding.c (decode_coding_iso_2022): Fix bug uncovered by -Wimplicit-fallthrough. * src/conf_post.h (FALLTHROUGH): New macro. Use it to mark all switch cases that fall through. * src/editfns.c (styled_format): Use !, not ~, on bool. * src/gtkutil.c (xg_check_special_colors): When using sprintf, don’t trust Gtk to output colors in [0, 1] range. (xg_update_scrollbar_pos): Avoid use of possibly-uninitialized bool; this bug was actually caught by Clang. * src/search.c (boyer_moore): Tell GCC that CHAR_BASE, if nonzero, must be a non-ASCII character. * src/xterm.c (x_draw_glyphless_glyph_string_foreground): Tell GCC that glyph->u.glyphless.ch must be a character.
* Use 'char *FOO' instead of 'char* FOO'Paul Eggert2017-02-181-13/+13
|
* Remove immediate_quit.Paul Eggert2017-02-011-6/+4
| | | | | | | | | The old code that sets and clears immediate_quit was ineffective except when Emacs is running in terminal mode, and has problematic race conditions anyway, so remove it. This will introduce some hangs when Emacs runs in terminal mode, and these hangs should be fixed in followup patches. * src/keyboard.c (immediate_quit): Remove. All uses removed.
* Replace QUIT with maybe_quitPaul Eggert2017-01-251-5/+2
| | | | | | | | There’s no longer need to have QUIT stand for a slug of C statements. Use the more-obvious function-call syntax instead. Also, use true and false when setting immediate_quit. These changes should not affect the generated machine code. * src/lisp.h (QUIT): Remove. All uses replaced by maybe_quit.
* Use expanded stack during regex matchesNoam Postavsky2017-01-081-30/+43
| | | | | | | | | | | | | | | | | | | | | | | While the stack is increased in main(), to allow the regex stack allocation to use alloca we also need to modify regex.c to actually take advantage of the increased stack, and not limit stack allocations to SAFE_ALLOCA bytes. * src/regex.c (MATCH_MAY_ALLOCATE): Remove obsolete comment about allocations in signal handlers which no longer happens and correct description about when and why MATCH_MAY_ALLOCATE should be defined. (emacs_re_safe_alloca): New variable. (REGEX_USE_SAFE_ALLOCA): Use it as the limit of stack allocation instead of MAX_ALLOCA. (emacs_re_max_failures): Rename from `re_max_failures' to avoid confusion with glibc's `re_max_failures'. * src/emacs.c (main): Increase the amount of fixed 'extra' bytes we add to the stack. Instead of changing emacs_re_max_failures based on the new stack size, just change emacs_re_safe_alloca; emacs_re_max_failures remains constant regardless, since if we run out stack space SAFE_ALLOCA will fall back to heap allocation. Co-authored-by: Eli Zaretskii <eliz@gnu.org>
* Fix computation of regex stack limitNoam Postavsky2017-01-081-9/+6
| | | | | | | | | | | The regex stack limit was being computed as the number of stack entries, whereas it was being compared with the current size as measured in bytes. This could cause indefinite looping when nearing the stack limit if re_max_failures happened not to be a multiple of sizeof fail_stack_elt_t (Bug #24751). * src/regex.c (GROW_FAIL_STACK): Compute both current stack size and limit as numbers of stack entries.
* Add support for Unicode whitespace in [:blank:]Philipp Stephani2017-01-061-4/+8
| | | | | | | | | | | | | | | | See Bug#25366. * src/character.c (blankp): New function for checking Unicode horizontal whitespace. * src/regex.c (ISBLANK): Use 'blankp' for non-ASCII horizontal whitespace. (BIT_BLANK): New bit for range table. (re_wctype_to_bit, execute_charset): Use it. * test/lisp/subr-tests.el (subr-tests--string-match-p--blank): Add unit test for [:blank:] character class. * test/src/regex-tests.el (test): Adapt unit test. * doc/lispref/searching.texi (Char Classes): Document new Unicode behavior for [:blank:].
* Merge from origin/emacs-25Paul Eggert2017-01-011-1/+1
|\ | | | | | | | | 2e2a806 Fix copyright years by hand 5badc81 Update copyright year to 2017
| * Update copyright year to 2017Paul Eggert2016-12-311-1/+1
| | | | | | | | Run admin/update-copyright.
* | Documentation and commentary improvementsEli Zaretskii2016-12-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | * src/lisp.h: * src/regex.c: * src/xgselect.c (xg_select): Improve commentary and formatting. * doc/lispref/objects.texi (Thread Type, Mutex Type) (Condition Variable Type): New subsections. (Type Predicates): Add thread-related predicates. * doc/lispref/objects.texi (Editing Types): * doc/lispref/elisp.texi (Top): Update higher-level menus.
* | Fix compilation problems.Eli Zaretskii2016-12-051-14/+0
| |
* | Merge branch 'concurrency'Eli Zaretskii2016-12-041-7/+14
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts (resolved): configure.ac src/Makefile.in src/alloc.c src/bytecode.c src/emacs.c src/eval.c src/lisp.h src/process.c src/regex.c src/regex.h
| * \ merge from trunkKen Raeburn2015-11-011-95/+104
| |\ \
| * \ \ merge from trunkTom Tromey2013-08-191-5/+4
| |\ \ \
| * \ \ \ Merge from trunkTom Tromey2013-07-121-14/+19
| |\ \ \ \
| * \ \ \ \ merge from trunk; clean up some issuesTom Tromey2013-06-031-206/+192
| |\ \ \ \ \
| * \ \ \ \ \ merge from trunkTom Tromey2013-01-051-4/+2
| |\ \ \ \ \ \
| * \ \ \ \ \ \ merge from trunkTom Tromey2012-12-171-6/+4
| |\ \ \ \ \ \ \
| * | | | | | | | This introduces a thread-state object and moves various C globalsTom Tromey2012-08-151-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there. It also introduces #defines for these globals to avoid a monster patch. The #defines mean that this patch also has to rename a few fields whose names clash with the defines. There is currently just a single "thread"; so this patch does not impact Emacs behavior in any significant way.
* | | | | | | | | Don't access pointers to freed storage in regex.cPaul Eggert2016-11-261-36/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove __BOUNDED_POINTERS__ code, which does not work with -fcheck-pointer-bound and which has undefined behavior anyway. Problem found when trying to port to gcc -fcheck-pointer-bounds. (This code was removed from glibc and gnulib regex.c many years ago.) * src/regex.c (ELSE_EXTEND_BUFFER_HIGH_BOUND): Remove. (EXTEND_BUFFER): Use a more-portable approach that avoids undefined behavior due to inspecting pointers to freed storage.
* | | | | | | | | Merge from origin/emacs-25Paul Eggert2016-11-041-73/+0
|\ \ \ \ \ \ \ \ \ | | |_|_|_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dbb3410 python.el: Fix detection of native completion in Python 3 (bu... 91c97b6 * Makefile.in (install-arch-indep): Skip etc/refcards/emacsve... 9c1cb8d * lisp/subr.el (set-transient-map): Exit for unbound events (... 9c247d2 Update category-table for Chinese characters 43986d1 Inhibit buffer relocation during regex searches fee4cef Revert fixes to allocation of regex matching
| * | | | | | | | Revert fixes to allocation of regex matchingNoam Postavsky2016-10-251-73/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fix was not complete, and completing it was proving too complicated. - Revert "* src/regex.c (re_search_2): Make new code safe for -Wjump-misses-init." This reverts commit c2a17924a57483d14692c8913edbe8ad24b5ffbb. - Revert "Port to GCC 6.2.1 + --enable-gcc-warnings" This reverts commit f6134bbda259c115c06d4a9a3ab5c39340a15949. - Revert "Fix handling of allocation in regex matching" This reverts commit ad66b3fadb7ae22a4cbb82bb1507c39ceadf3897. - Revert "Fix handling of buffer relocation in regex.c functions" This reverts commit ee04aedc723b035eedaf975422d4eb242894121b.
* | | | | | | | | * src/regex.c (re_search_2): Use UNINIT, not IF_LINT.Paul Eggert2016-10-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This finishes the merge of the recent emacs-25 regex changes into master.
* | | | | | | | | Merge from origin/emacs-25Paul Eggert2016-10-231-2/+77
|\ \ \ \ \ \ \ \ \ | |/ / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 50fa7d6 ;* src/w32heap.c: Fix typo and wording of the comments. 6f1325e electric-quote mode no longer worries about coding c2a1792 * src/regex.c (re_search_2): Make new code safe for -Wjump-mi... f6134bb Port to GCC 6.2.1 + --enable-gcc-warnings b2ba630 Explain how to debug emacsclient lisp errors 9da53e2 Let describe-function work for lambda again 5c2da93 Fix kill-line's docstring ad66b3f Fix handling of allocation in regex matching 5a26c9b * lisp/electric.el (electric-quote-mode): Improve doc (Bug#24... 3877c91 vc-region-history: Search just on lines intersecting the region 8988327 Fix documentation of 'alist-get' b6998ea * src/regex.h (re_match_object): Improve commentary. # Conflicts: # etc/NEWS # lisp/help-fns.el
| * | | | | | | | * src/regex.c (re_search_2): Make new code safe for -Wjump-misses-init.Paul Eggert2016-10-231-3/+6
| | | | | | | | |
| * | | | | | | | Port to GCC 6.2.1 + --enable-gcc-warningsPaul Eggert2016-10-221-17/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * src/regex.c (ENSURE_FAIL_STACK, re_search_2): Redo recent regex changes to avoid complaints from GCC 6.2.1 when Emacs is configured with --enable-gcc-warnings. Also, work around GCC bug 78081, which was uncovered by this new code.
| * | | | | | | | Fix handling of allocation in regex matchingNoam Postavsky2016-10-211-3/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `re_match_2_internal' uses pointers to the lisp objects that it searches. Since it may call malloc when growing the "fail stack", these pointers may be invalidated while searching, resulting in memory curruption (Bug #24358). To fix this, we check the pointer that the lisp object (as specified by re_match_object) points to before and after growing the stack, and update existing pointers accordingly. * src/regex.c (STR_BASE_PTR): New macro. (ENSURE_FAIL_STACK, re_search_2): Use it to convert pointers into offsets before possible malloc call, and back into pointers again afterwards. (POS_AS_IN_BUFFER): Add explanatory comment about punning trick. * src/search.c (search_buffer): Instead of storing search location as pointers, store them as pointers and recompute the corresponding address for each call to `re_search_2'. (string_match_1, fast_string_match_internal, fast_looking_at): * src/dired.c (directory_files_internal): Set `re_match_object' to Qnil after calling `re_search' or `re_match_2'. * src/regex.h (re_match_object): Mention new usage in commentary.
* | | | | | | | | Limit <config.h>’s includesPaul Eggert2016-09-301-14/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This follows up on recent problems with the fact that config.h includes stdlib.h etc.; some files need to include stdlib.h later. config.h generally should limit itself to includes that are universally safe; outside of MS-Windows, only stdbool.h makes the cut among the files currently included. So, move the other includes to just the files that need them (Bug#24506). * configure.ac (config_opsysfile): Remove, as this generic hook is no longer needed. * lib-src/etags.c, src/unexmacosx.c, src/w32.c, src/w32notify.c: * src/w32proc.c (_GNU_SOURCE): Remove, as it’s OK for config.h to do this now. * src/conf_post.h: Include <ms-w32.h>, instead of the generic config_opsysfile, for simplicity as this old way of configuring is now done only for the MS-Windows port. Do not include <ms-w32.h> if DEFER_MS_W32_H, for the benefit of the few files that want its effects later. Do not include <alloca.h>, <string.h>, or <stdlib.h>. Other files modified to include these headers as needed, or to not include headers that are no longer needed. * src/lisp.h: Include <alloca.h> and <string.h> here, since some of the inline functions need them. * src/regex.c: Include <alloca.h> if not emacs. (If emacs, we can rely on SAFE_ALLOCA.) There is no longer any need to worry about HAVE_ALLOCA_H. * src/unexmacosx.c: Rely on config.h not including stdlib.h. * src/w32.c, src/w32notify.c, src/w32proc.c (DEFER_MS_W32_H): Define before including <config.h> first, and include <ms-w32.h> after the troublesome headers.
* | | | | | | | | Remove dead loop iterations in regex.cMichal Nazarewicz2016-09-091-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RE_CHAR_TO_MULTIBYTE(c) yields c for ASCII characters and a byte8 character for c ≥ 0x80. Furthermore, CHAR_BYTE8_P(c) is true only for byte8 characters. This means that c = RE_CHAR_TO_MULTIBYTE (ch); if (! CHAR_BYTE8_P (c) && re_iswctype (c, cc)) is equivalent to: c = c; if (! false && re_iswctype (c, cc)) for 0 ⪬ c < 0x80, and c = BYTE8_TO_CHAR (c); if (! true && re_iswctype (c, cc)) for 0x80 ⪬ c < 0x100. In other words, the loop never executes for c ≥ 0x80 and RE_CHAR_TO_MULTIBYTE call is unnecessary for c < 0x80. * src/regex.c (regex_compile): Simplyfy a for loop by eliminating dead iterations and unnecessary macro calls.
* | | | | | | | | Replace decimalnump with alphanumericpMichal Nazarewicz2016-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | decimalnump was used in regex.c only in ISALNUM macro which ored it with alphabeticp. Because both of those functions require Unicode general category lookup, this resulted in unnecessary lookups (if alphabeticp return false decimalp had to perform another lookup). Drop decimalnump in favour of alphanumericp which combines decimelnump with alphabeticp. * src/character.c (decimalnump): Remove in favour of… (alphanumericp): …new function. * src/regex.c (ISALNUM): Use alphanumericp.
* | | | | | | | | Remove inaccurate comment in regex.cMichal Nazarewicz2016-09-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * src/regex.c (regex_compile): Remove comment indicating that wctype of some character classes may be negative. All wctypes are in fact non-negative.
* | | | | | | | | Spelling and minor grammar fixesPaul Eggert2016-08-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * test/file-organization.org: Rename from test/file-organisation.org.
* | | | | | | | | Port regex changes to strict ISO CPaul Eggert2016-08-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * src/regex.c (regex_compile): Redo casts.
* | | | | | | | | Remove unused STREQ macroMichal Nazarewicz2016-08-021-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the following compilation warning: regex.c:516:0: warning: macro "STREQ" is not used [-Wunused-macros] #define STREQ(s1, s2) ((strcmp (s1, s2) == 0)) ^ * src/regex.c (STREQ): Remove unused macro. It should have been removed in a [4538a5e: Refactor regex character class parsing in [:name:]] commit but was mistakenly left out.
* | | | | | | | | Hardcode regex syntax to remove dead code handling different syntaxMichal Nazarewicz2016-08-021-9/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Emacs only ever uses its own regex syntax so support for other syntaxes is never used. Hardcode the syntax so that the compilar can detect such dead code and remove it from compiled code. The only exception is RE_NO_POSIX_BACKTRACKING which can be separatelly specified. Handle this separatelly with a function argument (replacing now unnecessary syntax argument). With this patchset, size of Emacs binary on x86_64 machine is reduced by around 60 kB: new-sizes:-rwx------ 3 mpn eng 30254720 Jul 27 23:31 src/emacs old-sizes:-rwx------ 3 mpn eng 30314828 Jul 27 23:29 src/emacs * src/regex.h (re_pattern_buffer): Don’t define syntax field #ifdef emacs. (re_compile_pattern): Replace syntax with posix_backtracking argument. * src/regex.c (print_compiled_pattern): Don’t print syntax #ifdef emacs. (regex_compile): #ifdef emacs, replace syntax argument with posix_backtracking which is now used instead of testing for RE_NO_POSIX_BACKTRACKING syntax. (re_match_2_internal): Don’t access bufp->syntax #ifndef emacs. (re_compile_pattern): Replace syntax with posix_backtracking argument. * src/search.c (compile_pattern_1): Pass boolean posix_backtracking instead of syntax to re_compile_pattern.
* | | | | | | | | Get rid of re_set_whitespace_regexpMichal Nazarewicz2016-08-021-15/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * src/regex.h (re_set_whitespace_regexp): Delete. (re_compile_pattern): Add whitespace_regexp argument #ifdef emacs. * src/regex.c (re_set_whitespace_regexp, whitespace_regexp): Delete. (regex_compile): Add whitespace_regexp argument #ifdef emacs and wrap whitespace_regexp-related code in an #ifdef emacs so it’s compiled out unless building Emacs. (re_compile_pattern): Pass whitespace_regexp argument to regex_compile * src/search.c (compile_pattern_1): Don’t use re_set_whitespace_regexp, pass the regex as argument to re_compile_pattern instead.
* | | | | | | | | Get rid of re_set_syntaxMichal Nazarewicz2016-08-021-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a global variable for storing regex syntax, pass it to re_compile_pattern. This is only enabled when compiling Emacs (i.e. ‘#ifdef emacs’). * src/regex.h (re_set_syntax): Declare only #ifndef emacs. (re_compile_pattern): Now takes syntax argument #ifdef emacs. * src/regex.c (re_syntax_options): Define only #ifndef emacs. (re_compile_pattern): Use the new syntax argument #ifdef emacs. * src/search.c (compile_pattern_1): Don’t use re_set_syntax and instead pass syntax to re_compile_pattern directly.
* | | | | | | | | Remove dead opcodes in regex bytecodeMichal Nazarewicz2016-08-021-27/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no way to specify before_dot and after_dot opcodes in a regex so code handling those ends up being dead. Remove it. * src/regex.c (print_partial_compiled_pattern, regex_compile, analyze_first, re_match_2_internal): Remove handling and references to before_dot and after_dot opcodes.
* | | | | | | | | Refactor regex character class parsing in [:name:]Michal Nazarewicz2016-08-021-156/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | re_wctype function is used in three separate places and in all of those places almost exact code extracting the name from [:name:] surrounds it. Furthermore, re_wctype requires a NUL-terminated string, so the name of the character class is copied to a temporary buffer. The code duplication and unnecessary memory copying can be avoided by pushing the responsibility of parsing the whole [:name:] sequence to the function. Furthermore, since now the function has access to the length of the character class name (since it’s doing the parsing), it can take advantage of that information in skipping some string comparisons and using a constant-length memcmp instead of strcmp which needs to take care of NUL bytes. * src/regex.c (re_wctype): Delete function. Replace it with: (re_wctype_parse): New function which parses a whole [:name:] string and returns a RECC_* constant or -1 if the string is not of [:name:] format. (regex_compile): Use re_wctype_parse. * src/syntax.c (skip_chars): Use re_wctype_parse.
* | | | | | | | | Fix ‘is multibyte’ test regex.c’s mutually_exclusive_p (bug#24020)Michal Nazarewicz2016-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * src/regex.c (mutually_exclusive_p): Fix how whether character is unibyte is tested when calling execute_charset function. This bug has been introduced by [6dc6b00: Fix ‘[[:cc:]]*literal’ regex failing to match ‘literal’] which dropped a call to IS_REAL_ASCII (c) macro. Reinstitute it.
* | | | | | | | | Fix ‘[[:cc:]]*literal’ regex failing to match ‘literal’ (bug#24020)Michal Nazarewicz2016-07-251-116/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The regex engine tries to optimise Kleene star by avoiding backtracking when it can detect that star’s operand cannot match what follows it in the pattern. For example, when ‘[[:alpha:]]*1’ tries to match a ‘foo’, the engine will test the longest match for ‘[[:alpha:]]*’, namely ’foo’ which is the entire string. Literal digit one still present in the pattern will however not match the remaining empty string. Normally, backtracking would be performed trying a shorter match for the character class (namely ‘fo’ leaving ‘o’ in the string), but since the engine knows whatever would be put back into the string cannot possibly match literal digit one so no backtracking will be attempted. In the regexes of the form ‘[[:CC:]]*X’, the optimisation can be applied if the character class CC does not match character X. In the above example, this holds because digit one is not in alpha character class. This test is performed by mutually_exclusive_p function but it did not check class bits of a charset opcode. This resulted in an assumption that character classes do not match multibyte characters. For example, it would incorrectly conclude that [[:alpha:]] doesn’t match ‘ż’. This, in turn, led to the aforementioned Kleene star optimisation being incorrectly applied in patterns such as ‘[[:graph:]]*☠’ (which should match ‘☠’ but doesn’t as can be tested by executing (string-match-p "[[:graph:]]*☠" "☠") which should return 0 but instead yields nil. This issue affects any class witch matches multibyte characters, i.e. if ‘[[:cc:]]’ matches a multibyte character X then ‘[[:cc:]]*X’ will fail to match ‘X’. * src/regex.c (executing_charset): A new function for executing the charset and charset_not opcodes. It performs check on the character taking into consideration existing bitmap, range table and class bits. It also advances the pointer in the regex bytecode past the parsed opcode. (CHARSET_LOOKUP_RANGE_TABLE_RAW, CHARSET_LOOKUP_RANGE_TABLE): Removed. Code now included in executing_charset. (mutually_exclusive_p, re_match_2_internal): Changed to take advantage of executing_charset function. * test/src/regex-tests.el: New file with tests for the character class matching.
* | | | | | | | | Replace IF_LINT by NONVOLATILE and UNINITPaul Eggert2016-06-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inspired by a suggestion from RMS in: http://bugs.gnu.org/23640#58 * .dir-locals.el (c-mode): Adjust to macro changes. * src/conf_post.h (NONVOLATILE, UNINIT): New macros (Bug#23640). (IF_LINT): Remove. All uses replaced by the new macros.
* | | | | | | | | Omit IF_LINT code that no longer seems neededPaul Eggert2016-05-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nowadays GCC is smarter, or the Emacs code has mutated, or both, and now is as good a time as any to remove uses of IF_LINT that now seem to be unnecessary. * lib-src/emacsclient.c (set_local_socket): * lib-src/movemail.c (main) [MAIL_USE_MAILLOCK && HAVE_TOUCHLOCK]: * src/buffer.c (fix_start_end_in_overlays, fix_overlays_before): * src/casefiddle.c (casify_region): * src/charset.c (load_charset_map): * src/coding.c (decode_coding_object, encode_coding_object): * src/data.c (Fmake_variable_buffer_local, Fmake_local_variable) (cons_to_unsigned, cons_to_signed): * src/frame.c (make_frame, x_set_frame_parameters): * src/keyboard.c (read_event_from_main_queue): * src/regex.c (regex_compile): * src/syntax.c (back_comment): * src/window.c (Frecenter): * src/xfaces.c (Fx_list_fonts): Remove IF_LINT that no longer seems necessary. * src/image.c (png_load_body, jpeg_load_body): Simplify use of IF_LINT. * src/keyboard.c (read_char): Use IF_LINT (volatile) rather than a pragma dance to pacify GCC -Wclobbered. * src/xdisp.c (x_produce_glyphs): Rewrite to avoid need for IF_LINT. * src/xterm.c (x_connection_closed): Now _Noreturn, which should mean we do not need IF_LINT any more. (x_io_error_quitter): Now _Noreturn. Put an 'assume (false)’ at the end, to forestall warnings from older compilers.
* | | | | | | | | * src/regex.c (IF_LINT): Remove; it’s in conf_post.hKen Brown2016-05-301-7/+0
| | | | | | | | |
* | | | | | | | | Port redirect-debugging-output to non-GNU/LinuxPaul Eggert2016-04-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem reported by Kylie McClain for musl in: http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01592.html * etc/DEBUG, etc/NEWS: Mention this. * src/callproc.c (child_setup) [!MSDOS]: * src/dispnew.c (init_display): * src/emacs.c (main, Fdaemon_initialized): * src/minibuf.c (read_minibuf_noninteractive): * src/regex.c (xmalloc, xrealloc): Prefer symbolic names like STDERR_FILENO to magic numbers like 2, to make file-descriptor manipulation easier to follow. * src/emacs.c (relocate_fd) [!WINDOWSNT]: Remove; no longer needed now that we make sure stdin, stdout and stderr are open. All uses removed. (main): Make sure standard FDs are OK. Prefer symbolic names like EXIT_FAILURE to magic numbers like 1. Use bool for boolean. * src/lisp.h (init_standard_fds): New decl. * src/print.c (WITH_REDIRECT_DEBUGGING_OUTPUT) [GNU_LINUX]: Remove; no longer needed. (Fredirect_debugging_output): Define on all platforms, not just GNU/Linux. Redirect file descriptor, not stream, so that the code works even if stderr is not an lvalue. Report an error if the file arg is neither a string nor nil. (syms_of_print): Always define redirect-debugging-output. * src/sysdep.c (force_open, init_standard_fds): New functions.
* | | | | | | | | --enable-gcc-warnings now uses -Wjump-misses-initPaul Eggert2016-02-261-16/+17
|/ / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When configuring with --enable-gcc-warnings, also enable -Wjump-misses-init, as it’s confusing to use a goto to skip over an initialization. Fix the few places in the code that run afoul of this warning. * configure.ac (WERROR_CFLAGS): Add -Wjump-misses-init. * src/doc.c (Fsubstitute_command_keys): * src/image.c (svg_load_image): * src/regex.c (re_match_2_internal): * src/xdisp.c (redisplay_internal, redisplay_window): Don’t jump over initialization.
* | | | | | | | Fix "[:upper:]" for non-ASCII charactersEli Zaretskii2016-02-201-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * src/regex.c (re_match_2_internal): Support [:upper:] and [:lower:] for non-ASCII characters. (Bug#18150)