emacs/src/search.c, branch scratch/posix-spawn

Speed up match-substitute-replacement

2020-12-04T17:39:13+00:00

* lisp/subr.el (match-substitute-replacement): Use match-data--translate.
* src/search.c (Fmatch_data__translate): Remove string restriction.
* test/lisp/subr-tests.el (subr-match-substitute-replacement): New test.

Fix replace-regexp-in-string substring match data translation

2020-11-26T13:20:13+00:00

For certain patterns, re-matching the same regexp on the matched
substring does not produce correctly translated match data
(bug#15107 and bug#44861).

Using a new builtin function also improves performance since the
number of calls to string-match is halved.

Reported by Kevin Ryde and Shigeru Fukaya.

* lisp/subr.el (replace-regexp-in-string): Translate the match data
using match-data--translate instead of trusting a call to string-match
on the matched string to do the job.
* test/lisp/subr-tests.el (subr-replace-regexp-in-string):
Add test cases.
* src/search.c (Fmatch_data__translate): New internal function.
(syms_of_search): Register it as a subroutine.

Fix a misleading comment in Freplace_match

2020-10-18T08:18:57+00:00

* src/search.c (Freplace_match): Fix a misleading comment
(bug#42424).

Prefer Fvector to make_uninit_vector

2020-08-15T18:19:51+00:00

Fvector is less error-prone than make_uninit_vector, as it
avoids the possibility of a GC crash due to an uninitialized
vector.  So prefer Fvector to make_uninit_vector when this is
easy (and when there's no significant performance difference).
Inspired by a suggestion by Pip Cet in:
https://lists.gnu.org/r/emacs-devel/2020-08/msg00313.html
* src/ccl.c (Fregister_ccl_program):
* src/ccl.c (Fregister_ccl_program):
* src/charset.c (Fdefine_charset_internal):
* src/font.c (Fquery_font, Ffont_info, syms_of_font):
* src/fontset.c (font_def_new, Fset_fontset_font):
* src/ftfont.c (ftfont_shape_by_flt):
* src/hbfont.c (hbfont_shape):
* src/macfont.m (macfont_shape):
* src/search.c (Fnewline_cache_check):
* src/xfaces.c (Fx_family_fonts):
* src/xfns.c (Fx_window_property_attributes):
Prefer Fvector to make_uninit_vector when either is easy.
* src/fontset.c (font_def_new): Now a function with one less
arg instead of a do-while macro, and renamed from FONT_DEF_NEW.
All uses changed.

Fix GC bugs related to uninitialized vectors

2020-08-15T18:19:51+00:00

Avoid problems if GC occurs while initializing a vector.
Problem with Fdelete reported by Pip Cet in:
https://lists.gnu.org/r/emacs-devel/2020-08/msg00313.html
I looked for similar problems elsewhere and found quite a few.
* src/coding.c (make_subsidiaries):
* src/composite.c (syms_of_composite):
* src/font.c (build_style_table, Ffont_get_glyphs):
* src/nsselect.m (clean_local_selection_data):
* src/nsxwidget.m (js_to_lisp):
* src/syntax.c (init_syntax_once):
* src/window.c (Fcurrent_window_configuration):
* src/xselect.c (selection_data_to_lisp_data)
(clean_local_selection_data):
Use make_nil_vector instead of make_uninit_vector.
* src/fns.c (Fdelete):
* src/xwidget.c (webkit_js_to_lisp):
Use allocate_nil_vector instead of allocate_vector.
* src/search.c (Fnewline_cache_check):
Use make_vector instead of make_uninit_vector.

Simplify use of __lsan_ignore_object

2020-08-04T02:08:58+00:00

* configure.ac: Use AC_CHECK_FUNCS_ONCE for __lsan_ignore_object.
* src/buffer.c, src/data.c, src/emacs-module.c, src/regex-emacs.c:
* src/search.c: Use __lsan_ignore_object unconditionally, and don’t
include sanitizer/lsan_interface.h.
* src/lisp.h (__lsan_ignore_object): Provide a dummy in the
typical case where leak sanitization is not available.

Use a more precise check for '__lsan_ignore_object'

2020-08-01T15:12:30+00:00

* configure.ac: Add check for __lsan_ignore_object.

* src/buffer.c (enlarge_buffer_text):
* src/data.c (make_blv):
* src/emacs-module.c (Fmodule_load, initialize_environment):
* src/regex-emacs.c (regex_compile):
* src/search.c (newline_cache_on_off): Use new configuration macro.

Suppress leak sanitizer in a few more places

2020-08-01T15:01:00+00:00

* src/regex-emacs.c (regex_compile):
src/search.c (newline_cache_on_off): Suppress leak sanitizer.

Improve accuracy of apropos commands that search doc strings

2020-05-03T13:53:53+00:00

It is conceptually wrong for apropos commands that search doc
strings to look for matches of several words only on the same
line, because division of doc strings between lines is
ephemeral.
* lisp/apropos.el (apropos-parse-pattern): Accept an optional
argument MULTILINE-P, and if that is non-nil, produce regexps that
match words in the list even if they are separated by line
boundaries.
(apropos-value, apropos-local-value, apropos-documentation): Use
the new optional argument in apropos commands that search
multiline text, such as doc strings.

* src/search.c (Fposix_looking_at, Fposix_string_match)
(Fposix_search_backward, Fposix_search_forward): Make sure Posix
appears in the doc strings near REGEXP, for better matches.

Prefer more inline functions in character.h

2020-04-17T16:17:35+00:00

* src/buffer.h (fetch_char_advance, fetch_char_advance_no_check)
(buf_next_char_len, next_char_len, buf_prev_char_len)
(prev_char_len, inc_both, dec_both): New inline functions,
replacing the old character.h macros FETCH_CHAR_ADVANCE,
FETCH_CHAR_ADVANCE_NO_CHECK, BUF_INC_POS, INC_POS, BUF_DEC_POS,
DEC_POS, INC_BOTH, DEC_BOTH respectively.  All callers changed.
These new functions all assume buffer primitives and so need
to be here rather than in character.h.
* src/casefiddle.c (make_char_unibyte): New static function,
replacing the old MAKE_CHAR_UNIBYTE macro.  All callers changed.
(do_casify_unibyte_string): Use SINGLE_BYTE_CHAR_P instead
of open-coding it.
* src/ccl.c (GET_TRANSLATION_TABLE): New static function,
replacing the old macro of the same name.
* src/character.c (string_char): Omit 2nd arg.  3rd arg can no
longer be NULL.  All callers changed.
* src/character.h (SINGLE_BYTE_CHAR_P): Move up.
(MAKE_CHAR_UNIBYTE, MAKE_CHAR_MULTIBYTE, PREV_CHAR_BOUNDARY)
(STRING_CHAR_AND_LENGTH, STRING_CHAR_ADVANCE)
(FETCH_STRING_CHAR_ADVANCE)
(FETCH_STRING_CHAR_AS_MULTIBYTE_ADVANCE)
(FETCH_STRING_CHAR_ADVANCE_NO_CHECK, FETCH_CHAR_ADVANCE)
(FETCH_CHAR_ADVANCE_NO_CHECK, INC_POS, DEC_POS, INC_BOTH)
(DEC_BOTH, BUF_INC_POS, BUF_DEC_POS): Remove.
(make_char_multibyte): New static function, replacing
the old macro MAKE_CHAR_MULTIBYTE.  All callers changed.
(CHAR_STRING_ADVANCE): Remove; all callers changed to use
CHAR_STRING.
(NEXT_CHAR_BOUNDARY): Remove; it was unused.
(raw_prev_char_len): New inline function, replacing the
old PREV_CHAR_BOUNDARY macro.  All callers changed.
(string_char_and_length): New inline function, replacing the
old STRING_CHAR_AND_LENGTH macro.  All callers changed.
(STRING_CHAR): Rewrite in terms of string_char_and_length.
(string_char_advance): New inline function, replacing the old
STRING_CHAR_ADVANCE macro.  All callers changed.
(fetch_string_char_advance): New inline function, replacing the
old FETCH_STRING_CHAR_ADVANCE macro.  All callers changed.
(fetch_string_char_as_multibyte_advance): New inline function,
replacing the old FETCH_STRING_CHAR_AS_MULTIBYTE_ADVANCE macro.
All callers changed.
(fetch_string_char_advance_no_check): New inline function,
replacing the old FETCH_STRING_CHAR_ADVANCE_NO_CHECK macro.  All
callers changed.
* src/regex-emacs.c (HEAD_ADDR_VSTRING): Remove; no longer used.
* src/syntax.c (scan_lists): Use dec_bytepos instead of
open-coding it.
* src/xdisp.c (string_char_and_length): Rename from
string_char_and_length to avoid name conflict with new function in
character.h.  All callers changed.