diff options
| author | Yuan Fu | 2023-06-28 17:05:29 -0700 |
|---|---|---|
| committer | Yuan Fu | 2023-06-29 11:15:30 -0700 |
| commit | 1d2ba6b363b2e41ca40c74f679f80363e04a54ed (patch) | |
| tree | 75ba41eb240e2347d28ccc8e8d8092d85e4a1234 /admin/notes | |
| parent | 02b6be892fa1a30b42c3df21319dddd2f445175e (diff) | |
| download | emacs-1d2ba6b363b2e41ca40c74f679f80363e04a54ed.tar.gz emacs-1d2ba6b363b2e41ca40c74f679f80363e04a54ed.zip | |
; * admin/notes/tree-sitter/treesit_record_change: Update.
Diffstat (limited to 'admin/notes')
| -rw-r--r-- | admin/notes/tree-sitter/treesit_record_change | 180 |
1 files changed, 174 insertions, 6 deletions
diff --git a/admin/notes/tree-sitter/treesit_record_change b/admin/notes/tree-sitter/treesit_record_change index 0dc6491e2d1..e80df4adfa7 100644 --- a/admin/notes/tree-sitter/treesit_record_change +++ b/admin/notes/tree-sitter/treesit_record_change | |||
| @@ -3,10 +3,10 @@ NOTES ON TREESIT_RECORD_CHANGE | |||
| 3 | It is vital that Emacs informs tree-sitter of every change made to the | 3 | It is vital that Emacs informs tree-sitter of every change made to the |
| 4 | buffer, lest tree-sitter's parse tree would be corrupted/out of sync. | 4 | buffer, lest tree-sitter's parse tree would be corrupted/out of sync. |
| 5 | 5 | ||
| 6 | All buffer changes in Emacs are made through functions in insdel.c | 6 | Almost all buffer changes in Emacs are made through functions in |
| 7 | (and casefiddle.c), I augmented functions in those files with calls to | 7 | insdel.c (see below for exceptions), I augmented functions in insdel.c |
| 8 | treesit_record_change. Below is a manifest of all the relevant | 8 | with calls to treesit_record_change. Below is a manifest of all the |
| 9 | functions in insdel.c as of Emacs 29: | 9 | relevant functions in insdel.c as of Emacs 29: |
| 10 | 10 | ||
| 11 | Function Calls | 11 | Function Calls |
| 12 | ---------------------------------------------------------------------- | 12 | ---------------------------------------------------------------------- |
| @@ -43,8 +43,176 @@ insert_from_buffer but not insert_from_buffer_1. I also left a | |||
| 43 | reminder comment. | 43 | reminder comment. |
| 44 | 44 | ||
| 45 | 45 | ||
| 46 | As for casefiddle.c, do_casify_unibyte_region and | 46 | EXCEPTIONS |
| 47 | |||
| 48 | |||
| 49 | There are a couple of functions that replaces characters in-place | ||
| 50 | rather than insert/delete. They are in casefiddle.c and editfns.c. | ||
| 51 | |||
| 52 | In casefiddle.c, do_casify_unibyte_region and | ||
| 47 | do_casify_multibyte_region modifies buffer, but they are static | 53 | do_casify_multibyte_region modifies buffer, but they are static |
| 48 | functions and are called by casify_region, which calls | 54 | functions and are called by casify_region, which calls |
| 49 | treesit_record_change. Other higher-level functions calls | 55 | treesit_record_change. Other higher-level functions calls |
| 50 | casify_region to do the work. \ No newline at end of file | 56 | casify_region to do the work. |
| 57 | |||
| 58 | In editfns.c, subst-char-in-region and translate-region-internal might | ||
| 59 | replace characters in-place, I made them to call | ||
| 60 | treesit_record_change. transpose-regions uses memcpy to move text | ||
| 61 | around, it calls treesit_record_change too. | ||
| 62 | |||
| 63 | I found these exceptions by grepping for signal_after_change and | ||
| 64 | checking each caller manually. Below is all the result as of Emacs 29 | ||
| 65 | and some comment for each one. Readers can use | ||
| 66 | |||
| 67 | (highlight-regexp "^[^[:space:]]+?\\.c:[[:digit:]]+:[^z-a]+?$" 'highlight) | ||
| 68 | |||
| 69 | to make things easier to read. | ||
| 70 | |||
| 71 | grep [...] --color=auto -i --directories=skip -nH --null -e signal_after_change *.c | ||
| 72 | |||
| 73 | callproc.c:789: calling prepare_to_modify_buffer and signal_after_change. | ||
| 74 | callproc.c:793: is one call to signal_after_change in each of the | ||
| 75 | callproc.c:800: signal_after_change hasn't. A continue statement | ||
| 76 | callproc.c:804: again, and this time signal_after_change gets called, | ||
| 77 | |||
| 78 | Not code. | ||
| 79 | |||
| 80 | callproc.c:820: signal_after_change (PT - nread, 0, nread); | ||
| 81 | callproc.c:863: signal_after_change (PT - process_coding.produced_char, | ||
| 82 | |||
| 83 | Both are called in call-process. I don’t think we’ll ever use | ||
| 84 | tree-sitter in call-process’s stdio buffer, right? I didn’t check | ||
| 85 | line-by-line, but it seems to only use insert_1_both and del_range_2. | ||
| 86 | |||
| 87 | casefiddle.c:558: signal_after_change (start, end - start - added, end - start); | ||
| 88 | |||
| 89 | Called in casify-region, calls treesit_record_change. | ||
| 90 | |||
| 91 | decompress.c:195: signal_after_change (data->orig, data->start - data->orig, | ||
| 92 | |||
| 93 | Called in unwind_decompress, uses del_range_2, insdel function. | ||
| 94 | |||
| 95 | decompress.c:334: signal_after_change (istart, iend - istart, unwind_data.nbytes); | ||
| 96 | |||
| 97 | Called in zlib-decompress-region, uses del_range_2, insdel function. | ||
| 98 | |||
| 99 | editfns.c:2139: signal_after_change (BEGV, size_a, ZV - BEGV); | ||
| 100 | |||
| 101 | Called in replace-buffer-contents, which calls del_range and | ||
| 102 | Finsert_buffer_substring, both are ok. | ||
| 103 | |||
| 104 | editfns.c:2416: signal_after_change (changed, | ||
| 105 | |||
| 106 | Called in subst-char-in-region, which either calls replace_range (a | ||
| 107 | insdel function) or modifies buffer content by itself (need to call | ||
| 108 | treesit_record_change). | ||
| 109 | |||
| 110 | editfns.c:2544: /* Reload as signal_after_change in last iteration may GC. */ | ||
| 111 | |||
| 112 | Not code. | ||
| 113 | |||
| 114 | editfns.c:2604: signal_after_change (pos, 1, 1); | ||
| 115 | |||
| 116 | Called in translate-region-internal, which has three cases: | ||
| 117 | |||
| 118 | if (nc != oc && nc >= 0) { | ||
| 119 | if (len != str_len) { | ||
| 120 | replace_range() | ||
| 121 | } else { | ||
| 122 | while (str_len-- > 0) | ||
| 123 | *p++ = *str++; | ||
| 124 | } | ||
| 125 | } | ||
| 126 | else if (nc < 0) { | ||
| 127 | replace_range() | ||
| 128 | } | ||
| 129 | |||
| 130 | replace_range is ok, but in the case where it manually modifies buffer | ||
| 131 | content, it needs to call treesit_record_change. | ||
| 132 | |||
| 133 | editfns.c:4779: signal_after_change (start1, end2 - start1, end2 - start1); | ||
| 134 | |||
| 135 | Called in transpose-regions. It just uses memcpy’s and doesn’t use | ||
| 136 | insdel functions; needs to call treesit_record_change. | ||
| 137 | |||
| 138 | fileio.c:4825: signal_after_change (PT, 0, inserted); | ||
| 139 | |||
| 140 | Called in insert_file_contents. Uses insert_1_both (very first in the | ||
| 141 | function); del_range_1 and del_range_byte (the optimized way to | ||
| 142 | implement replace when decoding isn’t needed); del_range_byte and | ||
| 143 | insert_from_buffer (the optimized way used when decoding is needed); | ||
| 144 | decode_coding_gap or insert_from_gap_1 (I’m not sure the condition for | ||
| 145 | this, but anyway it’s safe). The function also calls memcpy and | ||
| 146 | memmove, but they are irrelevant: memcpy is used for decoding, and | ||
| 147 | memmove is moving stuff inside the gap for decode_coding_gap. | ||
| 148 | |||
| 149 | I’d love someone to verify this function, since it’s so complicated | ||
| 150 | and large, but from what I can tell it’s safe. | ||
| 151 | |||
| 152 | fns.c:3998: signal_after_change (XFIXNAT (beg), 0, inserted_chars); | ||
| 153 | |||
| 154 | Called in base64-decode-region, uses insert_1_both and del_range_both, | ||
| 155 | safe. | ||
| 156 | |||
| 157 | insdel.c:681: signal_after_change (opoint, 0, len); | ||
| 158 | insdel.c:696: signal_after_change (opoint, 0, len); | ||
| 159 | insdel.c:741: signal_after_change (opoint, 0, len); | ||
| 160 | insdel.c:757: signal_after_change (opoint, 0, len); | ||
| 161 | insdel.c:976: signal_after_change (opoint, 0, PT - opoint); | ||
| 162 | insdel.c:996: signal_after_change (opoint, 0, PT - opoint); | ||
| 163 | insdel.c:1187: signal_after_change (opoint, 0, PT - opoint); | ||
| 164 | insdel.c:1412: signal_after_change. */ | ||
| 165 | insdel.c:1585: signal_after_change (from, nchars_del, GPT - from); | ||
| 166 | insdel.c:1600: prepare_to_modify_buffer and never call signal_after_change. | ||
| 167 | insdel.c:1603: region once. Apart from signal_after_change, any caller of this | ||
| 168 | insdel.c:1747: signal_after_change (from, to - from, 0); | ||
| 169 | insdel.c:1789: signal_after_change (from, to - from, 0); | ||
| 170 | insdel.c:1833: signal_after_change (from, to - from, 0); | ||
| 171 | insdel.c:2223:signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins) | ||
| 172 | insdel.c:2396: signal_after_change (begpos, endpos - begpos - change, endpos - begpos); | ||
| 173 | |||
| 174 | I’ve checked all insdel functions. We can assume insdel functions are | ||
| 175 | all safe. | ||
| 176 | |||
| 177 | json.c:790: signal_after_change (PT, 0, inserted); | ||
| 178 | |||
| 179 | Called in json-insert, calls either decode_coding_gap or | ||
| 180 | insert_from_gap_1, both are safe. Calls memmove but it’s for | ||
| 181 | decode_coding_gap. | ||
| 182 | |||
| 183 | keymap.c:2873: /* Insert calls signal_after_change which may GC. */ | ||
| 184 | |||
| 185 | Not code. | ||
| 186 | |||
| 187 | print.c:219: signal_after_change (PT - print_buffer.pos, 0, print_buffer.pos); | ||
| 188 | |||
| 189 | Called in print_finish, calls copy_text and insert_1_both, safe. | ||
| 190 | |||
| 191 | process.c:6365: process buffer is changed in the signal_after_change above. | ||
| 192 | search.c:2763: (see signal_before_change and signal_after_change). Try to error | ||
| 193 | |||
| 194 | Not code. | ||
| 195 | |||
| 196 | search.c:2777: signal_after_change (sub_start, sub_end - sub_start, SCHARS (newtext)); | ||
| 197 | |||
| 198 | Called in replace_match. Calls replace_range, upcase-region, | ||
| 199 | upcase-initials-region (both calls casify_region in the end), safe. | ||
| 200 | Calls memcpy but it’s for string manipulation. | ||
| 201 | |||
| 202 | textprop.c:1261: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 203 | textprop.c:1272: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 204 | textprop.c:1283: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 205 | textprop.c:1458: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 206 | textprop.c:1652: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 207 | textprop.c:1661: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 208 | textprop.c:1672: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 209 | textprop.c:1750: before changes are made and signal_after_change when we are done. | ||
| 210 | textprop.c:1752: and call signal_after_change before returning if MODIFIED. */ | ||
| 211 | textprop.c:1764: signal_after_change (XFIXNUM (start), | ||
| 212 | textprop.c:1778: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 213 | textprop.c:1791: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start), | ||
| 214 | textprop.c:1810: signal_after_change (XFIXNUM (start), | ||
| 215 | |||
| 216 | We don’t care about text property changes. | ||
| 217 | |||
| 218 | Grep finished with 51 matches found at Wed Jun 28 15:12:23 | ||