aboutsummaryrefslogtreecommitdiffstats
path: root/src
diff options
context:
space:
mode:
authorKenichi Handa2000-05-20 00:23:52 +0000
committerKenichi Handa2000-05-20 00:23:52 +0000
commitc9671f819a6ee0949eb3d7601592e84b4608c0bd (patch)
tree1701e8e869709303793e3b37cee22124a068f052 /src
parent14333e317b8c9d1611d88f610e1b60791825084c (diff)
downloademacs-c9671f819a6ee0949eb3d7601592e84b4608c0bd.tar.gz
emacs-c9671f819a6ee0949eb3d7601592e84b4608c0bd.zip
*** empty log message ***
Diffstat (limited to 'src')
-rw-r--r--src/ChangeLog289
1 files changed, 289 insertions, 0 deletions
diff --git a/src/ChangeLog b/src/ChangeLog
index f349c3b9177..9bf6b61c2d9 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,292 @@
12000-05-20 Kenichi Handa <handa@etl.go.jp>
2
3 The following changes are to handle 8-bit characters in a
4 multibyte buffer/string without facing with byte combining
5 problem. Two new charsets eight-bit-control (for 0x80..0x9F) and
6 eight-bit-graphic (for 0xA0..0xFF) are introduced.
7
8 * Makefile.in (fns.o): Depend on charset.h.
9
10 * alloc.c (Fmake_byte_code): If BYTECODE-STRING is multibyte,
11 convert it to unibyte.
12 (make_string): Use parse_str_as_multibyte, not chars_in_text.
13
14 * buffer.c (advance_to_char_boundary): Don't use DEC_POS to find a
15 apparent char boundary.
16 (Fset_buffer_multibyte): Convert 8-bit characters in the range
17 0x80..0x9F to/from multibyte form.
18
19 * bytecode.c (Fbyte_code): If arg BYTESTR is multibyte, convert it
20 to unibyte.
21
22 * callproc.c (Fcall_process): Always encode an argument string if
23 it is multibyte. Setup src_multibyte and dst_multibyte members of
24 process_coding properly.
25
26 * category.c (Fmodify_category_entry): Use SPLIT_CHAR, not
27 SPLIT_NON_ASCII_CHAR.
28
29 * ccl.c (CCL_WRITE_CHAR): Be sure to write single byte characters
30 as is.
31 (CCL_MAKE_CHAR): Use MAKE_CHAR, not MAKE_NON_ASCII_CHAR.
32
33 * charset.c (Qeight_bit_control, Qeight_bit_graphic): New
34 variables.
35 (SPLIT_CHARACTER_SEQ): This macro deleted.
36 (SPLIT_MULTIBYTE_SEQ): Assume that multibyte sequence at STR is
37 valid.
38 (CHAR_COMPONENTS_VALID_P): Handle new charsets; eight-bit-control
39 and eight-bit-graphic.
40 (char_to_string): Likewise. Signal an error for too large
41 character code.
42 (char_printable_p): Return 0 for 8-bit characters.
43 (update_charset_table): Update iso_charset_table only when a final
44 character is non-negative.
45 (find_charset_in_text): Renamed from find_charset_in_str.
46 Arguments and return value changed. Callers changed.
47 (Fdefine_charset): Args ISO-FINAL-CHAR and ISO-GRAPHIC-PLANE can
48 be -1 if CHARSET is used only internally.
49 (Fmake_char_internal): Handle new charsets; eight-bit-control and
50 eight-bit-graphic.
51 (Fcharset_after): Simplified.
52 (char_valid_p): Use SPLIT_CHAR, not SPLIT_NON_ASCII_CHAR.
53 (char_bytes): Return 2 for chars of the range 0xA0..0xFF.
54 (multibyte_chars_in_text): Simplified by assuming there's no
55 invalid multibyte sequence.
56 (parse_str_as_multibyte, str_as_multibyte, str_to_multibyte,
57 str_as_unibyte): New functions.
58 (Fstring): Simpified by assuming that byte combining never
59 happens.
60 (init_charset_once): Initialization for
61 LEADING_CODE_8_BIT_CONTROL.
62 (syms_of_charset): Intern and staticpro Qeight_bit_control and
63 Qeight_bit_graphic. Include them in Vcharset_list. Make charsets
64 eight-bit-control and eight-bit-graphic.
65
66 * charset.h (LEADING_CODE_8_BIT_CONTROL, CHARSET_8_BIT_CONTROL,
67 CHARSET_8_BIT_GRAPHIC): New macros.
68 (SINGLE_BYTE_CHAR_P): Make it faster by using casting.
69 (CHARSET_ISO_GRAPHIC_PLANE): Use XINT instead of XFASTINT.
70 (CHARSET_REVERSE_CHARSET): Likewise.
71 (CHARSET_VALID_P): Handle new charsets; eight-bit-control and
72 eight-bit-graphic.
73 (BYTES_BY_CHAR_HEAD, WIDTH_BY_CHAR_HEAD): Optimize for ASCII.
74 (CHAR_CHARSET, MAKE_CHAR, SPLIT_CHAR, CHAR_BYTES): Likewise.
75 (PARSE_MULTIBYTE_SEQ) [BYTE_COMBINING_DEBUG]: Abort if we
76 encounter an invalid multibyte sequence.
77 (PARSE_MULTIBYTE_SEQ) [not BYTE_COMBINING_DEBUG]: Assume multibyte
78 sequence is always valid.
79 (MAKE_NON_ASCII_CHAR, SPLIT_NON_ASCII_CHAR): These macros Deleted.
80 (UNIBYTE_STR_AS_MULTIBYTE_P, MULTIBYTE_STR_AS_UNIBYTE_P): New
81 macros.
82 (CHAR_STRING): For 8-bit characters, call char_to_string.
83 (INC_POS) [not BYTE_COMBINING_DEBUG]: Faster version. Assume
84 multibyte sequence is always valid.
85 (BUF_INC_POS) [not BYTE_COMBINING_DEBUG]: Likewise.
86 (parse_str_as_multibyte, str_as_multibyte, str_to_multibyte,
87 str_as_unibyte): Extern them.
88 (BCOPY_SHORT): Fix a bug.
89 (CHAR_LEN): This macro deleted. Callers changed to use
90 CHAR_BYTES.
91 (FETCH_STRING_CHAR_ADVANCE): Check multibyteness of STRING.
92 (FETCH_STRING_CHAR_ADVANCE_NO_CHECK): New macro.
93 (FETCH_CHAR_ADVANCE): Check multibyteness of the current buffer.
94
95 * coding.c (ONE_MORE_BYTE, TWO_MORE_BYTES): Set coding->resutl to
96 CODING_FINISH_INSUFFICIENT_SRC if there's not enough source.
97 (ONE_MORE_CHAR, EMIT_CHAR, EMIT_ONE_BYTE, EMIT_TWO_BYTE,
98 EMIT_BYTES): New macros.
99 (THREE_MORE_BYTES, DECODE_CHARACTER_ASCII,
100 DECODE_CHARACTER_DIMENSION1, DECODE_CHARACTER_DIMENSION2): These
101 macros deleted.
102 (CHECK_CODE_RANGE_A0_FF): This macro deleted.
103 (detect_coding_emacs_mule): Use UNIBYTE_STR_AS_MULTIBYTE_P to
104 check the validity of multibyte sequence.
105 (decode_coding_emacs_mule): New function.
106 (encode_coding_emacs_mule): New macro.
107 (detect_coding_iso2022): Use ONE_MORE_BYTE to fetch a byte from
108 the source.
109 (DECODE_ISO_CHARACTER): Just return a character code.
110 (DECODE_COMPOSITION_START): Set coding->result instead of result.
111 (decode_coding_iso2022, decode_coding_sjis_big5, decode_eol): Use
112 EMIT_CHAR to produced decoded characters. Exit the loop only by
113 macros ONE_MORE_BYTE or EMIT_CHAR. Don't handle the case of last
114 block here.
115 (ENCODE_ISO_CHARACTER): Don't translate character here. Produce
116 only position codes for an invalid character.
117 (encode_designation_at_bol): Return new destination pointer. 5th
118 arg DSTP is changed to DST.
119 (encode_coding_iso2022, decode_coding_sjis_big5): Get a character
120 from the source by ONE_MORE_CHAR. Don't handle the case of last
121 block here.
122 (DECODE_SJIS_BIG5_CHARACTER, ENCODE_SJIS_BIG5_CHARACTER): These
123 macros deleted.
124 (detect_coding_sjis, detect_coding_big5, detect_coding_utf_8,
125 detect_coding_utf_16, detect_coding_ccl): Use ONE_MORE_BYTE and
126 TWO_MORE_BYTES to fetch a byte from the source.
127 (encode_eol): Pay attention to coding->src_multibyte.
128 (detect_coding, detect_eol): Preserve members src_multibyte and
129 dst_multibyte.
130 (DECODING_BUFFER_MAG): Return 2 even for coding_type_raw_text.
131 (encoding_buffer_size): Set magnification to 3 for all coding
132 systems that require encoding.
133 (ccl_coding_driver): For decoding, be sure that the result is
134 valid multibyte sequence.
135 (decode_coding): Initialize coding->errors and coding->result.
136 For emacs-mule, call decode_coding_emacs_mule. For no-conversion
137 and raw-text, always call decode_eol. Handle the case of last
138 block here. If not coding->dst_multibyte, convert the resulting
139 sequence to unibyte.
140 (encode_coding): Initialize coding->errors and coding->result.
141 For emacs-mule, call encode_coding_emacs_mule. For no-conversion
142 and raw-text, always call encode_eol. Handle the case of last
143 block here.
144 (shrink_decoding_region, shrink_encoding_region): Detect cases
145 that we can't skip data more rigidly.
146 (code_convert_region): Setup src_multibyte and dst_multibyte
147 members of coding. For decoding, if the buffer is multibyte,
148 convert the source sequence to unibyte in advance. For encoding,
149 if the buffer is multibyte, convert the resulting sequence to
150 multibyte afterward.
151 (run_pre_post_conversion_on_str): New function.
152 (code_convert_string): Deleted and divided into the following two.
153 (decode_coding_string, encode_coding_string): New functions.
154 (code_convert_string1, code_convert_string_norecord): Call one of
155 above.
156 (Fdecode_sjis_char, Fdecode_big5_char): Use MAKE_CHAR instead of
157 MAKE_NON_ASCII_CHAR.
158 (Fset_terminal_coding_system_internal,
159 Fset_safe_terminal_coding_system_internal): Setup src_multibyte
160 and dst_multibyte members.
161 (init_coding_once): Initialize iso_code_class with new enum
162 ISO_control_0 and ISO_control_1.
163
164 * coding.h (enum iso_code_class_type): Member ISO_control_code is
165 devided into ISO_control_0 and ISO_control_1.
166 (struct coding_system): New members src_multibyte, dst_multibyte,
167 errors, and result. Delete member fake_multibyte.
168 (CODING_REQUIRE_DECODING): Return 1 if coding->dst_multibyte is
169 nonzero.
170 (CODING_REQUIRE_ENCODING): Return 1 if coding->src_multibyte is
171 nonzero.
172
173 * data.c (Faref): Use SPLIT_CHAR instead of SPLIT_NON_ASCII_CHAR.
174 (Faset): Likewise.
175
176 * editfns.c (Fformat): Be sure to convert 8-bit characters to
177 multibyte form.
178 (Ftranspose_region) [BYTE_COMBINING_DEBUG]: Abort if byte
179 combining occurs.
180 (Ftranspose_region): Delete codes for handling byte combining.
181
182 * fileio.c (Finsert_file_contents): Setup src_multibyte and
183 dst_multibyte members of coding. On handling REPLACE on unibyte
184 buffer, convert the result of decode_coding to unibyte. On
185 inserting into a mutibyte buffer, always call code_convert_region.
186 (e_write): Setup cdoing->src_multibyte according to the
187 multibyteness of the source (buffer or string).
188
189 * fns.c (concat): Handle 8-bit characters correctly.
190 (Fstring_as_unibyte): Be sure to make all 8-bit characters in
191 unibyte in the result.
192 (Fstring_as_multibyte): Be sure to make all 8-bit characters in
193 valid multibyte form in the result.
194 (map_char_table): Use MAKE_CHAR instead of MAKE_NON_ASCII_CHAR.
195 (Fbase64_encode_region, Fbase64_encode_string): If base64_encode_1
196 return -1, signal an error.
197 (base64_encode_1): New arg MULTIBYTE. Get each character by
198 CHAR_STRING_AND_LENGTH if MULTIBYTE is nonzero. If a multibyte
199 character is found, return -1.
200 (Fbase64_decode_region): Delete codes for handling byte-combining.
201 Treat each decoded byte as a unibyte character.
202 (Fbase64_decode_string): Return unibyte string.
203 (Fcompare_strings, concat, string_byte_to_char): Use
204 FETCH_STRING_CHAR_ADVANCE_NO_CHECK instead off
205 FETCH_STRING_CHAR_ADVANCE.
206 (Fstring_lessp): Use FETCH_STRING_CHAR_ADVANCE unconditionally.
207 (mapcar1): If SEQ is string, always use FETCH_STRING_CHAR_ADVANCE.
208
209 * fontset.c (fontset_ref): Use SPLIT_CHAR instead of
210 SPLIT_NON_ASCII_CHAR.
211 (fontset_ref_via_base, fontset_set): Likewise
212
213 * insdel.c (adjust_markers_for_record_delete): Deleted.
214 (adjust_markers_for_insert): Argument changed. Caller changed.
215 (adjust_markers_for_replace): Likewise.
216 (ADJUST_CHAR_POS, combine_bytes, byte_combining_error,
217 CHECK_BYTE_COMBINING_FOR_INSERT): Deleted.
218 (copy_text): Delete unused local varialbe c_save. For converting
219 to multibyte, be sure to make all 8-bit characters in valid
220 multibyte form.
221 (count_size_as_multibyte): Handle 8-bit characters correctly.
222 (insert_1_both, insert_from_string_1, insert_from_buffer_1,
223 adjust_after_replace, replace_range, del_range_2)
224 [BYTE_COMBINING_DEBUG]: Abort if byte combining occurs.
225 (insert_1_both, insert_from_string_1, insert_from_buffer_1,
226 adjust_after_replace, replace_range, del_range_2) Delete codes for
227 handling byte combining.
228 (adjust_before_replace): Deleted.
229
230 * keymap.c (Fsingle_key_description): Use SPLIT_CHAR instead of
231 SPLIT_NON_ASCII_CHAR.
232 (describe_vector): Use MAKE_CHAR instead of MAKE_NON_ASCII_CHAR.
233 (Faccessible_keymaps): Use FETCH_STRING_CHAR_ADVANCE
234 unconditionally.
235 (Fkey_description): Likewise.
236
237 * lread.c (read1): On reading multibyte string, be sure to make
238 all 8-bit chararacters in valid multibyte form.
239 (readchar): Use FETCH_STRING_CHAR_ADVANCE unconditionally.
240
241 * print.c (print_object): Use FETCH_STRING_CHAR_ADVANCE
242 unconditionally.
243
244 * process.c (Fstart_process): GCPRO current_dir before calling
245 Ffind_operation_coding_system. Encode arguments here.
246 (create_process): Don't encode arguments here. Setup
247 src_multibyte and dst_multibyte members of struct coding.
248 (read_process_output): Setup src_multibyte and dst_multibyte
249 members of struct coding. If the output is to multibyte buffer,
250 always decode the output of the process. Adjust the
251 representation of 8-bit characters to the multibyteness of the
252 output.
253 (send_process): Setup coding->src_multibyte according to the
254 multibyteness of the source.
255
256 * search.c (wordify): Use FETCH_STRING_CHAR_ADVANCE
257 unconditionally.
258 (Freplace_match): Use FETCH_STRING_CHAR_ADVANCE and
259 FETCH_STRING_CHAR_ADVANCE_NO_CHECK appropriately.
260
261 * term.c (produce_special_glyphs): Use CHAR_BYTES instead of
262 CHAR_LEN.
263
264 * w16select.c (Fw16_set_clipboard_data): Setup members
265 src_multibyte and dst_multibyte of coding. Adjusted for the
266 change for find_charset_in_str.
267 (Fw16_get_clipboard_data): Likewise.
268
269 * w32fns.c (w32_to_x_font): Setup members src_multibyte and
270 dst_multibyte of coding.
271 (x_to_w32_font): Likewise.
272
273 * w32select.c (Fw32_set_clipboard_data): Setup members
274 src_multibyte and dst_multibyte of coding. Adjusted for the
275 change for find_charset_in_str.
276 (Fw32_get_clipboard_data): Likewise.
277
278 * xdisp.c (get_next_display_element): Handle 8-bit characters
279 correctly.
280 (next_element_from_display_vector): Use CHAR_BYTES instead of
281 CHAR_LEN.
282 (disp_char_vector): Use SPLIT_CHAR instead of
283 SPLIT_NON_ASCII_CHAR.
284
285 * xselect.c (selection_data_to_lisp_data): Setup members
286 src_multibyte and dst_multibyte of coding. Adjusted for the
287 change for find_charset_in_str.
288 (lisp_data_to_selection_data): Likewise.
289
12000-05-19 Gerd Moellmann <gerd@gnu.org> 2902000-05-19 Gerd Moellmann <gerd@gnu.org>
2 291
3 * buffer.c (Fbury_buffer): Avoid trouble from burying a killed 292 * buffer.c (Fbury_buffer): Avoid trouble from burying a killed