aboutsummaryrefslogtreecommitdiffstats
path: root/src
diff options
context:
space:
mode:
authorEli Zaretskii2010-03-30 05:13:07 -0400
committerEli Zaretskii2010-03-30 05:13:07 -0400
commita7b028203499410a9f4bebe8220be8f3a9ce598b (patch)
tree085ffc8adee85e9d1e55eace47a2bf13da81098c /src
parent82fc79808b345b43efc20c064baec63cc14df8a5 (diff)
parent4d6ea387e7527aa73d7664e98bfd2e2c24b1628d (diff)
downloademacs-a7b028203499410a9f4bebe8220be8f3a9ce598b.tar.gz
emacs-a7b028203499410a9f4bebe8220be8f3a9ce598b.zip
Initial support for bidirectional editing.
Makefile.in (obj): Include bidi.o. (bidi.o): New target. makefile.w32-in (OBJ1): Add $(BLD)/bidi.$(O). ($(BLD)/bidi.$(O)): New target. bidi.c: New file. buffer.h (struct buffer): New members bidi_display_reordering and bidi_paragraph_direction. buffer.c (init_buffer_once): Initialize bidi_display_reordering and bidi_paragraph_direction. (syms_of_buffer): Declare Lisp variables bidi-display-reordering and bidi-paragraph-direction. (Fbuffer_swap_text): Swap the values of bidi_display_reordering and bidi_paragraph_direction. dispextern.h (BIDI_MAXLEVEL, BIDI_AT_BASE_LEVEL): New macros. (bidi_type_t, bidi_dir_t): New types. (bidi_saved_info, bidi_stack, bidi_it): New structures. (struct it): New members bidi_p, bidi_it, paragraph_embedding, prev_stop, base_level_stop, and eol_pos. (bidi_init_it, bidi_get_next_char_visually): New prototypes. (IT_STACK_SIZE): Enlarge to 5. (struct glyph_row): New member reversed_p. <string_buffer_position>: Update prototype. (PRODUCE_GLYPHS): Set the reversed_p flag in the iterator's glyph_row if bidi_it.paragraph_dir == R2L. (struct glyph): New members resolved_level and bidi_type. dispnew.c (direct_output_forward_char): Give up if we need bidi processing or buffer's direction is right-to-left. (prepare_desired_row): Preserve the reversed_p flag. (row_equal_p): Compare the reversed_p attributes as well. xdisp.c (init_iterator): Initialize it->bidi_p. Call bidi_init_it and set it->paragraph_embedding from the current buffer's value of bidi_paragraph_direction. (reseat_1): Initialize bidi_it.first_elt. (set_iterator_to_next, next_element_from_buffer): Use the value of paragraph_embedding to determine the paragraph direction. (set_iterator_to_next): Under bidi reordering, call bidi_get_next_char_visually. Call bidi_paragraph_init if the new_paragraph flag is set in the bidi iterator. (next_element_from_buffer): If bidi_it.first_elt is set, initialize paragraph direction and find the first character to display in the visual order. If reseated to a middle of a line, prime the bidi iterator starting at the line's beginning. Handle the situation where we overstepped stop_charpos due to non-linearity of the bidi iteration. Likewise for when we back up beyond the previous stop_charpos. When moving across stop_charpos, record it in prev_stop. (display_line): Set row->end and it->start for the next row to the next character in logical order. Always extend reversed_p rows to the end of line, even if they end at ZV. Copy the reversed_p flag to the next glyph row. Keep calling set_cursor_from_row for bidi-reordered rows even if we already have a possible candidate for cursor position. Set row_end after all the row's glyphs have been produced, by looping over the glyphs. Record the position after EOL in it->eol_pos, and use it to set end_pos of the last row produced for a continued line. <Qright_to_left, Qleft_to_right>: New variables. (syms_of_xdisp): Initialize and staticpro them. (string_buffer_position_lim): New function. (string_buffer_position): Most of code moved to string_buffer_position_lim. Last argument and return value are now EMACS_INT; all callers changed. (set_cursor_from_row): Rewritten to support bidirectional text and reversed glyph rows. (text_outside_line_unchanged_p, try_window_id): Disable optimizations if we are reordering bidirectional text and the paragraph direction can be affected by the change. (append_glyph, append_composite_glyph) (produce_image_glyph, append_stretch_glyph): Set the resolved_level and bidi_type members of each glyph. (append_glyph): If the glyph row is reversed, prepend the glyph rather than appending it. (handle_stop_backwards): New function. (reseat_1, pop_it, push_it): Set prev_stop and base_level_stop. (reseat): call handle_stop_backwards to recompute prev_stop and base_level_stop for the new position. (handle_invisible_prop): Under bidi iteration, skip invisible text using bidi_get_next_char_visually. If we are `reseat'ed, init the paragraph direction. Update IT->prev_stop after skipping invisible text. (move_it_in_display_line_to): New variables prev_method and prev_pos. Compare for strict equality in BUFFER_POS_REACHED_P. (try_cursor_movement): Examine all the candidate rows that occlude point, to return the best match. If rows are bidi-reordered and point moved backwards, back up to the row that is not a continuation line, and start looking for a suitable row from there. term.c (append_glyph): Reverse glyphs by pre-pending them, rather than appending, if the glyph_row's reversed_p flag is set. Set the resolved_level and bidi_type members of each glyph. .gdbinit (pbiditype): New command. (pgx): Use it to display bidi level and type of the glyph. (pitx): Display some bidi information about the iterator. (prowlims, pmtxrows): New commands. files.el: Make bidi-display-reordering safe variable for boolean values. mule.texi (International): Mention support of bidirectional editing. (Bidirectional Editing): New section. HELLO: Reorder Arabic and Hebrew into logical order, and insert RLM before the opening paren, to make the display more reasonable. Add setting for bidi-display-reordering in the local variables section. NEWS: Mention initial support for bidirectional editing.
Diffstat (limited to 'src')
-rw-r--r--src/.gdbinit60
-rw-r--r--src/ChangeLog108
-rw-r--r--src/Makefile.in3
-rw-r--r--src/bidi.c2028
-rw-r--r--src/buffer.c39
-rw-r--r--src/buffer.h10
-rw-r--r--src/dispextern.h163
-rw-r--r--src/dispnew.c10
-rw-r--r--src/makefile.w32-in9
-rw-r--r--src/term.c32
-rw-r--r--src/window.h5
-rw-r--r--src/xdisp.c1228
12 files changed, 3484 insertions, 211 deletions
diff --git a/src/.gdbinit b/src/.gdbinit
index e8a64f5dfe4..b959baae8f3 100644
--- a/src/.gdbinit
+++ b/src/.gdbinit
@@ -271,6 +271,9 @@ define pitx
271 end 271 end
272 end 272 end
273 printf "\n" 273 printf "\n"
274 if ($it->bidi_p)
275 printf "BIDI: base_stop=%d prev_stop=%d level=%d\n", $it->base_level_stop, $it->prev_stop, $it->bidi_it.resolved_level
276 end
274 if ($it->region_beg_charpos >= 0) 277 if ($it->region_beg_charpos >= 0)
275 printf "reg=%d-%d ", $it->region_beg_charpos, $it->region_end_charpos 278 printf "reg=%d-%d ", $it->region_beg_charpos, $it->region_end_charpos
276 end 279 end
@@ -447,6 +450,36 @@ document pwin
447Pretty print window structure w. 450Pretty print window structure w.
448end 451end
449 452
453define pbiditype
454 if ($arg0 == 0)
455 printf "UNDEF"
456 end
457 if ($arg0 == 1)
458 printf "L"
459 end
460 if ($arg0 == 2)
461 printf "R"
462 end
463 if ($arg0 == 3)
464 printf "EN"
465 end
466 if ($arg0 == 4)
467 printf "AN"
468 end
469 if ($arg0 == 5)
470 printf "BN"
471 end
472 if ($arg0 == 6)
473 printf "B"
474 end
475 if ($arg0 < 0 || $arg0 > 6)
476 printf "%d??", $arg0
477 end
478end
479document pbiditype
480Print textual description of bidi type given as first argument.
481end
482
450define pgx 483define pgx
451 set $g = $arg0 484 set $g = $arg0
452 # CHAR_GLYPH 485 # CHAR_GLYPH
@@ -475,6 +508,11 @@ define pgx
475 else 508 else
476 printf " pos=%d", $g->charpos 509 printf " pos=%d", $g->charpos
477 end 510 end
511 # For characters, print their resolved level and bidi type
512 if ($g->type == 0)
513 printf " blev=%d,btyp=", $g->resolved_level
514 pbiditype $g->bidi_type
515 end
478 printf " w=%d a+d=%d+%d", $g->pixel_width, $g->ascent, $g->descent 516 printf " w=%d a+d=%d+%d", $g->pixel_width, $g->ascent, $g->descent
479 # If not DEFAULT_FACE_ID 517 # If not DEFAULT_FACE_ID
480 if ($g->face_id != 0) 518 if ($g->face_id != 0)
@@ -575,6 +613,28 @@ document pgrowit
575Pretty print all glyphs in it->glyph_row. 613Pretty print all glyphs in it->glyph_row.
576end 614end
577 615
616define prowlims
617 printf "start=%d,end=%d,reversed=%d,cont=%d,at_zv=%d\n", $arg0->start.pos.charpos, $arg0->end.pos.charpos, $arg0->reversed_p, $arg0->continued_p, $arg0->ends_at_zv_p
618end
619document prowlims
620Print important attributes of a glyph_row structure.
621Takes one argument, a pointer to a glyph_row structure.
622end
623
624define pmtxrows
625 set $mtx = $arg0
626 set $gl = $mtx->rows
627 set $glend = $mtx->rows + $mtx->nrows
628 while ($gl < $glend)
629 prowlims $gl
630 set $gl = $gl + 1
631 end
632end
633document pmtxrows
634Print data about glyph rows in a glyph matrix.
635Takes one argument, a pointer to a glyph_matrix structure.
636end
637
578define xtype 638define xtype
579 xgettype $ 639 xgettype $
580 output $type 640 output $type
diff --git a/src/ChangeLog b/src/ChangeLog
index 8fd89e9fd0c..1e013964db6 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,111 @@
12010-03-30 Eli Zaretskii <eliz@gnu.org>
2
3 Initial support for bidirectional editing.
4
5 * Makefile.in (obj): Include bidi.o.
6 (bidi.o): New target.
7
8 * makefile.w32-in (OBJ1): Add $(BLD)/bidi.$(O).
9 ($(BLD)/bidi.$(O)): New target.
10
11 * bidi.c: New file.
12
13 * buffer.h (struct buffer): New members bidi_display_reordering
14 and bidi_paragraph_direction.
15
16 * buffer.c (init_buffer_once): Initialize bidi_display_reordering
17 and bidi_paragraph_direction.
18 (syms_of_buffer): Declare Lisp variables bidi-display-reordering
19 and bidi-paragraph-direction.
20 (Fbuffer_swap_text): Swap the values of
21 bidi_display_reordering and bidi_paragraph_direction.
22
23 * dispextern.h (BIDI_MAXLEVEL, BIDI_AT_BASE_LEVEL): New macros.
24 (bidi_type_t, bidi_dir_t): New types.
25 (bidi_saved_info, bidi_stack, bidi_it): New structures.
26 (struct it): New members bidi_p, bidi_it, paragraph_embedding,
27 prev_stop, base_level_stop, and eol_pos.
28 (bidi_init_it, bidi_get_next_char_visually): New prototypes.
29 (IT_STACK_SIZE): Enlarge to 5.
30 (struct glyph_row): New member reversed_p.
31 <string_buffer_position>: Update prototype.
32 (PRODUCE_GLYPHS): Set the reversed_p flag in the iterator's
33 glyph_row if bidi_it.paragraph_dir == R2L.
34 (struct glyph): New members resolved_level and bidi_type.
35
36 * dispnew.c (direct_output_forward_char): Give up if we need bidi
37 processing or buffer's direction is right-to-left.
38 (prepare_desired_row): Preserve the reversed_p flag.
39 (row_equal_p): Compare the reversed_p attributes as well.
40
41 * xdisp.c (init_iterator): Initialize it->bidi_p. Call
42 bidi_init_it and set it->paragraph_embedding from the current
43 buffer's value of bidi_paragraph_direction.
44 (reseat_1): Initialize bidi_it.first_elt.
45 (set_iterator_to_next, next_element_from_buffer): Use the value of
46 paragraph_embedding to determine the paragraph direction.
47 (set_iterator_to_next): Under bidi reordering, call
48 bidi_get_next_char_visually. Call bidi_paragraph_init if the
49 new_paragraph flag is set in the bidi iterator.
50 (next_element_from_buffer): If bidi_it.first_elt is set,
51 initialize paragraph direction and find the first character to
52 display in the visual order. If reseated to a middle of a line,
53 prime the bidi iterator starting at the line's beginning. Handle
54 the situation where we overstepped stop_charpos due to
55 non-linearity of the bidi iteration. Likewise for when we back up
56 beyond the previous stop_charpos. When moving across stop_charpos,
57 record it in prev_stop.
58 (display_line): Set row->end and it->start for the next row to the
59 next character in logical order. Always extend reversed_p rows to
60 the end of line, even if they end at ZV. Copy the reversed_p flag
61 to the next glyph row. Keep calling set_cursor_from_row for
62 bidi-reordered rows even if we already have a possible candidate
63 for cursor position. Set row_end after all the row's glyphs have
64 been produced, by looping over the glyphs. Record the position
65 after EOL in it->eol_pos, and use it to set end_pos of the last
66 row produced for a continued line.
67 <Qright_to_left, Qleft_to_right>: New variables.
68 (syms_of_xdisp): Initialize and staticpro them.
69 (string_buffer_position_lim): New function.
70 (string_buffer_position): Most of code moved to
71 string_buffer_position_lim. Last argument and return value are
72 now EMACS_INT; all callers changed.
73 (set_cursor_from_row): Rewritten to support bidirectional text and
74 reversed glyph rows.
75 (text_outside_line_unchanged_p, try_window_id): Disable
76 optimizations if we are reordering bidirectional text and the
77 paragraph direction can be affected by the change.
78 (append_glyph, append_composite_glyph)
79 (produce_image_glyph, append_stretch_glyph): Set the
80 resolved_level and bidi_type members of each glyph.
81 (append_glyph): If the glyph row is reversed, prepend the glyph
82 rather than appending it.
83 (handle_stop_backwards): New function.
84 (reseat_1, pop_it, push_it): Set prev_stop and base_level_stop.
85 (reseat): call handle_stop_backwards to recompute prev_stop and
86 base_level_stop for the new position.
87 (handle_invisible_prop): Under bidi iteration, skip invisible text
88 using bidi_get_next_char_visually. If we are `reseat'ed, init the
89 paragraph direction. Update IT->prev_stop after skipping
90 invisible text.
91 (move_it_in_display_line_to): New variables prev_method
92 and prev_pos. Compare for strict equality in
93 BUFFER_POS_REACHED_P.
94 (try_cursor_movement): Examine all the candidate rows that occlude
95 point, to return the best match. If rows are bidi-reordered
96 and point moved backwards, back up to the row that is not a
97 continuation line, and start looking for a suitable row from
98 there.
99
100 * term.c (append_glyph): Reverse glyphs by pre-pending them,
101 rather than appending, if the glyph_row's reversed_p flag is set.
102 Set the resolved_level and bidi_type members of each glyph.
103
104 * .gdbinit (pbiditype): New command.
105 (pgx): Use it to display bidi level and type of the glyph.
106 (pitx): Display some bidi information about the iterator.
107 (prowlims, pmtxrows): New commands.
108
12010-03-30 Dan Nicolaescu <dann@ics.uci.edu> 1092010-03-30 Dan Nicolaescu <dann@ics.uci.edu>
2 110
3 Remove all uses of C_DEBUG_SWITCH and LIBS_DEBUG. 111 Remove all uses of C_DEBUG_SWITCH and LIBS_DEBUG.
diff --git a/src/Makefile.in b/src/Makefile.in
index 1730158f732..ed44cd7700f 100644
--- a/src/Makefile.in
+++ b/src/Makefile.in
@@ -496,7 +496,7 @@ FONT_DRIVERS = xfont.o
496/* lastfile must follow all files 496/* lastfile must follow all files
497 whose initialized data areas should be dumped as pure by dump-emacs. */ 497 whose initialized data areas should be dumped as pure by dump-emacs. */
498obj= dispnew.o frame.o scroll.o xdisp.o menu.o $(XMENU_OBJ) window.o \ 498obj= dispnew.o frame.o scroll.o xdisp.o menu.o $(XMENU_OBJ) window.o \
499 charset.o coding.o category.o ccl.o character.o chartab.o \ 499 charset.o coding.o category.o ccl.o character.o chartab.o bidi.o \
500 cm.o term.o terminal.o xfaces.o $(XOBJ) $(GTK_OBJ) $(DBUS_OBJ) \ 500 cm.o term.o terminal.o xfaces.o $(XOBJ) $(GTK_OBJ) $(DBUS_OBJ) \
501 emacs.o keyboard.o macros.o keymap.o sysdep.o \ 501 emacs.o keyboard.o macros.o keymap.o sysdep.o \
502 buffer.o filelock.o insdel.o marker.o \ 502 buffer.o filelock.o insdel.o marker.o \
@@ -982,6 +982,7 @@ doc.o: buildobj.h
982 982
983atimer.o: atimer.c atimer.h syssignal.h systime.h lisp.h blockinput.h \ 983atimer.o: atimer.c atimer.h syssignal.h systime.h lisp.h blockinput.h \
984 $(config_h) 984 $(config_h)
985bidi.o: bidi.c buffer.h character.h dispextern.h lisp.h $(config_h)
985buffer.o: buffer.c buffer.h region-cache.h commands.h window.h \ 986buffer.o: buffer.c buffer.h region-cache.h commands.h window.h \
986 $(INTERVALS_H) blockinput.h atimer.h systime.h character.h \ 987 $(INTERVALS_H) blockinput.h atimer.h systime.h character.h \
987 indent.h keyboard.h coding.h keymap.h frame.h lisp.h $(config_h) 988 indent.h keyboard.h coding.h keymap.h frame.h lisp.h $(config_h)
diff --git a/src/bidi.c b/src/bidi.c
new file mode 100644
index 00000000000..8089bf811a6
--- /dev/null
+++ b/src/bidi.c
@@ -0,0 +1,2028 @@
1/* Low-level bidirectional buffer-scanning functions for GNU Emacs.
2 Copyright (C) 2000, 2001, 2004, 2005, 2009 Free Software Foundation, Inc.
3
4This file is part of GNU Emacs.
5
6GNU Emacs is free software; you can redistribute it and/or modify
7it under the terms of the GNU General Public License as published by
8the Free Software Foundation; either version 2, or (at your option)
9any later version.
10
11GNU Emacs is distributed in the hope that it will be useful,
12but WITHOUT ANY WARRANTY; without even the implied warranty of
13MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14GNU General Public License for more details.
15
16
17You should have received a copy of the GNU General Public License
18along with GNU Emacs; see the file COPYING. If not, write to
19the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
20Boston, MA 02110-1301, USA. */
21
22/* Written by Eli Zaretskii <eliz@gnu.org>.
23
24 A sequential implementation of the Unicode Bidirectional algorithm,
25 as per UAX#9, a part of the Unicode Standard.
26
27 Unlike the reference and most other implementations, this one is
28 designed to be called once for every character in the buffer.
29
30 The main entry point is bidi_get_next_char_visually. Each time it
31 is called, it finds the next character in the visual order, and
32 returns its information in a special structure. The caller is then
33 expected to process this character for display or any other
34 purposes, and call bidi_get_next_char_visually for the next
35 character. See the comments in bidi_get_next_char_visually for
36 more details about its algorithm that finds the next visual-order
37 character by resolving their levels on the fly.
38
39 If you want to understand the code, you will have to read it
40 together with the relevant portions of UAX#9. The comments include
41 references to UAX#9 rules, for that very reason.
42
43 A note about references to UAX#9 rules: if the reference says
44 something like "X9/Retaining", it means that you need to refer to
45 rule X9 and to its modifications decribed in the "Implementation
46 Notes" section of UAX#9, under "Retaining Format Codes". */
47
48#ifdef HAVE_CONFIG_H
49#include <config.h>
50#endif
51
52#include <stdio.h>
53
54#ifdef HAVE_STRING_H
55#include <string.h>
56#endif
57
58#include <setjmp.h>
59
60#include "lisp.h"
61#include "buffer.h"
62#include "character.h"
63#include "dispextern.h"
64
65static int bidi_initialized = 0;
66
67static Lisp_Object bidi_type_table;
68
69/* FIXME: Remove these when bidi_explicit_dir_char uses a lookup table. */
70#define LRM_CHAR 0x200E
71#define RLM_CHAR 0x200F
72#define LRE_CHAR 0x202A
73#define RLE_CHAR 0x202B
74#define PDF_CHAR 0x202C
75#define LRO_CHAR 0x202D
76#define RLO_CHAR 0x202E
77
78#define BIDI_EOB -1
79#define BIDI_BOB -2 /* FIXME: Is this needed? */
80
81/* Local data structures. (Look in dispextern.h for the rest.) */
82
83/* What we need to know about the current paragraph. */
84struct bidi_paragraph_info {
85 int start_bytepos; /* byte position where it begins */
86 int end_bytepos; /* byte position where it ends */
87 int embedding_level; /* its basic embedding level */
88 bidi_dir_t base_dir; /* its base direction */
89};
90
91/* Data type for describing the bidirectional character categories. */
92typedef enum {
93 UNKNOWN_BC,
94 NEUTRAL,
95 WEAK,
96 STRONG
97} bidi_category_t;
98
99int bidi_ignore_explicit_marks_for_paragraph_level = 1;
100
101static Lisp_Object fallback_paragraph_start_re, fallback_paragraph_separate_re;
102static Lisp_Object Qparagraph_start, Qparagraph_separate;
103
104static void
105bidi_initialize ()
106{
107 /* FIXME: This should come from the Unicode Database. */
108 struct {
109 int from, to;
110 bidi_type_t type;
111 } bidi_type[] =
112 { { 0x0000, 0x0008, WEAK_BN },
113 { 0x0009, 0x0000, NEUTRAL_S },
114 { 0x000A, 0x0000, NEUTRAL_B },
115 { 0x000B, 0x0000, NEUTRAL_S },
116 { 0x000C, 0x0000, NEUTRAL_WS },
117 { 0x000D, 0x0000, NEUTRAL_B },
118 { 0x000E, 0x001B, WEAK_BN },
119 { 0x001C, 0x001E, NEUTRAL_B },
120 { 0x001F, 0x0000, NEUTRAL_S },
121 { 0x0020, 0x0000, NEUTRAL_WS },
122 { 0x0021, 0x0022, NEUTRAL_ON },
123 { 0x0023, 0x0025, WEAK_ET },
124 { 0x0026, 0x002A, NEUTRAL_ON },
125 { 0x002B, 0x0000, WEAK_ES },
126 { 0x002C, 0x0000, WEAK_CS },
127 { 0x002D, 0x0000, WEAK_ES },
128 { 0x002E, 0x002F, WEAK_CS },
129 { 0x0030, 0x0039, WEAK_EN },
130 { 0x003A, 0x0000, WEAK_CS },
131 { 0x003B, 0x0040, NEUTRAL_ON },
132 { 0x005B, 0x0060, NEUTRAL_ON },
133 { 0x007B, 0x007E, NEUTRAL_ON },
134 { 0x007F, 0x0084, WEAK_BN },
135 { 0x0085, 0x0000, NEUTRAL_B },
136 { 0x0086, 0x009F, WEAK_BN },
137 { 0x00A0, 0x0000, WEAK_CS },
138 { 0x00A1, 0x0000, NEUTRAL_ON },
139 { 0x00A2, 0x00A5, WEAK_ET },
140 { 0x00A6, 0x00A9, NEUTRAL_ON },
141 { 0x00AB, 0x00AC, NEUTRAL_ON },
142 { 0x00AD, 0x0000, WEAK_BN },
143 { 0x00AE, 0x00Af, NEUTRAL_ON },
144 { 0x00B0, 0x00B1, WEAK_ET },
145 { 0x00B2, 0x00B3, WEAK_EN },
146 { 0x00B4, 0x0000, NEUTRAL_ON },
147 { 0x00B6, 0x00B8, NEUTRAL_ON },
148 { 0x00B9, 0x0000, WEAK_EN },
149 { 0x00BB, 0x00BF, NEUTRAL_ON },
150 { 0x00D7, 0x0000, NEUTRAL_ON },
151 { 0x00F7, 0x0000, NEUTRAL_ON },
152 { 0x02B9, 0x02BA, NEUTRAL_ON },
153 { 0x02C2, 0x02CF, NEUTRAL_ON },
154 { 0x02D2, 0x02DF, NEUTRAL_ON },
155 { 0x02E5, 0x02ED, NEUTRAL_ON },
156 { 0x0300, 0x036F, WEAK_NSM },
157 { 0x0374, 0x0375, NEUTRAL_ON },
158 { 0x037E, 0x0385, NEUTRAL_ON },
159 { 0x0387, 0x0000, NEUTRAL_ON },
160 { 0x03F6, 0x0000, NEUTRAL_ON },
161 { 0x0483, 0x0489, WEAK_NSM },
162 { 0x058A, 0x0000, NEUTRAL_ON },
163 { 0x0591, 0x05BD, WEAK_NSM },
164 { 0x05BE, 0x0000, STRONG_R },
165 { 0x05BF, 0x0000, WEAK_NSM },
166 { 0x05C0, 0x0000, STRONG_R },
167 { 0x05C1, 0x05C2, WEAK_NSM },
168 { 0x05C3, 0x0000, STRONG_R },
169 { 0x05C4, 0x05C5, WEAK_NSM },
170 { 0x05C6, 0x0000, STRONG_R },
171 { 0x05C7, 0x0000, WEAK_NSM },
172 { 0x05D0, 0x05F4, STRONG_R },
173 { 0x060C, 0x0000, WEAK_CS },
174 { 0x061B, 0x064A, STRONG_AL },
175 { 0x064B, 0x0655, WEAK_NSM },
176 { 0x0660, 0x0669, WEAK_AN },
177 { 0x066A, 0x0000, WEAK_ET },
178 { 0x066B, 0x066C, WEAK_AN },
179 { 0x066D, 0x066F, STRONG_AL },
180 { 0x0670, 0x0000, WEAK_NSM },
181 { 0x0671, 0x06D5, STRONG_AL },
182 { 0x06D6, 0x06DC, WEAK_NSM },
183 { 0x06DD, 0x0000, STRONG_AL },
184 { 0x06DE, 0x06E4, WEAK_NSM },
185 { 0x06E5, 0x06E6, STRONG_AL },
186 { 0x06E7, 0x06E8, WEAK_NSM },
187 { 0x06E9, 0x0000, NEUTRAL_ON },
188 { 0x06EA, 0x06ED, WEAK_NSM },
189 { 0x06F0, 0x06F9, WEAK_EN },
190 { 0x06FA, 0x070D, STRONG_AL },
191 { 0x070F, 0x0000, WEAK_BN },
192 { 0x0710, 0x0000, STRONG_AL },
193 { 0x0711, 0x0000, WEAK_NSM },
194 { 0x0712, 0x072C, STRONG_AL },
195 { 0x0730, 0x074A, WEAK_NSM },
196 { 0x0780, 0x07A5, STRONG_AL },
197 { 0x07A6, 0x07B0, WEAK_NSM },
198 { 0x07B1, 0x0000, STRONG_AL },
199 { 0x0901, 0x0902, WEAK_NSM },
200 { 0x093C, 0x0000, WEAK_NSM },
201 { 0x0941, 0x0948, WEAK_NSM },
202 { 0x094D, 0x0000, WEAK_NSM },
203 { 0x0951, 0x0954, WEAK_NSM },
204 { 0x0962, 0x0963, WEAK_NSM },
205 { 0x0981, 0x0000, WEAK_NSM },
206 { 0x09BC, 0x0000, WEAK_NSM },
207 { 0x09C1, 0x09C4, WEAK_NSM },
208 { 0x09CD, 0x0000, WEAK_NSM },
209 { 0x09E2, 0x09E3, WEAK_NSM },
210 { 0x09F2, 0x09F3, WEAK_ET },
211 { 0x0A02, 0x0000, WEAK_NSM },
212 { 0x0A3C, 0x0000, WEAK_NSM },
213 { 0x0A41, 0x0A4D, WEAK_NSM },
214 { 0x0A70, 0x0A71, WEAK_NSM },
215 { 0x0A81, 0x0A82, WEAK_NSM },
216 { 0x0ABC, 0x0000, WEAK_NSM },
217 { 0x0AC1, 0x0AC8, WEAK_NSM },
218 { 0x0ACD, 0x0000, WEAK_NSM },
219 { 0x0B01, 0x0000, WEAK_NSM },
220 { 0x0B3C, 0x0000, WEAK_NSM },
221 { 0x0B3F, 0x0000, WEAK_NSM },
222 { 0x0B41, 0x0B43, WEAK_NSM },
223 { 0x0B4D, 0x0B56, WEAK_NSM },
224 { 0x0B82, 0x0000, WEAK_NSM },
225 { 0x0BC0, 0x0000, WEAK_NSM },
226 { 0x0BCD, 0x0000, WEAK_NSM },
227 { 0x0C3E, 0x0C40, WEAK_NSM },
228 { 0x0C46, 0x0C56, WEAK_NSM },
229 { 0x0CBF, 0x0000, WEAK_NSM },
230 { 0x0CC6, 0x0000, WEAK_NSM },
231 { 0x0CCC, 0x0CCD, WEAK_NSM },
232 { 0x0D41, 0x0D43, WEAK_NSM },
233 { 0x0D4D, 0x0000, WEAK_NSM },
234 { 0x0DCA, 0x0000, WEAK_NSM },
235 { 0x0DD2, 0x0DD6, WEAK_NSM },
236 { 0x0E31, 0x0000, WEAK_NSM },
237 { 0x0E34, 0x0E3A, WEAK_NSM },
238 { 0x0E3F, 0x0000, WEAK_ET },
239 { 0x0E47, 0x0E4E, WEAK_NSM },
240 { 0x0EB1, 0x0000, WEAK_NSM },
241 { 0x0EB4, 0x0EBC, WEAK_NSM },
242 { 0x0EC8, 0x0ECD, WEAK_NSM },
243 { 0x0F18, 0x0F19, WEAK_NSM },
244 { 0x0F35, 0x0000, WEAK_NSM },
245 { 0x0F37, 0x0000, WEAK_NSM },
246 { 0x0F39, 0x0000, WEAK_NSM },
247 { 0x0F3A, 0x0F3D, NEUTRAL_ON },
248 { 0x0F71, 0x0F7E, WEAK_NSM },
249 { 0x0F80, 0x0F84, WEAK_NSM },
250 { 0x0F86, 0x0F87, WEAK_NSM },
251 { 0x0F90, 0x0FBC, WEAK_NSM },
252 { 0x0FC6, 0x0000, WEAK_NSM },
253 { 0x102D, 0x1030, WEAK_NSM },
254 { 0x1032, 0x1037, WEAK_NSM },
255 { 0x1039, 0x0000, WEAK_NSM },
256 { 0x1058, 0x1059, WEAK_NSM },
257 { 0x1680, 0x0000, NEUTRAL_WS },
258 { 0x169B, 0x169C, NEUTRAL_ON },
259 { 0x1712, 0x1714, WEAK_NSM },
260 { 0x1732, 0x1734, WEAK_NSM },
261 { 0x1752, 0x1753, WEAK_NSM },
262 { 0x1772, 0x1773, WEAK_NSM },
263 { 0x17B7, 0x17BD, WEAK_NSM },
264 { 0x17C6, 0x0000, WEAK_NSM },
265 { 0x17C9, 0x17D3, WEAK_NSM },
266 { 0x17DB, 0x0000, WEAK_ET },
267 { 0x1800, 0x180A, NEUTRAL_ON },
268 { 0x180B, 0x180D, WEAK_NSM },
269 { 0x180E, 0x0000, WEAK_BN },
270 { 0x18A9, 0x0000, WEAK_NSM },
271 { 0x1FBD, 0x0000, NEUTRAL_ON },
272 { 0x1FBF, 0x1FC1, NEUTRAL_ON },
273 { 0x1FCD, 0x1FCF, NEUTRAL_ON },
274 { 0x1FDD, 0x1FDF, NEUTRAL_ON },
275 { 0x1FED, 0x1FEF, NEUTRAL_ON },
276 { 0x1FFD, 0x1FFE, NEUTRAL_ON },
277 { 0x2000, 0x200A, NEUTRAL_WS },
278 { 0x200B, 0x200D, WEAK_BN },
279 { 0x200F, 0x0000, STRONG_R },
280 { 0x2010, 0x2027, NEUTRAL_ON },
281 { 0x2028, 0x0000, NEUTRAL_WS },
282 { 0x2029, 0x0000, NEUTRAL_B },
283 { 0x202A, 0x0000, LRE },
284 { 0x202B, 0x0000, RLE },
285 { 0x202C, 0x0000, PDF },
286 { 0x202D, 0x0000, LRO },
287 { 0x202E, 0x0000, RLO },
288 { 0x202F, 0x0000, NEUTRAL_WS },
289 { 0x2030, 0x2034, WEAK_ET },
290 { 0x2035, 0x2057, NEUTRAL_ON },
291 { 0x205F, 0x0000, NEUTRAL_WS },
292 { 0x2060, 0x206F, WEAK_BN },
293 { 0x2070, 0x0000, WEAK_EN },
294 { 0x2074, 0x2079, WEAK_EN },
295 { 0x207A, 0x207B, WEAK_ET },
296 { 0x207C, 0x207E, NEUTRAL_ON },
297 { 0x2080, 0x2089, WEAK_EN },
298 { 0x208A, 0x208B, WEAK_ET },
299 { 0x208C, 0x208E, NEUTRAL_ON },
300 { 0x20A0, 0x20B1, WEAK_ET },
301 { 0x20D0, 0x20EA, WEAK_NSM },
302 { 0x2100, 0x2101, NEUTRAL_ON },
303 { 0x2103, 0x2106, NEUTRAL_ON },
304 { 0x2108, 0x2109, NEUTRAL_ON },
305 { 0x2114, 0x0000, NEUTRAL_ON },
306 { 0x2116, 0x2118, NEUTRAL_ON },
307 { 0x211E, 0x2123, NEUTRAL_ON },
308 { 0x2125, 0x0000, NEUTRAL_ON },
309 { 0x2127, 0x0000, NEUTRAL_ON },
310 { 0x2129, 0x0000, NEUTRAL_ON },
311 { 0x212E, 0x0000, WEAK_ET },
312 { 0x2132, 0x0000, NEUTRAL_ON },
313 { 0x213A, 0x0000, NEUTRAL_ON },
314 { 0x2140, 0x2144, NEUTRAL_ON },
315 { 0x214A, 0x215F, NEUTRAL_ON },
316 { 0x2190, 0x2211, NEUTRAL_ON },
317 { 0x2212, 0x2213, WEAK_ET },
318 { 0x2214, 0x2335, NEUTRAL_ON },
319 { 0x237B, 0x2394, NEUTRAL_ON },
320 { 0x2396, 0x244A, NEUTRAL_ON },
321 { 0x2460, 0x249B, WEAK_EN },
322 { 0x24EA, 0x0000, WEAK_EN },
323 { 0x24EB, 0x2FFB, NEUTRAL_ON },
324 { 0x3000, 0x0000, NEUTRAL_WS },
325 { 0x3001, 0x3004, NEUTRAL_ON },
326 { 0x3008, 0x3020, NEUTRAL_ON },
327 { 0x302A, 0x302F, WEAK_NSM },
328 { 0x3030, 0x0000, NEUTRAL_ON },
329 { 0x3036, 0x3037, NEUTRAL_ON },
330 { 0x303D, 0x303F, NEUTRAL_ON },
331 { 0x3099, 0x309A, WEAK_NSM },
332 { 0x309B, 0x309C, NEUTRAL_ON },
333 { 0x30A0, 0x0000, NEUTRAL_ON },
334 { 0x30FB, 0x0000, NEUTRAL_ON },
335 { 0x3251, 0x325F, NEUTRAL_ON },
336 { 0x32B1, 0x32BF, NEUTRAL_ON },
337 { 0xA490, 0xA4C6, NEUTRAL_ON },
338 { 0xFB1D, 0x0000, STRONG_R },
339 { 0xFB1E, 0x0000, WEAK_NSM },
340 { 0xFB1F, 0xFB28, STRONG_R },
341 { 0xFB29, 0x0000, WEAK_ET },
342 { 0xFB2A, 0xFB4F, STRONG_R },
343 { 0xFB50, 0xFD3D, STRONG_AL },
344 { 0xFD3E, 0xFD3F, NEUTRAL_ON },
345 { 0xFD50, 0xFDFC, STRONG_AL },
346 { 0xFE00, 0xFE23, WEAK_NSM },
347 { 0xFE30, 0xFE4F, NEUTRAL_ON },
348 { 0xFE50, 0x0000, WEAK_CS },
349 { 0xFE51, 0x0000, NEUTRAL_ON },
350 { 0xFE52, 0x0000, WEAK_CS },
351 { 0xFE54, 0x0000, NEUTRAL_ON },
352 { 0xFE55, 0x0000, WEAK_CS },
353 { 0xFE56, 0xFE5E, NEUTRAL_ON },
354 { 0xFE5F, 0x0000, WEAK_ET },
355 { 0xFE60, 0xFE61, NEUTRAL_ON },
356 { 0xFE62, 0xFE63, WEAK_ET },
357 { 0xFE64, 0xFE68, NEUTRAL_ON },
358 { 0xFE69, 0xFE6A, WEAK_ET },
359 { 0xFE6B, 0x0000, NEUTRAL_ON },
360 { 0xFE70, 0xFEFC, STRONG_AL },
361 { 0xFEFF, 0x0000, WEAK_BN },
362 { 0xFF01, 0xFF02, NEUTRAL_ON },
363 { 0xFF03, 0xFF05, WEAK_ET },
364 { 0xFF06, 0xFF0A, NEUTRAL_ON },
365 { 0xFF0B, 0x0000, WEAK_ET },
366 { 0xFF0C, 0x0000, WEAK_CS },
367 { 0xFF0D, 0x0000, WEAK_ET },
368 { 0xFF0E, 0x0000, WEAK_CS },
369 { 0xFF0F, 0x0000, WEAK_ES },
370 { 0xFF10, 0xFF19, WEAK_EN },
371 { 0xFF1A, 0x0000, WEAK_CS },
372 { 0xFF1B, 0xFF20, NEUTRAL_ON },
373 { 0xFF3B, 0xFF40, NEUTRAL_ON },
374 { 0xFF5B, 0xFF65, NEUTRAL_ON },
375 { 0xFFE0, 0xFFE1, WEAK_ET },
376 { 0xFFE2, 0xFFE4, NEUTRAL_ON },
377 { 0xFFE5, 0xFFE6, WEAK_ET },
378 { 0xFFE8, 0xFFEE, NEUTRAL_ON },
379 { 0xFFF9, 0xFFFB, WEAK_BN },
380 { 0xFFFC, 0xFFFD, NEUTRAL_ON },
381 { 0x1D167, 0x1D169, WEAK_NSM },
382 { 0x1D173, 0x1D17A, WEAK_BN },
383 { 0x1D17B, 0x1D182, WEAK_NSM },
384 { 0x1D185, 0x1D18B, WEAK_NSM },
385 { 0x1D1AA, 0x1D1AD, WEAK_NSM },
386 { 0x1D7CE, 0x1D7FF, WEAK_EN },
387 { 0xE0001, 0xE007F, WEAK_BN } };
388 int i;
389
390 bidi_type_table = Fmake_char_table (Qnil, make_number (STRONG_L));
391 staticpro (&bidi_type_table);
392
393 for (i = 0; i < sizeof bidi_type / sizeof bidi_type[0]; i++)
394 char_table_set_range (bidi_type_table, bidi_type[i].from,
395 bidi_type[i].to ? bidi_type[i].to : bidi_type[i].from,
396 make_number (bidi_type[i].type));
397
398 fallback_paragraph_start_re =
399 XSYMBOL (Fintern_soft (build_string ("paragraph-start"), Qnil))->value;
400 if (!STRINGP (fallback_paragraph_start_re))
401 fallback_paragraph_start_re = build_string ("\f\\|[ \t]*$");
402 staticpro (&fallback_paragraph_start_re);
403 Qparagraph_start = intern ("paragraph-start");
404 staticpro (&Qparagraph_start);
405 fallback_paragraph_separate_re =
406 XSYMBOL (Fintern_soft (build_string ("paragraph-separate"), Qnil))->value;
407 if (!STRINGP (fallback_paragraph_separate_re))
408 fallback_paragraph_separate_re = build_string ("[ \t\f]*$");
409 staticpro (&fallback_paragraph_separate_re);
410 Qparagraph_separate = intern ("paragraph-separate");
411 staticpro (&Qparagraph_separate);
412 bidi_initialized = 1;
413}
414
415/* Return the bidi type of a character CH, subject to the current
416 directional OVERRIDE. */
417bidi_type_t
418bidi_get_type (int ch, bidi_dir_t override)
419{
420 bidi_type_t default_type;
421
422 if (ch == BIDI_EOB)
423 return NEUTRAL_B;
424 if (ch < 0 || ch > MAX_CHAR)
425 abort ();
426
427 default_type = (bidi_type_t) XINT (CHAR_TABLE_REF (bidi_type_table, ch));
428
429 if (override == NEUTRAL_DIR)
430 return default_type;
431
432 switch (default_type)
433 {
434 /* Although UAX#9 does not tell, it doesn't make sense to
435 override NEUTRAL_B and LRM/RLM characters. */
436 case NEUTRAL_B:
437 case LRE:
438 case LRO:
439 case RLE:
440 case RLO:
441 case PDF:
442 return default_type;
443 default:
444 switch (ch)
445 {
446 case LRM_CHAR:
447 case RLM_CHAR:
448 return default_type;
449 default:
450 if (override == L2R) /* X6 */
451 return STRONG_L;
452 else if (override == R2L)
453 return STRONG_R;
454 else
455 abort (); /* can't happen: handled above */
456 }
457 }
458}
459
460void
461bidi_check_type (bidi_type_t type)
462{
463 if (type < UNKNOWN_BT || type > NEUTRAL_ON)
464 abort ();
465}
466
467/* Given a bidi TYPE of a character, return its category. */
468bidi_category_t
469bidi_get_category (bidi_type_t type)
470{
471 switch (type)
472 {
473 case UNKNOWN_BT:
474 return UNKNOWN_BC;
475 case STRONG_L:
476 case STRONG_R:
477 case STRONG_AL:
478 case LRE:
479 case LRO:
480 case RLE:
481 case RLO:
482 return STRONG;
483 case PDF: /* ??? really?? */
484 case WEAK_EN:
485 case WEAK_ES:
486 case WEAK_ET:
487 case WEAK_AN:
488 case WEAK_CS:
489 case WEAK_NSM:
490 case WEAK_BN:
491 return WEAK;
492 case NEUTRAL_B:
493 case NEUTRAL_S:
494 case NEUTRAL_WS:
495 case NEUTRAL_ON:
496 return NEUTRAL;
497 default:
498 abort ();
499 }
500}
501
502/* Return the mirrored character of C, if any.
503
504 Note: The conditions in UAX#9 clause L4 must be tested by the
505 caller. */
506/* FIXME: exceedingly temporary! Should consult the Unicode database
507 of character properties. */
508int
509bidi_mirror_char (int c)
510{
511 static const char mirrored_pairs[] = "()<>[]{}";
512 const char *p = c > 0 && c < 128 ? strchr (mirrored_pairs, c) : NULL;
513
514 if (p)
515 {
516 size_t i = p - mirrored_pairs;
517
518 return mirrored_pairs [(i ^ 1)];
519 }
520 return c;
521}
522
523/* Copy the bidi iterator from FROM to TO. To save cycles, this only
524 copies the part of the level stack that is actually in use. */
525static inline void
526bidi_copy_it (struct bidi_it *to, struct bidi_it *from)
527{
528 int i;
529
530 /* Copy everything except the level stack and beyond. */
531 memcpy (to, from, ((size_t)&((struct bidi_it *)0)->level_stack[0]));
532
533 /* Copy the active part of the level stack. */
534 to->level_stack[0] = from->level_stack[0]; /* level zero is always in use */
535 for (i = 1; i <= from->stack_idx; i++)
536 to->level_stack[i] = from->level_stack[i];
537}
538
539/* Caching the bidi iterator states. */
540
541static struct bidi_it bidi_cache[1000]; /* FIXME: make this dynamically allocated! */
542static int bidi_cache_idx;
543static int bidi_cache_last_idx;
544
545static inline void
546bidi_cache_reset (void)
547{
548 bidi_cache_idx = 0;
549 bidi_cache_last_idx = -1;
550}
551
552static inline void
553bidi_cache_fetch_state (int idx, struct bidi_it *bidi_it)
554{
555 int current_scan_dir = bidi_it->scan_dir;
556
557 if (idx < 0 || idx >= bidi_cache_idx)
558 abort ();
559
560 bidi_copy_it (bidi_it, &bidi_cache[idx]);
561 bidi_it->scan_dir = current_scan_dir;
562 bidi_cache_last_idx = idx;
563}
564
565/* Find a cached state with a given CHARPOS and resolved embedding
566 level less or equal to LEVEL. if LEVEL is -1, disregard the
567 resolved levels in cached states. DIR, if non-zero, means search
568 in that direction from the last cache hit. */
569static inline int
570bidi_cache_search (int charpos, int level, int dir)
571{
572 int i, i_start;
573
574 if (bidi_cache_idx)
575 {
576 if (charpos < bidi_cache[bidi_cache_last_idx].charpos)
577 dir = -1;
578 else if (charpos > bidi_cache[bidi_cache_last_idx].charpos)
579 dir = 1;
580 if (dir)
581 i_start = bidi_cache_last_idx;
582 else
583 {
584 dir = -1;
585 i_start = bidi_cache_idx - 1;
586 }
587
588 if (dir < 0)
589 {
590 /* Linear search for now; FIXME! */
591 for (i = i_start; i >= 0; i--)
592 if (bidi_cache[i].charpos == charpos
593 && (level == -1 || bidi_cache[i].resolved_level <= level))
594 return i;
595 }
596 else
597 {
598 for (i = i_start; i < bidi_cache_idx; i++)
599 if (bidi_cache[i].charpos == charpos
600 && (level == -1 || bidi_cache[i].resolved_level <= level))
601 return i;
602 }
603 }
604
605 return -1;
606}
607
608/* Find a cached state where the resolved level changes to a value
609 that is lower than LEVEL, and return its cache slot index. DIR is
610 the direction to search, starting with the last used cache slot.
611 BEFORE, if non-zero, means return the index of the slot that is
612 ``before'' the level change in the search direction. That is,
613 given the cached levels like this:
614
615 1122333442211
616 AB C
617
618 and assuming we are at the position cached at the slot marked with
619 C, searching backwards (DIR = -1) for LEVEL = 2 will return the
620 index of slot B or A, depending whether BEFORE is, respectively,
621 non-zero or zero. */
622static int
623bidi_cache_find_level_change (int level, int dir, int before)
624{
625 if (bidi_cache_idx)
626 {
627 int i = dir ? bidi_cache_last_idx : bidi_cache_idx - 1;
628 int incr = before ? 1 : 0;
629
630 if (!dir)
631 dir = -1;
632 else if (!incr)
633 i += dir;
634
635 if (dir < 0)
636 {
637 while (i >= incr)
638 {
639 if (bidi_cache[i - incr].resolved_level >= 0
640 && bidi_cache[i - incr].resolved_level < level)
641 return i;
642 i--;
643 }
644 }
645 else
646 {
647 while (i < bidi_cache_idx - incr)
648 {
649 if (bidi_cache[i + incr].resolved_level >= 0
650 && bidi_cache[i + incr].resolved_level < level)
651 return i;
652 i++;
653 }
654 }
655 }
656
657 return -1;
658}
659
660static inline void
661bidi_cache_iterator_state (struct bidi_it *bidi_it, int resolved)
662{
663 int idx;
664
665 /* We should never cache on backward scans. */
666 if (bidi_it->scan_dir == -1)
667 abort ();
668 idx = bidi_cache_search (bidi_it->charpos, -1, 1);
669
670 if (idx < 0)
671 {
672 idx = bidi_cache_idx;
673 /* Don't overrun the cache limit. */
674 if (idx > sizeof (bidi_cache) / sizeof (bidi_cache[0]) - 1)
675 abort ();
676 /* Don't violate cache integrity: character positions should
677 correspond to cache positions 1:1. */
678 if (idx > 0 && bidi_it->charpos != bidi_cache[idx - 1].charpos + 1)
679 abort ();
680 bidi_copy_it (&bidi_cache[idx], bidi_it);
681 if (!resolved)
682 bidi_cache[idx].resolved_level = -1;
683 bidi_cache[idx].new_paragraph = 0;
684 }
685 else
686 {
687 /* Copy only the members which could have changed, to avoid
688 costly copying of the entire struct. */
689 bidi_cache[idx].type = bidi_it->type;
690 bidi_check_type (bidi_it->type);
691 bidi_cache[idx].type_after_w1 = bidi_it->type_after_w1;
692 bidi_check_type (bidi_it->type_after_w1);
693 if (resolved)
694 bidi_cache[idx].resolved_level = bidi_it->resolved_level;
695 else
696 bidi_cache[idx].resolved_level = -1;
697 bidi_cache[idx].invalid_levels = bidi_it->invalid_levels;
698 bidi_cache[idx].invalid_rl_levels = bidi_it->invalid_rl_levels;
699 bidi_cache[idx].next_for_neutral = bidi_it->next_for_neutral;
700 bidi_cache[idx].next_for_ws = bidi_it->next_for_ws;
701 bidi_cache[idx].ignore_bn_limit = bidi_it->ignore_bn_limit;
702 }
703
704 bidi_cache_last_idx = idx;
705 if (idx >= bidi_cache_idx)
706 bidi_cache_idx = idx + 1;
707}
708
709static inline bidi_type_t
710bidi_cache_find (int charpos, int level, struct bidi_it *bidi_it)
711{
712 int i = bidi_cache_search (charpos, level, bidi_it->scan_dir);
713
714 if (i >= 0)
715 {
716 bidi_dir_t current_scan_dir = bidi_it->scan_dir;
717
718 bidi_copy_it (bidi_it, &bidi_cache[i]);
719 bidi_cache_last_idx = i;
720 /* Don't let scan direction from from the cached state override
721 the current scan direction. */
722 bidi_it->scan_dir = current_scan_dir;
723 return bidi_it->type;
724 }
725
726 return UNKNOWN_BT;
727}
728
729static inline int
730bidi_peek_at_next_level (struct bidi_it *bidi_it)
731{
732 if (bidi_cache_idx == 0 || bidi_cache_last_idx == -1)
733 abort ();
734 return bidi_cache[bidi_cache_last_idx + bidi_it->scan_dir].resolved_level;
735}
736
737/* Check if buffer position CHARPOS/BYTEPOS is the end of a paragraph.
738 Value is the non-negative length of the paragraph separator
739 following the buffer position, -1 if position is at the beginning
740 of a new paragraph, or -2 if position is neither at beginning nor
741 at end of a paragraph. */
742EMACS_INT
743bidi_at_paragraph_end (EMACS_INT charpos, EMACS_INT bytepos)
744{
745 Lisp_Object sep_re = Fbuffer_local_value (Qparagraph_separate,
746 Fcurrent_buffer ());
747 Lisp_Object start_re = Fbuffer_local_value (Qparagraph_start,
748 Fcurrent_buffer ());
749 EMACS_INT val;
750
751 if (!STRINGP (sep_re))
752 sep_re = fallback_paragraph_separate_re;
753 if (!STRINGP (start_re))
754 start_re = fallback_paragraph_start_re;
755
756 val = fast_looking_at (sep_re, charpos, bytepos, ZV, ZV_BYTE, Qnil);
757 if (val < 0)
758 {
759 if (fast_looking_at (start_re, charpos, bytepos, ZV, ZV_BYTE, Qnil) >= 0)
760 val = -1;
761 else
762 val = -2;
763 }
764
765 return val;
766}
767
768/* Determine the start-of-run (sor) directional type given the two
769 embedding levels on either side of the run boundary. Also, update
770 the saved info about previously seen characters, since that info is
771 generally valid for a single level run. */
772static inline void
773bidi_set_sor_type (struct bidi_it *bidi_it, int level_before, int level_after)
774{
775 int higher_level = level_before > level_after ? level_before : level_after;
776
777 /* The prev_was_pdf gork is required for when we have several PDFs
778 in a row. In that case, we want to compute the sor type for the
779 next level run only once: when we see the first PDF. That's
780 because the sor type depends only on the higher of the two levels
781 that we find on the two sides of the level boundary (see UAX#9,
782 clause X10), and so we don't need to know the final embedding
783 level to which we descend after processing all the PDFs. */
784 if (!bidi_it->prev_was_pdf || level_before < level_after)
785 /* FIXME: should the default sor direction be user selectable? */
786 bidi_it->sor = (higher_level & 1) != 0 ? R2L : L2R;
787 if (level_before > level_after)
788 bidi_it->prev_was_pdf = 1;
789
790 bidi_it->prev.type = UNKNOWN_BT;
791 bidi_it->last_strong.type = bidi_it->last_strong.type_after_w1 =
792 bidi_it->last_strong.orig_type = UNKNOWN_BT;
793 bidi_it->prev_for_neutral.type = bidi_it->sor == R2L ? STRONG_R : STRONG_L;
794 bidi_it->prev_for_neutral.charpos = bidi_it->charpos;
795 bidi_it->prev_for_neutral.bytepos = bidi_it->bytepos;
796 bidi_it->next_for_neutral.type = bidi_it->next_for_neutral.type_after_w1 =
797 bidi_it->next_for_neutral.orig_type = UNKNOWN_BT;
798 bidi_it->ignore_bn_limit = 0; /* meaning it's unknown */
799}
800
801static void
802bidi_line_init (struct bidi_it *bidi_it)
803{
804 bidi_it->scan_dir = 1; /* FIXME: do we need to have control on this? */
805 bidi_it->resolved_level = bidi_it->level_stack[0].level;
806 bidi_it->level_stack[0].override = NEUTRAL_DIR; /* X1 */
807 bidi_it->invalid_levels = 0;
808 bidi_it->invalid_rl_levels = -1;
809 bidi_it->next_en_pos = -1;
810 bidi_it->next_for_ws.type = UNKNOWN_BT;
811 bidi_set_sor_type (bidi_it,
812 bidi_it->paragraph_dir == R2L ? 1 : 0,
813 bidi_it->level_stack[0].level); /* X10 */
814
815 bidi_cache_reset ();
816}
817
818/* Find the beginning of this paragraph by looking back in the buffer.
819 Value is the byte position of the paragraph's beginning. */
820static EMACS_INT
821bidi_find_paragraph_start (EMACS_INT pos, EMACS_INT pos_byte)
822{
823 Lisp_Object re = Fbuffer_local_value (Qparagraph_start, Fcurrent_buffer ());
824 EMACS_INT limit = ZV, limit_byte = ZV_BYTE;
825
826 if (!STRINGP (re))
827 re = fallback_paragraph_start_re;
828 while (pos_byte > BEGV_BYTE
829 && fast_looking_at (re, pos, pos_byte, limit, limit_byte, Qnil) < 0)
830 {
831 pos = find_next_newline_no_quit (pos - 1, -1);
832 pos_byte = CHAR_TO_BYTE (pos);
833 }
834 return pos_byte;
835}
836
837/* Determine the direction, a.k.a. base embedding level, of the
838 paragraph we are about to iterate through. If DIR is either L2R or
839 R2L, just use that. Otherwise, determine the paragraph direction
840 from the first strong character of the paragraph.
841
842 Note that this gives the paragraph separator the same direction as
843 the preceding paragraph, even though Emacs generally views the
844 separartor as not belonging to any paragraph. */
845void
846bidi_paragraph_init (bidi_dir_t dir, struct bidi_it *bidi_it)
847{
848 EMACS_INT bytepos = bidi_it->bytepos;
849
850 /* Special case for an empty buffer. */
851 if (bytepos == BEGV_BYTE && bytepos == ZV_BYTE)
852 dir = L2R;
853 /* We should never be called at EOB or before BEGV. */
854 else if (bytepos >= ZV_BYTE || bytepos < BEGV_BYTE)
855 abort ();
856
857 if (dir == L2R)
858 {
859 bidi_it->paragraph_dir = L2R;
860 bidi_it->new_paragraph = 0;
861 }
862 else if (dir == R2L)
863 {
864 bidi_it->paragraph_dir = R2L;
865 bidi_it->new_paragraph = 0;
866 }
867 else if (dir == NEUTRAL_DIR) /* P2 */
868 {
869 int ch, ch_len;
870 EMACS_INT pos;
871 bidi_type_t type;
872 EMACS_INT sep_len;
873
874 /* If we are inside a paragraph separator, we are just waiting
875 for the separator to be exhausted; use the previous paragraph
876 direction. But don't do that if we have been just reseated,
877 because we need to reinitialize below in that case. */
878 if (!bidi_it->first_elt
879 && bidi_it->charpos < bidi_it->separator_limit)
880 return;
881
882 /* If we are on a newline, get past it to where the next
883 paragraph might start. But don't do that at BEGV since then
884 we are potentially in a new paragraph that doesn't yet
885 exist. */
886 pos = bidi_it->charpos;
887 if (bytepos > BEGV_BYTE && FETCH_CHAR (bytepos) == '\n')
888 {
889 bytepos++;
890 pos++;
891 }
892
893 /* We are either at the beginning of a paragraph or in the
894 middle of it. Find where this paragraph starts. */
895 bytepos = bidi_find_paragraph_start (pos, bytepos);
896
897 /* We should always be at the beginning of a new line at this
898 point. */
899 if (!(bytepos == BEGV_BYTE || FETCH_CHAR (bytepos - 1) == '\n'))
900 abort ();
901
902 bidi_it->separator_limit = -1;
903 bidi_it->new_paragraph = 0;
904 ch = FETCH_CHAR (bytepos);
905 ch_len = CHAR_BYTES (ch);
906 pos = BYTE_TO_CHAR (bytepos);
907 type = bidi_get_type (ch, NEUTRAL_DIR);
908
909 for (pos++, bytepos += ch_len;
910 /* NOTE: UAX#9 says to search only for L, AL, or R types of
911 characters, and ignore RLE, RLO, LRE, and LRO. However,
912 I'm not sure it makes sense to omit those 4; should try
913 with and without that to see the effect. */
914 (bidi_get_category (type) != STRONG)
915 || (bidi_ignore_explicit_marks_for_paragraph_level
916 && (type == RLE || type == RLO
917 || type == LRE || type == LRO));
918 type = bidi_get_type (ch, NEUTRAL_DIR))
919 {
920 if (type == NEUTRAL_B && bidi_at_paragraph_end (pos, bytepos) >= -1)
921 break;
922 if (bytepos >= ZV_BYTE)
923 {
924 /* Pretend there's a paragraph separator at end of buffer. */
925 type = NEUTRAL_B;
926 break;
927 }
928 FETCH_CHAR_ADVANCE (ch, pos, bytepos);
929 }
930 if (type == STRONG_R || type == STRONG_AL) /* P3 */
931 bidi_it->paragraph_dir = R2L;
932 else if (type == STRONG_L)
933 bidi_it->paragraph_dir = L2R;
934 }
935 else
936 abort ();
937
938 /* Contrary to UAX#9 clause P3, we only default the paragraph
939 direction to L2R if we have no previous usable paragraph
940 direction. */
941 if (bidi_it->paragraph_dir == NEUTRAL_DIR)
942 bidi_it->paragraph_dir = L2R; /* P3 and ``higher protocols'' */
943 if (bidi_it->paragraph_dir == R2L)
944 bidi_it->level_stack[0].level = 1;
945 else
946 bidi_it->level_stack[0].level = 0;
947
948 bidi_line_init (bidi_it);
949}
950
951/* Do whatever UAX#9 clause X8 says should be done at paragraph's
952 end. */
953static inline void
954bidi_set_paragraph_end (struct bidi_it *bidi_it)
955{
956 bidi_it->invalid_levels = 0;
957 bidi_it->invalid_rl_levels = -1;
958 bidi_it->stack_idx = 0;
959 bidi_it->resolved_level = bidi_it->level_stack[0].level;
960}
961
962/* Initialize the bidi iterator from buffer position CHARPOS. */
963void
964bidi_init_it (EMACS_INT charpos, EMACS_INT bytepos, struct bidi_it *bidi_it)
965{
966 if (! bidi_initialized)
967 bidi_initialize ();
968 bidi_it->charpos = charpos;
969 bidi_it->bytepos = bytepos;
970 bidi_it->first_elt = 1;
971 bidi_set_paragraph_end (bidi_it);
972 bidi_it->new_paragraph = 1;
973 bidi_it->separator_limit = -1;
974 bidi_it->type = NEUTRAL_B;
975 bidi_it->type_after_w1 = UNKNOWN_BT;
976 bidi_it->orig_type = UNKNOWN_BT;
977 bidi_it->prev_was_pdf = 0;
978 bidi_it->prev.type = bidi_it->prev.type_after_w1 = UNKNOWN_BT;
979 bidi_it->last_strong.type = bidi_it->last_strong.type_after_w1 =
980 bidi_it->last_strong.orig_type = UNKNOWN_BT;
981 bidi_it->next_for_neutral.charpos = -1;
982 bidi_it->next_for_neutral.type =
983 bidi_it->next_for_neutral.type_after_w1 =
984 bidi_it->next_for_neutral.orig_type = UNKNOWN_BT;
985 bidi_it->prev_for_neutral.charpos = -1;
986 bidi_it->prev_for_neutral.type =
987 bidi_it->prev_for_neutral.type_after_w1 =
988 bidi_it->prev_for_neutral.orig_type = UNKNOWN_BT;
989 bidi_it->sor = L2R; /* FIXME: should it be user-selectable? */
990}
991
992/* Push the current embedding level and override status; reset the
993 current level to LEVEL and the current override status to OVERRIDE. */
994static inline void
995bidi_push_embedding_level (struct bidi_it *bidi_it,
996 int level, bidi_dir_t override)
997{
998 bidi_it->stack_idx++;
999 if (bidi_it->stack_idx >= BIDI_MAXLEVEL)
1000 abort ();
1001 bidi_it->level_stack[bidi_it->stack_idx].level = level;
1002 bidi_it->level_stack[bidi_it->stack_idx].override = override;
1003}
1004
1005/* Pop the embedding level and directional override status from the
1006 stack, and return the new level. */
1007static inline int
1008bidi_pop_embedding_level (struct bidi_it *bidi_it)
1009{
1010 /* UAX#9 says to ignore invalid PDFs. */
1011 if (bidi_it->stack_idx > 0)
1012 bidi_it->stack_idx--;
1013 return bidi_it->level_stack[bidi_it->stack_idx].level;
1014}
1015
1016/* Record in SAVED_INFO the information about the current character. */
1017static inline void
1018bidi_remember_char (struct bidi_saved_info *saved_info,
1019 struct bidi_it *bidi_it)
1020{
1021 saved_info->charpos = bidi_it->charpos;
1022 saved_info->bytepos = bidi_it->bytepos;
1023 saved_info->type = bidi_it->type;
1024 bidi_check_type (bidi_it->type);
1025 saved_info->type_after_w1 = bidi_it->type_after_w1;
1026 bidi_check_type (bidi_it->type_after_w1);
1027 saved_info->orig_type = bidi_it->orig_type;
1028 bidi_check_type (bidi_it->orig_type);
1029}
1030
1031/* Resolve the type of a neutral character according to the type of
1032 surrounding strong text and the current embedding level. */
1033static inline bidi_type_t
1034bidi_resolve_neutral_1 (bidi_type_t prev_type, bidi_type_t next_type, int lev)
1035{
1036 /* N1: European and Arabic numbers are treated as though they were R. */
1037 if (next_type == WEAK_EN || next_type == WEAK_AN)
1038 next_type = STRONG_R;
1039 if (prev_type == WEAK_EN || prev_type == WEAK_AN)
1040 prev_type = STRONG_R;
1041
1042 if (next_type == prev_type) /* N1 */
1043 return next_type;
1044 else if ((lev & 1) == 0) /* N2 */
1045 return STRONG_L;
1046 else
1047 return STRONG_R;
1048}
1049
1050static inline int
1051bidi_explicit_dir_char (int c)
1052{
1053 /* FIXME: this should be replaced with a lookup table with suitable
1054 bits set, like standard C ctype macros do. */
1055 return (c == LRE_CHAR || c == LRO_CHAR
1056 || c == RLE_CHAR || c == RLO_CHAR || c == PDF_CHAR);
1057}
1058
1059/* A helper function for bidi_resolve_explicit. It advances to the
1060 next character in logical order and determines the new embedding
1061 level and directional override, but does not take into account
1062 empty embeddings. */
1063static int
1064bidi_resolve_explicit_1 (struct bidi_it *bidi_it)
1065{
1066 int curchar;
1067 bidi_type_t type;
1068 int current_level;
1069 int new_level;
1070 bidi_dir_t override;
1071
1072 if (bidi_it->bytepos < BEGV_BYTE /* after reseat to BEGV? */
1073 || bidi_it->first_elt)
1074 {
1075 bidi_it->first_elt = 0;
1076 if (bidi_it->charpos < BEGV)
1077 bidi_it->charpos = BEGV;
1078 bidi_it->bytepos = CHAR_TO_BYTE (bidi_it->charpos);
1079 }
1080 else if (bidi_it->bytepos < ZV_BYTE) /* don't move at ZV */
1081 {
1082 bidi_it->charpos++;
1083 if (bidi_it->ch_len == 0)
1084 abort ();
1085 bidi_it->bytepos += bidi_it->ch_len;
1086 }
1087
1088 current_level = bidi_it->level_stack[bidi_it->stack_idx].level; /* X1 */
1089 override = bidi_it->level_stack[bidi_it->stack_idx].override;
1090 new_level = current_level;
1091
1092 /* in case it is a unibyte character (not yet implemented) */
1093 /* _fetch_multibyte_char_len = 1; */
1094 if (bidi_it->bytepos >= ZV_BYTE)
1095 {
1096 curchar = BIDI_EOB;
1097 bidi_it->ch_len = 1;
1098 }
1099 else
1100 {
1101 curchar = FETCH_CHAR (bidi_it->bytepos);
1102 bidi_it->ch_len = CHAR_BYTES (curchar);
1103 }
1104 bidi_it->ch = curchar;
1105
1106 /* Don't apply directional override here, as all the types we handle
1107 below will not be affected by the override anyway, and we need
1108 the original type unaltered. The override will be applied in
1109 bidi_resolve_weak. */
1110 type = bidi_get_type (curchar, NEUTRAL_DIR);
1111 bidi_it->orig_type = type;
1112 bidi_check_type (bidi_it->orig_type);
1113
1114 if (type != PDF)
1115 bidi_it->prev_was_pdf = 0;
1116
1117 bidi_it->type_after_w1 = UNKNOWN_BT;
1118
1119 switch (type)
1120 {
1121 case RLE: /* X2 */
1122 case RLO: /* X4 */
1123 bidi_it->type_after_w1 = type;
1124 bidi_check_type (bidi_it->type_after_w1);
1125 type = WEAK_BN; /* X9/Retaining */
1126 if (bidi_it->ignore_bn_limit <= 0)
1127 {
1128 if (current_level <= BIDI_MAXLEVEL - 4)
1129 {
1130 /* Compute the least odd embedding level greater than
1131 the current level. */
1132 new_level = ((current_level + 1) & ~1) + 1;
1133 if (bidi_it->type_after_w1 == RLE)
1134 override = NEUTRAL_DIR;
1135 else
1136 override = R2L;
1137 if (current_level == BIDI_MAXLEVEL - 4)
1138 bidi_it->invalid_rl_levels = 0;
1139 bidi_push_embedding_level (bidi_it, new_level, override);
1140 }
1141 else
1142 {
1143 bidi_it->invalid_levels++;
1144 /* See the commentary about invalid_rl_levels below. */
1145 if (bidi_it->invalid_rl_levels < 0)
1146 bidi_it->invalid_rl_levels = 0;
1147 bidi_it->invalid_rl_levels++;
1148 }
1149 }
1150 else if (bidi_it->prev.type_after_w1 == WEAK_EN /* W5/Retaining */
1151 || bidi_it->next_en_pos > bidi_it->charpos)
1152 type = WEAK_EN;
1153 break;
1154 case LRE: /* X3 */
1155 case LRO: /* X5 */
1156 bidi_it->type_after_w1 = type;
1157 bidi_check_type (bidi_it->type_after_w1);
1158 type = WEAK_BN; /* X9/Retaining */
1159 if (bidi_it->ignore_bn_limit <= 0)
1160 {
1161 if (current_level <= BIDI_MAXLEVEL - 5)
1162 {
1163 /* Compute the least even embedding level greater than
1164 the current level. */
1165 new_level = ((current_level + 2) & ~1);
1166 if (bidi_it->type_after_w1 == LRE)
1167 override = NEUTRAL_DIR;
1168 else
1169 override = L2R;
1170 bidi_push_embedding_level (bidi_it, new_level, override);
1171 }
1172 else
1173 {
1174 bidi_it->invalid_levels++;
1175 /* invalid_rl_levels counts invalid levels encountered
1176 while the embedding level was already too high for
1177 LRE/LRO, but not for RLE/RLO. That is because
1178 there may be exactly one PDF which we should not
1179 ignore even though invalid_levels is non-zero.
1180 invalid_rl_levels helps to know what PDF is
1181 that. */
1182 if (bidi_it->invalid_rl_levels >= 0)
1183 bidi_it->invalid_rl_levels++;
1184 }
1185 }
1186 else if (bidi_it->prev.type_after_w1 == WEAK_EN /* W5/Retaining */
1187 || bidi_it->next_en_pos > bidi_it->charpos)
1188 type = WEAK_EN;
1189 break;
1190 case PDF: /* X7 */
1191 bidi_it->type_after_w1 = type;
1192 bidi_check_type (bidi_it->type_after_w1);
1193 type = WEAK_BN; /* X9/Retaining */
1194 if (bidi_it->ignore_bn_limit <= 0)
1195 {
1196 if (!bidi_it->invalid_rl_levels)
1197 {
1198 new_level = bidi_pop_embedding_level (bidi_it);
1199 bidi_it->invalid_rl_levels = -1;
1200 if (bidi_it->invalid_levels)
1201 bidi_it->invalid_levels--;
1202 /* else nothing: UAX#9 says to ignore invalid PDFs */
1203 }
1204 if (!bidi_it->invalid_levels)
1205 new_level = bidi_pop_embedding_level (bidi_it);
1206 else
1207 {
1208 bidi_it->invalid_levels--;
1209 bidi_it->invalid_rl_levels--;
1210 }
1211 }
1212 else if (bidi_it->prev.type_after_w1 == WEAK_EN /* W5/Retaining */
1213 || bidi_it->next_en_pos > bidi_it->charpos)
1214 type = WEAK_EN;
1215 break;
1216 default:
1217 /* Nothing. */
1218 break;
1219 }
1220
1221 bidi_it->type = type;
1222 bidi_check_type (bidi_it->type);
1223
1224 return new_level;
1225}
1226
1227/* Given an iterator state in BIDI_IT, advance one character position
1228 in the buffer to the next character (in the logical order), resolve
1229 any explicit embeddings and directional overrides, and return the
1230 embedding level of the character after resolving explicit
1231 directives and ignoring empty embeddings. */
1232static int
1233bidi_resolve_explicit (struct bidi_it *bidi_it)
1234{
1235 int prev_level = bidi_it->level_stack[bidi_it->stack_idx].level;
1236 int new_level = bidi_resolve_explicit_1 (bidi_it);
1237
1238 if (prev_level < new_level
1239 && bidi_it->type == WEAK_BN
1240 && bidi_it->ignore_bn_limit == 0 /* only if not already known */
1241 && bidi_it->ch != BIDI_EOB /* not already at EOB */
1242 && bidi_explicit_dir_char (FETCH_CHAR (bidi_it->bytepos
1243 + bidi_it->ch_len)))
1244 {
1245 /* Avoid pushing and popping embedding levels if the level run
1246 is empty, as this breaks level runs where it shouldn't.
1247 UAX#9 removes all the explicit embedding and override codes,
1248 so empty embeddings disappear without a trace. We need to
1249 behave as if we did the same. */
1250 struct bidi_it saved_it;
1251 int level = prev_level;
1252
1253 bidi_copy_it (&saved_it, bidi_it);
1254
1255 while (bidi_explicit_dir_char (FETCH_CHAR (bidi_it->bytepos
1256 + bidi_it->ch_len)))
1257 {
1258 level = bidi_resolve_explicit_1 (bidi_it);
1259 }
1260
1261 if (level == prev_level) /* empty embedding */
1262 saved_it.ignore_bn_limit = bidi_it->charpos + 1;
1263 else /* this embedding is non-empty */
1264 saved_it.ignore_bn_limit = -1;
1265
1266 bidi_copy_it (bidi_it, &saved_it);
1267 if (bidi_it->ignore_bn_limit > 0)
1268 {
1269 /* We pushed a level, but we shouldn't have. Undo that. */
1270 if (!bidi_it->invalid_rl_levels)
1271 {
1272 new_level = bidi_pop_embedding_level (bidi_it);
1273 bidi_it->invalid_rl_levels = -1;
1274 if (bidi_it->invalid_levels)
1275 bidi_it->invalid_levels--;
1276 }
1277 if (!bidi_it->invalid_levels)
1278 new_level = bidi_pop_embedding_level (bidi_it);
1279 else
1280 {
1281 bidi_it->invalid_levels--;
1282 bidi_it->invalid_rl_levels--;
1283 }
1284 }
1285 }
1286
1287 if (bidi_it->type == NEUTRAL_B) /* X8 */
1288 {
1289 bidi_set_paragraph_end (bidi_it);
1290 /* This is needed by bidi_resolve_weak below, and in L1. */
1291 bidi_it->type_after_w1 = bidi_it->type;
1292 bidi_check_type (bidi_it->type_after_w1);
1293 }
1294
1295 return new_level;
1296}
1297
1298/* Advance in the buffer, resolve weak types and return the type of
1299 the next character after weak type resolution. */
1300bidi_type_t
1301bidi_resolve_weak (struct bidi_it *bidi_it)
1302{
1303 bidi_type_t type;
1304 bidi_dir_t override;
1305 int prev_level = bidi_it->level_stack[bidi_it->stack_idx].level;
1306 int new_level = bidi_resolve_explicit (bidi_it);
1307 int next_char;
1308 bidi_type_t type_of_next;
1309 struct bidi_it saved_it;
1310
1311 type = bidi_it->type;
1312 override = bidi_it->level_stack[bidi_it->stack_idx].override;
1313
1314 if (type == UNKNOWN_BT
1315 || type == LRE
1316 || type == LRO
1317 || type == RLE
1318 || type == RLO
1319 || type == PDF)
1320 abort ();
1321
1322 if (new_level != prev_level
1323 || bidi_it->type == NEUTRAL_B)
1324 {
1325 /* We've got a new embedding level run, compute the directional
1326 type of sor and initialize per-run variables (UAX#9, clause
1327 X10). */
1328 bidi_set_sor_type (bidi_it, prev_level, new_level);
1329 }
1330 else if (type == NEUTRAL_S || type == NEUTRAL_WS
1331 || type == WEAK_BN || type == STRONG_AL)
1332 bidi_it->type_after_w1 = type; /* needed in L1 */
1333 bidi_check_type (bidi_it->type_after_w1);
1334
1335 /* Level and directional override status are already recorded in
1336 bidi_it, and do not need any change; see X6. */
1337 if (override == R2L) /* X6 */
1338 type = STRONG_R;
1339 else if (override == L2R)
1340 type = STRONG_L;
1341 else
1342 {
1343 if (type == WEAK_NSM) /* W1 */
1344 {
1345 /* Note that we don't need to consider the case where the
1346 prev character has its type overridden by an RLO or LRO:
1347 such characters are outside the current level run, and
1348 thus not relevant to this NSM. Thus, NSM gets the
1349 orig_type of the previous character. */
1350 if (bidi_it->prev.type != UNKNOWN_BT)
1351 type = bidi_it->prev.orig_type;
1352 else if (bidi_it->sor == R2L)
1353 type = STRONG_R;
1354 else if (bidi_it->sor == L2R)
1355 type = STRONG_L;
1356 else /* shouldn't happen! */
1357 abort ();
1358 }
1359 if (type == WEAK_EN /* W2 */
1360 && bidi_it->last_strong.type_after_w1 == STRONG_AL)
1361 type = WEAK_AN;
1362 else if (type == STRONG_AL) /* W3 */
1363 type = STRONG_R;
1364 else if ((type == WEAK_ES /* W4 */
1365 && bidi_it->prev.type_after_w1 == WEAK_EN
1366 && bidi_it->prev.orig_type == WEAK_EN)
1367 || (type == WEAK_CS
1368 && ((bidi_it->prev.type_after_w1 == WEAK_EN
1369 && bidi_it->prev.orig_type == WEAK_EN)
1370 || bidi_it->prev.type_after_w1 == WEAK_AN)))
1371 {
1372 next_char =
1373 bidi_it->bytepos + bidi_it->ch_len >= ZV_BYTE
1374 ? BIDI_EOB : FETCH_CHAR (bidi_it->bytepos + bidi_it->ch_len);
1375 type_of_next = bidi_get_type (next_char, override);
1376
1377 if (type_of_next == WEAK_BN
1378 || bidi_explicit_dir_char (next_char))
1379 {
1380 bidi_copy_it (&saved_it, bidi_it);
1381 while (bidi_resolve_explicit (bidi_it) == new_level
1382 && bidi_it->type == WEAK_BN)
1383 ;
1384 type_of_next = bidi_it->type;
1385 bidi_copy_it (bidi_it, &saved_it);
1386 }
1387
1388 /* If the next character is EN, but the last strong-type
1389 character is AL, that next EN will be changed to AN when
1390 we process it in W2 above. So in that case, this ES
1391 should not be changed into EN. */
1392 if (type == WEAK_ES
1393 && type_of_next == WEAK_EN
1394 && bidi_it->last_strong.type_after_w1 != STRONG_AL)
1395 type = WEAK_EN;
1396 else if (type == WEAK_CS)
1397 {
1398 if (bidi_it->prev.type_after_w1 == WEAK_AN
1399 && (type_of_next == WEAK_AN
1400 /* If the next character is EN, but the last
1401 strong-type character is AL, EN will be later
1402 changed to AN when we process it in W2 above.
1403 So in that case, this ES should not be
1404 changed into EN. */
1405 || (type_of_next == WEAK_EN
1406 && bidi_it->last_strong.type_after_w1 == STRONG_AL)))
1407 type = WEAK_AN;
1408 else if (bidi_it->prev.type_after_w1 == WEAK_EN
1409 && type_of_next == WEAK_EN
1410 && bidi_it->last_strong.type_after_w1 != STRONG_AL)
1411 type = WEAK_EN;
1412 }
1413 }
1414 else if (type == WEAK_ET /* W5: ET with EN before or after it */
1415 || type == WEAK_BN) /* W5/Retaining */
1416 {
1417 if (bidi_it->prev.type_after_w1 == WEAK_EN /* ET/BN w/EN before it */
1418 || bidi_it->next_en_pos > bidi_it->charpos)
1419 type = WEAK_EN;
1420 else /* W5: ET/BN with EN after it. */
1421 {
1422 EMACS_INT en_pos = bidi_it->charpos + 1;
1423
1424 next_char =
1425 bidi_it->bytepos + bidi_it->ch_len >= ZV_BYTE
1426 ? BIDI_EOB : FETCH_CHAR (bidi_it->bytepos + bidi_it->ch_len);
1427 type_of_next = bidi_get_type (next_char, override);
1428
1429 if (type_of_next == WEAK_ET
1430 || type_of_next == WEAK_BN
1431 || bidi_explicit_dir_char (next_char))
1432 {
1433 bidi_copy_it (&saved_it, bidi_it);
1434 while (bidi_resolve_explicit (bidi_it) == new_level
1435 && (bidi_it->type == WEAK_BN
1436 || bidi_it->type == WEAK_ET))
1437 ;
1438 type_of_next = bidi_it->type;
1439 en_pos = bidi_it->charpos;
1440 bidi_copy_it (bidi_it, &saved_it);
1441 }
1442 if (type_of_next == WEAK_EN)
1443 {
1444 /* If the last strong character is AL, the EN we've
1445 found will become AN when we get to it (W2). */
1446 if (bidi_it->last_strong.type_after_w1 != STRONG_AL)
1447 {
1448 type = WEAK_EN;
1449 /* Remember this EN position, to speed up processing
1450 of the next ETs. */
1451 bidi_it->next_en_pos = en_pos;
1452 }
1453 else if (type == WEAK_BN)
1454 type = NEUTRAL_ON; /* W6/Retaining */
1455 }
1456 }
1457 }
1458 }
1459
1460 if (type == WEAK_ES || type == WEAK_ET || type == WEAK_CS /* W6 */
1461 || (type == WEAK_BN
1462 && (bidi_it->prev.type_after_w1 == WEAK_CS /* W6/Retaining */
1463 || bidi_it->prev.type_after_w1 == WEAK_ES
1464 || bidi_it->prev.type_after_w1 == WEAK_ET)))
1465 type = NEUTRAL_ON;
1466
1467 /* Store the type we've got so far, before we clobber it with strong
1468 types in W7 and while resolving neutral types. But leave alone
1469 the original types that were recorded above, because we will need
1470 them for the L1 clause. */
1471 if (bidi_it->type_after_w1 == UNKNOWN_BT)
1472 bidi_it->type_after_w1 = type;
1473 bidi_check_type (bidi_it->type_after_w1);
1474
1475 if (type == WEAK_EN) /* W7 */
1476 {
1477 if ((bidi_it->last_strong.type_after_w1 == STRONG_L)
1478 || (bidi_it->last_strong.type == UNKNOWN_BT && bidi_it->sor == L2R))
1479 type = STRONG_L;
1480 }
1481
1482 bidi_it->type = type;
1483 bidi_check_type (bidi_it->type);
1484 return type;
1485}
1486
1487bidi_type_t
1488bidi_resolve_neutral (struct bidi_it *bidi_it)
1489{
1490 int prev_level = bidi_it->level_stack[bidi_it->stack_idx].level;
1491 bidi_type_t type = bidi_resolve_weak (bidi_it);
1492 int current_level = bidi_it->level_stack[bidi_it->stack_idx].level;
1493
1494 if (!(type == STRONG_R
1495 || type == STRONG_L
1496 || type == WEAK_BN
1497 || type == WEAK_EN
1498 || type == WEAK_AN
1499 || type == NEUTRAL_B
1500 || type == NEUTRAL_S
1501 || type == NEUTRAL_WS
1502 || type == NEUTRAL_ON))
1503 abort ();
1504
1505 if (bidi_get_category (type) == NEUTRAL
1506 || (type == WEAK_BN && prev_level == current_level))
1507 {
1508 if (bidi_it->next_for_neutral.type != UNKNOWN_BT)
1509 type = bidi_resolve_neutral_1 (bidi_it->prev_for_neutral.type,
1510 bidi_it->next_for_neutral.type,
1511 current_level);
1512 else
1513 {
1514 /* Arrrgh!! The UAX#9 algorithm is too deeply entrenched in
1515 the assumption of batch-style processing; see clauses W4,
1516 W5, and especially N1, which require to look far forward
1517 (as well as back) in the buffer. May the fleas of a
1518 thousand camels infest the armpits of those who design
1519 supposedly general-purpose algorithms by looking at their
1520 own implementations, and fail to consider other possible
1521 implementations! */
1522 struct bidi_it saved_it;
1523 bidi_type_t next_type;
1524
1525 if (bidi_it->scan_dir == -1)
1526 abort ();
1527
1528 bidi_copy_it (&saved_it, bidi_it);
1529 /* Scan the text forward until we find the first non-neutral
1530 character, and then use that to resolve the neutral we
1531 are dealing with now. We also cache the scanned iterator
1532 states, to salvage some of the effort later. */
1533 bidi_cache_iterator_state (bidi_it, 0);
1534 do {
1535 /* Record the info about the previous character, so that
1536 it will be cached below with this state. */
1537 if (bidi_it->type_after_w1 != WEAK_BN /* W1/Retaining */
1538 && bidi_it->type != WEAK_BN)
1539 bidi_remember_char (&bidi_it->prev, bidi_it);
1540 type = bidi_resolve_weak (bidi_it);
1541 /* Paragraph separators have their levels fully resolved
1542 at this point, so cache them as resolved. */
1543 bidi_cache_iterator_state (bidi_it, type == NEUTRAL_B);
1544 /* FIXME: implement L1 here, by testing for a newline and
1545 resetting the level for any sequence of whitespace
1546 characters adjacent to it. */
1547 } while (!(type == NEUTRAL_B
1548 || (type != WEAK_BN
1549 && bidi_get_category (type) != NEUTRAL)
1550 /* This is all per level run, so stop when we
1551 reach the end of this level run. */
1552 || bidi_it->level_stack[bidi_it->stack_idx].level !=
1553 current_level));
1554
1555 bidi_remember_char (&saved_it.next_for_neutral, bidi_it);
1556
1557 switch (type)
1558 {
1559 case STRONG_L:
1560 case STRONG_R:
1561 case STRONG_AL:
1562 next_type = type;
1563 break;
1564 case WEAK_EN:
1565 case WEAK_AN:
1566 /* N1: ``European and Arabic numbers are treated as
1567 though they were R.'' */
1568 next_type = STRONG_R;
1569 saved_it.next_for_neutral.type = STRONG_R;
1570 break;
1571 case WEAK_BN:
1572 if (!bidi_explicit_dir_char (bidi_it->ch))
1573 abort (); /* can't happen: BNs are skipped */
1574 /* FALLTHROUGH */
1575 case NEUTRAL_B:
1576 /* Marched all the way to the end of this level run.
1577 We need to use the eor type, whose information is
1578 stored by bidi_set_sor_type in the prev_for_neutral
1579 member. */
1580 if (saved_it.type != WEAK_BN
1581 || bidi_get_category (bidi_it->prev.type_after_w1) == NEUTRAL)
1582 {
1583 next_type = bidi_it->prev_for_neutral.type;
1584 saved_it.next_for_neutral.type = next_type;
1585 bidi_check_type (next_type);
1586 }
1587 else
1588 {
1589 /* This is a BN which does not adjoin neutrals.
1590 Leave its type alone. */
1591 bidi_copy_it (bidi_it, &saved_it);
1592 return bidi_it->type;
1593 }
1594 break;
1595 default:
1596 abort ();
1597 }
1598 type = bidi_resolve_neutral_1 (saved_it.prev_for_neutral.type,
1599 next_type, current_level);
1600 saved_it.type = type;
1601 bidi_check_type (type);
1602 bidi_copy_it (bidi_it, &saved_it);
1603 }
1604 }
1605 return type;
1606}
1607
1608/* Given an iterator state in BIDI_IT, advance one character position
1609 in the buffer to the next character (in the logical order), resolve
1610 the bidi type of that next character, and return that type. */
1611bidi_type_t
1612bidi_type_of_next_char (struct bidi_it *bidi_it)
1613{
1614 bidi_type_t type;
1615
1616 /* This should always be called during a forward scan. */
1617 if (bidi_it->scan_dir != 1)
1618 abort ();
1619
1620 /* Reset the limit until which to ignore BNs if we step out of the
1621 area where we found only empty levels. */
1622 if ((bidi_it->ignore_bn_limit > 0
1623 && bidi_it->ignore_bn_limit <= bidi_it->charpos)
1624 || (bidi_it->ignore_bn_limit == -1
1625 && !bidi_explicit_dir_char (bidi_it->ch)))
1626 bidi_it->ignore_bn_limit = 0;
1627
1628 type = bidi_resolve_neutral (bidi_it);
1629
1630 return type;
1631}
1632
1633/* Given an iterator state BIDI_IT, advance one character position in
1634 the buffer to the next character (in the logical order), resolve
1635 the embedding and implicit levels of that next character, and
1636 return the resulting level. */
1637int
1638bidi_level_of_next_char (struct bidi_it *bidi_it)
1639{
1640 bidi_type_t type;
1641 int level, prev_level = -1;
1642 struct bidi_saved_info next_for_neutral;
1643
1644 if (bidi_it->scan_dir == 1)
1645 {
1646 /* There's no sense in trying to advance if we hit end of text. */
1647 if (bidi_it->ch == BIDI_EOB)
1648 return bidi_it->resolved_level;
1649
1650 /* Record the info about the previous character. */
1651 if (bidi_it->type_after_w1 != WEAK_BN /* W1/Retaining */
1652 && bidi_it->type != WEAK_BN)
1653 bidi_remember_char (&bidi_it->prev, bidi_it);
1654 if (bidi_it->type_after_w1 == STRONG_R
1655 || bidi_it->type_after_w1 == STRONG_L
1656 || bidi_it->type_after_w1 == STRONG_AL)
1657 bidi_remember_char (&bidi_it->last_strong, bidi_it);
1658 /* FIXME: it sounds like we don't need both prev and
1659 prev_for_neutral members, but I'm leaving them both for now. */
1660 if (bidi_it->type == STRONG_R || bidi_it->type == STRONG_L
1661 || bidi_it->type == WEAK_EN || bidi_it->type == WEAK_AN)
1662 bidi_remember_char (&bidi_it->prev_for_neutral, bidi_it);
1663
1664 /* If we overstepped the characters used for resolving neutrals
1665 and whitespace, invalidate their info in the iterator. */
1666 if (bidi_it->charpos >= bidi_it->next_for_neutral.charpos)
1667 bidi_it->next_for_neutral.type = UNKNOWN_BT;
1668 if (bidi_it->next_en_pos >= 0
1669 && bidi_it->charpos >= bidi_it->next_en_pos)
1670 bidi_it->next_en_pos = -1;
1671 if (bidi_it->next_for_ws.type != UNKNOWN_BT
1672 && bidi_it->charpos >= bidi_it->next_for_ws.charpos)
1673 bidi_it->next_for_ws.type = UNKNOWN_BT;
1674
1675 /* This must be taken before we fill the iterator with the info
1676 about the next char. If we scan backwards, the iterator
1677 state must be already cached, so there's no need to know the
1678 embedding level of the previous character, since we will be
1679 returning to our caller shortly. */
1680 prev_level = bidi_it->level_stack[bidi_it->stack_idx].level;
1681 }
1682 next_for_neutral = bidi_it->next_for_neutral;
1683
1684 /* Perhaps it is already cached. */
1685 type = bidi_cache_find (bidi_it->charpos + bidi_it->scan_dir, -1, bidi_it);
1686 if (type != UNKNOWN_BT)
1687 {
1688 /* Don't lose the information for resolving neutrals! The
1689 cached states could have been cached before their
1690 next_for_neutral member was computed. If we are on our way
1691 forward, we can simply take the info from the previous
1692 state. */
1693 if (bidi_it->scan_dir == 1
1694 && bidi_it->next_for_neutral.type == UNKNOWN_BT)
1695 bidi_it->next_for_neutral = next_for_neutral;
1696
1697 /* If resolved_level is -1, it means this state was cached
1698 before it was completely resolved, so we cannot return
1699 it. */
1700 if (bidi_it->resolved_level != -1)
1701 return bidi_it->resolved_level;
1702 }
1703 if (bidi_it->scan_dir == -1)
1704 /* If we are going backwards, the iterator state is already cached
1705 from previous scans, and should be fully resolved. */
1706 abort ();
1707
1708 if (type == UNKNOWN_BT)
1709 type = bidi_type_of_next_char (bidi_it);
1710
1711 if (type == NEUTRAL_B)
1712 return bidi_it->resolved_level;
1713
1714 level = bidi_it->level_stack[bidi_it->stack_idx].level;
1715 if ((bidi_get_category (type) == NEUTRAL /* && type != NEUTRAL_B */)
1716 || (type == WEAK_BN && prev_level == level))
1717 {
1718 if (bidi_it->next_for_neutral.type == UNKNOWN_BT)
1719 abort ();
1720
1721 /* If the cached state shows a neutral character, it was not
1722 resolved by bidi_resolve_neutral, so do it now. */
1723 type = bidi_resolve_neutral_1 (bidi_it->prev_for_neutral.type,
1724 bidi_it->next_for_neutral.type,
1725 level);
1726 }
1727
1728 if (!(type == STRONG_R
1729 || type == STRONG_L
1730 || type == WEAK_BN
1731 || type == WEAK_EN
1732 || type == WEAK_AN))
1733 abort ();
1734 bidi_it->type = type;
1735 bidi_check_type (bidi_it->type);
1736
1737 /* For L1 below, we need to know, for each WS character, whether
1738 it belongs to a sequence of WS characters preceeding a newline
1739 or a TAB or a paragraph separator. */
1740 if (bidi_it->orig_type == NEUTRAL_WS
1741 && bidi_it->next_for_ws.type == UNKNOWN_BT)
1742 {
1743 int ch;
1744 int clen = bidi_it->ch_len;
1745 EMACS_INT bpos = bidi_it->bytepos;
1746 EMACS_INT cpos = bidi_it->charpos;
1747 bidi_type_t chtype;
1748
1749 do {
1750 /*_fetch_multibyte_char_len = 1;*/
1751 ch = bpos + clen >= ZV_BYTE ? BIDI_EOB : FETCH_CHAR (bpos + clen);
1752 bpos += clen;
1753 cpos++;
1754 clen = (ch == BIDI_EOB ? 1 : CHAR_BYTES (ch));
1755 if (ch == '\n' || ch == BIDI_EOB /* || ch == LINESEP_CHAR */)
1756 chtype = NEUTRAL_B;
1757 else
1758 chtype = bidi_get_type (ch, NEUTRAL_DIR);
1759 } while (chtype == NEUTRAL_WS || chtype == WEAK_BN
1760 || bidi_explicit_dir_char (ch)); /* L1/Retaining */
1761 bidi_it->next_for_ws.type = chtype;
1762 bidi_check_type (bidi_it->next_for_ws.type);
1763 bidi_it->next_for_ws.charpos = cpos;
1764 bidi_it->next_for_ws.bytepos = bpos;
1765 }
1766
1767 /* Resolve implicit levels, with a twist: PDFs get the embedding
1768 level of the enbedding they terminate. See below for the
1769 reason. */
1770 if (bidi_it->orig_type == PDF
1771 /* Don't do this if this formatting code didn't change the
1772 embedding level due to invalid or empty embeddings. */
1773 && prev_level != level)
1774 {
1775 /* Don't look in UAX#9 for the reason for this: it's our own
1776 private quirk. The reason is that we want the formatting
1777 codes to be delivered so that they bracket the text of their
1778 embedding. For example, given the text
1779
1780 {RLO}teST{PDF}
1781
1782 we want it to be displayed as
1783
1784 {RLO}STet{PDF}
1785
1786 not as
1787
1788 STet{RLO}{PDF}
1789
1790 which will result because we bump up the embedding level as
1791 soon as we see the RLO and pop it as soon as we see the PDF,
1792 so RLO itself has the same embedding level as "teST", and
1793 thus would be normally delivered last, just before the PDF.
1794 The switch below fiddles with the level of PDF so that this
1795 ugly side effect does not happen.
1796
1797 (This is, of course, only important if the formatting codes
1798 are actually displayed, but Emacs does need to display them
1799 if the user wants to.) */
1800 level = prev_level;
1801 }
1802 else if (bidi_it->orig_type == NEUTRAL_B /* L1 */
1803 || bidi_it->orig_type == NEUTRAL_S
1804 || bidi_it->ch == '\n' || bidi_it->ch == BIDI_EOB
1805 /* || bidi_it->ch == LINESEP_CHAR */
1806 || (bidi_it->orig_type == NEUTRAL_WS
1807 && (bidi_it->next_for_ws.type == NEUTRAL_B
1808 || bidi_it->next_for_ws.type == NEUTRAL_S)))
1809 level = bidi_it->level_stack[0].level;
1810 else if ((level & 1) == 0) /* I1 */
1811 {
1812 if (type == STRONG_R)
1813 level++;
1814 else if (type == WEAK_EN || type == WEAK_AN)
1815 level += 2;
1816 }
1817 else /* I2 */
1818 {
1819 if (type == STRONG_L || type == WEAK_EN || type == WEAK_AN)
1820 level++;
1821 }
1822
1823 bidi_it->resolved_level = level;
1824 return level;
1825}
1826
1827/* Move to the other edge of a level given by LEVEL. If END_FLAG is
1828 non-zero, we are at the end of a level, and we need to prepare to
1829 resume the scan of the lower level.
1830
1831 If this level's other edge is cached, we simply jump to it, filling
1832 the iterator structure with the iterator state on the other edge.
1833 Otherwise, we walk the buffer until we come back to the same level
1834 as LEVEL.
1835
1836 Note: we are not talking here about a ``level run'' in the UAX#9
1837 sense of the term, but rather about a ``level'' which includes
1838 all the levels higher than it. In other words, given the levels
1839 like this:
1840
1841 11111112222222333333334443343222222111111112223322111
1842 A B C
1843
1844 and assuming we are at point A scanning left to right, this
1845 function moves to point C, whereas the UAX#9 ``level 2 run'' ends
1846 at point B. */
1847static void
1848bidi_find_other_level_edge (struct bidi_it *bidi_it, int level, int end_flag)
1849{
1850 int dir = end_flag ? -bidi_it->scan_dir : bidi_it->scan_dir;
1851 int idx;
1852
1853 /* Try the cache first. */
1854 if ((idx = bidi_cache_find_level_change (level, dir, end_flag)) >= 0)
1855 bidi_cache_fetch_state (idx, bidi_it);
1856 else
1857 {
1858 int new_level;
1859
1860 if (end_flag)
1861 abort (); /* if we are at end of level, its edges must be cached */
1862
1863 bidi_cache_iterator_state (bidi_it, 1);
1864 do {
1865 new_level = bidi_level_of_next_char (bidi_it);
1866 bidi_cache_iterator_state (bidi_it, 1);
1867 } while (new_level >= level);
1868 }
1869}
1870
1871void
1872bidi_get_next_char_visually (struct bidi_it *bidi_it)
1873{
1874 int old_level, new_level, next_level;
1875 struct bidi_it sentinel;
1876
1877 if (bidi_it->scan_dir == 0)
1878 {
1879 bidi_it->scan_dir = 1; /* default to logical order */
1880 }
1881
1882 /* If we just passed a newline, initialize for the next line. */
1883 if (!bidi_it->first_elt && bidi_it->orig_type == NEUTRAL_B)
1884 bidi_line_init (bidi_it);
1885
1886 /* Prepare the sentinel iterator state. */
1887 if (bidi_cache_idx == 0)
1888 {
1889 bidi_copy_it (&sentinel, bidi_it);
1890 if (bidi_it->first_elt)
1891 {
1892 sentinel.charpos--; /* cached charpos needs to be monotonic */
1893 sentinel.bytepos--;
1894 sentinel.ch = '\n'; /* doesn't matter, but why not? */
1895 sentinel.ch_len = 1;
1896 }
1897 }
1898
1899 old_level = bidi_it->resolved_level;
1900 new_level = bidi_level_of_next_char (bidi_it);
1901
1902 /* Reordering of resolved levels (clause L2) is implemented by
1903 jumping to the other edge of the level and flipping direction of
1904 scanning the text whenever we find a level change. */
1905 if (new_level != old_level)
1906 {
1907 int ascending = new_level > old_level;
1908 int level_to_search = ascending ? old_level + 1 : old_level;
1909 int incr = ascending ? 1 : -1;
1910 int expected_next_level = old_level + incr;
1911
1912 /* If we don't have anything cached yet, we need to cache the
1913 sentinel state, since we'll need it to record where to jump
1914 when the last non-base level is exhausted. */
1915 if (bidi_cache_idx == 0)
1916 bidi_cache_iterator_state (&sentinel, 1);
1917 /* Jump (or walk) to the other edge of this level. */
1918 bidi_find_other_level_edge (bidi_it, level_to_search, !ascending);
1919 /* Switch scan direction and peek at the next character in the
1920 new direction. */
1921 bidi_it->scan_dir = -bidi_it->scan_dir;
1922
1923 /* The following loop handles the case where the resolved level
1924 jumps by more than one. This is typical for numbers inside a
1925 run of text with left-to-right embedding direction, but can
1926 also happen in other situations. In those cases the decision
1927 where to continue after a level change, and in what direction,
1928 is tricky. For example, given a text like below:
1929
1930 abcdefgh
1931 11336622
1932
1933 (where the numbers below the text show the resolved levels),
1934 the result of reordering according to UAX#9 should be this:
1935
1936 efdcghba
1937
1938 This is implemented by the loop below which flips direction
1939 and jumps to the other edge of the level each time it finds
1940 the new level not to be the expected one. The expected level
1941 is always one more or one less than the previous one. */
1942 next_level = bidi_peek_at_next_level (bidi_it);
1943 while (next_level != expected_next_level)
1944 {
1945 expected_next_level += incr;
1946 level_to_search += incr;
1947 bidi_find_other_level_edge (bidi_it, level_to_search, !ascending);
1948 bidi_it->scan_dir = -bidi_it->scan_dir;
1949 next_level = bidi_peek_at_next_level (bidi_it);
1950 }
1951
1952 /* Finally, deliver the next character in the new direction. */
1953 next_level = bidi_level_of_next_char (bidi_it);
1954 }
1955
1956 /* Take note when we have just processed the newline that precedes
1957 the end of the paragraph. The next time we are about to be
1958 called, set_iterator_to_next will automatically reinit the
1959 paragraph direction, if needed. We do this at the newline before
1960 the paragraph separator, because the next character might not be
1961 the first character of the next paragraph, due to the bidi
1962 reordering, whereas we _must_ know the paragraph base direction
1963 _before_ we process the paragraph's text, since the base
1964 direction affects the reordering. */
1965 if (bidi_it->scan_dir == 1
1966 && bidi_it->orig_type == NEUTRAL_B
1967 && bidi_it->bytepos < ZV_BYTE)
1968 {
1969 EMACS_INT sep_len =
1970 bidi_at_paragraph_end (bidi_it->charpos + 1,
1971 bidi_it->bytepos + bidi_it->ch_len);
1972 if (sep_len >= 0)
1973 {
1974 bidi_it->new_paragraph = 1;
1975 /* Record the buffer position of the last character of the
1976 paragraph separator. */
1977 bidi_it->separator_limit = bidi_it->charpos + 1 + sep_len;
1978 }
1979 }
1980
1981 if (bidi_it->scan_dir == 1 && bidi_cache_idx)
1982 {
1983 /* If we are at paragraph's base embedding level and beyond the
1984 last cached position, the cache's job is done and we can
1985 discard it. */
1986 if (bidi_it->resolved_level == bidi_it->level_stack[0].level
1987 && bidi_it->charpos > bidi_cache[bidi_cache_idx - 1].charpos)
1988 bidi_cache_reset ();
1989 /* But as long as we are caching during forward scan, we must
1990 cache each state, or else the cache integrity will be
1991 compromised: it assumes cached states correspond to buffer
1992 positions 1:1. */
1993 else
1994 bidi_cache_iterator_state (bidi_it, 1);
1995 }
1996}
1997
1998/* This is meant to be called from within the debugger, whenever you
1999 wish to examine the cache contents. */
2000void
2001bidi_dump_cached_states (void)
2002{
2003 int i;
2004 int ndigits = 1;
2005
2006 if (bidi_cache_idx == 0)
2007 {
2008 fprintf (stderr, "The cache is empty.\n");
2009 return;
2010 }
2011 fprintf (stderr, "Total of %d state%s in cache:\n",
2012 bidi_cache_idx, bidi_cache_idx == 1 ? "" : "s");
2013
2014 for (i = bidi_cache[bidi_cache_idx - 1].charpos; i > 0; i /= 10)
2015 ndigits++;
2016 fputs ("ch ", stderr);
2017 for (i = 0; i < bidi_cache_idx; i++)
2018 fprintf (stderr, "%*c", ndigits, bidi_cache[i].ch);
2019 fputs ("\n", stderr);
2020 fputs ("lvl ", stderr);
2021 for (i = 0; i < bidi_cache_idx; i++)
2022 fprintf (stderr, "%*d", ndigits, bidi_cache[i].resolved_level);
2023 fputs ("\n", stderr);
2024 fputs ("pos ", stderr);
2025 for (i = 0; i < bidi_cache_idx; i++)
2026 fprintf (stderr, "%*d", ndigits, bidi_cache[i].charpos);
2027 fputs ("\n", stderr);
2028}
diff --git a/src/buffer.c b/src/buffer.c
index a0acad309af..0c6e57d45be 100644
--- a/src/buffer.c
+++ b/src/buffer.c
@@ -2279,6 +2279,8 @@ DEFUN ("buffer-swap-text", Fbuffer_swap_text, Sbuffer_swap_text,
2279 swapfield (undo_list, Lisp_Object); 2279 swapfield (undo_list, Lisp_Object);
2280 swapfield (mark, Lisp_Object); 2280 swapfield (mark, Lisp_Object);
2281 swapfield (enable_multibyte_characters, Lisp_Object); 2281 swapfield (enable_multibyte_characters, Lisp_Object);
2282 swapfield (bidi_display_reordering, Lisp_Object);
2283 swapfield (bidi_paragraph_direction, Lisp_Object);
2282 /* FIXME: Not sure what we should do with these *_marker fields. 2284 /* FIXME: Not sure what we should do with these *_marker fields.
2283 Hopefully they're just nil anyway. */ 2285 Hopefully they're just nil anyway. */
2284 swapfield (pt_marker, Lisp_Object); 2286 swapfield (pt_marker, Lisp_Object);
@@ -5206,7 +5208,9 @@ init_buffer_once ()
5206 buffer_defaults.truncate_lines = Qnil; 5208 buffer_defaults.truncate_lines = Qnil;
5207 buffer_defaults.word_wrap = Qnil; 5209 buffer_defaults.word_wrap = Qnil;
5208 buffer_defaults.ctl_arrow = Qt; 5210 buffer_defaults.ctl_arrow = Qt;
5211 buffer_defaults.bidi_display_reordering = Qnil;
5209 buffer_defaults.direction_reversed = Qnil; 5212 buffer_defaults.direction_reversed = Qnil;
5213 buffer_defaults.bidi_paragraph_direction = Qnil;
5210 buffer_defaults.cursor_type = Qt; 5214 buffer_defaults.cursor_type = Qt;
5211 buffer_defaults.extra_line_spacing = Qnil; 5215 buffer_defaults.extra_line_spacing = Qnil;
5212 buffer_defaults.cursor_in_non_selected_windows = Qt; 5216 buffer_defaults.cursor_in_non_selected_windows = Qt;
@@ -5291,7 +5295,9 @@ init_buffer_once ()
5291 XSETFASTINT (buffer_local_flags.syntax_table, idx); ++idx; 5295 XSETFASTINT (buffer_local_flags.syntax_table, idx); ++idx;
5292 XSETFASTINT (buffer_local_flags.cache_long_line_scans, idx); ++idx; 5296 XSETFASTINT (buffer_local_flags.cache_long_line_scans, idx); ++idx;
5293 XSETFASTINT (buffer_local_flags.category_table, idx); ++idx; 5297 XSETFASTINT (buffer_local_flags.category_table, idx); ++idx;
5298 XSETFASTINT (buffer_local_flags.bidi_display_reordering, idx); ++idx;
5294 XSETFASTINT (buffer_local_flags.direction_reversed, idx); ++idx; 5299 XSETFASTINT (buffer_local_flags.direction_reversed, idx); ++idx;
5300 XSETFASTINT (buffer_local_flags.bidi_paragraph_direction, idx); ++idx;
5295 XSETFASTINT (buffer_local_flags.buffer_file_coding_system, idx); 5301 XSETFASTINT (buffer_local_flags.buffer_file_coding_system, idx);
5296 /* Make this one a permanent local. */ 5302 /* Make this one a permanent local. */
5297 buffer_permanent_local_flags[idx++] = 1; 5303 buffer_permanent_local_flags[idx++] = 1;
@@ -5548,11 +5554,6 @@ This is the same as (default-value 'abbrev-mode). */);
5548 doc: /* Default value of `ctl-arrow' for buffers that do not override it. 5554 doc: /* Default value of `ctl-arrow' for buffers that do not override it.
5549This is the same as (default-value 'ctl-arrow). */); 5555This is the same as (default-value 'ctl-arrow). */);
5550 5556
5551 DEFVAR_LISP_NOPRO ("default-direction-reversed",
5552 &buffer_defaults.direction_reversed,
5553 doc: /* Default value of `direction-reversed' for buffers that do not override it.
5554This is the same as (default-value 'direction-reversed). */);
5555
5556 DEFVAR_LISP_NOPRO ("default-enable-multibyte-characters", 5557 DEFVAR_LISP_NOPRO ("default-enable-multibyte-characters",
5557 &buffer_defaults.enable_multibyte_characters, 5558 &buffer_defaults.enable_multibyte_characters,
5558 doc: /* *Default value of `enable-multibyte-characters' for buffers not overriding it. 5559 doc: /* *Default value of `enable-multibyte-characters' for buffers not overriding it.
@@ -5809,11 +5810,29 @@ The variable `coding-system-for-write', if non-nil, overrides this variable.
5809 5810
5810This variable is never applied to a way of decoding a file while reading it. */); 5811This variable is never applied to a way of decoding a file while reading it. */);
5811 5812
5812 DEFVAR_PER_BUFFER ("direction-reversed", &current_buffer->direction_reversed, 5813 DEFVAR_PER_BUFFER ("direction-reversed",
5813 Qnil, 5814 &current_buffer->direction_reversed, Qnil,
5814 doc: /* *Non-nil means lines in the buffer are displayed right to left. */); 5815 doc: /* Non-nil means set beginning of lines at the right edge of the window.
5815 5816See also the variable `bidi-display-reordering'. */);
5816 DEFVAR_PER_BUFFER ("truncate-lines", &current_buffer->truncate_lines, Qnil, 5817
5818 DEFVAR_PER_BUFFER ("bidi-display-reordering",
5819 &current_buffer->bidi_display_reordering, Qnil,
5820 doc: /* Non-nil means reorder bidirectional text for display in the visual order.
5821See also the variable `direction-reversed'. */);
5822
5823 DEFVAR_PER_BUFFER ("bidi-paragraph-direction",
5824 &current_buffer->bidi_paragraph_direction, Qnil,
5825 doc: /* *If non-nil, forces directionality of text paragraphs in the buffer.
5826
5827If this is nil (the default), the direction of each paragraph is
5828determined by the first strong directional character of its text.
5829The values of `right-to-left' and `left-to-right' override that.
5830Any other value is treated as nil.
5831
5832This variable has no effect unless the buffer's value of
5833\`bidi-display-reordering' is non-nil. */);
5834
5835 DEFVAR_PER_BUFFER ("truncate-lines", &current_buffer->truncate_lines, Qnil,
5817 doc: /* *Non-nil means do not display continuation lines. 5836 doc: /* *Non-nil means do not display continuation lines.
5818Instead, give each line of text just one screen line. 5837Instead, give each line of text just one screen line.
5819 5838
diff --git a/src/buffer.h b/src/buffer.h
index 5217c6d7298..40f03daca90 100644
--- a/src/buffer.h
+++ b/src/buffer.h
@@ -662,8 +662,16 @@ struct buffer
662 Lisp_Object word_wrap; 662 Lisp_Object word_wrap;
663 /* Non-nil means display ctl chars with uparrow. */ 663 /* Non-nil means display ctl chars with uparrow. */
664 Lisp_Object ctl_arrow; 664 Lisp_Object ctl_arrow;
665 /* Non-nil means display text from right to left. */ 665 /* Non-nil means reorder bidirectional text for display in the
666 visual order. */
667 Lisp_Object bidi_display_reordering;
668 /* Non-nil means set beginning of lines at the right edge of
669 windows. */
666 Lisp_Object direction_reversed; 670 Lisp_Object direction_reversed;
671 /* If non-nil, specifies which direction of text to force in all the
672 paragraphs of the buffer. Nil means determine paragraph
673 direction dynamically for each paragraph. */
674 Lisp_Object bidi_paragraph_direction;
667 /* Non-nil means do selective display; 675 /* Non-nil means do selective display;
668 see doc string in syms_of_buffer (buffer.c) for details. */ 676 see doc string in syms_of_buffer (buffer.c) for details. */
669 Lisp_Object selective_display; 677 Lisp_Object selective_display;
diff --git a/src/dispextern.h b/src/dispextern.h
index 22d44fc9083..5083199c529 100644
--- a/src/dispextern.h
+++ b/src/dispextern.h
@@ -370,6 +370,16 @@ struct glyph
370 /* Non-zero means don't display cursor here. */ 370 /* Non-zero means don't display cursor here. */
371 unsigned avoid_cursor_p : 1; 371 unsigned avoid_cursor_p : 1;
372 372
373 /* Resolved bidirectional level of this character [0..63]. */
374 unsigned resolved_level : 5;
375
376 /* Resolved bidirectional type of this character, see enum
377 bidi_type_t below. Note that according to UAX#9, only some
378 values (STRONG_L, STRONG_R, WEAK_AN, WEAK_EN, WEAK_BN, and
379 NEUTRAL_B) can appear in the resolved type, so we only reserve
380 space for those that can. */
381 unsigned bidi_type : 3;
382
373#define FACE_ID_BITS 20 383#define FACE_ID_BITS 20
374 384
375 /* Face of the glyph. This is a realized face ID, 385 /* Face of the glyph. This is a realized face ID,
@@ -739,14 +749,18 @@ struct glyph_row
739 /* First position in this row. This is the text position, including 749 /* First position in this row. This is the text position, including
740 overlay position information etc, where the display of this row 750 overlay position information etc, where the display of this row
741 started, and can thus be less the position of the first glyph 751 started, and can thus be less the position of the first glyph
742 (e.g. due to invisible text or horizontal scrolling). */ 752 (e.g. due to invisible text or horizontal scrolling). BIDI Note:
753 This is the smallest character position in the row, but not
754 necessarily the character that is the leftmost on the display. */
743 struct display_pos start; 755 struct display_pos start;
744 756
745 /* Text position at the end of this row. This is the position after 757 /* Text position at the end of this row. This is the position after
746 the last glyph on this row. It can be greater than the last 758 the last glyph on this row. It can be greater than the last
747 glyph position + 1, due to truncation, invisible text etc. In an 759 glyph position + 1, due to truncation, invisible text etc. In an
748 up-to-date display, this should always be equal to the start 760 up-to-date display, this should always be equal to the start
749 position of the next row. */ 761 position of the next row. BIDI Note: this is the character whose
762 buffer position is the largest, but not necessarily the rightmost
763 one on the display. */
750 struct display_pos end; 764 struct display_pos end;
751 765
752 /* Non-zero means the overlay arrow bitmap is on this line. 766 /* Non-zero means the overlay arrow bitmap is on this line.
@@ -872,6 +886,10 @@ struct glyph_row
872 the bottom line of the window, but not end of the buffer. */ 886 the bottom line of the window, but not end of the buffer. */
873 unsigned indicate_bottom_line_p : 1; 887 unsigned indicate_bottom_line_p : 1;
874 888
889 /* Non-zero means the row was reversed to display text in a
890 right-to-left paragraph. */
891 unsigned reversed_p : 1;
892
875 /* Continuation lines width at the start of the row. */ 893 /* Continuation lines width at the start of the row. */
876 int continuation_lines_width; 894 int continuation_lines_width;
877 895
@@ -924,12 +942,18 @@ struct glyph_row *matrix_row P_ ((struct glyph_matrix *, int));
924 (MATRIX_ROW ((MATRIX), (ROW))->used[TEXT_AREA]) 942 (MATRIX_ROW ((MATRIX), (ROW))->used[TEXT_AREA])
925 943
926/* Return the character/ byte position at which the display of ROW 944/* Return the character/ byte position at which the display of ROW
927 starts. */ 945 starts. BIDI Note: this is the smallest character/byte position
946 among characters in ROW, i.e. the first logical-order character
947 displayed by ROW, which is not necessarily the smallest horizontal
948 position. */
928 949
929#define MATRIX_ROW_START_CHARPOS(ROW) ((ROW)->start.pos.charpos) 950#define MATRIX_ROW_START_CHARPOS(ROW) ((ROW)->start.pos.charpos)
930#define MATRIX_ROW_START_BYTEPOS(ROW) ((ROW)->start.pos.bytepos) 951#define MATRIX_ROW_START_BYTEPOS(ROW) ((ROW)->start.pos.bytepos)
931 952
932/* Return the character/ byte position at which ROW ends. */ 953/* Return the character/ byte position at which ROW ends. BIDI Note:
954 this is the largest character/byte position among characters in
955 ROW, i.e. the last logical-order character displayed by ROW, which
956 is not necessarily the largest horizontal position. */
933 957
934#define MATRIX_ROW_END_CHARPOS(ROW) ((ROW)->end.pos.charpos) 958#define MATRIX_ROW_END_CHARPOS(ROW) ((ROW)->end.pos.charpos)
935#define MATRIX_ROW_END_BYTEPOS(ROW) ((ROW)->end.pos.bytepos) 959#define MATRIX_ROW_END_BYTEPOS(ROW) ((ROW)->end.pos.bytepos)
@@ -1702,7 +1726,93 @@ struct face_cache
1702 1726
1703extern int face_change_count; 1727extern int face_change_count;
1704 1728
1729/* For reordering of bidirectional text. */
1730#define BIDI_MAXLEVEL 64
1731
1732/* Data type for describing the bidirectional character types. The
1733 first 7 must be at the beginning, because they are the only values
1734 valid in the `bidi_type' member of `struct glyph'; we only reserve
1735 3 bits for it, so we cannot use there values larger than 7. */
1736typedef enum {
1737 UNKNOWN_BT = 0,
1738 STRONG_L, /* strong left-to-right */
1739 STRONG_R, /* strong right-to-left */
1740 WEAK_EN, /* european number */
1741 WEAK_AN, /* arabic number */
1742 WEAK_BN, /* boundary neutral */
1743 NEUTRAL_B, /* paragraph separator */
1744 STRONG_AL, /* arabic right-to-left letter */
1745 LRE, /* left-to-right embedding */
1746 LRO, /* left-to-right override */
1747 RLE, /* right-to-left embedding */
1748 RLO, /* right-to-left override */
1749 PDF, /* pop directional format */
1750 WEAK_ES, /* european number separator */
1751 WEAK_ET, /* european number terminator */
1752 WEAK_CS, /* common separator */
1753 WEAK_NSM, /* non-spacing mark */
1754 NEUTRAL_S, /* segment separator */
1755 NEUTRAL_WS, /* whitespace */
1756 NEUTRAL_ON /* other neutrals */
1757} bidi_type_t;
1758
1759/* The basic directionality data type. */
1760typedef enum { NEUTRAL_DIR, L2R, R2L } bidi_dir_t;
1761
1762/* Data type for storing information about characters we need to
1763 remember. */
1764struct bidi_saved_info {
1765 int bytepos, charpos; /* character's buffer position */
1766 bidi_type_t type; /* character's resolved bidi type */
1767 bidi_type_t type_after_w1; /* original type of the character, after W1 */
1768 bidi_type_t orig_type; /* type as we found it in the buffer */
1769};
1770
1771/* Data type for keeping track of saved embedding levels and override
1772 status information. */
1773struct bidi_stack {
1774 int level;
1775 bidi_dir_t override;
1776};
1777
1778/* Data type for iterating over bidi text. */
1779struct bidi_it {
1780 EMACS_INT bytepos; /* iterator's position in buffer */
1781 EMACS_INT charpos;
1782 int ch; /* character itself */
1783 int ch_len; /* length of its multibyte sequence */
1784 bidi_type_t type; /* bidi type of this character, after
1785 resolving weak and neutral types */
1786 bidi_type_t type_after_w1; /* original type, after overrides and W1 */
1787 bidi_type_t orig_type; /* original type, as found in the buffer */
1788 int resolved_level; /* final resolved level of this character */
1789 int invalid_levels; /* how many PDFs to ignore */
1790 int invalid_rl_levels; /* how many PDFs from RLE/RLO to ignore */
1791 int prev_was_pdf; /* if non-zero, previous char was PDF */
1792 struct bidi_saved_info prev; /* info about previous character */
1793 struct bidi_saved_info last_strong; /* last-seen strong directional char */
1794 struct bidi_saved_info next_for_neutral; /* surrounding characters for... */
1795 struct bidi_saved_info prev_for_neutral; /* ...resolving neutrals */
1796 struct bidi_saved_info next_for_ws; /* character after sequence of ws */
1797 EMACS_INT next_en_pos; /* position of next EN char for ET */
1798 EMACS_INT ignore_bn_limit; /* position until which to ignore BNs */
1799 bidi_dir_t sor; /* direction of start-of-run in effect */
1800 int scan_dir; /* direction of text scan */
1801 int stack_idx; /* index of current data on the stack */
1802 /* Note: Everything from here on is not copied/saved when the bidi
1803 iterator state is saved, pushed, or popped. So only put here
1804 stuff that is not part of the bidi iterator's state! */
1805 struct bidi_stack level_stack[BIDI_MAXLEVEL]; /* stack of embedding levels */
1806 int first_elt; /* if non-zero, examine current char first */
1807 bidi_dir_t paragraph_dir; /* current paragraph direction */
1808 int new_paragraph; /* if non-zero, we expect a new paragraph */
1809 EMACS_INT separator_limit; /* where paragraph separator should end */
1810};
1705 1811
1812/* Value is non-zero when the bidi iterator is at base paragraph
1813 embedding level. */
1814#define BIDI_AT_BASE_LEVEL(BIDI_IT) \
1815 ((BIDI_IT).resolved_level == (BIDI_IT).level_stack[0].level)
1706 1816
1707 1817
1708/*********************************************************************** 1818/***********************************************************************
@@ -1854,7 +1964,7 @@ enum it_method {
1854 NUM_IT_METHODS 1964 NUM_IT_METHODS
1855}; 1965};
1856 1966
1857#define IT_STACK_SIZE 4 1967#define IT_STACK_SIZE 5
1858 1968
1859/* Iterator for composition (both for static and automatic). */ 1969/* Iterator for composition (both for static and automatic). */
1860struct composition_it 1970struct composition_it
@@ -1902,6 +2012,14 @@ struct it
1902 text, overlay strings, end of text etc., which see. */ 2012 text, overlay strings, end of text etc., which see. */
1903 EMACS_INT stop_charpos; 2013 EMACS_INT stop_charpos;
1904 2014
2015 /* Previous stop position, i.e. the last one before the current
2016 iterator position in `current'. */
2017 EMACS_INT prev_stop;
2018
2019 /* Last stop position iterated across whose bidi embedding level is
2020 equal to the current paragraph's base embedding level. */
2021 EMACS_INT base_level_stop;
2022
1905 /* Maximum string or buffer position + 1. ZV when iterating over 2023 /* Maximum string or buffer position + 1. ZV when iterating over
1906 current_buffer. */ 2024 current_buffer. */
1907 EMACS_INT end_charpos; 2025 EMACS_INT end_charpos;
@@ -2008,6 +2126,8 @@ struct it
2008 int string_nchars; 2126 int string_nchars;
2009 EMACS_INT end_charpos; 2127 EMACS_INT end_charpos;
2010 EMACS_INT stop_charpos; 2128 EMACS_INT stop_charpos;
2129 EMACS_INT prev_stop;
2130 EMACS_INT base_level_stop;
2011 struct composition_it cmp_it; 2131 struct composition_it cmp_it;
2012 int face_id; 2132 int face_id;
2013 2133
@@ -2207,6 +2327,14 @@ struct it
2207 incremented/reset by display_line, move_it_to etc. */ 2327 incremented/reset by display_line, move_it_to etc. */
2208 int continuation_lines_width; 2328 int continuation_lines_width;
2209 2329
2330 /* Buffer position that ends the buffer text line being iterated.
2331 This is normally the position after the newline at EOL. If this
2332 is the last line of the buffer and it doesn't have a newline,
2333 value is ZV/ZV_BYTE. Set and used only if IT->bidi_p, for
2334 setting the end position of glyph rows produced for continuation
2335 lines, see display_line. */
2336 struct text_pos eol_pos;
2337
2210 /* Current y-position. Automatically incremented by the height of 2338 /* Current y-position. Automatically incremented by the height of
2211 glyph_row in move_it_to and display_line. */ 2339 glyph_row in move_it_to and display_line. */
2212 int current_y; 2340 int current_y;
@@ -2233,6 +2361,14 @@ struct it
2233 2361
2234 /* Face of the right fringe glyph. */ 2362 /* Face of the right fringe glyph. */
2235 unsigned right_user_fringe_face_id : FACE_ID_BITS; 2363 unsigned right_user_fringe_face_id : FACE_ID_BITS;
2364
2365 /* Non-zero means we need to reorder bidirectional text for display
2366 in the visual order. */
2367 int bidi_p;
2368
2369 /* For iterating over bidirectional text. */
2370 struct bidi_it bidi_it;
2371 bidi_dir_t paragraph_embedding;
2236}; 2372};
2237 2373
2238 2374
@@ -2263,6 +2399,13 @@ struct it
2263#define PRODUCE_GLYPHS(IT) \ 2399#define PRODUCE_GLYPHS(IT) \
2264 do { \ 2400 do { \
2265 extern int inhibit_free_realized_faces; \ 2401 extern int inhibit_free_realized_faces; \
2402 if ((IT)->glyph_row != NULL && (IT)->bidi_p) \
2403 { \
2404 if ((IT)->bidi_it.paragraph_dir == R2L) \
2405 (IT)->glyph_row->reversed_p = 1; \
2406 else \
2407 (IT)->glyph_row->reversed_p = 0; \
2408 } \
2266 if (FRAME_RIF ((IT)->f) != NULL) \ 2409 if (FRAME_RIF ((IT)->f) != NULL) \
2267 FRAME_RIF ((IT)->f)->produce_glyphs ((IT)); \ 2410 FRAME_RIF ((IT)->f)->produce_glyphs ((IT)); \
2268 else \ 2411 else \
@@ -2704,12 +2847,20 @@ extern EMACS_INT tool_bar_button_relief;
2704 Function Prototypes 2847 Function Prototypes
2705 ***********************************************************************/ 2848 ***********************************************************************/
2706 2849
2850/* Defined in bidi.c */
2851
2852extern void bidi_init_it P_ ((EMACS_INT, EMACS_INT, struct bidi_it *));
2853extern void bidi_get_next_char_visually P_ ((struct bidi_it *));
2854extern void bidi_paragraph_init P_ ((bidi_dir_t, struct bidi_it *));
2855extern int bidi_mirror_char P_ ((int));
2856
2707/* Defined in xdisp.c */ 2857/* Defined in xdisp.c */
2708 2858
2709struct glyph_row *row_containing_pos P_ ((struct window *, int, 2859struct glyph_row *row_containing_pos P_ ((struct window *, int,
2710 struct glyph_row *, 2860 struct glyph_row *,
2711 struct glyph_row *, int)); 2861 struct glyph_row *, int));
2712int string_buffer_position P_ ((struct window *, Lisp_Object, int)); 2862EMACS_INT string_buffer_position P_ ((struct window *, Lisp_Object,
2863 EMACS_INT));
2713int line_bottom_y P_ ((struct it *)); 2864int line_bottom_y P_ ((struct it *));
2714int display_prop_intangible_p P_ ((Lisp_Object)); 2865int display_prop_intangible_p P_ ((Lisp_Object));
2715void resize_echo_area_exactly P_ ((void)); 2866void resize_echo_area_exactly P_ ((void));
diff --git a/src/dispnew.c b/src/dispnew.c
index d32ce48cce6..fd470491f78 100644
--- a/src/dispnew.c
+++ b/src/dispnew.c
@@ -1388,8 +1388,11 @@ prepare_desired_row (row)
1388{ 1388{
1389 if (!row->enabled_p) 1389 if (!row->enabled_p)
1390 { 1390 {
1391 unsigned rp = row->reversed_p;
1392
1391 clear_glyph_row (row); 1393 clear_glyph_row (row);
1392 row->enabled_p = 1; 1394 row->enabled_p = 1;
1395 row->reversed_p = rp;
1393 } 1396 }
1394} 1397}
1395 1398
@@ -1540,6 +1543,7 @@ row_equal_p (w, a, b, mouse_face_p)
1540 || a->overlapped_p != b->overlapped_p 1543 || a->overlapped_p != b->overlapped_p
1541 || (MATRIX_ROW_CONTINUATION_LINE_P (a) 1544 || (MATRIX_ROW_CONTINUATION_LINE_P (a)
1542 != MATRIX_ROW_CONTINUATION_LINE_P (b)) 1545 != MATRIX_ROW_CONTINUATION_LINE_P (b))
1546 || a->reversed_p != b->reversed_p
1543 /* Different partially visible characters on left margin. */ 1547 /* Different partially visible characters on left margin. */
1544 || a->x != b->x 1548 || a->x != b->x
1545 /* Different height. */ 1549 /* Different height. */
@@ -3500,6 +3504,8 @@ direct_output_for_insert (g)
3500 || !display_completed 3504 || !display_completed
3501 /* Give up if buffer appears in two places. */ 3505 /* Give up if buffer appears in two places. */
3502 || buffer_shared > 1 3506 || buffer_shared > 1
3507 /* Give up if we need to reorder bidirectional text. */
3508 || !NILP (current_buffer->bidi_display_reordering)
3503 /* Give up if currently displaying a message instead of the 3509 /* Give up if currently displaying a message instead of the
3504 minibuffer contents. */ 3510 minibuffer contents. */
3505 || (EQ (selected_window, minibuf_window) 3511 || (EQ (selected_window, minibuf_window)
@@ -3776,6 +3782,10 @@ direct_output_forward_char (n)
3776 if (!display_completed || cursor_in_echo_area) 3782 if (!display_completed || cursor_in_echo_area)
3777 return 0; 3783 return 0;
3778 3784
3785 /* Give up if we need to reorder bidirectional text. */
3786 if (!NILP (XBUFFER (w->buffer)->bidi_display_reordering))
3787 return 0;
3788
3779 /* Give up if the buffer's direction is reversed. */ 3789 /* Give up if the buffer's direction is reversed. */
3780 if (!NILP (XBUFFER (w->buffer)->direction_reversed)) 3790 if (!NILP (XBUFFER (w->buffer)->direction_reversed))
3781 return 0; 3791 return 0;
diff --git a/src/makefile.w32-in b/src/makefile.w32-in
index 156eddd6092..edb3f3f711b 100644
--- a/src/makefile.w32-in
+++ b/src/makefile.w32-in
@@ -115,6 +115,7 @@ OBJ1 = $(BLD)/alloc.$(O) \
115 $(BLD)/vm-limit.$(O) \ 115 $(BLD)/vm-limit.$(O) \
116 $(BLD)/region-cache.$(O) \ 116 $(BLD)/region-cache.$(O) \
117 $(BLD)/strftime.$(O) \ 117 $(BLD)/strftime.$(O) \
118 $(BLD)/bidi.$(O) \
118 $(BLD)/charset.$(O) \ 119 $(BLD)/charset.$(O) \
119 $(BLD)/character.$(O) \ 120 $(BLD)/character.$(O) \
120 $(BLD)/chartab.$(O) \ 121 $(BLD)/chartab.$(O) \
@@ -338,6 +339,14 @@ $(BLD)/atimer.$(O) : \
338 $(SRC)/syssignal.h \ 339 $(SRC)/syssignal.h \
339 $(SRC)/systime.h 340 $(SRC)/systime.h
340 341
342$(BLD)/bidi.$(O) : \
343 $(SRC)/bidi.c \
344 $(CONFIG_H) \
345 $(SRC)/lisp.h \
346 $(SRC)/buffer.h \
347 $(SRC)/character.h \
348 $(SRC)/dispextern.h
349
341$(BLD)/buffer.$(O) : \ 350$(BLD)/buffer.$(O) : \
342 $(SRC)/buffer.c \ 351 $(SRC)/buffer.c \
343 $(CONFIG_H) \ 352 $(CONFIG_H) \
diff --git a/src/term.c b/src/term.c
index 89b39767f56..718a20d4164 100644
--- a/src/term.c
+++ b/src/term.c
@@ -1540,6 +1540,26 @@ append_glyph (it)
1540 + it->glyph_row->used[it->area]); 1540 + it->glyph_row->used[it->area]);
1541 end = it->glyph_row->glyphs[1 + it->area]; 1541 end = it->glyph_row->glyphs[1 + it->area];
1542 1542
1543 /* If the glyph row is reversed, we need to prepend the glyph rather
1544 than append it. */
1545 if (it->glyph_row->reversed_p && it->area == TEXT_AREA)
1546 {
1547 struct glyph *g;
1548 int move_by = it->pixel_width;
1549
1550 /* Make room for the new glyphs. */
1551 if (move_by > end - glyph) /* don't overstep end of this area */
1552 move_by = end - glyph;
1553 for (g = glyph - 1; g >= it->glyph_row->glyphs[it->area]; g--)
1554 g[move_by] = *g;
1555 glyph = it->glyph_row->glyphs[it->area];
1556 end = glyph + move_by;
1557 }
1558
1559 /* BIDI Note: we put the glyphs of a "multi-pixel" character left to
1560 right, even in the REVERSED_P case, since (a) all of its u.ch are
1561 identical, and (b) the PADDING_P flag needs to be set for the
1562 leftmost one, because we write to the terminal left-to-right. */
1543 for (i = 0; 1563 for (i = 0;
1544 i < it->pixel_width && glyph < end; 1564 i < it->pixel_width && glyph < end;
1545 ++i) 1565 ++i)
@@ -1551,6 +1571,18 @@ append_glyph (it)
1551 glyph->padding_p = i > 0; 1571 glyph->padding_p = i > 0;
1552 glyph->charpos = CHARPOS (it->position); 1572 glyph->charpos = CHARPOS (it->position);
1553 glyph->object = it->object; 1573 glyph->object = it->object;
1574 if (it->bidi_p)
1575 {
1576 glyph->resolved_level = it->bidi_it.resolved_level;
1577 if ((it->bidi_it.type & 7) != it->bidi_it.type)
1578 abort ();
1579 glyph->bidi_type = it->bidi_it.type;
1580 }
1581 else
1582 {
1583 glyph->resolved_level = 0;
1584 glyph->bidi_type = UNKNOWN_BT;
1585 }
1554 1586
1555 ++it->glyph_row->used[it->area]; 1587 ++it->glyph_row->used[it->area];
1556 ++glyph; 1588 ++glyph;
diff --git a/src/window.h b/src/window.h
index 05c1eb18c89..17332f0af20 100644
--- a/src/window.h
+++ b/src/window.h
@@ -117,7 +117,10 @@ struct window
117 /* The buffer displayed in this window */ 117 /* The buffer displayed in this window */
118 /* Of the fields vchild, hchild and buffer, only one is non-nil. */ 118 /* Of the fields vchild, hchild and buffer, only one is non-nil. */
119 Lisp_Object buffer; 119 Lisp_Object buffer;
120 /* A marker pointing to where in the text to start displaying */ 120 /* A marker pointing to where in the text to start displaying.
121 BIDI Note: This is the _logical-order_ start, i.e. the smallest
122 buffer position visible in the window, not necessarily the
123 character displayed in the top left corner of the window. */
121 Lisp_Object start; 124 Lisp_Object start;
122 /* A marker pointing to where in the text point is in this window, 125 /* A marker pointing to where in the text point is in this window,
123 used only when the window is not selected. 126 used only when the window is not selected.
diff --git a/src/xdisp.c b/src/xdisp.c
index 9ece458e77e..ed2db08905d 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -249,6 +249,7 @@ Lisp_Object Qfontified;
249Lisp_Object Qgrow_only; 249Lisp_Object Qgrow_only;
250Lisp_Object Qinhibit_eval_during_redisplay; 250Lisp_Object Qinhibit_eval_during_redisplay;
251Lisp_Object Qbuffer_position, Qposition, Qobject; 251Lisp_Object Qbuffer_position, Qposition, Qobject;
252Lisp_Object Qright_to_left, Qleft_to_right;
252 253
253/* Cursor shapes */ 254/* Cursor shapes */
254Lisp_Object Qbar, Qhbar, Qbox, Qhollow; 255Lisp_Object Qbar, Qhbar, Qbox, Qhollow;
@@ -904,6 +905,7 @@ static void store_mode_line_noprop_char P_ ((char));
904static int store_mode_line_noprop P_ ((const unsigned char *, int, int)); 905static int store_mode_line_noprop P_ ((const unsigned char *, int, int));
905static void x_consider_frame_title P_ ((Lisp_Object)); 906static void x_consider_frame_title P_ ((Lisp_Object));
906static void handle_stop P_ ((struct it *)); 907static void handle_stop P_ ((struct it *));
908static void handle_stop_backwards P_ ((struct it *, EMACS_INT));
907static int tool_bar_lines_needed P_ ((struct frame *, int *)); 909static int tool_bar_lines_needed P_ ((struct frame *, int *));
908static int single_display_spec_intangible_p P_ ((Lisp_Object)); 910static int single_display_spec_intangible_p P_ ((Lisp_Object));
909static void ensure_echo_area_buffers P_ ((void)); 911static void ensure_echo_area_buffers P_ ((void));
@@ -2654,6 +2656,9 @@ init_iterator (it, w, charpos, bytepos, row, base_face_id)
2654 /* Are multibyte characters enabled in current_buffer? */ 2656 /* Are multibyte characters enabled in current_buffer? */
2655 it->multibyte_p = !NILP (current_buffer->enable_multibyte_characters); 2657 it->multibyte_p = !NILP (current_buffer->enable_multibyte_characters);
2656 2658
2659 /* Do we need to reorder bidirectional text? */
2660 it->bidi_p = !NILP (current_buffer->bidi_display_reordering);
2661
2657 /* Non-zero if we should highlight the region. */ 2662 /* Non-zero if we should highlight the region. */
2658 highlight_region_p 2663 highlight_region_p
2659 = (!NILP (Vtransient_mark_mode) 2664 = (!NILP (Vtransient_mark_mode)
@@ -2744,6 +2749,10 @@ init_iterator (it, w, charpos, bytepos, row, base_face_id)
2744 it->glyph_row = row; 2749 it->glyph_row = row;
2745 it->area = TEXT_AREA; 2750 it->area = TEXT_AREA;
2746 2751
2752 /* Forget any previous info about this row being reversed. */
2753 if (it->glyph_row)
2754 it->glyph_row->reversed_p = 0;
2755
2747 /* Get the dimensions of the display area. The display area 2756 /* Get the dimensions of the display area. The display area
2748 consists of the visible window area plus a horizontally scrolled 2757 consists of the visible window area plus a horizontally scrolled
2749 part to the left of the window. All x-values are relative to the 2758 part to the left of the window. All x-values are relative to the
@@ -2799,6 +2808,21 @@ init_iterator (it, w, charpos, bytepos, row, base_face_id)
2799 it->start_of_box_run_p = 1; 2808 it->start_of_box_run_p = 1;
2800 } 2809 }
2801 2810
2811 /* If we are to reorder bidirectional text, init the bidi
2812 iterator. */
2813 if (it->bidi_p)
2814 {
2815 /* Note the paragraph direction that this buffer wants to
2816 use. */
2817 if (EQ (current_buffer->bidi_paragraph_direction, Qleft_to_right))
2818 it->paragraph_embedding = L2R;
2819 else if (EQ (current_buffer->bidi_paragraph_direction, Qright_to_left))
2820 it->paragraph_embedding = R2L;
2821 else
2822 it->paragraph_embedding = NEUTRAL_DIR;
2823 bidi_init_it (charpos, bytepos, &it->bidi_it);
2824 }
2825
2802 /* If a buffer position was specified, set the iterator there, 2826 /* If a buffer position was specified, set the iterator there,
2803 getting overlays and face properties from that position. */ 2827 getting overlays and face properties from that position. */
2804 if (charpos >= BUF_BEG (current_buffer)) 2828 if (charpos >= BUF_BEG (current_buffer))
@@ -3764,18 +3788,18 @@ handle_invisible_prop (it)
3764 else 3788 else
3765 { 3789 {
3766 int invis_p; 3790 int invis_p;
3767 EMACS_INT newpos, next_stop, start_charpos; 3791 EMACS_INT newpos, next_stop, start_charpos, tem;
3768 Lisp_Object pos, prop, overlay; 3792 Lisp_Object pos, prop, overlay;
3769 3793
3770 /* First of all, is there invisible text at this position? */ 3794 /* First of all, is there invisible text at this position? */
3771 start_charpos = IT_CHARPOS (*it); 3795 tem = start_charpos = IT_CHARPOS (*it);
3772 pos = make_number (IT_CHARPOS (*it)); 3796 pos = make_number (tem);
3773 prop = get_char_property_and_overlay (pos, Qinvisible, it->window, 3797 prop = get_char_property_and_overlay (pos, Qinvisible, it->window,
3774 &overlay); 3798 &overlay);
3775 invis_p = TEXT_PROP_MEANS_INVISIBLE (prop); 3799 invis_p = TEXT_PROP_MEANS_INVISIBLE (prop);
3776 3800
3777 /* If we are on invisible text, skip over it. */ 3801 /* If we are on invisible text, skip over it. */
3778 if (invis_p && IT_CHARPOS (*it) < it->end_charpos) 3802 if (invis_p && start_charpos < it->end_charpos)
3779 { 3803 {
3780 /* Record whether we have to display an ellipsis for the 3804 /* Record whether we have to display an ellipsis for the
3781 invisible text. */ 3805 invisible text. */
@@ -3788,17 +3812,16 @@ handle_invisible_prop (it)
3788 do 3812 do
3789 { 3813 {
3790 /* Try to skip some invisible text. Return value is the 3814 /* Try to skip some invisible text. Return value is the
3791 position reached which can be equal to IT's position 3815 position reached which can be equal to where we start
3792 if there is nothing invisible here. This skips both 3816 if there is nothing invisible there. This skips both
3793 over invisible text properties and overlays with 3817 over invisible text properties and overlays with
3794 invisible property. */ 3818 invisible property. */
3795 newpos = skip_invisible (IT_CHARPOS (*it), 3819 newpos = skip_invisible (tem, &next_stop, ZV, it->window);
3796 &next_stop, ZV, it->window);
3797 3820
3798 /* If we skipped nothing at all we weren't at invisible 3821 /* If we skipped nothing at all we weren't at invisible
3799 text in the first place. If everything to the end of 3822 text in the first place. If everything to the end of
3800 the buffer was skipped, end the loop. */ 3823 the buffer was skipped, end the loop. */
3801 if (newpos == IT_CHARPOS (*it) || newpos >= ZV) 3824 if (newpos == tem || newpos >= ZV)
3802 invis_p = 0; 3825 invis_p = 0;
3803 else 3826 else
3804 { 3827 {
@@ -3816,7 +3839,7 @@ handle_invisible_prop (it)
3816 /* If we ended up on invisible text, proceed to 3839 /* If we ended up on invisible text, proceed to
3817 skip starting with next_stop. */ 3840 skip starting with next_stop. */
3818 if (invis_p) 3841 if (invis_p)
3819 IT_CHARPOS (*it) = next_stop; 3842 tem = next_stop;
3820 3843
3821 /* If there are adjacent invisible texts, don't lose the 3844 /* If there are adjacent invisible texts, don't lose the
3822 second one's ellipsis. */ 3845 second one's ellipsis. */
@@ -3826,8 +3849,47 @@ handle_invisible_prop (it)
3826 while (invis_p); 3849 while (invis_p);
3827 3850
3828 /* The position newpos is now either ZV or on visible text. */ 3851 /* The position newpos is now either ZV or on visible text. */
3829 IT_CHARPOS (*it) = newpos; 3852 if (it->bidi_p && newpos < ZV)
3830 IT_BYTEPOS (*it) = CHAR_TO_BYTE (newpos); 3853 {
3854 /* With bidi iteration, the region of invisible text
3855 could start and/or end in the middle of a non-base
3856 embedding level. Therefore, we need to skip
3857 invisible text using the bidi iterator, starting at
3858 IT's current position, until we find ourselves
3859 outside the invisible text. Skipping invisible text
3860 _after_ bidi iteration avoids affecting the visual
3861 order of the displayed text when invisible properties
3862 are added or removed. */
3863 if (it->bidi_it.first_elt)
3864 {
3865 /* If we were `reseat'ed to a new paragraph,
3866 determine the paragraph base direction. We need
3867 to do it now because next_element_from_buffer may
3868 not have a chance to do it, if we are going to
3869 skip any text at the beginning, which resets the
3870 FIRST_ELT flag. */
3871 bidi_paragraph_init (it->paragraph_embedding, &it->bidi_it);
3872 }
3873 do
3874 {
3875 bidi_get_next_char_visually (&it->bidi_it);
3876 }
3877 while (it->stop_charpos <= it->bidi_it.charpos
3878 && it->bidi_it.charpos < newpos);
3879 IT_CHARPOS (*it) = it->bidi_it.charpos;
3880 IT_BYTEPOS (*it) = it->bidi_it.bytepos;
3881 /* If we overstepped NEWPOS, record its position in the
3882 iterator, so that we skip invisible text if later the
3883 bidi iteration lands us in the invisible region
3884 again. */
3885 if (IT_CHARPOS (*it) >= newpos)
3886 it->prev_stop = newpos;
3887 }
3888 else
3889 {
3890 IT_CHARPOS (*it) = newpos;
3891 IT_BYTEPOS (*it) = CHAR_TO_BYTE (newpos);
3892 }
3831 3893
3832 /* If there are before-strings at the start of invisible 3894 /* If there are before-strings at the start of invisible
3833 text, and the text is invisible because of a text 3895 text, and the text is invisible because of a text
@@ -3836,7 +3898,7 @@ handle_invisible_prop (it)
3836 overlay property instead of a text property, this is 3898 overlay property instead of a text property, this is
3837 already handled in the overlay code.) */ 3899 already handled in the overlay code.) */
3838 if (NILP (overlay) 3900 if (NILP (overlay)
3839 && get_overlay_strings (it, start_charpos)) 3901 && get_overlay_strings (it, it->stop_charpos))
3840 { 3902 {
3841 handled = HANDLED_RECOMPUTE_PROPS; 3903 handled = HANDLED_RECOMPUTE_PROPS;
3842 it->stack[it->sp - 1].display_ellipsis_p = display_ellipsis_p; 3904 it->stack[it->sp - 1].display_ellipsis_p = display_ellipsis_p;
@@ -3857,7 +3919,7 @@ handle_invisible_prop (it)
3857 first invisible character. */ 3919 first invisible character. */
3858 if (!STRINGP (it->object)) 3920 if (!STRINGP (it->object))
3859 { 3921 {
3860 it->position.charpos = IT_CHARPOS (*it) - 1; 3922 it->position.charpos = newpos - 1;
3861 it->position.bytepos = CHAR_TO_BYTE (it->position.charpos); 3923 it->position.bytepos = CHAR_TO_BYTE (it->position.charpos);
3862 } 3924 }
3863 it->ellipsis_p = 1; 3925 it->ellipsis_p = 1;
@@ -4571,43 +4633,46 @@ display_prop_string_p (prop, string)
4571 return 0; 4633 return 0;
4572} 4634}
4573 4635
4574 4636/* Look for STRING in overlays and text properties in W's buffer,
4575/* Determine which buffer position in W's buffer STRING comes from. 4637 between character positions FROM and TO (excluding TO).
4576 AROUND_CHARPOS is an approximate position where it could come from. 4638 BACK_P non-zero means look back (in this case, TO is supposed to be
4577 Value is the buffer position or 0 if it couldn't be determined. 4639 less than FROM).
4640 Value is the first character position where STRING was found, or
4641 zero if it wasn't found before hitting TO.
4578 4642
4579 W's buffer must be current. 4643 W's buffer must be current.
4580 4644
4581 This function is necessary because we don't record buffer positions
4582 in glyphs generated from strings (to keep struct glyph small).
4583 This function may only use code that doesn't eval because it is 4645 This function may only use code that doesn't eval because it is
4584 called asynchronously from note_mouse_highlight. */ 4646 called asynchronously from note_mouse_highlight. */
4585 4647
4586int 4648static EMACS_INT
4587string_buffer_position (w, string, around_charpos) 4649string_buffer_position_lim (w, string, from, to, back_p)
4588 struct window *w; 4650 struct window *w;
4589 Lisp_Object string; 4651 Lisp_Object string;
4590 int around_charpos; 4652 EMACS_INT from, to;
4653 int back_p;
4591{ 4654{
4592 Lisp_Object limit, prop, pos; 4655 Lisp_Object limit, prop, pos;
4593 const int MAX_DISTANCE = 1000;
4594 int found = 0; 4656 int found = 0;
4595 4657
4596 pos = make_number (around_charpos); 4658 pos = make_number (from);
4597 limit = make_number (min (XINT (pos) + MAX_DISTANCE, ZV)); 4659
4598 while (!found && !EQ (pos, limit)) 4660 if (!back_p) /* looking forward */
4599 { 4661 {
4600 prop = Fget_char_property (pos, Qdisplay, Qnil); 4662 limit = make_number (min (to, ZV));
4601 if (!NILP (prop) && display_prop_string_p (prop, string)) 4663 while (!found && !EQ (pos, limit))
4602 found = 1; 4664 {
4603 else 4665 prop = Fget_char_property (pos, Qdisplay, Qnil);
4604 pos = Fnext_single_char_property_change (pos, Qdisplay, Qnil, limit); 4666 if (!NILP (prop) && display_prop_string_p (prop, string))
4667 found = 1;
4668 else
4669 pos = Fnext_single_char_property_change (pos, Qdisplay, Qnil,
4670 limit);
4671 }
4605 } 4672 }
4606 4673 else /* looking back */
4607 if (!found)
4608 { 4674 {
4609 pos = make_number (around_charpos); 4675 limit = make_number (max (to, BEGV));
4610 limit = make_number (max (XINT (pos) - MAX_DISTANCE, BEGV));
4611 while (!found && !EQ (pos, limit)) 4676 while (!found && !EQ (pos, limit))
4612 { 4677 {
4613 prop = Fget_char_property (pos, Qdisplay, Qnil); 4678 prop = Fget_char_property (pos, Qdisplay, Qnil);
@@ -4622,6 +4687,35 @@ string_buffer_position (w, string, around_charpos)
4622 return found ? XINT (pos) : 0; 4687 return found ? XINT (pos) : 0;
4623} 4688}
4624 4689
4690/* Determine which buffer position in W's buffer STRING comes from.
4691 AROUND_CHARPOS is an approximate position where it could come from.
4692 Value is the buffer position or 0 if it couldn't be determined.
4693
4694 W's buffer must be current.
4695
4696 This function is necessary because we don't record buffer positions
4697 in glyphs generated from strings (to keep struct glyph small).
4698 This function may only use code that doesn't eval because it is
4699 called asynchronously from note_mouse_highlight. */
4700
4701EMACS_INT
4702string_buffer_position (w, string, around_charpos)
4703 struct window *w;
4704 Lisp_Object string;
4705 EMACS_INT around_charpos;
4706{
4707 Lisp_Object limit, prop, pos;
4708 const int MAX_DISTANCE = 1000;
4709 EMACS_INT found = string_buffer_position_lim (w, string, around_charpos,
4710 around_charpos + MAX_DISTANCE,
4711 0);
4712
4713 if (!found)
4714 found = string_buffer_position_lim (w, string, around_charpos,
4715 around_charpos - MAX_DISTANCE, 1);
4716 return found;
4717}
4718
4625 4719
4626 4720
4627/*********************************************************************** 4721/***********************************************************************
@@ -5088,6 +5182,8 @@ push_it (it)
5088 p = it->stack + it->sp; 5182 p = it->stack + it->sp;
5089 5183
5090 p->stop_charpos = it->stop_charpos; 5184 p->stop_charpos = it->stop_charpos;
5185 p->prev_stop = it->prev_stop;
5186 p->base_level_stop = it->base_level_stop;
5091 p->cmp_it = it->cmp_it; 5187 p->cmp_it = it->cmp_it;
5092 xassert (it->face_id >= 0); 5188 xassert (it->face_id >= 0);
5093 p->face_id = it->face_id; 5189 p->face_id = it->face_id;
@@ -5138,6 +5234,8 @@ pop_it (it)
5138 --it->sp; 5234 --it->sp;
5139 p = it->stack + it->sp; 5235 p = it->stack + it->sp;
5140 it->stop_charpos = p->stop_charpos; 5236 it->stop_charpos = p->stop_charpos;
5237 it->prev_stop = p->prev_stop;
5238 it->base_level_stop = p->base_level_stop;
5141 it->cmp_it = p->cmp_it; 5239 it->cmp_it = p->cmp_it;
5142 it->face_id = p->face_id; 5240 it->face_id = p->face_id;
5143 it->current = p->current; 5241 it->current = p->current;
@@ -5315,8 +5413,8 @@ back_to_previous_visible_line_start (it)
5315 if (IT_CHARPOS (*it) <= BEGV) 5413 if (IT_CHARPOS (*it) <= BEGV)
5316 break; 5414 break;
5317 5415
5318 /* If selective > 0, then lines indented more than that values 5416 /* If selective > 0, then lines indented more than its value are
5319 are invisible. */ 5417 invisible. */
5320 if (it->selective > 0 5418 if (it->selective > 0
5321 && indented_beyond_p (IT_CHARPOS (*it), IT_BYTEPOS (*it), 5419 && indented_beyond_p (IT_CHARPOS (*it), IT_BYTEPOS (*it),
5322 (double) it->selective)) /* iftc */ 5420 (double) it->selective)) /* iftc */
@@ -5473,7 +5571,30 @@ reseat (it, pos, force_p)
5473 if (force_p 5571 if (force_p
5474 || CHARPOS (pos) > it->stop_charpos 5572 || CHARPOS (pos) > it->stop_charpos
5475 || CHARPOS (pos) < original_pos) 5573 || CHARPOS (pos) < original_pos)
5476 handle_stop (it); 5574 {
5575 if (it->bidi_p)
5576 {
5577 /* For bidi iteration, we need to prime prev_stop and
5578 base_level_stop with our best estimations. */
5579 if (CHARPOS (pos) < it->prev_stop)
5580 {
5581 handle_stop_backwards (it, BEGV);
5582 if (CHARPOS (pos) < it->base_level_stop)
5583 it->base_level_stop = 0;
5584 }
5585 else if (CHARPOS (pos) > it->stop_charpos
5586 && it->stop_charpos >= BEGV)
5587 handle_stop_backwards (it, it->stop_charpos);
5588 else /* force_p */
5589 handle_stop (it);
5590 }
5591 else
5592 {
5593 handle_stop (it);
5594 it->prev_stop = it->base_level_stop = 0;
5595 }
5596
5597 }
5477 5598
5478 CHECK_IT (it); 5599 CHECK_IT (it);
5479} 5600}
@@ -5510,9 +5631,14 @@ reseat_1 (it, pos, set_stop_p)
5510 it->sp = 0; 5631 it->sp = 0;
5511 it->string_from_display_prop_p = 0; 5632 it->string_from_display_prop_p = 0;
5512 it->face_before_selective_p = 0; 5633 it->face_before_selective_p = 0;
5634 if (it->bidi_p)
5635 it->bidi_it.first_elt = 1;
5513 5636
5514 if (set_stop_p) 5637 if (set_stop_p)
5515 it->stop_charpos = CHARPOS (pos); 5638 {
5639 it->stop_charpos = CHARPOS (pos);
5640 it->base_level_stop = CHARPOS (pos);
5641 }
5516} 5642}
5517 5643
5518 5644
@@ -5624,7 +5750,7 @@ reseat_to_string (it, s, string, charpos, precision, field_width, multibyte)
5624 5750
5625/*********************************************************************** 5751/***********************************************************************
5626 Iteration 5752 Iteration
5627 ***********************************************************************/ 5753***********************************************************************/
5628 5754
5629/* Map enum it_method value to corresponding next_element_from_* function. */ 5755/* Map enum it_method value to corresponding next_element_from_* function. */
5630 5756
@@ -5676,6 +5802,13 @@ get_next_display_element (it)
5676 5802
5677 if (it->what == IT_CHARACTER) 5803 if (it->what == IT_CHARACTER)
5678 { 5804 {
5805 /* UAX#9, L4: "A character is depicted by a mirrored glyph if
5806 and only if (a) the resolved directionality of that character
5807 is R..." */
5808 /* FIXME: Do we need an exception for characters from display
5809 tables? */
5810 if (it->bidi_p && it->bidi_it.type == STRONG_R)
5811 it->c = bidi_mirror_char (it->c);
5679 /* Map via display table or translate control characters. 5812 /* Map via display table or translate control characters.
5680 IT->c, IT->len etc. have been set to the next character by 5813 IT->c, IT->len etc. have been set to the next character by
5681 the function call above. If we have a display table, and it 5814 the function call above. If we have a display table, and it
@@ -5690,7 +5823,7 @@ get_next_display_element (it)
5690 Lisp_Object dv; 5823 Lisp_Object dv;
5691 struct charset *unibyte = CHARSET_FROM_ID (charset_unibyte); 5824 struct charset *unibyte = CHARSET_FROM_ID (charset_unibyte);
5692 enum { char_is_other = 0, char_is_nbsp, char_is_soft_hyphen } 5825 enum { char_is_other = 0, char_is_nbsp, char_is_soft_hyphen }
5693 nbsp_or_shy = char_is_other; 5826 nbsp_or_shy = char_is_other;
5694 int decoded = it->c; 5827 int decoded = it->c;
5695 5828
5696 if (it->dp 5829 if (it->dp
@@ -5908,12 +6041,12 @@ get_next_display_element (it)
5908 happen actually, but due to bugs it may 6041 happen actually, but due to bugs it may
5909 happen. Let's print the char as is, there's 6042 happen. Let's print the char as is, there's
5910 not much meaningful we can do with it. */ 6043 not much meaningful we can do with it. */
5911 str[0] = it->c; 6044 str[0] = it->c;
5912 str[1] = it->c >> 8; 6045 str[1] = it->c >> 8;
5913 str[2] = it->c >> 16; 6046 str[2] = it->c >> 16;
5914 str[3] = it->c >> 24; 6047 str[3] = it->c >> 24;
5915 len = 4; 6048 len = 4;
5916 } 6049 }
5917 6050
5918 for (i = 0; i < len; i++) 6051 for (i = 0; i < len; i++)
5919 { 6052 {
@@ -6082,8 +6215,22 @@ set_iterator_to_next (it, reseat_p)
6082 else 6215 else
6083 { 6216 {
6084 xassert (it->len != 0); 6217 xassert (it->len != 0);
6085 IT_BYTEPOS (*it) += it->len; 6218
6086 IT_CHARPOS (*it) += 1; 6219 if (!it->bidi_p)
6220 {
6221 IT_BYTEPOS (*it) += it->len;
6222 IT_CHARPOS (*it) += 1;
6223 }
6224 else
6225 {
6226 /* If this is a new paragraph, determine its base
6227 direction (a.k.a. its base embedding level). */
6228 if (it->bidi_it.new_paragraph)
6229 bidi_paragraph_init (it->paragraph_embedding, &it->bidi_it);
6230 bidi_get_next_char_visually (&it->bidi_it);
6231 IT_BYTEPOS (*it) = it->bidi_it.bytepos;
6232 IT_CHARPOS (*it) = it->bidi_it.charpos;
6233 }
6087 xassert (IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it))); 6234 xassert (IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it)));
6088 } 6235 }
6089 break; 6236 break;
@@ -6236,7 +6383,7 @@ next_element_from_display_vector (it)
6236 it->face_id = it->saved_face_id; 6383 it->face_id = it->saved_face_id;
6237 6384
6238 /* KFS: This code used to check ip->dpvec[0] instead of the current element. 6385 /* KFS: This code used to check ip->dpvec[0] instead of the current element.
6239 That seemed totally bogus - so I changed it... */ 6386 That seemed totally bogus - so I changed it... */
6240 gc = it->dpvec[it->current.dpvec_index]; 6387 gc = it->dpvec[it->current.dpvec_index];
6241 6388
6242 if (GLYPH_CODE_P (gc) && GLYPH_CODE_CHAR_VALID_P (gc)) 6389 if (GLYPH_CODE_P (gc) && GLYPH_CODE_CHAR_VALID_P (gc))
@@ -6471,6 +6618,45 @@ next_element_from_stretch (it)
6471 return 1; 6618 return 1;
6472} 6619}
6473 6620
6621/* Scan forward from CHARPOS in the current buffer, until we find a
6622 stop position > current IT's position. Then handle the stop
6623 position before that. This is called when we bump into a stop
6624 position while reordering bidirectional text. */
6625
6626static void
6627handle_stop_backwards (it, charpos)
6628 struct it *it;
6629 EMACS_INT charpos;
6630{
6631 EMACS_INT where_we_are = IT_CHARPOS (*it);
6632 struct display_pos save_current = it->current;
6633 struct text_pos save_position = it->position;
6634 struct text_pos pos1;
6635 EMACS_INT next_stop;
6636
6637 /* Scan in strict logical order. */
6638 it->bidi_p = 0;
6639 do
6640 {
6641 it->prev_stop = charpos;
6642 SET_TEXT_POS (pos1, charpos, CHAR_TO_BYTE (charpos));
6643 reseat_1 (it, pos1, 0);
6644 compute_stop_pos (it);
6645 /* We must advance forward, right? */
6646 if (it->stop_charpos <= it->prev_stop)
6647 abort ();
6648 charpos = it->stop_charpos;
6649 }
6650 while (charpos <= where_we_are);
6651
6652 next_stop = it->stop_charpos;
6653 it->stop_charpos = it->prev_stop;
6654 it->bidi_p = 1;
6655 it->current = save_current;
6656 it->position = save_position;
6657 handle_stop (it);
6658 it->stop_charpos = next_stop;
6659}
6474 6660
6475/* Load IT with the next display element from current_buffer. Value 6661/* Load IT with the next display element from current_buffer. Value
6476 is zero if end of buffer reached. IT->stop_charpos is the next 6662 is zero if end of buffer reached. IT->stop_charpos is the next
@@ -6485,6 +6671,53 @@ next_element_from_buffer (it)
6485 6671
6486 xassert (IT_CHARPOS (*it) >= BEGV); 6672 xassert (IT_CHARPOS (*it) >= BEGV);
6487 6673
6674 /* With bidi reordering, the character to display might not be the
6675 character at IT_CHARPOS. BIDI_IT.FIRST_ELT non-zero means that
6676 we were reseat()ed to a new buffer position, which is potentially
6677 a different paragraph. */
6678 if (it->bidi_p && it->bidi_it.first_elt)
6679 {
6680 it->bidi_it.charpos = IT_CHARPOS (*it);
6681 it->bidi_it.bytepos = IT_BYTEPOS (*it);
6682 /* If we are at the beginning of a line, we can produce the next
6683 element right away. */
6684 if (it->bidi_it.bytepos == BEGV_BYTE
6685 /* FIXME: Should support all Unicode line separators. */
6686 || FETCH_CHAR (it->bidi_it.bytepos - 1) == '\n'
6687 || FETCH_CHAR (it->bidi_it.bytepos) == '\n')
6688 {
6689 bidi_paragraph_init (it->paragraph_embedding, &it->bidi_it);
6690 bidi_get_next_char_visually (&it->bidi_it);
6691 }
6692 else
6693 {
6694 int orig_bytepos = IT_BYTEPOS (*it);
6695
6696 /* We need to prime the bidi iterator starting at the line's
6697 beginning, before we will be able to produce the next
6698 element. */
6699 IT_CHARPOS (*it) = find_next_newline_no_quit (IT_CHARPOS (*it), -1);
6700 IT_BYTEPOS (*it) = CHAR_TO_BYTE (IT_CHARPOS (*it));
6701 it->bidi_it.charpos = IT_CHARPOS (*it);
6702 it->bidi_it.bytepos = IT_BYTEPOS (*it);
6703 bidi_paragraph_init (it->paragraph_embedding, &it->bidi_it);
6704 do
6705 {
6706 /* Now return to buffer position where we were asked to
6707 get the next display element, and produce that. */
6708 bidi_get_next_char_visually (&it->bidi_it);
6709 }
6710 while (it->bidi_it.bytepos != orig_bytepos
6711 && it->bidi_it.bytepos < ZV_BYTE);
6712 }
6713
6714 it->bidi_it.first_elt = 0; /* paranoia: bidi.c does this */
6715 /* Adjust IT's position information to where we ended up. */
6716 IT_CHARPOS (*it) = it->bidi_it.charpos;
6717 IT_BYTEPOS (*it) = it->bidi_it.bytepos;
6718 SET_TEXT_POS (it->position, IT_CHARPOS (*it), IT_BYTEPOS (*it));
6719 }
6720
6488 if (IT_CHARPOS (*it) >= it->stop_charpos) 6721 if (IT_CHARPOS (*it) >= it->stop_charpos)
6489 { 6722 {
6490 if (IT_CHARPOS (*it) >= it->end_charpos) 6723 if (IT_CHARPOS (*it) >= it->end_charpos)
@@ -6510,12 +6743,51 @@ next_element_from_buffer (it)
6510 success_p = 0; 6743 success_p = 0;
6511 } 6744 }
6512 } 6745 }
6746 else if (!(!it->bidi_p
6747 || BIDI_AT_BASE_LEVEL (it->bidi_it)
6748 || IT_CHARPOS (*it) == it->stop_charpos))
6749 {
6750 /* With bidi non-linear iteration, we could find ourselves
6751 far beyond the last computed stop_charpos, with several
6752 other stop positions in between that we missed. Scan
6753 them all now, in buffer's logical order, until we find
6754 and handle the last stop_charpos that precedes our
6755 current position. */
6756 handle_stop_backwards (it, it->stop_charpos);
6757 return GET_NEXT_DISPLAY_ELEMENT (it);
6758 }
6513 else 6759 else
6514 { 6760 {
6761 if (it->bidi_p)
6762 {
6763 /* Take note of the stop position we just moved across,
6764 for when we will move back across it. */
6765 it->prev_stop = it->stop_charpos;
6766 /* If we are at base paragraph embedding level, take
6767 note of the last stop position seen at this
6768 level. */
6769 if (BIDI_AT_BASE_LEVEL (it->bidi_it))
6770 it->base_level_stop = it->stop_charpos;
6771 }
6515 handle_stop (it); 6772 handle_stop (it);
6516 return GET_NEXT_DISPLAY_ELEMENT (it); 6773 return GET_NEXT_DISPLAY_ELEMENT (it);
6517 } 6774 }
6518 } 6775 }
6776 else if (it->bidi_p
6777 /* We can sometimes back up for reasons that have nothing
6778 to do with bidi reordering. E.g., compositions. The
6779 code below is only needed when we are above the base
6780 embedding level, so test for that explicitly. */
6781 && !BIDI_AT_BASE_LEVEL (it->bidi_it)
6782 && IT_CHARPOS (*it) < it->prev_stop)
6783 {
6784 if (it->base_level_stop <= 0)
6785 it->base_level_stop = BEGV;
6786 if (IT_CHARPOS (*it) < it->base_level_stop)
6787 abort ();
6788 handle_stop_backwards (it, it->base_level_stop);
6789 return GET_NEXT_DISPLAY_ELEMENT (it);
6790 }
6519 else 6791 else
6520 { 6792 {
6521 /* No face changes, overlays etc. in sight, so just return a 6793 /* No face changes, overlays etc. in sight, so just return a
@@ -6670,9 +6942,9 @@ next_element_from_composition (it)
6670 line on the display without producing glyphs. 6942 line on the display without producing glyphs.
6671 6943
6672 OP should be a bit mask including some or all of these bits: 6944 OP should be a bit mask including some or all of these bits:
6673 MOVE_TO_X: Stop on reaching x-position TO_X. 6945 MOVE_TO_X: Stop upon reaching x-position TO_X.
6674 MOVE_TO_POS: Stop on reaching buffer or string position TO_CHARPOS. 6946 MOVE_TO_POS: Stop upon reaching buffer or string position TO_CHARPOS.
6675 Regardless of OP's value, stop in reaching the end of the display line. 6947 Regardless of OP's value, stop upon reaching the end of the display line.
6676 6948
6677 TO_X is normally a value 0 <= TO_X <= IT->last_visible_x. 6949 TO_X is normally a value 0 <= TO_X <= IT->last_visible_x.
6678 This means, in particular, that TO_X includes window's horizontal 6950 This means, in particular, that TO_X includes window's horizontal
@@ -6708,6 +6980,8 @@ move_it_in_display_line_to (struct it *it,
6708 struct glyph_row *saved_glyph_row; 6980 struct glyph_row *saved_glyph_row;
6709 struct it wrap_it, atpos_it, atx_it; 6981 struct it wrap_it, atpos_it, atx_it;
6710 int may_wrap = 0; 6982 int may_wrap = 0;
6983 enum it_method prev_method = it->method;
6984 EMACS_INT prev_pos = IT_CHARPOS (*it);
6711 6985
6712 /* Don't produce glyphs in produce_glyphs. */ 6986 /* Don't produce glyphs in produce_glyphs. */
6713 saved_glyph_row = it->glyph_row; 6987 saved_glyph_row = it->glyph_row;
@@ -6725,7 +6999,7 @@ move_it_in_display_line_to (struct it *it,
6725#define BUFFER_POS_REACHED_P() \ 6999#define BUFFER_POS_REACHED_P() \
6726 ((op & MOVE_TO_POS) != 0 \ 7000 ((op & MOVE_TO_POS) != 0 \
6727 && BUFFERP (it->object) \ 7001 && BUFFERP (it->object) \
6728 && IT_CHARPOS (*it) >= to_charpos \ 7002 && IT_CHARPOS (*it) == to_charpos \
6729 && (it->method == GET_FROM_BUFFER \ 7003 && (it->method == GET_FROM_BUFFER \
6730 || (it->method == GET_FROM_DISPLAY_VECTOR \ 7004 || (it->method == GET_FROM_DISPLAY_VECTOR \
6731 && it->dpvec + it->current.dpvec_index + 1 >= it->dpend))) 7005 && it->dpvec + it->current.dpvec_index + 1 >= it->dpend)))
@@ -6749,7 +7023,14 @@ move_it_in_display_line_to (struct it *it,
6749 if ((op & MOVE_TO_POS) != 0 7023 if ((op & MOVE_TO_POS) != 0
6750 && BUFFERP (it->object) 7024 && BUFFERP (it->object)
6751 && it->method == GET_FROM_BUFFER 7025 && it->method == GET_FROM_BUFFER
6752 && IT_CHARPOS (*it) > to_charpos) 7026 && (prev_method == GET_FROM_IMAGE
7027 || prev_method == GET_FROM_STRETCH)
7028 /* Passed TO_CHARPOS from left to right. */
7029 && ((prev_pos < to_charpos
7030 && IT_CHARPOS (*it) > to_charpos)
7031 /* Passed TO_CHARPOS from right to left. */
7032 || (prev_pos > to_charpos)
7033 && IT_CHARPOS (*it) < to_charpos))
6753 { 7034 {
6754 if (it->line_wrap != WORD_WRAP || wrap_it.sp < 0) 7035 if (it->line_wrap != WORD_WRAP || wrap_it.sp < 0)
6755 { 7036 {
@@ -6763,6 +7044,9 @@ move_it_in_display_line_to (struct it *it,
6763 atpos_it = *it; 7044 atpos_it = *it;
6764 } 7045 }
6765 7046
7047 prev_method = it->method;
7048 if (it->method == GET_FROM_BUFFER)
7049 prev_pos = IT_CHARPOS (*it);
6766 /* Stop when ZV reached. 7050 /* Stop when ZV reached.
6767 We used to stop here when TO_CHARPOS reached as well, but that is 7051 We used to stop here when TO_CHARPOS reached as well, but that is
6768 too soon if this glyph does not fit on this line. So we handle it 7052 too soon if this glyph does not fit on this line. So we handle it
@@ -7028,6 +7312,8 @@ move_it_in_display_line_to (struct it *it,
7028 break; 7312 break;
7029 } 7313 }
7030 7314
7315 if (it->method == GET_FROM_BUFFER)
7316 prev_pos = IT_CHARPOS (*it);
7031 /* The current display element has been consumed. Advance 7317 /* The current display element has been consumed. Advance
7032 to the next. */ 7318 to the next. */
7033 set_iterator_to_next (it, 1); 7319 set_iterator_to_next (it, 1);
@@ -11033,6 +11319,17 @@ text_outside_line_unchanged_p (w, start, end)
11033 && overlay_touches_p (Z - end)) 11319 && overlay_touches_p (Z - end))
11034 unchanged_p = 0; 11320 unchanged_p = 0;
11035 } 11321 }
11322
11323 /* Under bidi reordering, adding or deleting a character in the
11324 beginning of a paragraph, before the first strong directional
11325 character, can change the base direction of the paragraph (unless
11326 the buffer specifies a fixed paragraph direction), which will
11327 require to redisplay the whole paragraph. It might be worthwhile
11328 to find the paragraph limits and widen the range of redisplayed
11329 lines to that, but for now just give up this optimization. */
11330 if (!NILP (XBUFFER (w->buffer)->bidi_display_reordering)
11331 && NILP (XBUFFER (w->buffer)->bidi_paragraph_direction))
11332 unchanged_p = 0;
11036 } 11333 }
11037 11334
11038 return unchanged_p; 11335 return unchanged_p;
@@ -12323,162 +12620,400 @@ set_cursor_from_row (w, row, matrix, delta, delta_bytes, dy, dvpos)
12323 struct glyph *glyph = row->glyphs[TEXT_AREA]; 12620 struct glyph *glyph = row->glyphs[TEXT_AREA];
12324 struct glyph *end = glyph + row->used[TEXT_AREA]; 12621 struct glyph *end = glyph + row->used[TEXT_AREA];
12325 struct glyph *cursor = NULL; 12622 struct glyph *cursor = NULL;
12326 /* The first glyph that starts a sequence of glyphs from a string
12327 that is a value of a display property. */
12328 struct glyph *string_start;
12329 /* The X coordinate of string_start. */
12330 int string_start_x;
12331 /* The last known character position in row. */ 12623 /* The last known character position in row. */
12332 int last_pos = MATRIX_ROW_START_CHARPOS (row) + delta; 12624 int last_pos = MATRIX_ROW_START_CHARPOS (row) + delta;
12333 /* The last known character position before string_start. */
12334 int string_before_pos;
12335 int x = row->x; 12625 int x = row->x;
12336 int cursor_x = x; 12626 int cursor_x = x;
12337 /* Last buffer position covered by an overlay. */ 12627 EMACS_INT pt_old = PT - delta;
12338 int cursor_from_overlay_pos = 0; 12628 EMACS_INT pos_before = MATRIX_ROW_START_CHARPOS (row) + delta;
12339 int pt_old = PT - delta; 12629 EMACS_INT pos_after = MATRIX_ROW_END_CHARPOS (row) + delta;
12340 12630 struct glyph *glyph_before = glyph - 1, *glyph_after = end;
12341 /* Skip over glyphs not having an object at the start of the row. 12631 /* Non-zero means we've found a match for cursor position, but that
12342 These are special glyphs like truncation marks on terminal 12632 glyph has the avoid_cursor_p flag set. */
12343 frames. */ 12633 int match_with_avoid_cursor = 0;
12634 /* Non-zero means we've seen at least one glyph that came from a
12635 display string. */
12636 int string_seen = 0;
12637 /* Largest buffer position seen so far during scan of glyph row. */
12638 EMACS_INT bpos_max = last_pos;
12639 /* Last buffer position covered by an overlay string with an integer
12640 `cursor' property. */
12641 EMACS_INT bpos_covered = 0;
12642
12643 /* Skip over glyphs not having an object at the start and the end of
12644 the row. These are special glyphs like truncation marks on
12645 terminal frames. */
12344 if (row->displays_text_p) 12646 if (row->displays_text_p)
12345 while (glyph < end 12647 {
12346 && INTEGERP (glyph->object) 12648 if (!row->reversed_p)
12347 && glyph->charpos < 0) 12649 {
12650 while (glyph < end
12651 && INTEGERP (glyph->object)
12652 && glyph->charpos < 0)
12653 {
12654 x += glyph->pixel_width;
12655 ++glyph;
12656 }
12657 while (end > glyph
12658 && INTEGERP ((end - 1)->object)
12659 /* CHARPOS is zero for blanks inserted by
12660 extend_face_to_end_of_line. */
12661 && (end - 1)->charpos <= 0)
12662 --end;
12663 glyph_before = glyph - 1;
12664 glyph_after = end;
12665 }
12666 else
12667 {
12668 struct glyph *g;
12669
12670 /* If the glyph row is reversed, we need to process it from back
12671 to front, so swap the edge pointers. */
12672 end = glyph - 1;
12673 glyph += row->used[TEXT_AREA] - 1;
12674 /* Reverse the known positions in the row. */
12675 last_pos = pos_after = MATRIX_ROW_START_CHARPOS (row) + delta;
12676 pos_before = MATRIX_ROW_END_CHARPOS (row) + delta;
12677
12678 while (glyph > end + 1
12679 && INTEGERP (glyph->object)
12680 && glyph->charpos < 0)
12681 {
12682 --glyph;
12683 x -= glyph->pixel_width;
12684 }
12685 if (INTEGERP (glyph->object) && glyph->charpos < 0)
12686 --glyph;
12687 /* By default, put the cursor on the rightmost glyph. */
12688 for (g = end + 1; g < glyph; g++)
12689 x += g->pixel_width;
12690 cursor_x = x;
12691 while (end < glyph
12692 && INTEGERP ((end + 1)->object)
12693 && (end + 1)->charpos <= 0)
12694 ++end;
12695 glyph_before = glyph + 1;
12696 glyph_after = end;
12697 }
12698 }
12699 else if (row->reversed_p)
12700 {
12701 /* In R2L rows that don't display text, put the cursor on the
12702 rightmost glyph. Case in point: an empty last line that is
12703 part of an R2L paragraph. */
12704 cursor = end - 1;
12705 x = -1; /* will be computed below, at lable compute_x */
12706 }
12707
12708 /* Step 1: Try to find the glyph whose character position
12709 corresponds to point. If that's not possible, find 2 glyphs
12710 whose character positions are the closest to point, one before
12711 point, the other after it. */
12712 if (!row->reversed_p)
12713 while (/* not marched to end of glyph row */
12714 glyph < end
12715 /* glyph was not inserted by redisplay for internal purposes */
12716 && !INTEGERP (glyph->object))
12348 { 12717 {
12718 if (BUFFERP (glyph->object))
12719 {
12720 EMACS_INT dpos = glyph->charpos - pt_old;
12721
12722 if (glyph->charpos > bpos_max)
12723 bpos_max = glyph->charpos;
12724 if (!glyph->avoid_cursor_p)
12725 {
12726 /* If we hit point, we've found the glyph on which to
12727 display the cursor. */
12728 if (dpos == 0)
12729 {
12730 match_with_avoid_cursor = 0;
12731 break;
12732 }
12733 /* See if we've found a better approximation to
12734 POS_BEFORE or to POS_AFTER. Note that we want the
12735 first (leftmost) glyph of all those that are the
12736 closest from below, and the last (rightmost) of all
12737 those from above. */
12738 if (0 > dpos && dpos > pos_before - pt_old)
12739 {
12740 pos_before = glyph->charpos;
12741 glyph_before = glyph;
12742 }
12743 else if (0 < dpos && dpos <= pos_after - pt_old)
12744 {
12745 pos_after = glyph->charpos;
12746 glyph_after = glyph;
12747 }
12748 }
12749 else if (dpos == 0)
12750 match_with_avoid_cursor = 1;
12751 }
12752 else if (STRINGP (glyph->object))
12753 {
12754 Lisp_Object chprop;
12755 int glyph_pos = glyph->charpos;
12756
12757 chprop = Fget_char_property (make_number (glyph_pos), Qcursor,
12758 glyph->object);
12759 if (INTEGERP (chprop))
12760 {
12761 bpos_covered = bpos_max + XINT (chprop);
12762 /* If the `cursor' property covers buffer positions up
12763 to and including point, we should display cursor on
12764 this glyph. */
12765 /* Implementation note: bpos_max == pt_old when, e.g.,
12766 we are in an empty line, where bpos_max is set to
12767 MATRIX_ROW_START_CHARPOS, see above. */
12768 if (bpos_max <= pt_old && bpos_covered >= pt_old)
12769 {
12770 cursor = glyph;
12771 break;
12772 }
12773 }
12774
12775 string_seen = 1;
12776 }
12349 x += glyph->pixel_width; 12777 x += glyph->pixel_width;
12350 ++glyph; 12778 ++glyph;
12351 } 12779 }
12780 else if (glyph > end) /* row is reversed */
12781 while (!INTEGERP (glyph->object))
12782 {
12783 if (BUFFERP (glyph->object))
12784 {
12785 EMACS_INT dpos = glyph->charpos - pt_old;
12352 12786
12353 string_start = NULL; 12787 if (glyph->charpos > bpos_max)
12354 while (glyph < end 12788 bpos_max = glyph->charpos;
12355 && !INTEGERP (glyph->object) 12789 if (!glyph->avoid_cursor_p)
12356 && (!BUFFERP (glyph->object) 12790 {
12357 || (last_pos = glyph->charpos) < pt_old 12791 if (dpos == 0)
12358 || glyph->avoid_cursor_p)) 12792 {
12793 match_with_avoid_cursor = 0;
12794 break;
12795 }
12796 if (0 > dpos && dpos > pos_before - pt_old)
12797 {
12798 pos_before = glyph->charpos;
12799 glyph_before = glyph;
12800 }
12801 else if (0 < dpos && dpos <= pos_after - pt_old)
12802 {
12803 pos_after = glyph->charpos;
12804 glyph_after = glyph;
12805 }
12806 }
12807 else if (dpos == 0)
12808 match_with_avoid_cursor = 1;
12809 }
12810 else if (STRINGP (glyph->object))
12811 {
12812 Lisp_Object chprop;
12813 int glyph_pos = glyph->charpos;
12814
12815 chprop = Fget_char_property (make_number (glyph_pos), Qcursor,
12816 glyph->object);
12817 if (INTEGERP (chprop))
12818 {
12819 bpos_covered = bpos_max + XINT (chprop);
12820 /* If the `cursor' property covers buffer positions up
12821 to and including point, we should display cursor on
12822 this glyph. */
12823 if (bpos_max <= pt_old && bpos_covered >= pt_old)
12824 {
12825 cursor = glyph;
12826 break;
12827 }
12828 }
12829 string_seen = 1;
12830 }
12831 --glyph;
12832 if (glyph == end)
12833 break;
12834 x -= glyph->pixel_width;
12835 }
12836
12837 /* Step 2: If we didn't find an exact match for point, we need to
12838 look for a proper place to put the cursor among glyphs between
12839 GLYPH_BEFORE and GLYPH_AFTER. */
12840 if (!(BUFFERP (glyph->object) && glyph->charpos == pt_old)
12841 && bpos_covered < pt_old)
12359 { 12842 {
12360 if (! STRINGP (glyph->object)) 12843 if (row->ends_in_ellipsis_p && pos_after == last_pos)
12361 { 12844 {
12362 string_start = NULL; 12845 EMACS_INT ellipsis_pos;
12363 x += glyph->pixel_width; 12846
12364 ++glyph; 12847 /* Scan back over the ellipsis glyphs. */
12365 /* If we are beyond the cursor position computed from the 12848 if (!row->reversed_p)
12366 last overlay seen, that overlay is not in effect for
12367 current cursor position. Reset the cursor information
12368 computed from that overlay. */
12369 if (cursor_from_overlay_pos
12370 && last_pos >= cursor_from_overlay_pos)
12371 { 12849 {
12372 cursor_from_overlay_pos = 0; 12850 ellipsis_pos = (glyph - 1)->charpos;
12373 cursor = NULL; 12851 while (glyph > row->glyphs[TEXT_AREA]
12852 && (glyph - 1)->charpos == ellipsis_pos)
12853 glyph--, x -= glyph->pixel_width;
12854 /* That loop always goes one position too far, including
12855 the glyph before the ellipsis. So scan forward over
12856 that one. */
12857 x += glyph->pixel_width;
12858 glyph++;
12374 } 12859 }
12375 } 12860 else /* row is reversed */
12376 else
12377 {
12378 if (string_start == NULL)
12379 { 12861 {
12380 string_before_pos = last_pos; 12862 ellipsis_pos = (glyph + 1)->charpos;
12381 string_start = glyph; 12863 while (glyph < row->glyphs[TEXT_AREA] + row->used[TEXT_AREA] - 1
12382 string_start_x = x; 12864 && (glyph + 1)->charpos == ellipsis_pos)
12865 glyph++, x += glyph->pixel_width;
12866 x -= glyph->pixel_width;
12867 glyph--;
12383 } 12868 }
12384 /* Skip all glyphs from a string. */ 12869 }
12385 do 12870 else if (match_with_avoid_cursor
12871 /* zero-width characters produce no glyphs */
12872 || eabs (glyph_after - glyph_before) == 1)
12873 {
12874 cursor = glyph_after;
12875 x = -1;
12876 }
12877 else if (string_seen)
12878 {
12879 int incr = row->reversed_p ? -1 : +1;
12880
12881 /* Need to find the glyph that came out of a string which is
12882 present at point. That glyph is somewhere between
12883 GLYPH_BEFORE and GLYPH_AFTER, and it came from a string
12884 positioned between POS_BEFORE and POS_AFTER in the
12885 buffer. */
12886 struct glyph *stop = glyph_after;
12887 EMACS_INT pos = pos_before;
12888
12889 x = -1;
12890 for (glyph = glyph_before + incr;
12891 row->reversed_p ? glyph > stop : glyph < stop; )
12386 { 12892 {
12387 Lisp_Object cprop; 12893
12388 int pos; 12894 /* Any glyphs that come from the buffer are here because
12389 if ((cursor == NULL || glyph > cursor) 12895 of bidi reordering. Skip them, and only pay
12390 && (cprop = Fget_char_property (make_number ((glyph)->charpos), 12896 attention to glyphs that came from some string. */
12391 Qcursor, (glyph)->object), 12897 if (STRINGP (glyph->object))
12392 !NILP (cprop))
12393 && (pos = string_buffer_position (w, glyph->object,
12394 string_before_pos),
12395 (pos == 0 /* from overlay */
12396 || pos == pt_old)))
12397 { 12898 {
12398 /* Compute the first buffer position after the overlay. 12899 Lisp_Object str;
12399 If the `cursor' property tells us how many positions 12900 EMACS_INT tem;
12400 are associated with the overlay, use that. Otherwise, 12901
12401 estimate from the buffer positions of the glyphs 12902 str = glyph->object;
12402 before and after the overlay. */ 12903 tem = string_buffer_position_lim (w, str, pos, pos_after, 0);
12403 cursor_from_overlay_pos = (pos ? 0 : last_pos 12904 if (pos <= tem)
12404 + (INTEGERP (cprop) ? XINT (cprop) : 0)); 12905 {
12405 cursor = glyph; 12906 /* If the string from which this glyph came is
12406 cursor_x = x; 12907 found in the buffer at point, then we've
12908 found the glyph we've been looking for. */
12909 if (tem == pt_old)
12910 {
12911 /* The glyphs from this string could have
12912 been reordered. Find the one with the
12913 smallest string position. Or there could
12914 be a character in the string with the
12915 `cursor' property, which means display
12916 cursor on that character's glyph. */
12917 int strpos = glyph->charpos;
12918
12919 cursor = glyph;
12920 for (glyph += incr;
12921 EQ (glyph->object, str);
12922 glyph += incr)
12923 {
12924 Lisp_Object cprop;
12925 int gpos = glyph->charpos;
12926
12927 cprop = Fget_char_property (make_number (gpos),
12928 Qcursor,
12929 glyph->object);
12930 if (!NILP (cprop))
12931 {
12932 cursor = glyph;
12933 break;
12934 }
12935 if (glyph->charpos < strpos)
12936 {
12937 strpos = glyph->charpos;
12938 cursor = glyph;
12939 }
12940 }
12941
12942 goto compute_x;
12943 }
12944 pos = tem + 1; /* don't find previous instances */
12945 }
12946 /* This string is not what we want; skip all of the
12947 glyphs that came from it. */
12948 do
12949 glyph += incr;
12950 while ((row->reversed_p ? glyph > stop : glyph < stop)
12951 && EQ (glyph->object, str));
12407 } 12952 }
12408 x += glyph->pixel_width; 12953 else
12409 ++glyph; 12954 glyph += incr;
12410 } 12955 }
12411 while (glyph < end && EQ (glyph->object, string_start->object)); 12956
12957 /* If we reached the end of the line, and END was from a string,
12958 the cursor is not on this line. */
12959 if (glyph == end
12960 && STRINGP ((glyph - incr)->object)
12961 && row->continued_p)
12962 return 0;
12412 } 12963 }
12413 } 12964 }
12414 12965
12966 compute_x:
12415 if (cursor != NULL) 12967 if (cursor != NULL)
12968 glyph = cursor;
12969 if (x < 0)
12416 { 12970 {
12417 glyph = cursor; 12971 struct glyph *g;
12418 x = cursor_x;
12419 }
12420 else if (row->ends_in_ellipsis_p && glyph == end)
12421 {
12422 /* Scan back over the ellipsis glyphs, decrementing positions. */
12423 while (glyph > row->glyphs[TEXT_AREA]
12424 && (glyph - 1)->charpos == last_pos)
12425 glyph--, x -= glyph->pixel_width;
12426 /* That loop always goes one position too far, including the
12427 glyph before the ellipsis. So scan forward over that one. */
12428 x += glyph->pixel_width;
12429 glyph++;
12430 }
12431 else if (string_start
12432 && (glyph == end || !BUFFERP (glyph->object) || last_pos > pt_old))
12433 {
12434 /* We may have skipped over point because the previous glyphs
12435 are from string. As there's no easy way to know the
12436 character position of the current glyph, find the correct
12437 glyph on point by scanning from string_start again. */
12438 Lisp_Object limit;
12439 Lisp_Object string;
12440 struct glyph *stop = glyph;
12441 int pos;
12442
12443 limit = make_number (pt_old + 1);
12444 glyph = string_start;
12445 x = string_start_x;
12446 string = glyph->object;
12447 pos = string_buffer_position (w, string, string_before_pos);
12448 /* If POS == 0, STRING is from overlay. We skip such glyphs
12449 because we always put the cursor after overlay strings. */
12450 while (pos == 0 && glyph < stop)
12451 {
12452 string = glyph->object;
12453 SKIP_GLYPHS (glyph, stop, x, EQ (glyph->object, string));
12454 if (glyph < stop)
12455 pos = string_buffer_position (w, glyph->object, string_before_pos);
12456 }
12457 12972
12458 while (glyph < stop) 12973 /* Need to compute x that corresponds to GLYPH. */
12974 for (g = row->glyphs[TEXT_AREA], x = row->x; g < glyph; g++)
12459 { 12975 {
12460 pos = XINT (Fnext_single_char_property_change 12976 if (g >= row->glyphs[TEXT_AREA] + row->used[TEXT_AREA])
12461 (make_number (pos), Qdisplay, Qnil, limit)); 12977 abort ();
12462 if (pos > pt_old) 12978 x += g->pixel_width;
12463 break; 12979 }
12464 /* Skip glyphs from the same string. */ 12980 }
12465 string = glyph->object; 12981
12466 SKIP_GLYPHS (glyph, stop, x, EQ (glyph->object, string)); 12982 /* ROW could be part of a continued line, which might have other
12467 /* Skip glyphs from an overlay. */ 12983 rows whose start and end charpos occlude point. Only set
12468 while (glyph < stop 12984 w->cursor if we found a better approximation to the cursor
12469 && ! string_buffer_position (w, glyph->object, pos)) 12985 position than we have from previously examined rows. */
12470 { 12986 if (w->cursor.vpos >= 0
12471 string = glyph->object; 12987 /* Make sure cursor.vpos specifies a row whose start and end
12472 SKIP_GLYPHS (glyph, stop, x, EQ (glyph->object, string)); 12988 charpos occlude point. This is because some callers of this
12473 } 12989 function leave cursor.vpos at the row where the cursor was
12474 } 12990 displayed during the last redisplay cycle. */
12475 12991 && MATRIX_ROW_START_CHARPOS (MATRIX_ROW (matrix, w->cursor.vpos)) <= pt_old
12476 /* If we reached the end of the line, and END was from a string, 12992 && pt_old < MATRIX_ROW_END_CHARPOS (MATRIX_ROW (matrix, w->cursor.vpos)))
12477 the cursor is not on this line. */ 12993 {
12478 if (glyph == end && row->continued_p) 12994 struct glyph *g1 =
12995 MATRIX_ROW_GLYPH_START (matrix, w->cursor.vpos) + w->cursor.hpos;
12996
12997 /* Keep the candidate whose buffer position is the closest to
12998 point. */
12999 if (BUFFERP (g1->object)
13000 && (g1->charpos == pt_old /* an exact match always wins */
13001 || (BUFFERP (glyph->object)
13002 && eabs (g1->charpos - pt_old)
13003 < eabs (glyph->charpos - pt_old))))
13004 return 0;
13005 /* If this candidate gives an exact match, use that. */
13006 if (!(BUFFERP (glyph->object) && glyph->charpos == pt_old)
13007 /* Otherwise, keep the candidate that comes from a row
13008 spanning less buffer positions. This may win when one or
13009 both candidate positions are on glyphs that came from
13010 display strings, for which we cannot compare buffer
13011 positions. */
13012 && MATRIX_ROW_END_CHARPOS (MATRIX_ROW (matrix, w->cursor.vpos))
13013 - MATRIX_ROW_START_CHARPOS (MATRIX_ROW (matrix, w->cursor.vpos))
13014 < MATRIX_ROW_END_CHARPOS (row) - MATRIX_ROW_START_CHARPOS (row))
12479 return 0; 13015 return 0;
12480 } 13016 }
12481
12482 w->cursor.hpos = glyph - row->glyphs[TEXT_AREA]; 13017 w->cursor.hpos = glyph - row->glyphs[TEXT_AREA];
12483 w->cursor.x = x; 13018 w->cursor.x = x;
12484 w->cursor.vpos = MATRIX_ROW_VPOS (row, matrix) + dvpos; 13019 w->cursor.vpos = MATRIX_ROW_VPOS (row, matrix) + dvpos;
@@ -13025,6 +13560,32 @@ try_cursor_movement (window, startp, scroll_step)
13025 ++row; 13560 ++row;
13026 if (!row->enabled_p) 13561 if (!row->enabled_p)
13027 rc = CURSOR_MOVEMENT_MUST_SCROLL; 13562 rc = CURSOR_MOVEMENT_MUST_SCROLL;
13563 /* If rows are bidi-reordered, back up until we find a row
13564 that does not belong to a continuation line. This is
13565 because we must consider all rows of a continued line as
13566 candidates for cursor positioning, since row start and
13567 end positions change non-linearly with vertical position
13568 in such rows. */
13569 /* FIXME: Revisit this when glyph ``spilling'' in
13570 continuation lines' rows is implemented for
13571 bidi-reordered rows. */
13572 if (!NILP (XBUFFER (w->buffer)->bidi_display_reordering))
13573 {
13574 while (MATRIX_ROW_CONTINUATION_LINE_P (row))
13575 {
13576 xassert (row->enabled_p);
13577 --row;
13578 /* If we hit the beginning of the displayed portion
13579 without finding the first row of a continued
13580 line, give up. */
13581 if (row <= w->current_matrix->rows)
13582 {
13583 rc = CURSOR_MOVEMENT_MUST_SCROLL;
13584 break;
13585 }
13586
13587 }
13588 }
13028 } 13589 }
13029 13590
13030 if (rc == CURSOR_MOVEMENT_CANNOT_BE_USED) 13591 if (rc == CURSOR_MOVEMENT_CANNOT_BE_USED)
@@ -13148,6 +13709,46 @@ try_cursor_movement (window, startp, scroll_step)
13148 } 13709 }
13149 else if (scroll_p) 13710 else if (scroll_p)
13150 rc = CURSOR_MOVEMENT_MUST_SCROLL; 13711 rc = CURSOR_MOVEMENT_MUST_SCROLL;
13712 else if (!NILP (XBUFFER (w->buffer)->bidi_display_reordering))
13713 {
13714 /* With bidi-reordered rows, there could be more than
13715 one candidate row whose start and end positions
13716 occlude point. We need to let set_cursor_from_row
13717 find the best candidate. */
13718 /* FIXME: Revisit this when glyph ``spilling'' in
13719 continuation lines' rows is implemented for
13720 bidi-reordered rows. */
13721 int rv = 0;
13722
13723 do
13724 {
13725 rv |= set_cursor_from_row (w, row, w->current_matrix,
13726 0, 0, 0, 0);
13727 /* As soon as we've found the first suitable row
13728 whose ends_at_zv_p flag is set, we are done. */
13729 if (rv
13730 && MATRIX_ROW (w->current_matrix, w->cursor.vpos)->ends_at_zv_p)
13731 {
13732 rc = CURSOR_MOVEMENT_SUCCESS;
13733 break;
13734 }
13735 ++row;
13736 }
13737 while (MATRIX_ROW_BOTTOM_Y (row) < last_y
13738 && MATRIX_ROW_START_CHARPOS (row) <= PT
13739 && PT <= MATRIX_ROW_END_CHARPOS (row)
13740 && cursor_row_p (w, row));
13741 /* If we didn't find any candidate rows, or exited the
13742 loop before all the candidates were examined, signal
13743 to the caller that this method failed. */
13744 if (rc != CURSOR_MOVEMENT_SUCCESS
13745 && (!rv
13746 || (MATRIX_ROW_START_CHARPOS (row) <= PT
13747 && PT <= MATRIX_ROW_END_CHARPOS (row))))
13748 rc = CURSOR_MOVEMENT_CANNOT_BE_USED;
13749 else
13750 rc = CURSOR_MOVEMENT_SUCCESS;
13751 }
13151 else 13752 else
13152 { 13753 {
13153 do 13754 do
@@ -14474,15 +15075,39 @@ try_window_reusing_current_matrix (w)
14474 { 15075 {
14475 struct glyph *glyph = row->glyphs[TEXT_AREA] + w->cursor.hpos; 15076 struct glyph *glyph = row->glyphs[TEXT_AREA] + w->cursor.hpos;
14476 struct glyph *end = glyph + row->used[TEXT_AREA]; 15077 struct glyph *end = glyph + row->used[TEXT_AREA];
15078 struct glyph *orig_glyph = glyph;
15079 struct cursor_pos orig_cursor = w->cursor;
14477 15080
14478 for (; glyph < end 15081 for (; glyph < end
14479 && (!BUFFERP (glyph->object) 15082 && (!BUFFERP (glyph->object)
14480 || glyph->charpos < PT); 15083 || glyph->charpos != PT);
14481 glyph++) 15084 glyph++)
14482 { 15085 {
14483 w->cursor.hpos++; 15086 w->cursor.hpos++;
14484 w->cursor.x += glyph->pixel_width; 15087 w->cursor.x += glyph->pixel_width;
14485 } 15088 }
15089 /* With bidi reordering, charpos changes non-linearly
15090 with hpos, so the right glyph could be to the
15091 left. */
15092 if (!NILP (XBUFFER (w->buffer)->bidi_display_reordering)
15093 && (!BUFFERP (glyph->object) || glyph->charpos != PT))
15094 {
15095 struct glyph *start_glyph = row->glyphs[TEXT_AREA];
15096
15097 glyph = orig_glyph - 1;
15098 orig_cursor.hpos--;
15099 orig_cursor.x -= glyph->pixel_width;
15100 for (; glyph >= start_glyph
15101 && (!BUFFERP (glyph->object)
15102 || glyph->charpos != PT);
15103 glyph--)
15104 {
15105 w->cursor.hpos--;
15106 w->cursor.x -= glyph->pixel_width;
15107 }
15108 if (BUFFERP (glyph->object) && glyph->charpos == PT)
15109 w->cursor = orig_cursor;
15110 }
14486 } 15111 }
14487 } 15112 }
14488 15113
@@ -14749,6 +15374,8 @@ row_containing_pos (w, charpos, start, end, dy)
14749 int dy; 15374 int dy;
14750{ 15375{
14751 struct glyph_row *row = start; 15376 struct glyph_row *row = start;
15377 struct glyph_row *best_row = NULL;
15378 EMACS_INT mindif = BUF_ZV (XBUFFER (w->buffer)) + 1;
14752 int last_y; 15379 int last_y;
14753 15380
14754 /* If we happen to start on a header-line, skip that. */ 15381 /* If we happen to start on a header-line, skip that. */
@@ -14781,7 +15408,30 @@ row_containing_pos (w, charpos, start, end, dy)
14781 && !row->ends_at_zv_p 15408 && !row->ends_at_zv_p
14782 && !MATRIX_ROW_ENDS_IN_MIDDLE_OF_CHAR_P (row))) 15409 && !MATRIX_ROW_ENDS_IN_MIDDLE_OF_CHAR_P (row)))
14783 && charpos >= MATRIX_ROW_START_CHARPOS (row)) 15410 && charpos >= MATRIX_ROW_START_CHARPOS (row))
14784 return row; 15411 {
15412 struct glyph *g;
15413
15414 if (NILP (XBUFFER (w->buffer)->bidi_display_reordering))
15415 return row;
15416 /* In bidi-reordered rows, there could be several rows
15417 occluding point. We need to find the one which fits
15418 CHARPOS the best. */
15419 for (g = row->glyphs[TEXT_AREA];
15420 g < row->glyphs[TEXT_AREA] + row->used[TEXT_AREA];
15421 g++)
15422 {
15423 if (!STRINGP (g->object))
15424 {
15425 if (g->charpos > 0 && eabs (g->charpos - charpos) < mindif)
15426 {
15427 mindif = eabs (g->charpos - charpos);
15428 best_row = row;
15429 }
15430 }
15431 }
15432 }
15433 else if (best_row)
15434 return best_row;
14785 ++row; 15435 ++row;
14786 } 15436 }
14787} 15437}
@@ -14926,6 +15576,18 @@ try_window_id (w)
14926 if (!NILP (XBUFFER (w->buffer)->word_wrap)) 15576 if (!NILP (XBUFFER (w->buffer)->word_wrap))
14927 GIVE_UP (21); 15577 GIVE_UP (21);
14928 15578
15579 /* Under bidi reordering, adding or deleting a character in the
15580 beginning of a paragraph, before the first strong directional
15581 character, can change the base direction of the paragraph (unless
15582 the buffer specifies a fixed paragraph direction), which will
15583 require to redisplay the whole paragraph. It might be worthwhile
15584 to find the paragraph limits and widen the range of redisplayed
15585 lines to that, but for now just give up this optimization and
15586 redisplay from scratch. */
15587 if (!NILP (XBUFFER (w->buffer)->bidi_display_reordering)
15588 && NILP (XBUFFER (w->buffer)->bidi_paragraph_direction))
15589 GIVE_UP (22);
15590
14929 /* Make sure beg_unchanged and end_unchanged are up to date. Do it 15591 /* Make sure beg_unchanged and end_unchanged are up to date. Do it
14930 only if buffer has really changed. The reason is that the gap is 15592 only if buffer has really changed. The reason is that the gap is
14931 initially at Z for freshly visited files. The code below would 15593 initially at Z for freshly visited files. The code below would
@@ -16501,6 +17163,8 @@ display_line (it)
16501 int wrap_row_used = -1, wrap_row_ascent, wrap_row_height; 17163 int wrap_row_used = -1, wrap_row_ascent, wrap_row_height;
16502 int wrap_row_phys_ascent, wrap_row_phys_height; 17164 int wrap_row_phys_ascent, wrap_row_phys_height;
16503 int wrap_row_extra_line_spacing; 17165 int wrap_row_extra_line_spacing;
17166 struct display_pos row_end;
17167 int cvpos;
16504 17168
16505 /* We always start displaying at hpos zero even if hscrolled. */ 17169 /* We always start displaying at hpos zero even if hscrolled. */
16506 xassert (it->hpos == 0 && it->current_x == 0); 17170 xassert (it->hpos == 0 && it->current_x == 0);
@@ -16589,6 +17253,11 @@ display_line (it)
16589 17253
16590 it->continuation_lines_width = 0; 17254 it->continuation_lines_width = 0;
16591 row->ends_at_zv_p = 1; 17255 row->ends_at_zv_p = 1;
17256 /* A row that displays right-to-left text must always have
17257 its last face extended all the way to the end of line,
17258 even if this row ends in ZV. */
17259 if (row->reversed_p)
17260 extend_face_to_end_of_line (it);
16592 break; 17261 break;
16593 } 17262 }
16594 17263
@@ -16996,7 +17665,116 @@ display_line (it)
16996 compute_line_metrics (it); 17665 compute_line_metrics (it);
16997 17666
16998 /* Remember the position at which this line ends. */ 17667 /* Remember the position at which this line ends. */
16999 row->end = it->current; 17668 row->end = row_end = it->current;
17669 if (it->bidi_p)
17670 {
17671 /* ROW->start and ROW->end must be the smallest and largest
17672 buffer positions in ROW. But if ROW was bidi-reordered,
17673 these two positions can be anywhere in the row, so we must
17674 rescan all of the ROW's glyphs to find them. */
17675 /* FIXME: Revisit this when glyph ``spilling'' in continuation
17676 lines' rows is implemented for bidi-reordered rows. */
17677 EMACS_INT min_pos = ZV + 1, max_pos = 0;
17678 struct glyph *g;
17679 struct it save_it;
17680 struct text_pos tpos;
17681
17682 for (g = row->glyphs[TEXT_AREA];
17683 g < row->glyphs[TEXT_AREA] + row->used[TEXT_AREA];
17684 g++)
17685 {
17686 if (BUFFERP (g->object))
17687 {
17688 if (g->charpos > 0 && g->charpos < min_pos)
17689 min_pos = g->charpos;
17690 if (g->charpos > max_pos)
17691 max_pos = g->charpos;
17692 }
17693 }
17694 /* Empty lines have a valid buffer position at their first
17695 glyph, but that glyph's OBJECT is zero, as if it didn't come
17696 from a buffer. If we didn't find any valid buffer positions
17697 in this row, maybe we have such an empty line. */
17698 if (min_pos == ZV + 1 && row->used[TEXT_AREA])
17699 {
17700 for (g = row->glyphs[TEXT_AREA];
17701 g < row->glyphs[TEXT_AREA] + row->used[TEXT_AREA];
17702 g++)
17703 {
17704 if (INTEGERP (g->object))
17705 {
17706 if (g->charpos > 0 && g->charpos < min_pos)
17707 min_pos = g->charpos;
17708 if (g->charpos > max_pos)
17709 max_pos = g->charpos;
17710 }
17711 }
17712 }
17713 if (min_pos <= ZV)
17714 {
17715 if (min_pos != row->start.pos.charpos)
17716 {
17717 row->start.pos.charpos = min_pos;
17718 row->start.pos.bytepos = CHAR_TO_BYTE (min_pos);
17719 }
17720 if (max_pos == 0)
17721 max_pos = min_pos;
17722 }
17723 /* For ROW->end, we need the position that is _after_ max_pos,
17724 in the logical order, unless we are at ZV. */
17725 if (row->ends_at_zv_p)
17726 {
17727 row_end = row->end = it->current;
17728 if (!row->used[TEXT_AREA])
17729 {
17730 row->start.pos.charpos = row_end.pos.charpos;
17731 row->start.pos.bytepos = row_end.pos.bytepos;
17732 }
17733 }
17734 else if (row->used[TEXT_AREA] && max_pos)
17735 {
17736 SET_TEXT_POS (tpos, max_pos + 1, CHAR_TO_BYTE (max_pos + 1));
17737 row_end = it->current;
17738 row_end.pos = tpos;
17739 /* If the character at max_pos+1 is a newline, skip that as
17740 well. Note that this may skip some invisible text. */
17741 if (FETCH_CHAR (tpos.bytepos) == '\n'
17742 || (FETCH_CHAR (tpos.bytepos) == '\r' && it->selective))
17743 {
17744 save_it = *it;
17745 it->bidi_p = 0;
17746 reseat_1 (it, tpos, 0);
17747 set_iterator_to_next (it, 1);
17748 /* Record the position after the newline of a continued
17749 row. We will need that to set ROW->end of the last
17750 row produced for a continued line. */
17751 if (row->continued_p)
17752 {
17753 save_it.eol_pos.charpos = IT_CHARPOS (*it);
17754 save_it.eol_pos.bytepos = IT_BYTEPOS (*it);
17755 }
17756 else
17757 {
17758 row_end = it->current;
17759 save_it.eol_pos.charpos = save_it.eol_pos.bytepos = 0;
17760 }
17761 *it = save_it;
17762 }
17763 else if (!row->continued_p
17764 && row->continuation_lines_width
17765 && it->eol_pos.charpos > 0)
17766 {
17767 /* Last row of a continued line. Use the position
17768 recorded in ROW->eol_pos, to the effect that the
17769 newline belongs to this row, not to the row which
17770 displays the character with the largest buffer
17771 position. */
17772 row_end.pos = it->eol_pos;
17773 it->eol_pos.charpos = it->eol_pos.bytepos = 0;
17774 }
17775 row->end = row_end;
17776 }
17777 }
17000 17778
17001 /* Record whether this row ends inside an ellipsis. */ 17779 /* Record whether this row ends inside an ellipsis. */
17002 row->ends_in_ellipsis_p 17780 row->ends_in_ellipsis_p
@@ -17015,7 +17793,18 @@ display_line (it)
17015 it->right_user_fringe_face_id = 0; 17793 it->right_user_fringe_face_id = 0;
17016 17794
17017 /* Maybe set the cursor. */ 17795 /* Maybe set the cursor. */
17018 if (it->w->cursor.vpos < 0 17796 cvpos = it->w->cursor.vpos;
17797 if ((cvpos < 0
17798 /* In bidi-reordered rows, keep checking for proper cursor
17799 position even if one has been found already, because buffer
17800 positions in such rows change non-linearly with ROW->VPOS,
17801 when a line is continued. One exception: when we are at ZV,
17802 display cursor on the first suitable glyph row, since all
17803 the empty rows after that also have their position set to ZV. */
17804 /* FIXME: Revisit this when glyph ``spilling'' in continuation
17805 lines' rows is implemented for bidi-reordered rows. */
17806 || (it->bidi_p
17807 && !MATRIX_ROW (it->w->desired_matrix, cvpos)->ends_at_zv_p))
17019 && PT >= MATRIX_ROW_START_CHARPOS (row) 17808 && PT >= MATRIX_ROW_START_CHARPOS (row)
17020 && PT <= MATRIX_ROW_END_CHARPOS (row) 17809 && PT <= MATRIX_ROW_END_CHARPOS (row)
17021 && cursor_row_p (it->w, row)) 17810 && cursor_row_p (it->w, row))
@@ -17033,7 +17822,11 @@ display_line (it)
17033 it->current_y += row->height; 17822 it->current_y += row->height;
17034 ++it->vpos; 17823 ++it->vpos;
17035 ++it->glyph_row; 17824 ++it->glyph_row;
17036 it->start = it->current; 17825 /* The next row should use same value of the reversed_p flag as this
17826 one. set_iterator_to_next decides when it's a new paragraph, and
17827 PRODUCE_GLYPHS recomputes the value of the flag accordingly. */
17828 it->glyph_row->reversed_p = row->reversed_p;
17829 it->start = row_end;
17037 return row->displays_text_p; 17830 return row->displays_text_p;
17038} 17831}
17039 17832
@@ -20562,6 +21355,17 @@ append_glyph (it)
20562 glyph = it->glyph_row->glyphs[area] + it->glyph_row->used[area]; 21355 glyph = it->glyph_row->glyphs[area] + it->glyph_row->used[area];
20563 if (glyph < it->glyph_row->glyphs[area + 1]) 21356 if (glyph < it->glyph_row->glyphs[area + 1])
20564 { 21357 {
21358 /* If the glyph row is reversed, we need to prepend the glyph
21359 rather than append it. */
21360 if (it->glyph_row->reversed_p && area == TEXT_AREA)
21361 {
21362 struct glyph *g;
21363
21364 /* Make room for the additional glyph. */
21365 for (g = glyph - 1; g >= it->glyph_row->glyphs[area]; g--)
21366 g[1] = *g;
21367 glyph = it->glyph_row->glyphs[area];
21368 }
20565 glyph->charpos = CHARPOS (it->position); 21369 glyph->charpos = CHARPOS (it->position);
20566 glyph->object = it->object; 21370 glyph->object = it->object;
20567 if (it->pixel_width > 0) 21371 if (it->pixel_width > 0)
@@ -20591,6 +21395,18 @@ append_glyph (it)
20591 glyph->u.ch = it->char_to_display; 21395 glyph->u.ch = it->char_to_display;
20592 glyph->slice = null_glyph_slice; 21396 glyph->slice = null_glyph_slice;
20593 glyph->font_type = FONT_TYPE_UNKNOWN; 21397 glyph->font_type = FONT_TYPE_UNKNOWN;
21398 if (it->bidi_p)
21399 {
21400 glyph->resolved_level = it->bidi_it.resolved_level;
21401 if ((it->bidi_it.type & 7) != it->bidi_it.type)
21402 abort ();
21403 glyph->bidi_type = it->bidi_it.type;
21404 }
21405 else
21406 {
21407 glyph->resolved_level = 0;
21408 glyph->bidi_type = UNKNOWN_BT;
21409 }
20594 ++it->glyph_row->used[area]; 21410 ++it->glyph_row->used[area];
20595 } 21411 }
20596 else 21412 else
@@ -20643,6 +21459,13 @@ append_composite_glyph (it)
20643 glyph->face_id = it->face_id; 21459 glyph->face_id = it->face_id;
20644 glyph->slice = null_glyph_slice; 21460 glyph->slice = null_glyph_slice;
20645 glyph->font_type = FONT_TYPE_UNKNOWN; 21461 glyph->font_type = FONT_TYPE_UNKNOWN;
21462 if (it->bidi_p)
21463 {
21464 glyph->resolved_level = it->bidi_it.resolved_level;
21465 if ((it->bidi_it.type & 7) != it->bidi_it.type)
21466 abort ();
21467 glyph->bidi_type = it->bidi_it.type;
21468 }
20646 ++it->glyph_row->used[area]; 21469 ++it->glyph_row->used[area];
20647 } 21470 }
20648 else 21471 else
@@ -20817,6 +21640,13 @@ produce_image_glyph (it)
20817 glyph->u.img_id = img->id; 21640 glyph->u.img_id = img->id;
20818 glyph->slice = slice; 21641 glyph->slice = slice;
20819 glyph->font_type = FONT_TYPE_UNKNOWN; 21642 glyph->font_type = FONT_TYPE_UNKNOWN;
21643 if (it->bidi_p)
21644 {
21645 glyph->resolved_level = it->bidi_it.resolved_level;
21646 if ((it->bidi_it.type & 7) != it->bidi_it.type)
21647 abort ();
21648 glyph->bidi_type = it->bidi_it.type;
21649 }
20820 ++it->glyph_row->used[area]; 21650 ++it->glyph_row->used[area];
20821 } 21651 }
20822 else 21652 else
@@ -20863,6 +21693,13 @@ append_stretch_glyph (it, object, width, height, ascent)
20863 glyph->u.stretch.height = height; 21693 glyph->u.stretch.height = height;
20864 glyph->slice = null_glyph_slice; 21694 glyph->slice = null_glyph_slice;
20865 glyph->font_type = FONT_TYPE_UNKNOWN; 21695 glyph->font_type = FONT_TYPE_UNKNOWN;
21696 if (it->bidi_p)
21697 {
21698 glyph->resolved_level = it->bidi_it.resolved_level;
21699 if ((it->bidi_it.type & 7) != it->bidi_it.type)
21700 abort ();
21701 glyph->bidi_type = it->bidi_it.type;
21702 }
20866 ++it->glyph_row->used[area]; 21703 ++it->glyph_row->used[area];
20867 } 21704 }
20868 else 21705 else
@@ -23040,7 +23877,7 @@ mouse_face_from_buffer_pos (Lisp_Object window,
23040 associated with the end position, which must not be 23877 associated with the end position, which must not be
23041 highlighted. */ 23878 highlighted. */
23042 Lisp_Object prev_object; 23879 Lisp_Object prev_object;
23043 int pos; 23880 EMACS_INT pos;
23044 23881
23045 while (glyph > row->glyphs[TEXT_AREA]) 23882 while (glyph > row->glyphs[TEXT_AREA])
23046 { 23883 {
@@ -23672,7 +24509,8 @@ note_mouse_highlight (f, x, y)
23672 && XFASTINT (w->last_modified) == BUF_MODIFF (b) 24509 && XFASTINT (w->last_modified) == BUF_MODIFF (b)
23673 && XFASTINT (w->last_overlay_modified) == BUF_OVERLAY_MODIFF (b)) 24510 && XFASTINT (w->last_overlay_modified) == BUF_OVERLAY_MODIFF (b))
23674 { 24511 {
23675 int hpos, vpos, pos, i, dx, dy, area; 24512 int hpos, vpos, i, dx, dy, area;
24513 EMACS_INT pos;
23676 struct glyph *glyph; 24514 struct glyph *glyph;
23677 Lisp_Object object; 24515 Lisp_Object object;
23678 Lisp_Object mouse_face = Qnil, overlay = Qnil, position; 24516 Lisp_Object mouse_face = Qnil, overlay = Qnil, position;
@@ -23960,7 +24798,7 @@ note_mouse_highlight (f, x, y)
23960 struct glyph_row *r 24798 struct glyph_row *r
23961 = MATRIX_ROW (w->current_matrix, vpos); 24799 = MATRIX_ROW (w->current_matrix, vpos);
23962 int start = MATRIX_ROW_START_CHARPOS (r); 24800 int start = MATRIX_ROW_START_CHARPOS (r);
23963 int pos = string_buffer_position (w, object, start); 24801 EMACS_INT pos = string_buffer_position (w, object, start);
23964 if (pos > 0) 24802 if (pos > 0)
23965 { 24803 {
23966 help = Fget_char_property (make_number (pos), 24804 help = Fget_char_property (make_number (pos),
@@ -24015,7 +24853,8 @@ note_mouse_highlight (f, x, y)
24015 struct glyph_row *r 24853 struct glyph_row *r
24016 = MATRIX_ROW (w->current_matrix, vpos); 24854 = MATRIX_ROW (w->current_matrix, vpos);
24017 int start = MATRIX_ROW_START_CHARPOS (r); 24855 int start = MATRIX_ROW_START_CHARPOS (r);
24018 int pos = string_buffer_position (w, object, start); 24856 EMACS_INT pos = string_buffer_position (w, object,
24857 start);
24019 if (pos > 0) 24858 if (pos > 0)
24020 pointer = Fget_char_property (make_number (pos), 24859 pointer = Fget_char_property (make_number (pos),
24021 Qpointer, w->buffer); 24860 Qpointer, w->buffer);
@@ -24824,6 +25663,11 @@ syms_of_xdisp ()
24824 staticpro (&previous_help_echo_string); 25663 staticpro (&previous_help_echo_string);
24825 help_echo_pos = -1; 25664 help_echo_pos = -1;
24826 25665
25666 Qright_to_left = intern ("right-to-left");
25667 staticpro (&Qright_to_left);
25668 Qleft_to_right = intern ("left-to-right");
25669 staticpro (&Qleft_to_right);
25670
24827#ifdef HAVE_WINDOW_SYSTEM 25671#ifdef HAVE_WINDOW_SYSTEM
24828 DEFVAR_BOOL ("x-stretch-cursor", &x_stretch_cursor_p, 25672 DEFVAR_BOOL ("x-stretch-cursor", &x_stretch_cursor_p,
24829 doc: /* *Non-nil means draw block cursor as wide as the glyph under it. 25673 doc: /* *Non-nil means draw block cursor as wide as the glyph under it.