diff options
| author | Eli Zaretskii | 2011-08-18 13:53:55 +0300 |
|---|---|---|
| committer | Eli Zaretskii | 2011-08-18 13:53:55 +0300 |
| commit | c094bb0cf7eee9defdd76b8432dcbc24a7c6856d (patch) | |
| tree | 924c51e3b5751afa43cad773694e6568e55c9e4c | |
| parent | 4dcb0d7a58bff52c1155fd93c03dcab4567038f2 (diff) | |
| download | emacs-c094bb0cf7eee9defdd76b8432dcbc24a7c6856d.tar.gz emacs-c094bb0cf7eee9defdd76b8432dcbc24a7c6856d.zip | |
Improve documentation of bidi in ELisp manual.
doc/lispref/nonascii.texi (Character Properties): Document use of
`bidi-class' and `mirroring' properties as part of reordering.
Provide cross-references to "Bidirectional Display".
doc/lispref/display.texi (Bidirectional Display): Document the pitfalls of
concatenating strings with bidirectional content, with possible
solutions. Document string-mark-left-to-right. Mention paragraph
direction in modes that inherit from prog-mode. Document use of
`bidi-class' and `mirroring' properties as part of reordering.
etc/NEWS: Mark string-mark-left-to-right as documented.
| -rw-r--r-- | doc/lispref/ChangeLog | 12 | ||||
| -rw-r--r-- | doc/lispref/display.texi | 104 | ||||
| -rw-r--r-- | doc/lispref/nonascii.texi | 7 | ||||
| -rw-r--r-- | etc/NEWS | 1 |
4 files changed, 107 insertions, 17 deletions
diff --git a/doc/lispref/ChangeLog b/doc/lispref/ChangeLog index 56175a34eee..03a20ba5830 100644 --- a/doc/lispref/ChangeLog +++ b/doc/lispref/ChangeLog | |||
| @@ -1,3 +1,15 @@ | |||
| 1 | 2011-08-18 Eli Zaretskii <eliz@gnu.org> | ||
| 2 | |||
| 3 | * nonascii.texi (Character Properties): Document use of | ||
| 4 | `bidi-class' and `mirroring' properties as part of reordering. | ||
| 5 | Provide cross-references to "Bidirectional Display". | ||
| 6 | |||
| 7 | * display.texi (Bidirectional Display): Document the pitfalls of | ||
| 8 | concatenating strings with bidirectional content, with possible | ||
| 9 | solutions. Document string-mark-left-to-right. Mention paragraph | ||
| 10 | direction in modes that inherit from prog-mode. Document use of | ||
| 11 | `bidi-class' and `mirroring' properties as part of reordering. | ||
| 12 | |||
| 1 | 2011-08-16 Eli Zaretskii <eliz@gnu.org> | 13 | 2011-08-16 Eli Zaretskii <eliz@gnu.org> |
| 2 | 14 | ||
| 3 | * modes.texi (Major Mode Conventions): Improve the documentation | 15 | * modes.texi (Major Mode Conventions): Improve the documentation |
diff --git a/doc/lispref/display.texi b/doc/lispref/display.texi index 64a9054f596..7e7851452d8 100644 --- a/doc/lispref/display.texi +++ b/doc/lispref/display.texi | |||
| @@ -5992,6 +5992,7 @@ left-to-right and right-to-left characters. | |||
| 5992 | for editing and displaying bidirectional text. | 5992 | for editing and displaying bidirectional text. |
| 5993 | 5993 | ||
| 5994 | @cindex logical order | 5994 | @cindex logical order |
| 5995 | @cindex reading order | ||
| 5995 | @cindex visual order | 5996 | @cindex visual order |
| 5996 | @cindex unicode bidirectional algorithm | 5997 | @cindex unicode bidirectional algorithm |
| 5997 | Emacs stores right-to-left and bidirectional text in the so-called | 5998 | Emacs stores right-to-left and bidirectional text in the so-called |
| @@ -6006,17 +6007,16 @@ for display. Reordering of bidirectional text for display in Emacs is | |||
| 6006 | a ``Full bidirectionality'' class implementation of the @acronym{UBA}. | 6007 | a ``Full bidirectionality'' class implementation of the @acronym{UBA}. |
| 6007 | 6008 | ||
| 6008 | @defvar bidi-display-reordering | 6009 | @defvar bidi-display-reordering |
| 6009 | The buffer-local variable @code{bidi-display-reordering} controls | 6010 | This buffer-local variable controls whether text in the buffer is |
| 6010 | whether text in the buffer is reordered for display. If its value is | 6011 | reordered for display. If its value is non-@code{nil}, Emacs reorders |
| 6011 | non-@code{nil}, Emacs reorders characters that have right-to-left | 6012 | characters that have right-to-left directionality when they are |
| 6012 | directionality when they are displayed. The default value is | 6013 | displayed. The default value is @code{t}. Text in overlay strings |
| 6013 | @code{t}. Text in overlay strings (@pxref{Overlay | 6014 | (@pxref{Overlay Properties,,before-string}), display strings |
| 6014 | Properties,,before-string}), display strings (@pxref{Overlay | 6015 | (@pxref{Overlay Properties,,display}), and @code{display} text |
| 6015 | Properties,,display}), and @code{display} text properties | 6016 | properties (@pxref{Display Property}) is also reordered for display if |
| 6016 | (@pxref{Display Property}) is also reordered if the buffer whose text | 6017 | the buffer whose text includes these strings is reordered. Turning |
| 6017 | includes these strings is reordered for display. Turning off | 6018 | off @code{bidi-display-reordering} for a buffer turns off reordering |
| 6018 | @code{bidi-display-reordering} for a buffer turns off reordering of | 6019 | of all the overlay and display strings in that buffer. |
| 6019 | all the overlay and display strings in that buffer. | ||
| 6020 | 6020 | ||
| 6021 | Reordering of strings that are unrelated to any buffer, such as text | 6021 | Reordering of strings that are unrelated to any buffer, such as text |
| 6022 | displayed on the mode line (@pxref{Mode Line Format}) or header line | 6022 | displayed on the mode line (@pxref{Mode Line Format}) or header line |
| @@ -6056,7 +6056,7 @@ it is reordered for display. That is, the entire chunk of text | |||
| 6056 | covered by these properties is reordered together. Moreover, the | 6056 | covered by these properties is reordered together. Moreover, the |
| 6057 | bidirectional properties of the characters in this chunk of text are | 6057 | bidirectional properties of the characters in this chunk of text are |
| 6058 | ignored, and Emacs reorders them as if they were replaced with a | 6058 | ignored, and Emacs reorders them as if they were replaced with a |
| 6059 | single character @code{u+FFFC}, known as the @dfn{Object Replacement | 6059 | single character @code{U+FFFC}, known as the @dfn{Object Replacement |
| 6060 | Character}. This means that placing a display property over a portion | 6060 | Character}. This means that placing a display property over a portion |
| 6061 | of text may change the way that the surrounding text is reordered for | 6061 | of text may change the way that the surrounding text is reordered for |
| 6062 | display. To prevent this unexpected effect, always place such | 6062 | display. To prevent this unexpected effect, always place such |
| @@ -6073,9 +6073,9 @@ begins at the right margin and is continued or truncated at the left | |||
| 6073 | margin. | 6073 | margin. |
| 6074 | 6074 | ||
| 6075 | @defvar bidi-paragraph-direction | 6075 | @defvar bidi-paragraph-direction |
| 6076 | Emacs determines the base direction of each paragraph dynamically, | 6076 | By default, Emacs determines the base direction of each paragraph |
| 6077 | based on the text at the beginning of the paragraph. The precise | 6077 | dynamically, based on the text at the beginning of the paragraph. The |
| 6078 | method of determining the base direction is specified by the | 6078 | precise method of determining the base direction is specified by the |
| 6079 | @acronym{UBA}; in a nutshell, the first character in a paragraph that | 6079 | @acronym{UBA}; in a nutshell, the first character in a paragraph that |
| 6080 | has an explicit directionality determines the base direction of the | 6080 | has an explicit directionality determines the base direction of the |
| 6081 | paragraph. However, sometimes a buffer may need to force a certain | 6081 | paragraph. However, sometimes a buffer may need to force a certain |
| @@ -6087,6 +6087,13 @@ dynamic determination of the base direction, and instead forces all | |||
| 6087 | paragraphs in the buffer to have the direction specified by its | 6087 | paragraphs in the buffer to have the direction specified by its |
| 6088 | buffer-local value. The value can be either @code{right-to-left} or | 6088 | buffer-local value. The value can be either @code{right-to-left} or |
| 6089 | @code{left-to-right}. Any other value is interpreted as @code{nil}. | 6089 | @code{left-to-right}. Any other value is interpreted as @code{nil}. |
| 6090 | The default is @code{nil}. | ||
| 6091 | |||
| 6092 | @cindex @code{prog-mode}, and @code{bidi-paragraph-direction} | ||
| 6093 | Modes that are meant to display program source code should force a | ||
| 6094 | @code{left-to-right} paragraph direction. The easiest way of doing so | ||
| 6095 | is to derive the mode from Prog Mode, which already sets | ||
| 6096 | @code{bidi-paragraph-direction} to that value. | ||
| 6090 | @end defvar | 6097 | @end defvar |
| 6091 | 6098 | ||
| 6092 | @defun current-bidi-paragraph-direction &optional buffer | 6099 | @defun current-bidi-paragraph-direction &optional buffer |
| @@ -6099,3 +6106,70 @@ non-@code{nil}, the returned value will be identical to that value; | |||
| 6099 | otherwise, the returned value reflects the paragraph direction | 6106 | otherwise, the returned value reflects the paragraph direction |
| 6100 | determined dynamically by Emacs. | 6107 | determined dynamically by Emacs. |
| 6101 | @end defun | 6108 | @end defun |
| 6109 | |||
| 6110 | @cindex layout on display, and bidirectional text | ||
| 6111 | @cindex jumbled display of bidirectional text | ||
| 6112 | @cindex concatenating bidirectional strings | ||
| 6113 | Reordering of bidirectional text for display can have surprising and | ||
| 6114 | unpleasant effects when two strings with bidirectional content are | ||
| 6115 | juxtaposed in a buffer, or otherwise programmatically concatenated | ||
| 6116 | into a string of text. A typical example is a buffer whose lines are | ||
| 6117 | actually sequences of items, or fields, separated by whitespace or | ||
| 6118 | punctuation characters. This is used in specialized modes such as | ||
| 6119 | Buffer-menu Mode or various email summary modes, like Rmail Summary | ||
| 6120 | Mode. Because these separator characters are @dfn{weak}, i.e.@: have | ||
| 6121 | no strong directionality, they take on the directionality of | ||
| 6122 | surrounding text. As result, a numeric field that follows a field | ||
| 6123 | with bidirectional content can be displayed @emph{to the left} of the | ||
| 6124 | preceding field, producing a jumbled display and messing up the | ||
| 6125 | expected layout. | ||
| 6126 | |||
| 6127 | To countermand this, you can use one of the following techniques for | ||
| 6128 | forcing correct order of fields on display: | ||
| 6129 | |||
| 6130 | @itemize @minus | ||
| 6131 | @item | ||
| 6132 | Append the special character @code{U+200E}, LEFT-TO-RIGHT MARK, or | ||
| 6133 | @acronym{LRM}, to the end of each field that may have bidirectional | ||
| 6134 | content, or prepend it to the beginning of the following field. The | ||
| 6135 | function @code{string-mark-left-to-right}, described below, comes in | ||
| 6136 | handy for this purpose. (In a right-to-left paragraph, use | ||
| 6137 | @code{U+200F}, RIGHT-TO-LEFT MARK, or @acronym{RLM}, instead.) This | ||
| 6138 | is one of the solutions recommended by | ||
| 6139 | @uref{http://www.unicode.org/reports/tr9/#Separators, the | ||
| 6140 | @acronym{UBA}}. | ||
| 6141 | |||
| 6142 | @item | ||
| 6143 | Include the tab character in the field separator. The tab character | ||
| 6144 | plays the role of @dfn{segment separator} in the @acronym{UBA} | ||
| 6145 | reordering, whose effect is to make each field a separate segment, and | ||
| 6146 | thus reorder them separately. | ||
| 6147 | @end itemize | ||
| 6148 | |||
| 6149 | @defun string-mark-left-to-right string | ||
| 6150 | This subroutine returns its argument @var{string}, possibly modified, | ||
| 6151 | such that the result can be safely concatenated with another string, | ||
| 6152 | or juxtaposed with another string in a buffer, without disrupting the | ||
| 6153 | relative layout of this string and the next one on display. If the | ||
| 6154 | string returned by this function is displayed as part of a | ||
| 6155 | left-to-right paragraph, it will always appear on display to the left | ||
| 6156 | of the text that follows it. The function works by examining the | ||
| 6157 | characters of its argument, and if any of those characters could cause | ||
| 6158 | reordering on display, the function appends the @acronym{LRM} | ||
| 6159 | character to the string. The appended @acronym{LRM} character is made | ||
| 6160 | @emph{invisible} (@pxref{Invisible Text}), to hide it on display. | ||
| 6161 | @end defun | ||
| 6162 | |||
| 6163 | The reordering algorithm uses the bidirectional properties of the | ||
| 6164 | characters stored as their @code{bidi-class} property | ||
| 6165 | (@pxref{Character Properties}). Lisp programs can change these | ||
| 6166 | properties by calling the @code{put-char-code-property} function. | ||
| 6167 | However, doing this requires a thorough understanding of the | ||
| 6168 | @acronym{UBA}, and is therefore not recommended. Any changes to the | ||
| 6169 | bidirectional properties of a character have global effect: they | ||
| 6170 | affect all Emacs frames and windows. | ||
| 6171 | |||
| 6172 | Similarly, the @code{mirroring} property is used to display the | ||
| 6173 | appropriate mirrored character in the reordered text. Lisp programs | ||
| 6174 | can affect the mirrored display by changing this property. Again, any | ||
| 6175 | such changes affect all of Emacs display. | ||
diff --git a/doc/lispref/nonascii.texi b/doc/lispref/nonascii.texi index 83f9f424834..7b6d665b2ac 100644 --- a/doc/lispref/nonascii.texi +++ b/doc/lispref/nonascii.texi | |||
| @@ -392,7 +392,8 @@ The value is an integer number. | |||
| 392 | @item bidi-class | 392 | @item bidi-class |
| 393 | Corresponds to the Unicode @code{Bidi_Class} property. The value is a | 393 | Corresponds to the Unicode @code{Bidi_Class} property. The value is a |
| 394 | symbol whose name is the Unicode @dfn{directional type} of the | 394 | symbol whose name is the Unicode @dfn{directional type} of the |
| 395 | character. | 395 | character. Emacs uses this property when it reorders bidirectional |
| 396 | text for display (@pxref{Bidirectional Display}). | ||
| 396 | 397 | ||
| 397 | @item decomposition | 398 | @item decomposition |
| 398 | Corresponds to the Unicode @code{Decomposition_Type} and | 399 | Corresponds to the Unicode @code{Decomposition_Type} and |
| @@ -440,7 +441,9 @@ defined mirroring glyph. All the characters whose @code{mirrored} | |||
| 440 | property is @code{N} have @code{nil} as their @code{mirroring} | 441 | property is @code{N} have @code{nil} as their @code{mirroring} |
| 441 | property; however, some characters whose @code{mirrored} property is | 442 | property; however, some characters whose @code{mirrored} property is |
| 442 | @code{Y} also have @code{nil} for @code{mirroring}, because no | 443 | @code{Y} also have @code{nil} for @code{mirroring}, because no |
| 443 | appropriate characters exist with mirrored glyphs. | 444 | appropriate characters exist with mirrored glyphs. Emacs uses this |
| 445 | property to display mirror images of characters when appropriate | ||
| 446 | (@pxref{Bidirectional Display}). | ||
| 444 | 447 | ||
| 445 | @item old-name | 448 | @item old-name |
| 446 | Corresponds to the Unicode @code{Unicode_1_Name} property. The value | 449 | Corresponds to the Unicode @code{Unicode_1_Name} property. The value |
| @@ -1043,6 +1043,7 @@ of function value which looks like (closure ENV ARGS &rest BODY). | |||
| 1043 | *** New function `special-variable-p' to check whether a variable is | 1043 | *** New function `special-variable-p' to check whether a variable is |
| 1044 | declared as dynamically bound. | 1044 | declared as dynamically bound. |
| 1045 | 1045 | ||
| 1046 | +++ | ||
| 1046 | ** New function `string-mark-left-to-right'. | 1047 | ** New function `string-mark-left-to-right'. |
| 1047 | Given a string containing right-to-left (RTL) script, this function | 1048 | Given a string containing right-to-left (RTL) script, this function |
| 1048 | returns another string with a terminating LRM (left-to-right mark) | 1049 | returns another string with a terminating LRM (left-to-right mark) |