diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/emacs/ChangeLog | 4 | ||||
| -rw-r--r-- | doc/emacs/nxml-mode.texi | 834 |
2 files changed, 838 insertions, 0 deletions
diff --git a/doc/emacs/ChangeLog b/doc/emacs/ChangeLog index 02d9e85ac50..634782b4419 100644 --- a/doc/emacs/ChangeLog +++ b/doc/emacs/ChangeLog | |||
| @@ -1,3 +1,7 @@ | |||
| 1 | 2007-11-23 Mark A. Hershberger <mah@everybody.org> | ||
| 2 | |||
| 3 | * nxml-mode.texi: Initial merge of nxml. | ||
| 4 | |||
| 1 | 2007-11-18 Richard Stallman <rms@gnu.org> | 5 | 2007-11-18 Richard Stallman <rms@gnu.org> |
| 2 | 6 | ||
| 3 | * flymake.texi (Example -- Configuring a tool called directly): | 7 | * flymake.texi (Example -- Configuring a tool called directly): |
diff --git a/doc/emacs/nxml-mode.texi b/doc/emacs/nxml-mode.texi new file mode 100644 index 00000000000..e2ab3fdbd58 --- /dev/null +++ b/doc/emacs/nxml-mode.texi | |||
| @@ -0,0 +1,834 @@ | |||
| 1 | \input texinfo @c -*- texinfo -*- | ||
| 2 | @c %**start of header | ||
| 3 | @setfilename nxml-mode.info | ||
| 4 | @settitle nXML Mode | ||
| 5 | @c %**end of header | ||
| 6 | |||
| 7 | @dircategory Emacs | ||
| 8 | @direntry | ||
| 9 | * nXML Mode: (nxml-mode.info). | ||
| 10 | @end direntry | ||
| 11 | |||
| 12 | @node Top | ||
| 13 | @top nXML Mode | ||
| 14 | |||
| 15 | This manual documents nxml-mode, an Emacs major mode for editing | ||
| 16 | XML with RELAX NG support. This manual is not yet complete. | ||
| 17 | |||
| 18 | @menu | ||
| 19 | * Completion:: | ||
| 20 | * Inserting end-tags:: | ||
| 21 | * Paragraphs:: | ||
| 22 | * Outlining:: | ||
| 23 | * Locating a schema:: | ||
| 24 | * DTDs:: | ||
| 25 | * Limitations:: | ||
| 26 | @end menu | ||
| 27 | |||
| 28 | @node Completion | ||
| 29 | @chapter Completion | ||
| 30 | |||
| 31 | Apart from real-time validation, the most important feature that | ||
| 32 | nxml-mode provides for assisting in document creation is "completion". | ||
| 33 | Completion assists the user in inserting characters at point, based on | ||
| 34 | knowledge of the schema and on the contents of the buffer before | ||
| 35 | point. | ||
| 36 | |||
| 37 | The traditional GNU Emacs key combination for completion in a | ||
| 38 | buffer is @kbd{M-@key{TAB}}. However, many window systems | ||
| 39 | and window managers use this key combination themselves (typically for | ||
| 40 | switching between windows) and do not pass it to applications. It's | ||
| 41 | hard to find key combinations in GNU Emacs that are both easy to type | ||
| 42 | and not taken by something else. @kbd{C-@key{RET}} (i.e. | ||
| 43 | pressing the Enter or Return key, while the Ctrl key is held down) is | ||
| 44 | available. It won't be available on a traditional terminal (because | ||
| 45 | it is indistinguishable from Return), but it will work with a window | ||
| 46 | system. Therefore we adopt the following solution by default: use | ||
| 47 | @kbd{C-@key{RET}} when there's a window system and | ||
| 48 | @kbd{M-@key{TAB}} when there's not. In the following, I | ||
| 49 | will assume that a window system is being used and will therefore | ||
| 50 | refer to @kbd{C-@key{RET}}. | ||
| 51 | |||
| 52 | Completion works by examining the symbol preceding point. This | ||
| 53 | is the symbol to be completed. The symbol to be completed may be the | ||
| 54 | empty. Completion considers what symbols starting with the symbol to | ||
| 55 | be completed would be valid replacements for the symbol to be | ||
| 56 | completed, given the schema and the contents of the buffer before | ||
| 57 | point. These symbols are the possible completions. An example may | ||
| 58 | make this clearer. Suppose the buffer looks like this (where @point{} | ||
| 59 | indicates point): | ||
| 60 | |||
| 61 | @example | ||
| 62 | <html xmlns="http://www.w3.org/1999/xhtml"> | ||
| 63 | <h@point{} | ||
| 64 | @end example | ||
| 65 | |||
| 66 | @noindent | ||
| 67 | and the schema is XHTML. In this context, the symbol to be completed | ||
| 68 | is @samp{h}. The possible completions consist of just | ||
| 69 | @samp{head}. Another example, is | ||
| 70 | |||
| 71 | @example | ||
| 72 | <html xmlns="http://www.w3.org/1999/xhtml"> | ||
| 73 | <head> | ||
| 74 | <@point{} | ||
| 75 | @end example | ||
| 76 | |||
| 77 | @noindent | ||
| 78 | In this case, the symbol to be completed is empty, and the possible | ||
| 79 | completions are @samp{base}, @samp{isindex}, | ||
| 80 | @samp{link}, @samp{meta}, @samp{script}, | ||
| 81 | @samp{style}, @samp{title}. Another example is: | ||
| 82 | |||
| 83 | @example | ||
| 84 | <html xmlns="@point{} | ||
| 85 | @end example | ||
| 86 | |||
| 87 | @noindent | ||
| 88 | In this case, the symbol to be completed is empty, and the possible | ||
| 89 | completions are just @samp{http://www.w3.org/1999/xhtml}. | ||
| 90 | |||
| 91 | When you type @kbd{C-@key{RET}}, what happens depends | ||
| 92 | on what the set of possible completions are. | ||
| 93 | |||
| 94 | @itemize @bullet | ||
| 95 | @item | ||
| 96 | If the set of completions is empty, nothing | ||
| 97 | happens. | ||
| 98 | @item | ||
| 99 | If there is one possible completion, then that completion is | ||
| 100 | inserted, together with any following characters that are | ||
| 101 | required. For example, in this case: | ||
| 102 | |||
| 103 | @example | ||
| 104 | <html xmlns="http://www.w3.org/1999/xhtml"> | ||
| 105 | <@point{} | ||
| 106 | @end example | ||
| 107 | |||
| 108 | @noindent | ||
| 109 | @kbd{C-@key{RET}} will yield | ||
| 110 | |||
| 111 | @example | ||
| 112 | <html xmlns="http://www.w3.org/1999/xhtml"> | ||
| 113 | <head@point{} | ||
| 114 | @end example | ||
| 115 | @item | ||
| 116 | If there is more than one possible completion, but all | ||
| 117 | possible completions share a common non-empty prefix, then that prefix | ||
| 118 | is inserted. For example, suppose the buffer is: | ||
| 119 | |||
| 120 | @example | ||
| 121 | <html x@point{} | ||
| 122 | @end example | ||
| 123 | |||
| 124 | @noindent | ||
| 125 | The symbol to be completed is @samp{x}. The possible completions | ||
| 126 | are @samp{xmlns} and @samp{xml:lang}. These share a | ||
| 127 | common prefix of @samp{xml}. Thus, @kbd{C-@key{RET}} | ||
| 128 | will yield: | ||
| 129 | |||
| 130 | @example | ||
| 131 | <html xml@point{} | ||
| 132 | @end example | ||
| 133 | |||
| 134 | @noindent | ||
| 135 | Typically, you would do @kbd{C-@key{RET}} again, which would | ||
| 136 | have the result described in the next item. | ||
| 137 | @item | ||
| 138 | If there is more than one possible completion, but the | ||
| 139 | possible completions do not share a non-empty prefix, then Emacs will | ||
| 140 | prompt you to input the symbol in the minibuffer, initializing the | ||
| 141 | minibuffer with the symbol to be completed, and popping up a buffer | ||
| 142 | showing the possible completions. You can now input the symbol to be | ||
| 143 | inserted. The symbol you input will be inserted in the buffer instead | ||
| 144 | of the symbol to be completed. Emacs will then insert any required | ||
| 145 | characters after the symbol. For example, if it contains: | ||
| 146 | |||
| 147 | @example | ||
| 148 | <html xml@point{} | ||
| 149 | @end example | ||
| 150 | |||
| 151 | @noindent | ||
| 152 | Emacs will prompt you in the minibuffer with | ||
| 153 | |||
| 154 | @example | ||
| 155 | Attribute: xml@point{} | ||
| 156 | @end example | ||
| 157 | |||
| 158 | @noindent | ||
| 159 | and the buffer showing possible completions will contain | ||
| 160 | |||
| 161 | @example | ||
| 162 | Possible completions are: | ||
| 163 | xml:lang xmlns | ||
| 164 | @end example | ||
| 165 | |||
| 166 | @noindent | ||
| 167 | If you input @kbd{xmlns}, the result will be: | ||
| 168 | |||
| 169 | @example | ||
| 170 | <html xmlns="@point{} | ||
| 171 | @end example | ||
| 172 | |||
| 173 | @noindent | ||
| 174 | (If you do @kbd{C-@key{RET}} again, the namespace URI will | ||
| 175 | be inserted. Should that happen automatically?) | ||
| 176 | @end itemize | ||
| 177 | |||
| 178 | @node Inserting end-tags | ||
| 179 | @chapter Inserting end-tags | ||
| 180 | |||
| 181 | The main redundancy in XML syntax is end-tags. nxml-mode provides | ||
| 182 | several ways to make it easier to enter end-tags. You can use all of | ||
| 183 | these without a schema. | ||
| 184 | |||
| 185 | You can use @kbd{C-@key{RET}} after @samp{</} | ||
| 186 | to complete the rest of the end-tag. | ||
| 187 | |||
| 188 | @kbd{C-c C-f} inserts an end-tag for the element containing | ||
| 189 | point. This command is useful when you want to input the start-tag, | ||
| 190 | then input the content and finally input the end-tag. The @samp{f} | ||
| 191 | is mnemonic for finish. | ||
| 192 | |||
| 193 | If you want to keep tags balanced and input the end-tag at the | ||
| 194 | same time as the start-tag, before inputting the content, then you can | ||
| 195 | use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts | ||
| 196 | the end-tag and leaves point before the end-tag. @kbd{C-c C-b} | ||
| 197 | is similar but more convenient for block-level elements: it puts the | ||
| 198 | start-tag, point and the end-tag on successive lines, appropriately | ||
| 199 | indented. The @samp{i} is mnemonic for inline and the | ||
| 200 | @samp{b} is mnemonic for block. | ||
| 201 | |||
| 202 | Finally, you can customize nxml-mode so that @kbd{/} | ||
| 203 | automatically inserts the rest of the end-tag when it occurs after | ||
| 204 | @samp{<}, by doing | ||
| 205 | |||
| 206 | @display | ||
| 207 | @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}} | ||
| 208 | @end display | ||
| 209 | |||
| 210 | @noindent | ||
| 211 | and then following the instructions in the displayed buffer. | ||
| 212 | |||
| 213 | @node Paragraphs | ||
| 214 | @chapter Paragraphs | ||
| 215 | |||
| 216 | Emacs has several commands that operate on paragraphs, most | ||
| 217 | notably @kbd{M-q}. nXML mode redefines these to work in a way | ||
| 218 | that is useful for XML. The exact rules that are used to find the | ||
| 219 | beginning and end of a paragraph are complicated; they are designed | ||
| 220 | mainly to ensure that @kbd{M-q} does the right thing. | ||
| 221 | |||
| 222 | A paragraph consists of one or more complete, consecutive lines. | ||
| 223 | A group of lines is not considered a paragraph unless it contains some | ||
| 224 | non-whitespace characters between tags or inside comments. A blank | ||
| 225 | line separates paragraphs. A single tag on a line by itself also | ||
| 226 | separates paragraphs. More precisely, if one tag together with any | ||
| 227 | leading and trailing whitespace completely occupy one or more lines, | ||
| 228 | then those lines will not be included in any paragraph. | ||
| 229 | |||
| 230 | A start-tag at the beginning of the line (possibly indented) may | ||
| 231 | be treated as starting a paragraph. Similarly, an end-tag at the end | ||
| 232 | of the line may be treated as ending a paragraph. The following rules | ||
| 233 | are used to determine whether such a tag is in fact treated as a | ||
| 234 | paragraph boundary: | ||
| 235 | |||
| 236 | @itemize @bullet | ||
| 237 | @item | ||
| 238 | If the schema does not allow text at that point, then it | ||
| 239 | is a paragraph boundary. | ||
| 240 | @item | ||
| 241 | If the end-tag corresponding to the start-tag is not at | ||
| 242 | the end of its line, or the start-tag corresponding to the end-tag is | ||
| 243 | not at the beginning of its line, then it is not a paragraph | ||
| 244 | boundary. For example, in | ||
| 245 | |||
| 246 | @example | ||
| 247 | <p>This is a paragraph with an | ||
| 248 | <emph>emphasized</emph> phrase. | ||
| 249 | @end example | ||
| 250 | |||
| 251 | @noindent | ||
| 252 | the @samp{<emph>} start-tag would not be considered as | ||
| 253 | starting a paragraph, because its corresponding end-tag is not at the | ||
| 254 | end of the line. | ||
| 255 | @item | ||
| 256 | If there is text that is a sibling in element tree, then | ||
| 257 | it is not a paragraph boundary. For example, in | ||
| 258 | |||
| 259 | @example | ||
| 260 | <p>This is a paragraph with an | ||
| 261 | <emph>emphasized phrase that takes one source line</emph> | ||
| 262 | @end example | ||
| 263 | |||
| 264 | @noindent | ||
| 265 | the @samp{<emph>} start-tag would not be considered as | ||
| 266 | starting a paragraph, even though its end-tag is at the end of its | ||
| 267 | line, because there the text @samp{This is a paragraph with an} | ||
| 268 | is a sibling of the @samp{emph} element. | ||
| 269 | @item | ||
| 270 | Otherwise, it is a paragraph boundary. | ||
| 271 | @end itemize | ||
| 272 | |||
| 273 | @node Outlining | ||
| 274 | @chapter Outlining | ||
| 275 | |||
| 276 | nXML mode allows you to display all or part of a buffer as an | ||
| 277 | outline, in a similar way to Emacs' outline mode. An outline in nXML | ||
| 278 | mode is based on recognizing two kinds of element: sections and | ||
| 279 | headings. There is one heading for every section and one section for | ||
| 280 | every heading. A section contains its heading as or within its first | ||
| 281 | child element. A section also contains its subordinate sections (its | ||
| 282 | subsections). The text content of a section consists of anything in a | ||
| 283 | section that is neither a subsection nor a heading. | ||
| 284 | |||
| 285 | Note that this is a different model from that used by XHTML. | ||
| 286 | nXML mode's outline support will not be useful for XHTML unless you | ||
| 287 | adopt a convention of adding a @code{div} to enclose each | ||
| 288 | section, rather than having sections implicitly delimited by different | ||
| 289 | @code{h@var{n}} elements. This limitation may be removed | ||
| 290 | in a future version. | ||
| 291 | |||
| 292 | The variable @code{nxml-section-element-name-regexp} gives | ||
| 293 | a regexp for the local names (i.e. the part of the name following any | ||
| 294 | prefix) of section elements. The variable | ||
| 295 | @code{nxml-heading-element-name-regexp} gives a regexp for the | ||
| 296 | local names of heading elements. For an element to be recognized | ||
| 297 | as a section | ||
| 298 | |||
| 299 | @itemize @bullet | ||
| 300 | @item | ||
| 301 | its start-tag must occur at the beginning of a line | ||
| 302 | (possibly indented); | ||
| 303 | @item | ||
| 304 | its local name must match | ||
| 305 | @code{nxml-section-element-name-regexp}; | ||
| 306 | @item | ||
| 307 | either its first child element or a descendant of that | ||
| 308 | first child element must have a local name that matches | ||
| 309 | @code{nxml-heading-element-name-regexp}; the first such element | ||
| 310 | is treated as the section's heading. | ||
| 311 | @end itemize | ||
| 312 | |||
| 313 | @noindent | ||
| 314 | You can customize these variables using @kbd{M-x | ||
| 315 | customize-variable}. | ||
| 316 | |||
| 317 | There are three possible outline states for a section: | ||
| 318 | |||
| 319 | @itemize @bullet | ||
| 320 | @item | ||
| 321 | normal, showing everything, including its heading, text | ||
| 322 | content and subsections; each subsection is displayed according to the | ||
| 323 | state of that subsection; | ||
| 324 | @item | ||
| 325 | showing just its heading, with both its text content and | ||
| 326 | its subsections hidden; all subsections are hidden regardless of their | ||
| 327 | state; | ||
| 328 | @item | ||
| 329 | showing its heading and its subsections, with its text | ||
| 330 | content hidden; each subsection is displayed according to the state of | ||
| 331 | that subsection. | ||
| 332 | @end itemize | ||
| 333 | |||
| 334 | In the last two states, where the text content is hidden, the | ||
| 335 | heading is displayed specially, in an abbreviated form. An element | ||
| 336 | like this: | ||
| 337 | |||
| 338 | @example | ||
| 339 | <section> | ||
| 340 | <title>Food</title> | ||
| 341 | <para>There are many kinds of food.</para> | ||
| 342 | </section> | ||
| 343 | @end example | ||
| 344 | |||
| 345 | @noindent | ||
| 346 | would be displayed on a single line like this: | ||
| 347 | |||
| 348 | @example | ||
| 349 | <-section>Food...</> | ||
| 350 | @end example | ||
| 351 | |||
| 352 | @noindent | ||
| 353 | If there are hidden subsections, then a @code{+} will be used | ||
| 354 | instead of a @code{-} like this: | ||
| 355 | |||
| 356 | @example | ||
| 357 | <+section>Food...</> | ||
| 358 | @end example | ||
| 359 | |||
| 360 | @noindent | ||
| 361 | If there are non-hidden subsections, then the section will instead be | ||
| 362 | displayed like this: | ||
| 363 | |||
| 364 | @example | ||
| 365 | <-section>Food... | ||
| 366 | <-section>Delicious Food...</> | ||
| 367 | <-section>Distasteful Food...</> | ||
| 368 | </-section> | ||
| 369 | @end example | ||
| 370 | |||
| 371 | @noindent | ||
| 372 | The heading is always displayed with an indent that corresponds to its | ||
| 373 | depth in the outline, even it is not actually indented in the buffer. | ||
| 374 | The variable @code{nxml-outline-child-indent} controls how much | ||
| 375 | a subheading is indented with respect to its parent heading when the | ||
| 376 | heading is being displayed specially. | ||
| 377 | |||
| 378 | Commands to change the outline state of sections are bound to | ||
| 379 | key sequences that start with @kbd{C-c C-o} (@kbd{o} is | ||
| 380 | mnemonic for outline). The third and final key has been chosen to be | ||
| 381 | consistent with outline mode. In the following descriptions | ||
| 382 | current section means the section containing point, or, more precisely, | ||
| 383 | the innermost section containing the character immediately following | ||
| 384 | point. | ||
| 385 | |||
| 386 | @itemize @bullet | ||
| 387 | @item | ||
| 388 | @kbd{C-c C-o C-a} shows all sections in the buffer | ||
| 389 | normally. | ||
| 390 | @item | ||
| 391 | @kbd{C-c C-o C-t} hides the text content | ||
| 392 | of all sections in the buffer. | ||
| 393 | @item | ||
| 394 | @kbd{C-c C-o C-c} hides the text content | ||
| 395 | of the current section. | ||
| 396 | @item | ||
| 397 | @kbd{C-c C-o C-e} shows the text content | ||
| 398 | of the current section. | ||
| 399 | @item | ||
| 400 | @kbd{C-c C-o C-d} hides the text content | ||
| 401 | and subsections of the current section. | ||
| 402 | @item | ||
| 403 | @kbd{C-c C-o C-s} shows the current section | ||
| 404 | and all its direct and indirect subsections normally. | ||
| 405 | @item | ||
| 406 | @kbd{C-c C-o C-k} shows the headings of the | ||
| 407 | direct and indirect subsections of the current section. | ||
| 408 | @item | ||
| 409 | @kbd{C-c C-o C-l} hides the text content of the | ||
| 410 | current section and of its direct and indirect | ||
| 411 | subsections. | ||
| 412 | @item | ||
| 413 | @kbd{C-c C-o C-i} shows the headings of the | ||
| 414 | direct subsections of the current section. | ||
| 415 | @item | ||
| 416 | @kbd{C-c C-o C-o} hides as much as possible without | ||
| 417 | hiding the current section's text content; the headings of ancestor | ||
| 418 | sections of the current section and their child section sections will | ||
| 419 | not be hidden. | ||
| 420 | @end itemize | ||
| 421 | |||
| 422 | When a heading is displayed specially, you can use | ||
| 423 | @key{RET} in that heading to show the text content of the section | ||
| 424 | in the same way as @kbd{C-c C-o C-e}. | ||
| 425 | |||
| 426 | You can also use the mouse to change the outline state: | ||
| 427 | @kbd{S-mouse-2} hides the text content of a section in the same | ||
| 428 | way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially | ||
| 429 | displayed heading shows the text content of the section in the same | ||
| 430 | way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially | ||
| 431 | displayed start-tag toggles the display of subheadings on and | ||
| 432 | off. | ||
| 433 | |||
| 434 | The outline state for each section is stored with the first | ||
| 435 | character of the section (as a text property). Every command that | ||
| 436 | changes the outline state of any section updates the display of the | ||
| 437 | buffer so that each section is displayed correctly according to its | ||
| 438 | outline state. If the section structure is subsequently changed, then | ||
| 439 | it is possible for the display to no longer correctly reflect the | ||
| 440 | stored outline state. @kbd{C-c C-o C-r} can be used to refresh | ||
| 441 | the display so it is correct again. | ||
| 442 | |||
| 443 | @node Locating a schema | ||
| 444 | @chapter Locating a schema | ||
| 445 | |||
| 446 | nXML mode has a configurable set of rules to locate a schema for | ||
| 447 | the file being edited. The rules are contained in one or more schema | ||
| 448 | locating files, which are XML documents. | ||
| 449 | |||
| 450 | The variable @samp{rng-schema-locating-files} specifies | ||
| 451 | the list of the file-names of schema locating files that nXML mode | ||
| 452 | should use. The order of the list is significant: when file | ||
| 453 | @var{x} occurs in the list before file @var{y} then rules | ||
| 454 | from file @var{x} have precedence over rules from file | ||
| 455 | @var{y}. A filename specified in | ||
| 456 | @samp{rng-schema-locating-files} may be relative. If so, it will | ||
| 457 | be resolved relative to the document for which a schema is being | ||
| 458 | located. It is not an error if relative file-names in | ||
| 459 | @samp{rng-schema-locating-files} do not not exist. You can use | ||
| 460 | @kbd{M-x customize-variable @key{RET} rng-schema-locating-files | ||
| 461 | @key{RET}} to customize the list of schema locating | ||
| 462 | files. | ||
| 463 | |||
| 464 | By default, @samp{rng-schema-locating-files} list has two | ||
| 465 | members: @samp{schemas.xml}, and | ||
| 466 | @samp{@var{dist-dir}/schema/schemas.xml} where | ||
| 467 | @samp{@var{dist-dir}} is the directory containing the nXML | ||
| 468 | distribution. The first member will cause nXML mode to use a file | ||
| 469 | @samp{schemas.xml} in the same directory as the document being | ||
| 470 | edited if such a file exist. The second member contains rules for the | ||
| 471 | schemas that are included with the nXML distribution. | ||
| 472 | |||
| 473 | @menu | ||
| 474 | * Commands for locating a schema:: | ||
| 475 | * Schema locating files:: | ||
| 476 | @end menu | ||
| 477 | |||
| 478 | @node Commands for locating a schema | ||
| 479 | @section Commands for locating a schema | ||
| 480 | |||
| 481 | The command @kbd{C-c C-s C-w} will tell you what schema | ||
| 482 | is currently being used. | ||
| 483 | |||
| 484 | The rules for locating a schema are applied automatically when | ||
| 485 | you visit a file in nXML mode. However, if you have just created a new | ||
| 486 | file and the schema cannot be inferred from the file-name, then this | ||
| 487 | will not locate the right schema. In this case, you should insert the | ||
| 488 | start-tag of the root element and then use the command @kbd{C-c | ||
| 489 | C-a}, which reapplies the rules based on the current content of | ||
| 490 | the document. It is usually not necessary to insert the complete | ||
| 491 | start-tag; often just @samp{<@var{name}} is | ||
| 492 | enough. | ||
| 493 | |||
| 494 | If you want to use a schema that has not yet been added to the | ||
| 495 | schema locating files, you can use the command @kbd{C-c C-s C-f} | ||
| 496 | to manually select the file contaiing the schema for the document in | ||
| 497 | current buffer. Emacs will read the file-name of the schema from the | ||
| 498 | minibuffer. After reading the file-name, Emacs will ask whether you | ||
| 499 | wish to add a rule to a schema locating file that persistently | ||
| 500 | associates the document with the selected schema. The rule will be | ||
| 501 | added to the first file in the list specified | ||
| 502 | @samp{rng-schema-locating-files}; it will create the file if | ||
| 503 | necessary, but will not create a directory. If the variable | ||
| 504 | @samp{rng-schema-locating-files} has not been customized, this | ||
| 505 | means that the rule will be added to the file @samp{schemas.xml} | ||
| 506 | in the same directory as the document being edited. | ||
| 507 | |||
| 508 | The command @kbd{C-c C-s C-t} allows you to select a schema by | ||
| 509 | specifying an identifier for the type of the document. The schema | ||
| 510 | locating files determine the available type identifiers and what | ||
| 511 | schema is used for each type identifier. This is useful when it is | ||
| 512 | impossible to infer the right schema from either the file-name or the | ||
| 513 | content of the document, even though the schema is already in the | ||
| 514 | schema locating file. A situation in which this can occur is when | ||
| 515 | there are multiple variants of a schema where all valid documents have | ||
| 516 | the same document element. For example, XHTML has Strict and | ||
| 517 | Transitional variants. In a situation like this, a schema locating file | ||
| 518 | can define a type identifier for each variant. As with @kbd{C-c | ||
| 519 | C-s C-f}, Emacs will ask whether you wish to add a rule to a schema | ||
| 520 | locating file that persistently associates the document with the | ||
| 521 | specified type identifier. | ||
| 522 | |||
| 523 | The command @kbd{C-c C-s C-l} adds a rule to a schema | ||
| 524 | locating file that persistently associates the document with | ||
| 525 | the schema that is currently being used. | ||
| 526 | |||
| 527 | @node Schema locating files | ||
| 528 | @section Schema locating files | ||
| 529 | |||
| 530 | Each schema locating file specifies a list of rules. The rules | ||
| 531 | from each file are appended in order. To locate a schema each rule is | ||
| 532 | applied in turn until a rule matches. The first matching rule is then | ||
| 533 | used to determine the schema. | ||
| 534 | |||
| 535 | Schema locating files are designed to be useful for other | ||
| 536 | applications that need to locate a schema for a document. In fact, | ||
| 537 | there is nothing specific to locating schemas in the design; it could | ||
| 538 | equally well be used for locating a stylesheet. | ||
| 539 | |||
| 540 | @menu | ||
| 541 | * Schema locating file syntax basics:: | ||
| 542 | * Using the document's URI to locate a schema:: | ||
| 543 | * Using the document element to locate a schema:: | ||
| 544 | * Using type identifiers in schema locating files:: | ||
| 545 | * Using multiple schema locating files:: | ||
| 546 | @end menu | ||
| 547 | |||
| 548 | @node Schema locating file syntax basics | ||
| 549 | @subsection Schema locating file syntax basics | ||
| 550 | |||
| 551 | There is a schema for schema locating files in the file | ||
| 552 | @samp{locate.rnc} in the schema directory. Schema locating | ||
| 553 | files must be valid with respect to this schema. | ||
| 554 | |||
| 555 | The document element of a schema locating file must be | ||
| 556 | @samp{locatingRules} and the namespace URI must be | ||
| 557 | @samp{http://thaiopensource.com/ns/locating-rules/1.0}. The | ||
| 558 | children of the document element specify rules. The order of the | ||
| 559 | children is the same as the order of the rules. Here's a complete | ||
| 560 | example of a schema locating file: | ||
| 561 | |||
| 562 | @example | ||
| 563 | <?xml version="1.0"?> | ||
| 564 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | ||
| 565 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | ||
| 566 | <documentElement localName="book" uri="docbook.rnc"/> | ||
| 567 | </locatingRules> | ||
| 568 | @end example | ||
| 569 | |||
| 570 | @noindent | ||
| 571 | This says to use the schema @samp{xhtml.rnc} for a document with | ||
| 572 | namespace @samp{http://www.w3.org/1999/xhtml}, and to use the | ||
| 573 | schema @samp{docbook.rnc} for a document whose local name is | ||
| 574 | @samp{book}. If the document element had both a namespace URI | ||
| 575 | of @samp{http://www.w3.org/1999/xhtml} and a local name of | ||
| 576 | @samp{book}, then the matching rule that comes first will be | ||
| 577 | used and so the schema @samp{xhtml.rnc} would be used. There is | ||
| 578 | no precedence between different types of rule; the first matching rule | ||
| 579 | of any type is used. | ||
| 580 | |||
| 581 | As usual with XML-related technologies, resources are identified | ||
| 582 | by URIs. The @samp{uri} attribute identifies the schema by | ||
| 583 | specifying the URI. The URI may be relative. If so, it is resolved | ||
| 584 | relative to the URI of the schema locating file that contains | ||
| 585 | attribute. This means that if the value of @samp{uri} attribute | ||
| 586 | does not contain a @samp{/}, then it will refer to a filename in | ||
| 587 | the same directory as the schema locating file. | ||
| 588 | |||
| 589 | @node Using the document's URI to locate a schema | ||
| 590 | @subsection Using the document's URI to locate a schema | ||
| 591 | |||
| 592 | A @samp{uri} rule locates a schema based on the URI of the | ||
| 593 | document. The @samp{uri} attribute specifies the URI of the | ||
| 594 | schema. The @samp{resource} attribute can be used to specify | ||
| 595 | the schema for a particular document. For example, | ||
| 596 | |||
| 597 | @example | ||
| 598 | <uri resource="spec.xml" uri="docbook.rnc"/> | ||
| 599 | @end example | ||
| 600 | |||
| 601 | @noindent | ||
| 602 | specifies that that the schema for @samp{spec.xml} is | ||
| 603 | @samp{docbook.rnc}. | ||
| 604 | |||
| 605 | The @samp{pattern} attribute can be used instead of the | ||
| 606 | @samp{resource} attribute to specify the schema for any document | ||
| 607 | whose URI matches a pattern. The pattern has the same syntax as an | ||
| 608 | absolute or relative URI except that the path component of the URI can | ||
| 609 | use a @samp{*} character to stand for zero or more characters | ||
| 610 | within a path segment (i.e. any character other @samp{/}). | ||
| 611 | Typically, the URI pattern looks like a relative URI, but, whereas a | ||
| 612 | relative URI in the @samp{resource} attribute is resolved into a | ||
| 613 | particular absolute URI using the base URI of the schema locating | ||
| 614 | file, a relative URI pattern matches if it matches some number of | ||
| 615 | complete path segments of the document's URI ending with the last path | ||
| 616 | segment of the document's URI. For example, | ||
| 617 | |||
| 618 | @example | ||
| 619 | <uri pattern="*.xsl" uri="xslt.rnc"/> | ||
| 620 | @end example | ||
| 621 | |||
| 622 | @noindent | ||
| 623 | specifies that the schema for documents with a URI whose path ends | ||
| 624 | with @samp{.xsl} is @samp{xslt.rnc}. | ||
| 625 | |||
| 626 | A @samp{transformURI} rule locates a schema by | ||
| 627 | transforming the URI of the document. The @samp{fromPattern} | ||
| 628 | attribute specifies a URI pattern with the same meaning as the | ||
| 629 | @samp{pattern} attribute of the @samp{uri} element. The | ||
| 630 | @samp{toPattern} attribute is a URI pattern that is used to | ||
| 631 | generate the URI of the schema. Each @samp{*} in the | ||
| 632 | @samp{toPattern} is replaced by the string that matched the | ||
| 633 | corresponding @samp{*} in the @samp{fromPattern}. The | ||
| 634 | resulting string is appended to the initial part of the document's URI | ||
| 635 | that was not explicitly matched by the @samp{fromPattern}. The | ||
| 636 | rule matches only if the transformed URI identifies an existing | ||
| 637 | resource. For example, the rule | ||
| 638 | |||
| 639 | @example | ||
| 640 | <transformURI fromPattern="*.xml" toPattern="*.rnc"/> | ||
| 641 | @end example | ||
| 642 | |||
| 643 | @noindent | ||
| 644 | would transform the URI @samp{file:///home/jjc/docs/spec.xml} | ||
| 645 | into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this | ||
| 646 | rule specifies that to locate a schema for a document | ||
| 647 | @samp{@var{foo}.xml}, Emacs should test whether a file | ||
| 648 | @samp{@var{foo}.rnc} exists in the same directory as | ||
| 649 | @samp{@var{foo}.xml}, and, if so, should use it as the | ||
| 650 | schema. | ||
| 651 | |||
| 652 | @node Using the document element to locate a schema | ||
| 653 | @subsection Using the document element to locate a schema | ||
| 654 | |||
| 655 | A @samp{documentElement} rule locates a schema based on | ||
| 656 | the local name and prefix of the document element. For example, a rule | ||
| 657 | |||
| 658 | @example | ||
| 659 | <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/> | ||
| 660 | @end example | ||
| 661 | |||
| 662 | @noindent | ||
| 663 | specifies that when the name of the document element is | ||
| 664 | @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used | ||
| 665 | as the schema. Either the @samp{prefix} or | ||
| 666 | @samp{localName} attribute may be omitted to allow any prefix or | ||
| 667 | local name. | ||
| 668 | |||
| 669 | A @samp{namespace} rule locates a schema based on the | ||
| 670 | namespace URI of the document element. For example, a rule | ||
| 671 | |||
| 672 | @example | ||
| 673 | <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/> | ||
| 674 | @end example | ||
| 675 | |||
| 676 | @noindent | ||
| 677 | specifies that when the namespace URI of the document is | ||
| 678 | @samp{http://www.w3.org/1999/XSL/Transform}, then | ||
| 679 | @samp{xslt.rnc} should be used as the schema. | ||
| 680 | |||
| 681 | @node Using type identifiers in schema locating files | ||
| 682 | @subsection Using type identifiers in schema locating files | ||
| 683 | |||
| 684 | Type identifiers allow a level of indirection in locating the | ||
| 685 | schema for a document. Instead of associating the document directly | ||
| 686 | with a schema URI, the document is associated with a type identifier, | ||
| 687 | which is in turn associated with a schema URI. nXML mode does not | ||
| 688 | constrain the format of type identifiers. They can be simply strings | ||
| 689 | without any formal structure or they can be public identifiers or | ||
| 690 | URIs. Note that these type identifiers have nothing to do with the | ||
| 691 | DOCTYPE declaration. When comparing type identifiers, whitespace is | ||
| 692 | normalized in the same way as with the @samp{xsd:token} | ||
| 693 | datatype: leading and trailing whitespace is stripped; other sequences | ||
| 694 | of whitespace are normalized to a single space character. | ||
| 695 | |||
| 696 | Each of the rules described in previous sections that uses a | ||
| 697 | @samp{uri} attribute to specify a schema, can instead use a | ||
| 698 | @samp{typeId} attribute to specify a type identifier. The type | ||
| 699 | identifier can be associated with a URI using a @samp{typeId} | ||
| 700 | element. For example, | ||
| 701 | |||
| 702 | @example | ||
| 703 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | ||
| 704 | <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/> | ||
| 705 | <typeId id="XHTML" typeId="XHTML Strict"/> | ||
| 706 | <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/> | ||
| 707 | <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/> | ||
| 708 | </locatingRules> | ||
| 709 | @end example | ||
| 710 | |||
| 711 | @noindent | ||
| 712 | declares three type identifiers @samp{XHTML} (representing the | ||
| 713 | default variant of XHTML to be used), @samp{XHTML Strict} and | ||
| 714 | @samp{XHTML Transitional}. Such a schema locating file would | ||
| 715 | use @samp{xhtml-strict.rnc} for a document whose namespace is | ||
| 716 | @samp{http://www.w3.org/1999/xhtml}. But it is considerably | ||
| 717 | more flexible than a schema locating file that simply specified | ||
| 718 | |||
| 719 | @example | ||
| 720 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/> | ||
| 721 | @end example | ||
| 722 | |||
| 723 | @noindent | ||
| 724 | A user can easily use @kbd{C-c C-s C-t} to select between XHTML | ||
| 725 | Strict and XHTML Transitional. Also, a user can easily add a catalog | ||
| 726 | |||
| 727 | @example | ||
| 728 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | ||
| 729 | <typeId id="XHTML" typeId="XHTML Transitional"/> | ||
| 730 | </locatingRules> | ||
| 731 | @end example | ||
| 732 | |||
| 733 | @noindent | ||
| 734 | that makes the default variant of XHTML be XHTML Transitional. | ||
| 735 | |||
| 736 | @node Using multiple schema locating files | ||
| 737 | @subsection Using multiple schema locating files | ||
| 738 | |||
| 739 | The @samp{include} element includes rules from another | ||
| 740 | schema locating file. The behavior is exactly as if the rules from | ||
| 741 | that file were included in place of the @samp{include} element. | ||
| 742 | Relative URIs are resolved into absolute URIs before the inclusion is | ||
| 743 | performed. For example, | ||
| 744 | |||
| 745 | @example | ||
| 746 | <include rules="../rules.xml"/> | ||
| 747 | @end example | ||
| 748 | |||
| 749 | @noindent | ||
| 750 | includes the rules from @samp{rules.xml}. | ||
| 751 | |||
| 752 | The process of locating a schema takes as input a list of schema | ||
| 753 | locating files. The rules in all these files and in the files they | ||
| 754 | include are resolved into a single list of rules, which are applied | ||
| 755 | strictly in order. Sometimes this order is not what is needed. | ||
| 756 | For example, suppose you have two schema locating files, a private | ||
| 757 | file | ||
| 758 | |||
| 759 | @example | ||
| 760 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | ||
| 761 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | ||
| 762 | </locatingRules> | ||
| 763 | @end example | ||
| 764 | |||
| 765 | @noindent | ||
| 766 | followed by a public file | ||
| 767 | |||
| 768 | @example | ||
| 769 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | ||
| 770 | <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/> | ||
| 771 | <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/> | ||
| 772 | </locatingRules> | ||
| 773 | @end example | ||
| 774 | |||
| 775 | @noindent | ||
| 776 | The effect of these two files is that the XHTML @samp{namespace} | ||
| 777 | rule takes precedence over the @samp{transformURI} rule, which | ||
| 778 | is almost certainly not what is needed. This can be solved by adding | ||
| 779 | an @samp{applyFollowingRules} to the private file. | ||
| 780 | |||
| 781 | @example | ||
| 782 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | ||
| 783 | <applyFollowingRules ruleType="transformURI"/> | ||
| 784 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | ||
| 785 | </locatingRules> | ||
| 786 | @end example | ||
| 787 | |||
| 788 | @node DTDs | ||
| 789 | @chapter DTDs | ||
| 790 | |||
| 791 | nxml-mode is designed to support the creation of standalone XML | ||
| 792 | documents that do not depend on a DTD. Although it is common practice | ||
| 793 | to insert a DOCTYPE declaration referencing an external DTD, this has | ||
| 794 | undesirable side-effects. It means that the document is no longer | ||
| 795 | self-contained. It also means that different XML parsers may interpret | ||
| 796 | the document in different ways, since the XML Recommendation does not | ||
| 797 | require XML parsers to read the DTD. With DTDs, it was impractical to | ||
| 798 | get validation without using an external DTD or reference to an | ||
| 799 | parameter entity. With RELAX NG and other schema languages, you can | ||
| 800 | simulataneously get the benefits of validation and standalone XML | ||
| 801 | documents. Therefore, I recommend that you do not reference an | ||
| 802 | external DOCTYPE in your XML documents. | ||
| 803 | |||
| 804 | One problem is entities for characters. Typically, as well as | ||
| 805 | providing validation, DTDs also provide a set of character entities | ||
| 806 | for documents to use. Schemas cannot provide this functionality, | ||
| 807 | because schema validation happens after XML parsing. The recommended | ||
| 808 | solution is to either use the Unicode characters directly, or, if this | ||
| 809 | is impractical, use character references. nXML mode supports this by | ||
| 810 | providing commands for entering characters and character references | ||
| 811 | using the Unicode names, and can display the glyph corresponding to a | ||
| 812 | character reference. | ||
| 813 | |||
| 814 | @node Limitations | ||
| 815 | @chapter Limitations | ||
| 816 | |||
| 817 | nXML mode has some limitations: | ||
| 818 | |||
| 819 | @itemize @bullet | ||
| 820 | @item | ||
| 821 | DTD support is limited. Internal parsed general entities declared | ||
| 822 | in the internal subset are supported provided they do not contain | ||
| 823 | elements. Other usage of DTDs is ignored. | ||
| 824 | @item | ||
| 825 | The restrictions on RELAX NG schemas in section 7 of the RELAX NG | ||
| 826 | specification are not enforced. | ||
| 827 | @item | ||
| 828 | Unicode support has problems. This stems mostly from the fact that | ||
| 829 | the XML (and RELAX NG) character model is based squarely on Unicode, | ||
| 830 | whereas the Emacs character model is not. Emacs 22 is slated to have | ||
| 831 | full Unicode support, which should improve the situation here. | ||
| 832 | @end itemize | ||
| 833 | |||
| 834 | @bye | ||