aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorEric M. Ludlam2012-12-12 20:19:48 -0800
committerGlenn Morris2012-12-12 20:19:48 -0800
commitcfa49c1e3385ec06ca12648c494fc1a5a143fb86 (patch)
treef3c2baf135b56bf17c3088bacbf211d89a53f6b3
parent9b97b14348484722bde7bf1213816b6d9dd7cf3f (diff)
downloademacs-cfa49c1e3385ec06ca12648c494fc1a5a143fb86.tar.gz
emacs-cfa49c1e3385ec06ca12648c494fc1a5a143fb86.zip
Import bovine manual from CEDET trunk
Ref http://lists.gnu.org/archive/html/emacs-devel/2012-11/msg00419.html and preceding discussion Imported from bzr://cedet.bzr.sourceforge.net/bzrroot/cedet/code/trunk doc/texi/semantic/bovine.texi
-rw-r--r--doc/misc/ChangeLog6
-rw-r--r--doc/misc/bovine.texi480
2 files changed, 486 insertions, 0 deletions
diff --git a/doc/misc/ChangeLog b/doc/misc/ChangeLog
index 22e0e9d85ae..3557e27184c 100644
--- a/doc/misc/ChangeLog
+++ b/doc/misc/ChangeLog
@@ -1,3 +1,9 @@
12012-12-13 Eric Ludlam <zappo@gnu.org>
2 David Ponce <david@dponce.com>
3 Richard Kim <emacs18@gmail.com>
4
5 * bovine.texi: New file, imported from CEDET trunk.
6
12012-12-12 Glenn Morris <rgm@gnu.org> 72012-12-12 Glenn Morris <rgm@gnu.org>
2 8
3 * flymake.texi (Customizable variables, Locating the buildfile): 9 * flymake.texi (Customizable variables, Locating the buildfile):
diff --git a/doc/misc/bovine.texi b/doc/misc/bovine.texi
new file mode 100644
index 00000000000..b24e0e0dd7d
--- /dev/null
+++ b/doc/misc/bovine.texi
@@ -0,0 +1,480 @@
1\input texinfo @c -*-texinfo-*-
2@c %**start of header
3@setfilename bovine.info
4@set TITLE Bovine parser development
5@set AUTHOR Eric M. Ludlam, David Ponce, and Richard Y. Kim
6@settitle @value{TITLE}
7
8@c *************************************************************************
9@c @ Header
10@c *************************************************************************
11
12@c Merge all indexes into a single index for now.
13@c We can always separate them later into two or more as needed.
14@syncodeindex vr cp
15@syncodeindex fn cp
16@syncodeindex ky cp
17@syncodeindex pg cp
18@syncodeindex tp cp
19
20@c @footnotestyle separate
21@c @paragraphindent 2
22@c @@smallbook
23@c %**end of header
24
25@copying
26This manual documents Bovine parser development in Semantic
27
28Copyright @copyright{} 1999, 2000, 2001, 2002, 2003, 2004 Eric M. Ludlam
29Copyright @copyright{} 2001, 2002, 2003, 2004 David Ponce
30Copyright @copyright{} 2002, 2003 Richard Y. Kim
31
32@quotation
33Permission is granted to copy, distribute and/or modify this document
34under the terms of the GNU Free Documentation License, Version 1.1 or
35any later version published by the Free Software Foundation; with the
36Invariant Sections being list their titles, with the Front-Cover Texts
37being list, and with the Back-Cover Texts being list. A copy of the
38license is included in the section entitled ``GNU Free Documentation
39License''.
40@end quotation
41@end copying
42
43@ifinfo
44@dircategory Emacs
45@direntry
46* Semantic bovine parser development: (bovine).
47@end direntry
48@end ifinfo
49
50@iftex
51@finalout
52@end iftex
53
54@c @setchapternewpage odd
55@c @setchapternewpage off
56
57@ifinfo
58This file documents parser development with the bovine parser generator
59@emph{Infrastructure for parser based text analysis in Emacs}
60
61Copyright @copyright{} 1999, 2000, 2001, 2002, 2003, 2004 @value{AUTHOR}
62@end ifinfo
63
64@titlepage
65@sp 10
66@title @value{TITLE}
67@author by @value{AUTHOR}
68@vskip 0pt plus 1 fill
69Copyright @copyright{} 1999, 2000, 2001, 2002, 2003, 2004 @value{AUTHOR}
70@page
71@vskip 0pt plus 1 fill
72@insertcopying
73@end titlepage
74@page
75
76@c MACRO inclusion
77@include semanticheader.texi
78
79
80@c *************************************************************************
81@c @ Document
82@c *************************************************************************
83@contents
84
85@node top
86@top @value{TITLE}
87
88The @dfn{bovine} parser is the original @semantic{} parser, and is an
89implementation of an @acronym{LL} parser. It is good for simple
90languages. It has many conveniences making grammar writing easy. The
91conveniences make it less powerful than a Bison-like @acronym{LALR}
92parser. For more information, @inforef{top, the Wisent Parser Manual,
93wisent}.
94
95Bovine @acronym{LL} grammars are stored in files with a @file{.by}
96extension. When compiled, the contents is converted into a file of
97the form @file{NAME-by.el}. This, in turn is byte compiled.
98@inforef{top, Grammar Framework Manual, grammar-fw}.
99
100@menu
101* Starting Rules:: The starting rules for the grammar.
102* Bovine Grammar Rules:: Rules used to parse a language
103* Optional Lambda Expression:: Actions to take when a rule is matched
104* Bovine Examples:: Simple Samples
105* GNU Free Documentation License::
106* Index::
107@end menu
108
109@node Starting Rules
110@chapter Starting Rules
111
112In Bison, one and only one nonterminal is designated as the ``start''
113symbol. In @semantic{}, one or more nonterminals can be designated as
114the ``start'' symbol. They are declared following the @code{%start}
115keyword separated by spaces. @inforef{start Decl, ,grammar-fw}.
116
117If no @code{%start} keyword is used in a grammar, then the very first
118is used. Internally the first start nonterminal is targeted by the
119reserved symbol @code{bovine-toplevel}, so it can be found by the
120parser harness.
121
122To find locally defined variables, the local context handler needs to
123parse the body of functional code. The @code{scopestart} declaration
124specifies the name of a nonterminal used as the goal to parse a local
125context, @inforef{scopestart Decl, ,grammar-fw}. Internally the
126scopestart nonterminal is targeted by the reserved symbol
127@code{bovine-inner-scope}, so it can be found by the parser harness.
128
129@node Bovine Grammar Rules
130@chapter Bovine Grammar Rules
131
132The rules are what allow the compiler to create tags from a language
133file. Once the setup is done in the prologue, you can start writing
134rules. @inforef{Grammar Rules, ,grammar-fw}.
135
136@example
137@var{result} : @var{components1} @var{optional-semantic-action1})
138 | @var{components2} @var{optional-semantic-action2}
139 ;
140@end example
141
142@var{result} is a nonterminal, that is a symbol synthesized in your grammar.
143@var{components} is a list of elements that are to be matched if @var{result}
144is to be made. @var{optional-semantic-action} is an optional sequence
145of simplified Emacs Lisp expressions for concocting the parse tree.
146
147In bison, each time an element of @var{components} is found, it is
148@dfn{shifted} onto the parser stack. (The stack of matched elements.)
149When all @var{components}' elements have been matched, it is
150@dfn{reduced} to @var{result}. @xref{(bison)Algorithm}.
151
152A particular @var{result} written into your grammar becomes
153the parser's goal. It is designated by a @code{%start} statement
154(@pxref{Starting Rules}). The value returned by the associated
155@var{optional-semantic-action} is the parser's result. It should be
156a tree of @semantic{} @dfn{tags}, @inforef{Semantic Tags, ,
157semantic-appdev}.
158
159@var{components} is made up of symbols. A symbol such as @code{FOO}
160means that a syntactic token of class @code{FOO} must be matched.
161
162@menu
163* How Lexical Tokens Match::
164* Grammar-to-Lisp Details::
165* Order of components in rules::
166@end menu
167
168@node How Lexical Tokens Match
169@section How Lexical Tokens Match
170
171A lexical rule must be used to define how to match a lexical token.
172
173For instance:
174
175@example
176%keyword FOO "foo"
177@end example
178
179Means that @code{FOO} is a reserved language keyword, matched as such
180by looking up into a keyword table, @inforef{keyword Decl,
181,grammar-fw}. This is because @code{"foo"} will be converted to
182@code{FOO} in the lexical analysis stage. Thus the symbol @code{FOO}
183won't be available any other way.
184
185If we specify our token in this way:
186
187@example
188%token <symbol> FOO "foo"
189@end example
190
191then @code{FOO} will match the string @code{"foo"} explicitly, but it
192won't do so at the lexical level, allowing use of the text
193@code{"foo"} in other forms of regular expressions.
194
195In that case, @code{FOO} is a @code{symbol}-type token. To match, a
196@code{symbol} must first be encountered, and then it must
197@code{string-match "foo"}.
198
199@table @strong
200@item Caution:
201Be especially careful to remember that @code{"foo"}, and more
202generally the %token's match-value string, is a regular expression!
203@end table
204
205Non symbol tokens are also allowed. For example:
206
207@example
208%token <punctuation> PERIOD "[.]"
209
210filename : symbol PERIOD symbol
211 ;
212@end example
213
214@code{PERIOD} is a @code{punctuation}-type token that will explicitly
215match one period when used in the above rule.
216
217@table @strong
218@item Please Note:
219@code{symbol}, @code{punctuation}, etc., are predefined lexical token
220types, based on the @dfn{syntax class}-character associations
221currently in effect.
222@end table
223
224@node Grammar-to-Lisp Details
225@section Grammar-to-Lisp Details
226
227For the bovinator, lexical token matching patterns are @emph{inlined}.
228When the grammar-to-lisp converter encounters a lexical token
229declaration of the form:
230
231@example
232%token <@var{type}> @var{token-name} @var{match-value}
233@end example
234
235It substitutes every occurrences of @var{token-name} in rules, by its
236expanded form:
237
238@example
239@var{type} @var{match-value}
240@end example
241
242For example:
243
244@example
245%token <symbol> MOOSE "moose"
246
247find_a_moose: MOOSE
248 ;
249@end example
250
251Will generate this pseudo equivalent-rule:
252
253@example
254find_a_moose: symbol "moose" ;; invalid syntax!
255 ;
256@end example
257
258Thus, from the bovinator point of view, the @var{components} part of a
259rule is made up of symbols and strings. A string in the mix means
260that the previous symbol must have the additional constraint of
261exactly matching it, as described in @ref{How Lexical Tokens Match}.
262
263@table @strong
264@item Please Note:
265For the bovinator, this task was mixed into the language definition to
266simplify implementation, though Bison's technique is more efficient.
267@end table
268
269@node Order of components in rules
270@section Order of components in rules
271
272If a rule has multiple components, order is important, for example
273
274@example
275headerfile : symbol PERIOD symbol
276 | symbol
277 ;
278@end example
279
280would match @samp{foo.h} or the @acronym{C++} header @samp{foo}.
281The bovine parser will first attempt to match the long form, and then
282the short form. If they were in reverse order, then the long form
283would never be tested.
284
285@c @xref{Default syntactic tokens}.
286
287@node Optional Lambda Expression
288@chapter Optional Lambda Expressions
289
290The @acronym{OLE} (@dfn{Optional Lambda Expression}) is converted into
291a bovine lambda. This lambda has special short-cuts to simplify
292reading the semantic action definition. An @acronym{OLE} like this:
293
294@example
295( $1 )
296@end example
297
298results in a lambda return which consists entirely of the string
299or object found by matching the first (zeroth) element of match.
300An @acronym{OLE} like this:
301
302@example
303( ,(foo $1) )
304@end example
305
306executes @code{foo} on the first argument, and then splices its return
307into the return list whereas:
308
309@example
310( (foo $1) )
311@end example
312
313executes @code{foo}, and that is placed in the return list.
314
315Here are other things that can appear inline:
316
317@table @code
318@item $1
319The first object matched.
320
321@item ,$1
322The first object spliced into the list (assuming it is a list from a
323non-terminal).
324
325@item '$1
326The first object matched, placed in a list. i.e. @code{( $1 )}.
327
328@item foo
329The symbol @code{foo} (exactly as displayed).
330
331@item (foo)
332A function call to foo which is stuck into the return list.
333
334@item ,(foo)
335A function call to foo which is spliced into the return list.
336
337@item '(foo)
338A function call to foo which is stuck into the return list in a list.
339
340@item (EXPAND @var{$1} @var{nonterminal} @var{depth})
341A list starting with @code{EXPAND} performs a recursive parse on the
342token passed to it (represented by @samp{$1} above.) The
343@dfn{semantic list} is a common token to expand, as there are often
344interesting things in the list. The @var{nonterminal} is a symbol in
345your table which the bovinator will start with when parsing.
346@var{nonterminal}'s definition is the same as any other nonterminal.
347@var{depth} should be at least @samp{1} when descending into a
348semantic list.
349
350@item (EXPANDFULL @var{$1} @var{nonterminal} @var{depth})
351Is like @code{EXPAND}, except that the parser will iterate over
352@var{nonterminal} until there are no more matches. (The same way the
353parser iterates over the starting rule (@pxref{Starting Rules}). This
354lets you have much simpler rules in this specific case, and also lets
355you have positional information in the returned tokens, and error
356skipping.
357
358@item (ASSOC @var{symbol1} @var{value1} @var{symbol2} @var{value2} @dots{})
359This is used for creating an association list. Each @var{symbol} is
360included in the list if the associated @var{value} is non-@code{nil}.
361While the items are all listed explicitly, the created structure is an
362association list of the form:
363
364@example
365((@var{symbol1} . @var{value1}) (@var{symbol2} . @var{value2}) @dots{})
366@end example
367
368@item (TAG @var{name} @var{class} [@var{attributes}])
369This creates one tag in the current buffer.
370
371@table @var
372@item name
373Is a string that represents the tag in the language.
374
375@item class
376Is the kind of tag being create, such as @code{function}, or
377@code{variable}, though any symbol will work.
378
379@item attributes
380Is an optional set of labeled values such as @w{@code{:constant-flag t :parent
381"parenttype"}}.
382@end table
383
384@item (TAG-VARIABLE @var{name} @var{type} @var{default-value} [@var{attributes}])
385@itemx (TAG-FUNCTION @var{name} @var{type} @var{arg-list} [@var{attributes}])
386@itemx (TAG-TYPE @var{name} @var{type} @var{members} @var{parents} [@var{attributes}])
387@itemx (TAG-INCLUDE @var{name} @var{system-flag} [@var{attributes}])
388@itemx (TAG-PACKAGE @var{name} @var{detail} [@var{attributes}])
389@itemx (TAG-CODE @var{name} @var{detail} [@var{attributes}])
390Create a tag with @var{name} of respectively the class
391@code{variable}, @code{function}, @code{type}, @code{include},
392@code{package}, and @code{code}.
393See @inforef{Creating Tags, , semantic-appdev} for the lisp
394functions these translate into.
395@end table
396
397If the symbol @code{%quotemode backquote} is specified, then use
398@code{,@@} to splice a list in, and @code{,} to evaluate the expression.
399This lets you send @code{$1} as a symbol into a list instead of having
400it expanded inline.
401
402@node Bovine Examples
403@chapter Examples
404
405The rule:
406
407@example
408any-symbol: symbol
409 ;
410@end example
411
412is equivalent to
413
414@example
415any-symbol: symbol
416 ( $1 )
417 ;
418@end example
419
420which, if it matched the string @samp{"A"}, would return
421
422@example
423( "A" )
424@end example
425
426If this rule were used like this:
427
428@example
429%token <punctuation> EQUAL "="
430@dots{}
431assign: any-symbol EQUAL any-symbol
432 ( $1 $3 )
433 ;
434@end example
435
436it would match @samp{"A=B"}, and return
437
438@example
439( ("A") ("B") )
440@end example
441
442The letters @samp{A} and @samp{B} come back in lists because
443@samp{any-symbol} is a nonterminal, not an actual lexical element.
444
445To get a better result with nonterminals, use @asis{,} to splice lists
446in like this:
447
448@example
449%token <punctuation> EQUAL "="
450@dots{}
451assign: any-symbol EQUAL any-symbol
452 ( ,$1 ,$3 )
453 ;
454@end example
455
456which would return
457
458@example
459( "A" "B" )
460@end example
461
462@node GNU Free Documentation License
463@appendix GNU Free Documentation License
464
465@include fdl.texi
466
467@node Index
468@unnumbered Index
469@printindex cp
470
471@iftex
472@contents
473@summarycontents
474@end iftex
475
476@bye
477
478@c Following comments are for the benefit of ispell.
479
480@c LocalWords: bovinator inlined