JAVA TREEBUILDER |
SPECIAL
EDITION
FOR GJ
DOCUMENTATION
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Special
Edition
Documentation
Standard JTB Home
Links: |
JTB GJ1.1.2 is a special version of JTB for
GJ. It generates syntax trees that work with GJ.
By using gj.util instead of java.util, all the classes
generated by JTB GJ1.1.2 are type-casting free. You can go to JTB
Home for the standard JTB that works with javac.
Overview of Generated FilesTo begin using JTB GJ, simply run it using your grammar file as an argument (a list of command-line parameters can be obtained by running JTB GJ without any arguments). This will generate the following:
Let's take a look at all the files and directories JTB GJ generates.
jtb.out.jjThis file is the same as the input grammar file except that it now contains code for building the syntax tree during parse. Typically, this file can be left alone after generation. The only thing that needs to be done to it is to run it through JavaCC to generate your parser.syntaxtreeThis directory contains syntax tree node classes generated based on the productions in your JavaCC grammar. Each production will have its own class. If your grammar contains 97 productions, this directory will contain 97 classes (plus the special automatically generated nodes--these will be discussed later), with names corresponding to the left-hand side names of the productions. Like jtb.out.jj, after generation these files don't need to be edited. Generate them once, compile them once, and forget about them.Let's examine one of the classes generated from a production. Take, for example, the following production (from the Java1.1.jj grammar): void ImportDeclaration() : {} { "import" Name() [ "." "*" ] ";" }This production will generate the file ImportDeclaration.java in the directory (and package) syntaxtree. This file will look like this: // // Generated by JTB GJ1.1 // package syntaxtree; /** * Grammar production: * f0 -> "import" * f1 -> Name() * f2 -> [ "." "*" ] * f3 -> ";" */ public class ImportDeclaration implements Node { public NodeToken f0; public Name f1; public NodeOptional f2; public NodeToken f3; public ImportDeclaration(NodeToken n0, Name n1, NodeOptional n2, NodeToken n3) { f0 = n0; f1 = n1; f2 = n2; f3 = n3; } public ImportDeclaration(Name n0, NodeOptional n1) { f0 = new NodeToken("import"); f1 = n0; f2 = n1; f3 = new NodeToken(";"); } public void accept(visitor.Visitor v) { v.visit(this); } public <R,A> R accept(visitor.GJVisitor<R,A> v, A argu) { return v.visit(this,argu); } public <R> R accept(visitor.GJNoArguVisitor<R> v) { return v.visit(this); } public <A> void accept(visitor.GJVoidVisitor<A> v, A argu) { v.visit(this,argu); } }Let us now examine this file from the top down. The first set of comments obviously shows which version of JTB created this file. The second group of comments is for your benefit, showing the names of the fields of this class (the children of the node), and what parts of the original production they represent. All parts of a production are represented in the tree, including tokens. Notice that this class is in the package "syntaxtree". The purpose of separating the generated tree node classes into their own package is that it greatly simplifies file organization, particularly when the grammar contains a large number of productions. If the grammar is stable and not subject to change, once these classes are generated and compiled, it's not necessary to pay them any more attention. All of the work is to done to the visitor classes. Next you'll note that this class implements an interface named Node.
This is one of eight tree node classes and interfaces automatically generated
for every grammar. These classes are as follows:
These will be discussed in greater detail below. Next comes the member variables of the ImportDeclaration class. These are generated based on the RHS of the production. Their type depends on the various items in the RHS and their names begin with f0 and work their way up. You may be wondering why these variables are declared as public. Since the visitors which must access these fields reside in a different package than the syntax tree nodes, package visibility cannot be used. We decided that breaking encapsulation was a necessary evil in this case. The next portion of the generated class is the standard constructor. It is called from the tree-building actions in the annotated grammar so you will probably not need to use it. Following the first constructor is a convenience constructor with the constant tokens of the production already filled-in by the appropriate NodeToken. This constructor's purpose is to help in manual construction of syntax trees. After the constructor are the accept() methods. These
methods are the way in which visitors interact with the node classes. void
accept(visitor.Visitor v) works with Visitor, R accept(visitor.GJVisitor<R,A>
v, A argu) works with GJVisitor, and so on.
visitorThis directory is where the visitors you write can be placed. JTB GJ automatically generates four sets of visitors:
With regards to DepthFirstVisitor, our intent is for the programmer to only have to override those methods for which specific actions must be performed. For example, in a visitor which simply counts the number of assignment statments in a Java source file, only the overloaded method visit(Assignment n) would need to be modified. Continuing our above example is the visit(ImportDeclaration n) method of class DepthFirstVisitor: /** * f0 -> "import" * f1 -> Name() * f2 -> [ "." "*" ] * f3 -> ";" */ public void visit(ImportDeclaration n) { n.f0.accept(this); n.f1.accept(this); n.f2.accept(this); n.f3.accept(this); }The comments above each visit method are for the programmer's benefit, showing which field corresponds to which part of the production. In this example n.f0 is a reference to one of the automatically generated classes, NodeToken. n.f1 refers to a nonterminal of type Name. n.f2 refers to a NodeOptional which stores a NodeSequence (more on this later). n.f3 refers to another NodeToken. The other three sets of defualt visitors provide support for different visit() methods:
Note: All visitor classes must implement one of Visitor, GJVisitor , GJNoArguVisitor and GJVoidVisitor interfaces, either directly or by subclassing a class which does so (such as DepthFirstVisitor, GJDepthFirst, GJNoArguDepthFirst or GJVoidDepthFirst). gjThis directory contains gj packages -- more specifically, gj.util and gj.lang. You may have run-time error when you execute the files compiled by gjc, if gj packages are not in your classpath. However, as long as your class containing main() method is in the same directory as that of jtb.out.jj, you don't need to worry about classpath of gj packages. And that's why JTB GJ generates this gj directory.The Automatically Generated ClassesSix classes and two interfaces are automatically generated for every grammar file. The six classes are responsible for the various EBNF grammar constructs such as ( )+, ( )*, ( )?, etc.NodeThe interface Node is implemented by all syntax tree nodes. Node looks like this:public interface Node extends java.io.Serializable { public void accept(visitor.Visitor v); public <R,A> R accept(visitor.GJVisitor<R,A> v, A argu); public <R> R accept(visitor.GJNoArguVisitor<R> v); public <A> void accept(visitor.GJVoidVisitor<A> v, A argu); }All tree node classes implement the accept() method. In the case of all the automatically-generated classes, the accept() method simply calls the corresponding visit(XXXX n) (where XXXX is the name of the production) method of the visitor passed to it. Note that the visit() methods are overloaded, i.e. the distinguishing feature is the argument each takes, as opposed to its name. The Node interface extends java.io.Serializable, meaning
that you can now serialize your trees (or subtrees) to an output stream
and read them back in. If you are not familiar with object serialization,
see the Java documentation on the java.io.Serializable interface.
NodeListInterfaceThe interface NodeListInterface is implemented by NodeList, NodeListOptional, and NodeSequence. NodeListInterface looks like this:public interface NodeListInterface extends Node { public void addNode(Node n); public Node elementAt(int i); public gj.util.Enumeration<Node> elements(); public int size(); }You probably won't need to worry about this interface. It can be useful, though, when writing code which only deals with the Vector-like functionality of any of the three classes listed above.
NodeChoice is the class which JTB GJ uses to represent choice
points in a grammar. An example of this would be
|
Option | Description |
-h | Displays a help message including a table with brief descriptions of these options. |
-o NAME | Specifies the filename JTB should use to output the annotated grammar rather than use the default jtb.out.jj. |
-np NAME | Specifies the directory and package JTB should place the generated
syntax tree classes rather than use the default syntaxtree.
Note: for nested packages, JTB assumes the current directory is the one directly above the package stated. For example, if you used "-np=foo.bar.bletch", JTB will assume you are in the directory foo/bar and will generate a directory called bletch to store the node classes. |
-vp NAME | Specifies the directory and package JTB should place the generated visitor classes rather than use the default visitor. The above note for the -np otion applies to this option as well. |
-p NAME | Shorthand for "-np NAME.syntaxtree -vp NAME.visitor". |
-si | Reads input from standard input (typically the keyboard) rather than an input grammar file. |
-w | JTB will no longer overwrite existing files. |
-e | Supresses JTB semantic error checking. |
-jd | Generates JavaDoc-friendly comments in generated visitors and syntax tree classes. |
-f | Generates descrpitive node class child field names such as whileStatement and nodeToken2 rather than f0, f1, etc. |
-ns NAME | Specifies the name of the class (e.g. mypackage.MyClass) that all node classes should subclass. This class must be supplied by the user. |
-pp | Generates parent pointers in all node classes as well as getParent() and setParent() methods. The parent reference of a given node will automatically be set when the node is passed to the constructor of another node. The root node's parent will be null. |
-tk | Stores special tokens into the parse tree. |
public void flushWriter() | Flushes the OutputStream or Writer that TreeDumper is using to output the syntax tree. |
public void printSpecials(boolean b) | Allows you to specify whether or not to print special tokens. |
public void startAtNextToken() | Starts the tree dumper on the line containing the next token visited. For example, if the next token begins on line 50 and the dumper is currently on line 1 of the file, it will set its current line to 50 and continue printing from there, as opposed to printing 49 blank lines and then printing the token. |
public void resetPosition() | Resets the position of the internal "cursor" to the first line and
column. For example, if the interal cursor was at line twenty and
the next token begins on line twenty one, a single carriage return is output,
then the token. If resetPosition() is called, the interal
cursor will be reset to line 1. Twenty carriage returns would be
output, then the token.
When using a dumper on a syntax tree more than once, you either need to call this method or startAtNextToken() between each dump. |
root.accept(new DepthFirstVisitor() { public void visit(MethodDeclaration n) { dumper.startAtNextToken(); n.f0.accept(dumper); n.f1.accept(dumper); n.f2.accept(dumper); n.f3.accept(dumper); // skip n.f4, the method body System.out.println(); } });
public TreeFormatter(int indentAmt,
int wrapWidth) |
Allows you to specify the number of spaces per indentation level and the number of columns per line, after which tokens are wrapped to the next line (the default constructor assumes an indentAmt of 3 and a wrapWidth of 0, i.e. no line wrapping). |
protected void add(FormatCommand cmd) | Use this method to add FormatCommands to the command queue to be executed when the next token in the tree is visited. |
protected FormatCommand force(int i) | A Force command inserts one or more line breaks and indents the next line to the current indentation level. Without an argument, adds just one line break. Use add(force()); |
protected FormatCommand indent() | An Indent command increases the indentation level by one or more. Without an argument, just adds one indent level. Use add(indent()); |
protected FormatCommand outdent() | An Outdent command is the reverse of the Indent command: it reduces the indentation level. Use add(outdent()); |
protected FormatCommand space() | A Space command simply adds one or more spaces between tokens. Without an argument, adds just just one space. Use add(space()); |
protected void processList(
NodeListInterface n, FormatCommand cmd) |
Visits each element of a NodeList, NodeListOptional, or NodeSequence and inserts an optional FormatCommand between each element (but not after the last one). |
/** * f0 -> [ PackageDeclaration() ] * f1 -> ( ImportDeclaration() )* * f2 -> ( TypeDeclaration() )* * f3 -> <EOF> */ public void visit(CompilationUnit n) { if ( n.f0.present() ) { n.f0.accept(this); add(force(2)); } if ( n.f1.present() ) { processList(n.f1, force()); add(force(2)); } if ( n.f2.present() ) { processList(n.f2, force(2)); add(force()); } n.f3.accept(this); }
Message | Description |
Production "SomeProduction" has the same name as a JTB-generated class. | A production within the input grammar has a name which is reserved by JTB, such as Node, NodeList, etc. |
Message | Description |
Javacode block must be specially handled. | See the Known Issues section of the Release Nodes page. |
Non-void return type in SomeProduction(). | All productions in a grammar on which JTB is to be used should have a return type of void. JTB replaces all return types in the grammar upon processing. |
Block of Java code in SomeProduction(). | A production contains a block of embedded Java code. While it's possible this may not cause problems, the Java code could interact or interfere with the code JTB inserts into the grammar. A JTB grammar should ideally contain no embedded Java code. |
Extra parentheses in SomeProduction(). | A production contains extraneous parentheses (i.e. not enclosing a choice or followed by "*", "+", or "?"). This former caused JTB to misbehave but this has been corrected for 1.1 (see the section on NodeSequence). However, to be safe, we are still flagging this so you are aware should any pesky lingering bugs still be present. |
Maintained by Wanjun Wang, wanjun@purdue.edu. | Created September 4, 1997.
Last modified May 20, 2000. |