JTB
Home
What's New
Release Notes
Download
Documentation
Examples
Tutorial
Why Visitors?
Other Tools
JJTree
ANTLR
NewJacc
User Comments
Special Edition for GJ
Links:
JavaCC
Design
Patterns
Javasoft
GJ
|
JJTree
JJTree is a parse tree builder written by the same folks who gave us JavaCC
and is included with the JavaCC distribution. JTB is similar in function
to JJTree, however, there are significant differences both in how each
tool works and how to program the tree generated by each program.
Some of these differences are listed below:
-
JTB's design concentrates on simplicity and ease of use while JJTree
provides greater flexibility at the cost of added complexity and development
time.
-
JTB's tree node classes implement the Visitor design pattern as
described in the book Design
Patterns: Elements of Reusable Object-Oriented Software. See
the page entitled "Why Visitors" for information
on the benefits of this design pattern. Note that this functionality
has recently been added to JJTree.
-
JTB's nodes preserve type information by storing references to the actual
types of its children whereas tree nodes created by JJTree store their
children as a Vector of nodes. This eliminates any ambiguity in the
tree structure as well as type casts that might be necessary to access
specific node types in a JJTree syntax tree.
-
JTB operates on a standard, unmodified JavaCC grammar file.
-
This approach presents several advantages in our eyes:
-
A smaller learning curve since no additional grammar needs to be learned.
-
No additional work needs to be done on a grammar file to prepare it for
processing with JTB.
-
A grammar will only have one possible tree structure. Any programmer
who sees a JavaCC grammar file will immediately know the structure of the
JTB tree.
-
Although we consider the advantages to be important enough to take this
approach, there are also possible disadvantages:
-
The tree generated by JTB may not be as flexible as the one by JJTree.
For example, with JTB, all nonterminals generate a class.
With JJTree, certain productions can be flagged so as not to generate a
node class.
-
More memory will be required by a JTB tree as opposed to a JJTree tree
which suppresses certain nonterminals from generating nodes.
Like JJTree, JTB takes a JavaCC grammar file as input and outputs syntax
tree node classes as well as an annotated grammar file which contains code
to build the tree during parse. JTB also generates a default Visitor
class whose methods visit the the tree depth first.
JJTree: A Hands-On Comparison
I recently attempted to further compare the two tools by using each of
them to perform a small sample task: to find undeclared variables in a
small subset grammar of Scheme. This example can be downloaded from
the Examples page.
In all fairness to JJTree, note that I am not a JJTree expert.
There may be ways of doing things which I was not aware of. The JJTree
examples included with JavaCC which I used as a study model probably didn't
cover the whole spectrum of the tool's capabilities as well as tricks,
workarounds, and good JJTree programming style. In addition,
working on a larger project would probably reveal more advantages and disadvantages
of using JJTree.
For this comparison, I used JJTree 0.3pre3 (included with JavaCC 0.7pre5)
and JTB 1.0. My attempt at objective observation produced the following:
Observation |
JJTree |
JTB |
Preparation
Time |
High. |
None. |
The grammar had to be analyzed and the
parse tree planned. A mistake in planning the parse tree could
result in much time wasted if discovered after visitor programming has
begun. |
JTB's parse tree is fixed and cannot
be manually altered. |
The grammar file also had to be annotated
according to the planned parse tree. This in turn had to be tested
and debugged (for which I used the dump() method of Node). |
The bare grammar can be used without
modification. |
Visitor Programming |
About equal to JTB. |
About equal to JJTree. |
Slightly more time had to be spent providing
visit() methods which did nothing but visit certain nodes' children. |
JTB automatically provides a visitor whose
methods visit each node's children. Extending this class allows
you to override only the necessary visit() methods in which some
task needs to be performed. |
Parse
Tree
Nodes |
The working directory got very cluttered
after generating the tree classes. There is no way to automatically
generate tree classes into a specified directory. |
JTB automatically generates the syntax
tree nodes into a package and subdirectory called "syntaxtree",
keeping the grammar's directory clean. |
The generated node classes had to be modified,
which is common when dealing with parse trees. This, however, gets
somewhat messy when programming for a grammar with hundreds of potential
tree node classes. |
JTB's parse tree classes don't need to
be edited. Once generated, they should be left alone. All
work should be done within visitors.
Should certain node classes require data to be stored in them, we have
found that a good alternative solution is to store the data in a Hashtable
(using the object itself as the hash key) which can be passed from visitor
to visitor as needed. |
Children are stored in a vector.
There is no way to guarantee the type or number of children in a node. |
JTB stores actual references to its children's
types. Children of nodes will always be where you expect them. |
Tokens |
Special productions and classes are needed
to store tokens in the parse tree. |
JTB automatically stores all tokens
in the tree.
Advantage: All tokens are readily available for use in the visitors.
No additional work needs to be done to store them.
Disadvantage: Memory is wasted on unneeded tokens. |
General
Observations |
I had to continuously refer back to my
annotated grammar file to see how I had arranged the parse tree. |
JTB inserts helpful comments above
visit()
methods, indicating precisely which children correspond to what part of
the production. In addition, since the tree will always look the
same for a given grammar, there is less ambiguity. |
ANTLR
ANTLR is a tool descended from the Purdue Compiler Construction Tool Set
(a.k.a. PCCTS). It is not a tool for use with JavaCC; rather it is
a set of tools amongst which are a Java parser generator and a tree builder.
I have never used ANTLR or PCCTS so I cannot comment on them. However,
you can find the tool at http://java.magelang.com/antlr.
NewJacc
According to the NewJacc documentation, NewJacc is "a parser generator
system built upon Sun Microsystems JavaCC tool and the Purdue University
Java Tree Builder (JTB) tool. NewJacc's principle extension is to provide
users with a way to associate rewrite rules with individual productions
in the language grammar. These rules are used to describe how the parse
tree should be traversed. Users can easily control what action is performed
at each node in the tree during their traversals. This provides users with
great leverage in the construction of a variety of source to source translation
tools."
I have not used NewJacc and am not thoroughly familiar with it, but
I have tinkered with some of the examples provided with it, and it does
look like a very interesting tool. NewJacc can be found at http://hopper.cs.wvu.edu/software.html.
|