JavaCFG - Installation and Usage
- After unzipping or gunzipping and untar'ing, a
directory called "JavaCFG" will be created,
and you should cd into it.
- Create the Java bytecode with the command
javac CFGParse.java
- Write a grammar file such as
the racecar grammar
or sample from some examples provided in the
grammars directory.
- attempt to parse an input-string
with the command
java CFGParse grammar-file-path input-string
- if input-string is generated by the grammar,
two trees are created:
- the first tree is represented by a file system style tree
that is viewable and clickable in a Java window.
Directory icons (
)
represent variables in the grammar,
while file icons (
)
represent terminals. You can see which
productions were used in the derivation tree by opening
up the directories to reveal the right hand side of the
corresponding production.
- the second tree is represented by a LaTeX file called
"input-string.tex" which contains information
for drawing the derivation tree in a more standard fashion
than the first representation
- if input-string is not generated by the grammar
this fact is reported in a window created by the program. No
LaTeX file is created in this case.
- If you want convert the LaTeX tree into PDF
you'll need LaTeX installed. The command
pdflatex filename.tex
creates a file called "filename.pdf"
which contains a picture of the derivation tree.
Grammar Files
Grammars are stored as text files
which should end with the .cfg extension. They are
specified as follows:
- Each line consists of white-space separated strings which represent
productions from a common variable
- The first string of any line must have length 1 and is a variable
- Subsequent strings on a given line are the right hand sides
of productions from the variable for that line
- The first variable on the first line is the start variable
- Any character which isn't a variable, is a terminal
- THE FOLLOWING IS NOT RECOMMENDED (EARLEY'S PARSING ALGORITHM MAKES
NO GUARANTEES ABOUT EPSILONS):
A single double-quote symbol " represents the empty
string ε
(we think of " as two single quotes with nothing in between)
For example the language
{anbn | n > 0}
∪
{bnan | n > 0}
has the context free grammar:
S → X | Y
X → aXb | ab
Y → bYa | ba
which is represented in the grammar file by:
S X Y
X aXb ab
Y bYa ba
Directory Structure and Source Code
After untaring or unzipping you will find the following files
- CFG.java
- A class the defines the context free grammar object used by JavaCFG.
Embedded in this class is some additional functionality,
added by
Yoav Hirsch for de-epsilonizing, and converting grammars into
Chomsky Normal Form. Instructions for how to call on the conversion
algorithms is embedded in the file.
- CFGParse.java
-
- A class created for simplifying the parsing command.
- EarleyAgent.java
- A class used in the parsing algorithm.
- EarleyRule.java
- A class used in the parsing algorithm
- MalformedGrammarException.java
-
An exception class used to let the user
know that their grammar has not been
defined properly, has a typo, or
is not the right type for the given operation.
- Rule.java
- A class which defines the productions of a CFG.
- TreeModel2LaTeX.java
- A class for converting JTree objects into LaTeX
trees.
- grammars
- A directory containing some examples of grammars
NOTE: some of these grammars contain epsilons
and that the success of the parsing algorithm
on such grammars is not guaranteed.
- instructions.html
- This file
Last modified: Sun Oct 10 04:06:45 2004