CS W3139-02 Homework 3 Solutions



Non-Programming Problems

  1. The resulting tree is drawn below:
                                           _______
                                          | o L o |
                                           -------
                                      /               \
                                  /                       \
                              /                               \
                     _______                                    _______
                    | o H o |                                  | o P o |
                     -------                                    -------
                  /           \                              /           \
        _______                  _______              _______             _______
       | o D o |                | o J o |            | o N o |           | o R o |
        -------                  -------              -------             -------
        /     \                  /     \              /     \             /     \
 ___________   ___________   _______   _______   _______   _______   _______   _______
| o A o B o | | o E o F o | | o I o | | o K o | | o M o | | o O o | | o Q o | | o S o | 
 -----------   -----------   -------   -------   -------   -------   -------   -------
  1. Exercise 5.1

  2. Exercise 5.2

    When rehashing, we choose a table size that is roughly twice as large and prime. In our case, the appropriate new table size is 19, with hash function h(x) = x(mod 19).

  3. Exercise 6.2

  4. Exercise 6.3

    1. The result of three deleteMins, starting with both the heaps in Exercise 6.2, is as follows:

      After the first deleteMin:

                 2
               /   \
              /     \
             /       \
            /         \
           3           4
          / \         / \
         /   \       /   \
        6     7     5     8
       / \   / \   / \   /
      15 14 12  9 10 11 13 
      

      After the second deleteMin:

                 3
               /   \
              /     \
             /       \
            /         \
           6           4
          / \         / \
         /   \       /   \
        13    7     5     8
       / \   / \   / \   
      15 14 12  9 10 11  
      

      After the third deleteMin (resulting heap):

                 4
               /   \
              /     \
             /       \
            /         \
           6           5
          / \         / \
         /   \       /   \
        13    7     10    8
       / \   / \   /    
      15 14 12  9 11  
      



    2. After the first deleteMin:
                 2
               /   \
              /     \
             /       \
            /         \
           3           4
          / \         / \
         /   \       /   \
        12    6     5     8
       / \   / \   / \   /
      15 14 9   7 10 11 13 
      

      After the second deleteMin:

                 3
               /   \
              /     \
             /       \
            /         \
           6           4
          / \         / \
         /   \       /   \
        12    7     5     8
       / \   / \   / \   
      15 14 9  13 10 11  
      

      After the third deleteMin (resulting heap):

                 4
               /   \
              /     \
             /       \
            /         \
           6           5
          / \         / \
         /   \       /   \
        12    7     10    8
       / \   / \   / 
      15 14 9  13 11
      


Programming Problems



Here are solutions to programming problems 1 and 2, provided by Changbin Wang:

PROBLEM 1

File README:

README

Data Structures and Algorithm Analysis in JAVA (W3139-02)
Professor: Jason Nieh

Homework #3: HuffmanCoding project

Submitted by: Changbin Wang

================================================================================
Files submitted:
--------------------------------------------------------------------------------
For the Huffman code project:
./HuffmanCoding:
	HuffmanCoding.java		--	Main class of the Huffman coding project
	HuffmanTree.java		--	Structure for holding Huffman tree
	HuffmanQueue.java		--	Priority queue for constructing Huffman tree
	HuffmanNode.java		--	Nodes for Huffman Tree
	CodeTable.java			--	Code table for storing the code value for chars
	CodeTableItem.java		--	Item for code table
	PrintStack.java			--	Print tool for word wrapping 
	Comparable.java(*)
	Overflow.java(*)
	result.log				--	Test runs

(* indicate that the class codes are copied directly from DataStructures package)

To compile: At CUNIX prompt(in ./HuffmanCoding directory): javac *.java

To run: At CUNIX prompt(in ./HuffmanCoding directory): java HuffmanCoding

Log File: Contains several test runs of the program
================================================================================
This project will create a Huffman code for any given message file.  This given
message file should be in ASCII data format.  The following function realized in
this project:

 1 -- Generate a frequency table for the given file by calculating the occurance
	  of characters in the file; The program will display the frequency table 
	  upon user selection;

 2 -- Construct a Huffman tree for the given file.  The frequency table will be 
	  convert into a priority queue according to the frequencies of characters; 
	  the priority queue will then be used to construct the Huffman tree. The 
	  program will display the Huffman tree upon user selection;

 3 -- Generate a code table by traversing the Huffman tree.  The code is generated
	  when traversing the tree recursively.  Note that in the code table, the code
	  of a character is stored as a char value, this value will be transformed 
	  into a BitSet bit pattern.  The user will be prompted to display the code
	  table;

 4 -- Generate the bit pattern (BitSet) of the input file.  The bit pattern of the
	  given file is generated from the code table and will be displayed upon user
	  selection;

 5 -- Given a bit pattern (BitSet) of the given file, the program will translate 
	  it back to its original message.  This function is useful when try to use
	  Huffman coding to compress ascii datafile. 
================================================================================
Note Points:

 a -- If the given datafile is not ended by a CRNL combination (EOF), the program 
	  will add such a combination to the Huffman tree (the file is not changed). 
	  An EOF is read as two distinct character: a carriage return and a new line 
	  feed, i.e., when calculating the character number, an EOF is taken as two
	  characters (although it is OK to take then as just one character).

 b -- The program will work for any file with different size. Characters with value
	  lower than 33 or more than 166 is NAMED as the following:

			ASCII		Name
			================
			  10	     NL
			  13		 CR
			  32		 SB
			 other		 CC

 c -- The EOF of a file will be displayed in the result as "EOF" or "" and 
	  will be placed in the frequency table with index 0.

File HuffmanCoding.java:

import java.io.*;
import java.util.BitSet;

/**
 * HuffmanCoding class, the main class for the Huffman coding project
 */
public class HuffmanCoding 
{
	public static void main (String [] args)
	{
		BufferedReader in = new BufferedReader(
			new InputStreamReader(System.in));

		try
		{
				// Prompt for file name and create a file object
			System.out.print("Please input the text file name: ");
			String fileName = in.readLine();
			System.out.println();
			File f = new File(fileName);

				// Exit the program if the file cannot be found
			if (!f.exists() )
			{
				System.err.println("Error: File not found!");
				System.exit(1);
			}

				// New instance of a HuffmanCoding class
			HuffmanCoding hc = new HuffmanCoding(f);

				// Print out the frequency table
			System.out.print("Print out the frequency table (y/n)?");
			String yOrn = in.readLine();
			System.out.println();

			if(yOrn.charAt(0) == 'Y' || yOrn.charAt(0) == 'y')
			{
				hc.printOriginMessage(f);

				System.out.println();

				hc.printFreqs();
			}

			System.out.println("");

			HuffmanTree ht = new HuffmanTree(hc.freqs, hc.charCounter);

			System.out.println("");

				// Print the Huffman tree
			System.out.print("Print out the Huffman tree (y/n)?");
			yOrn = in.readLine();
			System.out.println();

			if(yOrn.charAt(0) == 'Y' || yOrn.charAt(0) == 'y')
			{
				ht.printHuffmanTree();
			}

			System.out.println("");

			System.out.println("");

				// Print the code table
			System.out.print("Print out the code table(y/n)?");
			yOrn = in.readLine();
			System.out.println();

			if(yOrn.charAt(0) == 'Y' || yOrn.charAt(0) == 'y')
			{
				hc.printOriginMessage(f);

				ht.printCodeTable();
			}

			System.out.println("");

			System.out.println("");

				// Print the bit pattern of the input message
			System.out.print("Print the bit pattern for the input file (y/n)?");
			yOrn = in.readLine();
			System.out.println();

			BitSet bs = hc.encode(f, ht);

				// The logical size of a BitSet; used only for JDK1.2
			//int bssize = bs.length();

				// The logical size of a BitSet; used  for JDK1.1.*
			int bssize = CodeTable.lengthOf(bs);

			if(yOrn.charAt(0) == 'Y' || yOrn.charAt(0) == 'y')
			{
				System.out.println("Here is the coded message:\n");

				hc.printBitPattern(bs, 32);

				System.out.println("\n\n"+(f.length()+1)+" characters encoded in "+
					(bssize-1)+ " bits, "+ ((float)(bssize-1))/f.length() +
					" bits per char.");
			}

			System.out.println("");

			System.out.println("");

				// Translate the bit pattern to ascii message
			System.out.print("Translate the bit pattern back to its original form(y/n)?");
			yOrn = in.readLine();
			System.out.println();

			if(yOrn.charAt(0) == 'Y' || yOrn.charAt(0) == 'y')
			{
				System.out.println("DECODED MESSAGE:\n");

				String ds = hc.decode(bs, ht);

				System.out.println(ds);

				System.out.println();

				System.out.println("Note: <EOF> indicates the EOF of the input message file.");
			}

			System.out.println("");
		}
		catch (Exception e)				// Catch all the exceptions
		{
			System.out.println(e);
		}
	}

	/**
	 * Constructor for HuffmanCoding
	 * @param txtFile the file to be coded
	 */
	public HuffmanCoding(File txtFile)
	{
		this.txtFile = txtFile;	

		freqs = new int[256];

		for( int i = 0; i < 256; i++) freqs[i] = 0;

		charCounter = 0;

		try
		{
				// Get file size
			int fileSize = (int)txtFile.length();
			
				// Include EOF in the file size
			fileSize ++;
	
				// Open the fileinput stream
			FileInputStream in = new FileInputStream(txtFile);

			int index = -1;
			for(int i = 0; i < fileSize; i++)
			{
					// Read the file byte by byte
				index = (int)in.read();	

					// Read the EOF simbol and put into freqs[0]
				if (index == -1) 
				{
					if(freqs[0] == 0) charCounter ++;
					freqs[0] ++;
				}
				else if (index > 0 && index < 256)
				{
					if (freqs[index] == 0) charCounter ++;
						// Put the byte into the array
					freqs[index]++;							
				}
			}
				// Report erro if NULL appears in the file; should never happen
			if(freqs[0] != 1) System.out.println("There should not be a "
				+ "NULL character and only one EOF character in the file! "
				+ freqs[0]);
		}
		catch (IOException e)
		{
			System.out.println("Error: Cannot read file!");
		}
	}

	/**
	 * Print out the original message
	 */
	public void printOriginMessage(File f)
	{
		System.out.println("ORIGINAL INPUT MESSAGE:");
		try
		{
				// Get file size
			int fileSize = (int)f.length();	
			
				// Open the fileinput stream
			FileInputStream in = new FileInputStream(f);
			
			char [] b = new char[fileSize];

			for(int i=0;i<fileSize;i++) b[i] = (char)in.read();

				// Output the message in the file and a <EOF> indicating
				// the EOF character of the file
			System.out.print(String.copyValueOf(b, 0, fileSize)+"<EOF>");		
			
			System.out.println();
		}
		catch (IOException e)
		{
			System.out.println("Error: Cannot read file!");
		}
	}


	/**
	 * Print out the frequency table of the file
	 */
	public void printFreqs()
	{
		System.out.println("        Frequency Table       ");
		System.out.println("==============================");
		System.out.println("CHARACTER | ASCII | FREQUENCY ");
		System.out.println("==============================");

		for(int i = 0; i < 256; i++)
		{
			if(freqs[i] > 0)
			{
				char c = (char)i;
				String s = String.valueOf(c);

				if( i < 33 || i > 126)
				{
					if( i == 10 ) s = "NL";
					else if( i == 13 ) s = "CR";
					else if( i == 32 ) s = "SB";
					else if( i == 0 ) s = "EOF";
					else s = "CC";
				}
				String sp1 = space(3);
				String sp2;
				if( i > 0 )
					sp2 = space(13-lenOfInt(i)-s.length());
				else
					sp2 = space(13-5-s.length());
				String sp3 = space(9-lenOfInt(freqs[i]));
				if( i > 0 )
					System.out.print(sp1 + s + sp2 + i + sp3 + freqs[i]);
				else
					System.out.print(sp1 + s + sp2 +"-1(*)"+ sp3 + freqs[0]);
				System.out.println();
			}
		}
		System.out.println();
		System.out.println("* Not indicate the real ASCII code");
	}

	/**
	 * Return the BitSet of a text file
	 * @param f, the file to be coded
	 * @param ht, the Huffman tree for the file
	 */
	public BitSet encode( File f, HuffmanTree ht )
	{
		return ht.encode(f);
	}

	/**
	 * Return the ascii string for the bit pattern
	 * @param bs, the bitset
	 * @param ht, the Huffman tree
	 */
	public String decode( BitSet bs, HuffmanTree ht )
	{
		return ht.decode(bs);
	}

	/**
	 * Print the bitSet in a bit pattern
	 * @param bs, the BitSet
	 * @param lineWidth, the line width when printing with word wrapped manner
	 */
	public void printBitPattern( BitSet bs , int lineWidth)
	{
		PrintStack ps = new PrintStack(lineWidth);
			
			// The logical size of a BitSet; used only in JDK1.2
		//int n = bs.length();

			// The logical size of a BitSet; used only in JDK1.1.*
		int n = CodeTable.lengthOf(bs);

		if(!bs.get(n-1)) System.err.println("Wrong code!");
		else for(int i = n - 2; i >= 0; i--)
			ps.printWord(bs.get(i)?"1":"0");
	}
	
	File txtFile;			// The file to be coded

	int [] freqs;			// Integer array to store character frequencies

	int charCounter;		// Number of different characters


	/**
	 * Used when print the frequency table
	 */
	private String space( int i )
	{
		char [] c = new char[i];
		for (int j = 0; j< i; j++) c[j]=' ';
		return String.copyValueOf(c);
	}

	/**
	 * Used when print the frequency table
	 */
	private int lenOfInt( int i)
	{
		int tmp = i;
		int len = 0;
		if(i == 0) return 1;
		while(tmp > 0) 
		{
			tmp /= 10;
			len ++;
		}
		return len;
	}
}

File HuffmanTree.java:

import java.io.*;
import java.util.BitSet;

public class HuffmanTree
{
	/**
	 * Construct a Huffman tree using a priority queue
	 * @param freqs the frequency table
	 * @param charCounter the unique character number in the frequency table
	 */
	public HuffmanTree (int [] freqs, int charCounter)
	{
		this.freqs = freqs;

		if (charCounter <= 0) 
		{
			System.err.println ("Empty input file!");
			System.exit(1);
		}
		else
		{
				// The priority queue used to construct a huffman tree
			hQueue = new HuffmanQueue(charCounter);
	
			try
			{
					// Insert elements in the frequency table into the 
					// priority queue. The elements of the queue is Huffman nodes
				for (int i = 0; i < 256; i++)
					if (freqs[i] > 0) hQueue.insert(
						new HuffmanNode(freqs[i], null, (char)i));

				HuffmanNode item1, item2;
	
				for (int i = 0; i < charCounter; i++)
				{
					item1 = hQueue.deleteMin();
					if (hQueue.isEmpty()) 
					{
							// Remember the root of the Huffman tree
						root = item1;
						break;
					}
					item2 = hQueue.deleteMin();

						// Insert the new formed Huffman tree into the priority queue
					hQueue.insert (
						new HuffmanNode(item1.freq + item2.freq, item1, item2));
				}
			}
			catch (Overflow e)
			{
				System.err.println("The queue is full!");
			}
		}

		codeTable = traverse();
	}

	/**
	 * Return the bit pattern for the file to be encoded
	 * @param f the file to be encoded
	 */
	public BitSet encode( File f )
	{
		int codeLength = codeTable.totalCodeLength();

		BitSet bs = new BitSet();
		bs.set(codeLength);

		try
		{
				// Get file size
			int fileSize = (int)f.length() +1;						
	
				// Open the fileinput stream
			FileInputStream in = new FileInputStream(f);		

			int bitIndex = codeLength - 1;
			for(int i = 0; i < fileSize; i++)
			{
					// Read the file byte by byte
				int b = (int)in.read();
				char c = (b>=0)? (char)b: (char)0; 
				BitSet cbs = codeTable.transCode(c);

					//The logical size of BitSet; used for JDK1.2
				//int n = cbs.length();

					//The logical size of BitSet; used for JDK1.1.*
				int n = CodeTable.lengthOf(cbs);
				for(int j = n - 2; j >= 0; j--)
				{
					if (cbs.get(j)) bs.set(bitIndex --);
					else bitIndex --;
				}
			}
		}
		catch (IOException e)
		{
			System.out.println("Error: Cannot read file!");
		}

		return bs;
	}

	/**
	 * Decoding the given bit pattern into ascii string using the Huffman tree
	 * @param bs the BitSet to be decoded
	 * @return the decoded string
	 */
	public String decode( BitSet bs )
	{
		String s = "";

			// Logical size of a Bitset; used for JDK1.2
		//int n = bs.length();

			// Logical size of a Bitset; used for JDK1.2
		int n = CodeTable.lengthOf(bs);

		if( n > 0 && root == null) 
		{
			System.err.println("Wront input or Huffman tree!");
			System.exit(1);
		}
		else if (root == null)
		{
			System.out.println("Empty input and Huffman tree!");
			System.exit(1);
		}

		HuffmanNode hn = root;

		int bitIndex = n - 2;

		while(bitIndex >= -1)
		{
			HuffmanNode hn1 = 
				(bitIndex >-1)?searchThroughPath(hn, bs.get(bitIndex)):
				searchThroughPath(hn, true);
			if(hn1 == null)
			{
				char tc = hn.getChar();
				String ts = (((int)tc) >0)?String.valueOf(tc):"<EOF>";
				s += ts;
				hn = root;
			}
			else 
			{
				hn = hn1;
				bitIndex --;
			}
		}
		return s;
	}

	/**
	 * Print out the Huffman tree
	 */
	public void printHuffmanTree()
	{
		printHuffmanTree(root, 0);
	}

	/**
	 * Print out the code table
	 */
	public void printCodeTable()
	{
		codeTable.printTable();
	}

	private HuffmanQueue hQueue;

	private CodeTable codeTable;

	private HuffmanNode root;

	private int [] freqs;

	/**
	 * Return a code table by traverse the Huffman tree
	 * @return the code table
	 */
	private CodeTable traverse()
	{
		CodeTable ct = new CodeTable();

		traverse( root, ct, (char)0, 0);

		return ct;
	}

	/**
	 * Internal method to traverse the tree in order to get a code table
	 * @param t the tree node
	 * @param ct the code table to be generated
	 * @param code the code of a character
	 * @param len the length of the code (bit number)
	 */
	private void traverse( HuffmanNode t, CodeTable ct, char code, int len)
	{
		if (t.left == null)
		{
			char c = t.getChar();
			ct.put( c, new CodeTableItem( code, len, freqs[(int)c] ));
		}
		else
		{
			code <<= 1;
			len ++;
			traverse (t.left, ct, code, len);
			traverse ((HuffmanNode)t.right, ct, (char)((int)code | 1), len);
		}
	}

	/**
	 * Internal method to determine the path of a bit pattern when decoding a 
	 * bit pattern back into a ascii string
	 * @param t the HuffmanNode indicating the tree node
	 * @param right the direction switch
	 * @return the next tree node along the path; null if t is a leaf
	 */
	private HuffmanNode searchThroughPath( HuffmanNode t, boolean right)
	{
		HuffmanNode hn;
		if( t.left != null && right ) hn = (HuffmanNode)t.right;
		else if( t.left != null && !right) hn = t.left;
		else hn = null;

		return hn;
	}

	/**
	 * Internal method to print out the tree by recursive
	 * @param t the node 
	 * @param indent the space before print out the node
	 */
	private void printHuffmanTree( HuffmanNode t, int indent )
	{
		if (t == null ) return;
		HuffmanNode tmpright;
		String s = "";
		if (t.left == null) tmpright = null;
		else tmpright = (HuffmanNode)t.right;

		printHuffmanTree(tmpright, indent + 4);
		for( int i=0; i<indent; i++) System.out.print(" ");
		if(t.left == null) 
		{
			char c = t.getChar();
			s = "("+String.valueOf(c)+")";
			if( (int)c < 33 || (int)c > 126)
			{
				if( (int)c == 10 ) s = "(NL)";
				else if( (int)c == 0 ) s = "(EOF)";
				else if( (int)c == 13 ) s = "(CR)";
				else if( (int)c == 32 ) s = "(SB)";
				else s = "(CC)";
			}
		}
		System.out.println(t.freq+s);
		printHuffmanTree(t.left, indent+4);
	}
}

File HuffmanQueue.java:

/**
 * Implements a priority queue with binary heap.
 * Note that all "matching" is based on the compareTo method.
 * @author Changbin Wang
 */
public class HuffmanQueue
{

	/**
	 * Construct the priority queue with default capacity
	 */
	public HuffmanQueue()
	{
		this(DEFAULT_CAPACITY);
	}

    /**
     * Construct the priority queue.
     * @param capacity the capacity of the priority queue.
     */
    public HuffmanQueue( int capacity )
    {
        currentSize = 0;
        array = new HuffmanNode[ capacity + 1 ];
    }

    /**
     * Insert into the priority queue, maintaining heap order.
     * Duplicates are allowed.
     * @param x the item to insert.
     * @exception Overflow if container is full.
     */
    public void insert( HuffmanNode x ) throws Overflow
    {
        if( isFull( ) )
            throw new Overflow( );

            // Percolate up
        int hole = ++currentSize;
        for( ; hole > 1 && x.compareTo( array[ hole / 2 ] ) < 0; hole /= 2 )
            array[ hole ] = array[ hole / 2 ];
        array[ hole ] = x;
    }

    /**
     * Find the smallest item in the priority queue.
     * @return the smallest item, or null, if empty.
     */
    public HuffmanNode findMin( )
    {
        if( isEmpty( ) )
            return null;
        return array[ 1 ];
    }

    /**
     * Remove the smallest item from the priority queue.
     * @return the smallest item, or null, if empty.
     */
    public HuffmanNode deleteMin( )
    {
        if( isEmpty( ) )
            return null;

        HuffmanNode minItem = findMin( );
        array[ 1 ] = array[ currentSize-- ];
        percolateDown( 1 );

        return minItem;
    }

    /**
     * Establish heap order property from an arbitrary
     * arrangement of items. Runs in linear time.
     */
    private void buildHeap( )
    {
        for( int i = currentSize / 2; i > 0; i-- )
            percolateDown( i );
    }

    /**
     * Test if the priority queue is logically empty.
     * @return true if empty, false otherwise.
     */
    public boolean isEmpty( )
    {
        return currentSize == 0;
    }

    /**
     * Test if the priority queue is logically full.
     * @return true if full, false otherwise.
     */
    public boolean isFull( )
    {
        return currentSize == array.length - 1;
    }

    /**
     * Make the priority queue logically empty.
     */
    public void makeEmpty( )
    {
        currentSize = 0;
    }

    private static final int DEFAULT_CAPACITY = 256;

    private int currentSize;      // Number of elements in heap
    private HuffmanNode [ ] array; // The heap array

    /**
     * Internal method to percolate down in the heap.
     * @param hole the index at which the percolate begins.
     */
    private void percolateDown( int hole )
    {
        int child;
        HuffmanNode tmp = array[ hole ];

        for( ; hole * 2 <= currentSize; hole = child )
        {
            child = hole * 2;
            if( child != currentSize &&
                    array[ child + 1 ].compareTo( array[ child ] ) < 0 )
                child++;
            if( array[ child ].compareTo( tmp ) < 0 )
                array[ hole ] = array[ child ];
            else
                break;
        }
        array[ hole ] = tmp;
	}
}

File HuffmanNode.java:

public final class HuffmanNode implements Comparable
{
	/**
	 * Construct a HuffmanNode with two HuffmanNodes as leaves
	 * @param freq the priority number when used in priority queue
	 * @param left the left leaf
	 * @param right the right leaf
	 */
	public HuffmanNode(int freq, HuffmanNode left, HuffmanNode right)
	{
		this.freq = freq;
		this.left = left;
		this.right = right;
	}

	/**
	 * Construct a HuffmanNode with a HuffmanNode and a character as leaves
	 * @param freq the priority number when used in priority queue
	 * @param left the left leaf, this leaf should always be null
	 * @param c the charactor as right leaf
	 */
	public HuffmanNode(int freq, HuffmanNode left, char c)
	{
		this.freq = freq;
		this.left = left;
		this.right = new Character(c);
	}

	/**
	 * Return the character stored in this node when this node is a leaf
	 * @return the character stored in this node
	 */
	public char getChar()
	{
		 return ((Character)right).charValue();
	}

	/**
	 * Implement the interface method of Comparable
	 * @param rhs the right hand side to be compared
	 * @return the value of comparision
	 */
    public int compareTo( Comparable rhs )
    {
        return freq < ((HuffmanNode)rhs).freq ? -1 :
               freq == ((HuffmanNode)rhs).freq ? 0 : 1;
    }

	int freq;

	HuffmanNode left;

	Object right;

}

File CodeTable.java:

import java.io.*;
import java.util.BitSet;

/**
 * CodeTable class for holding the code table
 */
public final class CodeTable
{
	/**
	 * Construct a code table
	 */
	public CodeTable()
	{
			// Initialize the array the number of ASCII characters
		array = new CodeTableItem [256];
	}

	/** 
	 * Put a character and its code into the code table
	 * @param c the character
	 * @param code the code information of the char
	 */
	public void put ( char c, CodeTableItem code)
	{
		if ((int)c >= 256) System.out.println("Wrong index");
		else array [(int)c] = code;
	}

	/**
	 * Return the code information of a character
	 * @param c the char to be searched
	 * @return the code information of the char
	 */
	public CodeTableItem get( char c )
	{
		if ((int)c >=256) return null;
		else return array[ (int)c ];
	}

	/**
	 * Return the total code length for a encoded file
	 * @return teh total code length
	 */
	public int totalCodeLength()
	{
		int sum = 0;
		for(int i = 0; i < 256; i++)
			if(array[i] != null) sum += array[i].len * array[i].freq;
		return sum;
	}

	/**
	 * Print a bit pattern of a given character 
	 * @param c the char to be printed
	 */
	public void printCode( char c )
	{
		BitSet bs = transCode(c);

			// The logical size of the bitset; used only for JDK1.2
		//int n = bs.length();

			// The logical size of the bitset; used only for JDK1.1.*
		int n = lengthOf(bs);

		if(!bs.get(n-1)) System.err.println("Wrong code!");
		else
		{
			for(int i = n - 2; i >= 0; i--)
			{
				if (bs.get(i)) System.out.print("1");
				else System.out.print("0");
			}
		}
	}

	/**
	 * Transform a code into bit pattern for a given char
	 * @param c the character
	 * @return the BitSet bit pattern
	 */
	public BitSet transCode( char c )
	{
		CodeTableItem codeItem = get( c );
		int len = codeItem.len;
		char code = codeItem.code;

		BitSet bs = new BitSet();
		bs.set(len);

		transCode(bs, (int)code, 0);

		return bs;
	}

	/**
	 * Print out the whole table
	 */
	public void printTable()
	{
		System.out.println("                     Code Table                     ");
		System.out.println("====================================================");
		System.out.println("CHARACTER | ASCII | FREQUENCY | BIT-PATTERN | LENGTH");
		System.out.println("====================================================");

		for(int i = 0; i < 256; i++)
		{
			if(array[i] != null)
			{
				char c = (char)i;
				String s = String.valueOf(c);
				CodeTableItem cti = array[i];
				char code = cti.code;
				int len = cti.len;
				int freq = cti.freq;

				if( i < 33 || i > 126)
				{
					if( i == 10 ) s = "NL";
					else if( i == 13 ) s = "CR";
					else if( i == 32 ) s = "SB";
					else if(i == 0 ) s = "EOF";
					else s = "CC";
				}
				String sp1 = space(3);
				String sp2;
				if( i > 0 )
					sp2 = space(13-lenOfInt(i)-s.length());
				else
					sp2 = space(13-5-s.length());
				String sp3 = space(9-lenOfInt(freq));
				String sp4 = space(16-len);
				String sp5 = space(7);
				if( i > 0 )
					System.out.print(sp1 + s + sp2 + i + sp3 + freq + sp4);
				else
					System.out.print(sp1 + s + sp2 +"-1(*)"+ sp3 + freq + sp4);
				printCode(c);
				System.out.println(sp5 + len);
			}
		}
		System.out.println();
		System.out.println("* Not indicate the real ASCII code");
	}

	private CodeTableItem [] array;

	/** 
	 * Internal method for transforming a code into bit pattern
	 * @param bs the generated bitset
	 * @param code the code in the table, a char datatype
	 * @param index the bit index in the bitset
	 */
	private void transCode(BitSet bs, int code, int index)
	{
		if(code % 2 == 1)
		{
			bs.set(index);
			index ++;
		}
		else if (code % 2 == 0)
		{
			index ++;
		}
		if (code <= 1) return;
		transCode(bs, code/2, index);
	}

	/**
	 * Used for print out the table
	 */
	private String space( int i )
	{
		char [] c = new char[i];
		for (int j = 0; j< i; j++) c[j]=' ';
		return String.copyValueOf(c);
	}

	/**
	 * Used for print out the table
	 */
	private int lenOfInt( int i)
	{
		int tmp = i;
		int len = 0;
		if(i == 0) return 1;
		while(tmp > 0) 
		{
			tmp /= 10;
			len ++;
		}
		return len;
	}

	/**
	 * Used for calculate the logical size of a BitSet
	 * Used when using Java JDK version lower than 1.2
	 * @param bs the bit set
	 * @return the logical size of a bitset
	 */
	public static int lengthOf(BitSet bs)
	{
		int len;

		int size = bs.size();

		len = size;

		while(!bs.get(size--)) len --;

		return len + 1;
	}
}

File PrintStack.java:

/**
 * Class PrintStack mimic a stack to print word list or sentence
 * in word wrapping manner
 */
public class PrintStack
{
	int lineWidth;			// The width of a line 
	int currentSize;		// Used to count the number of output charaters

	/**
	 * Construct the class
	 * @param n the line width
	 */
	public PrintStack(int n)
	{
		lineWidth = n;
		currentSize = 0;
	}

	/**
	 * Method used to print out word list with word wrapping feature
	 * @param str the word to print out
	 */
	public void printWord (String str)
	{
		int n = str.length();				// The length of the word

		// Output word if this line can make room for it
		if((currentSize + n) <= lineWidth)	
		{
			currentSize += n;				// Increment the current size of the line 
											// by the word length
			System.out.print(str);
		}
		// Else start a new line and set outputed charactor length to the length of word
		else								
		{
			System.out.println();
			currentSize = n;
			System.out.print(str);
		}
	}

	/**
	 * Method used to print out sentences with word wrapping feature
	 * @param str the sentence to be output
	 */
	public void printSentence (String str)
	{
		int n = str.length();				// The length of the sentence

		int nbeg = 0;			// The beginning index in str of each new line
								// initialized to 0
		// Output new lines if not finish
		while(nbeg < n)			
		{
			int nend = nbeg+lineWidth;		// Index of end index of a line in sentence
			nend = (nend>=n)?n:nend;		// Set end index to the length of sentence
											// if this index exceed sentence length

			// If the last charater of the line is not space"
			// back track the characters to find a space
			for(int i= nend-1; i>0 && 
				((str.charAt(i)>64 && str.charAt(i)<91)||
				(str.charAt(i)>96 && str.charAt(i)<123)); i--) 
				nend -= 1;

			// Output the line according to the start and end indice
			System.out.println(str.substring(nbeg, nend));

			// Update the begining point of the new output line
			nbeg = nend;
		}
	}
}

PROBLEM 2

File README:

README

Data Structures and Algorithm Analysis in JAVA (W3139-02)
Professor: Jason Nieh

Homework #3: Concordance project

Submitted by: Changbin Wang

================================================================================
Files submitted:
--------------------------------------------------------------------------------
For the Concordance project:
./Concordance:
	Condordance.java		--	Main class of the project
	CitationList.java		--  Word and sentence info list in separate chaining
	CitationListItr.java	--  Linked list iterator for CitationList
	CitationListNode.java	--  Node for CitationList
	CitationPoint.java		--  Structure to store sentence information
	Comparable.java(*)
	Hashable.java(*)
	JavaHashtable.java		--	Symbol table implemented using Java Hashtable
	JHTableItem.java		--	Table item for JavaHashtable
	LinkedList.java(*)
	LinkedListItr.java(*)
	ListNode.java(*)
	PrintStack.java			--  Print words or sentences with word wrapped
	SeparateChaining.java	--	Symbol table implemented using separate chaining
	SPTableItem.java		--	Table item for SeparateChaining
	StackLi.java(*)
	Underflow.java(*)
	Word.java				--	Wrapped class for a word and its sentence info
	Result1.log				--	The test runs using separate chaining implementation
	Result2.log				--	The test runs using Java Hashtable implementation

(* indicate that the class codes are copied directly from DataStructures package)

To compile: At CUNIX prompt(in ./Concordance directory): javac *.java

To run: At CUNIX prompt(in ./Concordance directory): java Concordance

Log File: Contains several test runs of the program using "~cs3139-2/bible/philemon"
=================================================================================

This program supply the following functions in order to get the concordance of 
distinct words in a text file and test out different data structure implementation
of the symbol table.

 1 -- Print the total number of distinct words in the alphabetized list;
 2 -- Find the sentences contain certain word. The program will prompt for a user
      input and then do searching in a linked list or a binary search tree. The 
	  result first report the frequency of the word in the text and then let the
	  user to confirm whether or not to display all the sentence contain the word;
 3 -- Print the word with maximum occurrences. The program first output the 
      frequency of the word in the text and then let the user to select whether or
	  not to display the sentences contain the word; 
 4 -- Test searching time; Help the user to compare the searching time of two 
      different implemetation of the alphabetized list. This program does only
	  the searching time according to the requirement of the assignment, and the 
	  other running time such as insertion and delete can be implemented easily;
 5 -- Print out the total number of collision happened when insert word and its 
	  sentence information into the symbol table(for separate chaining implementation);
 6 -- Print out the Print the bucket number with most collision and the number 
	  of collisions in this position(for separate chaining implementation);
 7 -- Print out the table size(for separate chaining implementation);
 8 -- Print out the load factor(for separate chaining implementation);
 9 -- Exit the running of the program.

================================================================================
 Note points:

 a. The symbol table is implemented using two different methods.  The first method
	is using separate chaining implementation for a hash table.  In this implementation,
	the words are used as keys while linked list storing the collision words as 
	value corresponding to each hashed index.  The linked list which stores the 
	collided words store the words and their sentence informations list.  The second 
	method is using the Hashtable class provide by Java package.  The words are used 
	as keys and the sentence information lists are used as values for the table.

 b. Running time: This program will only show the running time for searching a given
	word according to the requirement of the homework assignment.  For both implementation, 
	the average successful searching takes O(1) running time, while for the separate
	chaining implementation, the unsuccessful searching may take longer time than
	using Hashtable class in Java package. The following is a table of running time 
	tested on different	files.
	
	(The word "you" is used to search in the symbol tables of all the files).

	=======================================================================================
	file name  |   Number of    |       Running time (ms)        |    Rnning time (ms)    |
			   |different words | for separate chaining implement| for Java Hashtable impl|
    =======================================================================================
	philemon   |      191       |               0				 |			  0			  |
	---------------------------------------------------------------------------------------
	habakkuk   |      504       |               1                |            0           |
	---------------------------------------------------------------------------------------
	micah      |      738       |               0                |            0           |
	---------------------------------------------------------------------------------------
	ezra       |      1088      |               1                |            0           |
	---------------------------------------------------------------------------------------
	leviticus  |      1412      |               0                |            0           |
	---------------------------------------------------------------------------------------
	mark       |      1663      |               1                |            0           |
	---------------------------------------------------------------------------------------
	numbers    |      2065      |               0                |            0           |
	=======================================================================================

	From the proceeding table, for separate chaining implementation, the running time changes
	randomly. This may be the effects of collision of the searched word with other words hashed
	to the same position.  For the Java Hashtable implementation, the running time is always
	very small (indetectable).

	In both the test cases, there is also the CPU effects of the computer, since the program
	running time depends on the work load of the CPU.
==============================================================================================

File Result1.log:

Please input the text file name: 
Error: File not existing!
Please try again: 
===============================================
Please select an implementation method of the symbol table:
    0 -- Seperate chaining             
    1 -- Hashtable in Java package     
===============================================
Your selection(0/1): 
Program is precessing data ...

Data processing successful!

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
There are totally 191 different words in the file!

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
Please input the word you want to look for: 
 appears 3 times in sentences of the file.

Do you want to see all the sentences (y/n)?
I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

For we have great joy and consolation in thy love, because the bowels of 
the saints are refreshed by thee, brother.

Wherefore, though I might be much bold in Christ to enjoin thee that which 
is convenient,  Yet for love's sake I rather beseech thee, being such an 
one as Paul the aged, and now also a prisoner of Jesus Christ.



===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
Word  appears in the text file with maximum frequency, which is 16 
times.

Do you want to display all sentences (y/n)?
Paul, a prisoner of Jesus Christ, and Timothy our brother, unto Philemon 
our dearly beloved, and fellowlabourer,  And to our beloved Apphia, and 
Archippus our fellowsoldier, and to the church in thy house:  Grace to you,
 and peace, from God our Father and the Lord Jesus Christ.

I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

For we have great joy and consolation in thy love, because the bowels of 
the saints are refreshed by thee, brother.

Wherefore, though I might be much bold in Christ to enjoin thee that which 
is convenient,  Yet for love's sake I rather beseech thee, being such an 
one as Paul the aged, and now also a prisoner of Jesus Christ.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

For perhaps he therefore departed for a season, that thou shouldest 
receive him for ever;  Not now as a servant, but above a servant, a 
brother beloved, specially to me, but how much more unto thee, both in the 
flesh, and in the Lord?

For perhaps he therefore departed for a season, that thou shouldest 
receive him for ever;  Not now as a servant, but above a servant, a 
brother beloved, specially to me, but how much more unto thee, both in the 
flesh, and in the Lord?

Yea, brother, let me have joy of thee in the Lord: refresh my bowels in 
the Lord.

Yea, brother, let me have joy of thee in the Lord: refresh my bowels in 
the Lord.

Having confidence in thy obedience I wrote unto thee, knowing that thou 
wilt also do more than I say.

There salute thee Epaphras, my fellowprisoner in Christ Jesus;  Marcus, 
Aristarchus, Demas, Lucas, my fellowlabourers.

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
Please input the word you want to look for: 
Searching takes totally 0 milliseconds and the following is the results:
 appears 3 times in sentences of the file.

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
Totally 28 collisions happened during hashing all the words!

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
Position index=63 in the table has most collision and the number of 
collision is 2

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
The size of the table is: 503

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9): 
The load factor of this table is: 0.32405567

===============================================
You selected the separate chaining implementation
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Print the total number of collisions     
 6 -- Print the bucket number with most collision
      and the number of collisions             
 7 -- Print the size of the hash table         
 8 -- Print the load factor of the table       
 9 -- Exit                                     
===============================================
Your selection(1-9):

File Result2.log:

Please input the text file name: philemon

===============================================
Please select an implementation method of the symbol table:
    0 -- Seperate chaining             
    1 -- Hashtable in Java package     
===============================================
Your selection(0/1): 1

Program is precessing data ...

Data processing successful!

===============================================
You selected implementation using Java Hashtable
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Exit                                     
===============================================
Your selection(1-5): 1

There are totally 191 different words in the file!

===============================================
You selected implementation using Java Hashtable
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Exit                                     
===============================================
Your selection(1-5): 2

Please input the word you want to look for: 
 appears 3 times in sentences of the file.

Do you want to see all the sentences (y/n)? y

I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

For we have great joy and consolation in thy love, because the bowels of 
the saints are refreshed by thee, brother.

Wherefore, though I might be much bold in Christ to enjoin thee that which 
is convenient,  Yet for love's sake I rather beseech thee, being such an 
one as Paul the aged, and now also a prisoner of Jesus Christ.



===============================================
You selected implementation using Java Hashtable
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Exit                                     
===============================================
Your selection(1-5): 3

Word  appears in the text file with maximum frequency, which is 16 
times.

Do you want to display all sentences (y/n)? y

Paul, a prisoner of Jesus Christ, and Timothy our brother, unto Philemon 
our dearly beloved, and fellowlabourer,  And to our beloved Apphia, and 
Archippus our fellowsoldier, and to the church in thy house:  Grace to you,
 and peace, from God our Father and the Lord Jesus Christ.

I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

I thank my God, making mention of thee always in my prayers,  Hearing of 
thy love and faith, which thou hast toward the Lord Jesus, and toward all 
saints;  That the communication of thy faith may become effectual by the 
acknowledging of every good thing which is in you in Christ Jesus.

For we have great joy and consolation in thy love, because the bowels of 
the saints are refreshed by thee, brother.

Wherefore, though I might be much bold in Christ to enjoin thee that which 
is convenient,  Yet for love's sake I rather beseech thee, being such an 
one as Paul the aged, and now also a prisoner of Jesus Christ.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

I beseech thee for my son Onesimus, whom I have begotten in my bonds:  
Which in time past was to thee unprofitable, but now profitable to thee 
and to me:  Whom I have sent again: thou therefore receive him, that is, 
mine own bowels:  Whom I would have retained with me, that in thy stead he 
might have ministered unto me in the bonds of the gospel:  But without thy 
mind would I do nothing; that thy benefit should not be as it were of 
necessity, but willingly.

For perhaps he therefore departed for a season, that thou shouldest 
receive him for ever;  Not now as a servant, but above a servant, a 
brother beloved, specially to me, but how much more unto thee, both in the 
flesh, and in the Lord?

For perhaps he therefore departed for a season, that thou shouldest 
receive him for ever;  Not now as a servant, but above a servant, a 
brother beloved, specially to me, but how much more unto thee, both in the 
flesh, and in the Lord?

Yea, brother, let me have joy of thee in the Lord: refresh my bowels in 
the Lord.

Yea, brother, let me have joy of thee in the Lord: refresh my bowels in 
the Lord.

Having confidence in thy obedience I wrote unto thee, knowing that thou 
wilt also do more than I say.

There salute thee Epaphras, my fellowprisoner in Christ Jesus;  Marcus, 
Aristarchus, Demas, Lucas, my fellowlabourers.

===============================================
You selected implementation using Java Hashtable
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Exit                                     
===============================================
Your selection(1-5): 4

Please input the word you want to look for: you

Searching takes totally 0 milliseconds and the following is the results:
 appears 3 times in sentences of the file.

===============================================
You selected implementation using Java Hashtable
Please selection functions from the menu:      
 1 -- Print the total number of unique words   
 2 -- Find the sentences contain certain word  
 3 -- Print the word with maximum occurrences  
      and the sentences containing the word    
 4 -- Test searching time                      
 5 -- Exit                                     
===============================================
Your selection(1-5): 5
RESULT2.LOG

3

File Concordance.java:

import java.io.*;

/**
 * class Concordance
 * 
 * Main class for the concordance project. 
 * @ Author: Changbin Wang (cw170@columbia.edu)
 */

public class Concordance
{
	/**
	 * Constructor of the concordance class
	 * @param fineName the file for getting concordance
	 * @param ls boolean to indicate the implemetation method
	 */
	public Concordance(File txtFile, boolean ls)
	{
		this.ls = ls;

		this.txtFile = txtFile;								

		try
		{
				// Get file size
			fileSize = (int)txtFile.length();				   
	
				// Initialize the array
			allChar = new char[fileSize];					

				// Open the fileinput stream
			FileInputStream in = new FileInputStream(txtFile); 

			for(int i=0;i<fileSize;i++)
			{
					// Read the file byte by byte
				char c = (char)in.read();	
					// Put the byte into the array
				if(c != 10) allChar[i]= c;	
					// Replace all the carriage return with space
				else allChar[i] = ' ';		
			}
		}
		catch (OutOfMemoryError e1)
		{
			System.out.println("Error: Out of memory when open charactor array!");
		}
		catch (IOException e2)
		{
			System.out.println("Error: Cannot read file!");
		}

			// Construct separate chaining hashtable
		if (ls) spTable = new SeparateChaining(500);
			// Construct java hashtable implementation
		else jhTable = new JavaHashtable();				
	}

	/**
	 * method to extract words and sentences from the character array
	 */
	public void parseArray()
	{
			// Stack for temporary store words in a sentence
		StackLi wStack = new StackLi();		
			// Index for the beginning point of word
		int wbeg = 0;						
			// Length of each word	
		int wlen = 0;
			// Index for the beginning point of sentence
		int sbeg = 0;
			// Length of each sentence
		int slen = 0;
			// Number of space between words and sentences
		int nsp  = 0;
			// True if the index is inside a word
		boolean word = true;
			// True if the index is inside a sentence
		boolean sentence = true;			
		
		for(int i=0; i<fileSize; i++)
		{
				// Not one of ". ? !"
			if(allChar[i]!=46 && allChar[i]!=33 && allChar[i]!=63) 
			{
					// Between 'a-z' or 'A-Z' ?
				if((allChar[i]>64 && allChar[i]<91)||
					(allChar[i]>96 && allChar[i]<123))	
				{
						// First enter a word
					if(!word)							
					{
							// Set the beginning point of word
						wbeg = wbeg+wlen+nsp;
							// Initial length of the word 
						wlen = 0;						
							// # of space = 0
						nsp = 0;
							// Inside the word
						word = true;					
					}
						// Increment the word length
					wlen ++;							

						// First enter a sentence
					if(!sentence)						
					{
							// Set the beginning point of word
						sbeg = sbeg + slen + nsp;
							// Initial length of the sentence
						slen = 0;
							// Inside the sentence
						sentence = true;				
					}
				}
					// One of "space ; , ', :"
				else									
				{
						// First meet a space
					if(word)							
					{
							// In space area
						word = false;					
							// Push the proceeding word into stack
						if(i!=0) wStack.push(String.valueOf(allChar, wbeg, wlen));
					}
						// Increment space number
					nsp ++;							
				}
					// Increment sentence length
				slen ++;							
			}
				// One of ". ? !", end of sentence
			else									
			{
					// Increment the sentence length to include '.' or '?' or '!'
				slen ++;								
														
					// Push the last word of the sentence into the stack
				if(word)								
				{										
					word = false;
					if(i!=0) wStack.push(String.valueOf(allChar, wbeg, wlen));
				}
					// Not really space, but used to exclude the space before a sentence
				nsp ++;									
														
					// First meet a '.', '?' or '!'
				if(sentence)							
				{
						// Outside a sentence
					sentence = false;					

						// Store the starting point and the length of the sentence
					CitationPoint cp = new CitationPoint(sbeg, slen);

						// Pop words out of the temporary stack and insert words into 
						// symbol table; All words are changed to upper case
					while(!wStack.isEmpty())
					{
						String stmp = (String)wStack.topAndPop();
						Word w = new Word(stmp.toUpperCase(), cp);
						if(ls) spTable.insert(w);
						else jhTable.insert(w);
					}
				}
			}
		}
			// Output a message if the input file is empty and exit running
		if((ls && spTable.getWordNum()==0)||(!ls && jhTable.getWordNum()==0))
		{
			System.out.println("The input file is an empty file!");
			System.exit(1);
		}
	}

	/**
	 * Method to print out all sentences contain certain word
	 * @param index the index corresponding to the word
	 * @param lineWidth used to indicate the line width for word wrapping
	 */
	public void printSentences(String word, int lineWidth)
	{
			// Construct a PrintStack tool for printing with word wrapping
		PrintStack ps = new PrintStack(lineWidth);

			// Get out the linked list for certain word
		LinkedList list = (ls)?spTable.get(word):jhTable.get(word);

			// The list is null, no sentence for such index
		if (list == null) return;

			// Print out all sentence in thelist
        if( list.isEmpty( ) )
            System.out.print( "Error: Empty list" );	// Empty list, should not happen
        else
        {
				// Iterator point the first element
            LinkedListItr itr = list.first( );			

				// Extract and print the sentence out of the array according to 
				// sentence information in the list
            for( ; !itr.isPastEnd( ); itr.advance( ) )	
			{
				CitationPoint cp = (CitationPoint) itr.retrieve();
				String str = String.copyValueOf(allChar, cp.nbeg, cp.nlen);
				ps.printSentence(str);
				System.out.println();
			}
         }
	}

		// Name of input file
	File txtFile;			

		// Array holding all characters in the file
	char[] allChar;			

		// Bytes(characters) of the file 
	int fileSize;			

		// Symbol table implemented using separate chaining
	SeparateChaining spTable=null;

		// Symbol table implemented using java hashtable
	JavaHashtable jhTable = null; 

		// Boolean to indicate the implemetation of symbol table
	boolean ls;				


	/**
	 * main function of the class
	 */
	public static void main(String [] args)
	{
			// PrintStack used to output long sentence
		PrintStack psm = new PrintStack(75);

		if(args.length > 0) System.out.println("Usage: Concordance");
		else
		{
				// Input stream for various input use
			BufferedReader in = new BufferedReader(
				new InputStreamReader(System.in));

			try
			{
					// Prompt for file name and create a file object
				System.out.print("Please input the text file name: ");
				String fileName = in.readLine();
				System.out.println();
				File f = new File(fileName);

					// Prompt for file name if the file cannot be found
				while (!f.exists())
				{
					System.out.println("Error: File not existing!");
						// Prompt for file name 
					System.out.print("Please try again: ");
					fileName = in.readLine();
					System.out.println();
					f = new File(fileName);
				}

					// Prompt for selection of implemetation method of alphabetized list
				System.out.println("===============================================");
				psm.printSentence("Please select an implementation method "+
									"of the symbol table:");
				System.out.println("    0 -- Seperate chaining             ");
				System.out.println("    1 -- Hashtable in Java package     ");
				System.out.println("===============================================");
				System.out.print("Your selection(0/1): ");

				String txtLine = in.readLine();
				int x = Integer.parseInt(txtLine);
				boolean ls = (x == 0)? true:false;
				System.out.println();

				System.out.println("Program is precessing data ...");
				System.out.println();

					// Construct the instance of the concordance class
				Concordance prog = new Concordance(f, ls);

					// Parse the charater array into words and sentence and generate
					// the alphabetized list and symbol table.
				prog.parseArray();

				System.out.println("Data processing successful!");
				System.out.println();

				for(;;)
				{
					if(ls)
					{
						System.out.println("===============================================");
						System.out.println("You selected the separate chaining implementation");
						System.out.println("Please selection functions from the menu:      ");
						System.out.println(" 1 -- Print the total number of unique words   ");
						System.out.println(" 2 -- Find the sentences contain certain word  ");
						System.out.println(" 3 -- Print the word with maximum occurrences  ");
						System.out.println("      and the sentences containing the word    ");
						System.out.println(" 4 -- Test searching time                      ");
						System.out.println(" 5 -- Print the total number of collisions     ");
						System.out.println(" 6 -- Print the bucket number with most collision");
						System.out.println("      and the number of collisions             ");
						System.out.println(" 7 -- Print the size of the hash table         ");
						System.out.println(" 8 -- Print the load factor of the table       ");
						System.out.println(" 9 -- Exit                                     ");
						System.out.println("===============================================");
						System.out.print("Your selection(1-9): ");
					}
					else
					{
						System.out.println("===============================================");
						System.out.println("You selected implementation using Java Hashtable");
						System.out.println("Please selection functions from the menu:      ");
						System.out.println(" 1 -- Print the total number of unique words   ");
						System.out.println(" 2 -- Find the sentences contain certain word  ");
						System.out.println(" 3 -- Print the word with maximum occurrences  ");
						System.out.println("      and the sentences containing the word    ");
						System.out.println(" 4 -- Test searching time                      ");
						System.out.println(" 5 -- Exit                                     ");
						System.out.println("===============================================");
						System.out.print("Your selection(1-5): ");
					}

					txtLine = in.readLine();
					x = Integer.parseInt(txtLine);

					switch(x)
					{
					case 1:			// Output the total number of distinct words in the file
						int num = 
							(ls)? prog.spTable.getWordNum():prog.jhTable.getWordNum();
						System.out.println();
						psm.printSentence("There are totally "+num+
							" different words in the file!");
						System.out.println();
						break;		

					case 2:
						String word1;
						int freq;

						System.out.println();
							// Prompt for a word to be searched.
						System.out.print("Please input the word you want to look for: ");

						word1 = in.readLine();
						System.out.println();
						
						word1 = word1.toUpperCase();

						LinkedList list = (ls)?prog.spTable.get(word1):prog.jhTable.get(word1);
						freq = (ls)?prog.spTable.getFreq(word1):prog.jhTable.getFreq(word1);
						if(list == null && freq <= 0)
								System.out.println("Sorry, no such word in the text!");
						else
						{
							psm.printSentence("<"+word1.toUpperCase()+"> appears "+
									freq+" times in sentences of the file.");
								
							System.out.println();
							System.out.print("Do you want to see all the sentences (y/n)?");
							String yOrn = in.readLine();
							System.out.println();
							if(yOrn.charAt(0)=='y' || yOrn.charAt(0) == 'Y') 
								prog.printSentences(word1, 75);
							System.out.println();
						}
						System.out.println();
						break;

					case 3:			// Print out the word with maximum concordance and 
						String w;	// the frequency

						System.out.println();

						if(ls) 
						{
							CitationListItr ci = prog.spTable.findMaxFreq();
							w = ci.retrieveWord();
							freq = ci.retrieveFreq();
						}
						else
						{
							JHTableItem jhi = prog.jhTable.findMaxFreq();
							w = jhi.getWord();
							freq = jhi.getFreq();
						}
							
						psm.printSentence("Word <"+w+"> appears in the text file"+
							" with maximum frequency, which is "+freq+" times.");
						System.out.println();
						System.out.print("Do you want to display all sentences (y/n)?");
						String yOrn = in.readLine();
						System.out.println();
						if(yOrn.charAt(0)=='y' || yOrn.charAt(0) == 'Y')
								prog.printSentences(w, 75);
						break;

					case 4:			// Test the searching time for different implementation
						System.out.println();
						System.out.print("Please input the word you want to look for: ");

						word1 = in.readLine();
						System.out.println();
          
						word1 = word1.toUpperCase();

						long start = System.currentTimeMillis();
						list = (ls)?
							prog.spTable.get(word1):prog.jhTable.get(word1);
						long end = System.currentTimeMillis();
		
						long interval = end - start;
						psm.printSentence("Searching takes totally "+
							interval+" milliseconds and the following is the results:");

						if(list==null) 
							System.out.println("Sorry, no such word in the text!");
						else
						{
							freq = (ls)?prog.spTable.getFreq(word1):prog.jhTable.getFreq(word1);
							psm.printSentence("<"+word1+"> appears "+freq+
								" times in sentences of the file.");
						}
						System.out.println();
						break;

					case 5:					
						if(!ls)	return;
						else
						{
							System.out.println();
							psm.printSentence("Totally "+prog.spTable.getTotalCollision()+
								" collisions happened during hashing all the words!");
							System.out.println();
						}
						break;

					case 6:
						if(ls) 
						{
							System.out.println();
							psm.printSentence("Position index="+prog.spTable.getCollisionBucket()+
								" in the table has most collision and the number of"+
								" collision is "+prog.spTable.getMaxCollision());
							System.out.println();
						}
						break;

					case 7:
						if(ls)
						{
							System.out.println();
							psm.printSentence("The size of the table is: "+
								prog.spTable.getSize());
							System.out.println();
						}
						break;

					case 8:
						if(ls)
						{
							System.out.println();
							psm.printSentence("The load factor of this table is: "+
								prog.spTable.getLoadFactor());
							System.out.println();
						}
						break;

					case 9:
						if(ls) return;
						break;
					}
				}
			}
			catch (Exception e)				// Catch all the exceptions
			{
				System.out.println(e);
			}
		}
	}
}

File CitationList.java

/**
 * Linked list to store the sentence information
 * Access to the list is via CitationListItr.
 * @author Changbin Wang (cw170@columbia.edu)
 */
public class CitationList
{
    /**
     * Construct the list
     */
	public CitationList()
	{
		try
		{
			header = new CitationListNode( null, null );
		}
		catch (OutOfMemoryError e)
		{
			System.out.println(
				"Error: Not enough memory for opening a new list!");
		}

		currentSize = 0;
	}

	/**
	 * Return the size of the list indicate by the member variable of the class
	 */
	public int getSize()
	{
		return currentSize;
	}

    /**
     * Test if the list is logically empty.
     * @return true if empty, false otherwise.
     */
	public boolean isEmpty()
	{
		return header.next == null;
	}

    /**
     * Make the list logically empty.
     */
	public void makeEmpty()
	{
		header.next = null;
		currentSize = 0;
	}

    /**
     * Return an iterator representing the header node.
     */
	public CitationListItr zeroth()
	{
		return new CitationListItr( header );
	}

    /**
     * Return an iterator representing the first node in the list.
     * This operation is valid for empty lists.
     */
	public CitationListItr first()
	{
		return new CitationListItr( header.next );
	}

    /**
     * Return iterator corresponding to the first node containing an item.
     * @param x the item to search for.
     * @return an iterator; iterator isPastEnd if item is not found.
     */
	public CitationListItr find(String x)
	{
		CitationListNode itr = header.next;

		while (itr != null && !itr.element.equals(x) )
			itr = itr.next;

		return new CitationListItr(itr);
	}

    /**
     * Return iterator corresponding to the node with maximum frequency.
     * @return an iterator; iterator isPastEnd if item is not found.
     */
	public CitationListItr findMaxFreq()
	{
		CitationListItr itr = first();
		CitationListNode maxNode = itr.current;
		for (; !itr.isPastEnd(); itr.advance())
			maxNode=(maxNode.freq>=itr.retrieveFreq())?maxNode:itr.current;
		return new CitationListItr(maxNode);
	}

    /**
     * Remove the string.(not used in this project)
     * @param x the string to remove.
     */
	public void remove( String x )
	{
		CitationListItr p = findPrevious(x);

		if( p.current.next != null )
			p.current.next = p.current.next.next;
		else System.out.println("No such word in the list!");

		currentSize -= 1;		// Decrement the list size
	}

    /**
     * Return iterator prior to the node containing the string.
     * @param x the string to search for.
     * @return appropriate iterator if the item is found. Otherwise, the
     * iterator corresponding to the last element in the list is returned.
     */
	public CitationListItr findPrevious( String x )
	{
		CitationListNode itr = header;

		while (itr.next !=null && !itr.next.element.equals(x) )
			itr = itr.next;

		return new CitationListItr(itr);
	}

    /**
     * Insert a word and the information of sentence containing it into 
	 * the linked list.
	 * @param w the word inserted
	 * @return true if the list dosn't contain the given word; false if 
	 * the list has had word already
     */
	public boolean insert(Word w)
	{
		String word = w.getWord();
		CitationPoint cp = w.getSentenceInfo();

		CitationListItr itr = find(word);

		if(itr.isPastEnd())
			insert(word, cp, zeroth());	// Insert as a new node
		else							// Insert as an old node
		{
			itr.retrieveSentenceList().insert(cp,
				itr.retrieveSentenceList().findPrevious(null));
			itr.current.addFreq();
		}

		return (itr.isPastEnd() && currentSize >1);
	}

	private CitationListNode header;
	
	private int currentSize;			// The number of non-header nodes in the list

	/**
     * Insert after p, used for insert a new word.
     * @param x the item to insert.
     * @param p the position prior to the newly inserted item.
	 * @param theIndex the index used to access the symbol table
     */
	private void insert( String x, CitationPoint cp, CitationListItr p)
	{
		if( p != null && p.current != null )
			p.current.next = new CitationListNode( x, cp, p.current.next);
		currentSize ++;
	}
}

File CitationListItr.java:

/**
 * Linked list implementation of the list iterator
 */
public class CitationListItr
{
    /**
     * Construct the list iterator
     * @param theNode any node in the linked list.
     */
	CitationListItr( CitationListNode theNode )
	{
		current = theNode;
	}

    /**
     * Test if the current position is past the end of the list.
     * @return true if the current position is null.
     */
	public boolean isPastEnd()
	{
		return current == null;
	}

   /**
    * Return the item stored in the current position.
    * @return the stored item or null if the current position
    * is not in the list.
    */
	public String retrieveWord()
	{
		return isPastEnd()? null : current.element;
	}

   /**
    * Return the frequency of item stored in the current position.
    * @return the frequency of stored item or -1 if the current position
    * is not in the list.
    */
	public int retrieveFreq()
	{
		return isPastEnd()? -1 : current.freq;
	}

   /**
    * Return the linked list storing the sentence information.
    * @return the linked list
    */
	public LinkedList retrieveSentenceList()
	{
		return isPastEnd()? null : current.citeList;
	}

    /**
     * Advance the current position to the next node in the list.
     * If the current position is null, then do nothing.
     */
	public void advance()
	{
		if( !isPastEnd() ) current = current.next;
	}

	CitationListNode current;
}

File CitationListNode.java:

// Basic node stored in a string linked list
public class CitationListNode
{
	public CitationListNode( String theElement, CitationPoint cp )
	{
		this ( theElement, cp, null );
	}

	public CitationListNode( String theElement, 
		CitationPoint cp, CitationListNode n) 
	{
		element = theElement;
		next = n;
		freq = 1;
		citeList = new LinkedList();
		citeList.insert(cp, citeList.zeroth());
	}

	/**
	 * Increase the frequency when a word is inserted 
	 * into a list containing it
	 */
	public void addFreq()
	{
		freq ++;
	}

	int freq;			// The frequency of the word

	String element;

	LinkedList citeList;

	CitationListNode next;

}

File CitationPoint.java:

/**
 * Class Citation used to store sentence information
 */
public class CitationPoint extends Object
{
	int nbeg;		// The beginning index of the sentence in the array
	int nlen;		// The length of the sentence

	/**
	 * Constructor for the class
	 * @param _nbeg the beginning index
	 * @param _nlen the length of the sentence
	 */
	public CitationPoint (int _nbeg, int _nlen)
	{
		this.nbeg = _nbeg;
		this.nlen = _nlen;
	}
}

File JavaHashtable.java:

import java.util.Hashtable;

/**
 * JavaHashtable class, Symbol table implemented using Java Hashtable package
 * @author: Changbin Wang (cw170@columbia.edu)
 */
public class JavaHashtable
{
	/**
	 * Constructor
	 */
	public JavaHashtable()
	{
		table = new Hashtable();

		currentSize = 0;
	}

	/**
	 * Insert a word and its sentence information into the table
	 * @param w the word and its sentence information
	 */
	public void insert(Word w)
	{
		String word = w.getWord();
		CitationPoint cp = w.getSentenceInfo();

			// If the word exists in the table, increase its frequency
			// and insert its sentence information into the linked list
		if(table.containsKey(word))
		{
			JHTableItem ti = (JHTableItem)table.get(word);
			ti.insert(cp);
			maxFreq = (maxFreq.getFreq() >= ti.getFreq())? maxFreq: ti;
			table.put(word,ti);
		}
			// If the word isn't in the table, insert the word and 
			// its sentence information
		else
		{
			JHTableItem ti = new JHTableItem(word);
			ti.insert(cp);
			table.put(word, ti);
			currentSize ++;
			if(currentSize == 1) maxFreq = ti;
		}
	}

	/**
	 * Get the sentence information of a given word
	 * @param word, the word looked for
	 * @return the linked list containing the sentence information
	 */
	public LinkedList get(String word)
	{
		JHTableItem ti = (JHTableItem)table.get(word);
		return (ti==null)?null:ti.getList();
	}

	/**
	 * Return the frequency of word in the text
	 * @param word, the word looked for
	 * @return the frequency
	 */
	public int getFreq(String word)
	{
		JHTableItem ti = (JHTableItem)table.get(word);
		return (ti==null)?-1:ti.getFreq();
	}

	/**
	 * Return the total distinct word number of the text
	 * @return the total word number
	 */
	public int getWordNum()
	{
		return currentSize;
	}

	/**
	 * Return the word item with maximum frequency
	 * @return the table item with maximum frequency
	 */
	public JHTableItem findMaxFreq()
	{
		return maxFreq;
	}

	private Hashtable table;

	private int currentSize;

	private JHTableItem maxFreq;
}

File JHTableItem.java:

/**
 * JHTableItem class, the symbol table item used in the Hashtable(Java)
 * implementation
 */
class JHTableItem
{
	/**
	 * Constructor
	 * @param word, the word and its sentence information stored in the item
	 */
	public JHTableItem(String word)
	{
		this.word = word;
		citeList = new LinkedList();
		freq = 0;
	}

	/**
	 * Insert a sentence information into the linked list of this item
	 * @param cp the sentence information
	 */
	public void insert(CitationPoint cp)
	{
		citeList.insert(cp, citeList.findPrevious(null));
		freq ++;
	}

	/**
	 * Return the linked list for the sentence information of the word
	 * @return the linked list of sentence information
	 */
	public LinkedList getList()
	{
		return citeList;
	}

	/**
	 * Return the frequency of the work this item stands for
	 * @return the frequency
	 */
	public int getFreq()
	{
		return freq;
	}

	/**
	 * Return the word this item stands for
	 * @return the word
	 */
	public String getWord()
	{
		return word;
	}

	private LinkedList citeList;

	private int freq;

	private String word;
}

File PrintStack.java:

/**
 * Class PrintStack mimic a stack to print word list or sentence
 * in word wrapping manner
 */
public class PrintStack
{
	int lineWidth;			// The width of a line 
	int currentSize;		// Used to count the number of output charaters

	/**
	 * Construct the class
	 * @param n the line width
	 */
	public PrintStack(int n)
	{
		lineWidth = n;
		currentSize = 0;
	}

	/**
	 * Method used to print out word list with word wrapping feature
	 * @param str the word to print out
	 */
	public void printWord (String str)
	{
		int n = str.length();				// The length of the word

		// Output word if this line can make room for it
		if((currentSize + n) <= lineWidth)	
		{
			currentSize += n;				// Increment the current size of the line 
											// by the word length
			System.out.print(str);
		}
		// Else start a new line and set outputed charactor length to the length of word
		else								
		{
			System.out.println();
			currentSize = n;
			System.out.print(str);
		}
	}

	/**
	 * Method used to print out sentences with word wrapping feature
	 * @param str the sentence to be output
	 */
	public void printSentence (String str)
	{
		int n = str.length();				// The length of the sentence

		int nbeg = 0;			// The beginning index in str of each new line
								// initialized to 0
		// Output new lines if not finish
		while(nbeg < n)			
		{
			int nend = nbeg+lineWidth;		// Index of end index of a line in sentence
			nend = (nendgt;=n)?n:nend;		// Set end index to the length of sentence
											// if this index exceed sentence length

			// If the last charater of the line is not space"
			// back track the characters to find a space
			for(int i= nend-1; igt;0 && 
				((str.charAt(i)gt;64 && str.charAt(i)<91)||
				(str.charAt(i)gt;96 && str.charAt(i)<123)); i--) 
				nend -= 1;

			// Output the line according to the start and end indice
			System.out.println(str.substring(nbeg, nend));

			// Update the begining point of the new output line
			nbeg = nend;
		}
	}
}

File SeparateChaining.java:

    // SeparateChaining class
    //
    // CONSTRUCTION: with an approximate initial size or default of 101
    //
    // ******************PUBLIC OPERATIONS*********************
    // void insert( x )       --> Insert x
    // void remove( x )       --> Remove x
    // Hashable find( x )     --> Return item that matches x
    // void makeEmpty( )      --> Remove all items
    // int hash( String str, int tableSize )
    //                        --> Static method to hash strings
    // ********************************************************

    /**
     * Separate chaining table implementation of symbol table.
     * Note that all "matching" is based on the equals method.
     * @author Changbin Wang
     */
    public class SeparateChaining
    {
        /**
         * Construct the hash table.
         */
        public SeparateChaining( )
        {
            this( DEFAULT_TABLE_SIZE );
        }

        /**
         * Construct the hash table.
         * @param size approximate table size.
         */
        public SeparateChaining( int size )
        {
            chainElements = new SPTableItem[ nextPrime( size ) ];
            for( int i = 0; i < chainElements.length; i++ )
                chainElements[ i ] = new SPTableItem( );
        }

        /**
         * Insert into the hash table. If the item is
         * already present, then do nothing.
         * @param x the item to insert.
         */
        public void insert( Hashable x )
        {
            SPTableItem whichElement = chainElements[ x.hash( chainElements.length ) ];
			whichElement.insert((Word)x);
        }

        /**
         * Remove from the hash table.
         * @param x the item to remove.
         */
        public void remove( Hashable x )
        {
           chainElements[ x.hash( chainElements.length ) ].remove(((Word)x).getWord());
        }

        /**
         * Find an item in the hash table.
         * @param word the item to search for.
         * @return the matching item, or null if not found.
         */
        public LinkedList get( String word )
        {
			Word x = new Word(word, null);
            return chainElements[ x.hash( chainElements.length ) ].get( word );
        }

		public int getFreq( String word )
		{
			Word x = new Word(word, null);
            return chainElements[ x.hash( chainElements.length ) ].getFreq( word );
		}

        /**
         * Make the hash table logically empty.
         */
        public void makeEmpty( )
        {
            for( int i = 0; i < chainElements.length; i++ )
                chainElements[ i ].makeEmpty( );
        }

		/**
		 * Return the total number of collisions in the hash table
		 * @return the total number of collisions
		 */
		public int getTotalCollision()
		{
			int sum = 0;
            for( int i = 0; i < chainElements.length; i++ )
                sum += chainElements[i].getCollision();
			return sum;
		}

		/**
		 * Return the number of collisions in the collision bucket
		 * @return the max number of collisions in different position
		 */
		public int getMaxCollision()
		{
			int max = chainElements[0].getCollision();
            for( int i = 0; i < chainElements.length; i++ )
			{
				int c = chainElements[i].getCollision();
                max = (max>=c)? max: c;
			}
			return max;
		}

		/**
		 * Return the collision bucket position in the table
		 * @return the position of collision bucket
		 */
		public int getCollisionBucket()
		{
			int cmax = chainElements[0].getCollision();
			int pos = 0;

			for( int i = 0; i < chainElements.length; i++ )
			{
				int c = chainElements[i].getCollision();
                if(cmax < c)
				{
					cmax = c;
					pos = i;
				}
			}
			return pos;
		}

		/** 
		 * Return the total number of distinct words hashed into the table
		 * @return the number of distinct words
		 */
		public int getWordNum()
		{
			int sum = 0;
			for( int i = 0; i < chainElements.length; i++ )
				sum += chainElements[i].getWordNum();
			return sum;
		}

		/**
		 * Return the table size
		 * @return the size of the table
		 */
		public int getSize()
		{
			return chainElements.length;
		}

		/** 
		 * Return the load factor of this hash table
		 * @return the load factor
		 */
		public float getLoadFactor()
		{
			int oPos = 0;
			for( int i = 0; i < chainElements.length; i++ )
				if(chainElements[i].getWordNum()>0)
					oPos ++;
			return ((float)oPos)/getSize();
		}

		/**
		 * Return the list iterator of word with maximum occurance frequency
		 * @return the list iterator
		 */
		public CitationListItr findMaxFreq()
		{
			CitationListItr mf = chainElements[0].findMaxFreq();
			for (int i = 1; i < chainElements.length; i++ )
			{
				CitationListItr cl = chainElements[i].findMaxFreq();
				if(cl.retrieveFreq() > mf.retrieveFreq()) mf = cl;
			}
			return mf;
		}
				
        /**
         * A hash routine for String objects.
         * @param key the String to hash.
         * @param tableSize the size of the hash table.
         * @return the hash value.
         */
        public static int hash( String key, int tableSize )
        {
            int hashVal = 0;

            for( int i = 0; i < key.length( ); i++ )
                hashVal = 37 * hashVal + key.charAt( i );

            hashVal %= tableSize;
            if( hashVal < 0 )
                hashVal += tableSize;

            return hashVal;
        }

        private static final int DEFAULT_TABLE_SIZE = 101;

            /** The array of Lists. */
        private SPTableItem [ ] chainElements; 

        /**
         * Internal method to find a prime number at least as large as n.
         * @param n the starting number (must be positive).
         * @return a prime number larger than or equal to n.
         */
        private static int nextPrime( int n )
        {
            if( n % 2 == 0 )
                n++;

            for( ; !isPrime( n ); n += 2 )
                ;

            return n;
        }

        /**
         * Internal method to test if a number is prime.
         * Not an efficient algorithm.
         * @param n the number to test.
         * @return the result of the test.
         */
        private static boolean isPrime( int n )
        {
            if( n == 2 || n == 3 )
                return true;

            if( n == 1 || n % 2 == 0 )
                return false;

            for( int i = 3; i * i <= n; i += 2 )
                if( n % i == 0 )
                    return false;

            return true;
        }
}

File SPTableItem.java:

// SPTableItem class
// Class used to hold values corresponding to keys in the symbol table
//
class SPTableItem
{
	public SPTableItem()
	{
		collision = 0;

		itemList = new CitationList();
	}

	public void makeEmpty()
	{
		collision = 0;
		itemList.makeEmpty();
	}

	/**
	 * Insert a word and its sentence information into the hash table
	 * @param w the word inserted
	 */
	public void insert (Word w)
	{
		if(itemList.insert(w)) collision++;
	}

	/**
	 * Return the sentences information containing a certain word
	 * @param word the word looked for
	 * @return the linked list of sentence information
	 */
	public LinkedList get(String word)
	{
		CitationListItr itr = itemList.find(word);

		return (itr.isPastEnd())? null: itr.retrieveSentenceList();
	}

	/**
	 * Return the frequency of a certain given word
	 * @param word the word looked for
	 * @return the frequency
	 */
	public int getFreq(String word)
	{
		CitationListItr itr = itemList.find(word);
		return (itr.isPastEnd())? -1: itr.retrieveFreq();
	}

	/**
	 * Remove a word from the hash table (not used in this project)
	 * @param word the word to be removed
	 */
	public void remove(String word)
	{
		itemList.remove(word);
	}

	/**
	 * Return the word number of this position of the hash table
	 * @retun the word number
	 */
	public int getWordNum()
	{
		return itemList.getSize();
	}

	/**
	 * Return the collision number of this position
	 * @return the number of collision
	 */
	public int getCollision()
	{
		return collision;
	}

	/**
	 * Return the list iterator of the word with maximum occurance frequency
	 * @return the resulted list iterator
	 */
	public CitationListItr findMaxFreq()
	{
		return itemList.findMaxFreq();
	}

	private int collision;

	private CitationList itemList;

}

File Word.java:

/**
 * Word class, wrapped class for a string word and its sentence information
 */
public class Word implements Comparable, Hashable
{
	/**
	 * Constrcutor
	 */
	public Word()
	{
		this(null, null);
	}

	/**
	 * Constructor
	 * @param w the word
	 * @param cp the sentence information
	 */
	public Word (String w, CitationPoint cp)
	{
		word = w;
		citePoint = cp;
	}

	/**
	 * Return the string word
	 */
	public String getWord()
	{
		return word;
	}

	/**
	 * Return the sentence information
	 * @return the sentence information
	 */
	public CitationPoint getSentenceInfo()
	{
		return citePoint;
	}

	/**
	 * Implement the compareTo method of Comparable interface
	 */
    public int compareTo( Comparable rhs )
    {
        return word.compareTo(((Word)rhs).word);
    }

	/**
	 * Equals 
	 */
    public boolean equals( Object rhs )
    {
        return rhs != null && word == ((Word)rhs).word;
    }

	/**
	 * The hash function of the word
	 * @param tableSize, the table size of the hash table
	 * @return the hash index 
	 */
    public int hash( int tableSize )
    {
        int hashVal = 0;

        for( int i = 0; i < word.length( ); i++ )
            hashVal = 37 * hashVal + word.charAt( i );

        hashVal %= tableSize;
        if( hashVal < 0 )
            hashVal += tableSize;

        return hashVal;
    }

	private String word;

	private CitationPoint citePoint;
}


[CS3139 Home]