COMS 4180 Project Page: Michael E. Locasto


  "But, Holmes!" I cried, "How do you plan to keep them from knowing?"
  "My dear Watson," stated Holmes calmly, "I propose to employ a cipher of my own devising..."

Proposal

While simple mono-alphabetic substitution ciphers have long since been rendered useless by modern cryptoanalytic methods, it is still an interesting topic to explore to learn about the basics of cryptography. Simple substitutions may be easy to crack with a computer, and may only resist an "untrained by diligent interceptor"[1] for a matter of hours. However, even simple encryption is useful to stop any given viewer - and it's an encryption method that the ordinary people can comprehend without having to know about NP-Complete problems and polynomial time reducibility.

In addition, as a graduation present, I was given a book of "Sit & Solve cryptograms" that use simple mono-alphabetic substitution to encode famous quotes. Because Larry Wall reminds us that laziness is one of a programmer's prime virtues, I do not want to employ my brain to decode the quotes, preferring rather to let my CPU do the work for this and all future such problems.

Goals and Requirements

In this project, I will be working to decode both mono & polyalphabetic ciphers.

My primary purpose is two-fold: solve these cryptograms and do some statistical research into the threshold values and critical mass of any given plaintext KB (knowledge base) and corrosponding ciphertext. In addition, I will attempt to employ the Kasiski method for decoding polyalphabetic ciphers. The method for decoding monoalphabetic English ciphertexts will be through mapping pure frequency distributions between cipher and plain texts.

Preliminary Design

A KB will be a simple XML file (or table in a database) that maps a given English character or punctuation mark to a frequency. The KB will be updated through automatic scanning of webpages and through text selections that I enter myself. Thus, a module will be constructed to recieve input and update this KB as necessary. I'd rather use XML than a database because a db would be overkill and yet another requirement that adds complexity to the software. A simple file can be easily distributed with the software and requires no third party drivers or a database needlessly running a small table.

A core backend will read this KB in and compare the freqency distribution it generates from a given ciphertext. The backend will then make available a best approximation of the original plaintext, as well as any further suggestions (eg, this three letter word could be AND or ANY).

Hopefully, I'll be able to incorporate this backend and a master KB (XML file) into a Java Servlet and run this as a service via the web for demo or just plain fun purposes.

I have to do research yet into the Kasiski method to plan out a design for that.

Literature Review & Survey

The liturature review.

Proposal Presentation/Webpage

Can be found [at this page].

Paper

The final write up is here.

Source Code, Documentation, & Overall Results

Souce code and object code (JAR file) for the stand-alone software is here, in .tar.gz format. Compiled with j2sdk1.4.1. It includes some KBs in properties files.

The web-based demo is available.

Overall results:

Final Presentation

The final presentation consists of the source and object code, the final writeup, this webpage, and the web-based demo.

References

1. Pfleeger, Charles P. Security In Computing, Second Edition Prentice Hall. 1997.


Copyright © 2002 Michael E. Locasto