COMS W4705: Homeworks

Gilbert Street
[Main] | [General Information] | [Problem Sets]

  • December 18th: Homework 2 solutions are available here.

  • November 30th: Homework 3 is released. Analytical questions only. Due on 12th December 2011.
  • November 5th: Homework 2 is released: analytical part, programming part . Both parts are due November 22nd. Files for the programming assignment: corpus.en.gz, corpus.de.gz, devwords.txt Sample results: sample_t_model1.txt, alignment_sample_model1.txt, alignment_sample_model2.txt
  • October 18th: Homework 1 solutions are available here.
  • September 19th: Homework 1 is released here. The analytical problems are due September 30th; the programming assignment is due October 11th (see the homework for full details). Files for the programming assignment: count_freqs.py, ner_train.dat, ner_dev_fixed.dat, ner_dev_fixed.key, eval_ne_tagger.py.
    The old development (which needs needs parameter smoothing to tag some sentences) is still available: ner_dev.dat, ner_dev.key.

    Submission guidelines and policy

    Submission Instructions

    1. Analytical part: problems must be submitted in hard copy (either hand-written or printed) to the box in front of 723 CEPSR/Shapiro.
    2. Programming assignments: Place the files for all problems in a directory named [your_uni]_h[X] , where X is the number of the programming assignment. For instance if your uni is xy1234 and you are submitting the first programming assignment, the directory should be called xy1234_h1. Either zip or tar and gzip the directory and upload it to the directory for programming assignment X on the Courseworks page for this class.
    3. Update on late policy: we will give students 3 "free" days that can be used as they wish across the 4 problem sets. Specifically, we will not penalize the first 3 late days that a student incurs on problem sets. After that, the penalties posted on the problem sets will apply (e.g., 5 points per day late on the first problem set). The final (0 point) deadline will still apply; for example for problem set 1 any solutions handed in after Oct 3 (analytical part) or Oct 14 (programming part) will get 0 points.

    Programing Assignments Policy and Guidelines

    1. Your code should compile and run on the CLIC machines. We recommend solving problems in Python (<= v. 2.7 / 3.0), Java (<= v.1.6), or Perl (<= v 5.10) to ensure compatibility. If you want to use any other language, please request approval from the TAs before you start coding.
    2. Document your code! Undocumented code will result in lower scores.
    3. Write a brief report describing results of experiments, any observations you made, design choices and instructions on how to build (if necessary) and run your implementation (command line arguments, whether data is fed to your program on stdin or from a file, etc.). The report is part of your solution and will be scored. It can be in plain text or PDF.
    4. Make sure your program implements any specific functionality we ask for (input/output format etc.).
    5. Efficiency of your implementation matters only when we ask for it (your algorithms should have desired performance and space requirements).
    6. You should be able to solve all problems using pre-installed standard libraries. Do not use any NLP or machine learning libraries. If you choose to use third-party libraries or modules (e.g numeric computing frameworks such as numpy), make sure they are installed on CLIC. When in doubt if it is okay to use third-party code ask the TAs.

    Group Work and Academic Honesty Policy

    All problems must be solved individually. You may discuss the problems with other students, but you have to do the write-up and implementation yourself. We will check homework assignments for duplicates. Violations will result in a grade of zero and further steps may be taken in accordance with the CS department's academic honesty policy.