W6998 Spring 2015: Seminar on Spoken Dialogue Systems

Computer Science COMSE6998_006_2015_1:
Seminar on Spoken Dialogue Systems
Spring 2015

[Announcements] [General Information]
[Schedule] [Class Presentation]
[Project] [Links and Resources]

Announcements

5/4: Contratulations on winning the best demo prize to the FOOD SPEAK team (Annanya, Leighanne, and Mandi)!
3/20: Think of where you might submit your work. A demo track option: Sigdial demontstration (deadline is April 30); research track EMNLP, short paper June 15
3/9: Related work and methods descriptions submission deadline is extended to 3/13.
2/8: System descriptions are due 2/23. More details are added in the deliverables for the system descriptions.
1/26: Class cancelled due to snow.
First Class January 26. Welcome to the Seminar on Dialogue Systems!

[Top]

General Information

Description

Seminar on Dialogue Systems introduces students to research on automatic spoken dialogue systems. Dialogue systems process spoken (or typed) user input and respond to the user in natural language. Dialogue systems are used commercially to process customer requests, such as providing train schedule information, customer service for banks and stores, etc. Dialogue systems running on personal/home devices, like Apple’s Siri or Amazon’s Echo, answer open-ended questions. In this course, we will discuss state-of-the-art research on dialogue systems and their components: speech recognition, language understanding, dialogue management, and natural language generation. We will cover different types of dialogue systems, including information providing systems, task-oriented systems, tutoring systems, and multimodal systems. In the course project, the students will have a chance to build their own dialogue system. Classes will be lecture and discussion with an emphasis on group participation. There will be no final exam in this course.

Prerequisites

COMS 3133/4/7/9 (Data Structures) or equivalent programming ability in at least one systems or scripting language ( Java, Python)

Instructor: Svetlana Stoyanchev

sstoyanchev [who is at] cs [dot] columbia [dot] edu
Office Hours: Mondays 2-4 CEPSR 7LW3 (speech lab) or by appointment on Skype
Skype ID: svetastenchikova

TA: Victor Soto

vsoto [who is at] cs [dot] columbia [dot] edu
Office Hours: Tuesdays 4-6 CEPSR 7LW3 (speech lab)

Lectures

Mondays 4:10-6:00, Mudd Building Room 545

Grade Breakdown

20% Class participation

30% Class presentation

Feb 2

so it is important you email your preferences.

50% Course Project

Project Description: 5%;
Related work and Methodology: 5%
In-class demonstration : 20%;
Final paper draft (including experiments and evaluation): 20%

[Top]

Reading schedule

Schedule is tentative and highly subject to change.

Discussion papers:

Date	Topics	Readings	Due dates
1/26	Class Introduction Lecture slides	Explore links and videos in the Course Resources
2/2	Task-oriented dialogue systems:information-providing, troubleshooting, making reservations	Background reading: Pier Lison's thesis Chapter 2 Discussion papers: Raux, A., Langner, B., Bohus, D., Black, A., and Eskenazi, M. Let's Go Public! Taking a Spoken Dialog System to the Real World, in Interspeech-2005, Lisbon, Portugal William Swartout, David Traum, Ron Artstein, Dan Noren, Paul Debevec, Kerry Bronnenkant, Josh Williams, Anton Leuski, Shrikanth S. Narayanan, Diane Piepol, Chad Lane, Jacquelyn Morie, Priti Aggarwal, Matt Liewer, Jen-Yuan Chiang, Jillian Gerten, Selina Chu and Kyle White, Ada and Grace: Toward Realistic and Engaging Virtual Museum Guides, in: In Proceedings of the 10th International Conference on Intelligent Virtual Agents (IVA), 2010 Meena, R., Boyer, J., Skantze, G., & Gustafson, J. Crowdsourcing Street-level Geographic Information Using a Spoken Dialogue System. In 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue - SIGdial, 2014 Additional papers: J. Weizenbaum ELIZA--A Computer Program For the Study of Natural Language Communication Between Man and Machine Communications of the ACM Volume 9, Number 1 (January 1966): 36-45. Diane Litman and Scott Silliman. 2004. ITSPOKE: An Intelligent Tutoring Spoken Dialogue System. Companion Proceedings of the Human Language Technology Conference: 4th Meeting of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL), Boston, MA. Raux, A., Bohus, D., Langner, B., Black, A., and Eskenazi, M. - Doing Research in a Deployed Spoken Dialog System: One Year of Let's Go! Public Experience, in Interspeech-2006, Pittsburgh, PA David Traum, Priti Aggarwal, Ron Artstein, Susan Foutz, Jillian Gerten, Athanasios Katsamanis, Anton Leuski, Dan Noren, and William Swartout. Ada and Grace: Direct interaction with museum visitors. In Yukiko Nakano, Michael Neff, Ana Paiva, and Marilyn Walker, editors, Intelligent Virtual Agents (IVA-2012), volume 7502 of Lecture Notes in Computer Science, pages 245-251. Springer Berlin / Heidelberg, 2012.	Create account on wit.ai, go through Quickstart guide; Send a list of 5 preferred papers for your presentation
2/9	Discussion: Speech Recognition and Language Undersanding for Dialogue Systems; Lecture slides (presented on 2/2) Lecture: Dialogue Modelling and Management	Background papers: Renato Michael McTear, Giuseppe Riccardi, and Gokhan Tur Spoken Language Understanding: Interpreting the signs given by a speech signal, IEEE SIGNAL PROCESSING MAGAZINE, MAY 2008 W. Ward and S. Issar, Recent improvements in the CMU spoken language understanding system, in Proc. of the ARPA Human Language Technology Workshop. 1996, pp. 213–216, Morgan Kaufman Publishers, Inc. Discussion papers: Mandy Korpusik, Nicole Schmidt, Jennifer Drexler, Scott Cyphers, and James Glass DATA COLLECTION AND LANGUAGE UNDERSTANDING OF FOOD DESCRIPTIONS In IEEE SLT Workshop, 2014 Presenter: Cecilia Reyes Fabrizio Morbini and Eric Forbell and Kenji Sagae Improving Classification-Based Natural Language Understanding with Non-Expert Annotation in Proceedings of SigDial 2014 Presenter: Leighanne Hsu Ali El-Kahky, Derek Liu, Ruhi Sarikaya, Gokhan Tur, Dilek Hakkani-Tur, and Larry Heck Extending Domain Coverage of Language Understanding Systems via Intent Transfer Between Domains Using Knowledge Graphs and Search Query Click Logs IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2014 Presenter: Dara Pir Additional papers: Yulan He, Steve Young A data-driven spoken language understanding system, In IEEE Workshop on Automatic Speech Recognition and Understanding Murat Akbacak, Dilek Hakkani-Tur, and Gokhan Tur, Rapidly building domain-specific entity-centric language models using semantic web knowledge resources, in Proceedings of Interspeech, ISCA - International Speech Communication Association, September 2014	Download and install OpenDial; go through the tutorial.
2/16	Discussions: Dialogue Modelling and Management Lecture slides	Background papers: Mark G Core and James Allen. 1997. Coding dialogs with the damsl annotation scheme. In AAAI fall symposium on communicative action in humans and machines, pages 28–35. Boston, MA. Jean Carletta, Stephen Isard, Gwyneth Doherty Sneddon, Amy Isard, Jacqueline C Kowtko, and Anne H Anderson. 1997. The reliability of a dialogue structure coding scheme. Computational linguistics, 23(1):13–31. Karen Lochbaum, A collaborative computational planning model of intentional struture in Computational Linguistics, 1998 David R. Traum, Speech Acts for Dialogue Agents, in Michael Wooldridge and Anand Rao, editors, ``Foundations And Theories Of Rational Agents'', Kluwer Academic Publishers, pages 169--201, 1999. Harry Bunt, Jan Alexandersson, Jean Carletta, Jae-Woong Choe, Alex Chengyu Fang, Koiti Hasida, Kiyong Lee, Volha Petukhova, Andrei Popescu-Belis, Laurent Romary, Claudia Soria, and David Traum Towards an ISO standard for dialogue act annotation. Proceedings of the seventh international conference on language resources and evaluation (LREC 2010), Paris: ELRA (2010) Discussion papers: David Traum and Staffan Larsson, The Information State Approach to Dialogue Management. in Current and New Directions in Discourse and Dialogue, Ed. Jan van Kuppevelt and Ronnie Smith, Kluwer, pages 325--353, 2003. Students' questions Ben Hixon, Rebecca Passonneau Open Dialogue Management for Relational Databases NAACL 2013 Presenter: Jee Hyun Wang slides ; Students' questions Pierre Lison. Model-based Bayesian Reinforcement Learning for Dialogue Management. In Proceedings of the 14th International Conference of the Speech Communication Association (Interspeech 2013), Lyon, France, 2013. Presenter: Edward Li** slides; Students' questions Additional Papers: S. Bangalore, G. Di Fabbrizio and A. Stent, ``Learning the Structure of Task-driven Human-Human Dialogs'', with ACL-Coling, Sydney, Australia, 2006. Bohus, D., Raux, A., Harris, T., Eskenazi, M., and Rudnicky, A. (2007) - Olympus: an open-source framework for conversational spoken language interface research, in HLT-NAACL 2007 workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technology, Rochester, NY Levin, Pieraccini and Eckert A Stochastic Model of Human-Machine Interactin for Learning Dialog Strategies. Transactions on Speech and Audio Processing, 2000. Anton Leuski and David DeVault A Study in How NLU Performance Can Affect the Choice of Dialogue System Architecture , in the 13th annual SIGdial Meeting on Discourse and Dialogue (SigDial 2012) .	Find a team ( email the instructor); Make an appointment with the TA or instructor to discuss your project ideas;
2/23	Information Presentation in Dialogue slides (Frameworks summary)	Discussion papers: students' questions Taghi Paksima, Kallirroi Georgila, and Johanna D. Moore. Evaluating the Effectiveness of Information Presentation in a Full End-to-End Dialogue System. In Proceedings of the 10th Annual SIGdial Meeting on Discourse and Dialogue, pp. 1-10, London, UK, 2009. [Presenter: Samara Trilling ] Margaret Mitchell, Dan Bohus, Ece Kamar Crowdsourcing Language Generation Templates for Dialogue Systems. INLG, 2014 [presenter:Sarah Ita Levitan] Additional papers: M. Walker, S. Whittaker, A. Stent, P. Maloor, J. Moore, M. Johnston, and G. Vasireddy Generation and evaluation of user tailored responses in multimodal dialogue. Cognitive Science, 28:811–840, 2004. David DeVault, David Traum, and Ron Artstein, Making Grammar-Based Generation Easier to Deploy in Dialogue Systems In proceedings of The 9th SIGdial Workshop on Discourse and Dialogue (SIGdial 2008), June, 2008. Srinivasan Janarthanam, Oliver Lemon Adaptive Generation in Dialogue Systems Using Dynamic User Modeling Computational Linguistics December 2014	Project Descriptions Due; Each team gives a 5-minute elevator speech about project in class.
3/2	Evaluation of Dialogue Systems SDSEval_03_02_15.pptx students' questions	Background Reading: H.Hastie (2012) Metrics and Evaluation of Spoken Dialogue Systems. In Data-Driven Methods for Adaptive Spoken Dialogue Systems Computational Learning for Conversational Interfaces. Oliver Lemon and Olivier Pietquin (Editors). (available for download at Columbia library) Marilyn A. Walker, Candace Kamm and Diane J. Litman. Towards Developing General Models of Usability with PARADISE. Natural Language Engineering, 2000. Discussion papers: (NLG) Brian McMahan and Matthew Stone Training an integrated sentence planner on user dialogue SigDial, 2014 [ presenter: Ananya Aniruddh Poddar] F. Jurčíček and S. Keizer and F. Mairesse and B. Thomson and K. Yu and S. Young Real user evaluation of spoken dialogue systems using Amazon Mechanical Turk. in Proceedings of Interstpeech, 2011 [ presenter: Mandi Wang ] Additional papers: Hua Ai and Diane Litman. Assessing User Simulation for Dialog Systems using Human Judges and Automatic Evaluation Measures. Journal of Natural Language Engineering, Volume 17 Issue 4, pages 511-540. Favre B, Cheung K, Kazemian S, Lee A, Liu Y, Munteanu C, Nenkova A, Ochei D, Penn G, Tratz S, Voss C. and Zelle F. Automatic Human Utility Evaluation of ASR Systems: does WER Really predict Performance? INTERSPEECH, Lyon, 2013.	Create a github repository for your team and send git repo URL to the instructor and TA
3/9	Error recovery in dialogue systems (Lecture slides) (Student's questions)	Background papers: D. Bohus and A. I. Rudnicky. 2005. A principled approach for rejection threshold optimization in spoken dialog systems. In INTERSPEECH. Discussion papers: (Eval) K. Georgila, J. Henderson, and O. Lemon. 2005. Learning User Simulations for Information State Update Dialogue Systems. In Proceedings of Interspeech.[ presenter: Xiaoqian Ma ] Skantze, G. Making grounding decisions: Data-driven estimation of dialogue costs and confidence thresholds. In Proceedings of SigDial, 2007 (pp. 206-210). Antwerp, Belgium. Alex Liu, Rose Sloan, Mei-Vern Then, Svetlana Stoyanchev, Julia Hirschberg and Elizabeth Shriberg Detecting Inappropriate Clarification Requests in Spoken Dialogue Systems presenter: Daniel Sadik Maxson Additional papers: S. Stoyanchev, A. Liu, J. Hirschberg Towards Natural Clarification Questions in Dialogue Systems The Questions, discourse and dialogue symposium at AISB, 2014 S. Stoyanchev, A. Liu, J. Hirschberg Modelling Human Clarification Strategies Proceedings of the SIGDIAL 2013 Conference S. Stoyanchev and M. Johnston Localized Error Detection for Targeted Clarification in a Virtual Assistant in ICASSP 2015 ? A. Gravano, J. Hirschberg A Corpus-Based Study of Interruptions in Spoken Dialogue. INTERSPEECH, 2012	Related work and method description due
3/16	Spring Break
3/23	Tutoring dialog systems; adaptation in dialogue systems; Students' questions; discussion on frameworks OpenDial	Discussion papers: Alexandria Katarina Vail and Kristy Elizabeth Boyer Adapting to Personality Over Time: Examining the Effectiveness of Dialogue Policy Progressions in Task-Oriented Interaction in Proceedings of Sigdial, 2014 [presenter: Anna Prokofieva] Kate Forbes-Riley and Diane Litman. Adapting to Multiple Affective States in Spoken Dialogue. Proceedings of the 13th Annual Meeting of the Special Interest Group on on Discourse and Dialogue (SIGDIAL), pages 217-226, Seoul, South Korea, July. [presenter: Michael Yang] Additional papers: Keelan Evanini, Youngsoon So, Jidong Tao, Diego Zapata, Christine Luce, Laura Battistini, and Xinhao Wang. 2014. Performance of a trialogue-based prototype system for English language assessment for young learners. Proceedings of the Interspeech Workshop on Child Computer Interaction (WOCCI 2014), Singapore, September 19, 2014. ? Lallé, S., Mostow, J., Luengo, V., & Guin, N. (2013). Comparing Student Models in Different Formalisms by Predicting their Impact on Help Success. In Proceedings of the 16th International Conference on Artificial Intelligence in Education, 2013
3/30	Entrainment in dialogue	Invited speaker: Rivka Levitan, Brooklyn College CUNY. Background Reading (please read and submit questions for at least 1 of these papers): Chapter 15 in : R. LevitanAcoustic-Prosodic Entrainment in Human-Humanand Human-Computer Dialogue Z. Xia, R. Levitan, J. Hirschberg, Prosodic Entrainment in Mandarin and English: A Cross-Linguistic Comparison in Proceedings of Speech Prosody. Dublin, Ireland, May 2014. Rivka Levitan, Stefan Benus, Agustin Gravano and Julia Hirschberg Entrainment and Turn-Taking in Human-Human Dialogue, AAAI 2015 Spring Symposium Discussion papers (please read and submit questions for at least 2 of these papers): J. Lopes, M. Eskenazi and I. Trancoso, 2013, Automated Two-Way Entrainment to Improve Spoken Dialog System Performance, Proceedings ICASSP 2013. (this paper is available on Columbia network. Please email us if you have trouble accessing this.) D. Reitter, D. Moore Predicting Success in Dialogue, in Proceedings of ACL 2007 Zhichao Hu, Gabrielle Halberg, Carolynn R Jimenez, and Marilyn A Walker. Entrainment in pedestrian direction giving: How many kinds of entrainment? In Proceedings of 5th International Workshop on Spoken Dialog System, 2014.
4/6	Search and Dialogue; Voice search; Question answering	Guest Speaker: David Elson, Google Background papers: Rivka Levitan, David Elson: Detecting Retries of Voice Search Queries. ACL 2014 Discussion papers: Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '13). Lu Wang, Larry Heck, and Dilek Hakkani-Tur, Leveraging Semantic Web Search and Browse Sessions for Multi-Turn Spoken Dialog Systems, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014. Additional Papers: J. Feng, S. Bangalore Effects of Word Confusion Networks on Voice Search in Proceedings of EACL 2009 David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty Building Watson: An Overview of the DeepQA Project , AI Magazine 2010	Make an appointment with the instructor or TA to show demo progress and discuss your project;
4/13	Domain Knowledge Acquisition; Situated Dialogue Systems; Dialogue with Robots	Guest lecture: Aasish Pappu, Yahoo labs Background Reading: Aasish Pappu, Alexander I. Rudnicky Learning Situated Knowledge Bases through Dialog In Proceedings of Interspeech 2014 Aasish Pappu Knowledge Discovery Through Spoken Dialog, Thesis 2014 Discussion papers: James Allen, Nathanael Chambers, George Ferguson, Lucian Galescu, Hyuckchul Jung, Mary Swift, and William Taysom. Plow: A collaborative task learning agent. In Proceedings of the National Conference on Artificial Intelligence, volume 22, page 1514. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2007. Hakkani-Tür, Dilek, et al. "Probabilistic enrichment of knowledge graph entities for relation detection in conversational understanding." Proceedings of Interspeech. 2014. Joyce Y Chai, Lanbo She, Rui Fang, Spencer Ottarson, Cody Littley, Changsong Liu, and Kenneth Hanson. Collaborative effort towards common ground in situated human-robot dialogue. In Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction, pages 33–40. ACM, 2014. Additional papers: H. Holzapfel, D. Neubig, and A. Waibel, “A dialogue approach to learning object descriptions and semantic categories,” Robotics and Autonomous Systems, vol. 56, no. 11, pp. 1004–1013, Nov. 2008
4/20	Statistical dialogue systems; Dialogue system competitions: Dialog State Tracking Challenge; REAL challenge	Guest lecture: Sungjin Lee, Yahoo Labs. Required Reading: Sungjin Lee and Maxine Eskenazi Recipe For Building Robust Spoken Dialog State Trackers: Dialog State Tracking Challenge System Description Sigdial 2013 Discussion papers: Jason Williams, Steve Young Partially observable Markov decision processes for spoken dialog systems Computer Speech and Language 2007 Ronnie Smith Comparative Error Analysis of Dialog State Tracking , Sigdial 2014
4/27	Multimodal dialogue systems	Guest Speaker: Michael Johnston, Interactions Corporation Background papers: Required M. Johnston and S. Bangalore Robust Understanding in Multimodal Interfaces, in Computational Linguistics, Volume 35, Number 3, September 2009 Michael Johnston, John Chen, Patrick Ehlen, Hyuckchul Jung, Jay Lieske, Aarthy Reddy, Ethan Selfridge, Svetlana Stoyanchev, Brant Vasilieff, Jay Wilpon, MVA: The Multimodal Virtual Assistant, Proceedings of Annual SIGdial Meeting on Discourse and Dialogue, 2014 M. Walker, S. Whittaker, A. Stent, P. Maloor, J. Moore, M. Johnston, and G. Vasireddy Generation and evaluation of user tailored responses in multimodal dialogue. Cognitive Science, 28:811–840, 2004. Discussion papers: Larry Heck, Dilek Hakkani-Tur, Madhu Chinthakunta, Gokhan Tur, Rukmini Iyer, Partha Parthasarathy, Lisa Stifelman, Elizabeth Shriberg, and Ashley Fidler, Multimodal Conversational Search and Browse, IEEE Workshop on Speech, Language and Audio in Multimedia, August 2013 Bohus, D., Horvitz, E., (2014) - Managing Human-Robot Engagement with Forecasts and ... um ... Hesitations, in Proceedings of ICMI'2014, Istanbul, Turkey
5/4	In-class Project Demonstrations Final slides
5/15	Final Draft Due (no extensions)

[Top]

Course Presentation

Prepare 10 - 20 slides to present content of the paper you are assigned to present. Short papers will have 10-15 minutes and long papers will have 20-30 munutes presentations.

describe the task addressed by the paper
approach proposed by the authors
data or system used
summarize the related work
describe the experiments and results

Prepare critical analysis of the paper using a research review form:

Clarity
Originality
Implementation and soundness
Substance
Evaluation
Meaningful Comparison
Impact
Recommendation

The presentation sessions will be followed by a panel discussions with the presenters as panelists. Deliverables: 1) send a link to your slides before the class 2) write a paragraph on each point of the critical analysis 3) present the slides in the class, 4) lead a discussion on critical analysis of the paper [Top]

Project

The project will involve 1) building a spoken dialogue system (SDS) in a domain of your choice; 2) proposing a research question (optionally); and 3) evaluating your system/research question. You may use one of the existing frameworks (e.g. OpenDial, WitAI). You may use any architecture/platform that you are familiar with: a stand-alone application, a web application, an android, or an i-phone app. The TA and Instructor will provide support with OpenDial java framework and with WitAI using python API (which we found to work with Linux but having installation issues with a Mac).
The SDS should allow the user to ask questions related to your domain. You can structure the interactions you support however you want. You should design your system by first deciding what types of user interactions you will support and then creating a Dialog Flow Diagram, a directed graph showing the System States (e.g. Greeting, Help, Info-request, Exit) as nodes with arcs showing which state can follow which other state (E.g. Greeting can be followed by Info-request or Help or Exit, so there should be arrows connecting the Greeting node to each of these 3). You will identify these states in your SDS by considering the current state, the legal states this state is connected to in the graph, and the user input.
Your SDS should handle multi-turn dialogues (in contrast with a single-turn search request or Q&A).
You should build your application with an idea of evaluation. What will make this application successful?
Your final report will be written in a form of a research paper and will include sections:

Introduction
Related Work
Method
Experiments
Conclusions

Project deliverables will include 1) System description, 2) Related Work and Method description, 3) System demonstration, and 4) Final paper draft. Each deliverable will contribute towards your final paper.
Project report deliverables are submitted by each team using CourseWorks website. Please submit the documents in PDF format. Please include a section with team member names and contribution made by each team member.
The code and running instructions will be kept in a github repository. The instructor and TA will be monitor your progress by occasionally pulling your current version of the code. Your project grade will include how readable and well-documented is your code. Include documentation with the instructions for running your system.

1. System Description
This report will form an initial draft of the Introduction section of your final report. Describe the domain that you choose for the system; which system architecture you intend to use; what is a dialogue flow; example dialogue with a system.

Your system should engage in an interactive dialogue (i.e. more than one turn of question/answer) and support multiple (at least 4 - 5) domain concepts (types of things that users can ask about and that your system will need to recognize in order to respond appropriately). For example, a flight arrival system might allow users to specify (1) an airline, (2) a departure city, (3) an arrival city, and (4) a flight number -- although users might specify only 2 or 3 in any given query. A restaurant domain system might have nieghborhood, price range, star rating, and cousine.

Example topics for the project include (but not limited to):

A voice interface for an existing API:

A calendar system that interfaces a google calendar and allows a user add/remove/query events in the calendar
A system that queries weather information
A system that holds a dialogue questions about current events in NYC: find concerts/plays/movies at NYC venues
Voice Interface for a travel api, e.g tripadvisor that allows to query hotels
Google search API
A text base (you can run an indexing, such as Lucene on a text collection and use API to search)

A chatbot system that uses a database on a back-end e.g.
- a chat interface for a toy that talks with children
- a chat interface that may be used in a museum to provide information for visitors

Deliverables: 1) 5 minute in-class presentation of the project idea. 2) 1-2 page desciption of the proposed dialogue system (Introduction part of the paper). The description should include:

Motivation for your choice of domain in terms of interestingness, utility, or challenge.
System functionality: describe the type of input you will accept from users, the domain concepts in your system, and the type of output you want to produce in as much detail as possible.
Front-end: what architecture will the system run on (stand-alone, i-phone, android)
Back-end: identify where you will get the data for the system (which API, database, ontology)
Framework: which framework do you intend to use for the system (if any), which ASR, TTS will you use, how will you implement NLU for your system.
Research or demo: is your project focusing on system development or system & research? What research questions do you intend to address
Evaluation: how do you plan to evaluate your system/research question (user study)?

The project will have two options: 1) focus on developing an SDS system 2) focus on a research question. In both cases, you will be developing a system. Indicate in your submission what is the focus of your project.
Research focus: Describe a research question you indend to investigate using your system. Which experimental conditions will you investigate? (having 2 experimental conditions is reasonable). What is your hypothesis (which experimental condition do you expect to work best and why)? How will you evaluate your hypothesis?
Both system & research : Summarize the related work. Describe evaluation procedure. For example, you may propose to evaluate your experimental system by having 10 people use the system and answer a questionnaire or measure speech recognition/language understanding performance of the system. Possible research questions include (but are not limited to):

Compare system performance using different speech recognisers (e.g. Kaldi, Pocket Sphinx, Google, Nuance)
Compare system performance with different TTS engines (Festival)
Build a statistical NLU for the OpenDial system. ** (this has a practical application), or try connecting WitAI NLU as a module for OpenDial
Build a multimodal graphical display for your system (e.g. as a module in OpenDial) and compare voice-only and multimodal condition
Experiment with clarification strategies in your system
Experiment with different methods of information presentation or natural language generation

See an example project description from a related course.

2. Related Work and Method Description
Finalize your choice of system architecture. Think through the design of your system. Test proposed tools to ensure the feasibility of your proposal. At this point we need to make sure that whether you are using Wit.AI and OpenDial, the framework of your choice will be enough to support all the functionality you are proposing to implement in your system. If you are going to use a third-party API in the backend of your system, it is very important too, that you understand what the input and output of the API is and what you will be able to do with it.

Deliverables: A report with refined Introduction, a draft of Related Work and Method sections. The report should show that you thought through the system design and functionality and confirmed feasibility of your approach by trying out the functionality of the proposed tools. Evaluation criteria:

Clearly describe your motivation and goals.
Compare your proposed system with previous work in academia and/or industry.
Describes the design for your proposed system/experiments.
What is the system functionality from a user perspective (what tasks can a user accomplish using the system).
Implementation details: Which ASR/NLU/TTS components is your system using? How will you train/create rules for NLU in your domain.
Describe third-party APIs that you will be using in your system, if any.
A high-level design of a dialogue manager: what will be a set of states and/or frames.
Motivation for your choices of tools/infrastructure ( sure what you are proposing is feasible)
Describe how your system accesses the data, what intermediate representations are you using?
Describe your evaluation plan.

3. System Demonstration
Develop a functional dialogue system. This system will be used for your experiments.
Deliverables: Please prepare a demonstration of your system to be shown during the class (see the schedule). Each group will have up to 20 minutes to present. I recommend preparing ~15 minute presentation with 5 minutes for questions. After the presentations, everyone can vote for the best presentation. There will be time to play with other teams' systems. Please submit your documented source code with README describing content on your github.

6. Final Report
Recruit experiment participants (from the class or outside the class) who will be users of your system. Run an evaluation of your system/experimental condition. Analyze and summarize the results in Evaluation section. In Conclusions section summarize your system/what you learned? what would you do differently? how would you extend this work?
Deliverables: Submit a complete paper draft that includes your improved Introduction, Related Work, Method sections and new Evaluation and Conclusion sections. It will be graded by the TA and the instructor according to the:

Clarity (overall clarity of your paper)
Implementation and soundness (method description)
Substance (how much work is accomplished)
Evaluation
Meaningful comparison (related work)
Impact of accompanying software (we will look at your github source code and evaluate for potential reusability)

In addition to the paper, please also include an Admin and a Framework evaluation sections with the following information:
Admin Section

URL of your github.
URL of your final report if you agree for it to be published on the course webpage
If you are planning to submit your work to a workshop, please indicate which workshop (I will be glad to help with your paper draft)
For each group member, a paragraph description of the project tasks accomplished

Framework evaluation section (OpenDial/WitAI)

What are the strengths and weaknesses of the framework you were using?
On a scale from 1 (worst) – 10 (best) How easy was it to start using the framework? How easy was it to debug your code?
What are the challenges that you discovered while using the framework?
For Wit.ai – how much training data did you need to gather to make it usable?
For OpenDial – how much effort did it take to build NLU/DM/NLG/external components?
What suggestions would you make for the authors of the framework to improve it?
What extensions would you suggest for the framework?
Would you use this framework for another dialogue system?

[Top]

Links and Resources

Course slides can be found in Dropbox

Research-related:

S. Keshav, How to Read a Paper, ACM SIGCOMM Computer Communication Review 37(3) : 83-84, July 2007.
Philip W. L. Fong, How to Read a CS Research Paper, July 2004.
Research papers and authors can be found at Citeseer, Google Scholar
Research paper review form
Overleaf : A tool for collaborative editing in LaTeX format

Dialogue systems-related websites and articles:

Pier Lison's thesis with an overview of dialogue systems
Chatbot challenge
Real challenge
Eliza chatbot interface
News about a new Virtual Assistant in financial domain
Dialogue-related conference:Sigdial

Student workshops for potential paper submission:

Real Challenge: Submit an idea for a dialogue system application; deadline May 14
ESSLI in Barcelona (deadline March 25)
NAACL student workshop Beijing (deadline March 31)

Videos and talks:

Talk: The Story of Siri, by Adam Cheyer
Talk: Talking to Muppets, Challenges of Voice Interfaces for Kids by Oren Jacob
Talk: Alex Lebrun, the CEO and founder of Wit AI
Virtual humans: Ada and Grace in Museum of Science in Boston
Microsoft's Cortana

Amazon's Echo

A conversation between two chatbots

Sample APIs for projects (back-end):

Frameworks and tools for speech-enabled application development:

Commersial API (free for research purposes) that provides ASR/NLU capabilities: Wit.ai
Research Dialog Framework that provides Dialog management capabilities: OpenDial
SOX: for recording audio. "brew install sox" on OSX and "apt-get install sox" on ubuntu.

Other:

Github repository
A website for setting up an online survey: Survey monkey

svetlana.stoyanchev [ at ] gmail [dot] com

Design adapted from David Elson's/Smaranda Muresan site design