Time and Location
Mon/Wed 11:00 AM-12:15 PM, Clapp 422
4th Hour: Friday 11-11:50 AM, Clapp 422
ponbarry at mtholyoke dot edu
Office: Clapp 226
Office Hours: Mon 3:00-4:00 PM, Thurs 3:00-4:00 PM, or by appointment
Natural Language Processing (NLP) is the scientific and engineering discipline of how to get computers to understand and process human language. Speech recognition, machine translation, and search engines are all NLP systems that have revolutionized how we work with information.
This course introduces the fundamental techniques for automated text and speech analysis and understanding. It covers computational algorithms, hands-on practice, and insights from linguistics.
Provisional topics include: language modeling, part-of-speech tagging, speech recognition, speech synthesis, prosodic analysis, conversational dialogue, context-free grammars, syntactic parsing, coreference, text classification, sentiment analysis, and machine translation.
- Understand the challenges of processing natural language; appreciate the basic linguistic issues that underlie NLP problems.
- Recognize and understand NLP terminology and methods in widespread use in modern NLP systems.
- Gain experience implementing several components of NLP systems.
- Be able to read and understand current research papers published in conferences such as ACL, Interspeech, and SIGDial.
CS 211 (data structures) or permission from the instructor. You should be comfortable designing programs, writing code, and you should be comfortable with mathematical reasoning. If you have any questions about whether you have the right background, please talk to me.
This class is a mixture of traditional lectures, small group activities, and hands-on activities in the lab (Clapp 202). In the lab, you will gain experience with unix tools, text analysis with python, and software for speech analysis, recognition, and synthesis.
All readings will be available on moodle. Most of these are draft chapters from Speech and Language Processing (Authors: Dan Jurafsky and James Martin). You do not need to buy this book, but if you are interested in learning more about NLP, it is an excellent resource.
We will use Piazza for announcements, Q&A, discussions, and short exercises.
There will be 4 or 5 homework assignments. They will be a mixture of conceptual and programming exercises.
The final project is an integral part of this course. It will give you an opportunity to creatively extend one of the homework projects. There will be milestones throughout the semester to guide your project development.
This includes constructive participation in small group and whole-class discussions, a presentation of your final project, informal presentations throughout the semester, giving structured feedback to your peers, and arriving to class on-time.
- Homework assignments: 50%
- Short exercises and participation: 15%
- Literature reviews: 10%
- Final Project: 25%
You can use up to 3 free late days on the homework assignments (<= 24 hours counts as 1 day). After your late days are spent, late homework will be penalized 15% per day late.
If you have a disability for which you require accommodations, please make an appointment to see the instructor within the first two weeks of classes so that we can make appropriate arrangements. You will need to have a letter from the AccessAbility Services Office, located in Wilder Hall B4 (phone: 413-538-2634, ).
In all your work for this class, it is very important for you to follow the Honor Code: I will honor myself, my fellow students, and Mount Holyoke College by acting responsibly, honestly, and respectfully in both my words and my deeds. If you are not sure how this applies in a particular context, please ask for clarification. Collaboration on homework assignments is encouraged. However, when you write up your work, it is important that you only write what you understand, and that it is in your own words. If you have any questions about what constitutes an Honor Code violation in this class please ask your instructor.
|Jan 21||Lecture||Introduction||J&M Ch. 1|
|Jan 26||Lecture||Words||J&M Ch. 2.1-2.2, J&M-2E Ch. 5.1|
|Jan 28||Lab||Unix/Regex||J&M Ch. 2.3; Unix for Poets|
|Feb 2||Snow day||HW 1 due|
|Feb 4||Lecture||N-grams||J&M Ch. 4.1-4.2|
|Feb 9||Lecture||Language modeling||J&M Ch. 4.3-4.4|
|Feb 11||Lab||Smoothing; building LMs|
|Feb 16||Lecture||Phonetics||J&M2E Phonetics Chapter||HW2 due|
|Feb 23||Lecture||Viterbi algorithm||J&M Ch. 7.4|
|Feb 25||Lecture||Discussion; ASR decoding||Lang of Food Ch. 12; J&M2E Section 9.2 (ASR Chapter)|
|Mar 2||No class|
|Mar 4||Lecture||Guest Lecture: Information Retrieval|
|Mar 9||Lecture||Final Project out; Viterbi algorithm|
|Mar 11||Lecture||Sentiment Analysis||Pang and Lee (2008)||HW3 due|
|Mar 23||Lecture||Sentiment Analysis; Naive Bayes|
|Mar 25||Discussion||Behavioral Signal Processing||Narayanan and Georgiou (2013)|
|Mar 30||Lecture||POS Tagging||J&M Ch. 8.3-8.5|
|Apr 1||Lab||POS Lab||Lit Review due|
|Apr 6||Lecture||Syntactic Pasring|
|Apr 8||Lab||Project work day||Final Project Proposal due|
|Apr 13||Lab||Pasring Lab|
|Apr 15||Lab||Project work day|
|Apr 20||Lecture||Wrap-up||Final Project Progress Report due|
|Apr 22||Discussion||Final Project Presentations|
|Apr 27||Discussion||Final Project Presentations||Final Project Write-up due 5/1|