Time and Location
Mon/Wed 11:00 AM-12:15 PM, Clapp 422
4th Hour: Friday 11-11:50 AM, Clapp 422

Instructor
Heather Pon-Barry
ponbarry at mtholyoke dot edu
Office: Clapp 226
Phone: x2241
Office Hours: Mon 3:00-4:00 PM, Thurs 3:00-4:00 PM, or by appointment

Course Overview
Natural Language Processing (NLP) is the scientific and engineering discipline of how to get computers to understand and process human language. Speech recognition, machine translation, and search engines are all NLP systems that have revolutionized how we work with information.

This course introduces the fundamental techniques for automated text and speech analysis and understanding. It covers computational algorithms, hands-on practice, and insights from linguistics.

Provisional topics include: language modeling, part-of-speech tagging, speech recognition, speech synthesis, prosodic analysis, conversational dialogue, context-free grammars, syntactic parsing, coreference, text classification, sentiment analysis, and machine translation.

Website: https://www.mtholyoke.edu/courses/ponbarry/cs341.html

Learning Objectives

Prerequisites
CS 211 (data structures) or permission from the instructor. You should be comfortable designing programs, writing code, and you should be comfortable with mathematical reasoning. If you have any questions about whether you have the right background, please talk to me.

Class Format
This class is a mixture of traditional lectures, small group activities, and hands-on activities in the lab (Clapp 202). In the lab, you will gain experience with unix tools, text analysis with python, and software for speech analysis, recognition, and synthesis.

Reading Materials
All readings will be available on moodle. Most of these are draft chapters from Speech and Language Processing (Authors: Dan Jurafsky and James Martin). You do not need to buy this book, but if you are interested in learning more about NLP, it is an excellent resource.

Piazza
We will use Piazza for announcements, Q&A, discussions, and short exercises.

Homework
There will be 4 or 5 homework assignments. They will be a mixture of conceptual and programming exercises.

Final Project
The final project is an integral part of this course. It will give you an opportunity to creatively extend one of the homework projects. There will be milestones throughout the semester to guide your project development.

Participation
This includes constructive participation in small group and whole-class discussions, a presentation of your final project, informal presentations throughout the semester, giving structured feedback to your peers, and arriving to class on-time.

Grading

Late Days
You can use up to 3 free late days on the homework assignments (<= 24 hours counts as 1 day). After your late days are spent, late homework will be penalized 15% per day late.

Accommodations
If you have a disability for which you require accommodations, please make an appointment to see the instructor within the first two weeks of classes so that we can make appropriate arrangements. You will need to have a letter from the AccessAbility Services Office, located in Wilder Hall B4 (phone: 413-538-2634, ).

Academic Integrity
In all your work for this class, it is very important for you to follow the Honor Code: I will honor myself, my fellow students, and Mount Holyoke College by acting responsibly, honestly, and respectfully in both my words and my deeds. If you are not sure how this applies in a particular context, please ask for clarification. Collaboration on homework assignments is encouraged. However, when you write up your work, it is important that you only write what you understand, and that it is in your own words. If you have any questions about what constitutes an Honor Code violation in this class please ask your instructor.

Schedule

Date Format Topic Reference Assignment Due
Jan 21 Lecture Introduction J&M Ch. 1
Jan 26 Lecture Words J&M Ch. 2.1-2.2, J&M-2E Ch. 5.1
Jan 28 Lab Unix/Regex J&M Ch. 2.3; Unix for Poets
Feb 2 Snow day HW 1 due
Feb 4 Lecture N-grams J&M Ch. 4.1-4.2
Feb 6 Lab N-grams/LMs
Feb 9 Lecture Language modeling J&M Ch. 4.3-4.4
Feb 11 Lab Smoothing; building LMs
Feb 16 Lecture Phonetics J&M2E Phonetics Chapter HW2 due
Feb 18 Lab Spectrograms
Feb 23 Lecture Viterbi algorithm J&M Ch. 7.4
Feb 25 Lecture Discussion; ASR decoding Lang of Food Ch. 12; J&M2E Section 9.2 (ASR Chapter)
Mar 2 No class
Mar 4 Lecture Guest Lecture: Information Retrieval
Mar 9 Lecture Final Project out; Viterbi algorithm
Mar 11 Lecture Sentiment Analysis Pang and Lee (2008) HW3 due
Mar 23 Lecture Sentiment Analysis; Naive Bayes
Mar 25 Discussion Behavioral Signal Processing Narayanan and Georgiou (2013)
Mar 30 Lecture POS Tagging J&M Ch. 8.3-8.5
Apr 1 Lab POS Lab Lit Review due
Apr 6 Lecture Syntactic Pasring
Apr 8 Lab Project work day Final Project Proposal due
Apr 13 Lab Pasring Lab
Apr 15 Lab Project work day
Apr 20 Lecture Wrap-up Final Project Progress Report due
Apr 22 Discussion Final Project Presentations
Apr 27 Discussion Final Project Presentations Final Project Write-up due 5/1