General Information

Time and Location
Tues/Thurs 11:30 AM-12:45 PM, Carr 102

Instructor
Heather Pon-Barry
Office: Clapp 226
Office Hours: Tues/Thurs 2-3 PM
Appointments: https://ponbarry.youcanbook.me

TA
Mahima Ghale
Office Hours: Tues 5-7 PM, Thurs 7-8 PM in Clapp 218

Course Overview
Natural Language Processing (NLP) is the scientific and engineering discipline of how to get computers to understand and process human language. Speech recognition, machine translation, and search engines are all NLP systems that have revolutionized how we work with information.

This course introduces the fundamental techniques for automated text and speech analysis and understanding. It covers computational algorithms, hands-on practice, and insights from linguistics.

Provisional topics include: language modeling, part-of-speech tagging, speech recognition, speech synthesis, prosodic analysis, conversational dialogue, context-free grammars, syntactic parsing, coreference, text classification, sentiment analysis, and machine translation.

Website: https://www.mtholyoke.edu/courses/ponbarry/cs341.html

Learning Objectives

Prerequisites
CS 211 (data structures) or permission from the instructor. You should be comfortable designing programs, writing code, and you should be comfortable with mathematical reasoning. If you have any questions about whether you have the right background, please talk to me.

Class Format
This class is a mixture of traditional lectures, small group activities, and hands-on lab activities (in Kendade 307). In the lab, you will gain experience with unix tools, text analysis with python, and software for speech analysis, recognition, and synthesis.

Reading Materials
All readings will be available online or on moodle. Most of these are chapters from Speech and Language Processing (3rd edition draft) (Authors: Dan Jurafsky and James Martin). The library has a copy of Speech and Language Processing (2nd edition) on reserve; it is an excellent resource.

Piazza
Course Piazza site: https://piazza.com/mtholyoke/spring2017/cs341nl/home
We will use piazza for announcements, Q&A, discussions, and reading responses.

Homework
There will be 4 or 5 homework assignments. They will be a mixture of conceptual and programming exercises.

Final Project
The final project is an integral part of this course. It will give you an opportunity to creatively extend one of the homeworks or explore a topic of interest to you. There will be milestones throughout the semester to guide your project development.

Participation
This includes constructive participation in small group and whole-class discussions, a presentation of your final project, informal presentations throughout the semester, giving structured feedback to your peers, and arriving to class on-time.

Grading

Late Days
You can use up to 3 free late days on the homework assignments (<= 24 hours counts as 1 day). After your late days are spent, late homework will be penalized 10% per day late.

Accommodations
If you have a disability for which you require accommodations, please make an appointment to see the instructor within the first two weeks of classes so that we can make appropriate arrangements. You will need to have a letter from the AccessAbility Services Office, located in Wilder Hall B4 (phone: 413-538-2634, ).

Academic Integrity
In all your work for this class, it is very important for you to follow the Honor Code: I will honor myself, my fellow students, and Mount Holyoke College by acting responsibly, honestly, and respectfully in both my words and my deeds. If you are not sure how this applies in a particular context, please ask for clarification. Collaboration on homework assignments is encouraged. However, when you write up your work, it is important that you only write what you understand, and that it is in your own words. If you have any questions about what constitutes an Honor Code violation in this class please ask your instructor.

Schedule

This is provisional schedule and will be updated throughout the semester.

Unless otherwise noted, chapters listed in the reference column refer to Jurafsky & Martin’s Speech and Language Processing (3rd edition draft). SLP2 refers to the Speech and Language Processing 2nd edition chapters available on moodle.

Date Format Topic Reference Assignment Due
Jan 24 Lecture Introduction [slides] Ch. 1
Jan 26 Lab Unix/Regex Lab [slides] Ch. 2.1; Unix for Poets
Jan 31 Lecture Words; N-grams [slides] Ch. 2.2-2.3, SLP2 Ch. 5.1 HW 1 due
Feb 2 Lab N-gram Lab [moodle]
Feb 7 Lecture Language modeling [slides] Ch. 4.1 HW2 due
Feb 9 SNOW DAY
Feb 14 Lab Building LMs [slides]
Feb 16 Lecture Smoothing [slides] Ch. 4.3-4.4 Piazza post on Green, Heer, and Manning (2015)
Feb 21 Lecture HMMs [slides] Ch. 9.1-9.4 HW3 due
Feb 23 Guest Lecture Phi Beta Kappa Visiting Scholar Barbara Grosz “Barbie wants to get to know your child” (2015); Green, Heer, and Manning (2015) Piazza post on “Barbie wants to get to know your child” (2015)
Feb 28 Lecture Viterbi algorithm
Mar 2 Lab Dialogue
Mar 7 Guest Lecture TBD HW4 due
Mar 9 No class Lit Review due
Mar 21 Lecture Phonetics SLP2 Ch. 7 Project Proposal due
Mar 23 Lab Spectrograms; ASR decoding
Mar 28 Discussion Behavioral Signal Processing Narayanan and Georgiou (2013)
Mar 30 Lecture Syntactic Pasring
Apr 4 Lab Parsing Lab Revised Project Proposal + Lit Review due
Apr 6 Lecture Sentiment Analysis; Naive Bayes Pang and Lee (2008)
Apr 11 Lab Project work day
Apr 13 Lab Pasring Lab
Apr 18 Lab Project work day Project Progress Report due
Apr 20 Lecture Wrap-up
Apr 25 Discussion Final Project Presentations
Apr 27 Discussion Final Project Presentations Project Write-up due 5/1