Menu

CSE 40657/60657
Natural Language Processing

Term
Spring 2021
Time
MWF 1–1:50pm
Room
Fitzpatrick 356
Instructor
David Chiang

Computers process massive amounts of information every day in the form of human language. Although they do not understand it, they can learn how to do things like answer questions about it, or translate it into other languages. This course is a systematic introduction to the ideas that form the foundation of current language technologies and research into future language technologies.

The official prerequisites are CSE 20312 or CDT 30020. Students should be experienced with writing substantial programs in Python. The course also makes use of finite automata, context-free grammars, basic linear algebra, multivariable differential calculus, and probability theory. Ideally, students should have taken CSE 30151, Math 10560, and ACMS 30440, but please contact the instructor if you have questions about the necessary background.

Staff

Instructor
Prof. David Chiang
Office hours: TBD
Teaching assistant
Toan Nguyen
Office hours: TBD

Schedule

The readings for each week should be done before the lectures that week; homeworks and project milestones should be done by Friday at 5pm in that week.

I'm revising the schedule for 2021 more extensively than usual. The general plan is stable, but the day-to-day topics are subject to change.

Unit Week Assignment Mon Wed Fri
Foundations 1 Chapter 1
02/03
Language
02/05
Probability
2 Project idea
02/08
N-gram language models
02/10
Overfitting and regularization
02/12
Weighted automata
3
02/15
Recurrent neural networks
02/17
Gradient descent
02/19
IBM Model 1
4 HW1: Text prediction
02/22
Attention
02/24
Transformers
02/26
Decoding
Inputting Language 5
03/01
Text input
03/03
Weighted transducers
03/05
Phonetics and phonology
6 HW2: Machine translation
03/08
Speech recognition
03/10
Writing systems
03/12
Character and handwriting recognition
Analyzing Language 7
03/15
Morphology
03/17
Part-of-speech tagging
03/19
Conditional random fields
8 Project baseline
03/22
Syntax
03/24
Context-free grammars
03/26
CKY parsing
9
03/29
Neural parsing
03/31
Beyond CFGs
04/02
Good Friday
Understanding Language 10 HW3: Parsing
04/05
Bags of words; topic models
04/07
Word embeddings
04/09
Projects
11
04/12
Recognizing entities and relations
04/14
Graph semantics and graph grammars
04/16
Projects
12 HW4: Entity recognition
04/19
Logical semantics
04/21
Mini-break
04/23
Projects
Generating Language 13
04/26
Unconstrained generation
04/28
Generating from vectors
04/30
Projects
14 HW5: Generation
05/03
Generating from graphs
05/05
Generating from logical forms
05/07
Projects
15
05/10
Conclusion
Final Project report

Requirements

Your work in this course consists of five homework assignments, a research project, and participation (whether you come to class, whether you appear to be paying attention, and whether you appear to have done the readings).

All written work should be submitted through Sakai.

requirement points
homeworks 5 × 30
project 4 × 30
participation 30
total 300
letter gradepoints
A 280–300
A− 270–279
B+ 260–269
B 250–259
B− 240–249
C+ 230–239
C 220–229
C− 210–219
D 180–209
F 0–179

Policies

Honor Code

Students in this course are expected to abide by the Academic Code of Honor Pledge: “As a member of the Notre Dame community, I will not participate in or tolerate academic dishonesty.”

The following table summarizes how you may work with other students and use print/online sources:

Resources Solutions
Consulting allowed not allowed
Copying cite not allowed
See the CSE Guide to the Honor Code for definitions of the above terms.

If an instructor sees behavior that is, in his judgement, academically dishonest, he is required to file either an Honor Code Violation Report or a formal report to the College of Engineering Honesty Committee.

Late Submissions

In the case of a serious illness or other excused absence, as defined by university policies, coursework submissions will be accepted late by the same number of days as the excused absence. Otherwise, you may submit part of an assignment on time for full credit and part of the assignment late with a penalty of 30% per week (that is, your score for that part will be $\lfloor 0.7^t s\rfloor$, where $s$ is your raw score and $t$ is the possibly fractional number of weeks late). No part of the assigment may be submitted more than once. No work may be submitted after the final project due date.

Students with Disabilities

Any student who has a documented disability and is registered with Disability Services should speak with the professor as soon as possible regarding accommodations. Students who are not registered should contact the Office of Disability Services.