Computers process massive amounts of information every day in the form of human language. Although they do not understand it, they can learn how to do things like answer questions about it, or translate it into other languages. This course is a systematic introduction to the ideas that form the foundation of current language technologies and research into future language technologies.
The official prerequisites are CSE 20312 or CDT 30020. Students should be experienced with writing substantial programs in Python. The course also makes use of finite automata, context-free grammars, basic linear algebra, multivariable differential calculus, and probability theory. Ideally, students should have taken CSE 30151, Math 10560, and ACMS 30440, but please contact the instructor if you have questions about the necessary background.
The best way to contact the teaching staff is on Piazza.
I'm revising the schedule for 2021 more extensively than usual. The general plan is stable, but the day-to-day topics are subject to change.
Unit | Week | Assignment | Mon | Wed | Fri |
Foundations | 1 | Chapter 1 Project idea (due 02/12) |
02/03 Language |
02/05 Probability |
|
2 | Chapter 2 | 02/08 N-gram language models |
02/10 Overfitting and regularization |
02/12 Weighted automata |
|
3 | HW1: Text prediction (due 02/26) | 02/15 RNNs: motivation |
02/17 RNNs: definition |
02/19 RNNs: training |
|
4 | Chapter 3 | 02/22 IBM Model 1 and 2 |
02/24 Training the IBM models |
02/26 Attention |
|
5 | HW2: Machine translation (due 03/12) | 03/01 NMT: motivation |
03/03 NMT: RNNs |
03/05 NMT: Transformers |
|
Inputting Language | 6 | Chapter 4 | 03/08 Phonetics and phonology |
03/10 Speech recognition |
03/12 Writing systems and character/handwriting recognition |
Analyzing Language | 7 | Chapter 5 Project baseline (due 03/26) |
03/15 Morphology |
03/17 Syntax |
03/19 Context-free grammars |
8 | 03/22 CKY parsing |
03/24 Binarization and unary rules |
03/26 Neural parsing |
||
9 | HW3: Parsing (due 04/09) | 03/29 Neural parsing, cont. |
03/31 Beyond CFGs |
04/02 Good Friday |
|
Understanding Language | 10 | Chapter 6 | 04/05 Bags of words; topic models |
04/07 Word embeddings |
04/09 Projects |
11 | HW4: Semantic parsing (due 04/23) | 04/12 Recognizing entities and relations |
04/14 Graph semantics and graph grammars |
04/16 Projects |
|
12 | 04/19 Logical semantics |
04/21 Mini-break |
04/23 Projects |
||
Generating Language | 13 | HW5: Generation (due 05/07) Chapter 7 |
04/26 Question Answering |
04/28 Summarization |
04/30 Projects |
14 | 05/03 Generating from graphs |
05/05 Generating from logical forms |
05/07 Projects |
||
15 | Project report (due 05/18) | 05/10 Conclusion |
Your work in this course consists of five homework assignments and a research project.
All written work should be submitted through Sakai.
requirement | points |
homeworks | 5 × 30 |
project | 3 × 30 + 60 |
total | 300 |
letter grade | points |
A | 280–300 |
A− | 270–279 |
B+ | 260–269 |
B | 250–259 |
B− | 240–249 |
C+ | 230–239 |
C | 220–229 |
C− | 210–219 |
D | 180–209 |
F | 0–179 |
Students in this course are expected to abide by the Academic Code of Honor Pledge: “As a member of the Notre Dame community, I will not participate in or tolerate academic dishonesty.”
The following table summarizes how you may work with other students and use print/online sources:
Resources | Solutions | |
---|---|---|
Consulting | allowed | not allowed |
Copying | cite | not allowed |
If an instructor sees behavior that is, in his judgement, academically dishonest, he is required to file either an Honor Code Violation Report or a formal report to the College of Engineering Honesty Committee.
In the case of a serious illness or other excused absence, as defined by university policies, coursework submissions will be accepted late by the same number of days as the excused absence. Otherwise, you may submit part of an assignment on time for full credit and part of the assignment late with a penalty of 30% per week (that is, your score for that part will be $\lfloor 0.7^t s\rfloor$, where $s$ is your raw score and $t$ is the possibly fractional number of weeks late). No part of the assigment may be submitted more than once. No work may be submitted after the final project due date.
Any student who has a documented disability and is registered with Disability Services should speak with the professor as soon as possible regarding accommodations. Students who are not registered should contact the Office of Disability Services.