Computers process massive amounts of information every day in the form of human language. Although they do not understand it, they can learn how to do things like answer questions about it, or translate it into other languages. This course is a systematic introduction to the ideas that form the foundation of current language technologies and research into future language technologies.
The official prerequisites are CSE 20312 or CDT 30020. Students should be experienced with writing substantial programs in Python. The course also makes use of finite automata, context-free grammars, basic linear algebra, multivariable differential calculus, and probability theory. Ideally, students should have taken CSE 30151, Math 10560, and ACMS 30440, but please contact the instructor if you have questions about the necessary background.
The best way to contact the teaching staff is by Piazza. You can also try our Slack channel, #cse-40657-fa18.
The readings for each week should be done before the lectures that week; homeworks and project milestones should be done by Friday at 5pm in that week.
Unit | Week | Assignment | Mon | Wed | Fri |
1 | Chapter 1, Chapter 2 | 08/22 Language |
08/24 Probability |
||
Getting Language | 2 | Chapter 3 Project idea |
08/27 Language models Finite automata |
08/29 Smoothing |
08/31 Recurrent neural networks |
3 | Chapter 4 Chapter 5 |
09/03 Text input Finite transducers and composition |
09/05 Viterbi algorithm |
09/07 continued |
|
4 | HW1: Text input | 09/10 Text normalization Unsupervised training |
09/12 continued |
09/14 continued |
|
5 | Chapter 6, Chapter 7 | 09/17 Phonetics and phonology |
09/19 Speech recognition |
09/21 Character and handwriting recognition |
|
Analyzing Language | 6 | HW2: Text correction Chapter 8 |
09/24 Syntax |
09/26 Morphology |
09/28 Part-of-speech tagging |
7 | Chapter 9 Chapter 10 |
10/01 Context-free grammars |
10/03 CKY algorithm |
10/05 continued |
|
8 | Chapter 11 Project baseline |
10/08 Adding annotations |
10/10 Unsupervised annotation |
10/12 Beyond CFGs |
|
Fall break | |||||
Understanding Language | 9 | Chapter 12 | 10/22 Text classification Bag of words |
10/24 Topic modeling Word embeddings |
10/26 Projects |
10 | HW3: Parsing | 10/29 Entity recognition |
10/31 Graph semantics and graph grammars |
11/02 Projects |
|
11 | Slides | 11/05 Logical semantics |
11/07 continued |
11/09 Projects |
|
Generating Language | 12 | HW4: Entity recognition |
11/12 Generation |
11/14 continued |
11/16 Projects |
13 | Chapter 14 | 11/19 Word-based machine translation |
Thanksgiving | ||
14 | Chapter 15 | 11/26 Syntax-based machine translation Synchronous CFGs |
11/28 continued |
11/30 Projects |
|
15 | HW5: Machine translation | 12/03 Adding a language model |
12/05 Conclusion |
||
Final | Project report |
Your work in this course consists of five homework assignments, a research project, and participation (whether you come to class, whether you appear to be paying attention, and whether you appear to have done the readings).
All written work should be submitted through Sakai.
requirement | points |
homeworks | 5 × 30 |
project | 4 × 30 |
participation | 30 |
total | 300 |
letter grade | points |
A | 280–300 |
A− | 270–279 |
B+ | 260–269 |
B | 250–259 |
B− | 240–249 |
C+ | 230–239 |
C | 220–229 |
C− | 210–219 |
D | 180–209 |
F | 0–179 |
Students in this course are expected to abide by the Academic Code of Honor Pledge: “As a member of the Notre Dame community, I will not participate in or tolerate academic dishonesty.”
The following table summarizes how you may work with other students and use print/online sources:
Resources | Solutions | |
---|---|---|
Consulting | allowed | not allowed |
Copying | cite | not allowed |
If an instructor sees behavior that is, in his judgement, academically dishonest, he is required to file either an Honor Code Violation Report or a formal report to the College of Engineering Honesty Committee.
In the case of a serious illness or other excused absence, as defined by university policies, coursework submissions will be accepted late by the same number of days as the excused absence. Otherwise, you may submit part of an assignment on time for full credit and part of the assignment late with a penalty of 30% per week (that is, your score for that part will be $\lfloor 0.7^t s\rfloor$, where $s$ is your raw score and $t$ is the possibly fractional number of weeks late). No part of the assigment may be submitted more than once. No work may be submitted after the final project due date.
Any student who has a documented disability and is registered with Disability Services should speak with the professor as soon as possible regarding accommodations. Students who are not registered should contact the Office of Disability Services.