Theory of Neural Networks

CSE 60963 | Fall 2024

Lectures
MWF 11:30am–12:20pm, 312 DeBartolo Hall
Instructor
David Chiang
Instructor Office Hours
T 2–3:30pm, F 3–4:30pm, 179 Fitzpatrick
Teaching Assistant
Andy Yang
TA Office Hours
M 2–3pm, 150M Fitzpatrick

Introduction to the theory of neural networks: expressivity (what functions a neural network can and cannot compute) and trainability (what functions a neural network can and cannot learn). Neural network architectures covered will include feed-forward, recurrent, convolutional and attention (transformer) neural networks.

This offering of the course will be focused on expressivity of neural networks for sequence data (language models), studying how they relate to theoretical models of computation like automata, Turing machines, logics, and Boolean circuits.

Links

  • This website for all content
  • Canvas for submitting assignments
  • Ed for discussions

Prerequisites

  • Theory: Students must be familiar with finite automata, Turing machines, and first-order logic, and comfortable reading and writing proofs. This requirement is satisfied by Theory of Computing (CSE 30151) or equivalent.
  • Neural networks: Students should minimally understand feed-forward neural networks and how they are trained by gradient descent (backpropagation). They should ideally be familiar with recurrent neural networks and transformers. This requirement is satisfied by one of the following or permission of the instructor:
    • Neural Networks (CSE 40868/60868)
    • Natural Language Processing (CSE 40657/60657)

Topics

Week Topics Readings Assignments
8/26 Introduction
Class starts Wednesday
Chapter 1
9/2 Perceptrons and feedforward NNs Chapter 2 HW1 due 9/13
9/9 Feedforward NNs and universal approximation theorems
No class Friday
Chapter 3
9/16 Recurrent NNs (RNNs) and finite automata PP1 due 10/4
9/23 RNNs with intermediate steps and Turing machines (and beyond) Chapter 4
Slides (1)
Slides (2)
9/30 Transformers, circuit complexity and descriptive complexity Chapter 5
Slides
10/7 Transformers and first-order logic Chapter 6 HW2 due 10/18
10/14 Transformers and counting logics Chapter 7
10/21 Fall break (no class)
10/28 Transformers with intermediate steps and Turing machines Chapter 8
Slides
PP2 due 11/15
11/4 State-space models Chapter 9
11/11 Optimization Chapter 10
Dwaraknath, Understanding the Neural Tangent Kernel
11/18 Optimization continued HW3 due 12/6
11/25 Generalization
Thanksgiving (no class Wed–Fri)
Chapter 11
12/2 Generalization (and optimization) Wilber and Werness, Double Descent, parts 1 and 2 PP3 due 12/16
12/9 In-context learning
Class ends Wednesday

Requirements

There will be three homework assignments and three programming projects, each worth 50 points. Grades will be assigned as follows.

letter gradepoints
A 280–300
A− 270–279
B+ 260–269
B 250–259
B− 240–249
C+ 230–239
C 220–229
C− 210–219
D 180–209
F 0–179

Honor Code

Students in this course are expected to abide by the Academic Code of Honor Pledge: “As a member of the Notre Dame community, I will not participate in or tolerate academic dishonesty.”

The following table summarizes how you may work with other students and use print/online sources:

Resources Solutions
Consulting allowed not allowed
Copying cite not allowed
See the CSE Guide to the Honor Code for definitions of the above terms.

If an instructor sees behavior that is, in his judgement, academically dishonest, he is required to file either an Honor Code Violation Report or a formal report to the College of Engineering Honesty Committee.

Late Submissions

In the case of a serious illness or other excused absence, as defined by university policies, coursework submissions will be accepted late by the same number of days as the excused absence. Otherwise, you may submit part of an assignment on time for full credit and part of the assignment late with a penalty of 30% per week (that is, your score for that part will be \(\lfloor 0.7^t s\rfloor\), where \(s\) is your raw score and \(t\) is the possibly fractional number of weeks late). No part of the assigment may be submitted more than once. No work may be submitted after the registrar-assigned final exam date.

Students with Disabilities

Any student who has a documented disability and is registered with Disability Services should speak with the professor as soon as possible regarding accommodations. Students who are not registered should contact the Office of Disability Services.