Theory of Neural Networks

CSE 60963 | Fall 2024

Programming Projects

For each project, you can choose one of our suggested topics, or you can propose your own topic.

Disclaimers about our suggested topics: Since you are probably the first person ever to implement the constructions below, there's a real chance that you will find bugs in the constructions! Humanity thanks you.

To propose your own topic: Please email your idea(s) to Prof. Chiang three weeks before the due date; you should converge on a topic by two weeks before the due date. It should involve some programming and it should relate to the course material somehow. While our suggested topics are implementations of theoretical constructions, you're welcome to propose experiments instead.

PP1

Implement one of the following, or propose a different topic. When the instructions say to construct a ReLU network, you should construct it as a PyTorch Sequential object containing only Linear and ReLU as submodules.

  1. Given a Boolean function \(f: \mathbb{R}^d \rightarrow \mathbb{R}\), construct a two-layer ReLU network that computes \(f\). Assume that \(f\) is represented as a Boolean formula with variables \(x_1, \ldots, x_d\).
  2. Given a \(\rho\)-Lipschitz continuous function \(f : \mathbb{R}^d \rightarrow \mathbb{R}\) and \(\epsilon>0\), construct a three-layer ReLU network that approximates \(f\) with error \(\epsilon\). Assume that \(f\) is represented as a Python function from Tensor to float.
  3. Given a continuous piecewise linear function \(f : \mathbb{R}^d \rightarrow \mathbb{R}\) with \(k\) pieces, construct a ReLU network with \(\log k\) layers that computes \(f\).

PP2

Implement one of the following, or propose a different topic. When the instructions say to construct a simple ReLU RNN, you should construct it as a PyTorch RNN(nonlinearity='relu').

  1. Given a finite automaton \(M\), construct a simple ReLU RNN that recognizes the same language as \(M\).
  2. Given a Turing machine \(M\), construct a simple ReLU RNN (or a not-simple ReLU RNN is okay too) that simulates \(M\) using intermediate steps.
  3. Given a probabilistic finite automaton \(M\) and \(\epsilon > 0\), construct a simple ReLU RNN that approximates \(M\) with total variation distance \(\epsilon\), as shown by Svete et al (2024).

PP3

Implement one of the following, or propose a different topic. When the instructions say to construct a transformer, you should construct it as a PyTorch TransformerEncoder. To get unique-hard or average-hard attention, you'll need to monkey-patch TransformerEncoderLayer.self_attn.

  1. Given an LTL formula \(\phi\), construct a unique-hard or average-hard attention transformer encoder that recognizes the language defined by \(\phi\).
  2. Given a \(K_t[\#]\) formula \(\phi\), construct a softmax attention transformer encoder that recognizes the language defined by \(\phi\).
  3. Given a Turing machine \(M\), construct an average-hard attention transformer decoder that simulates \(M\) using intermediate steps.