Course Notes

I discovered that I learn the best when I take good notes. To share this (very large) collection of notes I've made on topics in AI and mathematics, I've exported my notes as HTML pages, which you can find below. Because of the conversion process, there may be some links that are broken. Let me know these places, as well as general comments here.

Instructors: I generally only include derivations that have instructional significance beyond a problem set (which also means that they are often pubically available elsewhere). If you believe I've included a derivation that is specific to a problem set in a listed course (i.e. not useful beyond that particular question), please contact me and I will take it down.

Mathematical Foundations

These are a collection of subjects that are necessary for understanding machine learning. We will look at how we construct distributions, linear transformations, and the very fabric of calculus. These notes are not listed in strict order of difficulty.

Linear Algebra

Linear algebra is the question of vectors and linear tranformations applied to these vectors. It's the foundation of many machine learning concepts, where models are represented as a composition of matrices. These (somewhat messy) notes cover linear algebra from abstract and numerical approaches.


Real Analysis

Real Analysis builds the principles of calculus from the ground-up. The result is pretty simple (nothing beyond a high school calculus class), but the formalism creates a certain proof mindset that I've found indispensable in my other AI courses. These notes include formal definitions of the basic operations, limits, continuity, derivatives, and integration. There is also a bit of point set topology in these notes.

View Notes

Probability and Distributions

One of the most fundamental topics of machine learning: how do we model real-world stochastic phenomena? These notes are a cobbling-together of various classes, and it goes through simple probability and distributions for both both single and multivariate cases.



Computers do stuff. How can we make them do stuff faster? Or, in some cases, turn a difficult problem into an easier one? These notes cover some of the basic algorithms that are a must-have for computer scientists. In many cases, knowing these algorithms can help AI researchers know what is tractable and what is really hard.

View Notes

AI Foundations

In this section, we explore the core principles behind AI models. How do we model the real world? How do we optimize these models? How do we evaluate them? These notes are not listed in strict order of difficulty.

Basic Optimization

Say we know what we want. Can we get there? These notes cover this journey. We focus mostly on gradient-based methods. You'll find all the matrix calculus basics here, as well as gradient-descent algorithms and a brief discussion of second-order methods. These notes do not cover Convex Optimization, which I will be including in a separate place.

View Notes

Classic Learning Approaches

Before neural networks, there were many other learning algorithms that work pretty well in constrained settings. Unlike neural networks, these methods have many more provable properties. Furthermore, they can be part of a larger pipeline that involves neural networks. Therefore, it is still essential to know how these classical approaches work. These notes include PCA, Naive Bayes, Support Vector Machines, and more.

View Notes

Neural Network Fundamentals

Neural networks are the building blocks of many machine learning models. Before we can talk about all the cool stuff, we need to understand the basics: how are they defined, how are they optimized? These notes also cover some core ML principles like feature selection and bias-variance tradeoff.

View Notes

Probabilistic Graphical Models

The real world is stochastic but also highly dependent. Clouds are correlated with rain, winter correlated with snow. Probabilistic Graphical Models are a very elegant way of modeling these dependencies. Here, we introduce Bayesian Models, Markov Random Fields, and their properties. We see how they can be sampled, evaluated, and trained. Finally, we look at some (pretty complicated) theories of variational inference.

View Notes

Information Theory

Information Theory deals with the question of communication. We make a lot of noise, and sometimes this noise forms important things. How can we measure degrees of randomness? How can we compress randomness? These notes also include discussions of Markov Processes and other probabilistic processes. Information Theory is important to AI because many of our objectives can be decomposed into properties introduced by information theorists: entropy, divergence, etc.

View Notes

AI / ML Methods

These are the notes that talk about various AI approaches, including ways of seeing the world (computer vision), processing language (NLP), learning from reinforcement (RL), and others. Some of these notes include state-of-the-art methods.

Computer Vision

Vision is a very important sensory modality to understand the world. These notes give an overview of the methods we use to make computers see. Includes discussions on CNNs and other algorithms.

View Notes

Natural Language Processing

One of the things that sets us apart from animals is the presence of a natural language, which allows for a very efficient way of exchanging information. In these notes, we will look at how we can equip machines with the ability to read and comprehend.

View Notes

Deep Generative Models

Much of machine learning is modeling a distribution. In these notes, we will look deeply at advanced distribution modeling techniques, like Variational Autoencoders, GANs, and even diffusion models.


Reinforcement Learning Theory

All living creatures (including humans) want to increase rewards and reduce punishments. We can construct elaborate behaviors to accomplish this goal. In these notes, we formalize this reward learning problem and pose some theoretically-grounded algorithms that form the basis for modern RL algorithms.

View Notes

Deep Reinforcement Learning

Moving from the tabular worlds of reinforcement learning theory, we bring RL to the real world through neural networks and various techniques like actor-critic algorithms and inverse reinforcement learning. Admittedly, these notes are not the most refined; I will reupload them when they get better.

View Notes

Meta Learning

As we grow up, we gain knowledge, but we also learn how to learn. More specifically, we learn certain strategies that speed up this learning process. In the human world, we might call these study-skills. In the AI world, we call this meta-learning. In these notes, we will cover the AI models capable of meta-learning, and how we might apply them to real-world problems.

View Notes