Maximilian Du

maxjdu at stanford dot edu

Google Scholar  /  GitHub  /  Resume  /  CV

Maximilian Du

Hey there! I am an undergraduate researcher at Stanford University. I study computer science (AI Track) and creative writing (prose track minor). I work in Chelsea Finn's IRIS lab on robot learning.

My research and project experiences include reinforcement learning, behavior cloning, computer vision, and robotics. I am also excited at the interdisciplinary connections between psychology and computer science, especially when it comes to the mechanisms of learning, teaching, and training.

In my free time, I love to write short fiction and creative non-fiction. Currently, I'm working on a book with the Stanford Storytelling Project that focuses on animal trainers and the human-animal connection.

Scroll down to see my research, other projects, writing, and course work. Feeling adventurous? Check these out as well:

Whale Book  /  Other Writing  /  Photography  /  Personal Gallery


Robot Reading a Book

Learning Smarter from Mistakes

Current Work 2022

Agents that imitate experts can make mistakes when running in the real world. When we correct the agent, the corrections also indicate which parts of the task are the hardest. Can we use this insight to improve data efficiency in behavior cloning agents?

Improving LSTM Neural Networks for Better Short-Term Wind Power Predictions

Maximilian Du
IEEE Renewable Energy and Power Engineering (REPE) 2019

I created modified LSTM (Long Short Term Memory) Neural Networks and used auxiliary weather forecast data to improve ultra-short wind farm output predictions, for use in a smart grid.

Paper  /  Code

Robot Extracting Keys

Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning

Maximilian Du*, Olivia Lee*, Suraj Nair, Chelsea Finn
Robotics: Science and Systems 2022

We show that robots can benefit from audio data while accomplishing visually-occluded tasks. We learn policies end-to-end from vision and audio (from a gripper-mounted microphone) to complete difficult tasks, like extracting keys from a bag when the keys are not initially visible.

Website  /  Paper  /  Code


Sixteen Pixels is (Almost) All You Need: Crafting Parameterized Image Uncrumpling Models

Maximilian Du*, Niveditha Iyer*, Tejas Narayanan*
Winning CS231N Final Project 2022

We created a decrumpling model that will take in an image of a crumpled document and smooth it out. We find that an adversarial paradigm with a small PatchGAN yields the most realistic results with the best quantitative scores as well.

Paper  /  Code


Media Marker Screencap

Media Annotator

When reviewing video and audio media for my book, I often find myself needing a real-time annotator. So I coded one! It listens for your keystrokes anywhere on the screen, so you can be focused on your media. It uses system time, so it runs in lockstep with any media player. It exports your annotations to a simple text format that you can add to any literature review notes. And best of all, your annotations are attached to simple macros (number keys) and can be easily changed!


MidiStyle: Audio Style Transfer

CS 229 Final Project

Style transfer is pretty pervasive in visual tasks, using anything from Gram Matrix methods to CycleGAN. Can we try using established vision style transfer algorithms on audio? In this project, we show that this is indeed possible. Using a spectrogram representation, we change a piano into harp, harpsichord, electric guitar, and even timpanis. We generate our own data using MIDI, and we test on a real-world piano.

Website   /   Code

Coding Basics

Literally every time I want to make a line plot in Matplotlib, or save a model in PyTorch, or load a csv, I find myself searching it up and copy/pasting. To unify all of these simple things, I'm working on a large repository of code basics for researchers. This includes Numpy, PyTorch, LaTeX, Pandas, fileloading, Matplotlib, Seaborn, and others. I'm also including easily runnable PyTorch implementations of common ML models.


Thompson Sampling

Thompson Sampling Simulator

CS 109 Winning Final Project

The Multi-Armed Bandit is a fascinating theoretical question, but it is also a compelling question of philosophical intensity: how do we balance exploration vs exploitation? We look at one algorithm, known as Thompson sampling. Here, we simulate ants finding a good location for a nest. We also implement Tandem Running, which allows ants to "persuade" other ants, resulting in faster convergence.

Simulation   /   Code

Training Artwork

The Tricks, Goofs, and Whales of Machine Learning

Machine learning is the abstraction of real learning processes into computational ones. Therefore, we find that it often imitates nature. In this 90-minute talk, I explore this parallelism by looking at connections between robot-learning and animal training. How does classical conditioning map to the mathematical Bellman Backup? How do animal trainers help their animals achieve good exploration policies? What do animal superstitions tell us about domain adaptation?

This talk was originally made for Stanford Splash, but it works for any audience of high school or undergraduate college students. If you want the slide deck or a presentation, please email me.

Whale in Library

Automatic Source Vetter

While doing research for my whale book, I found myself needing to keep track of current events and follow whale-related developments as they happen. I'm also a full-time student, which makes this highly impractical. So, I've scripted up a system that monitors a large collection of RSS newsfeeds for a list of keywords. If any such keywords are found, the relevant links are saved to a SQL database for later perusal. This script also looks at relevant YouTube channels and specialty sites. While you may not find any use in my specialty site translators, the codebase is highly modular, so you can implement your own translator. This also is useful for any CS researcher because arXiv has RSS feed links.

Code (still actively updating)


Maximilian Du

From paper to presentation, I tell stories of my research. But I'm also play with writing as an artform. I work on fiction, creative non-fiction, and some poetry. In all three, I often focus on the magic, fragility, and danger of innocence. Following this trend, I'm also drawn to the story of animals and our relationship with them. Find my writing here.

I'm currently working on a non-fiction book about captive killer whales, advised by DCI fellow Melissa Dyrdahl and Professor Jonah Willihnganz from the Stanford Storytelling Project.

Selected Coursework

Computer Science

CS 330 Deep Multi-task and Meta Learning

CS 231N Deep Learning for Computer Vision

CS 228 Probabilistic Graphical Models

CS 229 Machine Learning

CS 285 Deep Reinforcement Learning (Berkeley, self-study)

CS 161 Design and Analysis of Algorithms

CS 110 Principles of Computer Systems

CS 107E Systems from the Ground Up

CS 106B Programming Abstractions in C++


MATH 115 Real Analysis

MATH 113 Linear Algebra and Matrix Theory

MATH 51 Linear Algebra and Multivariable Calculus

EE 263 Introduction to Linear Dynamics Systems (self-study)

CS 109 Probability for Computer Science

CS 103 Mathematical Foundations of Computing


PSYCH 30 Introduction to Perception

PSYCH 50 Cognitive Neuroscience

PSYCH 1 Intro to Psychology

Writing, Literature, & Philosophy

ENGLISH 127A Moby Dick & The Role of Animals in Fiction

PHIL 2 Ethical Philosophy

ENGLISH 190 Intermediate Fiction

ENGLISH 92 Introduction Poetry

ENGLISH 91 Introduction Creative Non-Fiction