For smaller CS / Non-CS excursions
Ever sit down and watch something, or hear something, and want to note down a critical moment? I sure do! In my narrative interviews, I often have to do this process manually on a piece of notebook paper. In my video and audio reviews, I also have to do this manually on a google doc on split-screen. Finally, I put my foot down and programmed my very own media annotator. It has a load of useful features. It listens for your keystrokes anywhere on the screen, so you can be focused on your media. It uses system time, so it runs in lockstep with any media player. It exports your annotations to a simple text format that you can add to any literature review notes. And best of all, your annotations are attached to simple macros (number keys) and can be easily changed! Code is here.
Literally every time I want to make a line plot in Matplotlib, or save a model in PyTorch, or load a csv, I find myself searching it up and copy/pasting. To unify all of these simple things, I'm working on a large repository of code basics for researchers. This includes Numpy, Pytorch, LaTeX, Pandas, fileloading, and others that I haven't gotten to yet. I plan to also include simple implementations of structures like VAEs and GANs.
When I read papers, I like to give it one read, hide the paper, and then try to summarize the key findings in a few paragraphs. I decided to put these together into one large LaTeX document. Currently, it's around 100 pages. When I get the time, I will put it on GitHub.
Training Whales, Robots, and You!
Reinforcement learning and animal training have a lot in common because they all revolve around conveying information through rewards. I've had some very fascinating conversations with whale & dolphin trainers. What I've discovered is that we have very similar insights but use different terminologies. To understand this better, I've prepared a 105-minute talk that looks at these connections through a survey of literature and anecdotes in both fields. This talk was originally made for Stanford Splash, but it works for any audience of high schoolers. It can also be adapted for older audiences.
Not to be all pretentious or anything, but I think I'm building up what will become the world's most comprehensive archival database on the subject of whales in captivity. Right now, it contains around 3k individual documents, including images, videos, books, audio, web snapshots, reports, and research articles. After I finish my book, I can release most of it for educational use.
Sixteen Pixels is (Almost) All You Need: Crafting Parameterized Image Uncrumpling Models (CS231N Winning Final project)
As smartphone technology continues to evolve, handheld document scanning is becoming more pervasive. Unlike flatbed scanners that physically flatten documents, smartphone scanners must digitally remove creases and crinkles in a document to ensure the best scan possible. In this work, we create a procedurally-generated dataset of pairwise crumpled and uncrumpled images. Then, we implement and compare denoising and style transfer architectures to this new problem. We find that an adversarial paradigm with a small PatchGAN yields the most realistic results with the best quantitative scores as well. Paper and code are available.
MidiStyle: Audio Style Transfer (CS229 Final Project)
Style transfer is pretty pervasive in visual tasks, using anything from Gram Matrix methods to CycleGAN. Can we try using established vision style transfer algorithms on audio? In this project, we show that this is indeed possible. Using a spectrogram representation, we change a piano into harp, harpsichord, electric guitar, and even timpanis. We generate our own data using MIDI, and we test on a real-world piano. Our project website can be found here and our code here.
Thompson Sampling Simulator (CS109 Winning Final Project)
The Multi-Armed Bandit is a fascinating theoretical question, but it is also a compelling question of philosophical intensity: how do we balance exploration vs exploitation? We look at one algorithm, known as Thompson sampling. Here, we simulate ants finding a good location for a nest. We also implement Tandem Running, which allows ants to "persuade" other ants, resulting in faster convergence. The code is here, and as of June 2022, the simulation is still running here
Noninvasive Pipe Monitor
In my childhood home, we had a problem with sewage backing up occasionally into our basement. If we saw it coming (by means of the main sewage line filling up), then we could minimize the damage. I devised a capacitive sensor and an Arduino that constantly monitored the pipes. Higher water levels means a higher capacitance, and the Arduino will trip an alarm if the capacitance reaches a certain threshold for a long time. The code is here.