Welcome to Depth First Learning!

DFL is a compendium of curricula to help you deeply understand Machine Learning.

Each of our posts are a self-contained lesson plan targeting a significant research paper and complete with readings, questions, and answers.

We can guarantee that honestly engaging the material will leave you with a thorough understanding of the methods, background, and significance of that paper.

Want to stay up to date on future in-person or on-line DFL study groups? Fill out this short form.

April 07, 2020

Resurrecting the Sigmoid: Theory and Practice

With the success of deep networks across task ranging from vision to language, it is important to understand how to properly trian very deep neural networks with gradient-based methods. This paper studies, from a rigorous theoretical perspective, which combinations of network weight initializations and network activation functions can result in deep networks which are trainable. ⟹

March 02, 2020

Stein Variational Gradient Descent

Stein Variational Gradient Descent is a powerful, non-parameteric Bayesian Inference algorithm. It is based on its namesake, Stein's Method. In this guide, we work through the mathematics and fundamentals behind this approach, including Kernelized Stein Discrepancy, before fully understanding Stein Variational Gradient Descent. We end by considering two application areas for SVGD. One is Reinforcement Learning, the other is viewing SVGD from the lens of gradient flow. ⟹

September 23, 2019

Neural ODEs

Neural Ordinary Differentiable Equations (Neural ODEs) are deep learning architectures which combine neural networks and ordinary differentiable equations, providing new models for the familiar litany of tasks ranging from supervised learning to generative modeling to time series forecasting. In this curriculum, we will dive deep into these models with an end goal of implementing them ourselves. ⟹

May 02, 2019

Wasserstein GAN

The Wasserstein GAN (WGAN) is a GAN variant which uses the 1-Wasserstein distance, rather than the JS-Divergence, to measure the difference between the model and target distributions. This seemingly simple change has big consequences! Not only does WGAN train more easily but it also achieves very impressive results — generating some stunning images. ⟹

April 15, 2019

Announcing the 2019 DFL Fellows

After we launched Depth First Learning last year, we wanted to keep the momentum and continue outputting high-quality study guides for machine learning. Subsequently, we launched the Depth First Learning Fellowship with funding provided by Jane Street. We were blown away by the response. With over 100 applicants from 5 continents, we had a tremendously hard time selecting only four proposals. ⟹

← → 2