This document summarizes recent developments in action recognition using deep learning techniques. It discusses early approaches using improved dense trajectories and two-stream convolutional neural networks. It then focuses on advances using 3D convolutional networks, enabled by large video datasets like Kinetics. State-of-the-art results are achieved using inflated 3D convolutional networks and temporal aggregation methods like temporal linear encoding. The document provides an overview of popular datasets and challenges and concludes with tips on training models at scale.