GitHub - Bihaqo/tf-memonger: Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets

A TensorFlow implementation of the idea from the paper Training Deep Nets with Sublinear Memory Cost

The code is messy and doesn't really work, but can be a starting point for someone who want to properly reimplement the idea in TensorFlow.

gradients function from TensorFlow gets as input a variable V and builds a computational graph that computes the gradient of V w.r.t. the parameters. The backward pass computational graph uses the values of all nodes of the forward computational graph. try.ipynb contains my reimplementation of gradients function. It also builds a graph to compute the gradient, but it only uses the values from the nodes listed in store_activations_set on the backward pass, all other necessary values are recomputed on the fly. Right now store_activations_set is hardcoded, but MemoryOptimizer.py is a still not working attempt to smartly choose what to put into store_activations_set based on the analysis of the forward computation graph.

try.ipynb builds a simple MNIST network and makes a few iterations of optimization. Use TensorBoard to explore the constructed graph.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
MemoryOptimizer.py		MemoryOptimizer.py
Readme.md		Readme.md
Test MemoryOptimizer.ipynb		Test MemoryOptimizer.ipynb
try.ipynb		try.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Bihaqo/tf-memonger

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages