GPU Computing For Data Science - John Joo
GPU Computing For Data Science - John Joo
John Joo
[email protected]
Data Science Evangelist @ Domino Data Lab
Outline
• Why use GPUs?
• Make predictions
Little Information in One “Noisy Simulation”
Price(t+1) = Price(t) e InterestRate•dt + noise
Many “Noisy Simulations” ➡ Actionable Information
Price(t+1) = Price(t) e InterestRate•dt + noise
Monte Carlo Simulations Are Often Slow
• Lots of simulation data is required to
create valid models
• Machine learning
• Search
GPUs Make Deep Learning Accessible
Google
Stanford AI Lab
Datacenter
# of machines 1,000 3
# of CPUs or
2,000 CPUs 12 GPUs
GPUs
Adam Coates, Brody Huval, Tao Wang, David Wu, Bryan Catanzaro, Ng Andrew ; JMLR W&CP 28 (3) : 1337–1345, 2013
CPU vs GPU Architecture:
Structured for Different Purposes
CPU GPU
4-8 High Performance Cores 100s-1000s of bare bones cores
Both CPU and GPU are required
Compute intensive
functions
Everything else
CPU GPU
✔
Getting Started: Software
• TensorFlow - ML • mxnet - NN
• mxnet - NN
• scikit-cuda
• cudamat
• gputools
• HiPLARM
• Drop-in library
Drop-in Library
26 sec
17x speed up
Some sample Jupyter notebooks
• https://app.dominodatalab.com/johnjoo/gpu_examples
blog.dominodatalab.com
[email protected]
blog.dominodatalab.com