Multi Armed Bandit Algorithm W\ Deep Learning Build using python 3.6.6 with tox 3.4.0 and tensorflow 1.12.2 Policies implemented: Epsilon Decreasing Epsilon First Epsilon Greedy EXP3 GreedyMix Softmax Softmax Decreasing SoftMix UCB1 UCB1-Tuned UCB2 Arms: Bernoulli Normal