Tired: loss minimalists. Wired: loss maximalists.
by @sharifshameem :)
Datasets with a data loader without a shuffle after each epoch? Generously contributed by @richardgalvez.
An educational post! We’re looking at the validation accuracy of a model as a function of dropout we train with. This trend is consistent with my overall experience: models with dropout train faster, but models with higher dropout win eventually. The dropout of one model is quite extreme (0.85), but it is gaining on the others! What’s going to happen as we train longer? #soexciting
Spatial Transformer Network identifying right whales, L2 reg and loss plot.
Contributed by @robibok