Ilya Sutskever1 James Martens George Dahl Geoffrey Hinton

On the importance of initialization and momentum in deep learning

