To use the updaters, pass a new class to the method in either a ComputationGraph or .


[source]

The Nadam updater.

applyUpdater
  1. public void applyUpdater(INDArray gradient, int iteration, int epoch)

Calculate the update based on the given gradient

  • param gradient the gradient to get the update for
  • param iteration
  • return the gradient

NesterovsUpdater

[source]

Nesterov’s momentum.Keep track of the previous layer’s gradientand use it as a way of updating the gradient.

applyUpdater

Get the nesterov update

  • param gradient the gradient to get the update for
  • param iteration

RmsPropUpdater

RMS Prop updates:

http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdfhttp://cs231n.github.io/neural-networks-3/#ada


Vectorized Learning Rate used per Connection Weight

Adapted from: also http://cs231n.github.io/neural-networks-3/#ada

applyUpdater

Gets feature specific learning ratesAdagrad keeps a history of gradients being passed in.Note that each gradient passed in becomes adapted over time, hence the opName adagrad

  • param gradient the gradient to get learning rates for
  • param iteration

AdaMaxUpdater

The AdaMax updater, a variant of Adam.http://arxiv.org/abs/1412.6980

applyUpdater

Calculate the update based on the given gradient

  • param gradient the gradient to get the update for
  • param iteration
  • return the gradient

NoOpUpdater

NoOp updater: gradient updater that makes no changes to the gradient


[source]

The Adam updater.

applyUpdater
  1. public void applyUpdater(INDArray gradient, int iteration, int epoch)
  • param gradient the gradient to get the update for
  • param iteration
  • return the gradient

AdaDeltaUpdater

[source]

Ada delta updater. More robust adagrad that keeps track of a moving windowaverage of the gradient rather than the every decaying learning rates of adagrad

applyUpdater

Get the updated gradient for the given gradientand also update the state of ada delta.

  • param gradient the gradient to get theupdated gradient for
  • param iteration

SgdUpdater

[source]

SGD updater applies a learning rate only


Gradient modifications: Calculates an update and tracks related information for gradient changes over timefor handling updates.


AMSGradUpdater

The AMSGrad updaterReference: On the Convergence of Adam and Beyond - https://openreview.net/forum?id=ryQu7f-RZ