Models - Updaters - 《DL4J（Deep Learning for Java）Document》

Available updaters
- - applyUpdater
- applyUpdater
AdaGradUpdater
AdaMaxUpdater
NoOpUpdater
- applyUpdater
GradientUpdater

To use the updaters, pass a new class to the method in either a ComputationGraph or .

The Nadam updater.

applyUpdater

public void applyUpdater(INDArray gradient, int iteration, int epoch)

Calculate the update based on the given gradient

param gradient the gradient to get the update for
param iteration
return the gradient

NesterovsUpdater

[source]

Nesterov’s momentum.Keep track of the previous layer’s gradientand use it as a way of updating the gradient.

applyUpdater

Get the nesterov update

param gradient the gradient to get the update for
param iteration

RmsPropUpdater

RMS Prop updates:

http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdfhttp://cs231n.github.io/neural-networks-3/#ada

Vectorized Learning Rate used per Connection Weight

Adapted from: also http://cs231n.github.io/neural-networks-3/#ada

applyUpdater

Gets feature specific learning ratesAdagrad keeps a history of gradients being passed in.Note that each gradient passed in becomes adapted over time, hence the opName adagrad

param gradient the gradient to get learning rates for
param iteration

AdaMaxUpdater

The AdaMax updater, a variant of Adam.http://arxiv.org/abs/1412.6980

applyUpdater

Calculate the update based on the given gradient

param gradient the gradient to get the update for
param iteration
return the gradient

NoOpUpdater

NoOp updater: gradient updater that makes no changes to the gradient

[source]

The Adam updater.

applyUpdater

public void applyUpdater(INDArray gradient, int iteration, int epoch)

param gradient the gradient to get the update for
param iteration
return the gradient

AdaDeltaUpdater

[source]

Ada delta updater. More robust adagrad that keeps track of a moving windowaverage of the gradient rather than the every decaying learning rates of adagrad

applyUpdater

Get the updated gradient for the given gradientand also update the state of ada delta.

param gradient the gradient to get theupdated gradient for
param iteration

SgdUpdater

[source]

SGD updater applies a learning rate only

Gradient modifications: Calculates an update and tracks related information for gradient changes over timefor handling updates.

AMSGradUpdater

The AMSGrad updaterReference: On the Convergence of Adam and Beyond - https://openreview.net/forum?id=ryQu7f-RZ