Update '*var' according to the AdaMax algorithm.
tf.raw_ops.ResourceApplyAdaMax(
var,
m,
v,
beta1_power,
lr,
beta1,
beta2,
epsilon,
grad,
use_locking=False,
name=None
)
mt <- beta1 * m{t-1} + (1 - beta1) * g vt <- max(beta2 * v{t-1}, abs(g)) variable <- variable - learning_rate / (1 - beta1^t) * m_t / (v_t + epsilon)
Args |
|---|
var
Tensor of type resource. Should be from a Variable().
m
Tensor of type resource. Should be from a Variable().
v
Tensor of type resource. Should be from a Variable().
beta1_power
Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, qint16, quint16, uint16, complex128, half, uint32, uint64.
Must be a scalar.
lr
Tensor. Must have the same type as beta1_power.
Scaling factor. Must be a scalar.
beta1
Tensor. Must have the same type as beta1_power.
Momentum factor. Must be a scalar.
beta2
Tensor. Must have the same type as beta1_power.
Momentum factor. Must be a scalar.
epsilon
Tensor. Must have the same type as beta1_power.
Ridge term. Must be a scalar.
grad
Tensor. Must have the same type as beta1_power. The gradient.
use_locking
bool. Defaults to False.
If True, updating of the var, m, and v tensors will be protected
by a lock; otherwise the behavior is undefined, but may exhibit less
contention.
name
Returns | |
|---|---|
| The created Operation. |