View source on GitHub
|
Optimizer that implements the Adadelta algorithm.
Inherits From: Optimizer
tf.compat.v1.train.AdadeltaOptimizer(
learning_rate=0.001,
rho=0.95,
epsilon=1e-08,
use_locking=False,
name='Adadelta'
)
Migrate to TF2
tf.compat.v1.train.AdadeltaOptimizer is compatible with eager mode and
tf.function.
When eager execution is enabled, learning_rate, rho,
and epsilon can each be a callable that
takes no arguments and returns the actual value to use. This can be useful
for changing these values across different invocations of optimizer
functions.
To switch to native TF2 style, use tf.keras.optimizers.Adadelta
instead. Please notice that due to the implementation differences,
tf.keras.optimizers.Adadelta and
tf.compat.v1.train.AdadeltaOptimizer may have slight differences in
floating point numerics even though the formula used for the variable
updates still matches.
Structural mapping to native TF2
Before:
optimizer = tf.compat.v1.train.AdadeltaOptimizer(
learning_rate=learning_rate,
rho=rho,
epsilon=epsilon)
After:
optimizer = tf.keras.optimizers.Adadelta(
learning_rate=learning_rate,
rho=rho,
epsilon=epsilon)
How to map arguments
| TF1 Arg Name | TF2 Arg Name | Note |
|---|---|---|
learning_rate
|
learning_rate
|
Be careful of setting learning_rate tensor value computed from the global step. In TF1 this was usually meant to imply a dynamic learning rate and would recompute in each step. In TF2 (eager + function) it will treat it as a scalar value that only gets computed once instead of a symbolic placeholder to be computed each time. |
rho |
rho |
- |
epsilon
|
epsilon
|
Default value is 1e-08 in TF1, but 1e-07 in TF2. |
use_locking |
- | Not applicable in TF2. |
Before & after usage example
Before:
x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.compat.v1.train.AdadeltaOptimizer(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))
After:
x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.keras.optimizers.Adadelta(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))
View source on GitHub