tf.compat.v1.train.AdadeltaOptimizer

Optimizer that implements the Adadelta algorithm.

Inherits From: Optimizer

Migrate to TF2

tf.compat.v1.train.AdadeltaOptimizer is compatible with eager mode and tf.function. When eager execution is enabled, learning_rate, rho, and epsilon can each be a callable that takes no arguments and returns the actual value to use. This can be useful for changing these values across different invocations of optimizer functions.

To switch to native TF2 style, use tf.keras.optimizers.Adadelta instead. Please notice that due to the implementation differences, tf.keras.optimizers.Adadelta and tf.compat.v1.train.AdadeltaOptimizer may have slight differences in floating point numerics even though the formula used for the variable updates still matches.

Structural mapping to native TF2

Before:

optimizer = tf.compat.v1.train.AdadeltaOptimizer(
  learning_rate=learning_rate,
  rho=rho,
  epsilon=epsilon)

After:

optimizer = tf.keras.optimizers.Adadelta(
  learning_rate=learning_rate,
  rho=rho,
  epsilon=epsilon)

How to map arguments

TF1 Arg Name TF2 Arg Name Note
learning_rate learning_rate Be careful of setting learning_rate tensor value computed from the global step. In TF1 this was usually meant to imply a dynamic learning rate and would recompute in each step. In TF2 (eager + function) it will treat it as a scalar value that only gets computed once instead of a symbolic placeholder to be computed each time.
rho rho -
epsilon epsilon Default value is 1e-08 in TF1, but 1e-07 in TF2.
use_locking - Not applicable in TF2.

Before & after usage example

Before:

x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.compat.v1.train.AdadeltaOptimizer(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))

After:

x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.keras.optimizers.Adadelta(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))