tf.compat.v1.train.AdamOptimizer

Optimizer that implements the Adam algorithm.

Inherits From: Optimizer

Migrate to TF2

tf.compat.v1.train.AdamOptimizer is compatible with eager mode and tf.function. When eager execution is enabled, learning_rate, beta1, beta2, and epsilon can each be a callable that takes no arguments and returns the actual value to use. This can be useful for changing these values across different invocations of optimizer functions.

To switch to native TF2 style, use tf.keras.optimizers.Adam instead. Please notice that due to the implementation differences, tf.keras.optimizers.Adam and tf.compat.v1.train.AdamOptimizer may have slight differences in floating point numerics even though the formula used for the variable updates still matches.

Structural Mapping to Native TF2

Before:

optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)

After:

optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

How to Map Arguments

TF1 Arg Name TF2 Arg Name Note
learning_rate learning_rate Be careful of setting learning_rate as a tensor value computed from the global step. In TF1 this was usually meant to imply a dynamic learning rate and would recompute in each step. In TF2 (eager + function) it will treat it as a scalar value that only gets computed once instead of a symbolic placeholder to be computed each time.
beta1 beta_1
beta2 beta_2
epsilon epsilon Default value is 1e-08 in TF1, but 1e-07 in TF2.
use_locking N/A Not applicable in TF2.

Before & After Usage Example

Before:

x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))

After:

x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))