View source on GitHub
|
Optimizer that implements the Adam algorithm.
Inherits From: Optimizer
tf.compat.v1.train.AdamOptimizer(
learning_rate=0.001,
beta1=0.9,
beta2=0.999,
epsilon=1e-08,
use_locking=False,
name='Adam'
)
Migrate to TF2
tf.compat.v1.train.AdamOptimizer is compatible with eager mode and
tf.function.
When eager execution is enabled, learning_rate, beta1, beta2, and
epsilon can each be a callable that takes no arguments and returns the
actual value to use. This can be useful for changing these values across
different invocations of optimizer functions.
To switch to native TF2 style, use tf.keras.optimizers.Adam
instead. Please notice that due to the implementation differences,
tf.keras.optimizers.Adam and
tf.compat.v1.train.AdamOptimizer may have slight differences in
floating point numerics even though the formula used for the variable
updates still matches.
Structural Mapping to Native TF2
Before:
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)
After:
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
How to Map Arguments
| TF1 Arg Name | TF2 Arg Name | Note |
|---|---|---|
| learning_rate | learning_rate | Be careful of setting learning_rate as a tensor value computed from the global step. In TF1 this was usually meant to imply a dynamic learning rate and would recompute in each step. In TF2 (eager + function) it will treat it as a scalar value that only gets computed once instead of a symbolic placeholder to be computed each time. |
| beta1 | beta_1 | |
| beta2 | beta_2 | |
| epsilon | epsilon | Default value is 1e-08 in TF1, but 1e-07 in TF2. |
| use_locking | N/A | Not applicable in TF2. |
Before & After Usage Example
Before:
x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))
After:
x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))
View source on GitHub