View source on GitHub
|
Long Short-Term Memory layer - Hochreiter 1997.
Inherits From: RNN, Layer, Operation
tf.keras.layers.LSTM(
units,
activation='tanh',
recurrent_activation='sigmoid',
use_bias=True,
kernel_initializer='glorot_uniform',
recurrent_initializer='orthogonal',
bias_initializer='zeros',
unit_forget_bias=True,
kernel_regularizer=None,
recurrent_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
recurrent_constraint=None,
bias_constraint=None,
dropout=0.0,
recurrent_dropout=0.0,
seed=None,
return_sequences=False,
return_state=False,
go_backwards=False,
stateful=False,
unroll=False,
use_cudnn='auto',
**kwargs
)
Used in the notebooks
| Used in the guide | Used in the tutorials |
|---|---|
Based on available runtime hardware and constraints, this layer will choose different implementations (cuDNN-based or backend-native) to maximize the performance. If a GPU is available and all the arguments to the layer meet the requirement of the cuDNN kernel (see below for details), the layer will use a fast cuDNN implementation when using the TensorFlow backend. The requirements to use the cuDNN implementation are:
activation==tanhrecurrent_activation==sigmoiddropout== 0 andrecurrent_dropout== 0unrollisFalseuse_biasisTrue- Inputs, if use masking, are strictly right-padded.
- Eager execution is enabled in the outermost context.
For example:
inputs = np.random.random((32, 10, 8))lstm = keras.layers.LSTM(4)output = lstm(inputs)output.shape(32, 4)lstm = keras.layers.LSTM(4, return_sequences=True, return_state=True)whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)whole_seq_output.shape(32, 10, 4)final_memory_state.shape(32, 4)final_carry_state.shape(32, 4)
Args |
|---|
units
activation
tanh).
If you pass None, no activation is applied
(ie. "linear" activation: a(x) = x).
recurrent_activation
sigmoid).
If you pass None, no activation is applied
(ie. "linear" activation: a(x) = x).
use_bias
True), whether the layer
should use a bias vector.
kernel_initializer
kernel weights matrix,
used for the linear transformation of the inputs. Default:
"glorot_uniform".
recurrent_initializer
recurrent_kernel
weights matrix, used for the linear transformation of the recurrent
state. Default: "orthogonal".
bias_initializer
"zeros".
unit_forget_bias
True). If True,
add 1 to the bias of the forget gate at initialization.
Setting it to True will also force bias_initializer="zeros".
This is recommended in Jozefowicz et al.
kernel_regularizer
kernel weights
matrix. Default: None.
recurrent_regularizer
recurrent_kernel weights matrix. Default: None.
bias_regularizer
None.
activity_regularizer
None.
kernel_constraint
kernel weights
matrix. Default: None.
recurrent_constraint
recurrent_kernel weights matrix. Default: None.
bias_constraint
None.
dropout
recurrent_dropout
seed
return_sequences
False.
return_state
False.
go_backwards
False).
If True, process the input sequence backwards and return the
reversed sequence.
stateful
False). If True, the last state
for each sample at index i in a batch will be used as initial
state for the sample of index i in the following batch.
unroll
True, the network will be unrolled,
else a symbolic loop will be used.
Unrolling can speed-up a RNN,
although it tends to be more memory-intensive.
Unrolling is only suitable for short sequences.
use_cudnn
"auto" will
attempt to use cuDNN when feasible, and will fallback to the
default implementation if not.
Call arguments |
|---|
inputs
(batch, timesteps, feature).
mask
(samples, timesteps) indicating whether
a given timestep should be masked (optional).
An individual True entry indicates that the corresponding timestep
should be utilized, while a False entry indicates that the
corresponding timestep should be ignored. Defaults to None.
training
dropout or
recurrent_dropout is used (optional). Defaults to None.
initial_state
None causes creation
of zero-filled initial state tensors). Defaults to None.
Attributes |
|---|
activation
bias_constraint
bias_initializer
bias_regularizer
dropout
input
Only returns the tensor(s) corresponding to the first time the operation was called.
kernel_constraint
kernel_initializer
kernel_regularizer
output
Only returns the tensor(s) corresponding to the first time the operation was called.
recurrent_activation
recurrent_constraint
recurrent_dropout
recurrent_initializer
recurrent_regularizer
unit_forget_bias
units
use_bias
Methods
from_config
@classmethodfrom_config( config )
Creates a layer from its config.
This method is the reverse of get_config,
capable of instantiating the same layer from the config
dictionary. It does not handle layer connectivity
(handled by Network), nor weights (handled by set_weights).
| Args |
|---|
config
| Returns | |
|---|---|
| A layer instance. |
get_initial_state
get_initial_state(
batch_size
)
inner_loop
inner_loop(
sequences, initial_state, mask, training=False
)
reset_state
reset_state()
reset_states
reset_states()
symbolic_call
symbolic_call(
*args, **kwargs
)
View source on GitHub