View source on GitHub
|
Create a Trajectory transitioning between StepTypes LAST and FIRST.
tf_agents.trajectories.boundary(
observation: tf_agents.typing.types.NestedSpecTensorOrArray,
action: tf_agents.typing.types.NestedSpecTensorOrArray,
policy_info: tf_agents.typing.types.NestedSpecTensorOrArray,
reward: tf_agents.typing.types.NestedSpecTensorOrArray,
discount: tf_agents.typing.types.SpecTensorOrArray
) -> tf_agents.trajectories.Trajectory
All inputs may be batched.
The input discount is used to infer the outer shape of the inputs,
as it is always expected to be a singleton array with scalar inner shape.
Args |
|---|
observation
Tensor or np.ndarray; all shaped
[B, ...], [T, ...], or [B, T, ...].
action
Tensor or np.ndarray; all shaped [B,
...], [T, ...], or [B, T, ...].
policy_info
Tensor or np.ndarray; all shaped
[B, ...], [T, ...], or [B, T, ...].
reward
Tensor or np.ndarray; all shaped [B,
...], [T, ...], or [B, T, ...].
discount
Tensor or np.ndarray; shaped [B],
[T], or [B, T] (optional).
Returns | |
|---|---|
A Trajectory instance.
|
View source on GitHub