View source on GitHub
|
Hierarchical copy all-reduce implementation of CrossDeviceOps.
Inherits From: CrossDeviceOps
tf.distribute.HierarchicalCopyAllReduce(
num_packs=1
)
Used in the notebooks
| Used in the guide |
|---|
It reduces to one GPU along edges in some hierarchy and broadcasts back to each GPU along the same path. For the batch API, tensors will be repacked or aggregated for more efficient cross-device transportation.
This is a reduction created for Nvidia DGX-1 which assumes GPUs connects like
that on DGX-1 machine. If you have different GPU inter-connections, it is
likely that it would be slower than tf.distribute.ReductionToOneDevice.
For reduces that are not all-reduce, it falls back to
tf.distribute.ReductionToOneDevice.
Here is how you can use HierarchicalCopyAllReduce in
tf.distribute.MirroredStrategy:
strategy = tf.distribute.MirroredStrategy(
cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
Args |
|---|
num_packs
Raises | |
|---|---|
ValueError if num_packs is negative.
|
Methods
batch_reduce
batch_reduce(
reduce_op, value_destination_pairs, options=None
)
Reduce values to destinations in batches.
See tf.distribute.StrategyExtended.batch_reduce_to. This can only be
called in the cross-replica context.
| Args |
|---|
reduce_op
tf.distribute.ReduceOp specifying how values should be
combined.
value_destination_pairs
tf.distribute.CrossDeviceOps.reduce for descriptions.
options
tf.distribute.experimental.CommunicationOptions. See
tf.distribute.experimental.CommunicationOptions for details.
| Returns | |
|---|---|
A list of tf.Tensor or tf.distribute.DistributedValues, one per pair
in value_destination_pairs.
|
| Raises |
|---|
ValueError
value_destination_pairs is not an iterable of
tuples of tf.distribute.DistributedValues and destinations.
broadcast
broadcast(
tensor, destinations
)
Broadcast tensor to destinations.
This can only be called in the cross-replica context.
| Args |
|---|
tensor
tf.Tensor like object. The value to broadcast.
destinations
tf.distribute.DistributedValues, a tf.Variable, a
tf.Tensor alike object, or a device string. It specifies the devices
to broadcast to. Note that if it's a tf.Variable, the value is
broadcasted to the devices of that variable, this method doesn't update
the variable.
| Returns | |
|---|---|
A tf.Tensor or tf.distribute.DistributedValues.
|
reduce
reduce(
reduce_op, per_replica_value, destinations, options=None
)
Reduce per_replica_value to destinations.
See tf.distribute.StrategyExtended.reduce_to. This can only be called in
the cross-replica context.
| Args |
|---|
reduce_op
tf.distribute.ReduceOp specifying how values should be
combined.
per_replica_value
tf.distribute.DistributedValues, or a tf.Tensor
like object.
destinations
tf.distribute.DistributedValues, a tf.Variable, a
tf.Tensor alike object, or a device string. It specifies the devices
to reduce to. To perform an all-reduce, pass the same to value and
destinations. Note that if it's a tf.Variable, the value is reduced
to the devices of that variable, and this method doesn't update the
variable.
options
tf.distribute.experimental.CommunicationOptions. See
tf.distribute.experimental.CommunicationOptions for details.
| Returns | |
|---|---|
A tf.Tensor or tf.distribute.DistributedValues.
|
| Raises |
|---|
ValueError
tf.distribute.DistributedValues or if destinations is not a string,
tf.Variable or tf.distribute.DistributedValues.
View source on GitHub