View source on GitHub
|
Partitioning of a sequence of values into contiguous subsequences ("rows").
tf.experimental.RowPartition(
row_splits,
row_lengths=None,
value_rowids=None,
nrows=None,
uniform_row_length=None,
nvals=None,
internal=False
)
A RowPartition describes how a sequence with nvals items should be
divided into nrows contiguous subsequences ("rows"). For example, a
RowPartition could be used to partition the vector [1, 2, 3, 4, 5] into
subsequences [[1, 2], [3], [], [4, 5]]. Note that RowPartition stores
information about how values are partitioned, but does not include the
partitioned values themselves. tf.RaggedTensor is used to pair a values
tensor with one or more RowPartitions, providing a complete encoding for a
ragged tensor (i.e. a tensor with variable-length dimensions).
RowPartitions may be defined using several different schemes:
row_lengths: an integer vector with shape[nrows], which specifies the length of each row.row_splits: an integer vector with shape[nrows+1], specifying the "split points" between each row.row_starts: an integer vector with shape[nrows], which specifies the start offset for each row. Equivalent torow_splits[:-1].row_limits: an integer vector with shape[nrows], which specifies the stop offset for each row. Equivalent torow_splits[1:].value_rowidsis an integer vector with shape[nvals], corresponding one-to-one with sequence values, which specifies the row that each value belongs to. If the partition has empty trailing rows, thennrowsmust also be specified.uniform_row_lengthis an integer scalar, specifying the length of every row. This scheme may only be used if all rows have the same length.
For example, the following RowPartitions all represent the partitioning of
8 values into 5 sublists as follows: [[*, *, *, *], [], [*, *, *], [*], []].
p1 = RowPartition.from_row_lengths([4, 0, 3, 1, 0])p2 = RowPartition.from_row_splits([0, 4, 4, 7, 8, 8])p3 = RowPartition.from_row_starts([0, 4, 4, 7, 8], nvals=8)p4 = RowPartition.from_row_limits([4, 4, 7, 8, 8])p5 = RowPartition.from_value_rowids([0, 0, 0, 0, 2, 2, 2, 3], nrows=5)
For more information about each scheme, see the documentation for the
its factory method. For additional examples, see the documentation on
tf.RaggedTensor.
Precomputed Encodings
RowPartition always stores at least one encoding of the partitioning, but
it can be configured to cache additional encodings as well. This can
avoid unnecessary recomputation in eager mode. (In graph mode, optimizations
such as common subexpression elimination will typically prevent these
unnecessary recomputations.) To check which encodings are precomputed, use
RowPartition.has_precomputed_<encoding>. To cache an additional
encoding, use RowPartition.with_precomputed_<encoding>.
Args |
|---|
row_splits
[nrows+1].
row_lengths
[nrows]
value_rowids
[nvals].
nrows
uniform_row_length
nvals
internal
Raises |
|---|
TypeError
TypeError
ValueError
ValueError
ValueError
Attributes |
|---|
dtype
DType used to encode the row partition (either int32 or int64).
static_nrows
self.row_lengths().shape == [self.static_nrows]
self.row_starts().shape == [self.static_nrows]
self.row_limits().shape == [self.static_nrows]
self.row_splits().shape == [self.static_nrows + 1]
static_nvals
self.value_rowids().shape == [self.static_vals]
static_uniform_row_length
Methods
from_row_lengths
@classmethodfrom_row_lengths( row_lengths, validate=True, dtype=None, dtype_hint=None )
Creates a RowPartition with rows partitioned by row_lengths.
This RowPartition divides a sequence values into rows by indicating
the length of each row:
partitioned_rows = [[values.pop(0) for _ in range(length)]
for length in row_lengths]
| Args |
|---|
row_lengths
[nrows]. Must be
nonnegative.
validate
RowPartition.
dtype
row_lengths, dtype_hint, or tf.int64.
dtype_hint
dtype_hint is not possible, this argument has no
effect.
| Returns | |
|---|---|
A RowPartition.
|
from_row_limits
@classmethodfrom_row_limits( row_limits, validate=True, dtype=None, dtype_hint=None )
Creates a RowPartition with rows partitioned by row_limits.
Equivalent to: from_row_splits(values, concat([0, row_limits], axis=0)).
| Args |
|---|
row_limits
[nrows]. Must be sorted in
ascending order.
validate
RowPartition.
dtype
row_limits, dtype_hint, or tf.int64.
dtype_hint
dtype_hint is not possible, this argument has no
effect.
| Returns | |
|---|---|
A RowPartition.
|
from_row_splits
@classmethodfrom_row_splits( row_splits, validate=True, dtype=None, dtype_hint=None )
Creates a RowPartition with rows partitioned by row_splits.
This RowPartition divides a sequence values into rows by indicating
where each row begins and ends:
partitioned_rows = []
for i in range(len(row_splits) - 1):
row_start = row_splits[i]
row_end = row_splits[i + 1]
partitioned_rows.append(values[row_start:row_end])
| Args |
|---|
row_splits
[nrows+1]. Must not be
empty, and must be sorted in ascending order. row_splits[0] must be
zero.
validate
RowPartition.
dtype
row_splits, dtype_hint, or tf.int64.
dtype_hint
dtype_hint is not possible, this argument has no
effect.
| Returns | |
|---|---|
A RowPartition.
|
| Raises |
|---|
ValueError
row_splits is an empty list.
from_row_starts
@classmethodfrom_row_starts( row_starts, nvals, validate=True, dtype=None, dtype_hint=None )
Creates a RowPartition with rows partitioned by row_starts.
Equivalent to: from_row_splits(concat([row_starts, nvals], axis=0)).
| Args |
|---|
row_starts
[nrows]. Must be
nonnegative and sorted in ascending order. If nrows>0, then
row_starts[0] must be zero.
nvals
validate
RowPartition.
dtype
row_starts, dtype_hint, or tf.int64.
dtype_hint
dtype_hint is not possible, this argument has no
effect.
| Returns | |
|---|---|
A RowPartition.
|
from_uniform_row_length
@classmethodfrom_uniform_row_length( uniform_row_length, nvals=None, nrows=None, validate=True, dtype=None, dtype_hint=None )
Creates a RowPartition with rows partitioned by uniform_row_length.
This RowPartition divides a sequence values into rows that all have
the same length:
partitioned_rows = [[values.pop(0) for _ in range(uniform_row_length)]
for _ in range(nrows)]
Note that either or both of nvals and nrows must be specified.
| Args |
|---|
uniform_row_length
values must be evenly divisible by
uniform_row_length.
nvals
nrows
nvals/uniform_row_length (or 0 if
uniform_row_length==0). nrows only needs to be specified if
uniform_row_length might be zero. uniform_row_length*nrows must be
nvals.
validate
RowPartition.
dtype
uniform_row_length, dtype_hint,
or tf.int64.
dtype_hint
dtype_hint is not possible, this argument has no
effect.
| Returns | |
|---|---|
A RowPartition.
|
from_value_rowids
@classmethodfrom_value_rowids( value_rowids, nrows=None, validate=True, dtype=None, dtype_hint=None )
Creates a RowPartition with rows partitioned by value_rowids.
This RowPartition divides a sequence values into rows by specifying
which row each value should be added to:
partitioned_rows = [[] for _ in nrows]
for (value, rowid) in zip(values, value_rowids):
partitioned_rows[rowid].append(value)
| Args |
|---|
value_rowids
[nvals], which corresponds
one-to-one with values, and specifies each value's row index. Must be
nonnegative, and must be sorted in ascending order.
nrows
RowPartition may containing empty training rows. Must
be greater than value_rowids[-1] (or greater than or equal to zero if
value_rowids is empty). Defaults to value_rowids[-1] + 1 (or zero if
value_rowids is empty).
validate
RowPartition.
dtype
value_rowids, dtype_hint, or tf.int64.
dtype_hint
dtype_hint is not possible, this argument has no
effect.
| Returns | |
|---|---|
A RowPartition.
|
| Raises |
|---|
ValueError
nrows is incompatible with value_rowids.
Example:
print(RowPartition.from_value_rowids(value_rowids=[0, 0, 0, 0, 2, 2, 2, 3],nrows=4))tf.RowPartition(row_splits=[0 4 4 7 8])
is_uniform
is_uniform()
Returns true if the partition is known to be uniform statically.
This is based upon the existence of self._uniform_row_length. For example: RowPartition.from_row_lengths([3,3,3]).is_uniform()false RowPartition.from_uniform_row_length(5, nvals=20).is_uniform()true RowPartition.from_row_lengths([2,0,2]).is_uniform()==false
| Returns | |
|---|---|
| Whether a RowPartition is known to be uniform statically. |
nrows
nrows()
Returns the number of rows created by this RowPartition.
| Returns | |
|---|---|
| scalar integer Tensor |
nvals
nvals()
Returns the number of values partitioned by this RowPartition.
If the sequence partitioned by this RowPartition is a tensor, then
nvals is the size of that tensor's outermost dimension -- i.e.,
nvals == values.shape[0].
| Returns | |
|---|---|
| scalar integer Tensor |
offsets_in_rows
offsets_in_rows()
Return the offset of each value.
RowPartition takes an array x and converts it into sublists. offsets[i] is the index of x[i] in its sublist. Given a shape, such as: [,,],[,],[],[,*] This returns: 0,1,2,0,1,0,1
| Returns | |
|---|---|
| an offset for every value. |
row_lengths
row_lengths()
Returns the lengths of rows in this RowPartition.
| Returns | |
|---|---|
A 1-D integer Tensor with shape [self.nrows].
The returned tensor is nonnegative.
tf.reduce_sum(self.row_lengths) == self.nvals().
|
row_limits
row_limits()
Returns the limit indices for rows in this row partition.
These indices specify where the values for each row end.
partition.row_limits() is equal to partition.row_splits()[:-1].
| Returns | |
|---|---|
A 1-D integer Tensor with shape [self.nrows].
The returned tensor is nonnegative, and is sorted in ascending order.
self.row_limits()[-1] == self.nvals().
|
row_splits
row_splits()
Returns the row-split indices for this row partition.
row_splits specifies where the values for each row begin and end.
In particular, the values for row i are stored in the slice
values[row_splits[i]:row_splits[i+1]].
| Returns | |
|---|---|
A 1-D integer Tensor with shape [self.nrows+1].
The returned tensor is non-empty, and is sorted in ascending order.
self.row_splits()[0] == 0.
self.row_splits()[-1] == self.nvals().
|
row_starts
row_starts()
Returns the start indices for rows in this row partition.
These indices specify where the values for each row begin.
partition.row_starts() is equal to partition.row_splits()[:-1].
| Returns | |
|---|---|
A 1-D integer Tensor with shape [self.nrows()].
The returned tensor is nonnegative, and is sorted in ascending order.
self.row_starts()[0] == 0.
self.row_starts()[-1] <= self.nvals().
|
uniform_row_length
uniform_row_length()
Returns the length of each row in this partition, if rows are uniform.
If all rows in this RowPartition have the same length, then this returns
that length as a scalar integer Tensor. Otherwise, it returns None.
| Returns | |
|---|---|
scalar Tensor with type=self.dtype, or None.
|
value_rowids
value_rowids()
Returns the row indices for this row partition.
value_rowids specifies the row index fo reach value. In particular,
value_rowids[i] is the row index for values[i].
| Returns | |
|---|---|
A 1-D integer Tensor with shape [self.nvals()].
The returned tensor is nonnegative, and is sorted in ascending order.
|
with_dtype
with_dtype(
dtype
)
Returns a copy of this RowPartition with the given encoding dtype.
| Args |
|---|
dtype
row_splits and nrows.
One of tf.int32 or tf.int64.
| Returns | |
|---|---|
| A copy of this RowPartition, with the encoding tensors cast to the given type. |
View source on GitHub