Release Note
This release contains 7 new features, 3 bug fixes and 7 documentation improvements.
🆕 Features
Milvus document store (#587)
This release supports the Milvus vector database as a document store.
da = DocumentArray(storage='milvus', config={'n_dim': 3))
Root_id for document stores (#808)
When working with a vector database you can now retrieve the root document even if you search at a nested level with sub-indices (for example at chunk level).
top_level_matches = da.find(query=np.random.rand(512), on='@.[image]', return_root=True)
To allow this we now store the root_id in the chunks' tags. You can enable this by passing root_id=True in your document store configuration.
Filtering based on text keywords for Qdrant (#849)
You can now filter based on text keywords for the Qdrant document store.
filter = {
'must': [
{"key": "info", "match": {"text": "shoes"}}
]
}
results = da.find(np.random.rand(n_dim), filter=filter)
RGB-D representation of 3D meshes (#753)
DocArray already supports 3D mesh representation in different formats and this release adds support for RGB-D representation.
doc.load_uris_to_rgbd_tensor()
Load multi page tiff files into chunks (#845)
Multi page tiff images can now be loaded with load_uri_to_image_tensor().
d = Document(uri="foo.tiff")
d.load_uri_to_image_tensor()
print(d)
<Document ('id', 'uri', 'chunks') at 7f907d786d6c11ec840a1e008a366d49>
└─ chunks
├─ <Document ('id', 'parent_id', 'granularity', 'tensor') at 7aa4c0ba66cf6c300b7f07fdcbc2fdc8>
├─ <Document ('id', 'parent_id', 'granularity', 'tensor') at bc94a3e3ca60352f2e4c9ab1b1bb9c22>
└─ <Document ('id', 'parent_id', 'granularity', 'tensor') at 36fe0d1daf4442ad6461c619f8bb25b7>
Store key frame indices when loading video tensor from uri (#880)
key_frame_indices are now stored in a Document's tags when loading a video to tensor. This allows extracting the section of the video between key frames.
d = Document(uri="video.mp4").load_uri_to_video_tensor()
print(d.tags['keyframe_indices'])
Better plotting of embeddings for nested and complex data (#891)
You can now choose which meta field parameters to exclude when calling DocumentArray's plot_embedding() method. This makes it easier to plot embeddings for complex and nested data.
docs.plot_embeddings(exclude_fields_metas=['chunks'])
Better support for information retrieval evaluation (#826)
This release adds a max_rel_per_label parameter to better support metric calculations that require the number of relevant Documents.
metrics = da.evaluate(['recall_at_k'], max_rel_per_label={i: 1 for i in range(3)})
🐞 Bug Fixes
Support length calculation independently from list-like behavior (#840)
DocArray 0.19 added the ability to instantiate a document store without list-like behavior for improved performance. However, calculating the length of certain document stores relied on such list-like behavior. This release fixes length calculation for the Redis document store, making it independent from list-like behavior.
Remove cosine similarity field with false assignment (#835)
In the Weaviate document store, cosine distance is no longer mistakenly assigned to the cosine_similarity field.
Rebuild index after clearing storage (#837)
The index for Redis and Elasticsearch document stores is now rebuilt when _clear_storage is called.
📗 Documentation Improvements
🤟 Contributors
We would like to thank all contributors to this release:
Release Note
This release contains 7 new features, 3 bug fixes and 7 documentation improvements.
🆕 Features
Milvus document store (#587)
This release supports the Milvus vector database as a document store.
Root_id for document stores (#808)
When working with a vector database you can now retrieve the root document even if you search at a nested level with sub-indices (for example at chunk level).
To allow this we now store the
root_idin the chunks' tags. You can enable this by passingroot_id=Truein your document store configuration.Filtering based on text keywords for Qdrant (#849)
You can now filter based on text keywords for the Qdrant document store.
RGB-D representation of 3D meshes (#753)
DocArray already supports 3D mesh representation in different formats and this release adds support for RGB-D representation.
Load multi page tiff files into chunks (#845)
Multi page
tiffimages can now be loaded withload_uri_to_image_tensor().Store key frame indices when loading video tensor from uri (#880)
key_frame_indicesare now stored in a Document's tags when loading a video to tensor. This allows extracting the section of the video between key frames.Better plotting of embeddings for nested and complex data (#891)
You can now choose which meta field parameters to exclude when calling DocumentArray's
plot_embedding()method. This makes it easier to plot embeddings for complex and nested data.Better support for information retrieval evaluation (#826)
This release adds a
max_rel_per_labelparameter to better support metric calculations that require the number of relevant Documents.🐞 Bug Fixes
Support length calculation independently from list-like behavior (#840)
DocArray 0.19 added the ability to instantiate a document store without list-like behavior for improved performance. However, calculating the length of certain document stores relied on such list-like behavior. This release fixes length calculation for the Redis document store, making it independent from list-like behavior.
Remove cosine similarity field with false assignment (#835)
In the Weaviate document store, cosine distance is no longer mistakenly assigned to the cosine_similarity field.
Rebuild index after clearing storage (#837)
The index for Redis and Elasticsearch document stores is now rebuilt when
_clear_storageis called.📗 Documentation Improvements
🤟 Contributors
We would like to thank all contributors to this release: