-
Notifications
You must be signed in to change notification settings - Fork 244
feat: encode primitive type in the proto #772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
6271f43
refactor: change embedding and base node
samsja 64cc33e
fix: expose any url to user
samsja 1fc7db6
refactor: docarray type are relfected at the proto level
samsja 6cb5256
fix: add test and fix dump to proto
samsja f4084e8
refactor: move get nested class to document
samsja 5c87798
fix: fix mypy return type hint
samsja 5b5b54d
feat: add id as a type
samsja a045aa4
feat: adapt proto to the new id type
samsja b7df023
fix(proto): does not return mixin anymore type
samsja File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,5 @@ | ||
| from docarray.document.any_document import AnyDocument | ||
| from docarray.document.base_node import BaseNode | ||
| from docarray.document.document import BaseDocument | ||
|
|
||
| __all__ = ['AnyDocument', 'BaseDocument'] | ||
| __all__ = ['AnyDocument', 'BaseDocument', 'BaseNode'] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,16 @@ | ||
| from typing import Dict, Iterable | ||
| from abc import abstractmethod | ||
| from typing import TYPE_CHECKING, Dict, Iterable, Type | ||
|
|
||
| from pydantic.fields import ModelField | ||
|
|
||
| if TYPE_CHECKING: | ||
| from docarray.document.mixins.proto import ProtoMixin | ||
|
|
||
|
|
||
| class AbstractDocument(Iterable): | ||
| __fields__: Dict[str, ModelField] | ||
|
|
||
| @classmethod | ||
| @abstractmethod | ||
| def _get_nested_document_class(cls, field: str) -> Type['ProtoMixin']: | ||
|
JohannesMessner marked this conversation as resolved.
|
||
| ... | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,6 @@ | ||
| from docarray.document.base_node import BaseNode | ||
| from docarray.typing.embedding import Embedding | ||
| from docarray.typing.id import ID | ||
| from docarray.typing.tensor import Tensor | ||
| from docarray.typing.url import AnyUrl, ImageUrl | ||
|
|
||
| from docarray.typing.ndarray import Embedding, Tensor | ||
| from docarray.typing.url import ImageUrl | ||
|
|
||
| __all__ = ['Tensor', 'Embedding', 'BaseNode', 'ImageUrl'] | ||
| __all__ = ['Tensor', 'Embedding', 'ImageUrl', 'AnyUrl', 'ID'] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| from typing import TypeVar | ||
|
|
||
| from docarray.proto import NodeProto | ||
| from docarray.typing.tensor import Tensor | ||
|
|
||
| T = TypeVar('T', bound='Embedding') | ||
|
|
||
|
|
||
| class Embedding(Tensor): | ||
| def _to_node_protobuf(self: T, field: str = 'tensor') -> NodeProto: | ||
| """Convert Document into a NodeProto protobuf message. This function should | ||
| be called when the Document is nested into another Document that need to be | ||
| converted into a protobuf | ||
| :param field: field in which to store the content in the node proto | ||
| :return: the nested item protobuf message | ||
| """ | ||
|
|
||
| return super()._to_node_protobuf(field='embedding') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| from typing import TYPE_CHECKING, Optional, Type, TypeVar, Union | ||
| from uuid import UUID | ||
|
|
||
| from docarray.document.base_node import BaseNode | ||
| from docarray.proto import NodeProto | ||
|
|
||
| if TYPE_CHECKING: | ||
| from pydantic import BaseConfig | ||
| from pydantic.fields import ModelField | ||
|
|
||
|
|
||
| T = TypeVar('T', bound='ID') | ||
|
|
||
|
|
||
| class ID(str, BaseNode): | ||
| """ | ||
| Represent an unique ID | ||
| """ | ||
|
|
||
| @classmethod | ||
| def __get_validators__(cls): | ||
| yield cls.validate | ||
|
|
||
| @classmethod | ||
| def validate( | ||
| cls: Type[T], | ||
| value: Union[str, int, UUID], | ||
| field: Optional['ModelField'] = None, | ||
| config: Optional['BaseConfig'] = None, | ||
| ) -> T: | ||
|
|
||
| try: | ||
| id: str = str(value) | ||
| return cls(id) | ||
| except Exception: | ||
| raise ValueError(f'Expected a str, int or UUID, got {type(value)}') | ||
|
|
||
| def _to_node_protobuf(self) -> NodeProto: | ||
| """Convert an ID into a NodeProto message. This function should | ||
| be called when the self is nested into another Document that need to be | ||
| converted into a protobuf | ||
|
|
||
| :return: the nested item protobuf message | ||
| """ | ||
| return NodeProto(id=self) |
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,4 @@ | ||
| from .image_url import ImageUrl | ||
| from docarray.typing.url.any_url import AnyUrl | ||
| from docarray.typing.url.image_url import ImageUrl | ||
|
|
||
| __all__ = ['ImageUrl'] | ||
| __all__ = ['ImageUrl', 'AnyUrl'] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| import numpy as np | ||
|
|
||
| from docarray import Document | ||
| from docarray.document import AnyDocument | ||
| from docarray.typing import AnyUrl, Embedding, ImageUrl, Tensor | ||
|
|
||
|
|
||
| def test_proto_all_types(): | ||
| class Mymmdoc(Document): | ||
| tensor: Tensor | ||
| embedding: Embedding | ||
| any_url: AnyUrl | ||
| image_url: ImageUrl | ||
|
|
||
| doc = Mymmdoc( | ||
| tensor=np.zeros((3, 224, 224)), | ||
| embedding=np.zeros((100, 1)), | ||
| any_url='http://jina.ai', | ||
| image_url='http://jina.ai', | ||
| ) | ||
|
|
||
| new_doc = AnyDocument.from_protobuf(doc.to_protobuf()) | ||
|
|
||
| for field, value in new_doc: | ||
| assert isinstance(value, doc._get_nested_document_class(field)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.