Skip to content

Bug in hnsw with AnyTensor #1352

@samsja

Description

@samsja

Context

Predefined document embedding field are not recognize by our indexer

from docarray import DocList
from docarray.documents import ImageDoc
from docarray.index import HnswDocumentIndex
import numpy as np

# create some data
dl = DocList[ImageDoc](
    [
        ImageDoc(
            url="https://upload.wikimedia.org/wikipedia/commons/2/2f/Alpamayo.jpg",
            tensor=np.zeros((3, 224, 224)),
            embedding=np.random.random((128,)),
        )
        for _ in range(100)
    ]
)

# create a Document Index
index = HnswDocumentIndex[ImageDoc](work_dir='/tmp/test_index')

# index your data
index.index(dl)

# find similar Documents
query = dl[0]
results, scores = index.find(query, limit=10, search_field='embedding')
  File "/home/sami/Documents/workspace/Jina/docarray2/docarray/docarray/index/backends/hnswlib.py", line 262, in _find
    docs, scores = self._find_batched(
  File "/home/sami/Documents/workspace/Jina/docarray2/docarray/docarray/index/backends/hnswlib.py", line 248, in _find_batched
    index = self._hnsw_indices[search_field]
KeyError: 'embedding'

Process finished with exit code 1

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

Status
Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions