Skip to content

Subindex find is broken by da['@c'] = new_da  #829

Description

@AnneYang720

After using da['@c'] = new_da with Redis or ElasticSearch storage backends, find on subindex level will no longer work. MRE is below.

from docarray import Document, DocumentArray
import numpy as np

with DocumentArray(
    storage='elasticsearch', # or redis
    config={
        'n_dim': 128,
    },
    subindex_configs={'@c': {'n_dim': 3}},
) as da:
    da.extend(
        [
            Document(
                id=f'{i}',
                chunks=[
                    Document(id=f'sub{i}_0', embedding=np.random.random(3)),
                    Document(id=f'sub{i}_1', embedding=np.random.random(3)),
                ],
            )
            for i in range(1)
        ]
    )
    res = da.find(np.random.random(3), on='@c') # this works
    
    da['@c'] = [Document(id='sub0_0', embedding=np.random.random(3)), Document(id='sub0_1', embedding=np.random.random(3))]
    res = da.find(np.random.random(3), on='@c') # this fails

This reason is da['@c'] = new_da will call subindex_da.clear() in _update_subindices_set and then _clear_storage. The _clear_storage of Redis and ElasticSearch deletes the index in database.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions