When a user creates a Document Index with a schema, it is not always unambiguous what types should be used in the database:
class MySchema(BaseDocument):
text: str # should this be a string or a varchar? if varchar, what length?
embedding: NdArray # should this be float tensor or boolean tensor?
num: float # float32 or float64?
index = MyDocIndex[MySchema]
Right now, there is the method python_type_to_db_type that disambiguates these things, but it leaves no user choice.
We should enable an (optional!) feature like this:
class MySchema(BaseDocument):
text: str = Field(..., col_type='varchar', max_len=2048)
embedding: NdArray = Field(col_type='boolean_tensor')
num: float = Field(col_type='float64')
index = MyDocIndex[MySchema]
When a user creates a Document Index with a schema, it is not always unambiguous what types should be used in the database:
Right now, there is the method
python_type_to_db_typethat disambiguates these things, but it leaves no user choice.We should enable an (optional!) feature like this: