feat: Quantization support for elastic search by vanitabhagwat · Pull Request #355 · ExpediaGroup/feast

vanitabhagwat · 2026-04-07T06:00:12Z

What this PR does / why we need it:

Quantization strategy - The PR implements memory-efficient vector compression (int4/int8/BBQ) which trades some accuracy for 4-8x memory reduction.
Query path duality - Maintains backward-compatible exact search (script_score) while adding approximate nearest neighbor search (native knn).
Validation on parameters

Which issue(s) this PR fixes:

Misc

* Adding support for Valkey Search, adding changes to the online_write_batch functionality * Addressing PR comments * addressing linting error * fix tests * addressing PR comments * addressing PR comments * fixing linting --------- Co-authored-by: Manisha4 <Manisha4@github.com>

* Adding support for Valkey Search, adding changes to the online_write_batch functionality * Addressing PR comments * addressing linting error * Adding changes to support search in valkey * fix tests * adding unit tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * Fix linter errors: type annotations and code formatting - Add explicit type annotation for schema_fields to support both TagField and VectorField - Encode project string to bytes for consistency with other hash values - Decode doc_key bytes to string for hmget compatibility - Fix code formatting: break long lines and remove extra blank lines - Remove tests for multiple vector fields (Feast enforces one vector per feature view) - Fix config type: use 'eg-valkey' (hyphen) not 'eg_valkey' (underscore) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * addressing PR comments * addressing PR comments * fixing linting * Fix missing feature_name argument in retrieve_online_documents_v2 Add the third argument (vector_field.name) to _get_vector_index_name call to match the updated function signature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * addressing comments, PR changes for some fixes and merge conflicts * fixing tests * fixing tests * fixing linting * fixing linting --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

- Add config fields: vector_index_type, hnsw_m, hnsw_ef_construction, rescore_oversample - Add use_native_knn toggle (default: false for backward compatibility) - Add knn_num_candidates_multiplier for query tuning - Comprehensive Pydantic validation for all config constraints - Support for int8, int4, bbq quantization with HNSW and flat indices - Dimension validation (int4 requires even dims, bbq requires >=64) - Dual query path: script_score (exact) vs native knn (approximate) - 17 unit tests covering config validation and index creation Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Fix mypy error: Argument 1 to b64decode has incompatible type - Add None check before decoding feature_value - Ensures type safety without changing behavior

- Remove unused patch import - Fix import ordering in test methods - No functional changes, all tests passing

- Auto-format with ruff for consistent code style - No functional changes, all tests passing

Manisha4 · 2026-04-20T16:30:27Z

            or 512
        )

+        # Validate vector_field_length is positive


nit: it would be nice to have this in another method _build_vector_mapping(), might be more readable.

Manisha4 · 2026-04-20T16:50:14Z

+                    if isinstance(feature_data, dict)
+                    else None
                )
+                if feature_value is not None:


your change is guarding feature_value not being appended if its not None. Is this the intended change?

Manisha4 · 2026-04-20T16:54:43Z

+    elif isinstance(value, list):
+        if not value:
+            # Empty list - create empty float list
+            pass


An empty list now returns an empty ValueProto with no field set, which downstream code may not expect. The old code would have raised ValueError (since it required all(isinstance(v, float))

the old code would have called float_list_val.val.extend([])). So the old behavior created an empty float_list_val; the new behavior creates a completely unset proto.

Maybe keep the val_proto.float_list_val.val.extend(value) for the empty case to match old behavior.

Manisha4 · 2026-04-20T17:03:18Z

+            "int4_flat",
+            "bbq_flat",
+        }
+        if self.rescore_oversample is not None and self.rescore_oversample != 0:


Seems like your code is using 3 different ways to check if rescore_oversample is enabled.

I guess here there are 2 ways, None and Zero. A reader has to figure out which one the code path cares about. If someone adds a new code path later and writes if config.rescore_oversample: (the Pythonic truthy check), they'll accidentally treat 0.0 and None as equivalent which happens to be correct here, but only by accident. And if someone writes if config.rescore_oversample is not None: thinking None is the only disabled state, they'll incorrectly try to apply oversample=0 to Elasticsearch, which will either error or silently do something unexpected.

Maybe just use None as the check.

-> Validation
if self.rescore_oversample is not None and self.rescore_oversample < 1.0:
raise ValueError(
f"rescore_oversample must be >= 1.0, got {self.rescore_oversample}"
)

if self.rescore_oversample is not None:
if self.vector_index_type not in quantized_types:
raise ValueError(...)

-> Use
if config.online_store.rescore_oversample is not None:
knn_query["rescore_vector"] = {"oversample": config.online_store.rescore_oversample}

Manisha4 · 2026-04-20T17:05:06Z

+            # int4 quantization requires even number of dimensions
+            if "int4" in index_type and vector_field_length % 2 != 0:
+                raise ValueError(
+                    f"int4 quantization ('{index_type}') requires even number of dimensions, "


can you reference a documentation in the error message that the user can refer to?

Manisha4 · 2026-04-20T17:08:06Z

-        )
+        client = self._get_client(config)
+        if not client.indices.exists(index=table.name):
+            client.indices.create(index=table.name, mappings=index_mapping)


Can we add a log to the above code on the else condition, since you're now checking if the index exists?

if not client.indices.exists(index=table.name):
client.indices.create(index=table.name, mappings=index_mapping)
else:
logger.info(
f"Index '{table.name}' already exists; skipping creation. "
f"To apply mapping changes, delete the index first."
)

Manisha4 · 2026-04-20T17:11:35Z

    val_proto = ValueProto()
    if isinstance(value, ValueProto):
        return value
+    # Check bool before int/float since bool is a subclass of int in Python


Can you add a test for this change?

- Enforce (1.0, 10.0) exclusive range per ES documentation - Use None as single sentinel value (remove 0 confusion) - Add validation for None vector_index_type with rescore - Apply ruff formatting Fixes: - https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/dense-vector - Addresses review comment about dual sentinel values Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Updated the elastic search to use vector_index defined in the feature view to identify vector fields (#348) * updated the elastic search to use vector_index defined in the feature view to identify vector fields * fix: formatting * Added logging and switched to use open source elastic search --------- Co-authored-by: vanitabhagwat <vbhagwat@expediagroup.com> * fix: ES integration tests (#350) * fix: ES integration tests * fix: Added fromisoformat() for converting timestamps --------- Co-authored-by: vanitabhagwat <vbhagwat@expediagroup.com> * fix:Elasticsearch online store — correctness, performance, and robustness fixes (#353) Co-authored-by: vanitabhagwat <vbhagwat@expediagroup.com> * Feature/vector store (#357) * feat: Valkey Online Write Batch Vector Search Support (#351) * Adding support for Valkey Search, adding changes to the online_write_batch functionality * Addressing PR comments * addressing linting error * fix tests * addressing PR comments * addressing PR comments * fixing linting --------- Co-authored-by: Manisha4 <Manisha4@github.com> * feat: Support Vector Search in Valkey (#354) * Adding support for Valkey Search, adding changes to the online_write_batch functionality * Addressing PR comments * addressing linting error * Adding changes to support search in valkey * fix tests * adding unit tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * Fix linter errors: type annotations and code formatting - Add explicit type annotation for schema_fields to support both TagField and VectorField - Encode project string to bytes for consistency with other hash values - Decode doc_key bytes to string for hmget compatibility - Fix code formatting: break long lines and remove extra blank lines - Remove tests for multiple vector fields (Feast enforces one vector per feature view) - Fix config type: use 'eg-valkey' (hyphen) not 'eg_valkey' (underscore) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * addressing PR comments * addressing PR comments * fixing linting * Fix missing feature_name argument in retrieve_online_documents_v2 Add the third argument (vector_field.name) to _get_vector_index_name call to match the updated function signature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * addressing comments, PR changes for some fixes and merge conflicts * fixing tests * fixing tests * fixing linting * fixing linting --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: Valkey vector search - remove unsupported SORTBY (#356) * fix: Valkey vector search - remove unsupported SORTBY and fix tag filter syntax Valkey Search KNN queries return results pre-sorted by distance, so explicit SORTBY is not supported and causes a ResponseError. This removes the .sort_by() call from the query builder. Additionally, fixes the project tag filter to use unquoted syntax with backslash escaping for special characters (e.g. hyphens, dots) instead of the quoted syntax which was returning empty results. Updates unit tests to reflect both changes: replaces three metric-specific sort order tests with a single test asserting no SORTBY is set, and updates escaping assertions to match the new backslash-escape approach. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: apply ruff format to eg_valkey.py and test_valkey.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: Quantization support for elastic search (#355) * feat: Add retrieve_online_documents_v3 SDK with multi-vector and hybrid fusion (#358) * feat: implement retrieve_online_documents_v3 SDK method - Multi-vector search with configurable fusion (RRF, WEIGHTED_LINEAR, VECTOR_ONLY) via the ES retriever API. Valkey gracefully degrades to single-vector KNN with warnings. - "embedding" magic key for V2→V3 migration convenience - Reserved output fields: final_score, signal_scores - include_signal_scores and distance_metric accepted as reserved params - ODFV and reserved-name collision validation - Shared signal_scores encoding via _signal_scores helper * update tests * update tests * fixing linting * docs: clarify final_score semantics in Valkey V3 docstring Correct the Valkey final_score description — Valkey's __distance__ is lower-is-better across all metrics (COSINE, L2, IP), not higher-is-better for IP. Call out the ordering inversion vs Elasticsearch so callers don't assume cross-backend score portability. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: plumb include_signal_scores through V3 and align defaults to False Valkey/ES/provider/passthrough/online-store defaults were True, mismatching the SDK's False default. Align all layers on False and thread the parameter from retrieve_online_documents_v3 through the internal dispatcher, provider, and online stores so callers can opt in today and transparently pick up the explain-based per-signal path when it lands — no API change required. Tighten docstrings to describe the current best-effort behavior instead of hinting at latency tradeoffs that aren't wired yet. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * updating doc string * fix: preserve ranked row order in V3 retrieve_online_documents _retrieve_from_online_store_v3 was passing the driver's ranked rows through _get_unique_entities_from_values, which sorts and dedupes by entity-key bytes. That helper is correct for batch entity lookups but wrong here — ES/Valkey have already ordered rows by relevance, and the sort was scrambling them in the final DataFrame (e.g. doc_10 jumping ahead of doc_3 because "10" < "3" lexicographically). Replace the helper call with an identity mapping so the driver's rank order flows through untouched. No change to V1, V2, batch reads, or the helper itself; V3 output now matches the order returned by the online store. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix: apply rescore_oversample to V3 ES kNN retrievers (#363) V3's retriever construction was silently ignoring rescore_oversample. V2 honored it (lines 483-486, 617-620) but V3 never added rescore_vector to its kNN clauses. On quantized indices (int8_hnsw / int4_hnsw / bbq_hnsw), this meant V3 queries returned lower recall than the config promised, with no error or warning. Wire rescore_oversample into each kNN retriever the same way V2 does. Covers single-vector and multi-vector V3 queries; BM25 retrievers skip the branch since they lack a "knn" key. Existing config validation (lines 102-127) already prevents rescore on non-quantized indices, so no new validation needed. Added three unit tests in TestRetrieveOnlineDocumentsV3QueryBuilding: - rescore_vector appears in single-vector query body when configured - rescore_vector appears on every kNN retriever in multi-vector query - rescore_vector absent when rescore_oversample is None Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: vanitabhagwat <92561664+vanitabhagwat@users.noreply.github.com> Co-authored-by: vanitabhagwat <vbhagwat@expediagroup.com> Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Manisha4 and others added 3 commits April 6, 2026 12:12

vanitabhagwat force-pushed the feature/int8-quantization-es branch from b91534a to d1ebd93 Compare April 9, 2026 01:07

vanitabhagwat changed the title ~~Added quantization support~~ fix: Quantization support Apr 9, 2026

vanitabhagwat changed the title ~~fix: Quantization support~~ fix: Quantization support for elastic search Apr 9, 2026

vanitabhagwat added 5 commits April 8, 2026 18:22

fix: Add type guard for feature_value in retrieve_online_documents

8f57538

- Fix mypy error: Argument 1 to b64decode has incompatible type - Add None check before decoding feature_value - Ensures type safety without changing behavior

style: Fix ruff linting issues in elasticsearch tests

ea0f206

- Remove unused patch import - Fix import ordering in test methods - No functional changes, all tests passing

style: Apply ruff formatting to elasticsearch.py

5912a5a

- Auto-format with ruff for consistent code style - No functional changes, all tests passing

validation fixes

b0cb271

fix: formatting

a5090bf

vanitabhagwat changed the title ~~fix: Quantization support for elastic search~~ feat: Quantization support for elastic search Apr 17, 2026

Manisha4 reviewed Apr 20, 2026

View reviewed changes

vanitabhagwat and others added 3 commits April 27, 2026 00:10

Address review comments

16dc062

updated rescore boundaries

cdc2b88

vanitabhagwat changed the base branch from feature/vector-store to ess-vector-store April 28, 2026 17:42

Merge branch 'ess-vector-store' into feature/int8-quantization-es

ae3b406

Manisha4 approved these changes Apr 28, 2026

View reviewed changes

vanitabhagwat merged commit 23bfd01 into ess-vector-store Apr 28, 2026
27 of 29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Quantization support for elastic search#355

feat: Quantization support for elastic search#355
vanitabhagwat merged 12 commits into
ess-vector-storefrom
feature/int8-quantization-es

vanitabhagwat commented Apr 7, 2026 •

edited

Loading

Uh oh!

Manisha4 Apr 20, 2026

Uh oh!

Manisha4 Apr 20, 2026

Uh oh!

Manisha4 Apr 20, 2026

Uh oh!

Manisha4 Apr 20, 2026 •

edited

Loading

Uh oh!

Manisha4 Apr 20, 2026

Uh oh!

Manisha4 Apr 20, 2026

Uh oh!

Manisha4 Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vanitabhagwat commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it:

Which issue(s) this PR fixes:

Misc

Uh oh!

Manisha4 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Manisha4 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Manisha4 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Manisha4 Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Manisha4 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Manisha4 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Manisha4 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vanitabhagwat commented Apr 7, 2026 •

edited

Loading

Manisha4 Apr 20, 2026 •

edited

Loading