This document provides a high-level introduction to the Feast feature store repository, explaining its purpose, architecture, and core components. Feast is an open-source feature store for machine learning that provides consistent offline (training) and online (serving) feature access.
For detailed information on specific topics:
Sources: README.md26-36 docs/getting-started/quickstart.md1-16
Feast (Feature Store) is an open-source feature store for machine learning that manages existing infrastructure to productionize analytic data for model training and online inference.
For Data Scientists: A tool to define, store, and retrieve features for both model development and deployment without software engineering concerns.
For MLOps Engineers: A library that connects existing infrastructure (databases, application servers, analytical databases, orchestration tools) enabling data scientists to ship features to production via a friendly SDK.
For Data Engineers: A centralized catalog for feature definitions providing a single source of truth, with abstraction for reading/writing to multiple offline and online data stores.
For AI Engineers: A platform to scale AI applications by enabling seamless integration of richer data and facilitating fine-tuning with optimized data pipelines.
Feast solves critical problems:
Sources: README.md29-36 docs/getting-started/quickstart.md4-14
The Feast repository is organized into several main directories:
| Directory | Purpose | Key Components |
|---|---|---|
sdk/python/feast/ | Python SDK and core logic | FeatureStore, Registry, Provider |
sdk/python/feast/infra/ | Infrastructure abstractions | Offline stores, online stores, compute engines |
protos/ | Protocol buffer definitions | API contracts for cross-language compatibility |
infra/ | Deployment infrastructure | Docker images, Helm charts, Kubernetes operator |
ui/ | Web UI (React) | Feature browsing and visualization |
docs/ | Documentation | User guides, API references |
examples/ | Example implementations | Quickstart, tutorials |
Sources: sdk/python/feast/feature_store.py1-100 docs/reference/codebase-structure.md1-30
Sources: sdk/python/feast/feature_store.py100-203 sdk/python/feast/infra/provider.py49-62 sdk/python/feast/repo_config.py193-257
The FeatureStore class in sdk/python/feast/feature_store.py100-189 is the primary interface for all Feast operations:
Key Attributes:
config: RepoConfig - Configuration for the feature storerepo_path: Path - Path to the feature repository_registry: BaseRegistry - Registry for metadata storage_provider: Provider - Provider for infrastructure operationsInitialization Flow:
feature_store.yaml or explicit RepoConfigregistry_type (file, SQL, Snowflake, or remote)PassthroughProvider) based on provider settingSources: sdk/python/feast/feature_store.py116-178 sdk/python/feast/repo_config.py255-296
Sources: sdk/python/feast/feature_store.py821-1007 sdk/python/feast/infra/passthrough_provider.py58-130 sdk/python/feast/repo_operations.py399-415
Feast uses several core objects to define features:
Entity sdk/python/feast/entity.py
driver_id, customer_id)FeatureView sdk/python/feast/feature_view.py
Field objectsOnDemandFeatureView sdk/python/feast/on_demand_feature_view.py
StreamFeatureView sdk/python/feast/stream_feature_view.py
FeatureService sdk/python/feast/feature_service.py
Sources: sdk/python/feast/entity.py sdk/python/feast/feature_view.py sdk/python/feast/on_demand_feature_view.py sdk/python/feast/stream_feature_view.py
The Provider interface sdk/python/feast/infra/provider.py49-105 abstracts infrastructure operations. The default implementation is PassthroughProvider sdk/python/feast/infra/passthrough_provider.py58-130 which delegates to:
Offline Stores - Historical feature retrieval
OfflineStore sdk/python/feast/infra/offline_stores/offline_store.pyget_historical_features() for point-in-time joinsOnline Stores - Low-latency feature serving
OnlineStore sdk/python/feast/infra/online_stores/online_store.py35-150online_write_batch(), online_read()Batch Engines - Scalable materialization
ComputeEngine sdk/python/feast/infra/compute_engines/base.pymaterialize() for offline-to-online data transferSources: sdk/python/feast/repo_config.py39-106 sdk/python/feast/infra/provider.py533-545 sdk/python/feast/infra/passthrough_provider.py58-130
The Registry manages all feature store metadata including feature views, entities, data sources, and infrastructure state.
Registry Types:
| Type | Implementation | Storage | Use Case |
|---|---|---|---|
file | Registry | Local/S3/GCS file | Development, single-user |
sql | SqlRegistry | PostgreSQL/MySQL | Production, multi-user |
snowflake.registry | SnowflakeRegistry | Snowflake table | Snowflake-native deployments |
remote | RemoteRegistry | Remote gRPC service | Distributed deployments |
Key Operations:
apply_entity(), apply_feature_view() - Register objectsget_entity(), get_feature_view() - Retrieve objectslist_entities(), list_feature_views() - List objectsrefresh() - Reload registry cachecommit() - Persist changesSources: sdk/python/feast/feature_store.py155-177 sdk/python/feast/infra/registry/base_registry.py sdk/python/feast/repo_config.py39-44
Feast is configured via feature_store.yaml which is loaded by RepoConfig sdk/python/feast/repo_config.py193-296:
Core Configuration:
Configuration is loaded by:
load_repo_config() sdk/python/feast/repo_config.pyimport_class()Sources: sdk/python/feast/repo_config.py193-296 examples/quickstart/quickstart.ipynb README.md108-117
The Feast CLI sdk/python/feast/cli.py provides commands for managing the feature store:
Repository Management:
feast init <project> - Initialize a new feature repositoryfeast apply - Register feature definitions and update infrastructurefeast plan - Preview changes without applying themfeast teardown - Remove all infrastructureFeature Materialization:
feast materialize <start> <end> - Materialize features for a time rangefeast materialize-incremental <end> - Materialize since last materializationFeature Serving:
feast serve - Start Python feature serverfeast ui - Start web UI for feature browsingInspection:
feast feature-views list - List all feature viewsfeast entities list - List all entitiesfeast feature-services list - List all feature servicesSources: sdk/python/feast/repo_operations.py223-240 docs/reference/feast-cli-commands.md1-30 README.md45-128
feature_definitions.py)feast apply to register with registryImplementation: sdk/python/feast/feature_store.py821-1007 sdk/python/feast/repo_operations.py399-415
fs.get_historical_features(entity_df, features)RetrievalJob that can be converted to dataframeImplementation: sdk/python/feast/feature_store.py1356-1643
feast materialize or materialize_incremental()online_write_batch()Implementation: sdk/python/feast/feature_store.py1644-1968 sdk/python/feast/infra/passthrough_provider.py376-534
fs.get_online_features(features, entity_rows)online_read()OnlineResponse with feature valuesImplementation: sdk/python/feast/feature_store.py1969-2331
Sources: sdk/python/feast/feature_store.py sdk/python/feast/infra/passthrough_provider.py
Feast provides a complete feature store solution with:
FeatureStore class for all operationsThe architecture enables teams to decouple ML workflows from infrastructure while maintaining consistency between training and serving environments.
For more details on specific components, refer to the related wiki sections linked at the beginning of this document.
Sources: README.md26-41 sdk/python/feast/feature_store.py100-203 docs/getting-started/quickstart.md1-60
Refresh this wiki
This wiki was recently refreshed. Please wait 7 days to refresh again.