Context
The current GitHub Actions setup gives us good coverage, but it is doing more work than we probably need on every PR and master push. Recent master runs show the largest costs are in the master-only integration/build workflow, with full Python integration across 3.10/3.11/3.12 and a very slow feature-server-dev Docker build. PR CI also has some low-signal jobs that look redundant with other checks.
Observations
unit-tests runs the full Python unit suite across Python 3.10, 3.11, 3.12 on Ubuntu plus 3.11 and 3.12 on macOS.
- Several files under
sdk/python/tests/unit behave more like local functional/integration tests: subprocess CLI calls, feast init/apply, Spark setup, Docker/testcontainers, feature server startup, etc.
smoke_tests.yml installs full CI dependencies on Python 3.10/3.11/3.12 just to import feast.cli.
check_skip_tests.yml exists as a reusable docs/examples/community skip gate, but the main unit/linter/smoke workflows do not call it.
master_only.yml runs full integration plus benchmark upload across the same Python version matrix.
- The
feature-server-dev Docker build is the master workflow long pole and should get its own cache/layering/toolchain audit.
Suggested Cleanup Tracks
-
Low-risk CI hygiene
- Collapse or remove the redundant smoke workflow.
- Add path/doc-only skipping for low-risk workflows.
- Move benchmark publishing out of blocking master integration, or limit it to one Python version.
-
Test-suite restructuring
- Keep pure unit tests as the default fast PR gate.
- Move or mark local functional tests separately: CLI subprocess, feature repo apply/init, Spark, Docker/testcontainers, and feature server startup.
- Run that local functional slice on one Python/OS combination.
-
Matrix policy
- Keep full unit coverage on one primary Python version.
- Use smaller compatibility smoke/subset checks for other Python versions and macOS.
- Run full compatibility coverage nightly or on release-sensitive paths.
-
Master integration/build improvements
- Consider full integration on Python 3.11, with smaller compatibility coverage for 3.10/3.12.
- Split benchmark collection into a scheduled or non-blocking job.
- Audit
feature-server-dev Docker build caching, Node version alignment, image context, and layer ordering.
This issue is intentionally broad so contributors can pick off independent pieces without needing to solve the whole CI design in one PR.
Context
The current GitHub Actions setup gives us good coverage, but it is doing more work than we probably need on every PR and master push. Recent master runs show the largest costs are in the master-only integration/build workflow, with full Python integration across 3.10/3.11/3.12 and a very slow
feature-server-devDocker build. PR CI also has some low-signal jobs that look redundant with other checks.Observations
unit-testsruns the full Python unit suite across Python 3.10, 3.11, 3.12 on Ubuntu plus 3.11 and 3.12 on macOS.sdk/python/tests/unitbehave more like local functional/integration tests: subprocess CLI calls,feast init/apply, Spark setup, Docker/testcontainers, feature server startup, etc.smoke_tests.ymlinstalls full CI dependencies on Python 3.10/3.11/3.12 just to importfeast.cli.check_skip_tests.ymlexists as a reusable docs/examples/community skip gate, but the main unit/linter/smoke workflows do not call it.master_only.ymlruns full integration plus benchmark upload across the same Python version matrix.feature-server-devDocker build is the master workflow long pole and should get its own cache/layering/toolchain audit.Suggested Cleanup Tracks
Low-risk CI hygiene
Test-suite restructuring
Matrix policy
Master integration/build improvements
feature-server-devDocker build caching, Node version alignment, image context, and layer ordering.This issue is intentionally broad so contributors can pick off independent pieces without needing to solve the whole CI design in one PR.