Move benchmarks to daily cron#1302
Merged
jsturtevant merged 5 commits intohyperlight-dev:mainfrom Mar 11, 2026
Merged
Conversation
Delete Benchmarks.yml and add its features (artifact upload, baseline_tag, baseline_run_id, retention_days inputs) to dep_benchmarks.yml. Update CreateRelease.yml to call dep_benchmarks.yml with a matrix directly. Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Remove the benchmarks job from ValidatePullRequest.yml and add a new DailyBenchmarks.yml workflow that runs benchmarks daily, comparing against the previous day's run artifacts with 90-day retention. Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Replace references to per-PR benchmarks and Benchmarks.yml with the new DailyBenchmarks.yml and dep_benchmarks.yml workflows. Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
simongdavies
previously approved these changes
Mar 11, 2026
Contributor
simongdavies
left a comment
There was a problem hiding this comment.
LGTM, I fired off a Copilot review or backup
Contributor
There was a problem hiding this comment.
Pull request overview
Moves benchmark execution out of PR validation and into a scheduled daily workflow, while reworking the benchmark workflow into a reusable component that can compare against either the prior day’s artifacts or a release baseline.
Changes:
- Add
DailyBenchmarks.ymlscheduled workflow that performs day-over-day comparisons using prior run artifacts. - Update
dep_benchmarks.ymlto support baseline selection (previous run vs. release tag) and upload benchmark results as artifacts with configurable retention. - Remove benchmarks from
ValidatePullRequest.yml, switch release benchmarking to the reusable workflow, and update docs/remove legacy workflow.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/benchmarking-hyperlight.md | Updates documentation to reflect daily scheduled benchmarks and revised release workflow usage. |
| .github/workflows/dep_benchmarks.yml | Expands reusable benchmark runner to support baseline download and artifact retention/upload. |
| .github/workflows/ValidatePullRequest.yml | Removes per-PR benchmark job from PR validation pipeline. |
| .github/workflows/DailyBenchmarks.yml | Adds daily scheduled benchmarking + baseline discovery + failure notification. |
| .github/workflows/CreateRelease.yml | Switches release benchmarking to use the reusable workflow with a matrix. |
| .github/workflows/Benchmarks.yml | Deletes the legacy benchmarks workflow. |
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
jsturtevant
reviewed
Mar 11, 2026
jsturtevant
previously approved these changes
Mar 11, 2026
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
jsturtevant
approved these changes
Mar 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Running per PR is slow (>30 min), and will just get slower and slower for every added benchmark. Furthermore, I don't think most PR authors look at the results anyway.
It now runs daily, and compares with the previous day's result. Unfortunately retention period is only 90 days (max), so maybe this is something to look into in anther PR (e.g. save results to different branch or soemthing)