Skip to content

[dbsp] Always GC storage at startup.#5803

Merged
blp merged 1 commit intomainfrom
always-gc
Mar 11, 2026
Merged

[dbsp] Always GC storage at startup.#5803
blp merged 1 commit intomainfrom
always-gc

Conversation

@blp
Copy link
Member

@blp blp commented Mar 11, 2026

Until now, Feldera has only run GC at startup if a checkpoint file existed and was readable. This meant that if a pipeline crashed or force-stopped before the first checkpoint, then upon restart, the pipeline did not clear any files that were in storage from previous runs. This fixes the problem by unconditionally running GC at startup; if we can't read it now then there's no reason to believe that we will be able to read it later.

I tested this manually with a pipeline that writes to storage and force-stop.

Until now, Feldera has only run GC at startup if a checkpoint file existed
and was readable.  This meant that if a pipeline crashed or force-stopped
before the first checkpoint, then upon restart, the pipeline did not clear
any files that were in storage from previous runs.  This fixes the problem
by unconditionally running GC at startup; if we can't read it now then
there's no reason to believe that we will be able to read it later.

I tested this manually with a pipeline that writes to storage and
force-stop.

Signed-off-by: Ben Pfaff <blp@feldera.com>
@blp blp requested a review from gz March 11, 2026 20:17
@blp blp self-assigned this Mar 11, 2026
@blp blp added bug Something isn't working storage Persistence for internal state in DBSP operators rust Pull requests that update Rust code labels Mar 11, 2026
@blp blp added this pull request to the merge queue Mar 11, 2026
Copy link

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix makes sense -- if we can't read the checkpoint list, preserving orphaned files forever is wrong. But this changes startup behavior with no new test coverage.

@@ -70,15 +70,7 @@ impl Checkpointer {
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the startup behavior: previously, if checkpoint_list was empty (no checkpoint or unreadable), storage files were preserved and only measured. Now gc_startup() runs unconditionally and will delete everything it doesn't recognize as belonging to a known checkpoint.

The fix is correct in intent, but it needs a test for the specific case this addresses: pipeline force-stopped before first checkpoint -> restart -> orphaned files are cleaned up. Without it there's no regression guard if gc_startup()'s handling of an empty checkpoint list ever changes.

Merged via the queue into main with commit 30bcbaf Mar 11, 2026
1 check passed
@blp blp deleted the always-gc branch March 11, 2026 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working rust Pull requests that update Rust code storage Persistence for internal state in DBSP operators

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants