Conversation
Until now, Feldera has only run GC at startup if a checkpoint file existed and was readable. This meant that if a pipeline crashed or force-stopped before the first checkpoint, then upon restart, the pipeline did not clear any files that were in storage from previous runs. This fixes the problem by unconditionally running GC at startup; if we can't read it now then there's no reason to believe that we will be able to read it later. I tested this manually with a pipeline that writes to storage and force-stop. Signed-off-by: Ben Pfaff <blp@feldera.com>
mythical-fred
left a comment
There was a problem hiding this comment.
The fix makes sense -- if we can't read the checkpoint list, preserving orphaned files forever is wrong. But this changes startup behavior with no new test coverage.
| @@ -70,15 +70,7 @@ impl Checkpointer { | |||
| } | |||
There was a problem hiding this comment.
This changes the startup behavior: previously, if checkpoint_list was empty (no checkpoint or unreadable), storage files were preserved and only measured. Now gc_startup() runs unconditionally and will delete everything it doesn't recognize as belonging to a known checkpoint.
The fix is correct in intent, but it needs a test for the specific case this addresses: pipeline force-stopped before first checkpoint -> restart -> orphaned files are cleaned up. Without it there's no regression guard if gc_startup()'s handling of an empty checkpoint list ever changes.
Until now, Feldera has only run GC at startup if a checkpoint file existed and was readable. This meant that if a pipeline crashed or force-stopped before the first checkpoint, then upon restart, the pipeline did not clear any files that were in storage from previous runs. This fixes the problem by unconditionally running GC at startup; if we can't read it now then there's no reason to believe that we will be able to read it later.
I tested this manually with a pipeline that writes to storage and force-stop.