Skip to content

adapters: configurable error handling for failed batches during snapshot reading#5777

Draft
swanandx wants to merge 2 commits intomainfrom
issue5750
Draft

adapters: configurable error handling for failed batches during snapshot reading#5777
swanandx wants to merge 2 commits intomainfrom
issue5750

Conversation

@swanandx
Copy link
Contributor

@swanandx swanandx commented Mar 7, 2026

Fixes issue #5750 by introducing configurable snapshot error handling. Previously, failed batch errors were always non-fatal, causing the connector to just log error and skip batches. Now users can choose snapshot_error_mode:

  • ignore (default): log warning and skip batch (previous behavior)
  • fail: treat error as fatal

Describe Manual Test Plan

Checklist

  • Unit tests added/updated
  • Integration tests added/updated
  • Documentation updated
  • Changelog updated

Breaking Changes?

Mark if you think the answer is yes for any of these components:

Describe Incompatible Changes

swanandx and others added 2 commits March 8, 2026 00:22
…hot reading

Fixes issue #5750 by introducing configurable snapshot error handling.
Previously, failed batch errors were always non-fatal, causing the connector to just log error
and skip batches. Now users can choose snapshot_error_mode:
  - ignore (default): log warning and skip batch (previous behavior)
  - fail: treat error as fatal

Signed-off-by: Swanand Mulay <73115739+swanandx@users.noreply.github.com>
Signed-off-by: feldera-bot <feldera-bot@feldera.com>
@lalithsuresh
Copy link
Contributor

@swanandx shouldn't "fail: treat error as fatal" should be the default, even if it means differing from previous behavior?

@swanandx
Copy link
Contributor Author

swanandx commented Mar 9, 2026

@swanandx shouldn't "fail: treat error as fatal" should be the default, even if it means differing from previous behavior?

i thought fatal error would stop the whole pipeline, and to make sure we don't break any users current setup, i kept it that way. Now that I know treating it as fatal error won't stop / break anything, we can make it the default.

Copy link
Contributor

@ryzhyk ryzhyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, reporting a fatal error doesn't automatically stop the pipeline or the connector. You need to stop the connector yourself by exiting worker_task_inner. You also need to make sure that the is_closed method returns true (which I think will happen automatically when worker_task_inner terminates). This will ensure that the controller is aware that the connector is gone. But the connector should still respond to Queue commands to flush any inputs that have already been parsed. And it should commit the current transaction.

@swanandx , we can talk offline about the details.


/// Controls how the connector handles errors during snapshot loading.
///
/// * `"ignore"` - Log a non-fatal warning and skip the failed batch (default).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// * `"ignore"` - Log a non-fatal warning and skip the failed batch (default).
/// * `"ignore"` - Log a non-fatal error and skip the failed batch (default).

/// Controls how the connector handles errors during snapshot loading.
///
/// * `"ignore"` - Log a non-fatal warning and skip the failed batch (default).
/// * `"fail"` - Treat the error as fatal and stop the pipeline immediately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// * `"fail"` - Treat the error as fatal and stop the pipeline immediately.
/// * `"fail"` - Treat the error as fatal and stop the connector immediately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants