Skip to content

fix(queue): emit retrying Queue Local Event instead of failed#186

Closed
hughsw wants to merge 5 commits intobee-queue:masterfrom
hughsw:fix-retrying
Closed

fix(queue): emit retrying Queue Local Event instead of failed#186
hughsw wants to merge 5 commits intobee-queue:masterfrom
hughsw:fix-retrying

Conversation

@hughsw
Copy link
Collaborator

@hughsw hughsw commented Feb 21, 2020

This fixes the problem reported in #184

This fix could conceivably break code that has worked around the problem, e.g. by looking at job.options.retries or job.status in their queue failed handler...

@hughsw hughsw requested a review from skeggse February 21, 2020 14:57
@hughsw
Copy link
Collaborator Author

hughsw commented Feb 21, 2020

Apologies for the commit churn -- It's taking a little while to get my build/test configuration running in a Docker container....

Copy link
Member

@skeggse skeggse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, in a vacuum. I think it may break existing code that has come to expect jobs that fail and then retry to emit the failed event. What do you think about emitting both retrying and failed when a job fails and retries?

@hughsw
Copy link
Collaborator Author

hughsw commented Feb 21, 2020

I have undone the destructuring syntax, and fixed the test (grrr, so I thought... Tests are failing due to timeout; I still have a lot to learn aboue bee-queue and bee-queue testing...).

The question about emitting both 'failed' and 'retrying' is interesting. My gut reaction is that the two messages is a more surprising behavior than what the docs imply will happen, and what the names of the events imply. But, two messages doesn't contradict what the docs say, and the retrying docs even mention that the job being retried has just failed. But that's also just being explicit -- 'retrying' means the job failed.

Emitting only one of 'retrying' or 'failed' for the queue when a job fails seems less surprising. It's what the PubSub events do. It's what the jobs do. The retrying messages let you know the job is still viable. 'succeeded' or 'failed' are the final word on the job. Otherwise you have to actively dig around in implementation-specific details to make a determination about the job's final status...

@hughsw
Copy link
Collaborator Author

hughsw commented Feb 21, 2020

Indeed, the reason I filed the issue, #184 was that Queue Local is a simple way to manage job lifecycle that didn't have the complexity or overhead of the PubSub or the Job callbacks. It would be very surprising that Queue Local callbacks have different semantics than the other two sets, and have no simple way to know if a job has really failed yet...

@skeggse
Copy link
Member

skeggse commented Feb 21, 2020

I agree with everything you're saying, and I don't think we're in a position to ship a new major revision of the library. Shipping the change that makes retrying fire instead of failed represents a major change in the API surface and since we don't know what downstream folks are doing with the library this can break their expectations (even though the API does seem like it should only emit one or the other). Maybe as a compromise we can create a new event on the Queue like failure:final or finalFailure that signals that a job has finished and is not being retried, and in bee-queue v2 can remove or deprecate the new event in favor of a fixed failure event.

Thanks also for pointing out the documentation/confusion problems here - I think we can help with that some with more documentation, but I don't see an optimal/foolproof solution here.

@hughsw
Copy link
Collaborator Author

hughsw commented Feb 22, 2020

I agree.

Here's the simplest path forward that I can see:

  • we make no code changes
  • we update the documentation to remove any mention of a Queue Local 'retrying' event
  • we update the Queue Local 'failed' documentation to be explicit that:
    • all job failures emit the event, regardless of whether the job will be retried
    • using only Queue Local events, there is no simple, stateless way to determine that a job has failed for good (won't be retried)
    • perhaps mention job.status as a potential work-around in the disclaimer regarding Queue Local events and job failure? (At present it is only documented in example code for Queue#getJob -- not sure if you want to promote job.status as a bona-fide feature? -- would definitely want to protect it with a getter)
  • we add a comment or two to the code regarding how things could be changed in order to unify the 'succeeded', 'retrying', 'failed' callbacks across the three classes of events

@compwright compwright changed the title fix(queue): Emit retrying Queue Local Event instead of failed, update tests fix(queue): emit retrying Queue Local Event instead of failed Nov 22, 2022
@compwright compwright closed this Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments