Skip to content

Updating CommonVoice CTC English and Conformer English#2560

Merged
TParcollet merged 51 commits intospeechbrain:developfrom
TParcollet:update_cv
Jun 26, 2024
Merged

Updating CommonVoice CTC English and Conformer English#2560
TParcollet merged 51 commits intospeechbrain:developfrom
TParcollet:update_cv

Conversation

@TParcollet
Copy link
Collaborator

@TParcollet TParcollet commented Jun 1, 2024

The CTC en recipes and Conformer EN recipes were way out of date.
Now we have something a bit better.
Models are training, i'll update the results.

  • update code
  • train models
  • verify dynamic batching does not break legacy code
  • update readme with results

@TParcollet TParcollet added enhancement New feature or request work in progress Not ready for merge labels Jun 1, 2024
Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics added 2 commits June 3, 2024 13:33
@TParcollet TParcollet added ready to review Waiting on reviewer to provide feedback and removed work in progress Not ready for merge labels Jun 3, 2024
@TParcollet
Copy link
Collaborator Author

@Adel-Moumen is ready to review and merge. The conformer WER is not there yet, but I'll edit the readme directly on the main repo once the decoding is done :p

Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics added 2 commits June 3, 2024 18:24
Copy link
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, overall lgtm. I leaved one comment about avoid_if_longer_than_val_test.

Comment on lines +283 to +286
test_data = test_data.filtered_sorted(
sort_key="duration",
key_max_value={"duration": hparams["avoid_if_longer_than_val_test"]},
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should definitely add a warning/log saying that we are skipping files that have a duration > to avoid_if_longer_than_val_test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do it :p

@TParcollet
Copy link
Collaborator Author

I merge cuz I'm the boss blblblblb.

@TParcollet TParcollet merged commit 0eb40ad into speechbrain:develop Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready to review Waiting on reviewer to provide feedback

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants