Fixed output_all_hiddens for hubert in huggingface_wav2vec by gorinars · Pull Request #1587 · speechbrain/speechbrain

gorinars · 2022-09-28T16:19:45Z

I am trying to extract all hidden representations from several HF models using recently implemented in #1570 output_all_hiddens property.

Specifically, I used source=["facebook/wav2vec2-base", "facebook/hubert-base-ls960", "microsoft/wavlm-base", "microsoft/wavlm-base-plus"] in HuggingFaceWav2Vec2 class.

All works good except Hubert where we have dim(out) = 2 so the code crashes.

Unlike others, it does not have a 512-dimensional representation in the out[1], which is not used anyway.

Taking the last dimension for accessing all transformer layers should work for all these models unless I am missing something.

BenoitWang · 2022-09-28T17:55:29Z

Hi @gorinars, thanks for trying this PR and pointing this out.

It is true that I didn't test for all the models, however, if I made it right, these models output attentions at the very end which is None if not specified, and that blocks you code. So we should use the second last output.

Please check these huggingface codes: wav2vec, HuBERT

gorinars · 2022-09-28T18:37:15Z

Thanks for a quick reply @BenoitWang.

It seems that all models I am testing have output_attentions=False. And in this case for some reason I do not see None in the out variable.

Let me test a bit more with different settings.

Here is a simple test that I currently use and that passes for wav2vec and wavlm but fails for hubert.

@pytest.mark.slow
@pytest.mark.parametrize("model", ["facebook/wav2vec2-base", "facebook/hubert-base-ls960", "microsoft/wavlm-base", "microsoft/wavlm-base-plus", "microsoft/wavlm-base-plus-sd"])
@pytest.mark.parametrize("batch_size", [1, 4])
def test_sb_wav2vec(batch_size, model):

    model = HuggingFaceWav2Vec2(model, "data")

    wav = torch.rand([batch_size, 32000])

    # Extract wav2vec output
    out = model.model(wav, output_hidden_states=True)

    out_expected_len = 99
    assert len(out) == 3
    assert out[0].shape == torch.Size([batch_size, out_expected_len, 768])
    assert out[1].shape == torch.Size((batch_size, out_expected_len, 512))
    assert len(out[2]) == 13
    assert out[2][12].shape == torch.Size((batch_size, out_expected_len, 768))

gorinars · 2022-09-28T18:43:50Z

OK, so enforcing model.model.config.output_attentions=True in the test above makes len(out) == 4.
With that, I actually think the safest approach would be to use explicit names like

out[0] -> out.last_hidden_state
out[2] -> out.hidden_states

thoughts?

BenoitWang · 2022-09-28T20:19:51Z

Yes much neater, and hope that they use always the same names for all the models:).

Thanks!

TParcollet · 2022-09-29T08:53:44Z

@BenoitWang could you review the PR and merge if it looks good to you? Thanks!

Fixed output_all_hiddens for hubert in huggingface_wav2vec

2ef13ec

gorinars requested a review from BenoitWang September 28, 2022 16:20

use explicit method names instead of indices in hf model outputs

53c5822

BenoitWang approved these changes Sep 28, 2022

View reviewed changes

BenoitWang approved these changes Sep 29, 2022

View reviewed changes

TParcollet merged commit 211083f into speechbrain:develop Sep 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed output_all_hiddens for hubert in huggingface_wav2vec#1587

Fixed output_all_hiddens for hubert in huggingface_wav2vec#1587
TParcollet merged 2 commits intospeechbrain:developfrom
gorinars:fix-hubert-output-all

gorinars commented Sep 28, 2022

Uh oh!

BenoitWang commented Sep 28, 2022

Uh oh!

gorinars commented Sep 28, 2022

Uh oh!

gorinars commented Sep 28, 2022 •

edited

Loading

Uh oh!

BenoitWang commented Sep 28, 2022

Uh oh!

TParcollet commented Sep 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gorinars commented Sep 28, 2022

Uh oh!

BenoitWang commented Sep 28, 2022

Uh oh!

gorinars commented Sep 28, 2022

Uh oh!

gorinars commented Sep 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenoitWang commented Sep 28, 2022

Uh oh!

TParcollet commented Sep 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gorinars commented Sep 28, 2022 •

edited

Loading