Optimize minDCF memory footprint #3037
Open
othman-istaiteh wants to merge 3 commits intospeechbrain:developfrom
Open
Optimize minDCF memory footprint #3037othman-istaiteh wants to merge 3 commits intospeechbrain:developfrom
othman-istaiteh wants to merge 3 commits intospeechbrain:developfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR refactors the
minDCFevaluation metric to resolve Out-Of-Memory (OOM) errors during the evaluation of large datasets.Due to the previous memory footprint, the original function crashed when attempting to compute
minDCFfor large trial lists, such as VoxCeleb1-H (Vox-H) and VoxCeleb1-E (Vox-E).Complexity Improvements:
Where
Nis the total number of scores andTis the number of unique thresholds (typicallyT ≤ N_pos + N_neg):Old Implementation:
New Implementation:
Algorithmic Details:
O(N * T)tensor expansion blocks for 1Dtorch.searchsortedoperations to count False Acceptance and False Rejection rates.(thresholds[0:-1] + thresholds[1:]) / 2) was removed. Becausep_missandp_faare step functions that only change state at observed scores, evaluating midpoints is mathematically redundant.Testing:
pytest tests/unittests/test_metrics.py.pytest --doctest-modules speechbrain/utils/metric_stats.py.Fixes #<issue_number>
Breaking Changes:
No breaking changes. The function signature and return types remain identical.
Before submitting
PR review
Reviewer checklist