Skip to content

feat: add max_rel_per_label to support recall for labeled data#826

Merged
JoanFM merged 11 commits into
docarray:mainfrom
guenthermi:feat-add-max_rel_per_label
Nov 30, 2022
Merged

feat: add max_rel_per_label to support recall for labeled data#826
JoanFM merged 11 commits into
docarray:mainfrom
guenthermi:feat-add-max_rel_per_label

Conversation

@guenthermi

@guenthermi guenthermi commented Nov 22, 2022

Copy link
Copy Markdown
Contributor

Signed-off-by: Michael Guenther guenthermi50@gmail.com

Goals:

  • Support recall@k and F1 measure@k for labeled datasets.
  • check and update documentation, if required. See guide

For labeled datasets it is not trivial to calculate metrics like recall and F1 measure, which require the knowledge of the number of relevant documents in the document collection for each query since neither the whole set of relevant documents nor the number of documents with a specific label is provided to the evaluate function.

To enable the calculation of recall and F1 measure, this PR

  • adds a parameter max_rel_per_label: Dict to the evaluate function which provides the number of relevant documents for each label, i.e., the number of documents in the collection with this label.
  • calculates those label counts for max_rel_per_label in the embed_and_evaluate_function.

Code Example:

# example DocumentArray with matches and labels for evaluation 
da = DocumentArray([Document(text=str(i), tags={'label': i}) for i in range(3)])
for d in da:
  d.matches = da
# each label occurs one time in the data collection (not provided to the evaluate function)
max_rel_per_label = {i: 1 for i in range(3)}
# evaluate matches
metrics = da.evaluate(['recall_at_k'], max_rel_per_label=max_rel_per_label)
print(metrics)
{'recall_at_k': 1.0}

Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
@codecov-commenter

codecov-commenter commented Nov 22, 2022

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.47368% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.81%. Comparing base (7a5b0bf) to head (66f6ee5).
⚠️ Report is 608 commits behind head on main.

Files with missing lines Patch % Lines
docarray/array/mixins/evaluation.py 94.11% 1 Missing ⚠️
docarray/math/evaluation.py 50.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (7a5b0bf) and HEAD (66f6ee5). Click for more details.

HEAD has 13 uploads less than BASE
Flag BASE (7a5b0bf) HEAD (66f6ee5)
docarray 26 13
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #826      +/-   ##
==========================================
- Coverage   86.38%   77.81%   -8.57%     
==========================================
  Files         138      138              
  Lines        7122     7137      +15     
==========================================
- Hits         6152     5554     -598     
- Misses        970     1583     +613     
Flag Coverage Δ
docarray 77.81% <89.47%> (-8.57%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

caller_max_rel = kwargs.pop('max_rel', None)
for d, gd in zip(self, ground_truth):
max_rel = caller_max_rel or len(gd.matches)
if caller_max_rel:

@gmastrapas gmastrapas Nov 23, 2022

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to refactor the if else logic here a bit

for d, gd in zip(self, ground_truth):
    if caller_max_rel:
        max_rel = caller_max_rel

    if ground_truth_type == 'labels':
        if max_rel_per_label:
            max_rel = max_rel_per_label.get(d.tags[label_tag], None)
            if max_rel is None:
                raise ValueError(
                    '`max_rel_per_label` misses the label ' + str(d.tags[label_tag])
                )
        else:
            raise ValueError('max_rel is required or something')
    else:
        max_rel = len(gd.matches)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is correct, that caller_max_rel is used if the user provides a max_rel attribute explicitly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception when max_rel is not set should also not be their because most of the metrics do not require max_rel, but setting it to None might be better than setting it to len(gd.matches). I will change this.

Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
@guenthermi guenthermi force-pushed the feat-add-max_rel_per_label branch from d65eb05 to 29777b7 Compare November 24, 2022 08:27
Signed-off-by: Michael Guenther <guenthermi50@gmail.com>

@bwanglzu bwanglzu left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comment

Comment thread docarray/array/mixins/evaluation.py Outdated
Comment on lines +459 to +464
if ground_truth and label_tag in ground_truth[0].tags:
max_rel_per_label = dict(Counter([d.tags[label_tag] for d in ground_truth]))
elif not ground_truth and label_tag in query_data[0].tags:
max_rel_per_label = dict(Counter([d.tags[label_tag] for d in query_data]))
else:
max_rel_per_label = None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't understand, max_rel_per_label is a variable you passed into the function, then what is this max_rel_per_label?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay i see, these are two functions

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaberate a bit the naming, max_rel_per_label? why not num_relevant_documents_per_label or something like that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done

@guenthermi guenthermi requested review from bwanglzu and removed request for gmastrapas November 28, 2022 10:35
@guenthermi guenthermi force-pushed the feat-add-max_rel_per_label branch 3 times, most recently from 802d8da to 62e70f8 Compare November 28, 2022 11:04
@guenthermi guenthermi closed this Nov 28, 2022
@guenthermi guenthermi reopened this Nov 28, 2022
Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
@guenthermi guenthermi force-pushed the feat-add-max_rel_per_label branch 2 times, most recently from 9dbabed to 22f0dc9 Compare November 28, 2022 12:12
Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
@guenthermi guenthermi force-pushed the feat-add-max_rel_per_label branch from 68852b9 to 5739b4f Compare November 28, 2022 13:57

@JoanFM JoanFM left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see some changes in Documentation

Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
@guenthermi guenthermi requested review from JoanFM and removed request for JoanFM and bwanglzu November 29, 2022 13:25
@guenthermi guenthermi requested review from JoanFM and bwanglzu and removed request for JoanFM and bwanglzu November 29, 2022 13:25
@JoanFM JoanFM requested a review from alexcg1 November 29, 2022 13:32

In this case, the keyword argument `k` is passed to all metric functions, even though it does not fulfill any specific function for the calculation of the reciprocal rank.

### The max-rel parameter

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### The max-rel parameter
### The max_rel parameter

### The max-rel parameter

Some metric functions shown in the table above require a `max_rel` parameter.
This parameter should be set to the number of relevant documents in the document collection.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This parameter should be set to the number of relevant documents in the document collection.
This parameter should be set to the number of relevant Documents in the Document collection.


Some metric functions shown in the table above require a `max_rel` parameter.
This parameter should be set to the number of relevant documents in the document collection.
Without the knowledge of this number, metrics like `recall_at_k` and `f1_score_at_k` can be calculated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can or cannot?

This parameter should be set to the number of relevant documents in the document collection.
Without the knowledge of this number, metrics like `recall_at_k` and `f1_score_at_k` can be calculated.

In the `evaluate` function, one can provide a keyword argument `max_rel`, which is then used for all queries.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change all "one" to "you"

{'recall_at_k': 1.0}
```

Since all relevant documents are in the matches, the recall is one.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize Documents throughout text

Signed-off-by: Michael Guenther <guenthermi50@gmail.com>
@guenthermi guenthermi requested review from alexcg1 and removed request for JoanFM November 29, 2022 14:35

@alexcg1 alexcg1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@JoanFM JoanFM merged commit 8a4224d into docarray:main Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants