HotpotQA

A Dataset for Diverse, Explainable Multi-hop Question Answering

What is HotpotQA?

HotpotQA is a question answering dataset featuring natural, multi-hop questions, with strong supervision for supporting facts to enable more explainable question answering systems. It is collected by a team of NLP researchers at Carnegie Mellon University, Stanford University, and Université de Montréal.

For more details about HotpotQA, please refer to our EMNLP 2018 paper:

If you work on open-domain multi-hop question answering, you might also be interested in a new dataset one of our authors (Peng Qi) published more recently, BeerQA, which features open-domain questions that might require varying hops of reasoning to answer, and which HotpotQA is made part of.

Getting started

HotpotQA is distributed under a CC BY-SA 4.0 License. The training and development sets can be downloaded below.

A more comprehensive summary about data download, preprocessing, baseline model training, and evaluation is included in our GitHub repository, and linked below.

Once you have built your model, you can use the evaluation script we provide below to evaluate model performance by running python hotpot_evaluate_v1.py <path_to_prediction> <path_to_gold>

To submit your models and evaluate them on the official test sets, please read our submission guide hosted on Codalab.

We also release the processed Wikipedia used in the process of creating HotpotQA (also under a CC BY-SA 4.0 License), serving both as the corpus for the fullwiki setting in our evaluation, and hopefully as a standalone resource for future researches involving processed text on Wikipedia. Below please find the link to the documentation for this corpus.

Stay connected!

Join our Google group to receive updates or initiate discussions about HotpotQA!

If you use HotpotQA in your research, please cite our paper with the following BibTeX entry

@inproceedings{yang2018hotpotqa,
  title={{HotpotQA}: A Dataset for Diverse, Explainable Multi-hop Question Answering},
  author={Yang, Zhilin and Qi, Peng and Zhang, Saizheng and Bengio, Yoshua and Cohen, William W. and Salakhutdinov, Ruslan and Manning, Christopher D.},
  booktitle={Conference on Empirical Methods in Natural Language Processing ({EMNLP})},
  year={2018}
}
Leaderboard (Distractor Setting)
In the distractor setting, a question-answering system reads 10 paragraphs to provide an answer (Ans) to a question. They must also justify these answers with supporting facts (Sup).
Model Code Ans Sup Joint
EM F1 EM F1 EM F1
1
Aug 7, 2023
Beam Retrieval (single model)
BUPT & Tencent
(Zhang, Zhang, Zhang, et al. 2023)
72.69 85.04 66.25 90.09 50.53 77.54
2
Jul 7, 2022
PipNet (single model)
Tencent Cloud Xiaowei

72.26 84.86 63.71 89.41 48.76 76.95
3
Jun 27, 2022
Smoothing R3 (single model)
Fudan University & Huawei Poisson Lab
Rethinking Label Smoothing on Multi-hop Question Answering
72.07 84.34 65.44 89.55 49.73 76.69
4
Jan 28, 2022
FE2H on ALBERT (single model)
Nanjing University
From Easy to Hard: Two-stage Selector and Reader for Multi-hop Question Answering
71.89 84.44 64.98 89.14 50.04 76.54
5
May 16, 2022
R3 (single model)
Fudan University & Huawei Poisson Lab
Rethinking Label Smoothing on Multi-hop Question Answering
71.27 83.57 65.25 88.98 49.81 76.02
6
May 28, 2021
SAE+ (single model)
JD AI Research

70.74 83.61 63.70 88.95 48.15 75.72
7
Jul 12, 2021
S2G+EGA (single model)
Shanghai Jiao Tong University

70.92 83.44 63.86 88.68 48.76 75.47
8
Feb 27, 2021
S2G+ (single model)
Shanghai Jiao Tong University

70.72 83.53 64.30 88.72 48.60 75.45
9
Jan 11, 2021
AMGN+ (single model)
Anonymous

70.53 83.37 63.57 88.83 47.77 75.24
10
Mar 23, 2022
RD Model (single model)


70.35 82.86 63.57 88.81 47.96 75.17
11
Feb 14, 2022
FE2H on ELECTRA (single model)
Anonymous

69.54 82.69 64.78 88.71 48.46 74.90
12
Sep 6, 2020
SpiderNet-large (single model)
Kingsoft AI Lab

70.15 83.02 63.82 88.85 47.54 74.88
13
Feb 25, 2023
GIT (single model)
KAIST

70.07 82.86 62.59 88.53 47.22 74.84
14
Feb 20, 2021
S2G+ (single model)
Anonymous

69.38 82.17 64.30 88.72 48.00 74.36
15
Dec 30, 2021
AnonymousS (single model)
Anonymous

69.66 82.42 62.99 87.85 47.84 74.27
16
Nov 23, 2020
Anonymous (single model)
Anonymous

70.24 82.36 62.26 88.46 46.81 74.27
17
Dec 1, 2019
HGN-large (single model)
Anonymous

69.22 82.19 62.76 88.47 47.11 74.21
18
Nov 15, 2020
AMGN (single model)
Anonymous

69.89 82.79 62.67 88.12 46.59 74.20
19
Dec 15, 2021
BoSe (single model)
Anonymous

69.66 82.43 62.52 87.73 47.52 74.18
20
Jun 10, 2020
BFR-Graph (single model)
Anonymous

70.06 82.20 61.33 88.41 45.92 74.13
21
Apr 9, 2021
KIFGraph (single model)
LAB

69.53 82.42 61.79 87.98 46.49 74.12
22
Dec 14, 2021
Anonymous (single model)
Anonymous

69.43 82.47 61.85 87.59 46.57 73.93
23
May 11, 2020
GSAN-large (single model)
Anonymous

68.57 81.62 62.36 88.73 46.06 73.89
24
Sep 14, 2021
GIT (single model)
KAIST

69.12 82.01 62.05 88.19 46.50 73.87
25
Oct 6, 2020
FFReader-large (single model)
Kyoto University
(Alkhaldi et al., 2021)
68.89 82.16 62.10 88.42 45.61 73.78
26
May 28, 2020
ETC-large (single model)
Anonymous

68.12 81.18 63.25 89.09 46.40 73.62
27
May 28, 2020
Longformer (single model)
Anonymous

68.00 81.25 63.09 88.34 45.91 73.16
28
May 24, 2021
RealFormer (single model)
Anonymous

67.41 80.59 63.38 89.00 46.14 73.13
29
Apr 15, 2022
EGF Reader-large (single model)
Anonymous

68.10 80.96 62.60 88.20 46.15 72.96
30
Oct 18, 2019
C2F Reader (single model)
Joint Laboratory of HIT and iFLYTEK Research
(Shao, Cui et al. 2020)
67.98 81.24 60.81 87.63 44.67 72.73
31
Feb 11, 2021
Text-CAN large (single model)
Usyd NLP

67.53 80.80 61.62 86.95 45.75 72.52
32
Jun 15, 2020
SEGraph (single model)
Anonymous

68.03 81.17 61.70 87.43 44.86 72.40
33
Jan 24, 2021
S2G-large (single model)
Anonymous

67.34 80.24 62.66 87.61 45.80 72.26
34
Jun 29, 2021
()


67.44 80.27 60.08 86.16 44.69 71.46

Jun 30, 2021
() (single model)
Anonymous

67.44 80.27 60.08 86.16 44.69 71.46
36
Nov 19, 2019
SAE-large (single model)
JD AI Research
Tu, Huang et al., AAAI 2020
66.92 79.62 61.53 86.86 45.36 71.45
37
Sep 27, 2019
HGN (single model)
Microsoft Dynamics 365 AI Research
Fang et al., 2019
66.07 79.36 60.33 87.33 43.57 71.03
38
Aug 19, 2020
SpiderNet-Base (single model)
Anonymous

66.38 79.53 60.35 86.90 43.83 70.90
39
Jul 29, 2019
TAP 2 (ensemble)
IBM Research AI & IISc

66.64 79.82 57.21 86.69 41.21 70.65
40
Oct 1, 2019
EPS + BERT(wwm) (single model)
Anonymous

65.79 79.05 58.50 86.26 42.47 70.48
41
Mar 2, 2021
S2G-base (single model)
Anonymous

63.72 77.02 61.33 87.19 43.74 69.51
42
Feb 24, 2021
BDR+JNM (single model)
Anonymous

65.13 77.96 56.85 85.03 41.91 69.12
43
Jul 29, 2019
TAP 2 (single model)
IBM Research AI & IISc

64.99 78.59 55.47 85.57 39.77 69.12
44
Dec 3, 2020
AnonymousK (single model)
Anonymous

63.63 77.15 57.00 86.17 40.04 68.75
45
May 5, 2021
GAR-BERT (single model)
York University

62.67 76.35 59.50 87.98 40.64 68.74
46
May 31, 2019
EPS + BERT(large) (single model)
Anonymous

63.29 76.36 58.25 85.60 41.39 67.92
47
Jul 30, 2020
()


60.66 74.67 57.05 87.02 37.85 66.65
48
May 11, 2020
GSAN-base (single model)
Anonymous

61.25 74.74 57.74 86.28 39.56 66.62
49
Feb 12, 2021
Text-CAN (single model)
Usyd NLP

60.17 73.99 58.33 85.75 39.31 65.95
50
Aug 31, 2019
SAE (single model)
JD AI Research
Tu, Huang et al., AAAI 2020
60.36 73.58 56.93 84.63 38.81 64.96
51
Mar 13, 2021
GAR (single model)
York University

56.61 71.40 58.36 87.27 36.79 64.01

Mar 15, 2021
()


56.61 71.40 58.36 87.27 36.79 64.01
53
Jun 13, 2019
P-BERT (single model)
Anonymous

61.18 74.16 51.38 82.76 35.42 63.79
54
Sep 16, 2019
LQR-net 2 + BERT-Base (single model)
Anonymous

60.20 73.78 56.21 84.09 36.56 63.68
55
Apr 11, 2019
EPS + BERT (single model)
Anonymous

60.13 73.31 52.55 83.20 35.40 63.41
56
May 16, 2019
PIPE (single model)
Anonymous

59.77 72.77 52.53 82.82 35.54 62.92
57
Dec 1, 2019
SEval (single model)
Anonymous

61.87 74.37 45.73 80.50 33.32 62.73
58
Jun 8, 2019
TAP (single model)


58.63 71.48 46.84 82.98 32.03 61.90
59
Aug 14, 2019
SAQA (single model)
Anonymous

55.07 70.22 57.62 84.19 35.94 61.72
60
Sep 2, 2019
MKGN (single model)
Anonymous

57.09 70.69 54.26 83.54 35.59 61.69
61
Apr 19, 2019
GRN + BERT (single model)
Anonymous

55.12 68.98 52.55 84.06 32.88 60.31
62
Jun 19, 2019
LQR-net + BERT-Base (single model)
Anonymous

57.20 70.66 50.20 82.42 31.18 59.99
63
Apr 22, 2019
DFGN (single model)
Shanghai Jiao Tong University & ByteDance AI Lab
(Xiao, Qu, Qiu et al. ACL19)
56.31 69.69 51.50 81.62 33.62 59.82
64
Nov 21, 2018
QFE (single model)
NTT Media Intelligence Laboratories
(Nishida et al., ACL'19)
53.86 68.06 57.75 84.49 34.63 59.61
65
Jun 3, 2020
IRC (single model)
NTT Media Intelligence Laboratories
(Nishida et al., 2021)
58.54 72.67 36.56 79.53 23.57 59.43
66
Apr 17, 2019
LQR-net (ensemble)
Anonymous

55.19 69.55 47.15 82.42 28.42 58.86
67
Mar 4, 2019
GRN (single model)
Anonymous

52.92 66.71 52.37 84.11 31.77 58.47
68
Mar 1, 2019
DFGN + BERT (single model)
Anonymous

55.17 68.49 49.85 81.06 31.87 58.23
69
Mar 4, 2019
BERT Plus (single model)
CIS Lab

55.84 69.76 42.88 80.74 27.13 58.23
70
May 18, 2019
KGNN (single model)
Tsinghua University
(Ye et al., 2019)
50.81 65.75 38.74 76.79 22.40 52.82
71
Jul 14, 2021
RoBERTa-L Two-step Model (single model)
Anonymous

67.61 80.36 1.10 64.01 0.76 52.50
72
Mar 13, 2021
GAR-NOSF (single model)
York University

56.20 71.17 9.37 54.76 6.25 41.42

Mar 15, 2021
()


56.20 71.17 9.37 54.76 6.25 41.42
74
Aug 24, 2020
()


56.78 70.93 8.35 53.77 5.23 40.89
75
Oct 10, 2018
Baseline Model (single model)
Carnegie Mellon University, Stanford University, & Universite de Montreal
(Yang, Qi, Zhang, et al. 2018)
45.60 59.02 20.32 64.49 10.83 40.16
76
Aug 24, 2020
()


52.61 68.17 9.00 53.62 5.76 39.25
-
Feb 3, 2020
Unsupervised Decomposition (single model)
Facebook AI Research, New York University & University College London
Perez et al. EMNLP 2020
66.33 79.34 N/A N/A N/A N/A
-
Sep 24, 2019
ChainEx (single model)
UT Austin
(Chen et al., 2019)
61.20 74.11 N/A N/A N/A N/A
-
Feb 27, 2019
DecompRC (single model)
University of Washington
(Min et al., ACL'18)
55.20 69.63 N/A N/A N/A N/A
Leaderboard (Fullwiki Setting)
In the fullwiki setting, a question-answering system must find the answer to a question in the scope of the entire Wikipedia. Similar to in the distractor setting, systems are evaluated on the accuracy of their answers (Ans) and the quality of the supporting facts they use to justify them (Sup).
Model Code Ans Sup Joint
EM F1 EM F1 EM F1
1
May 10, 2021
AISO (single model)
Institute of Computing Technology, Chinese Academy of Sciences
(Zhu, Pang et al., EMNLP 2021)
67.46 80.52 61.17 86.02 44.87 72.00
2
Jan 31, 2023
Chain-of-Skills (single model)
Carnegie Mellon University, Microsoft Research and UIUC
Ma et al. ACL 2023
67.38 80.14 61.25 85.31 45.65 71.65
3
Feb 1, 2021
TPRR (single model)
Huawei Poisson Lab & Parallel Distributed Computing Lab

66.95 79.50 59.43 84.25 44.37 70.83
4
Jan 15, 2021
HopRetriever + Sp-search (single model)
Huawei Noah's Ark Lab & Huawei Cloud
(Li, Li, Shang, et al. 2020)
67.13 79.91 57.38 83.52 43.20 70.61
5
Dec 1, 2020
EBS-Large (single model)
Samsung SDS AI Research

66.18 79.32 57.29 83.98 41.95 70.04
6
Dec 18, 2020
HopRetriever (single model)
Huawei Noah's Ark Lab

67.13 79.91 57.23 82.59 43.10 69.84
7
Nov 30, 2020
IRRR+ (single model)
Stanford University & Samsung Research
(Qi, Lee, Sido, and Manning. 2020)
66.33 79.10 56.92 83.24 42.75 69.60
8
Dec 31, 2020
Anonymous (single model)
Anonymous

65.68 78.49 58.24 83.31 43.44 69.54
9
Sep 7, 2020
EBS-SH (single model)
Samsung SDS AI Research

65.53 78.61 55.90 83.13 40.91 68.94
10
Aug 3, 2020
IRRR (single model)
Stanford University & Samsung Research
(Qi, Lee, Sido, and Manning. 2020)
65.71 78.19 55.93 82.05 42.14 68.59
11
Oct 27, 2020
Anonymous (single model)
Anonymous

65.21 78.02 56.61 82.44 42.26 68.54
12
Sep 10, 2020
Anonymous (single model)
Anonymous

65.05 78.02 55.35 82.69 40.51 68.37
13
Aug 6, 2020
Anonymous (single model)
Anonymous

64.94 78.18 54.49 82.48 39.44 68.10
14
Aug 28, 2020
Anonymous (ensemble)
Anonymous

65.26 78.27 54.22 82.21 40.02 68.08
15
Oct 29, 2020
HopRetriever-V2 (single model)
anonymous

64.83 77.81 56.08 81.79 40.95 67.75
16
May 13, 2021
Anonymous (single model)
Anonymous

62.90 75.82 57.71 81.26 42.18 67.08
17
Dec 4, 2021
AFSGraph-retriever (single model)
Anonymous

64.55 77.79 55.65 81.23 41.05 66.98
18
May 19, 2021
Anonymous (single model)
Anonymous

62.67 75.51 57.54 80.93 42.03 66.87
19
Aug 26, 2020
Recursive Dense Retriever (single model)
Facebook AI & UCSB & UMass
Xiong, Li et al., ICLR 2021
62.28 75.29 57.46 80.86 41.78 66.55
20
May 21, 2020
Step-by-Step Retriever (single model)
Joint Laboratory of HIT and iFLYTEK Research

62.95 75.43 54.61 80.00 40.36 66.22
21
Nov 28, 2020
Anonymous (single model)
Anonymous

61.79 74.71 53.51 80.05 38.43 64.45
22
Jun 9, 2020
HopRetriever-V1 (single model)
anonymous

60.83 73.93 53.07 79.26 38.00 63.91
23
May 21, 2020
DDRQA (single model)
Georgia Institute of Technology & Peking University
(Yuyu, Ping et al. 2020)
62.53 75.91 51.01 78.86 36.04 63.88
24
Jul 6, 2020
Anonymous (single model)
Anonymous

64.29 77.23 51.12 78.57 36.29 63.75
25
Mar 6, 2020
DR model large (single model)
Anonymous

62.01 75.32 49.88 77.77 35.44 62.95
26
Nov 24, 2021
()


61.71 74.57 50.04 77.16 36.77 62.92

Nov 24, 2021
HopAns (single model)
ptf

61.71 74.57 50.04 77.16 36.77 62.92
28
Nov 21, 2020
Anonymous (single model)
Anonymous

60.44 73.22 52.01 77.05 37.98 62.86
29
Nov 15, 2021
Multi-dimensional-AFSGraph (single model)
Anonymous

61.53 74.61 50.33 77.24 36.21 62.44
30
Feb 11, 2020
HGN-albert + SemanticRetrievalMRS IR (single model)
Anonymous

59.74 71.41 51.03 77.37 37.92 62.26
31
Aug 19, 2021
Tree-shaped-cluster (single model)
Anonymous

60.31 73.14 49.87 76.83 35.85 61.73
32
Feb 6, 2021
AFSgraph (single model)
Anonymous

60.08 72.97 49.96 76.85 35.89 61.66
33
Nov 6, 2019
Robustly Fine-tuned Graph-based Recurrent Retriever (single model)
Salesforce Research & University of Washington
(Asai et al., ICLR 2020)
60.04 72.96 49.08 76.41 35.35 61.18
34
Oct 4, 2020
AFSgraph model (single model)
Anonymous

60.06 72.97 48.49 75.94 35.03 60.90
35
Dec 1, 2019
HGN-large + SemanticRetrievalMRS IR (single model)
Anonymous

57.85 69.93 51.01 76.82 37.17 60.74
36
Jan 24, 2021
DPR-recurrent (single model)
Anonymous

59.79 72.65 47.95 74.89 34.54 60.23
37
Jan 19, 2021
RoBERTa-DenseRetriever (single model)
Anonymous

59.60 72.43 47.87 74.79 34.53 60.05
38
Oct 7, 2019
HGN + SemanticRetrievalMRS IR (single model)
Microsoft Dynamics 365 AI Research
Fang et al., 2019
56.71 69.16 49.97 76.39 35.63 59.86
39
Jul 27, 2020
()


58.89 71.60 48.03 75.69 34.46 59.84
40
Jan 21, 2021
GraphRR-Fast (single model)
Anonymous

58.21 70.86 42.91 71.30 30.95 56.85
41
Feb 13, 2020
DR model (single model)
Anonymous

58.82 71.68 41.55 72.54 29.34 56.82
42
Dec 8, 2019
Quark + SemanticRetrievalMRS IR (single model)
Allen Institute for AI and Indian Institute of Technology
A Simple Yet Strong Pipeline for HotpotQA
55.50 67.51 45.64 72.95 32.89 56.23
43
May 6, 2021
GAR-BERT (single model)
York University

52.28 64.84 49.00 74.73 33.00 56.10
44
Sep 20, 2019
Graph-based Recurrent Retriever (single model)
Anonymous

56.04 68.87 44.14 73.03 29.18 55.31
45
Sep 28, 2019
MIR+EPS+BERT (single model)
Anonymous

52.86 64.79 42.75 72.00 31.19 54.75
46
Mar 14, 2021
GAR (single model)
York University

48.22 61.33 48.34 73.89 30.61 52.95
47
Feb 4, 2020
Transformer-XH-final(BERT-base) (single model)
University of Maryland, Microsoft AI & Research
(Zhao et al. ICLR 2020)
51.60 64.07 40.91 71.42 26.14 51.29
48
Sep 21, 2019
Transformer-XH (single model)
Anonymous

48.95 60.75 41.66 70.01 27.13 49.57
49
May 15, 2019
SemanticRetrievalMRS (single model)
UNC-NLP
(Nie et al., EMNLP'2019)
45.32 57.34 38.67 70.83 25.14 47.60
50
Nov 28, 2020
()


43.22 54.35 38.62 63.61 25.37 44.88
51
Feb 21, 2020
DrKIT (single model)
Carnegie Mellon University, Google Research
(Dhingra et al, ICLR 2020)
42.13 51.72 37.05 59.84 24.69 42.88
52
Nov 28, 2020
()


38.94 50.72 38.29 62.19 23.33 41.77
53
Jul 31, 2019
Entity-centric BERT Pipeline (single model)
Anonymous

41.82 53.09 26.26 57.29 17.01 39.18
54
May 21, 2019
GoldEn Retriever (single model)
Stanford University
(Qi et al., EMNLP-IJCNLP 2019)
37.92 48.58 30.69 64.24 18.04 39.13
55
Aug 14, 2019
PR-Bert (single model)
KingSoft AI Lab

43.33 53.79 21.90 59.63 14.50 39.11
56
Dec 4, 2019
SAFSr-Bert (single model)
Anonymous

39.35 51.40 24.21 58.54 13.34 37.00
57
Feb 21, 2019
Cognitive Graph QA (single model)
Tsinghua KEG & Alibaba DAMO Academy
(Ding et al., ACL'19)
37.12 48.87 22.82 57.69 12.42 34.92
58
Mar 14, 2021
GAR-NOSF (single model)
York University

47.50 60.62 7.62 44.79 4.88 33.36
59
Apr 12, 2021
IKFGraph (single model)
anonymous

35.82 45.33 15.97 51.20 11.46 30.38
60
Jul 8, 2022
AnonymousQ (single model)
Anonymous

36.85 45.95 15.25 46.76 11.54 29.07

Feb 12, 2024
()


36.85 45.95 15.25 46.76 11.54 29.07
62
May 15, 2023
HGN Model-reproduce (single model)
Peking University

33.51 42.69 15.59 49.32 10.95 28.40
63
Mar 5, 2019
MUPPET (single model)
Technion
(Feldman and El-Yaniv, ACL'19)
30.61 40.26 16.65 47.33 10.85 27.01
64
Apr 7, 2019
GRN + BERT (single model)
Anonymous

29.87 39.14 13.16 49.67 8.26 25.84
65
May 20, 2019
Entity-centric IR (single model)
Anonymous

35.36 46.26 0.06 43.16 0.02 25.47
66
May 19, 2019
KGNN (single model)
Tsinghua University
(Ye et al., 2019)
27.65 37.19 12.65 47.19 7.03 24.66
67
Aug 16, 2019
SAQA (single model)
Anonymous

28.44 38.62 14.69 47.17 8.62 24.49
68
Mar 4, 2019
GRN (single model)
Anonymous

27.34 36.48 12.23 48.75 7.40 23.55
69
Nov 25, 2018
QFE (single model)
NTT Media Intelligence Laboratories
(Nishida et al., ACL'19)
28.66 38.06 14.20 44.35 8.69 23.10
70
Nov 29, 2019
SAFSr_model (single model)
Anonymous

28.91 39.14 8.03 40.55 4.06 20.90
71
Oct 12, 2018
Baseline Model (single model)
Carnegie Mellon University, Stanford University, & Universite de Montreal
(Yang, Qi, Zhang, et al. 2018)
23.95 32.89 3.86 37.71 1.85 16.15
72
Nov 26, 2023
()


7.35 12.14 0.00 7.84 0.00 1.11
73
Jan 30, 2021
graph-recurrent-retriever+roberta-base w. S/R-pretraining (single model)
Anonymous

58.13 70.96 0.00 0.00 0.00 0.00
74
Mar 1, 2019
()


30.00 40.65 0.00 0.00 0.00 0.00
75
Jun 25, 2024
Mistral multi hop with very large sources (single model)
Anonymous

7.98 22.14 0.00 0.00 0.00 0.00
-
Dec 13, 2022
()


58.05 71.08 N/A N/A N/A N/A
-
May 19, 2019
TPReasoner w/o BERT (single model)
Anonymous

36.04 47.43 N/A N/A N/A N/A
-
Mar 3, 2019
MultiQA (single model)
Anonymous

30.73 40.23 N/A N/A N/A N/A