Baseline vs Best model performance at recovering MRS arguments, aggregated by predication type.
MATCH indicates how often the gold predication itself was found in the result analysis.
ARG2 etc indicates how often, given that a gold predication was found in the result analysis, the value of that particular argument matched (as judged by finding a matching pair of gold/result EPs with that value as their ARG0).


aggregatebaselinebest
exact tree match 40.3% 47.5%
_v_ . ARG1 91.50% 93.21%
_v_ . ARG2 93.08% 94.70%
_v_ . ARG3 90.35% 90.95%
_v_ . MATCH 96.39% 96.70%
_p_ . ARG1 80.03% 83.99%
_p_ . ARG2 94.35% 95.62%
_p_ . MATCH 93.94% 94.67%
_n_ . ARG1 93.81% 95.14%
_n_ . MATCH 98.33% 98.41%
_a_ . ARG1 93.37% 95.08%
_a_ . MATCH 97.51% 97.65%
. MATCH 92.09% 93.31%
_in_p_rel . ARG1 78.06% 80.53%
_in_p_rel . ARG2 95.95% 96.40%
_for_p_rel . ARG1 78.84% 82.60%
_for_p_rel . ARG2 93.09% 94.74%
_of_p_rel . ARG1 97.00% 97.16%
_of_p_rel . ARG2 94.26% 95.49%
_and_c_rel . L-INDEX 84.98% 88.45%
_and_c_rel . R-INDEX 88.96% 90.02%
compound_rel . ARG1 99.80% 99.79%
compound_rel . ARG2 95.70% 96.68%