Baseline vs Best model performance at recovering MRS arguments, aggregated by predication type.
MATCH indicates how often the gold predication itself was found in the result analysis.
ARG2 etc indicates how often, given that a gold predication was found in the result analysis, the value of that particular argument matched (as judged by finding a matching pair of gold/result EPs with that value as their ARG0).
aggregate | baseline | best |
exact tree match | 40.3% | 47.5% |
_v_ . ARG1 | 91.50% | 93.21% |
_v_ . ARG2 | 93.08% | 94.70% |
_v_ . ARG3 | 90.35% | 90.95% |
_v_ . MATCH | 96.39% | 96.70% |
_p_ . ARG1 | 80.03% | 83.99% |
_p_ . ARG2 | 94.35% | 95.62% |
_p_ . MATCH | 93.94% | 94.67% |
_n_ . ARG1 | 93.81% | 95.14% |
_n_ . MATCH | 98.33% | 98.41% |
_a_ . ARG1 | 93.37% | 95.08% |
_a_ . MATCH | 97.51% | 97.65% |
. MATCH | 92.09% | 93.31% |
_in_p_rel . ARG1 | 78.06% | 80.53% |
_in_p_rel . ARG2 | 95.95% | 96.40% |
_for_p_rel . ARG1 | 78.84% | 82.60% |
_for_p_rel . ARG2 | 93.09% | 94.74% |
_of_p_rel . ARG1 | 97.00% | 97.16% |
_of_p_rel . ARG2 | 94.26% | 95.49% |
_and_c_rel . L-INDEX | 84.98% | 88.45% |
_and_c_rel . R-INDEX | 88.96% | 90.02% |
compound_rel . ARG1 | 99.80% | 99.79% |
compound_rel . ARG2 | 95.70% | 96.68% |