Scope of Negation with MRS
Packard, W., Bender, E. M., Read, J., Oepen, S., & Dridan, R. (2014). Simple Negation Scope Resolution Through Deep Parsing: A Semantic Solution to a Semantic Problem. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics. Baltimore, MD.
Here you will find the data and code necessary to reproduce the results published in this paper.
The system runs under Linux and has dependencies that will make it difficult to run on other systems.
- Download the code and data and unpack it.
- Decide which experiment you want to reproduce.
- The system can be run with gold cues or the cues predicted by the system of Read et al. (2012).
The input data is stored in the tsdb/1212/goldcue/ and tsdb/1212/syscue/ directories.
- There are three datasets: training (results not included in the paper and called sst in the system file structure), development (called CDD in the paper and ssd in the system file structure), and evaluation (called CDE in the paper and sse in the system file structure).
The input profile for the evaluation data with gold cues, for example, is: tsdb/1212/goldcue/sse.ace.prob
- Finally, the variant of the system to use must be selected: the options discussed in the paper are Ranker (the system of Read et al.), Crawler, and two system combinations: Crawler_N (backoff when no MRS is available) and Crawler_P (backoff when confidence score is too low).
- To reproduce results reported for the Ranker system, use e.g.:
$ perl eval.cd-sco.pl -s mrs_experiments/ssd_ranker.txt -g gold/cdd.txt
- To reproduce results reported for the Crawler system or the combinations, use the following steps.
The choice of which system is tested (Crawler, Crawler_N, Crawler_P, oracle) is determined by the combined choices made in steps 2 and 3 below.
- Navigate into the uw/scope-predictor/ directory.
- Run the system:
This produces a file uw/scope-predictor/connl.txt, which is used by the scripts in the following step, but (contrary to its name) is not exactly CONNL-formatted.
- Evaluate the system:
- For gold cues with Crawler, use e.g.:
bash scripts/eval-ssd-only.sh
- For gold cues with Crawler_P or Crawler_N, use e.g.:
bash scripts/eval-ssd.sh
- For system cues with Crawler, use e.g.:
bash scripts/eval-ssd-only-syscue.sh
- For system cues with Crawler_P or Crawler_N, use e.g.:
bash scripts/eval-ssd-syscue.sh
- For oracle comparisons (with gold cues only), use e.g.:
bash scripts/eval-ssd-oracle.sh
The scoring script tends to produce a few error messages about uninitialized values, which should be ignored.
Please note that evaluations will not be meaningful if the evaluation script invoked does not match the data profile used in the cue invocation, in terms of the dataset and the cue type (system vs. gold).
The parsed profiles in tsdb/1212 were created using the ACE parser (version 0.9.16) with the English Resource Grammar (version 1212), with support for the CONNL token format provided by [incr tsdb()].