An Empirical Study on Compositionality in Compound Nouns

Reddy, Siva, McCarthy, Diana and Manandhar, Suresh
Proceedings of 5th International Joint Conference on Natural Language Processing (IJCNLP-2011). p.210--218
2011

Abstract

A multiword is compositional if its meaning can be expressed in terms of the meaning of its constituents. In this paper, we collect and analyse the compositionality judgments for a range of compound nouns using Mechanical Turk. Unlike existing compositionality datasets, our dataset has judgments on the contribution of constituent words as well as judgments for the phrase as a whole. We use this dataset to study the relation between the judgments at constituent level to that for the whole phrase. We then evaluate two different types of distributional models for compositionality detection - constituent based models and composition function based models. Both the models show competitive performance though the composition function based models perform slightly better. In both types, additive models perform better than their multiplicative counterparts.
Dataset: http://www.cs.york.ac.uk/aig/nl/dat...
Presentation: http://www.cs.york.ac.uk/aig/nl/dat...

Files

download pdf of An Empirical Study on Compositionality in Compound Nouns pdf format

BibTex

@InProceedings{reddy-mccarthy-manandhar:2011:IJCNLP-2011,
  author    = {Reddy, Siva  and  McCarthy, Diana  and  Manandhar, Suresh},
  title     = {An Empirical Study on Compositionality in Compound Nouns},
  booktitle = {Proceedings of 5th International Joint Conference on Natural Language Processing (IJCNLP-2011)},
  month     = {November},
  year      = {2011},
  address   = {Chiang Mai, Thailand },
  publisher = {Asian Federation of Natural Language Processing},
  pages     = {210--218},
  url= {http://www.aclweb.org/anthology/I/I11/I11-1024.pdf}
  abstract = {A multiword is compositional if its meaning
can be expressed in terms of the meaning
of its constituents. In this paper, we
collect and analyse the compositionality
judgments for a range of compound nouns
using Mechanical Turk. Unlike existing
compositionality datasets, our dataset
has judgments on the contribution of constituent
words as well as judgments for the
phrase as a whole. We use this dataset
to study the relation between the judgments
at constituent level to that for the
whole phrase. We then evaluate two different
types of distributional models for compositionality
detection - constituent based
models and composition function based
models. Both the models show competitive
performance though the composition
function based models perform slightly
better. In both types, additive models perform
better than their multiplicative counterparts. 
Dataset: http://www.cs.york.ac.uk/aig/nl/datasets/compositionalityDataset/ijcnlp_compositionality_data.tgz
Presentation: http://www.cs.york.ac.uk/aig/nl/datasets/compositionalityDataset/EmpStdyComp.pdf } }