Cats Climb Entails Mammals Move: Preserving Hyponymy in Compositional Distributional Semantics

Year
2021
Volume 22
Issue 3
Pages
311-353
Authors
Gemma De las Cuevas, Andreas Klingler, Martha Lewis, and Tim Netzer
Abstract
To give vector-based representations of meaning more structure, an approach proposed in Piedeleu et al. (2015); Sadrzadeh et al. (2018);
Bankova et al. (2018) is to use positive semidefinite (psd) matrices. These allow us to model similarity of words as well as the hyponymy
or is-a relationship. To compose words to form phrases and sentences, we may represent adjectives, verbs, and other functional words as multilinear,
positivity preserving maps, following the compositional distributional approach introduced in Coecke et al. (2010) and extended to
the realm of psd matrices in Piedeleu et al. (2015), but it is not clear how to learn representations of functional words when working with
psd matrices. In this paper, we introduce a generic way of composing the psd matrices corresponding to words. We propose that psd matrices
for verbs, adjectives, and other functional words be lifted to completely positive (CP) maps that match their grammatical type. This lifting is
carried out by our composition rule called Compression, Compr. In contrast to previous composition rules like Fuzz and Phaser (Coecke and
Meichanetzidis, 2020) (a.k.a. KMult and BMult (Lewis, 2019a)), Compr preserves hyponymy. Mathematically, Compr is itself a CP map, and
is therefore linear and generally non-commutative. We give a number of proposals for the structure of Compr, based on spiders, cups, and
caps, and generate a range of composition rules. We test these rules on sentence entailment datasets from Kartsaklis and Sadrzadeh (2016), and
see some improvements over the performance of Fuzz and Phaser. We go on to estimate the parameters of a simplified form of Compr based on
entailment information from the aforementioned datasets, and find that whilst this learnt operator does not consistently outperform previously
proposed mechanisms, it is competitive and has the potential to improve with the use of a less simplified version.