Distributional semantics represents words as multidimensional vectors recording their statistical distribution in context. Notwithstanding the wide use of this approach in fields as distant as Natural Language Processing, psycho-linguistic modeling and semantic analysis, relatively little work focused on the characterization of the semantic information encoded in these semantic vectors, especially for verbs. Here we investigate whether and to what extent distributional vectors are able to encode the semantic content of Dowty’s semantic proto-roles, which can be characterized as the set of entailment relations that an argument receives by virtue of its role in the event described by a predicate (Dowty 1989, 1991). We created several linear mappings between various kinds of static embeddings and a semantic space built on the basis of the proto-roles annotations collected by White et al. (2016). Our results show that, to a certain extent, proto-roles information is available in distributional models, and that a linear mapping can be used to infer the semantic characteristics of the arguments of novel verbs, thus testing the possibility of developing large-scale models able to extract the semantic properties for a wide inventory of verbs. Finally, we report a qualitative analysis in which we discuss which entailment relations our technique associates with a few semantic verb classes whose semantic roles are notoriously difficult to describe.

Investigating Dowty’s proto-roles with embeddings

Alessandro Lenci
Secondo
2021-01-01

Abstract

Distributional semantics represents words as multidimensional vectors recording their statistical distribution in context. Notwithstanding the wide use of this approach in fields as distant as Natural Language Processing, psycho-linguistic modeling and semantic analysis, relatively little work focused on the characterization of the semantic information encoded in these semantic vectors, especially for verbs. Here we investigate whether and to what extent distributional vectors are able to encode the semantic content of Dowty’s semantic proto-roles, which can be characterized as the set of entailment relations that an argument receives by virtue of its role in the event described by a predicate (Dowty 1989, 1991). We created several linear mappings between various kinds of static embeddings and a semantic space built on the basis of the proto-roles annotations collected by White et al. (2016). Our results show that, to a certain extent, proto-roles information is available in distributional models, and that a linear mapping can be used to infer the semantic characteristics of the arguments of novel verbs, thus testing the possibility of developing large-scale models able to extract the semantic properties for a wide inventory of verbs. Finally, we report a qualitative analysis in which we discuss which entailment relations our technique associates with a few semantic verb classes whose semantic roles are notoriously difficult to describe.
2021
Lebani, Gianluca E.; Lenci, Alessandro
File in questo prodotto:
File Dimensione Formato  
LebaniLenci_LL_2021.pdf

solo utenti autorizzati

Descrizione: Articolo principale
Tipologia: Versione finale editoriale
Licenza: NON PUBBLICO - accesso privato/ristretto
Dimensione 431.97 kB
Formato Adobe PDF
431.97 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1134760
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact