CINECA IRIS Institutional Research Information System

Many text mining tasks, such as clustering, classification, retrieval, and named entity linking, benefit from a measure of relatedness between entities in a knowledge graph. We present a thorough study of all entity relatedness measures in recent literature based on Wikipedia as the knowledge graph. To facilitate this study, we introduce a new dataset with human judgments of entity relatedness. No clear dominance is seen between measures based on textual similarity and graph proximity. Some of the better measures involve expensive global graph computations. We propose a new, space-efficient, computationally lightweight, two-stage framework for relatedness computation. In the first stage, a small weighted subgraph is dynamically grown around the two query entities; in the second stage, relatedness is derived based on computations on this subgraph. Our system shows better agreement with human judgment than existing proposals both on the new dataset and on an established one. Our framework also shows improvements with respect to the state-of-the-art on three different extrinsic evaluations in the domains of ranking entity pairs, entity linking, and synonym extraction.

On Computing Entity Relatedness in Wikipedia, with Applications

Ponza M.;Ferragina P.;Chakrabarti S.

2020-01-01

Abstract

Many text mining tasks, such as clustering, classification, retrieval, and named entity linking, benefit from a measure of relatedness between entities in a knowledge graph. We present a thorough study of all entity relatedness measures in recent literature based on Wikipedia as the knowledge graph. To facilitate this study, we introduce a new dataset with human judgments of entity relatedness. No clear dominance is seen between measures based on textual similarity and graph proximity. Some of the better measures involve expensive global graph computations. We propose a new, space-efficient, computationally lightweight, two-stage framework for relatedness computation. In the first stage, a small weighted subgraph is dynamically grown around the two query entities; in the second stage, relatedness is derived based on computations on this subgraph. Our system shows better agreement with human judgment than existing proposals both on the new dataset and on an established one. Our framework also shows improvements with respect to the state-of-the-art on three different extrinsic evaluations in the domains of ranking entity pairs, entity linking, and synonym extraction.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.knosys.2019.105051
			
	Tutti gli autori
	
						Ponza, M.; Ferragina, P.; Chakrabarti, S.
					
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
On computing entity relatedness in Wikipedia.pdf non disponibili Tipologia: Versione finale editoriale Licenza: NON PUBBLICO - accesso privato/ristretto Dimensione 1.93 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.93 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1011848

Citazioni

ND

15

9

social impact