CINECA IRIS Institutional Research Information System

ROOT 9 is a supervised system for the classification of hypernyms, co - hyponyms and random words that is derived from the already introduced ROOT13 (Santus et al., 2016) . It relies on a Random Forest algorithm and nine unsupervised corpus - based features. We evaluate it with a 10 - fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts - Of - Speech (i.e. adjectives, nouns and verbs). When all the classes are present, ROOT 9 achieves an F1 score of 90 . 7 %, against a baseline of 57. 2 % ( vector cosine ). When the classification is binary, ROOT 9 achieves the following results against the baseline: hypernyms - co - hyponyms 9 5 . 7 % vs. 69.8 %, hypernyms - random 9 1 . 8 % vs. 64.1 % and co - hypon yms - random 97. 8 % vs. 79.4 %. In order to compare the performance with the state - of - the - art, we have also evaluated ROOT 9 in subsets of the Weeds et al. (2014) datasets, proving that it is in fact competitive. Finally, we investigated whether the system lear ns the semantic relation or it simply learn s the prototypical hypernyms, as claimed by Levy et al. (2015). The second possibility seems to be the most likely , even though ROOT9 can be trained on negative examples (i.e. , switched hypernyms) to drastically r educe th is bias .

Nine Features in a Random Forest to Learn Taxonomical Semantic Relations

Santus, Enrico^Primo;LENCI, ALESSANDRO^Co-primo;Chiu, Tin Shing^Co-primo;Lu, Qin^Co-primo;Hunag, Chu Ren^Co-primo

2016-01-01

Abstract

ROOT 9 is a supervised system for the classification of hypernyms, co - hyponyms and random words that is derived from the already introduced ROOT13 (Santus et al., 2016) . It relies on a Random Forest algorithm and nine unsupervised corpus - based features. We evaluate it with a 10 - fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts - Of - Speech (i.e. adjectives, nouns and verbs). When all the classes are present, ROOT 9 achieves an F1 score of 90 . 7 %, against a baseline of 57. 2 % ( vector cosine ). When the classification is binary, ROOT 9 achieves the following results against the baseline: hypernyms - co - hyponyms 9 5 . 7 % vs. 69.8 %, hypernyms - random 9 1 . 8 % vs. 64.1 % and co - hypon yms - random 97. 8 % vs. 79.4 %. In order to compare the performance with the state - of - the - art, we have also evaluated ROOT 9 in subsets of the Weeds et al. (2014) datasets, proving that it is in fact competitive. Finally, we investigated whether the system lear ns the semantic relation or it simply learn s the prototypical hypernyms, as claimed by Levy et al. (2015). The second possibility seems to be the most likely , even though ROOT9 can be trained on negative examples (i.e. , switched hypernyms) to drastically r educe th is bias .

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Codice ISBN
	
				978-2-9517408-9-1
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/843152

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

31

15

social impact