The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters

Attardi, Giuseppe; Dei Rossi, S.; Simi, Maria

We have extended an existing lemmatizer, which relies on a lexicon of about 1.2 millions form, where lemmas are indexed by rich PoS tags, with a sequence of cascading filters, each one in charge of dealing with specific issues related to out-of-dictionary words. The last two filters are devoted to resolve semantic ambiguities between words of the same syntactic category, by querying external resources: an enriched index built on the Italian Wikipedia and the Google index.

The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters

ATTARDI, GIUSEPPE;Dei Rossi S.;SIMI, MARIA

2012-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2012
		
	Codice ISBN
	
			9783642358272
		
	Appare nelle tipologie:
	
			4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/238500

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

CINECA IRIS Institutional Research Information System

The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters

ATTARDI, GIUSEPPE;Dei Rossi S.;SIMI, MARIA

2012-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Attenzione

Citazioni

social impact

CINECA IRIS Institutional Research Information System

The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters

ATTARDI, GIUSEPPE;Dei Rossi S.;SIMI, MARIA

2012-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)