The SMAPH system implements a pipeline of four main steps: (1) Fetching -- it fetches the search results returned by a search engine given the query to be annotated; (2) Spotting -- search result snippets are parsed to identify candidate mentions for the entities to be annotated. This is done in a novel way by detecting the keywords-in-context by looking at the bold parts of the search snippets; (3) Candidate generation -- candidate entities are generated in two ways: from the Wikipedia pages occurring in the search results, and from an existing annotator, using the mentions identified in the spotting step as input; (4) Pruning -- a binary SVM classifier is used to decide which entities to keep/discard in order to generate the final annotation set for the query. The SMAPH system ranked third on the development set and first on the final blind test of the 2014 ERD Challenge short text track.

The SMAPH system for query entity recognition and disambiguation

CORNOLTI, MARCO;FERRAGINA, PAOLO;
2014-01-01

Abstract

The SMAPH system implements a pipeline of four main steps: (1) Fetching -- it fetches the search results returned by a search engine given the query to be annotated; (2) Spotting -- search result snippets are parsed to identify candidate mentions for the entities to be annotated. This is done in a novel way by detecting the keywords-in-context by looking at the bold parts of the search snippets; (3) Candidate generation -- candidate entities are generated in two ways: from the Wikipedia pages occurring in the search results, and from an existing annotator, using the mentions identified in the spotting step as input; (4) Pruning -- a binary SVM classifier is used to decide which entities to keep/discard in order to generate the final annotation set for the query. The SMAPH system ranked third on the development set and first on the final blind test of the 2014 ERD Challenge short text track.
2014
9781450330237
File in questo prodotto:
File Dimensione Formato  
erd04-smaph.pdf

solo utenti autorizzati

Tipologia: Documento in Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 265.96 kB
Formato Adobe PDF
265.96 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/640065
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? ND
social impact