Recently, user-generated content in social media opened up new alluring possibilities for understanding the geospatial aspects of many real-world phenomena. Yet, the vast majority of such content lacks explicit, structured geographic information. Here, we describe the design and implementation of a novel approach for associating geographic in- formation to text documents. GSP exploits powerful machine learning algorithms on top of the rich, interconnected Linked Data in order to overcome limitations of previous state-of-the-art approaches. In detail, our technique performs semantic annotation to identify relevant tokens in the input document, traverses a sub-graph of Linked Data for extract- ing possible geographic information related to the identified tokens and optimizes its results by means of a Support Vector Machine classifier. We compare our results with those of 4 state-of-the-art techniques and baselines on ground-truth data from 2 evaluation datasets. Our GSP tech- nique achieves excellent performances, with the best F 1 = 0.91, sensibly outperforming benchmark techniques that achieve F 1 ≤ 0.78.

GSP (Geo-Semantic-Parsing): Geoparsing and Geotagging with Machine Learning on Top of Linked Data

Marco Avvenuti;Leonardo Nizzoli;
2018-01-01

Abstract

Recently, user-generated content in social media opened up new alluring possibilities for understanding the geospatial aspects of many real-world phenomena. Yet, the vast majority of such content lacks explicit, structured geographic information. Here, we describe the design and implementation of a novel approach for associating geographic in- formation to text documents. GSP exploits powerful machine learning algorithms on top of the rich, interconnected Linked Data in order to overcome limitations of previous state-of-the-art approaches. In detail, our technique performs semantic annotation to identify relevant tokens in the input document, traverses a sub-graph of Linked Data for extract- ing possible geographic information related to the identified tokens and optimizes its results by means of a Support Vector Machine classifier. We compare our results with those of 4 state-of-the-art techniques and baselines on ground-truth data from 2 evaluation datasets. Our GSP tech- nique achieves excellent performances, with the best F 1 = 0.91, sensibly outperforming benchmark techniques that achieve F 1 ≤ 0.78.
2018
978-3-319-93416-7
File in questo prodotto:
File Dimensione Formato  
2018-Geoparsing-ESWC18.pdf

solo utenti autorizzati

Tipologia: Versione finale editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 666.69 kB
Formato Adobe PDF
666.69 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
2018-Geoparsing-Postprint.pdf

Open Access dal 01/07/2019

Tipologia: Documento in Post-print
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 953.54 kB
Formato Adobe PDF
953.54 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/920361
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 35
  • ???jsp.display-item.citation.isi??? 14
social impact