Recently, user-generated content in social media opened up new alluring possibilities for understanding the geospatial aspects of many real-world phenomena. Yet, the vast majority of such content lacks explicit, structured geographic information. Here, we describe the design and implementation of a novel approach for associating geographic in- formation to text documents. GSP exploits powerful machine learning algorithms on top of the rich, interconnected Linked Data in order to overcome limitations of previous state-of-the-art approaches. In detail, our technique performs semantic annotation to identify relevant tokens in the input document, traverses a sub-graph of Linked Data for extract- ing possible geographic information related to the identified tokens and optimizes its results by means of a Support Vector Machine classifier. We compare our results with those of 4 state-of-the-art techniques and baselines on ground-truth data from 2 evaluation datasets. Our GSP tech- nique achieves excellent performances, with the best F 1 = 0.91, sensibly outperforming benchmark techniques that achieve F 1 ≤ 0.78.
GSP (Geo-Semantic-Parsing): Geoparsing and Geotagging with Machine Learning on Top of Linked Data
Marco Avvenuti;Leonardo Nizzoli;
2018-01-01
Abstract
Recently, user-generated content in social media opened up new alluring possibilities for understanding the geospatial aspects of many real-world phenomena. Yet, the vast majority of such content lacks explicit, structured geographic information. Here, we describe the design and implementation of a novel approach for associating geographic in- formation to text documents. GSP exploits powerful machine learning algorithms on top of the rich, interconnected Linked Data in order to overcome limitations of previous state-of-the-art approaches. In detail, our technique performs semantic annotation to identify relevant tokens in the input document, traverses a sub-graph of Linked Data for extract- ing possible geographic information related to the identified tokens and optimizes its results by means of a Support Vector Machine classifier. We compare our results with those of 4 state-of-the-art techniques and baselines on ground-truth data from 2 evaluation datasets. Our GSP tech- nique achieves excellent performances, with the best F 1 = 0.91, sensibly outperforming benchmark techniques that achieve F 1 ≤ 0.78.| File | Dimensione | Formato | |
|---|---|---|---|
|
2018-Geoparsing-ESWC18.pdf
solo utenti autorizzati
Tipologia:
Versione finale editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
666.69 kB
Formato
Adobe PDF
|
666.69 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
2018-Geoparsing-Postprint.pdf
Open Access dal 01/07/2019
Tipologia:
Documento in Post-print
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
953.54 kB
Formato
Adobe PDF
|
953.54 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


