Online Social Media (OSM) platforms, such as Facebook or Twitter, are part of everyday life as powerful communication tools. They let users communicate anywhereanytime, and improve their own public image. For this reason, OSM are becoming more and more popular. Social Media data may play a crucial role in various decision-making processes. In this setting, research topics connected to monitoring of Social Media data are becoming increasingly important. The presented work is grounded on direct extensive experiences in data collection from different Social Media sources, and on the different methodologies applied in different reference domains (namely Online Reputation, Social Media Intelligence, and Opinion Mining in tourism). The crawlers developed for these domains provide valuable suggestions to elicit diverse requirements. After lessons learned in such fields, a general architecture for data capture from Social Media sources has been devised, and the interfaces of the composing modules have been defined. The resulting API can be exploited for an orderly re-engineering of crawling tools in the reference domains, thus implementing specific versions of the generic architecture.
Towards a general architecture for social media data capture from a multi-domain perspective
BECHINI, ALESSIO;GAZZE', DAVIDE;MARCHETTI, ANDREA;TESCONI, MAURIZIO
2016-01-01
Abstract
Online Social Media (OSM) platforms, such as Facebook or Twitter, are part of everyday life as powerful communication tools. They let users communicate anywhereanytime, and improve their own public image. For this reason, OSM are becoming more and more popular. Social Media data may play a crucial role in various decision-making processes. In this setting, research topics connected to monitoring of Social Media data are becoming increasingly important. The presented work is grounded on direct extensive experiences in data collection from different Social Media sources, and on the different methodologies applied in different reference domains (namely Online Reputation, Social Media Intelligence, and Opinion Mining in tourism). The crawlers developed for these domains provide valuable suggestions to elicit diverse requirements. After lessons learned in such fields, a general architecture for data capture from Social Media sources has been devised, and the interfaces of the composing modules have been defined. The resulting API can be exploited for an orderly re-engineering of crawling tools in the reference domains, thus implementing specific versions of the generic architecture.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.