The content posted by users on Social Networks represents an important source of information for a myriad of applications in the wide field known as 'social sensing'. The Twitter platform in particular hosts the thoughts, opinions and comments of its users, expressed in the form of tweets: as a consequence, tweets are often analyzed with text mining and natural language processing techniques for relevant tasks, ranging from brand reputation and sentiment analysis to stance detection. In most cases the intelligent systems designed to accomplish these tasks are based on a classification model that, once trained, is deployed into the data flow for online monitoring. In this work we show how this approach turns out to be inadequate for the task of stance detection from tweets. In fact, the sequence of tweets that are collected everyday represents a data stream. As it is well known in the literature on data stream mining, classification models may suffer from concept drift, i.e. a change in the data distribution can potentially degrade the performance. We present a broad experimental campaign for the case study of the online monitoring of the stance expressed on Twitter about the vaccination topic in Italy. We compare different learning schemes and propose yet a novel one, aimed at addressing the event-driven concept drift.
Addressing Event-Driven Concept Drift in Twitter Stream: A Stance Detection Application
Bechini A.;Bondielli A.;Ducange P.;Marcelloni F.;Renda A.
2021-01-01
Abstract
The content posted by users on Social Networks represents an important source of information for a myriad of applications in the wide field known as 'social sensing'. The Twitter platform in particular hosts the thoughts, opinions and comments of its users, expressed in the form of tweets: as a consequence, tweets are often analyzed with text mining and natural language processing techniques for relevant tasks, ranging from brand reputation and sentiment analysis to stance detection. In most cases the intelligent systems designed to accomplish these tasks are based on a classification model that, once trained, is deployed into the data flow for online monitoring. In this work we show how this approach turns out to be inadequate for the task of stance detection from tweets. In fact, the sequence of tweets that are collected everyday represents a data stream. As it is well known in the literature on data stream mining, classification models may suffer from concept drift, i.e. a change in the data distribution can potentially degrade the performance. We present a broad experimental campaign for the case study of the online monitoring of the stance expressed on Twitter about the vaccination topic in Italy. We compare different learning schemes and propose yet a novel one, aimed at addressing the event-driven concept drift.File | Dimensione | Formato | |
---|---|---|---|
Addressing_Event-Driven_Concept_Drift_in_Twitter_Stream_A_Stance_Detection_Application.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
2.72 MB
Formato
Adobe PDF
|
2.72 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.