A Dependency-Aware Utterances Permutation Strategy to Improve Conversational Evaluation

Faggioli, G.; Ferrante, M.; Ferro, N.; Perego, R.; Tonellotto, N.

doi:10.1007/978-3-030-99736-6_13

The rapid growth in the number and complexity of conversational agents has highlighted the need for suitable evaluation tools to describe their performance. The main evaluation paradigms move from analyzing conversations where the user explores information needs following a scripted dialogue with the agent. We argue that this is not a realistic setting: different users ask different questions (and in a diverse order), obtaining distinct answers and changing the conversation path. We analyze what happens to conversational systems performance when we change the order of the utterances in a scripted conversation while respecting temporal dependencies between them. Our results highlight that the performance of the system widely varies. Our experiments show that diverse orders of utterances determine completely different rankings of systems by performance. The current way of evaluating conversational systems is thus biased. Motivated by these observations, we propose a new evaluation approach based on dependency-aware utterance permutations to increase the power of our evaluation tools.

A Dependency-Aware Utterances Permutation Strategy to Improve Conversational Evaluation

Faggioli G.;Ferrante M.;Ferro N.;Perego R.;Tonellotto N.

2022-01-01

Abstract

The rapid growth in the number and complexity of conversational agents has highlighted the need for suitable evaluation tools to describe their performance. The main evaluation paradigms move from analyzing conversations where the user explores information needs following a scripted dialogue with the agent. We argue that this is not a realistic setting: different users ask different questions (and in a diverse order), obtaining distinct answers and changing the conversation path. We analyze what happens to conversational systems performance when we change the order of the utterances in a scripted conversation while respecting temporal dependencies between them. Our results highlight that the performance of the system widely varies. Our experiments show that diverse orders of utterances determine completely different rankings of systems by performance. The current way of evaluating conversational systems is thus biased. Motivated by these observations, we propose a new evaluation approach based on dependency-aware utterance permutations to increase the power of our evaluation tools.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2022

Codice ISBN

978-3-030-99735-9
978-3-030-99736-6

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1163067

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

3

0

CINECA IRIS Institutional Research Information System

A Dependency-Aware Utterances Permutation Strategy to Improve Conversational Evaluation

Faggioli G.;Ferrante M.;Ferro N.;Perego R.;Tonellotto N.

2022-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Attenzione

Citazioni

social impact

CINECA IRIS Institutional Research Information System

A Dependency-Aware Utterances Permutation Strategy to Improve Conversational Evaluation

Faggioli G.;Ferrante M.;Ferro N.;Perego R.;Tonellotto N.

2022-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)