We observe that most relevant terms in unstructured news arti- cles are primarily concentrated towards the beginning and the end of the document. Exploiting this observation, we propose a novel version of the classical BM25 weighting model, called BM25 Pas- sage (BM25P), which scores query results by computing a linear combination of term statistics in the different portions of news arti- cles. Our experimentation, conducted using three publicly available news datasets, demonstrates that BM25P markedly outperforms BM25 in term of effectiveness by up to 17.44% in NDCG@5 and 85% in NDCG@1.
Enhanced news retrieval: Passages lead the way!
Nardini F. M.;Perego R.;Tonellotto N.
2019-01-01
Abstract
We observe that most relevant terms in unstructured news arti- cles are primarily concentrated towards the beginning and the end of the document. Exploiting this observation, we propose a novel version of the classical BM25 weighting model, called BM25 Pas- sage (BM25P), which scores query results by computing a linear combination of term statistics in the different portions of news arti- cles. Our experimentation, conducted using three publicly available news datasets, demonstrates that BM25P markedly outperforms BM25 in term of effectiveness by up to 17.44% in NDCG@5 and 85% in NDCG@1.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.