BlockMaxWand is a recent advance on the Wand dynamic pruning technique, which allows efficient retrieval without any effectiveness degradation to rank K. However, while BMW uses docid-sorted in- dices, it relies on recording the upper bound of the term weighting model scores for each block of postings in the inverted index. Such a requirement can be disadvantageous in situations such as when an index must be updated. In this work, we examine the appropri- ateness of upper-bound approximation – which have previously been shown suitable for Wand– in providing efficient retrieval for BMW. Experiments on the ClueWeb12 category B13 corpus using 5000 queries from a real search engine’s query log demonstrate that BMW still provides benefits w.r.t. Wand when approximate upper bounds are used, and that, if approximations on upper bounds are tight, BMW with approximate upper bounds can provide efficiency gains w.r.t. Wand with exact upper bounds, in particular for queries of short to medium length.
Upper bound approximations for BlockMaxWand
Tonellotto N.
2017-01-01
Abstract
BlockMaxWand is a recent advance on the Wand dynamic pruning technique, which allows efficient retrieval without any effectiveness degradation to rank K. However, while BMW uses docid-sorted in- dices, it relies on recording the upper bound of the term weighting model scores for each block of postings in the inverted index. Such a requirement can be disadvantageous in situations such as when an index must be updated. In this work, we examine the appropri- ateness of upper-bound approximation – which have previously been shown suitable for Wand– in providing efficient retrieval for BMW. Experiments on the ClueWeb12 category B13 corpus using 5000 queries from a real search engine’s query log demonstrate that BMW still provides benefits w.r.t. Wand when approximate upper bounds are used, and that, if approximations on upper bounds are tight, BMW with approximate upper bounds can provide efficiency gains w.r.t. Wand with exact upper bounds, in particular for queries of short to medium length.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.