We tackle two different problems of text categorization, namely feature selection (FS) and classifier induction. We propose a new FS technique, based on a simplified version of the χ2 statistics and a novel variant, based on the exploitation of negative evidence, of the well-known k-NN method. We re- port the results of systematic experimentation of these two methods performed on the Reuters-21578 benchmark.
Feature Selection and Negative Evidence in Automated Text Categorization
SIMI, MARIA
2000-01-01
Abstract
We tackle two different problems of text categorization, namely feature selection (FS) and classifier induction. We propose a new FS technique, based on a simplified version of the χ2 statistics and a novel variant, based on the exploitation of negative evidence, of the well-known k-NN method. We re- port the results of systematic experimentation of these two methods performed on the Reuters-21578 benchmark.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.