Towards a Reference Corpus of Web Genres for the Evaluation of Genre Identification Systems