The rapid development of the World Wide Web as a medium of commerce and information dissemination has generated a growing interest of web portal managers in systems able to identify user profiles from the web access logs. The interpretation of these profiles can help re-organize the web portal, e.g., by restructuring the site's content more efficiently, or even to build adaptive web portals, i.e., portals whose organization and presentation change depending on the specific visitor's needs. In this paper, we assume that the pages of the web portal have been prearranged in a number of different categories. We introduce a systematic approach to determine a hierarchy of user profiles from the history of users' accesses to the categories. First, we filter the access log by removing both occasional users and categories of poor interest. Then, we apply an Unsupervised Fuzzy Divisive Hierarchical Clustering (UFDHC) algorithm to cluster the users of the web portal into a hierarchy of fuzzy groups characterized by a set of common interests and each represented by a prototype, which defines the profile of the group typical member. To identify the profile a specific user belongs to, we propose a novel classification method which completely exploits the information contained in the hierarchy. To prove the effectiveness of our approach, we apply the UFDHC algorithm to access log data collected over a period of 15 days and use the classification method to associate a profile with the users defined by access log data collected during subsequent 60 days. Finally, we highlight the good characteristics of our system by comparing our results with the ones obtained by applying a profiling system based on a modified version of the fuzzy C-means.

A hierarchical fuzzy clustering-based system to create user profiles

LAZZERINI, BEATRICE;MARCELLONI, FRANCESCO
2007

Abstract

The rapid development of the World Wide Web as a medium of commerce and information dissemination has generated a growing interest of web portal managers in systems able to identify user profiles from the web access logs. The interpretation of these profiles can help re-organize the web portal, e.g., by restructuring the site's content more efficiently, or even to build adaptive web portals, i.e., portals whose organization and presentation change depending on the specific visitor's needs. In this paper, we assume that the pages of the web portal have been prearranged in a number of different categories. We introduce a systematic approach to determine a hierarchy of user profiles from the history of users' accesses to the categories. First, we filter the access log by removing both occasional users and categories of poor interest. Then, we apply an Unsupervised Fuzzy Divisive Hierarchical Clustering (UFDHC) algorithm to cluster the users of the web portal into a hierarchy of fuzzy groups characterized by a set of common interests and each represented by a prototype, which defines the profile of the group typical member. To identify the profile a specific user belongs to, we propose a novel classification method which completely exploits the information contained in the hierarchy. To prove the effectiveness of our approach, we apply the UFDHC algorithm to access log data collected over a period of 15 days and use the classification method to associate a profile with the users defined by access log data collected during subsequent 60 days. Finally, we highlight the good characteristics of our system by comparing our results with the ones obtained by applying a profiling system based on a modified version of the fuzzy C-means.
Lazzerini, Beatrice; Marcelloni, Francesco
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11568/194892
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 12
social impact