Web log preprocessing is an important step in Web log mining
Web log is a prerequisite to obtain accurate information
directly affect the accuracy of the follow mining algorithms.Base on the massive Web log
this paper advanced an optimized Web log preprocessing method and implements it via distributed computing platform Hadoop.We compared the Hadoop platform’s performance with the stand-alone’s
thus proved the efficiency of Web log preprocessing on Hadoop.
Data Preparation for Mining World Wide Web Browsing Patterns [J] . Robert Cooley,Bamshad Mobasher,Jaideep Srivastava. Knowledge and Information Systems . 1999 (1)