叶志雄, 王丹弘. Detection method of phishing website based on imbalance SVM-incremental learning of massive data[J]. 2016, 29(12): 26-31. DOI: 10.13992/j.cnki.tetas.2016.12.006.
Detection method of phishing website based on imbalance SVM-incremental learning of massive data
摘要
钓鱼网站每年在电子商务、通信、银行等领域给用户造成极大损失
成功有效的防范钓鱼网站成为一项艰巨任务。本文通过对实际数据的分析
提取了URL相关特点、网页文本内容两方面特征描述网页
然后对不同特征构建相应分类器
根据增量学习思想优化各分类器
提升算法在线学习能力。最后采用分类集成的方法综合各个分类器的预测结果
达到对钓鱼网站在线智能检测的目标。实验表明
集成分类具有良好的在线学习能力和泛化能力。
Abstract
For each year
phishing website in electronic commerce
communications
banking and other areas to give users a great loss
so successfully and effectively prevent phishing website become a difficult task. In this paper
through the analysis of the actual data
extracts 2 kinds of characteristics such as the characteristics of URL
webpage text content to describe the page
classifiers are then built based on these different feature representations
and optimized based on the theory of incremental learning
the online learning ability of the algorithm is improved. Finally
the classification ensemble method is used to synthesize the prediction results of each classifier
which can achieve the goal of online intelligent detection for phishing website. According to the experimental results
the ensemble classification has good online learning ability and generalization ability.