朱壮军, 王彬. Application of massive data extraction technology in IDC bad information monitoring system[J]. 2020, 33(11): 82-87.
DOI:
朱壮军, 王彬. Application of massive data extraction technology in IDC bad information monitoring system[J]. 2020, 33(11): 82-87. DOI: 10.13992/j.cnki.tetas.2020.11.014.
Application of massive data extraction technology in IDC bad information monitoring system
摘要
本文阐述了某电信企业在建设IDC不良信息监测系统过程中
为高效处理每天的海量数据
选取了多种数据采集技术
进行反复方案论证和实验对比
最终选择了"Hadoop脚本+FTP"方式
极大提高了数据采集效率
实现了海量数据高效采集和处理
保证IDC不良信息监测系统能够及时发现和处理IDC中包含的不良信息
助力IDC业务健康发展
避免给国家和社会带来负面影响。
Abstract
This paper describes that in the process of building IDC bad information monitoring system
in order to efficiently process tens of tons of massive data every day
a variety of data extraction technologies are selected for repeated scheme demonstration and experimental comparison. Finally
Hadoop script + FTP mode is selected
which greatly improves the efficiency of data extraction and realizes efficient collection and processing of massive data
to ensure that IDC bad information monitoring system can discover and process the bad information contained in time
help IDC business develop healthily
and avoid negative impact on the country and society.