王红雨, 杜刚, 朱艳云, et al. A variant keyword matching method based on the stroke order features of Chinese characters[J]. 2020, 33(12): 14-18. DOI: 10.13992/j.cnki.tetas.2020.12.003.
spam short messages appear to contain a large number of split and similar characters
this kind of short message can bypass keyword filtering and be sent to users. Due to the large number and flexible changes of split words and similar words
adding them to the key database will make the database redundant. In this paper
a variant keyword matching method based on the stroke order features of Chinese characters is proposed. Firstly
the split words in spam short messages are merged based on the stroke order features of Chinese characters. Secondly
the suspected keywords contained in spam messages are indexed by an index table which is built using the characters of keywords. Finally
a pyramid matching method is proposed to match keywords. The method proposed in this paper can effectively reduce the redundancy of keywords database and improve the efficiency of keywords matching.