宁夏大学 信息工程学院,宁夏 银川 750021
胡德洲(2000—),男,硕士研究生,主要从事命名实体识别研究(nxuhdz@163.com)。
李贯峰(1979—),男,副教授,博士,主要从事语义计算、知识图谱研究(ligf@nxu.edu.cn)。
收稿:2023-12-07,
纸质出版:2026-03-25
移动端阅览
胡德洲,李贯峰,李瑞,等.融合多头注意力的中文命名实体识别方法[J].西北工程技术学报(中英文),2026,25(1):27-32.
HU Dezhou,LI Guanfeng,LI Rui,et al.A Chinese Named Entity Recognition Method Integrating Multi-Head Attention[J].Journal of Northwest Engineering Technology,2026,25(01):27-32.
胡德洲,李贯峰,李瑞,等.融合多头注意力的中文命名实体识别方法[J].西北工程技术学报(中英文),2026,25(1):27-32. DOI: 10.26974/j.cnki.XBGC.2026.01.004.
HU Dezhou,LI Guanfeng,LI Rui,et al.A Chinese Named Entity Recognition Method Integrating Multi-Head Attention[J].Journal of Northwest Engineering Technology,2026,25(01):27-32. DOI: 10.26974/j.cnki.XBGC.2026.01.004.
针对现有中文命名实体识别模型特征抽取能力不足,难以捕捉长距离依赖等问题,提出一种融合多头注意力(multi-head attention,MA)和双向门控循环单元(bidirectional gate recurrent unit,BiGRU)的中文命名实体识别方法。首先,使用ALBERT (a lite BERT)预训练模型生成动态表示向量,并将向量序列输入到BiGRU来提取全局语义特征,再利用多头注意力机制捕捉长距离依赖信息来增强语义特征,最后,通过条件随机场(conditional random field,CRF)解码获得最优序列。结果表明,该方法在《人民日报》和MSRA中文数据集上
F
1
值均超过95%,优于其他模型;同时,该方法相比BERT-BiLSTM-CRF模型,训练时间减少约14.5%,证明了模型的有效性和通用性。
To address the limited feature extraction capacity and the difficulty in modeling long-range dependencies in existing Chinese named entity recognition(NER) models, this study proposed a method that integrates multi-head attention(MA) with a bidirectional gated recurrent unit(BiGRU). Specifically, it first used the ALBERT(a lite BERT) pre-trained language model to generate contextualized representations, which are then fed into a BiGRU to extract global semantic features. A multi-head attention mecha
nism is subsequently applied to capture long-range dependencies and further enhance the semantic representation then the multi-head attention mechanism is used to capture long-distance dependent information to enhance semantic representations. Finally, a conditional random field (CRF) layer decodes the optimal label sequence. Experimental results show that the proposed method achieves
F
1
scored above 95% on both
the People’s Daily
and MSRA datasets, outperforming competing models. In addition, compared with the BERT-BiLSTM-CRF model, this approach reduces training time by approximately 14.5%, demonstrating its effectiveness and generalizability.
Li J , Sun A X , Han J L , et al . A survey on deep learning for named entity recognition [J]. IEEE Transactions on Knowledge and Data Engineering , 2022 , 34 ( 1 ): 50 - 70 .
彭雪 , 赵辉 , 郑肇谦 , 等 . 融合多种嵌入表示的中文命名实体识别 [J]. 长春工业大学学报 , 2022 , 43 ( 1 ): 81 - 90 .
Shaalan K , Raza H . NERA: Named entity recognition for Arabic [J]. Journal of the American Society for Information Science and Technology , 2009 , 60 ( 8 ): 1652 - 1663 .
Zhou G D , Su J . Named entity recognition using an HMM-based chunk tagger [C]// Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics . Philadelphia, Pennsylvania, USA Association for Computational Linguistics , 2002 : 473 - 480 .
尤丽珏 , 尹远芳 . 基于BiLSTM-CRF模型的医学影像检查报告信息实体识别 [J]. 微型电脑应用 , 2023 , 39 ( 10 ): 134 - 137 .
宋佳芮 , 陈艳平 , 王凯 , 等 . 基于Affix-Attention的命名实体识别语义补充方法 [J]. 山东大学学报(工学版) , 2023 , 53 ( 2 ): 70 - 76 .
Zhang Y , Yang J . Chinese NER using lattice LSTM [C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . Melbourne, Australia Association for Computational Linguistics , 2018 : 1554 - 1564 .
关斯琪 , 董婷婷 , 万子敬 , 等 . 基于BERT-CRF模型的火灾事故案例实体识别研究 [J]. 消防科学与技术 , 2023 , 42 ( 11 ): 1529 - 1534 .
Sun S Q , Cheng Y , Gan Z , et al . Patient knowledge distillation for BERT model compression [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . Hong Kong, China Association for Computational Linguistics , 2019 : 4323 - 4332 .
张祺 , 李成军 , 刘敬蜀 . 基于BERT-IDCNN-CRF的军事领域命名实体识别研究 [J]. 航天电子对抗 , 2021 , 37 ( 5 ): 56 - 60 .
Wang Y C , Yu B W , Zhang Y Y , et al . TPLinker: Single-stage joint extraction of entities and relations through token pair linking [C]// Proceedings of the 28th International Conference on Computational Linguistics . Barcelona, Spain (Online) Association for Computational Linguistics , 2020 : 1572 - 1582 .
Zhang J R , Liu F A , Xu W Z , et al . Feature fusion text classification model combining CNN and BiGRU with multi-attention mechanism [J]. Future Internet , 2019 , 11 ( 11 ): 237 .
Vaswani A , Shazeer N , Parmar N , et al . Attention is all you need [C]// Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems , Long Beach , 2017 : 5998 - 6008 .
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010602201714号