1.哈尔滨工程大学 计算机科学与技术学院, 黑龙江 哈尔滨 150001
2.黑龙江科技大学 计算机科学与信息工程学院, 黑龙江 哈尔滨 150022
[ "王红滨, 男, 副教授" ]
[ "何鸣, 男, 助理研究员, 博士" ]
收稿:2022-06-13,
网络首发:2024-03-05,
纸质出版:2024-04-05
移动端阅览
王红滨, 张帅, 何鸣, 等. 基于多层优选卷积的水声信号样本自动标注方法[J]. 哈尔滨工程大学学报, 2024,45(4):758-763.
Hongbin WANG, Shuai ZHANG, Ming HE, et al. Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution[J]. Journal of Harbin Engineering University, 2024, 45(4): 758-763.
王红滨, 张帅, 何鸣, 等. 基于多层优选卷积的水声信号样本自动标注方法[J]. 哈尔滨工程大学学报, 2024,45(4):758-763. DOI: 10.11990/jheu.202206048.
Hongbin WANG, Shuai ZHANG, Ming HE, et al. Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution[J]. Journal of Harbin Engineering University, 2024, 45(4): 758-763. DOI: 10.11990/jheu.202206048.
针对深度学习在水声研究领域的应用中面临大数据量要求和现有样本量限制的问题
本文提出了一种多层优选卷积网络模型。通过基于相似度的优选方法选出最佳卷积核
以提取更具代表性的特征。利用探索层特征融合策略
叠加多层卷积输出
获取更全面的特征信息。采用约减策略优化模型
有效缩短运算时间。通过优选、特征融合和注意力机制
有效解决此类问题。实验结果表明
该模型在数据集上取得的最好的标注准确率为高基线模型1.12 %; 同时运行时间减少了43.5 %。因此
该模型的使用提高了水声信号标注的准确率和效率。
The application of deep learning in underwater acoustic research often faces problems such as large data volume requirements and current sample size limitations. Herein
the best convolution kernel is selected using the similarity-based optimization method to extract representative features. Then
by exploring the layer feature fusion strategy
the multilayer convolution output is superimposed to obtain comprehensive feature information. This study proposes a multilayer optimized convolutional network model that can effectively solve such problems through optimization
feature fusion
and attention mechanisms. Finally
a reduction strategy is used to optimize the model
which effectively shortens the operation time. The experimental results reveal that the best annotation accuracy of the model on the data set is 1.12 % of the high baseline model
and the running time is reduced by 43.5 %. Therefore
this model improves the accuracy and efficiency of underwater acoustic signal labeling.
YANG H, BYUN S H, LEE K, et al. Underwater acoustic research trends with machine learning: active SONAR applications[J]. Journal of ocean engineering and technology, 2020, 34(4): 277-284.
SHAN Shuaijie, LIU Jianbao, DUN Yaowu. Prospect of voiceprint recognition based on deep learning[J]. Journal of physics: conference series, 2021, 1848(1): 012046.
KHDIER H Y, JASIM W M, ALIESAWI S A. Deep learning algorithms based voiceprint recognition system in noisy environment[J]. Journal of physics: conference series, 2021, 1804(1): 012042.
MINI P P, THOMAS T, GOPIKAKUMARI R. EEG based direct speech BCI system using a fusion of SMRT and MFCC/LPCC features with ANN classifier[J]. Biomedical signal processing and control, 2021, 68: 102625.
付进, 许婉琰, 王燕等. 复倒谱域水声信道多途抑制技术[J]. 哈尔滨工程大学学报, 2015, 36(9): 1188-1193.
FU Jin, XU Wanyan, WANG Yan, LIANG Guolong. Anti-multipath technique of underwater acoustic channel in complex cepstrum domain[J]. Journal of Harbin Engineering University, 2015(9): 1188-1193.
ZHU Qiang, WANG Zhong, DOU Yunfeng, et al. Whispered speech conversion based on the inversion of mel frequency cepstral coefficient features[J]. Algorithms, 2022, 15(2): 68.
王大宇, 王志欣, 张光普. 基于改进谱减算法的水声通信信号检测方法[J]. 应用科技, 2020, 47(3): 69-73.
WANG Dayu, WANG Zhixin, ZHANG Guangpu. Detection method of underwater acoustic communications signal based on improved spectral subtraction algorithm[J]. Applied science and technology, 2020, 47(3): 69-73.
SHIVAPRASAD S, SADANANDAM M. Optimized features extraction from spectral and temporal features for identifying the Telugu dialects by using GMM and HMM[J]. Ingénierie des systèmes d information, 2021, 26(3): 275-283.
ZHANG Jing. Research on cantonese phonetic feature extraction algorithm based on GMM-UBM[C]//2019 Chinese Control and Decision Conference (CCDC). Piscataway, NJ: IEEE, 2019: 4034-4038.
杨爽, 曾向阳. 基于多尺度稀疏简单循环单元模型的水声目标识别方法[J]. 哈尔滨工程大学学报, 2022, 43(7): 958-964.
YANG Shuang, ZENG Xiangyang. Underwater acoustic target recognition method based on the multi-scale sparse simple recurrent unit model[J]. Journal of Harbin Engineering University, 2022, 43(7): 958-964.
GU Jiuxiang, WANG Zhenhua, KUEN J, et al. Recent advances in convolutional neural networks[J]. Pattern recognition, 2018, 77(C): 354-377.
REJAIBI E, KOMATY A, MERIAUDEAU F, et al. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech[J]. Biomedical signal processing and control, 2022, 71: 103107.
LI C, MA X, JIANG B, et al. Deep speaker: an end-to-end neural speaker embedding system[J ] . Arxiv, 2017. DOI: 10.48550/arXiv.1705.02304 http://dx.doi.org/10.48550/arXiv.1705.02304 .
汤礼颖, 贺利乐, 何林等. 一种卷积神经网络集成的多样性度量方法[J]. 智能系统学报, 2021, 16(6): 1030-1038.
TANG Liying, HE Lile, HE Lin, et al. Diversity measuring method of a convolutional neural network ensemble[J]. CAAI Transactions on Intelligent Systems, 2021, 16(6): 1030-1038.
王同, 苏林, 任群言等. 循环神经网络在浅海声速-声源联合反演中的应用[J]. 哈尔滨工程大学学报, 2021, 42(8): 1133-1139.
WANG Tong, SU Lin, REN Qunyan, WANG Wenbo, MA Li. Application of the sound speed profile and sound source location in shallow waters[J]. Journal of Harbin Engineering University, 2021, 42(8): 1133-1139.
DESPLANQUES B, THIENPONDT J, DEMUYNCK K. ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verification[J]. Arxiv: 2005.07143.
BOU NASSIF A, ALNAZZAWI N, SHAHIN I, et al. A novel RBFNN-CNN model for speaker identification in stressful talking environments[J]. Applied sciences, 2022, 12(10): 4841.
JIN Xin, XIE Yanping, WEI Xiushen, et al. Delving deep into spatial pooling for squeeze-and-excitation networks[J]. Pattern recognition, 2022, 121: 108159.
HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(8): 2011-2023.
赵文清, 杨盼盼. 双向特征融合与注意力机制结合的目标检测[J]. 智能系统学报, 2021, 16(6): 1098-1105.
ZHAO Wenqing, YANG Panpan. Target detection based on bidirectional feature fusion and an attention mechanism[J]. CAAI Transactions on Intelligent Systems, 2021, 16(6): 1098-1105.
MAY J O, LOONEY S W. On sample size determination when comparing two independent spearman or Kendall coefficients[J]. Open journal of statistics, 2022, 12(2): 291-302.
GRABOWSKI S, KOWALSKI T M. Algorithms for all-pairs Hamming distance based similarity[J]. Software: practice and experience, 2021, 51(7): 1580-1590.
LI Jing, LIN Song, YU Kai, et al. Quantum K-nearest neighbor classification algorithm based on Hamming distance[J]. Quantum information processing, 2021, 21(1): 18.
INDIA M, SAFARI P, HERNANDO J. Self multi-head attention for speaker recognition[C]//Interspeech 2019. ISCA: ISCA, 2019: 1906-2015.
LEI Zeyu, WANG Yan, LI Zijian, et al. Attention based multilayer feature fusion convolutional neural network for unsupervised monocular depth estimation[J]. Neurocomputing, 2021, 423: 343-352.
李晓峰, 邢金明. 融合时空多特征表示的运动人体目标跟踪算法[J]. 应用科技, 2020, 47(4): 26-31.
LI Xiaofeng, XING Jinming. Tracking algorithm of a moving human body target using multi-feature representation of fused time and space[J]. Applied science and technology, 2020, 47(4): 26-31.
HU Zhangfang, SI Xingtong, LUO Yuan, et al. Speaker recognition based on 3DCNN-LSTM[J]. Speaker recognition based on 3DCNN-LSTM, 2021, 29.0(2.0).
LI Lantian, LIU Ruiqi, KANG Jiawen, et al. CN-Celeb: multi-genre speaker recognition[J]. Speech communication, 2022, 137: 77-91.
0
浏览量
29
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010602201714号