Hongbin WANG, Shuai ZHANG, Ming HE, et al. Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution[J]. Journal of Harbin Engineering University, 2024, 45(4): 758-763.
DOI:
Hongbin WANG, Shuai ZHANG, Ming HE, et al. Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution[J]. Journal of Harbin Engineering University, 2024, 45(4): 758-763. DOI: 10.11990/jheu.202206048.
Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution
The application of deep learning in underwater acoustic research often faces problems such as large data volume requirements and current sample size limitations. Herein
the best convolution kernel is selected using the similarity-based optimization method to extract representative features. Then
by exploring the layer feature fusion strategy
the multilayer convolution output is superimposed to obtain comprehensive feature information. This study proposes a multilayer optimized convolutional network model that can effectively solve such problems through optimization
feature fusion
and attention mechanisms. Finally
a reduction strategy is used to optimize the model
which effectively shortens the operation time. The experimental results reveal that the best annotation accuracy of the model on the data set is 1.12 % of the high baseline model
and the running time is reduced by 43.5 %. Therefore
this model improves the accuracy and efficiency of underwater acoustic signal labeling.
关键词
Keywords
references
YANG H, BYUN S H, LEE K, et al. Underwater acoustic research trends with machine learning: active SONAR applications[J]. Journal of ocean engineering and technology, 2020, 34(4): 277-284.
SHAN Shuaijie, LIU Jianbao, DUN Yaowu. Prospect of voiceprint recognition based on deep learning[J]. Journal of physics: conference series, 2021, 1848(1): 012046.
KHDIER H Y, JASIM W M, ALIESAWI S A. Deep learning algorithms based voiceprint recognition system in noisy environment[J]. Journal of physics: conference series, 2021, 1804(1): 012042.
MINI P P, THOMAS T, GOPIKAKUMARI R. EEG based direct speech BCI system using a fusion of SMRT and MFCC/LPCC features with ANN classifier[J]. Biomedical signal processing and control, 2021, 68: 102625.
FU Jin, XU Wanyan, WANG Yan, LIANG Guolong. Anti-multipath technique of underwater acoustic channel in complex cepstrum domain[J]. Journal of Harbin Engineering University, 2015(9): 1188-1193.
ZHU Qiang, WANG Zhong, DOU Yunfeng, et al. Whispered speech conversion based on the inversion of mel frequency cepstral coefficient features[J]. Algorithms, 2022, 15(2): 68.
WANG Dayu, WANG Zhixin, ZHANG Guangpu. Detection method of underwater acoustic communications signal based on improved spectral subtraction algorithm[J]. Applied science and technology, 2020, 47(3): 69-73.
SHIVAPRASAD S, SADANANDAM M. Optimized features extraction from spectral and temporal features for identifying the Telugu dialects by using GMM and HMM[J]. Ingénierie des systèmes d information, 2021, 26(3): 275-283.
ZHANG Jing. Research on cantonese phonetic feature extraction algorithm based on GMM-UBM[C]//2019 Chinese Control and Decision Conference (CCDC). Piscataway, NJ: IEEE, 2019: 4034-4038.
YANG Shuang, ZENG Xiangyang. Underwater acoustic target recognition method based on the multi-scale sparse simple recurrent unit model[J]. Journal of Harbin Engineering University, 2022, 43(7): 958-964.
GU Jiuxiang, WANG Zhenhua, KUEN J, et al. Recent advances in convolutional neural networks[J]. Pattern recognition, 2018, 77(C): 354-377.
REJAIBI E, KOMATY A, MERIAUDEAU F, et al. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech[J]. Biomedical signal processing and control, 2022, 71: 103107.
LI C, MA X, JIANG B, et al. Deep speaker: an end-to-end neural speaker embedding system[J ] . Arxiv, 2017. DOI: 10.48550/arXiv.1705.02304 http://dx.doi.org/10.48550/arXiv.1705.02304 .
TANG Liying, HE Lile, HE Lin, et al. Diversity measuring method of a convolutional neural network ensemble[J]. CAAI Transactions on Intelligent Systems, 2021, 16(6): 1030-1038.
WANG Tong, SU Lin, REN Qunyan, WANG Wenbo, MA Li. Application of the sound speed profile and sound source location in shallow waters[J]. Journal of Harbin Engineering University, 2021, 42(8): 1133-1139.
DESPLANQUES B, THIENPONDT J, DEMUYNCK K. ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verification[J]. Arxiv: 2005.07143.
BOU NASSIF A, ALNAZZAWI N, SHAHIN I, et al. A novel RBFNN-CNN model for speaker identification in stressful talking environments[J]. Applied sciences, 2022, 12(10): 4841.
JIN Xin, XIE Yanping, WEI Xiushen, et al. Delving deep into spatial pooling for squeeze-and-excitation networks[J]. Pattern recognition, 2022, 121: 108159.
HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(8): 2011-2023.
ZHAO Wenqing, YANG Panpan. Target detection based on bidirectional feature fusion and an attention mechanism[J]. CAAI Transactions on Intelligent Systems, 2021, 16(6): 1098-1105.
MAY J O, LOONEY S W. On sample size determination when comparing two independent spearman or Kendall coefficients[J]. Open journal of statistics, 2022, 12(2): 291-302.
GRABOWSKI S, KOWALSKI T M. Algorithms for all-pairs Hamming distance based similarity[J]. Software: practice and experience, 2021, 51(7): 1580-1590.
LI Jing, LIN Song, YU Kai, et al. Quantum K-nearest neighbor classification algorithm based on Hamming distance[J]. Quantum information processing, 2021, 21(1): 18.
INDIA M, SAFARI P, HERNANDO J. Self multi-head attention for speaker recognition[C]//Interspeech 2019. ISCA: ISCA, 2019: 1906-2015.
LEI Zeyu, WANG Yan, LI Zijian, et al. Attention based multilayer feature fusion convolutional neural network for unsupervised monocular depth estimation[J]. Neurocomputing, 2021, 423: 343-352.
LI Xiaofeng, XING Jinming. Tracking algorithm of a moving human body target using multi-feature representation of fused time and space[J]. Applied science and technology, 2020, 47(4): 26-31.
HU Zhangfang, SI Xingtong, LUO Yuan, et al. Speaker recognition based on 3DCNN-LSTM[J]. Speaker recognition based on 3DCNN-LSTM, 2021, 29.0(2.0).
LI Lantian, LIU Ruiqi, KANG Jiawen, et al. CN-Celeb: multi-genre speaker recognition[J]. Speech communication, 2022, 137: 77-91.
A method for modulation recognition of underwater acoustic communication signals based on VMD-wavelet denoising, bilinear model ResNet, and coordinate attention mechanism
Traffic speed forecasting with spatiotemporal fusion based on the mayfly optimization algorithm
A coral reef benthic recognition method based on the improved YOLOv algorithm
Related Author
Feng ZHOU
Shaoshuai WEI
Gang QIAO
ZHANG Hong
GONG Lei
CAO Jie
ZHANG Xijun
WU Rui
Related Institution
Acoustic Science and Technology Laboratory, Harbin Engineering University
Key Laboratory of Marine Information Acquisition and Security (Harbin Engineering University), Ministry of Industry and Information Technology
College of Underwater Acoustic Engineering, Harbin Engineering University
School of Computer and Communication Lanzhou University of Technology Lanzhou China
School of Information Engineering Lanzhou City University Lanzhou China