Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution

Hongbin WANG; Shuai ZHANG; Ming HE; Xiake CHEN

doi:10.11990/jheu.202206048

Chinese

您当前的位置：

首页 >

文章列表页 >

Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution

更新时间：2025-09-01

- Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution
- Journal of Harbin Engineering University Vol. 45, Issue 4, Pages: 758-763(2024)
- 作者机构：
  
  1.哈尔滨工程大学计算机科学与技术学院, 黑龙江哈尔滨 150001
  2.黑龙江科技大学计算机科学与信息工程学院, 黑龙江哈尔滨 150022
- 作者简介：
- 基金信息：
- DOI：10.11990/jheu.202206048
  CLC： U661.31
- Received：13 June 2022，
  
  Online First：05 March 2024，
  
  Published：05 April 2024
- 稿件说明：
移动端阅览
Hongbin WANG, Shuai ZHANG, Ming HE, et al. Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution[J]. Journal of Harbin Engineering University, 2024, 45(4): 758-763.
DOI：

Hongbin WANG, Shuai ZHANG, Ming HE, et al. Automatic labeling method for underwater acoustic signal samples based on multilayer optimal convolution[J]. Journal of Harbin Engineering University, 2024, 45(4): 758-763. DOI： 10.11990/jheu.202206048.

摘要

针对深度学习在水声研究领域的应用中面临大数据量要求和现有样本量限制的问题

本文提出了一种多层优选卷积网络模型。通过基于相似度的优选方法选出最佳卷积核

以提取更具代表性的特征。利用探索层特征融合策略

叠加多层卷积输出

获取更全面的特征信息。采用约减策略优化模型

有效缩短运算时间。通过优选、特征融合和注意力机制

有效解决此类问题。实验结果表明

该模型在数据集上取得的最好的标注准确率为高基线模型1.12 %; 同时运行时间减少了43.5 %。因此

该模型的使用提高了水声信号标注的准确率和效率。

Abstract

The application of deep learning in underwater acoustic research often faces problems such as large data volume requirements and current sample size limitations. Herein

the best convolution kernel is selected using the similarity-based optimization method to extract representative features. Then

by exploring the layer feature fusion strategy

the multilayer convolution output is superimposed to obtain comprehensive feature information. This study proposes a multilayer optimized convolutional network model that can effectively solve such problems through optimization

feature fusion

and attention mechanisms. Finally

a reduction strategy is used to optimize the model

which effectively shortens the operation time. The experimental results reveal that the best annotation accuracy of the model on the data set is 1.12 % of the high baseline model

and the running time is reduced by 43.5 %. Therefore

this model improves the accuracy and efficiency of underwater acoustic signal labeling.

关键词

Keywords

references

YANG H, BYUN S H, LEE K, et al. Underwater acoustic research trends with machine learning: active SONAR applications[J]. Journal of ocean engineering and technology, 2020, 34(4): 277-284.

SHAN Shuaijie, LIU Jianbao, DUN Yaowu. Prospect of voiceprint recognition based on deep learning[J]. Journal of physics: conference series, 2021, 1848(1): 012046.

KHDIER H Y, JASIM W M, ALIESAWI S A. Deep learning algorithms based voiceprint recognition system in noisy environment[J]. Journal of physics: conference series, 2021, 1804(1): 012042.

MINI P P, THOMAS T, GOPIKAKUMARI R. EEG based direct speech BCI system using a fusion of SMRT and MFCC/LPCC features with ANN classifier[J]. Biomedical signal processing and control, 2021, 68: 102625.

付进, 许婉琰, 王燕等. 复倒谱域水声信道多途抑制技术[J]. 哈尔滨工程大学学报, 2015, 36(9): 1188-1193.

FU Jin, XU Wanyan, WANG Yan, LIANG Guolong. Anti-multipath technique of underwater acoustic channel in complex cepstrum domain[J]. Journal of Harbin Engineering University, 2015(9): 1188-1193.

ZHU Qiang, WANG Zhong, DOU Yunfeng, et al. Whispered speech conversion based on the inversion of mel frequency cepstral coefficient features[J]. Algorithms, 2022, 15(2): 68.

王大宇, 王志欣, 张光普. 基于改进谱减算法的水声通信信号检测方法[J]. 应用科技, 2020, 47(3): 69-73.

WANG Dayu, WANG Zhixin, ZHANG Guangpu. Detection method of underwater acoustic communications signal based on improved spectral subtraction algorithm[J]. Applied science and technology, 2020, 47(3): 69-73.

SHIVAPRASAD S, SADANANDAM M. Optimized features extraction from spectral and temporal features for identifying the Telugu dialects by using GMM and HMM[J]. Ingénierie des systèmes d information, 2021, 26(3): 275-283.

ZHANG Jing. Research on cantonese phonetic feature extraction algorithm based on GMM-UBM[C]//2019 Chinese Control and Decision Conference (CCDC). Piscataway, NJ: IEEE, 2019: 4034-4038.

杨爽, 曾向阳. 基于多尺度稀疏简单循环单元模型的水声目标识别方法[J]. 哈尔滨工程大学学报, 2022, 43(7): 958-964.

YANG Shuang, ZENG Xiangyang. Underwater acoustic target recognition method based on the multi-scale sparse simple recurrent unit model[J]. Journal of Harbin Engineering University, 2022, 43(7): 958-964.

GU Jiuxiang, WANG Zhenhua, KUEN J, et al. Recent advances in convolutional neural networks[J]. Pattern recognition, 2018, 77(C): 354-377.

REJAIBI E, KOMATY A, MERIAUDEAU F, et al. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech[J]. Biomedical signal processing and control, 2022, 71: 103107.

LI C, MA X, JIANG B, et al. Deep speaker: an end-to-end neural speaker embedding system[J ] . Arxiv, 2017. DOI: 10.48550/arXiv.1705.02304 http://dx.doi.org/10.48550/arXiv.1705.02304 .

汤礼颖, 贺利乐, 何林等. 一种卷积神经网络集成的多样性度量方法[J]. 智能系统学报, 2021, 16(6): 1030-1038.

TANG Liying, HE Lile, HE Lin, et al. Diversity measuring method of a convolutional neural network ensemble[J]. CAAI Transactions on Intelligent Systems, 2021, 16(6): 1030-1038.

王同, 苏林, 任群言等. 循环神经网络在浅海声速-声源联合反演中的应用[J]. 哈尔滨工程大学学报, 2021, 42(8): 1133-1139.

WANG Tong, SU Lin, REN Qunyan, WANG Wenbo, MA Li. Application of the sound speed profile and sound source location in shallow waters[J]. Journal of Harbin Engineering University, 2021, 42(8): 1133-1139.

DESPLANQUES B, THIENPONDT J, DEMUYNCK K. ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verification[J]. Arxiv: 2005.07143.

BOU NASSIF A, ALNAZZAWI N, SHAHIN I, et al. A novel RBFNN-CNN model for speaker identification in stressful talking environments[J]. Applied sciences, 2022, 12(10): 4841.

JIN Xin, XIE Yanping, WEI Xiushen, et al. Delving deep into spatial pooling for squeeze-and-excitation networks[J]. Pattern recognition, 2022, 121: 108159.

HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(8): 2011-2023.

赵文清, 杨盼盼. 双向特征融合与注意力机制结合的目标检测[J]. 智能系统学报, 2021, 16(6): 1098-1105.

ZHAO Wenqing, YANG Panpan. Target detection based on bidirectional feature fusion and an attention mechanism[J]. CAAI Transactions on Intelligent Systems, 2021, 16(6): 1098-1105.

MAY J O, LOONEY S W. On sample size determination when comparing two independent spearman or Kendall coefficients[J]. Open journal of statistics, 2022, 12(2): 291-302.

GRABOWSKI S, KOWALSKI T M. Algorithms for all-pairs Hamming distance based similarity[J]. Software: practice and experience, 2021, 51(7): 1580-1590.

LI Jing, LIN Song, YU Kai, et al. Quantum K-nearest neighbor classification algorithm based on Hamming distance[J]. Quantum information processing, 2021, 21(1): 18.

INDIA M, SAFARI P, HERNANDO J. Self multi-head attention for speaker recognition[C]//Interspeech 2019. ISCA: ISCA, 2019: 1906-2015.

LEI Zeyu, WANG Yan, LI Zijian, et al. Attention based multilayer feature fusion convolutional neural network for unsupervised monocular depth estimation[J]. Neurocomputing, 2021, 423: 343-352.

李晓峰, 邢金明. 融合时空多特征表示的运动人体目标跟踪算法[J]. 应用科技, 2020, 47(4): 26-31.

LI Xiaofeng, XING Jinming. Tracking algorithm of a moving human body target using multi-feature representation of fused time and space[J]. Applied science and technology, 2020, 47(4): 26-31.

HU Zhangfang, SI Xingtong, LUO Yuan, et al. Speaker recognition based on 3DCNN-LSTM[J]. Speaker recognition based on 3DCNN-LSTM, 2021, 29.0(2.0).

LI Lantian, LIU Ruiqi, KANG Jiawen, et al. CN-Celeb: multi-genre speaker recognition[J]. Speech communication, 2022, 137: 77-91.

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

A method for modulation recognition of underwater acoustic communication signals based on VMD-wavelet denoising, bilinear model ResNet, and coordinate attention mechanism

Traffic speed forecasting with spatiotemporal fusion based on the mayfly optimization algorithm

A coral reef benthic recognition method based on the improved YOLOv algorithm

Related Author

Feng ZHOU

Shaoshuai WEI

Gang QIAO

ZHANG Hong

GONG Lei

CAO Jie

ZHANG Xijun

WU Rui

Related Institution

Acoustic Science and Technology Laboratory, Harbin Engineering University

Key Laboratory of Marine Information Acquisition and Security (Harbin Engineering University), Ministry of Industry and Information Technology

College of Underwater Acoustic Engineering, Harbin Engineering University

School of Computer and Communication Lanzhou University of Technology Lanzhou China

School of Information Engineering Lanzhou City University Lanzhou China

AI问答

Postal code：100079
Tel：（010）53879206 Email：tmw@bjxintong.com.cn
Technical support is provided by Beijing Founder electronics co., LTD 京ICP备09082226号-64 京公网安备11010602201714号
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰