基于双流ViT＋ConvNeXt架构的前庭功能校准识别模型构建与价值

罗旭; 吴沛霞; 郝维明; 屈寅弘; 陈寒

doi:10.12025/j.issn.1008-6358.2025.20250219

基于双流ViT＋ConvNeXt架构的前庭功能校准识别模型构建与价值

Construction and value of a vestibular function calibration test recognition model based on dual-stream ViT and ConvNeXt architecture

摘要

摘要:
目的构建一种基于双流架构的深度学习模型，提高眼震视图校准试验结果的判读准确性，同时实现对扫视欠冲波形的有效识别。
方法通过结合视觉transformer（vision transformer, ViT）与改进型ConvNeXt卷积网络，构建跨模态特征融合的前庭功能校准识别模型。模型采用轨迹图和空间分布图作为输入源，通过多任务学习框架对校准数据进行分类判断，并对欠冲波形进行直接判定。
结果模型在判断校准配合程度方面表现优秀，左向、中部、右向校准的准确度、灵敏度、特异度均大于90%，且AUC均接近0.99，最佳准确度97.66%（中部校准）、最佳灵敏度98.98%（中部校准）、最佳特异度96.87%（右向校准），最大AUC为0.997（右向校准）。模型识别欠冲波形表现良好，准确度87.50%、灵敏度89.66%、特异度85.71%，F1评分86.67%，AUC为0.931。
结论基于ViT＋ConvNeXt构建的模型能有效提高前庭功能校准试验结果的判读准确性，同时能识别欠冲波形，为校准试验结果判读及欠冲波形识别提供新的方案。

Abstract:
Objective To improve the efficiency and accuracy of videonystagmography calibration test results while enabling effective recognition of saccadic undershoot waveform by developing a dual-stream architecture-based deep learning model.
Methods A vestibular function calibration test recognition model with cross-modal feature fusion was constructed by integrating vision transformer (ViT) and a modified ConvNeXt convolutional network. The model utilized trajectory pictures and spatial distribution maps as inputs, employed a multi-task learning framework to classify calibration data, and to directly evaluate undershoot waveform.
Results The model showed outstanding performance in assessing calibration compliance. The accuracy, sensitivity, specificity of the model in left side, middle, and right side were all greater than 90%, and AUC values were all greater than 0.99, with 97.66% of optimal accuracy (middle), 98.98% of optimal sensitivity (middle), 96.87% of optimal specificity (right side), and 0.997 of AUC (right side). The model also showed promising performance in undershoot waveform recognition with 87.50% of accuracy, 89.66% of sensitivity, 85.71% of specificity, 86.67% of F1 score, and 0.931 of AUC.
Conclusions The proposed method not only significantly enhances the efficiency and accuracy of calibration test results, but also provides a novel solution for undershoot waveform recognition.

HTML全文

参考文献(10)

施引文献

资源附件(0)