基于特征增强和自适应区域掩码的医学图像分割模型Medical Image Segmentation Model Based on Feature Enhancement and Adaptive Region Masking
黄汇磊,刘全金,嵇文,杨一昊,凌雅棋,吴磊,谭镇坤
摘要(Abstract):
针对Mask2Former在医学图像分割中特征信息不足和目标定位不准的问题,文章研究了一种结合特征增强和自适应区域掩码的医学图像分割模型。该模型采用轻量化设计策略,通过精简目标区域数量、引入可变形注意力模块和区域掩码解码模块,以提升模型医学图像分割计算效率,并缓解标注样本稀缺所带来的压力。在融合像素解码器中,通过构建特征增强模块,以增强可变形注意力输出的语义特征信息,进而促进多尺度特征的充分融合。在区域掩码解码器中,设计查询特征自适应生成网络来替代原始Mask2Former中随机生成的目标查询特征,从而提升掩码生成的准确性。图像分割实验结果表明,模型在VerSe脊柱、LiTS肝脏与STS牙齿3个数据集上的Dice均值分别达到了86.93%、78.12%和89.90%,优于传统的U-Net、DeepLabV3和MaskFormer等主流算法。
关键词(KeyWords): Mask2Former;医学图像分割;特征增强;自适应;归纳偏置;区域掩码解码器
基金项目(Foundation): 国家自然科学基金项目(62272280);; 山东省自然科学基金项目(ZR2020KF103)
作者(Author): 黄汇磊,刘全金,嵇文,杨一昊,凌雅棋,吴磊,谭镇坤
DOI: 10.13757/j.cnki.cn34-1328/n.2025.03.011
参考文献(References):
- [1]BAUR D, KROBOTH K, HEYDE C E, et al. Convolutional neural networks in spinal magnetic resonance imaging:a systematic review[J].World Neurosurgery, 2022, 166:60-70.
- [2]林晓青,夏艺,范丽. CT定量分析及人工智能对COPD急性加重的研究进展[J].中国临床医学影像杂志, 2024, 35(1):61-64.
- [3]呼伟,徐巧枝,葛湘巍,等.医学图像分割的无监督域适应研究综述[J].计算机工程与应用, 2024, 60(6):10-26.
- [4]任钰.基于Faster R-CNN的小目标检测研究与应用[D].安庆:安庆师范大学, 2022.
- [5]LONG J, SHERLHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(4):640-651.
- [6]RONNEBERGER O, FISCHER P, BROX T. U-Net:convolutional networks for biomedical image segmentation[C]. International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham:Springer International Publishing, 2015.
- [7]崔珂,田启川,廉露.基于U-Net变体的医学图像分割算法综述[J].计算机工程与应用, 2024, 60(11):32-49.
- [8]CHEN L C, PAPANDREOU G, KOKKINOS I. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J].Computer Science, 2014, 4:357-361.
- [9]CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab:semantic image segmentation with deep convolutional nets, atrous convolution,and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.
- [10]CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[C]. IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA:IEEE Computer Society, 2017.
- [11]GOYAL A, BENGIO Y. Inductive biases for deep learning of higher-level cognition[J]. Proceedings of the Royal Society A, 2022, 478(2266):20210068.
- [12]CUI Y T, JIANG C, WANG L M, et al. Mixformer:end-to-end tracking with iterative mixed attention[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- [13]FAN Q, HUANG H, CHEN M, et al. Rmt:retentive networks meet vision transformers[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
- [14]ZHU L, WANG X, KE Z, et al. Biformer:vision transformer with bi-level routing attention[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- [15]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Advances in Neural Information Processing Systems 30:Annual Conference on Neural Information Processing Systems, 2017.
- [16]DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words:transformers for image recognition at scale[J].arXiv Preprint arXiv:2010. 11929, 2021.
- [17]LIU Z, LIN Y T, CAO Y, et al. Swin transformer:hierarchical vision transformer using shifted windows[C]. 2021 IEEE/CVF International Conference on Computer Vision(ICCV), 2021.
- [18]CAO H, WANG Y Y, CHEN J, et al. Swin-Unet:unet-like pure transformer for medical image segmentation[C]. Computer Vision-ECCV2022 Workshops, 2023.
- [19]CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]. Computer Vision-ECCV 2020, 2020.
- [20]ZHU X Z, SU W J, LU L W, et al. Deformable DETR:deformable transformers for end-to-end object detection[C]. 2021 International Conference on Learning Representations, 2021.
- [21]CHENG B, SCHWING A G, KIRILLOV A, et al. Per-pixel classification is not all you need for semantic segmentation[J]. Advances in Neural Information Processing Systems, 2021, 34:17864-17875.
- [22]CHENG B, MISRA I, SCHWING A G, et al. Masked-attention mask transformer for universal image segmentation[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.