Transaction on Pattern Recognition and Machine Intelligence

[IS2D] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution,and Fully Connected CRFs (IEEE TPAMI2017)

2024.01.08

안녕하세요. 지난 포스팅의 [IS2D] Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs (ICLR2015)에서는 대표적인 영상 분할 모델인 DeepLabV1에 대해서 알아보았습니다. 오늘은 DeepLabV1의 발전된 모델인 DeepLabV2에 대해서 알아보도록 하겠습니다. Background 기본적으로 DeepLabV2 역시 의미론적 영상 분할을 위해 제시된 모델이기 때문에 DeepLabV1과 동일한 challenge를 공유하고 있습니다: 1) 입력 영상에 대한 반복적인 풀링 연산으로 인한 영상 해상도의 감소, 2) 공간 변환에 대한 불변성 확보 필요, 3) 동일한 객체라고 하더라도 다양한 크기의 객체가 단..

논문 함께 읽기/Transformer

[Transformer] P2T: Pyramid Pooling Transformer for Scene Understanding (IEEE TPAMI2022)

2024.01.07

안녕하세요. 지난 포스팅의 [Transformer] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions (ICCV2021)에서는 Feature Pyramid를 활용한 PVT에 대해서 알아보았습니다. 오늘도 Transformer에 Feature Pyramid를 발전시킨 모델 중 하나인 Pyramid Pooling Transformer (P2T)에 대해서 알아보도록 하겠습니다. P2T: Pyramid Pooling Transformer for Scene Understanding Recently, the vision transformer has achieved great success by pushi..

[IS2D] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution,and Fully Connected CRFs (IEEE TPAMI2017)

[Transformer] P2T: Pyramid Pooling Transformer for Scene Understanding (IEEE TPAMI2022)

티스토리툴바