Image Segmentation
Models
U-Net
Convolutional Networks for Biomedical Image Segmentation
- Left half: image size keeps reducing, channel size keeps increasing
- Right half:
- upscale the downscaled image, back to the original height and width but with as many channels as there are classes
- leverage information from the corresponding layers in the left half using skip connections
Mask R-CNN
Mask R-CNN
- mask: 0 or 1, indicates whether the pixel contains an object or not
SAM
MaskFormer
OneFormer
Concepts
Transposed Convolution
https://makeyourownneuralnetwork.blogspot.com/2020/02/calculating-output-size-of-convolutions.html
- Internally, padding is calculated as dilation * (kernel_size – 1) - padding. Hence, it is 1*(2-1)-0 = 1, where we add zero padding of 1 to both dimensions of the input array
Convoultion output size