What do you mean by semantic segmentation?

Semantic segmentation refers to the process of assigning labels to each pixel within an image. This contrasts with a classification that assigns a single label to all pixels. Semantic segmentation treats multiple objects belonging to the same class as one entity.

Image Source: Google

Instance segmentation, on the other hand, treats multiple objects in the same class as separate objects (or instances). Instance segmentation is typically more difficult than semantic segmentation.

  • Classical Methods

Segmentation at Gray Level

Semantic segmentation is the simplest type. It involves assigning labels to a region based on hard-coded rules. These rules can be described in terms of the properties of a pixel, such as its gray-level intensity. 

Split and Merge is one example of this technique. This algorithm divides an image into sub-regions until it can be given a label. Then, it merges adjacent regions with the same label.

Conditional Random Fields

Segmenting an image can be done by training a model that assigns a class per pixel. If our model is imperfect, noisy results may result in segmentation that is not possible in the real world.

  • Deep Learning Methods

Deep Learning has made it possible to simplify the process of semantic segmentation. It is also producing impressive results. This section will discuss the most popular models and loss functions that are used to train deep learning methods.

Model Architectures

The Fully Convolutional Network (FCN) is one of the most popular and simple architectures for semantic segmentation. The paper FCN for Semantic Segmentation uses the downsample of the data pictures to a shorter size while getting more ways through a series of convolutions. 


The U-Net architecture is an upgrade of the FCN architecture. It uses skip connections to connect the output of convolution blocks with the corresponding input from the transposed-convolution bloc at the same level.