Sunday, May 4, 2025

CNN in Deep Learning: Algorithm and Machine Learning Uses

 Artificial Intelligence has come a long way and has been seamlessly bridging the gap between the potential of humans and machines. And data enthusiasts all around the globe work on numerous aspects of AI and turn visions into reality - and one such amazing area is the domain of Computer Vision. This field aims to enable and configure machines to view the world as humans do, and use the knowledge for several tasks and processes (such as Image Recognition, Image Analysis and Classification, and so on). And the advancements in Computer Vision with Deep Learning have been a considerable success, particularly with the Convolutional Neural Network algorithm.



Introduction to CNN

Yann LeCun, director of Facebook’s AI Research Group, pioneered convolutional neural networks. In 1988, he built the first one, LeNet, which was used for character recognition tasks like reading zip codes and digits.

Have you ever wondered how facial recognition works on social media, or how object detection helps in building self-driving cars, or how disease detection is done using visual imagery in healthcare? It’s all possible thanks to convolutional neural networks (CNN). Here’s an example of convolutional neural networks that illustrates how they work:

Imagine there’s an image of a bird, and you want to identify whether it’s really a bird or some other object. The first thing you do is feed the pixels of the image in the form of arrays to the input layer of the neural network (multi-layer networks used to classify things). The hidden layers carry out feature extraction by performing different calculations and manipulations. There are multiple hidden layers like the convolution layer, the ReLU layer, and pooling layer, that perform feature extraction from the image. Finally, there’s a fully connected layer that identifies the object in the image.

Convolutional Neural Network to identify the image of a bird

Fig: Convolutional Neural Network to identify the image of a bird

What is Convolutional Neural Network?

A convolutional neural network is a feed-forward neural network that is generally used to analyze visual images by processing data with grid-like topology. It’s also known as a ConvNet. A convolutional neural network is used to detect and classify objects in an image.

Below is a neural network that identifies two types of flowers: Orchid and Rose.

In CNN, every image is represented in the form of an array of pixel values.

The convolution operation forms the basis of any convolutional neural network. Let’s understand the convolution operation using two matrices, a and b, of 1 dimension.

a = [5,3,7,5,9,7]

b = [1,2,3]

In convolution operation, the arrays are multiplied element-wise, and the product is summed to create a new array, which represents a*b.

The first three elements of the matrix a are multiplied with the elements of matrix b. The product is summed to get the result.

The next three elements from the matrix a are multiplied by the elements in matrix b, and the product is summed up.

This process continues until the convolution operation is complete.

How Does CNN Recognize Images?

Consider the following images:

The boxes that are colored represent a pixel value of 1, and 0 if not colored.

When you press backslash (\), the below image gets processed.

When you press forward-slash (/), the below image is processed:

Here is another example to depict how CNN recognizes an image:

As you can see from the above diagram, only those values are lit that have a value of 1.

Layers in a Convolutional Neural Network

A convolution neural network has multiple hidden layers that help in extracting information from an image. The four important layers in CNN are:

  1. Convolution layer
  2. ReLU layer
  3. Pooling layer
  4. Fully connected layer
  5. ReLU layer/ Activation Layer
  6. Flattening
  7. Output Layer

Convolution Layer

This is the first step in the process of extracting valuable features from an image. A convolution layer has several filters that perform the convolution operation. Every image is considered as a matrix of pixel values.

Consider the following 5x5 image whose pixel values are either 0 or 1. There’s also a filter matrix with a dimension of 3x3. Slide the filter matrix over the image and compute the dot product to get the convolved feature matrix.

ReLU layer

ReLU stands for the rectified linear unit. Once the feature maps are extracted, the next step is to move them to a ReLU layer. 

ReLU performs an element-wise operation and sets all the negative pixels to 0. It introduces non-linearity to the network, and the generated output is a rectified feature map. Below is the graph of a ReLU function:

No comments:

Post a Comment

Scientists from Russia and Vietnam discover new antimicrobial compounds in marine sponges

  Scientists from the G. B. Elyakov Pacific Institute of Bioorganic Chemistry of the Far Eastern Branch of the Russian Academy of Sciences, ...