What is Convolutional Neural Network?
Convolutional neural network(CNN) seems like really a robotic and neuro fiction term with weird combination includes math and biology with some CS involved in it, CNN’s have been some of the most powerful innovations in the field of computer vision.
In Neural Networks(NN), CNN is used to identify and classify images. Object detection and face recognition applications are widely used in the area of Deep Learning.
In 2012, neural networks have grown to really famous because Mr. Alex Krizhevsky utilized these networks to win 2012’s year’s Image-net competition, dropping the classification error rate of 26% to 15%, an astonishing improvement at the time in the field of computer vision. From that time onward many people have been using deep learning with the core of their services.
Automatic tagging algorithms is used by facebook images, Amazon for their products and ads recommendations, Google uses it for photo search, and Pinterest for their personalized feed.
Convolutional networks can be implemented for the profit-oriented business-like letter recognition in digitizing the text and making the natural language processing possible for the handwritten and analog documents.
Basic Operations involving Convolutional Neural Network
1)The Convolution Operation
At the very start, it can be said that the main aim of the convolutional layer is to find the visible features of images like lines, edges or color dropping, etc. It is a very fascinating feature as if you are able to learn a specific property of the image then it is easy to find out the other parts of the image. Rather, the dense neural networks are more capable to recognize the image patterns if it shows at the new position.
These layers are able to work at the tensors having three dimensions which are also called feature maps. There are also channel axes and as well as height and width having spatial axes are also called depth. In RGB, the dimension of axis’ depth is 3 As there are three channels which are red, green, and blue. For picture/images of black and white color channels, the depth-axis-dimension is 1, such as in MNIST numbers. Visualizing the whole process we begin from the top left corner of an image in a window, giving this network important information to the first neuron of the hidden layer.
If we have an example input image of 28×28 pixels and a window with the dimensions of 5×5, defining a space of 24×24 neurons inside the first hidden layer because we can only move the window to 23 neurons right and 23 neurons till bottom before hitting the right (or bottom) border of the input image.
There is a stride which is 1, we move the implemented filter to 1 stepper pixel and traverse it whole image. If we set this stride to 2 then the filters are moved by 2 pixels at one time and so on. The given figure shows the functionality of the convolution layer with stride set to 2.
- Non Linearity (ReLU)
There is also a non linear function called Rectified linear unit(ReLU). The result is :
g(x) = max(0,x).
ReLU is very important for creating the nonlinearity to the neural network since our real-world information would want our CNN to work with non-negative linear values.
2)The Pooling Operation
To understand the concept of the pooling layer is that, it simplifies collected information which is done by the convolutional layer creating a summary of that specific information.
2×2 window is selected for the convolutional layer of this MNIST data, synthesizing specific information at some point in a pooling layer.
Basic Algorithm of a 2d CNN
Convolutional Neural networks with python
An example of the convolutional neuronal network can be implemented with the help of python. In this example, we are using 32 filters for the convolutional layer and for the pooling layer. Dimensions of the window for convolutional layer and pooling layers are 5×5 and 2×2 respectively. Configuring a CNN network to process a tensor with size (28, 28, 1); the size of the MNIST images (3rd parameter is for the color channel of image). Value of the input_shape = (28, 28,1) is specified for the first layer :
Conv2d layer with the parameters that correspond to 5×5 weight matrix W and 832 bias b of each filter with parameters of (32 x (25 +1)).
Note: “Max-pooling is an operation or mathematical formula for finding maximum value from a matrix so it does not require parameters”.
Discussion and conclusion
The aim of this topic was to present the deep convolutional neural networks with vital interest. Mostly these neural networks give an excellent outcome for identification and classification work. These networks are also utilized for interpreting the text, video or audio data. If there is an objective to solve the sequence patterns then convolutional neural networks are best for this purpose.
Steps to perform in for simple CNN project:
- Provide an image dataset to the convolutional layer.
- Do convolution operation at the image and employ the ReLU activation function to the matrix.
- Implement pooling function to minimize dimensionality size.
- Add the convolutional neural layers as required until contented.
- Also, add the flatten layer at the output and forward it to the fully connected layer for generating the results.
- Yield the class by utilizing the activation function for classifying the images with logistic regression having the cost function. There are also some famous convolutional neural network frameworks like GoogleNet, VGGNet ResNet or Alexnet.
You may also know: Analysis of Weak Artificial Intelligence