Introduction:
As we know that, concept of artificial perceptrons and neurons were based on the system of human brain. It work same as the human neural this idea actually established when back in 50s scientists observes the human brain.
This idea is basically depends on the perceptron (we may call them predecessors of artificial designed neurons) that we can artificially make some parts of neurons i.e. axons, dendroid and bodies of cell. These can be replicate simply by using mathematical models in which little knowledge of how they work internally is required. It work in this way as dendrites receive the signals and once dendrites receive handsome amount of signals they sent down these signals to axon. In this way the sending amount of signal become the input for the remaining neurons. There are some signals that are more valuable than other signals and these signals are used to activate some neurons. Either the connection is weaker or stronger, other connections can take place while the previous connection can still exist. All this working can be made just by making a function that will take list populated by weight of input signals and gave us the result of some kind of signal only if the sum of weights taken as input reach a particular bias. Simple classification task can be done by this model alone.

In Artificial neural network in the neurons there are weights and hidden layers we give input to neuron and then in neuron the input is matched with several weights and hidden layers and then there comes an activator function which maps the input with weights and give us an output . inside neuron figure is shown below:

Three processes are going here, in first stage every input is multiplied by a particular value (weight).

Then, every weight is added with a bias b

At last, the sum is passed to an activation function.

An activation function is used to turn an input in to the output such as, output will be a good predictable form. Activation functions most commonly used are sigmoid and Relu.
Neural Network:
When a bunch of several neurons connected together they form a network called neural network. Below figure shows how neural network is in looking:

The network shown in above image actually have two of input layers and two of the hidden layer and the is one output layer.
“A hidden layer is any layer between the input (first) layer and output (last) layer. There can be multiple hidden layers!.”
There can be any number of layers with no restriction of number of neurons in those layers. Basic structure will remain the same: feed the input(s) forward through the neurons in the network to get the output(s) at the end.
Loss Calculation:
Loss function is a significant part in artificial neural networks, which is used to measure the difference between ground truth value and the predicted value. MSE Used for loss calculation :

Before training our data we have to calculate loss for our model, we can find loss by different methods such as Cross entropy, MSE (Mean square error). After finding loss we optimize our neural network to give us better result. Optimizing is done by tuning the several parameters in the network it can be done by using different already implemented optimizers.
Back propagation:
Technique through which we propagate total loss reverse in the model is called back propagation. We used this technique to know about the loss on every node and then according to this we can further optimize our model. We calculate delta at every unit that’s how back propagation works

Above image is demonstrating the flow working of simple linear mode utilizing TensorFlow .A linear regression network that has only one explanatory variable is called simple linear regression. It means that it requires only 2d sample points in which one will be dependent and one will be independent variable. Thus, its responsibility is to discover should be accurate to full extent in predicting the value of dependent variable as a function of the independent variables.
We are Using MNIST dataset because it is one of the most commonly available largest dataset of handwritten digits.
Requirements:
- Python & Editor(JUPYTER , PYCHARM)
One-Hot_Encoding:
A process in which we utilize the categorical values and then convert them into a particular form in order to provide it to our ML algorithms to perform better working is called as one hot encoding
After loading tha MNIST dataset. We do one hot encoding it means we convert class numbers from single integer into vector which have length of equal number of possible classes, in our case the number of possible classes are 10.
For example:

Helper function:
After this we made helper function to plot the images from the dataset in this case we plot 9 images from our helper functions.
Helper functions are usually made to make our task easy and efficient.
Variables & Placeholders:
Think of Variable in TensorFlow as a normal variables which we use in programming languages. Variables are initialized, we can change them later as well. On the other hand, placeholder are different to variable as they have no initial value. They are simply used to allocate memory blocks that can be used in future.
Moreover, Placeholders are used when the value has to be changed on run time and it is also used to take less of your memory because if you want to change you value 100 times you can’t just make 100 variables it will take more memory so we just make a placeholder by tf. Placeholder.
Weights & Biases:
Weights and biases are the parameters of your model, these are actually learnable parameters. In linear regression it can be used same as well. Most machine learning algorithms include some learnable parameters like this.
Bias is simply how biased you are, Now you are a Pakistani, And you are asked “Which nationality have the most beautiful men” you say Pakistani Men, we can say this because you are biased. so your formula is Y=WX+Pakistani. Where W is weight and X is input.
So what do you understand? Biased is that pre-assumption in a model like you have.
“Weight is the strength of the connection. If I increase the input then how much influence does it have on the output.”
When weights are near zero it depicts that by changing them it will not affect our output. Many algorithms will automatically set those weights to zero in order to simplify the network.
Learning Rate:
Value of weights that are updated after training start and during training is referred to as of learning rates.