What is Neural Network
A short description about building a simple neural network model for sine function.
Last updated
A short description about building a simple neural network model for sine function.
Last updated
Briefly introduced in the previous post, a neural network is made up of thousands or millions (or more) of perceptrons. A whole model of neural network can be thought as a function approximator that even though we don't know a true function, we can find a function that approximates the same or really close.
We cannot exactly see what work a neural network does or how it does, behind the scene, but with enough training it outputs a prediction(s). Because of this, sometimes we call neural network (or hidden layers between input and output) a black box.
A NN model starts as blank and as it iterates over training data, it updates and finds a better model.
Every time it outputs prediction values, it compares with a true y values and computes loss. With this and backpropagation technique, a model updates weights and biases in hidden layers to reduce next prediction's loss.
Backpropagation is a way of computing how much each hidden layer contributes to predicted values. For example we have a model with two hidden layers and first layer contributes 40% work to the output while the second does 60%. Then when we update weights of two layers, we update them with the same ratio. We don't want to punish a worker of two with equal amount when one of them was the main cause of a problem.
One good thing about neural network is, given substantial amount of data, we can find an approximate function that almost (or all) works as a true function. However this can also be one con of neural network that it needs large data. If the amount is small, it works poorly and using supervised or unsupervised algorithms will be much better instead.
To make a model, we need to make a model function to pass into a tensorflow estimator.
The arguments in the function should be named exactly as the above or else it will throw an exception. For example if the name of 'features' is instead 'feature', it will throw model_fn () must include features argument
exception.
In this model, we are only using two hidden layers of size 50 and 100 with biases and relu functions for activations.
You can modify these layers and hyper-parameters to make a better function or to find a different approximation for different function.
For the x values, I reshaped to (-1, 1) to make it in 2 dimensions as that is the least number of dimensions tensorflow expects. The first column value will be number of total samples and the second is to hold each of its values. Same goes for the y's shape as well.
We can save our model's checkpoints in the 'model' sub-directory.
Now that the model is trained, let's check how much error(Mean Squared Error) we make with the same x and y values.
Our MSE value is 0.0015171621. This value can be different depending on parameters.
Now to predict with the same x values, do the following.
First five of prediction values.
These are the graph of predicted values after 10000, 20000 and 30000 iterations.
Though I stopped training after 30000 iterations, you can do more with different parameters to find an exact sine function.
Usually, if a given function is not so complex such as image classification or else, a model with two hidden layers (w or w/o bias) and relu functions is enough to find an approximator.
When we work with a neural network, we encounter the terms steps, batch size and epoch(s).
One epoch means one iteration over the whole training samples. A step is iterating once over one batch samples. For example if we have 1000 samples and batch size is 100, for 5 epochs we need 50 steps. It could be thought as
Above we used batch size of 50 and steps of 10000 while epoch was None
. When None
value is passed as epoch, it will iterate as many times as it needs until reaching 10000 steps.
Though some concepts were not explained in this post such as backpropagation, activation functions, computation graph and others, it will be covered in later posts.
Thank you all for reading this and let me know if there is a typo(s) or error(s).
A neural network has many different forms such as convolutional neural network, recurrent neural network, LSTM network, GANs and so many more. Here we will explore how to construct a basic neural network to find an approximate function for .