Skip to main content

Understanding the keras workflow with Google Colaboratory


Another post starts with you beautiful people!
Hope you have learnt the core concepts of Deep Learning from my previous post. If not please visit once because it is required before creating our first keras model. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. Keras workflow has following four steps- Specify the architecture, Compile the model, Fit the model, Predict. Let's understand how we can achieve each steps-
  1. Specify the architecture:- In the first step you define the architecture of your model like- how many layers do you want? how many nodes in each layer? what activation function do you want to use?
  2. Compile the model:- This step specifies the loss function and some details about optimization.
  3. Fit the model:- This step is the cycle of Backpropagation and model optimization of model weights with your data.
  4. Predict:- In this last step you use your model to make predictions.
Now we will explore each step with respect to code. Hope you have setup your Google Colab notebook as mentioned in previous step. Open the notebook and import the required basic libraries as below-

The other two imports besides pandas and numpy are keras libraries to build our model. Next, we load a dataset in Colab and then we will read to understand the next three steps. You can upload any local dataset using following code-

Next, we will read the dataset so that we can find the number of the nodes in the input layer and then we need to specify how many columns are in the input when building a keras model because this is the number of nodes in the input layer-

Next, we will start to build our model, The first line of model specification is initializing the Sequential model. For this step we will use Sequential() that is a linear stack of layers. The sequential models require that each layer has weight only to the one layer coming directly after it in the network diagram. Then we start adding layers using .add() method of the model. Here the standard layer type is Dense layer. It is called Dense because all the nodes in the previous layers connect to all the nodes in the current layer. In each layer we define first argument as number of nodes, then an activation function, then input shape-

In input_shape argument, we are passing number of columns followed by comma and then blank value which means there can be any number of rows or data points. Here the last layer has only 1 node because this is the output layer and it matches those diagrams where we ended with only a single node as the output or the prediction of the model. So in our model there are two hidden layers and an output layer. Here in the hidden layers we are using 50 or 32 nodes for example, but you can put any larger number of nodes here and keras will do all the maths for you! So don''t afraid and try with bigger network.

Next, we will compile and fit the model. The compile() method has two arguments- first one is the optimizer which basically controls the learning rate. There are few algorithms also which can select optimized learning rate automatically. 'Adam' is one the best algorithm for that task. Second argument is the loss function. We have seen 'mean squared error' is a common choice for a regression problem-

After compiling we can fit our model using fit() method. Here fitting the model means keras is applying Backpropagation and gradient descent with our data to update the weights-

When you run the above cells, you will see the output showing some optimization progress like below-
Now change the number of nodes to 150 in each hidden layer and see the loss function value in output. That's it! You now know how to specify, compile, and fit a deep learning model using keras!

Now we will learn how can we use keras for a classification problem? There are some changes required to use keras for a classification problem like the loss function here we use the most common classification loss function- 'categorical_crossentropy', we add a metric like 'accuracy' to print the accuracy score at the end of the each epoch, we change the activation function as 'softmax' to interpret the predictions as probabilities. To understand each change we will apply keras model to a classification problem dataset-


In this problem our goal is take information about the passengers and predict which ones survived. So we will separate the target variable from the dataframe, convert this target from  a class vector (integers) to binary class matrix using keras's to_categorical and then we will perform all four basic steps required for keras-

Here we are using 'Stochastic Gradient Descent' (SGD) as optimizer which you have learnt in last post. In output layer we are using number of nodes as 2 because our prediction class is two- survived or not survived. You can try with more number of nodes in hidden layer.

We have learnt how to use keras for classification as well as for regression problem. Now we will learn how can we save,reload our model and then make prediction for a new data. For saving purpose keras has save() method and for reloading your saved model, keras model has load_model api. For making prediction, keras model has predict() method-

Great! Now it's time to learn how to use different learning rates and select best one to optimize our model. Although we are using optimizer algorithms like SGD in our our model compilation yet there may be a situation where your model is not improving any more at some time. This situation is known as dying neuron problem. To solve this problem you should try with another optimizer algorithm. We will use same titanic dataset but here we will create a function get_new_model() to create an unoptimized model to optimize shown as below-

Next, we will use this function, iterate over the learning rate list and compile & fit model-

Once you run the above cell, you can see your model accuracy for all three learning rates we have mentioned as below-

You can see that with learning rate 0.01, loss is minimum! So in this way you can select appropriate learning rate.
If you remember we had cross validated (for example: k-fold) our training data before training our various machine learning models. In deep learning instead of cross validation we do validation split because in general deep learning is used on large data and repeated training from cross validation would take longer time to train our model. keras make this step easy for us. We just need to use keyword argument 'validation_split' with fit() method. So for a classification problem code will be like below-

In general we should keep training our model until it stops to improve anymore. keras provide a way to do this with the help of 'Early Stopping'. For early stopping we need to set early stopping monitor before the fitting of the model. This early stopping monitor takes one argument- 'patience' which is how many epochs a model can go without improving before we stop training. Generally 2 or 3 is a good choice for patience. You can pass this argument with 'callbacks' argument under fit() method as below-

Default value of epochs is 10, here we are passing as 30 because optimization will automatically stop when it is no longer helpful, it is okay to specify the maximum number of epochs as 30 rather than using the default of 10.

Always remember creating a great model in deep learning requires experimentation. So start doing it with different architectures, more or fewer layers etc. In next post we will work on a image classification problem and we will use keras library to solve this image problem. Till then Go chase your dreams, have an awesome day, make every second count and see you later in my next post.

Comments

Post a Comment

Popular posts from this blog

How to use TensorBoard with TensorFlow 2.0 in Google Colaboratory?

Another post starts with you beautiful people! It is quite a wonderful moment for me that many Aspiring Data Scientists like you have connected with me through my facebook page and have started their focused journey to be a Data Scientists by following my  book . If you have not then I recommend to atleast visit my  last post here . In two of my previous posts we have learnt about keras and colab. In this post I am going to share with you all that TensorFlow 2.0 has been released and one quite interesting news about this release is that our beloved deep learning library keras is in built with it. Yes! You heard it right. If you know keras then using TensorFlow 2.0 library is quite easy for you. One of the interesting benefit of using TensorFlow library is it's visualization tool known as  TensorBoard . In this post we are going to learn how to use TensorFlow 2.0 with MNIST dataset and then setup TensorBoard with Google Colaboratory. Let's start this pos...

How can I make a simple ChatBot?

Another post starts with you beautiful people! It has been a long time of posting a new post. But my friends in this period I was not sitting  where I got a chance to work with chatbot and classification related machine learning problem. So in this post I am going to share all about chatbot- from where I have learned? What I have learned? And how can you build your first bot? Quite interesting right! Chatbot is a program that can conduct an intelligent conversation based on user's input. Since chatbot is a new thing to me also, I first searched- is there any Python library available to start with this? And like always Python has helped me this time also. There is a Python library available with name as  ChatterBot   which is nothing but a machine learning conversational dialog engine. And yes that is all I want to start my learning because I always prefer inbuilt Python library to start my learning journey and once I learn this then only I move ahead for another...

Identify Eight types of Indian Classical Dance forms with YOLOv4

Another post starts with you beautiful people! Thank you all who had followed my last post about  install and compile YOLOv4 in Windows10   and could able to successfully set up the Darknet in their machines. As I promised in last post and you asked for, in this post I am going to share you the steps required for training a custom object with YOLOv4. If you are seeing my blog first time, I recommend you to first follow my  last post  and then proceed further. For this exercise I have choosen a dataset of eight Indian Classical Dance forms- Manipuri from Manipur Bharatanatyam from Tamil Nadu Odissi from Orissa Kathakali from Kerala Kathak from Uttar Pradesh Sattriya from Assam Kuchipudi from Andhra Pradesh Mohiniyattam from Kerala You can download the dataset from this hackethon link . After downloading the dataset , you need to create 8 folders with class name and copy respective images from train folder to there. For this work I have writt...