Skip to main content

Understanding the keras workflow with Google Colaboratory


Another post starts with you beautiful people!
Hope you have learnt the core concepts of Deep Learning from my previous post. If not please visit once because it is required before creating our first keras model. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. Keras workflow has following four steps- Specify the architecture, Compile the model, Fit the model, Predict. Let's understand how we can achieve each steps-
  1. Specify the architecture:- In the first step you define the architecture of your model like- how many layers do you want? how many nodes in each layer? what activation function do you want to use?
  2. Compile the model:- This step specifies the loss function and some details about optimization.
  3. Fit the model:- This step is the cycle of Backpropagation and model optimization of model weights with your data.
  4. Predict:- In this last step you use your model to make predictions.
Now we will explore each step with respect to code. Hope you have setup your Google Colab notebook as mentioned in previous step. Open the notebook and import the required basic libraries as below-

The other two imports besides pandas and numpy are keras libraries to build our model. Next, we load a dataset in Colab and then we will read to understand the next three steps. You can upload any local dataset using following code-

Next, we will read the dataset so that we can find the number of the nodes in the input layer and then we need to specify how many columns are in the input when building a keras model because this is the number of nodes in the input layer-

Next, we will start to build our model, The first line of model specification is initializing the Sequential model. For this step we will use Sequential() that is a linear stack of layers. The sequential models require that each layer has weight only to the one layer coming directly after it in the network diagram. Then we start adding layers using .add() method of the model. Here the standard layer type is Dense layer. It is called Dense because all the nodes in the previous layers connect to all the nodes in the current layer. In each layer we define first argument as number of nodes, then an activation function, then input shape-

In input_shape argument, we are passing number of columns followed by comma and then blank value which means there can be any number of rows or data points. Here the last layer has only 1 node because this is the output layer and it matches those diagrams where we ended with only a single node as the output or the prediction of the model. So in our model there are two hidden layers and an output layer. Here in the hidden layers we are using 50 or 32 nodes for example, but you can put any larger number of nodes here and keras will do all the maths for you! So don''t afraid and try with bigger network.

Next, we will compile and fit the model. The compile() method has two arguments- first one is the optimizer which basically controls the learning rate. There are few algorithms also which can select optimized learning rate automatically. 'Adam' is one the best algorithm for that task. Second argument is the loss function. We have seen 'mean squared error' is a common choice for a regression problem-

After compiling we can fit our model using fit() method. Here fitting the model means keras is applying Backpropagation and gradient descent with our data to update the weights-

When you run the above cells, you will see the output showing some optimization progress like below-
Now change the number of nodes to 150 in each hidden layer and see the loss function value in output. That's it! You now know how to specify, compile, and fit a deep learning model using keras!

Now we will learn how can we use keras for a classification problem? There are some changes required to use keras for a classification problem like the loss function here we use the most common classification loss function- 'categorical_crossentropy', we add a metric like 'accuracy' to print the accuracy score at the end of the each epoch, we change the activation function as 'softmax' to interpret the predictions as probabilities. To understand each change we will apply keras model to a classification problem dataset-


In this problem our goal is take information about the passengers and predict which ones survived. So we will separate the target variable from the dataframe, convert this target from  a class vector (integers) to binary class matrix using keras's to_categorical and then we will perform all four basic steps required for keras-

Here we are using 'Stochastic Gradient Descent' (SGD) as optimizer which you have learnt in last post. In output layer we are using number of nodes as 2 because our prediction class is two- survived or not survived. You can try with more number of nodes in hidden layer.

We have learnt how to use keras for classification as well as for regression problem. Now we will learn how can we save,reload our model and then make prediction for a new data. For saving purpose keras has save() method and for reloading your saved model, keras model has load_model api. For making prediction, keras model has predict() method-

Great! Now it's time to learn how to use different learning rates and select best one to optimize our model. Although we are using optimizer algorithms like SGD in our our model compilation yet there may be a situation where your model is not improving any more at some time. This situation is known as dying neuron problem. To solve this problem you should try with another optimizer algorithm. We will use same titanic dataset but here we will create a function get_new_model() to create an unoptimized model to optimize shown as below-

Next, we will use this function, iterate over the learning rate list and compile & fit model-

Once you run the above cell, you can see your model accuracy for all three learning rates we have mentioned as below-

You can see that with learning rate 0.01, loss is minimum! So in this way you can select appropriate learning rate.
If you remember we had cross validated (for example: k-fold) our training data before training our various machine learning models. In deep learning instead of cross validation we do validation split because in general deep learning is used on large data and repeated training from cross validation would take longer time to train our model. keras make this step easy for us. We just need to use keyword argument 'validation_split' with fit() method. So for a classification problem code will be like below-

In general we should keep training our model until it stops to improve anymore. keras provide a way to do this with the help of 'Early Stopping'. For early stopping we need to set early stopping monitor before the fitting of the model. This early stopping monitor takes one argument- 'patience' which is how many epochs a model can go without improving before we stop training. Generally 2 or 3 is a good choice for patience. You can pass this argument with 'callbacks' argument under fit() method as below-

Default value of epochs is 10, here we are passing as 30 because optimization will automatically stop when it is no longer helpful, it is okay to specify the maximum number of epochs as 30 rather than using the default of 10.

Always remember creating a great model in deep learning requires experimentation. So start doing it with different architectures, more or fewer layers etc. In next post we will work on a image classification problem and we will use keras library to solve this image problem. Till then Go chase your dreams, have an awesome day, make every second count and see you later in my next post.

Comments

Post a Comment

Popular posts from this blog

How to install and compile YOLO v4 with GPU enable settings in Windows 10?

Another post starts with you beautiful people! Last year I had shared a post about  installing and compiling Darknet YOLOv3   in your Windows machine and also how to detect an object using  YOLOv3 with Keras . This year on April' 2020 the fourth generation of YOLO has arrived and since then I was curious to use this as soon as possible. Due to my project (built on YOLOv3 :)) work I could not find a chance to check this latest release. Today I got some relief and successfully able to install and compile YOLOv4 in my machine. In this post I am going to share a single shot way to do the same in your Windows 10 machine. If your machine does not have GPU then you can follow my  previous post  by just replacing YOLOv3 related files with YOLOv4 files. For GPU having Windows machine, follow my steps to avoid any issue while building the Darknet repository. My machine has following configurations: Windows 10 64 bit Intel Core i7 16 GB RAM NVIDIA GeForce G...

How to use opencv-python with Darknet's YOLOv4?

Another post starts with you beautiful people 😊 Thank you all for messaging me your doubts about Darknet's YOLOv4. I am very happy to see in a very short amount of time my lovely aspiring data scientists have learned a state of the art object detection and recognition technique. If you are new to my blog and to computer vision then please check my following blog posts one by one- Setup Darknet's YOLOv4 Train custom dataset with YOLOv4 Create production-ready API of YOLOv4 model Create a web app for your YOLOv4 model Since now we have learned to use YOLOv4 built on Darknet's framework. In this post, I am going to share with you how can you use your trained YOLOv4 model with another awesome computer vision and machine learning software library-  OpenCV  and of course with Python 🐍. Yes, the Python wrapper of OpenCV library has just released it's latest version with support of YOLOv4 which you can install in your system using below command- pip install opencv-pyt...

How to convert your YOLOv4 weights to TensorFlow 2.2.0?

Another post starts with you beautiful people! Thank you all for your overwhelming response in my last two posts about the YOLOv4. It is quite clear that my beloved aspiring data scientists are very much curious to learn state of the art computer vision technique but they were not able to achieve that due to the lack of proper guidance. Now they have learnt exact steps to use a state of the art object detection and recognition technique from my last two posts. If you are new to my blog and want to use YOLOv4 in your project then please follow below two links- How to install and compile Darknet code with GPU? How to train your custom data with YOLOv4? In my  last post we have trained our custom dataset to identify eight types of Indian classical dance forms. After the model training we have got the YOLOv4 specific weights file as 'yolo-obj_final.weights'. This YOLOv4 specific weight file cannot be used directly to either with OpenCV or with TensorFlow currently becau...