Understanding the keras workflow with Google Colaboratory

Another post starts with you beautiful people!
Hope you have learnt the core concepts of Deep Learning from my previous post. If not please visit once because it is required before creating our first keras model. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. Keras workflow has following four steps- Specify the architecture, Compile the model, Fit the model, Predict. Let's understand how we can achieve each steps-

Specify the architecture:- In the first step you define the architecture of your model like- how many layers do you want? how many nodes in each layer? what activation function do you want to use?
Compile the model:- This step specifies the loss function and some details about optimization.
Fit the model:- This step is the cycle of Backpropagation and model optimization of model weights with your data.
Predict:- In this last step you use your model to make predictions.

Now we will explore each step with respect to code. Hope you have setup your Google Colab notebook as mentioned in previous step. Open the notebook and import the required basic libraries as below-

The other two imports besides pandas and numpy are keras libraries to build our model. Next, we load a dataset in Colab and then we will read to understand the next three steps. You can upload any local dataset using following code-

Next, we will read the dataset so that we can find the number of the nodes in the input layer and then we need to specify how many columns are in the input when building a keras model because this is the number of nodes in the input layer-

Next, we will start to build our model, The first line of model specification is initializing the Sequential model. For this step we will use Sequential() that is a linear stack of layers. The sequential models require that each layer has weight only to the one layer coming directly after it in the network diagram. Then we start adding layers using .add() method of the model. Here the standard layer type is Dense layer. It is called Dense because all the nodes in the previous layers connect to all the nodes in the current layer. In each layer we define first argument as number of nodes, then an activation function, then input shape-

In input_shape argument, we are passing number of columns followed by comma and then blank value which means there can be any number of rows or data points. Here the last layer has only 1 node because this is the output layer and it matches those diagrams where we ended with only a single node as the output or the prediction of the model. So in our model there are two hidden layers and an output layer. Here in the hidden layers we are using 50 or 32 nodes for example, but you can put any larger number of nodes here and keras will do all the maths for you! So don''t afraid and try with bigger network.

Next, we will compile and fit the model. The compile() method has two arguments- first one is the optimizer which basically controls the learning rate. There are few algorithms also which can select optimized learning rate automatically. 'Adam' is one the best algorithm for that task. Second argument is the loss function. We have seen 'mean squared error' is a common choice for a regression problem-

After compiling we can fit our model using fit() method. Here fitting the model means keras is applying Backpropagation and gradient descent with our data to update the weights-

When you run the above cells, you will see the output showing some optimization progress like below-

Now change the number of nodes to 150 in each hidden layer and see the loss function value in output. That's it! You now know how to specify, compile, and fit a deep learning model using keras!

Now we will learn how can we use keras for a classification problem? There are some changes required to use keras for a classification problem like the loss function here we use the most common classification loss function- 'categorical_crossentropy', we add a metric like 'accuracy' to print the accuracy score at the end of the each epoch, we change the activation function as 'softmax' to interpret the predictions as probabilities. To understand each change we will apply keras model to a classification problem dataset-

In this problem our goal is take information about the passengers and predict which ones survived. So we will separate the target variable from the dataframe, convert this target from a class vector (integers) to binary class matrix using keras's to_categorical and then we will perform all four basic steps required for keras-

Here we are using 'Stochastic Gradient Descent' (SGD) as optimizer which you have learnt in last post. In output layer we are using number of nodes as 2 because our prediction class is two- survived or not survived. You can try with more number of nodes in hidden layer.

We have learnt how to use keras for classification as well as for regression problem. Now we will learn how can we save,reload our model and then make prediction for a new data. For saving purpose keras has save() method and for reloading your saved model, keras model has load_model api. For making prediction, keras model has predict() method-

Great! Now it's time to learn how to use different learning rates and select best one to optimize our model. Although we are using optimizer algorithms like SGD in our our model compilation yet there may be a situation where your model is not improving any more at some time. This situation is known as dying neuron problem. To solve this problem you should try with another optimizer algorithm. We will use same titanic dataset but here we will create a function get_new_model() to create an unoptimized model to optimize shown as below-

Next, we will use this function, iterate over the learning rate list and compile & fit model-

Once you run the above cell, you can see your model accuracy for all three learning rates we have mentioned as below-

You can see that with learning rate 0.01, loss is minimum! So in this way you can select appropriate learning rate.
If you remember we had cross validated (for example: k-fold) our training data before training our various machine learning models. In deep learning instead of cross validation we do validation split because in general deep learning is used on large data and repeated training from cross validation would take longer time to train our model. keras make this step easy for us. We just need to use keyword argument 'validation_split' with fit() method. So for a classification problem code will be like below-

In general we should keep training our model until it stops to improve anymore. keras provide a way to do this with the help of 'Early Stopping'. For early stopping we need to set early stopping monitor before the fitting of the model. This early stopping monitor takes one argument- 'patience' which is how many epochs a model can go without improving before we stop training. Generally 2 or 3 is a good choice for patience. You can pass this argument with 'callbacks' argument under fit() method as below-

Default value of epochs is 10, here we are passing as 30 because optimization will automatically stop when it is no longer helpful, it is okay to specify the maximum number of epochs as 30 rather than using the default of 10.

Always remember creating a great model in deep learning requires experimentation. So start doing it with different architectures, more or fewer layers etc. In next post we will work on a image classification problem and we will use keras library to solve this image problem. Till then Go chase your dreams, have an awesome day, make every second count and see you later in my next post.

How to convert your YOLOv4 weights to TensorFlow 2.2.0?

Another post starts with you beautiful people! Thank you all for your overwhelming response in my last two posts about the YOLOv4. It is quite clear that my beloved aspiring data scientists are very much curious to learn state of the art computer vision technique but they were not able to achieve that due to the lack of proper guidance. Now they have learnt exact steps to use a state of the art object detection and recognition technique from my last two posts. If you are new to my blog and want to use YOLOv4 in your project then please follow below two links- How to install and compile Darknet code with GPU? How to train your custom data with YOLOv4? In my last post we have trained our custom dataset to identify eight types of Indian classical dance forms. After the model training we have got the YOLOv4 specific weights file as 'yolo-obj_final.weights'. This YOLOv4 specific weight file cannot be used directly to either with OpenCV or with TensorFlow currently becau...

Learn Data Science using Python

Search This Blog

Understanding the keras workflow with Google Colaboratory

Labels

Comments

Post a Comment

Popular posts from this blog

How to use opencv-python with Darknet's YOLOv4?

How to convert your YOLOv4 weights to TensorFlow 2.2.0?

Learn the fastest way to build data apps