Skip to main content

Making a deep learning model to predict which digit it is using keras

Another post starts with you beautiful people!
In previous post we have learnt keras workflow. In this post we will understand how to solve a image related problem with a simple neural network model using keras. For this exercise we will use MNIST hand written digit datasetThe MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. This is a very popular dataset to get started with images. You can download this dataset from this link.

In this dataset, each digit shows an image and each image is composed of 28 pixel by 28 pixel grid. The image is represented by how dark each pixel is. So zero will be darkest possible while 255 will be lightest possible. Our goal is to create a deep learning model that will predict which digit it is. Here 28 x 28 pixels grid are flattened to 784 features for each image. Let's load the training and test datasets in our colab notebook and explore-




You can see the training dataset has digit image from 0-9. Since the data type of the digits is int64, we will optimize the memory by changing their data types to float32 and int32. One important point here is that to process image in a neural network model, we need to scale it first. This scaling is also called normalization. In normalization image having grey scale form 0-255 is changed into 0-1 pixel range. We can achieve this by dividing each value by it's maximum value that is 255 as shown below-


Next we convert our target variable to binary class matrix using keras.utils api-
Rest all steps we have already learnt in previous post, as a recall now we will create a model, compile it and fit it. Here we will use softmax activation function on the output layer to turn the outputs into probability-like values and allow one class of the 10 to be selected as the model’s output prediction. Logarithmic loss is used as the loss function (called categorical_crossentropy) and the efficient ADAM gradient descent algorithm is used to learn the weights-


Above cell will show following output-

See here, our model is trained on 29399 samples and validated it on rest 12601 samples because we have used validation_split argument with 30% value. After epoch 7 the model stopped training by itself because it was not improving anymore. We have got our model accuracy around 97%. Try to train above model with different number of nodes, increase number of hidden layers and see if you are getting any better result! Now we can use our model to make prediction on test dataset-
You can also save your result in a csv as below-

In our submission file we are predicting that image having ImageId 1 in test dataset is the image of digit 2, ImageId 2 is of image 0 digit and so on. That was quite easy right. With the help of GPU supported environment like Google Colab and powerful deep learning library keras we are able to achieve a very good accuracy. In next post we will move ahead and learn advanced deep learning with keras. Till then Go chase your dreams, have an awesome day, make every second count and see you later in my next post.






Comments

Post a Comment

Popular posts from this blog

Learn the fastest way to build data apps

Another post starts with you beautiful people! I hope you have enjoyed and learned something new from my previous three posts about machine learning model deployment. In one post we have learned  How to deploy a model as FastAPI?  I n the second post, we have learned  How to deploy a deep learning model as RestAPI ? and in the third post, we have also learned  How to scale your deep learning model API?   If you are following my blog posts, you have seen how easily you have transit yourselves from aspiring to a mature data scientist. In this new post, I am going to share a new framework-  Streamlit which will help you to easily create a beautiful app with Python only. I will show here how had I used the Streamlit framework to create an app for my YOLOv3 custom model. What is Streamlit? Streamlit’s open-source app framework is the easiest way for data scientists and machine learning engineers to create beautiful, performant apps in only a few hours!...

Exploring The File Import

Another post starts with you beautiful people! Today we will explore various file import options in Python which I learned from a great learning site- DataCamp . In order to import data into Python, we should first have an idea of what files are in our working directory. We will learn step by step examples as given below- Importing entire text files- In this exercise, we'll be working with the file mobydick.txt [ download here ] It is a text file that contains the opening sentences of Moby Dick, one of the great American novels! Here you'll get experience opening a text file, printing its contents to the shell and, finally, closing it- # Open a file: file file = open('mobydick.txt', mode='r') # Print it print(file.read()) # Check whether file is closed print(file.closed) # Close file file.close() # Check whether file is closed print(file.closed) Importing text files line by line- For large files, we may not want to print all of th...

Machine Learning-Cross Validation & ROC curve

Another post starts with you beautiful people! Hope you enjoyed my previous post about improving your model performance by  confusion metrix . Today we will continue our performance improvement journey and will learn about Cross Validation (k-fold cross validation) & ROC in Machine Learning. A common practice in data science competitions is to iterate over various models to find a better performing model. However, it becomes difficult to distinguish whether this improvement in score is coming because we are capturing the relationship better or we are just over-fitting the data. To find the right answer of this question, we use cross validation technique. This method helps us to achieve more generalized relationships. What is Cross Validation? Cross Validation is a technique which involves reserving a particular sample of a data set on which we do not train the model. Later, we test the model on this sample before finalizing the model. Here are the steps involved in...