Skip to main content

Machine Learning::Confusion Matrix


Another post starts with you beautiful people!
Thanks for your overwhelming response on my previous post about decision trees and random forests.
Today in this post we will continue our Machine Learning journey and we will discover the confusion matrix interpretation for use in machine learning.
After reading this post we will know:




  • What the confusion matrix is and why we need to use it?
  • How to calculate a confusion matrix?
  • How to create a confusion matrix?
A confusion matrix is a technique for summarizing the performance of a classification algorithm.
Classification accuracy (Classification accuracy is the ratio of correct predictions to total predictions made) alone can be misleading if we have an unequal number of observations in each class or if we have more than two classes in our dataset.
For a quick revision remember the following formula -
error rate = (1 - (correct predictions / total predictions)) * 100
The main problem with classification accuracy is that it hides the detail we need to better understand the performance of our classification model.

There are two examples where we are most likely to encounter this problem:

  • When our data has more than 2 classes. With 3 or more classes we may get a classification accuracy of 80%, but we don’t know if that is because all classes are being predicted equally well or whether one or two classes are being neglected by the model.
  • When our data does not have an even number of classes. We may achieve accuracy of 90% or more, but this is not a good score if 90 records for every 100 belong to one class and we can achieve this score by always predicting the most common class value.

But thankfully we can tease apart this detail by using a confusion matrix. Calculating a confusion matrix can give us a better idea of what our classification model is getting right and what types of errors it is making. A confusion matrix is a summary of prediction results on a classification problem. The number of correct and incorrect predictions are summarized with count values and broken down by each class. This is the key to the confusion matrix.

Below is the process for calculating a confusion Matrix-
1. We need a test dataset or a validation dataset with expected outcome values.
2. Make a prediction for each row in our test dataset.
3. From the expected outcomes and predictions, count-
  • The number of correct predictions for each class.
  • The number of incorrect predictions for each class, organized by the class that was predicted.

4. These numbers are then organized into a table, or a matrix as follows:

  • Expected down the side: Each row of the matrix corresponds to an actual class.
  • Predicted across the top: Each column of the matrix corresponds to a predicted class.

5. The counts of correct and incorrect classification are then filled into the table.
6. The total number of correct predictions for a class go into the expected row for that class value and     the predicted column for that class value.
7. In the same way, the total number of incorrect predictions for a class go into the expected row for       that class value and the predicted column for that class value.

Let's start with some hands on a dataset which has the information of diabetic patients and we need to predict whether a patient has diabetes or not.
Attribute Information:
1. Number of times pregnant
2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test
3. Diastolic blood pressure (mm Hg)
4. Triceps skin fold thickness (mm)
5. 2-Hour serum insulin (mu U/ml)
6. Body mass index (weight in kg/(height in m)^2)
7. Diabetes pedigree function
8. Age (years)

9. Class variable (0 or 1)


Checking the first 5 rows of data:-pima.head()

Checking the type of each attribute:-pima.info()

Spreading of negative cases and positive cases:-pima.groupby('label')['skin'].count()

Define  X and y (where X are dependent attributes and Y is independent attribute):_

Split X and y into training and testing sets:-

Train a logistic regression model on the training set:-

Output-

Make class predictions for the testing set:-

Calculate accuracy:-

Lets calculate null accuracy (accuracy that could be achieved by always predicting the most frequent class) and see how good is our model compared to base model.

Examine the class distribution of the testing set:-
Calculate the percentage of ones:-
Calculate the percentage of zeros:-
Calculate null accuracy (for binary classification problems coded as 0/1):-

So the null accuracy score is 0.677 and our model accuracy is little better than null accuracy.

Let's Plot Confusion Matrix:-

Output:-

Let's apply random forest:-

Make class predictions for the testing set and check accuracy:-

Confusion matrix for random forest:-

Output:-

Basic terminology for confusion matrix:-

  • True Positives (TP): we correctly predicted that they do have diabetes
  • True Negatives (TN): we correctly predicted that they don't have diabetes
  • False Positives (FP): we incorrectly predicted that they do have diabetes (a "Type I error")
  • False Negatives (FN): we incorrectly predicted that they don't have diabetes (a "Type II error")
Let's understand the metrics in terms of business context:-
Suppose you are owner of ferrari company and you are manufacturing limited edition super car.
The head of marketing department has 10,000 customer details who they think to advertise.
You have created a model which predicts whether a customer will buy the car or not. 
According to the model you will advertise to only those which the model tells as buyers
So in this case your model can do two mistakes-
1) Precision: Predicts non-buyer as buyer this is false positive (falsely predicting that the customer will buy)
2) Recall : Predicts buyer as non-buyer this is false negative (falsely predicting that the customer will not buy)

Now which metric do you think is important?
For this case, If model predicts a non-buyer as buyer then company will loose small amount by advertising to non-buyer and the amount they spent on advertising for that person will be low (at most 50$)..this is precision (falsely predicted as positive).

But on the other side of coin, If model predicts a buyer as non-buyer then the company is not going to advertise the car to that buyer and at the end the company is going to loose that customer who had the potential to buy that car. This is recall (falsely predicted as negative)..
So in this case the recall is the metric to optimize.

What is F1- Score?
F1 Score is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. 
Intuitively it is not as easy to understand as accuracy, but F1 is usually more useful than accuracy, especially if we have an uneven class distribution. 
Accuracy works best if false positives and false negatives have similar cost. 
If the cost of false positives and false negatives are very different, it’s better to look at both Precision and Recall.
If we have a specific goal in our mind like 'Precision is the king. We don't care much about recall', then there's no problem. 
Higher precision is better. But if we don't have such a strong goal, we will want a combined metric. That's F-measure. By using it, we will compare some of precision and some of recall of different models.
F1-Score = 2 (Recall * Precision) / (Recall + Precision)
The closer to 1 the better.


Say We have a precision of 80% and a recall of 15%. if we create new model with different algorithm so the new model precision is 70% but the recall is 20%. 
The first case has F measure of 25.3%. The second is 31%. Even though our average goes down between the two, it is more important to increase our recall so the precision drop is worth it. 
The F-score allows us to judge just how much of a tradeoff is worthwhile. If we made our system have a 30% precision and 20% recall, our F-measure would be 24%, and the tradeoff wouldn't be worth it.

What is F-beta Score?
The beta parameter determines the weight of precision in the combined score. 
beta < 1 lends more weight to precision, while beta > 1 favors recall (beta -> 0 considers only precision, beta -> inf only recall).
If we are trying to decide between two different models where both has high precison but lower recall which will we choose? 
One method is to choose the model which has high area under ROC curve another method is to choose model with higher F-beta score.

Try different values to understand how change in beta value is effecting the output:-




That's it guy'z for today. Try above learning in a different dataset and explore more!






Comments

  1. Hi there! This post couldn't be written any better! Reading this post reminds me of
    my old room mate! He always kept talking about
    this. I will forward this page to him. Pretty sure he will have a good read.
    Many thanks for sharing!

    ReplyDelete
  2. APTRON offers a top-notch Machine Learning Course in Gurgaon that provides students with the necessary skills and knowledge to excel in the field. With a curriculum designed by industry experts and delivered by experienced instructors, our program offers a comprehensive understanding of machine learning concepts and their practical application.

    ReplyDelete

Post a Comment

Popular posts from this blog

How to deploy your ML model as Fast API?

Another post starts with you beautiful people! Thank you all for showing so much interests in my last posts about object detection and recognition using YOLOv4. I was very happy to see many aspiring data scientists have learnt from my past three posts about using YOLOv4. Today I am going to share you all a new skill to learn. Most of you have seen my post about  deploying and consuming ML models as Flask API   where we have learnt to deploy and consume a keras model with Flask API  . In this post you are going to learn a new framework-  FastAPI to deploy your model as Rest API. After completing this post you will have a new industry standard skill. What is FastAPI? FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. It is easy to learn, fast to code and ready for production . Yes, you heard it right! Flask is not meant to be used in production but with FastAPI you can use you...

Learn the fastest way to build data apps

Another post starts with you beautiful people! I hope you have enjoyed and learned something new from my previous three posts about machine learning model deployment. In one post we have learned  How to deploy a model as FastAPI?  I n the second post, we have learned  How to deploy a deep learning model as RestAPI ? and in the third post, we have also learned  How to scale your deep learning model API?   If you are following my blog posts, you have seen how easily you have transit yourselves from aspiring to a mature data scientist. In this new post, I am going to share a new framework-  Streamlit which will help you to easily create a beautiful app with Python only. I will show here how had I used the Streamlit framework to create an app for my YOLOv3 custom model. What is Streamlit? Streamlit’s open-source app framework is the easiest way for data scientists and machine learning engineers to create beautiful, performant apps in only a few hours!...

How can I make a simple ChatBot?

Another post starts with you beautiful people! It has been a long time of posting a new post. But my friends in this period I was not sitting  where I got a chance to work with chatbot and classification related machine learning problem. So in this post I am going to share all about chatbot- from where I have learned? What I have learned? And how can you build your first bot? Quite interesting right! Chatbot is a program that can conduct an intelligent conversation based on user's input. Since chatbot is a new thing to me also, I first searched- is there any Python library available to start with this? And like always Python has helped me this time also. There is a Python library available with name as  ChatterBot   which is nothing but a machine learning conversational dialog engine. And yes that is all I want to start my learning because I always prefer inbuilt Python library to start my learning journey and once I learn this then only I move ahead for another...