Skip to main content

Can you build a model to predict toxic comments?


Another post starts with you beautiful people!
Hope you have learnt something new and very powerful machine learning model from my previous post- How to use LightGBM?
Till now you must have an idea that there is no any area left that a machine learning model cannot be applied; yes it's everywhere!
Continuing our journey today we will learn how to deal a problem which consists texts/sentences as feature. Examples of such kind of problems you see in internet sites, emails, posts , social media etc. Data Scientists sitting in industry giants like Quora, Twitter, Facebook, Google are working very smartly to build machine learning models to classify texts/sentences/words.
Today we are going to do the same and believe me friends once you do some hand on, you will be also in the same hat.

Challenge Linkjigsaw-toxic-comment-classification-challenge
Problem: We’re challenged to build a multi-headed model that’s capable of detecting different types of toxicity like threats, obscenity, insults, and identity-based hate.
Solution Format: We need to create a model which predicts a probability of each type of toxicity for each comment.

Let's first start our analysis by digging into the given datasets:-
#loading the datasets from kaggle

#checking the shape of the datasets

#Lets check the class imbalance in the train dataset


#check and fill NA


#store the target in a variable

Next, we will work on the cleaning of texts. We will write a function where we will remove some clean text using regular expression package-


Let's apply above function to both train and test datasets-

Next, we will split the train and test datasets-

Now we will transforms text to feature vectors that can be used as input to estimator and for this purpose we will use TfidfVectorizer-

Next, we will learn the vocabulary from the feature vectors , then use it to create a document-term matrix-
We will transform the test data into a document-term matrix-

Now it's time to import and instantiate the Logistic Regression model-


Since the target variable is more than one, we will use for loop and apply our model and then we will see the prediction and accuracy of our model-

From above technique I got below accuracy-


And last we submit our prediction in Kaggle as below-

If you want to see the actual code , you can see it from here- Kaggle Toxic Classify

The above model is just for your reference so that atleast you can start your text classification journey in Machine Learning. So please download my code, improve the logic, apply different models, try to improve the score and share your learning.

Meanwhile Friends! Go chase your dreams, have an awesome day, make every second count and see you later in my next post.

Comments

  1. Wow, amazing blog layout! How long have you been blogging for? you make blogging look easy. The overall look of your website is fantastic, let alone the content!

    Best 3D animation Company
    Best Chatbot Development Company
    Mobile app development in Coimbatore

    ReplyDelete


  2. I enjoy what you guys are usually up too sort of clever work and coverage! Keep up the wonderful works guys.
    python training in chennai |python course in chennai

    ReplyDelete
  3. Thanks a lot for high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.
    python training in chennai |python course in chennai

    ReplyDelete
  4. Here is the investors contact Email details,_   lfdsloans@lemeridianfds.com  Or Whatsapp +1 989-394-3740 that helped me with loan of 90,000.00 Euros to startup my business and I'm very grateful,It was really hard on me here trying to make a way as a single mother things hasn't be easy with me but with the help of Le_Meridian put smile on my face as i watch my business growing stronger and expanding as well.I know you may surprise why me putting things like this here but i really have to express my gratitude so anyone seeking for financial help or going through hardship with there business or want to startup business project can see to this and have hope of getting out of the hardship..Thank You.

    ReplyDelete
  5. You are doing a great job by sharing useful information about Python Programming course. It is one of the post to read and imporove my knowledge in Python Programming .You can check our Comments In Python language,for more information about Python Comments Tutorial.

    ReplyDelete

  6. Thanks for sharing,got lot of useful information.Keep Updating more.If one want to learn depth Data science training institute in btm layout is the best course to start with.

    ReplyDelete
  7. Nice article i was really impressed by seeing this article, it was very interesting to study build model of toxic and it is very useful for me.Thanks for sharing this wonderful content.its very useful to us.I gained many unknown information, the way you have clearly explained is really fantastic.
    DevOps Training in Chennai

    DevOps Online Training in Chennai

    DevOps Training in Bangalore

    DevOps Training in Hyderabad

    DevOps Training in Coimbatore

    DevOps Training

    DevOps Online Training


    ReplyDelete
  8. hi,This article is really helpful for me.Now days fashion is import one of women.More useful about jewellery working. keep it up!!!

    Java training in Bangalore

    Java training in Hyderabad

    Java Training in Coimbatore

    Java Training



    ReplyDelete
  9. Excellent Blog! I would Thanks for sharing this wonderful content.its very useful to us.There is lots of Post about Python But your way of Writing is so Good & Knowledgeable. I gained many unknown information, the way you have clearly explained is really fantastic.keep posting such useful information.
    python training in chennai

    python course in chennai

    python online training in chennai

    python training in bangalore

    python training in hyderabad

    python online training

    python training

    python flask training

    python flask online training

    python training in coimbatore

    ReplyDelete
  10. You can learn Android app development from scratch without the knowledge of Java, but that will be very cumbersome. As a matter of fact, you won't get a grasp of most basic concepts of Android, and it will take you a long time to learn Android, hence it's supposed to make you frustrated at some point.keep up!!

    Android Training in Chennai

    Android Online Training in Chennai

    Android Training in Bangalore

    Android Training in Hyderabad

    Android Training in Coimbatore

    Android Training

    Android Online Training


    ReplyDelete
  11. I think you have a great article here, But let me share with you all here about my experience with a loan lender called Benjamin Lee who helped me expand my business with his loan company that offered me a loan amount of 600,000.00 USD which I used to upgrade my business months ago. He was really awesome working with him because he a Gentle man with a good heart, a man who can listen to your heart beat and tell you that everything will be OK, when I contacted Mr lee it was on my Facebook page that his advert came up then I visited his office at Michigan to discuss about the loan offer that he and his company render, He makes me understand how all process go then I decided to give a try to it was successful just like he promised, yeah I believe him, I trust him, I rely on him as well about all my project he will be my dear financial officer and I'm glad my business is probably going well and I'm going makes my business growth like grass with his help.he work's with a great investors and guess what? They also give international loans. Is that not awesome to hear when you know a lot of business project are growing up each day by day in your heart hoping that you going to make income of that job to raise money for the project, Ops, then Mr Lee will help you with that, Yes international loan he will help you with that perfectly because I trust him very much for that kind of job, Look don't be shy or shaded give a possible try to Mr lee here his contact : 247officedept@gmail.com

    ReplyDelete
  12. LOW DOSE NALTREXONE (LDN) . we want to make the conversation about all over the diseases.
    If we come to Low libido
    we can find the best doctor for a health checkup. But in the case of health, we can't take the risk of any compromise on it. Our best try is choosing the right compounding pharmacy for good health.

    ReplyDelete
  13. Hello,
    Your blog has a lot of valuable information . Thanks for your time on putting these all together.. Really helpful blog..I just wanted to share information about
    power bi training

    ReplyDelete
  14. Hello,
    Your blog has a lot of valuable information . Thanks for your time on putting these all together.. Really helpful blog..I just wanted to share information about
    devops online training

    ReplyDelete
  15. At our Automation Testing Training Institute in Noida, we understand the growing demand for skilled automation testers in the software development sector. Our curriculum is carefully designed to cover all aspects of automation testing, including test automation frameworks, scripting languages, industry-standard tools, and best practices. What sets us apart from other training institutes is our focus on hands-on learning. We believe in the principle of "learning by doing," and our training modules are designed to provide ample opportunities for students to apply their knowledge in real-world scenarios. Through practical assignments and projects, students gain valuable experience and develop the confidence to tackle real-world automation testing challenges.

    ReplyDelete
  16. If you are new to data science and have no prior programming experience, you should begin using Python as your first language. Python was my first language of choice as a newbie.
    Business Listings

    ReplyDelete
  17. I've learned so much! This is a fascinating topic. Finally, I pick up new knowledge. Thank you very much!! Forum Backlinks

    ReplyDelete
  18. Get accurate for prediction 1x2football matches with expert analysis and insights. Maximize your chances of success with reliable tips tailored for every game. Stay ahead in the betting game

    ReplyDelete

Post a Comment

Popular posts from this blog

How to deploy your ML model as Fast API?

Another post starts with you beautiful people! Thank you all for showing so much interests in my last posts about object detection and recognition using YOLOv4. I was very happy to see many aspiring data scientists have learnt from my past three posts about using YOLOv4. Today I am going to share you all a new skill to learn. Most of you have seen my post about  deploying and consuming ML models as Flask API   where we have learnt to deploy and consume a keras model with Flask API  . In this post you are going to learn a new framework-  FastAPI to deploy your model as Rest API. After completing this post you will have a new industry standard skill. What is FastAPI? FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. It is easy to learn, fast to code and ready for production . Yes, you heard it right! Flask is not meant to be used in production but with FastAPI you can use you...

Learn the fastest way to build data apps

Another post starts with you beautiful people! I hope you have enjoyed and learned something new from my previous three posts about machine learning model deployment. In one post we have learned  How to deploy a model as FastAPI?  I n the second post, we have learned  How to deploy a deep learning model as RestAPI ? and in the third post, we have also learned  How to scale your deep learning model API?   If you are following my blog posts, you have seen how easily you have transit yourselves from aspiring to a mature data scientist. In this new post, I am going to share a new framework-  Streamlit which will help you to easily create a beautiful app with Python only. I will show here how had I used the Streamlit framework to create an app for my YOLOv3 custom model. What is Streamlit? Streamlit’s open-source app framework is the easiest way for data scientists and machine learning engineers to create beautiful, performant apps in only a few hours!...

How can I make a simple ChatBot?

Another post starts with you beautiful people! It has been a long time of posting a new post. But my friends in this period I was not sitting  where I got a chance to work with chatbot and classification related machine learning problem. So in this post I am going to share all about chatbot- from where I have learned? What I have learned? And how can you build your first bot? Quite interesting right! Chatbot is a program that can conduct an intelligent conversation based on user's input. Since chatbot is a new thing to me also, I first searched- is there any Python library available to start with this? And like always Python has helped me this time also. There is a Python library available with name as  ChatterBot   which is nothing but a machine learning conversational dialog engine. And yes that is all I want to start my learning because I always prefer inbuilt Python library to start my learning journey and once I learn this then only I move ahead for another...