Skip to main content

Can you build a model to predict toxic comments?


Another post starts with you beautiful people!
Hope you have learnt something new and very powerful machine learning model from my previous post- How to use LightGBM?
Till now you must have an idea that there is no any area left that a machine learning model cannot be applied; yes it's everywhere!
Continuing our journey today we will learn how to deal a problem which consists texts/sentences as feature. Examples of such kind of problems you see in internet sites, emails, posts , social media etc. Data Scientists sitting in industry giants like Quora, Twitter, Facebook, Google are working very smartly to build machine learning models to classify texts/sentences/words.
Today we are going to do the same and believe me friends once you do some hand on, you will be also in the same hat.

Challenge Linkjigsaw-toxic-comment-classification-challenge
Problem: We’re challenged to build a multi-headed model that’s capable of detecting different types of toxicity like threats, obscenity, insults, and identity-based hate.
Solution Format: We need to create a model which predicts a probability of each type of toxicity for each comment.

Let's first start our analysis by digging into the given datasets:-
#loading the datasets from kaggle

#checking the shape of the datasets

#Lets check the class imbalance in the train dataset


#check and fill NA


#store the target in a variable

Next, we will work on the cleaning of texts. We will write a function where we will remove some clean text using regular expression package-


Let's apply above function to both train and test datasets-

Next, we will split the train and test datasets-

Now we will transforms text to feature vectors that can be used as input to estimator and for this purpose we will use TfidfVectorizer-

Next, we will learn the vocabulary from the feature vectors , then use it to create a document-term matrix-
We will transform the test data into a document-term matrix-

Now it's time to import and instantiate the Logistic Regression model-


Since the target variable is more than one, we will use for loop and apply our model and then we will see the prediction and accuracy of our model-

From above technique I got below accuracy-


And last we submit our prediction in Kaggle as below-

If you want to see the actual code , you can see it from here- Kaggle Toxic Classify

The above model is just for your reference so that atleast you can start your text classification journey in Machine Learning. So please download my code, improve the logic, apply different models, try to improve the score and share your learning.

Meanwhile Friends! Go chase your dreams, have an awesome day, make every second count and see you later in my next post.

Comments

  1. Wow, amazing blog layout! How long have you been blogging for? you make blogging look easy. The overall look of your website is fantastic, let alone the content!

    Best 3D animation Company
    Best Chatbot Development Company
    Mobile app development in Coimbatore

    ReplyDelete


  2. I enjoy what you guys are usually up too sort of clever work and coverage! Keep up the wonderful works guys.
    python training in chennai |python course in chennai

    ReplyDelete
  3. Thanks a lot for high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.
    python training in chennai |python course in chennai

    ReplyDelete
  4. Here is the investors contact Email details,_   lfdsloans@lemeridianfds.com  Or Whatsapp +1 989-394-3740 that helped me with loan of 90,000.00 Euros to startup my business and I'm very grateful,It was really hard on me here trying to make a way as a single mother things hasn't be easy with me but with the help of Le_Meridian put smile on my face as i watch my business growing stronger and expanding as well.I know you may surprise why me putting things like this here but i really have to express my gratitude so anyone seeking for financial help or going through hardship with there business or want to startup business project can see to this and have hope of getting out of the hardship..Thank You.

    ReplyDelete
  5. You are doing a great job by sharing useful information about Python Programming course. It is one of the post to read and imporove my knowledge in Python Programming .You can check our Comments In Python language,for more information about Python Comments Tutorial.

    ReplyDelete

  6. Thanks for sharing,got lot of useful information.Keep Updating more.If one want to learn depth Data science training institute in btm layout is the best course to start with.

    ReplyDelete
  7. Nice article i was really impressed by seeing this article, it was very interesting to study build model of toxic and it is very useful for me.Thanks for sharing this wonderful content.its very useful to us.I gained many unknown information, the way you have clearly explained is really fantastic.
    DevOps Training in Chennai

    DevOps Online Training in Chennai

    DevOps Training in Bangalore

    DevOps Training in Hyderabad

    DevOps Training in Coimbatore

    DevOps Training

    DevOps Online Training


    ReplyDelete
  8. hi,This article is really helpful for me.Now days fashion is import one of women.More useful about jewellery working. keep it up!!!

    Java training in Bangalore

    Java training in Hyderabad

    Java Training in Coimbatore

    Java Training



    ReplyDelete
  9. Excellent Blog! I would Thanks for sharing this wonderful content.its very useful to us.There is lots of Post about Python But your way of Writing is so Good & Knowledgeable. I gained many unknown information, the way you have clearly explained is really fantastic.keep posting such useful information.
    python training in chennai

    python course in chennai

    python online training in chennai

    python training in bangalore

    python training in hyderabad

    python online training

    python training

    python flask training

    python flask online training

    python training in coimbatore

    ReplyDelete
  10. You can learn Android app development from scratch without the knowledge of Java, but that will be very cumbersome. As a matter of fact, you won't get a grasp of most basic concepts of Android, and it will take you a long time to learn Android, hence it's supposed to make you frustrated at some point.keep up!!

    Android Training in Chennai

    Android Online Training in Chennai

    Android Training in Bangalore

    Android Training in Hyderabad

    Android Training in Coimbatore

    Android Training

    Android Online Training


    ReplyDelete
  11. I think you have a great article here, But let me share with you all here about my experience with a loan lender called Benjamin Lee who helped me expand my business with his loan company that offered me a loan amount of 600,000.00 USD which I used to upgrade my business months ago. He was really awesome working with him because he a Gentle man with a good heart, a man who can listen to your heart beat and tell you that everything will be OK, when I contacted Mr lee it was on my Facebook page that his advert came up then I visited his office at Michigan to discuss about the loan offer that he and his company render, He makes me understand how all process go then I decided to give a try to it was successful just like he promised, yeah I believe him, I trust him, I rely on him as well about all my project he will be my dear financial officer and I'm glad my business is probably going well and I'm going makes my business growth like grass with his help.he work's with a great investors and guess what? They also give international loans. Is that not awesome to hear when you know a lot of business project are growing up each day by day in your heart hoping that you going to make income of that job to raise money for the project, Ops, then Mr Lee will help you with that, Yes international loan he will help you with that perfectly because I trust him very much for that kind of job, Look don't be shy or shaded give a possible try to Mr lee here his contact : 247officedept@gmail.com

    ReplyDelete
  12. LOW DOSE NALTREXONE (LDN) . we want to make the conversation about all over the diseases.
    If we come to Low libido
    we can find the best doctor for a health checkup. But in the case of health, we can't take the risk of any compromise on it. Our best try is choosing the right compounding pharmacy for good health.

    ReplyDelete
  13. Hello,
    Your blog has a lot of valuable information . Thanks for your time on putting these all together.. Really helpful blog..I just wanted to share information about
    power bi training

    ReplyDelete
  14. Hello,
    Your blog has a lot of valuable information . Thanks for your time on putting these all together.. Really helpful blog..I just wanted to share information about
    devops online training

    ReplyDelete
  15. At our Automation Testing Training Institute in Noida, we understand the growing demand for skilled automation testers in the software development sector. Our curriculum is carefully designed to cover all aspects of automation testing, including test automation frameworks, scripting languages, industry-standard tools, and best practices. What sets us apart from other training institutes is our focus on hands-on learning. We believe in the principle of "learning by doing," and our training modules are designed to provide ample opportunities for students to apply their knowledge in real-world scenarios. Through practical assignments and projects, students gain valuable experience and develop the confidence to tackle real-world automation testing challenges.

    ReplyDelete
  16. If you are new to data science and have no prior programming experience, you should begin using Python as your first language. Python was my first language of choice as a newbie.
    Business Listings

    ReplyDelete
  17. I've learned so much! This is a fascinating topic. Finally, I pick up new knowledge. Thank you very much!! Forum Backlinks

    ReplyDelete

Post a Comment

Popular posts from this blog

How to install and compile YOLO v4 with GPU enable settings in Windows 10?

Another post starts with you beautiful people! Last year I had shared a post about  installing and compiling Darknet YOLOv3   in your Windows machine and also how to detect an object using  YOLOv3 with Keras . This year on April' 2020 the fourth generation of YOLO has arrived and since then I was curious to use this as soon as possible. Due to my project (built on YOLOv3 :)) work I could not find a chance to check this latest release. Today I got some relief and successfully able to install and compile YOLOv4 in my machine. In this post I am going to share a single shot way to do the same in your Windows 10 machine. If your machine does not have GPU then you can follow my  previous post  by just replacing YOLOv3 related files with YOLOv4 files. For GPU having Windows machine, follow my steps to avoid any issue while building the Darknet repository. My machine has following configurations: Windows 10 64 bit Intel Core i7 16 GB RAM NVIDIA GeForce GTX 1660 Ti Version 445.87

How to convert your YOLOv4 weights to TensorFlow 2.2.0?

Another post starts with you beautiful people! Thank you all for your overwhelming response in my last two posts about the YOLOv4. It is quite clear that my beloved aspiring data scientists are very much curious to learn state of the art computer vision technique but they were not able to achieve that due to the lack of proper guidance. Now they have learnt exact steps to use a state of the art object detection and recognition technique from my last two posts. If you are new to my blog and want to use YOLOv4 in your project then please follow below two links- How to install and compile Darknet code with GPU? How to train your custom data with YOLOv4? In my  last post we have trained our custom dataset to identify eight types of Indian classical dance forms. After the model training we have got the YOLOv4 specific weights file as 'yolo-obj_final.weights'. This YOLOv4 specific weight file cannot be used directly to either with OpenCV or with TensorFlow currently becau

How to use opencv-python with Darknet's YOLOv4?

Another post starts with you beautiful people 😊 Thank you all for messaging me your doubts about Darknet's YOLOv4. I am very happy to see in a very short amount of time my lovely aspiring data scientists have learned a state of the art object detection and recognition technique. If you are new to my blog and to computer vision then please check my following blog posts one by one- Setup Darknet's YOLOv4 Train custom dataset with YOLOv4 Create production-ready API of YOLOv4 model Create a web app for your YOLOv4 model Since now we have learned to use YOLOv4 built on Darknet's framework. In this post, I am going to share with you how can you use your trained YOLOv4 model with another awesome computer vision and machine learning software library-  OpenCV  and of course with Python 🐍. Yes, the Python wrapper of OpenCV library has just released it's latest version with support of YOLOv4 which you can install in your system using below command- pip install opencv-python --up