Skip to main content

How to train a custom dataset with YOLOv7 for instance segmentation?

 

Another post starts with you beautiful people! It is overwhelming for me to see massive interest in my last three posts about the YOLOv7 series💓. Your response keeps me motivated to share my learning with you all 💝. If you have not checked my previous posts about YOLOv7, then I am sharing here links to read those once and then proceed with this post-

  1. Train a custom dataset with YOLOv7
  2. Export custom YOLOv7 model to ONNX
  3. Export custom YOLOv7 model to TensorRT
Till now we have learned about object detection with YOLOv7. In this post, we are going to learn how can we train a custom dataset for instance segmentation task with YOLOv7 👌. For your information instance segmentation is the task of detecting and delineating each distinct object of interest appearing in an image. For our hands-on we need a dataset having images and their annotations in polygon format and of course in YOLO format. So I have found and downloaded the American Sign Language dataset in the required format from this link.

This dataset has all letters A through Z in American Sign Language labeled with polygon labels. To our knowledge, the alphabet and its correspondence signs look like below-

For this post also I am using the google colab environment. Let's first install its required library and restart the colab notebook runtime once it is installed-

Next, we will download the code repo of the YOLOv7's segmentation branch since while writing this post author has not officially declared the main branch for the segmentation code. So I have downloaded the u7 branch and uploaded it to my gdrive so that I can access it in my colab notebook 😀. After uploading the branch code to drive, I unzipped it like as below-

The unzipped folder structure of the dataset will look like as below-

Next, we will prepare the required YAML files for our instance segmentation training. First, we will create our configuration YAML file where will define the path of the training and validation images, the number of the target classes, and the names of the target classes as below-

Here I have provided the train and validation images path, and the number of classes as 26 since our classes are from the alphabet A-Z. Next, we will create a second YAML file having the content of it's default yolov7-seg.yaml but with one difference- the value of 'nc' parameter from 80 to 26-

Next, we will create the hyperparameter YAML file having the same content as its official hyp.scratch-high.yaml as below-

Next, we will download the pre-trained weight file for the training as below-

Now we are ready to start our instance segmentation training with the below command-

Here, change the .yaml files paths and the weight file path as per yours. In my colab environment for 100 epochs it has taken 1 hour to complete. for better accuracy, you should train it for 300 epochs for this dataset. Once the training is completed, you will see the following output in the console-

After completion of the training, let's check the trained model on a test image as below-

You will see the following output in the console after successfully running the above command-

Let's open the processed image and see what it looks like-

See our trained model is able to do the instance segmentation and is able to predict the target class of a sign 💥. Let me share a few more output images-




Indeed we have a strong sign language detection model. Let's run the inference on a video-

The output in the console will look like the below after a successful run of the above command-

Let's open this video in our colab notebook-

And it will show the processed video file as below-

That's it for today's guys! We have learned another useful technique with YOLOv7. Here I am sharing my colab notebook for this tutorial. So no need to rest, just copy the notebook into your colab environment and try it with your own dataset. In my next post I will share something useful again till then 👉 Go chase your dreams, have an awesome day, make every second count, and see you later in my next post😇







Comments

  1. Hello first of all thank you for the great tutorial, however I cant seem to reproduce your colab. It's giving me an error on loss.py at line 198 anchors, shape = self.anchors[i], p[i].shape, saying that list object has no attribute shape. Did you encounter this problem?

    ReplyDelete
  2. Nvm I was running the wrong file. For anyone going to have this mistake, run train.py located in the folder /seg/segment NOT the one in /seg. Thanks for the tutorial cheers!

    ReplyDelete
  3. yes, the correct train.py is inside the /seg/segment/ folder as I mentioned already in this post.

    ReplyDelete
  4. I was also trying to run instance segmentation on custom dataset of occluded objects. I am facing problems annotating objects formed by two polygons/contour rather than a single one. In coco style format, I can use segmentation as a list of lists. How should I do it in yolo format? Should I put all x, y coordinate pairs of two polygons at the same line? if I put them on separate lines, they will be recognized as two distinct instance of same class as far as my understanding. Can you suggest how to handle this?

    ReplyDelete
    Replies
    1. If the object is one then it's annotation should also be one. May I know what kind of object do you have?

      Delete
  5. Amazing write-up. Thankyou for such informative blog. Python course in Greater Noida is the good platform for learning python in deep.

    ReplyDelete
  6. Our Data Science Course in Noida encompasses a well-structured curriculum that delves into essential concepts such as data manipulation, statistical analysis, machine learning, and more. With hands-on projects and practical exercises, students gain the confidence and skills required to tackle real-world data challenges. This hands-on approach sets APTRON Solution apart, as it empowers students to apply their knowledge effectively in their future careers.

    ReplyDelete
  7. APTRON caters to learners of all levels. With state-of-the-art facilities and a commitment to excellence, we strive to empower our students to harness the power of data and drive meaningful insights. Join APTRON's Data Science Training in Gurgaon and embark on a journey towards a rewarding and impactful career in the realm of data science.

    ReplyDelete
  8. Excellent article. Thanks for sharing.
    Microsoft Fabric Training
    Microsoft Azure Fabric Training
    Microsoft Fabric Course in Hyderabad
    Microsoft Fabric Online Training Course
    Microsoft Fabric Online Training Institute
    Microsoft Fabric Training In Ameerpet
    Microsoft Fabric Online Training

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. Master data science with Python! Learn data analysis, visualization, and machine learning through hands-on projects. Whether you're a beginner or pro, gain practical skills to excel. (Note: This content is safe and unrelated to "nsfw chatgpt."

    ReplyDelete
  11. Great article! Your insights are incredibly valuable. If you're looking for creative ways to stay entertained and productive, check out our list of Things to Do Instead of Social Media!

    ReplyDelete
  12. Great read! Exploring lesser-known places is always fascinating. If you love discovering untouched beauty, check out our guide on the most breathtaking World Hidden Sanctuaries!

    ReplyDelete
  13. Well written! The key to IELTS success is following expert IELTS strategies and tips. Keep sharing such valuable insights!

    ReplyDelete

Post a Comment

Popular posts from this blog

Generative AI with LangChain: Basics

  Wishing everyone a Happy New Year '24😇 I trust that you've found valuable insights in my previous blog posts. Embarking on a new learning adventure with this latest post, we'll delve into the realm of Generative AI applications using LangChain💪. This article will initially cover the basics of Language Models and LangChain. Subsequent posts will guide you through hands-on experiences with various Generative AI use cases using LangChain. Let's kick off by exploring the essential fundamentals💁 What is a Large Language Model (LLM)? A large language model denotes a category of artificial intelligence (AI) models that undergo extensive training with extensive textual data to comprehend and produce language resembling human expression🙇. Such a large language model constitutes a scaled-up transformer model, often too extensive for execution on a single computer. Consequently, it is commonly deployed as a service accessible through an API or web interface. These models are...

How can I become a TPU expert?

Another post starts with you beautiful people! I have two good news for all of you! First good news is that Tensorflow has released it's new version (TF 2.1) which is focused on TPUs and the most interesting thing about this release is that it now also supports Keras high level API. And second wonderful news is to help us get started Kaggle has launched a TPU Playground Challenge . This means there is no any way to stop you learning & using TPUs. In this post I am going to share you how to configure and use TPUs while solving a image classification problem. What are TPUs? You must have heard about TPU while using  Google Colab . Now Kaggle also supports this hardware accelerator. TPUs or Tensor Processing Units are hardware accelerators specialized in deep learning tasks. They were created by Google and have been behind many cutting edge results in machine learning research. Kaggle Notebooks are configured with TPU v3-8s, which is a specialized hardware with...

LightGBM and Kaggle's Mercari Price Suggestion Challenge

Another post starts with you beautiful people! I hope you have enjoyed and must learnt something from previous two posts about real world machine learning problems in Kaggle. As I said earlier Kaggle is a great platform to apply your machine learning skills and enhance your knowledge; today I will share again my learning from there with all of you! In this post we will work upon an online machine learning competition where we need to predict the the price of products for Japan’s biggest community-powered shopping app. The main attraction of this challenge is that this is a Kernels-only competition; it means the datasets are given for downloading only in stage 1.In next final stage it will be available only in Kernels. What kind of problem is this? Since our goal is to predict the price (which is a number), it will be a regression problem. Data: You can see the datasets  here Exploring the datasets: The datasets provided are in the zip format of 'tsv'. So how can ...