Skip to main content

Exploratory Data Analysis using Python

Another post starts with you beautiful people!
In my previous posts and pages we have learnt basics and advanced topics of Python required in Data Science.
Now it's time to do EDA, sounds interesting!

Exploratory Data Analysis (EDA) is a crucial step of the data analytics process.
It involves exploring the data and identifying important features about the data as well as asking interesting questions from the data by using statistical and visualization tools studied in earlier classes such as descriptive statistics and basic plotting.

In this post we will use the dataset about TB data on countries and their territories.
Specifically, we would using data files for TB Deaths, spread of TB, and number of new cases of TB to answer some important questions.

Since we are going to perform some Exploratory Data Analysis in our TB dataset, these are the questions we want to answer:

  • Which are the countries with the highest and infectious TB incidence?
  • What is the general world tendency in the period from 1990 to 2007?
  • What countries don't follow that tendency?
  • What other facts about the disease do we know that we can check with our data?
First set the local path where you want to put files for example I am using-

Second, import required libraries-

Next, we will get our dataset from the internet resource and save those in our local disk-

For more details about urllib.request library please visit here- tell me more about urllib.request
After the above step the dataset will be saved in your local path as csv files and we are ready to use these datasets.

Now we will read the csv files and do some beautification -

After this let's explore few data-

or


Result:-

If you want to check percentage change in existing cases over the years-

Let us look at curious case of Spain. What do you infer?

Let us go ahead and do some plotting-

How about box-plots-

Which country has the highest number of existing and new TB cases?

Result:


What about world trends?

Result:


What about specific countries?


Result:

Let us think about outlier countries-

Proportions of countries as outlier-

Filter the data frame:-


Result:


What do you infer from above dataset? Can we somehow combine all of that information?

Compare this with rest of world:-


Result:


What about percentage change?

Result:-

Let's see TB cases in China-

Hope you enjoyed today' learning.
If you are reading and practicing these learnings then no doubt you are a future data scientist.
Last but not the least as a data scientist- ASK THE RIGHT QUESTION !

Comments

  1. It was really a nice article and i was really impressed by reading this Data Science online Course

    ReplyDelete
  2. Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.

    Text Analytics Software

    Text Summarization Solutions

    ReplyDelete
  3. It's late finding this act. At least, it's a thing to be familiar with that there are such events exist. I agree with your Blog and I will be back to inspect it more in the future so please keep up your act.data science course in malaysia

    ReplyDelete
  4. Thanks for providing such a valuable Knowledge on Data Analysis With Python. I have learned many things with this blog. Keep sharing.Very knowledgeable Blog.

    ReplyDelete
  5. Incredibly conventional blog and articles. I am realy very happy to visit your blog. Directly I am found which I truly need. Thankful to you and keeping it together for your new post.
    data scientist course in malaysia

    ReplyDelete
  6. Incredibly conventional blog and articles. I am realy very happy to visit your blog. Directly I am found which I truly need. Thankful to you and keeping it together for your new post.
    data scientist course in malaysia

    ReplyDelete
  7. This is my first time visit here. From the tremendous measures of comments on your articles.I deduce I am not only one having all the fulfillment legitimately here!
    artificial intelligence training aurangabad

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. Join now for the intense Python Training in Hyderabad program at AI Patasala to become an early leader in this trending platform.
    Python Certification in Hyderabad

    ReplyDelete

Post a Comment

Popular posts from this blog

How to use opencv-python with Darknet's YOLOv4?

Another post starts with you beautiful people 😊 Thank you all for messaging me your doubts about Darknet's YOLOv4. I am very happy to see in a very short amount of time my lovely aspiring data scientists have learned a state of the art object detection and recognition technique. If you are new to my blog and to computer vision then please check my following blog posts one by one- Setup Darknet's YOLOv4 Train custom dataset with YOLOv4 Create production-ready API of YOLOv4 model Create a web app for your YOLOv4 model Since now we have learned to use YOLOv4 built on Darknet's framework. In this post, I am going to share with you how can you use your trained YOLOv4 model with another awesome computer vision and machine learning software library-  OpenCV  and of course with Python 🐍. Yes, the Python wrapper of OpenCV library has just released it's latest version with support of YOLOv4 which you can install in your system using below command- pip install opencv-pyt...

How can I make a simple ChatBot?

Another post starts with you beautiful people! It has been a long time of posting a new post. But my friends in this period I was not sitting  where I got a chance to work with chatbot and classification related machine learning problem. So in this post I am going to share all about chatbot- from where I have learned? What I have learned? And how can you build your first bot? Quite interesting right! Chatbot is a program that can conduct an intelligent conversation based on user's input. Since chatbot is a new thing to me also, I first searched- is there any Python library available to start with this? And like always Python has helped me this time also. There is a Python library available with name as  ChatterBot   which is nothing but a machine learning conversational dialog engine. And yes that is all I want to start my learning because I always prefer inbuilt Python library to start my learning journey and once I learn this then only I move ahead for another...

Generative AI: Retrieval Augmented Generation(RAG)

  Another blog post starts with you beautiful people👦. I hope you have explored my  last blog post  about 2x faster fine-tuning of Mistral 7b model on a custom dataset👈. In this blog post, we are going to learn an essential technique in Generative AI: Retrieval Augmented Generation (RAG). What is RAG? Retrieval Augmented Generation (RAG) is an innovative approach that melds generative models, like transformers, with a retrieval mechanism. By tapping into existing knowledge, RAG retrieves pertinent information from expansive external datasets or knowledge bases to enhance the generation process, thereby elevating the model's content relevance and factual accuracy💪. This versatility renders RAG particularly beneficial for tasks demanding the assimilation of external knowledge, such as question answering or content creation. Upon receiving input, RAG actively searches for relevant documents from specified sources (e.g., Wikipedia, company knowledge base, etc.). It th...