Python Advanced- Visualizing the Titanic Disaster

Another post starts with you beautiful people !
Today we will work on a famous dataset Titanic Dataset taken from kaggle.
This dataset gives information about the details of the passengers aboard the Titanic and a column on survival of the passengers. Those who survived are represented as “1” while those who did not survive are represented as “0”.

The columns in the dataset are as below-
PassengerId: Passenger Identity
Survived: Whether passenger survived or not
Pclass: Class of ticket
Name: Name of passenger
Sex: Sex of passenger (Male or Female)
Age: Age of passenger
SibSp: Number of sibling and/or spouse travelling with passenger
Parch: Number of parent and/or children travelling with passenger
Ticket: Ticket number
Fare: Price of ticket
Cabin: Cabin number

Let's starts some hands on-

Let's generates descriptive statistics-

Result:

Note: if you are seeing error- ImportError: No module named 'seaborn' then it mean you need to install the seaborn library using command- pip install seaborn in the command prompt.

Result:

Let's find out the children in the dataset-

Let's count the person individually-

Now plot Male, Female, Child in Pclass-

Result:

People Who Survived and Who Didn't:

How many Male and Female survived :

Result-More females survive than males.

Let's compute pairwise correlation of columns, excluding NA/null values:-

Result:

See with the help of above visualization how you can easily transform a dataset into a story telling.
Try in your notebook and share your thoughts in comment.

Comments

Learn the fastest way to build data apps

Another post starts with you beautiful people! I hope you have enjoyed and learned something new from my previous three posts about machine learning model deployment. In one post we have learned How to deploy a model as FastAPI? I n the second post, we have learned How to deploy a deep learning model as RestAPI ? and in the third post, we have also learned How to scale your deep learning model API? If you are following my blog posts, you have seen how easily you have transit yourselves from aspiring to a mature data scientist. In this new post, I am going to share a new framework- Streamlit which will help you to easily create a beautiful app with Python only. I will show here how had I used the Streamlit framework to create an app for my YOLOv3 custom model. What is Streamlit? Streamlit’s open-source app framework is the easiest way for data scientists and machine learning engineers to create beautiful, performant apps in only a few hours!...

How to use opencv-python with Darknet's YOLOv4?

Another post starts with you beautiful people 😊 Thank you all for messaging me your doubts about Darknet's YOLOv4. I am very happy to see in a very short amount of time my lovely aspiring data scientists have learned a state of the art object detection and recognition technique. If you are new to my blog and to computer vision then please check my following blog posts one by one- Setup Darknet's YOLOv4 Train custom dataset with YOLOv4 Create production-ready API of YOLOv4 model Create a web app for your YOLOv4 model Since now we have learned to use YOLOv4 built on Darknet's framework. In this post, I am going to share with you how can you use your trained YOLOv4 model with another awesome computer vision and machine learning software library- OpenCV and of course with Python 🐍. Yes, the Python wrapper of OpenCV library has just released it's latest version with support of YOLOv4 which you can install in your system using below command- pip install opencv-pyt...

Machine Learning-Cross Validation & ROC curve

Another post starts with you beautiful people! Hope you enjoyed my previous post about improving your model performance by confusion metrix . Today we will continue our performance improvement journey and will learn about Cross Validation (k-fold cross validation) & ROC in Machine Learning. A common practice in data science competitions is to iterate over various models to find a better performing model. However, it becomes difficult to distinguish whether this improvement in score is coming because we are capturing the relationship better or we are just over-fitting the data. To find the right answer of this question, we use cross validation technique. This method helps us to achieve more generalized relationships. What is Cross Validation? Cross Validation is a technique which involves reserving a particular sample of a data set on which we do not train the model. Later, we test the model on this sample before finalizing the model. Here are the steps involved in...

Learn Data Science using Python

Search This Blog