Skip to main content

Posts

Showing posts from August, 2017

Basic concept to Visualization

Another post starts with you beautiful people! Today we will go through the plotting concepts so that you can draw your plot more clear. How to create Multiple plots on single axis- It is time now to put together some of what we have learned and combine line plots on a common set of axes. The data set here comes from records of undergraduate degrees awarded to women in a variety of fields from 1970 to 2011 . You can download this dataset from here-  download dataset we can compare trends in degrees most easily by viewing two curves on the same set of axes. we will issue two plt.plot() commands to draw line plots of different colors on the same set of axes. Here, year represents the x-axis, while physical_sciences and computer_science are the y-axes. Result :- Well done! It looks like, for the last 25 years or so, more women have been awarded undergraduate degrees in the Physical Sciences than in Computer Science. How to use axes()- Rather than overlaying li

Exploring The File Import

Another post starts with you beautiful people! Today we will explore various file import options in Python which I learned from a great learning site- DataCamp . In order to import data into Python, we should first have an idea of what files are in our working directory. We will learn step by step examples as given below- Importing entire text files- In this exercise, we'll be working with the file mobydick.txt [ download here ] It is a text file that contains the opening sentences of Moby Dick, one of the great American novels! Here you'll get experience opening a text file, printing its contents to the shell and, finally, closing it- # Open a file: file file = open('mobydick.txt', mode='r') # Print it print(file.read()) # Check whether file is closed print(file.closed) # Close file file.close() # Check whether file is closed print(file.closed) Importing text files line by line- For large files, we may not want to print all of th

Central Limit Theorem and Hypothesis Testing

Another post starts with you beautiful people! Today we will learn about an important topic related to statistics. Statistical inference is the process of deducing properties of an underlying distribution by analysis of data. Inferential statistical analysis infers properties about a population: this includes testing hypotheses and deriving estimates. Statistics are helpful in analyzing most collections of data. Hypothesis testing can justify conclusions even when no scientific theory exists. You can find more about this here-  tell me more about Statistical_hypothesis_testing Here our case study will be Average Experience of Data Science Specialization(DSS) batch taught in a leading University with Statistical Inference. We will aim to study how accurately can we characterize the actual average participant experience (population mean) from the samples of data (sample mean). We can quantify the certainty of outcome through the confidence intervals. Let's plot the d

Exploratory Data Analysis using Python

Another post starts with you beautiful people! In my previous posts and pages we have learnt basics and advanced topics of Python required in Data Science. Now it's time to do EDA, sounds interesting! Exploratory Data Analysis (EDA) is a crucial step of the data analytics process. It involves exploring the data and identifying important features about the data as well as asking interesting questions from the data by using statistical and visualization tools studied in earlier classes such as descriptive statistics and basic plotting. In this post we will use the dataset about TB data on countries and their territories. Specifically, we would using data files for TB Deaths, spread of TB, and number of new cases of TB to answer some important questions. Since we are going to perform some Exploratory Data Analysis in our TB dataset, these are the questions we want to answer: Which are the countries with the highest and infectious TB incidence? What is the general world