What is Descriptive statistics? | Data science -TechnicalTeacher

Descriptive statistics by using Pandas and Scipy Library
In real-time, data science becomes most useful for generating prediction through data visualization. Python is most popular language for analyzing data.
Here, we discuss about Descriptive statistics in detail, with the help of Pandas and SciPy libraries in python.
Firstly, you will learn about these two libraries which we are using in python project. Afterwards, we will discuss on, what is Descriptive statistics?.
I will provide you short note on these libraries.


Pandas is a library in python that is used for data analysis (in python).


SciPyis a library that contains all algebraic functions and builds on the NumPy extension.
Now, we will learn about descriptive statistics.

What is Descriptive statistics?

Descriptive statistics can be used for computing statistical measure of one or more sample. This means it describe and summarize data in meaningful way.
Descriptive statistics is divided into two parts. These are-
1-It describes the values of observation in a variable.
Like,we have descriptive statistics exampleSum, Median, Mean, Max etc.
2-It also describes variable spread.
Like, we have descriptive statistics exampleStandard Deviation, Variance, Counts, Quartiles etc.
We can use descriptive statistics analysis for analyzing data. For this we can use python in Jupyter notebook.
You can follow below steps when working with Jupyter notebook in python.
Step 1-Import libraries like Pandas and SciPy.
        Import numpy as np
        import pandas as pd
        from pandas import Series,DataFrame
         from scipy import stats
Step 2-Create an Excel file and save as .csv file. Here, I have created  Fruits.csv file and kept inside folder YoursTechnicalTeacher.
Fruits=pd.read_csv(address,index_col=’Order Date’,encoding=’cp1252′,parse_dates=True)

Important Points to remember-

1-read.csv method can be used to read the data from csv file into a data frame.
2-head() method is used to show the first five rows from data.
Library & csv file
3-Now we will starts on values of observation in a variable.

i-Sum– we can find sum by adding all the values.

For example,
we have values a,b,c,d,e,f
Then sum is =a+b+c+d+e+f

When we need sum of data in row wise manner we can define axis=1.



ii-Median– it gives you middle value from datasets.

If total number from numbers(n) is odd value then median =((n+1)/2)th term.
If total number from numbers(n) is even value then median =average value of (n/2)thand ((n+2)/2)th term.

iii-Mean– it will give you average value from datasets.

For example,
we have values a,b,c,d,e,f
Then mean is =(a+b+c+d+e+f)/6

median and mean


iv-Max-Tt will give you maximum value from datasets.

idmax() method-it gives you the index value of the row that contained the maximum value.
4-Now, we will work on variable distribution.
  • Variance– It will gives average value by adding of squares of difference between all numbers and means.
  • Standard deviation-it will gives you square root of variance.
  • Count– It will give us number of occurrence for items in a datasets. Also, it will show unique value.
NoteOne of the important method that is , used to describe all the descriptive statistics for each variables in a data set all at one time.
I hope, you have understood all the steps in descriptive statistics in detail. Therefore,it will help you for analyzing data in data science. These are also data analysis method. 
Thank you and happy coding!!

1 thought on “What is Descriptive statistics? | Data science -TechnicalTeacher”

  1. Oh my goodness! Incredible article
    dude! Many thanks, However I am encountering difficulties
    with your RSS. I don’t know the reason why I am unable to
    to it. Is there anybody else getting
    the same RSS problems?
    Anyone that knows the answer will you kindly respond?


Leave a Reply

Your email address will not be published. Required fields are marked *