 # Statistics – Stepping Into the Analyst Shoes!!

Okay!!! Long story short.. I want to cover maximum concepts the Project Manager will need from those available in the shortest time Span possible. So we ‘ll dive right into definition of Statistics to start with!!!

## What is Statistics?

Some definitions of Statistics

• Statistics is concerned with collecting and processing data, summarizing information in the form of tables, graphs, charts etc.
• Statistics also involves estimating parameters, testing hypotheses, and designing experiments in such a way that valid inferences can be drawn from empirical evidence.
• Numbers calculated to describe important features of the data are also referred to, as statistics. For example, (i) the proportion of females, and (ii) the average age of unemployed persons, in a sample of residents of a town are statistics.

## But then What is Data?

In our context, data refers to a collection of information about unit(s) of interest to us. The unit(s) may be customer(s), part(s) manufactured, or some such thing(s) which we want to analyze & understand

## How to you visualize Data?

Given below are the marks scored by a group of 15 participants from a training exercise conducted for your organization (These are fictional numbers on a scale of 0 to 100)

39,49,46,52,66,47,53,48,51,46,48,49,53,51,56

Try to gain an understanding about how this group performed as a whole, just by looking at this data.

## How to you visualize this Data Further?

Let’s first sort the data in ascending order

39,46,46,47,48,48,49,49,51,51,52,53,53,56,66

Makes sense? A little more than when we looked at raw data, I guess..

We now clearly see the minimum value, the maximum value & to an extent an idea about the middle value of the data. Now, let’s start bucketing the number of observations in an interval of 5 starting from the minimum value and covering the maximum value

## Can you help me visualize this Data Further? Let’s use some Charting Techniques……….. 12 out of 15 Observations are in the range 46 - 55 Now this gives me some perspective....

## You know what!!! I can visualize this Data further…… ## Basic Measures – Central Tendency

One of the very basic measures used to summarize data is the measure of “Central Tendency”

Central Tendency measure gives an idea of a particular value around which the entire set of data is grouped

There are different measures of Central Tendency available depending upon the type of data used and depending on the type of data, and sometimes depending on data itself

Central Tendency measures that are most often used are:

1)Simple Arithmetic Mean

2)Weighted Arithmetic Mean

3)Median

4)Mode ## Simple Arithmetic Mean may be used in some organizations to find out a representative target for their teams. Weighted Arithmetic Mean is used to calculate the most likely metric as part of PERT calculations for some organizations… ## The Shoe Shop Story – Mode As a Measure in Action

An analyst joined a Shoes Factory outlet and part of his Job description was to ensure the inventory was appropriately managed so as to ensure there was a regular supply of shoes for the customers.

Upon analyzing historical data he looks at Measures of Central Tendency.

The Analyst knows that if he identifies the Mode which is the value that occurs most number of times in the historical data his Inventory Problem would be more than half solved.

Upon Analysis he identifies 7 as the mode and interprets that this shoe size drives the maximum sales for the Shoes Factory Outlet. ## Statistical Case Study …….

A Project Manager was tasked with analyzing and selecting an appropriate target for  an Operations function.

When he analyzed the data he found out that the data was skewed and not normally spread out.

While it was a normal practice to baseline targets around the mean in this case this practice would have meant selecting a non representative target.

Since the data was skewed the Project Manager Proposed that the median be used as a representative target.

## What is the Median and how it it different from the Mean? Find out below…. A Project Manager may decide to baseline the Target Metrics at the Mean or the Median. It really depends on being able to visualize the data

## Another basic measure used to summarize data is the measure of “Dispersion”

Dispersion measures indicate the spread of data points around the central tendency measure (say mean), and give an idea of variability in the data.

There are different ways to measure Dispersion as well, and we need to use the appropriate measure, given a situation.

Some popular measures of Dispersion are:

1)Range

2)Standard Deviation

3)Inter-Quartile Range

4)Coefficient of Variation