What is the Empirical Rule in Statistics?
Empirical Rule in Statistics states that almost all (95%) of the observations in a normal distribution lie within 3 Standard Deviations from the Mean. This is a very important rule and helps in forecasting.
The formula shows the predicted percentage of observations that will lie within each Standard DeviationStandard DeviationStandard deviation (SD) is a popular statistical tool represented by the Greek letter 'σ' to measure the variation or dispersion of a set of data values relative to its mean (average), thus interpreting the data's reliability. from the MeanMeanMean refers to the mathematical average calculated for two or more values. There are primarily two ways: arithmetic mean, where all the numbers are added and divided by their weight, and in geometric mean, we multiply the numbers together, take the Nth root and subtract it with one..
The Rule says that:
- 68% of the observations will lie within +/- 1 Standard Deviation from the mean
- 95% of the observations will lie within +/- 2 Standard Deviations from the mean
- 7% of the observations will lie within +/- 3 Standard Deviations from the mean
How to Use?
This is used in the forecasting trend of a data set. When the data set is extensive, and it gets challenging to study the entire population, then Empirical Rule can be applied to the sample to get an estimation of how the data in population will react if you are being asked to find the average salary of all the accountantsThe AccountantsAn accountant is a finance professional responsible for recording business transactions on behalf of a firm, reporting the firm’s performance and issuing financial statements. Thus, an accountant plays an important role whether it is a small domestic entity or a large multinational company. in the US. Then that is a difficult task to perform as the population set is enormous. So, in that case, you can select, say, 90 observations randomly from the entire population.
So now you will have 90 salaries. You need to find the Mean and Standard Deviation of the observations. If the observation follows a normal distribution, then this can be applied, and an estimation of the salary of all accountants in the US can be made.
Say the Mean salary of the sample comes out to be $90,000. And the Standard deviation is $5,000. So out of the entire population, 68% of the accountants are drawing a salary ranging between +/- 1Standard Deviations from the Mean. As the Mean is $90,000 and the Standard Deviation is $5,000. So 68% of all the accountants in the US are being paid in the range of $90,000 +/- (1*$5,000). That is within $85,000 to $95,000
If we spread a bit more, then 95% of all the accountants in the US are being paid in the range of Mean +/- 2 Standard Deviations. $90,000 +/- (2*5000). So the range is $80,000 to $100,000.
In a broader range, 99.7% of all accountants are drawing salaries ranging from Mean +/- 3Standard Deviations. That is 90,000 +/- (3*5000). The range is $75,000 to $105,000
You can clearly see that without studying the entire population, estimation could be made regarding the population. If someone is planning to work as an accountant in the US, then he can easily expect that his salary will range from $75,000 to $105,000
This kind of estimation helps to ease work and make forecasts regarding the future.
Empirical Rule Examples
Mr. X is trying to find the average number of years a person survive after retirement, considering the retirement age to be 60. If the Mean survival years of 50 random observations are 20 years and SD is 3, then find out the probability that a person will draw a pension for more than 23 years
The Empirical Rule states that 68% of the observations will lie within 1 Standard Deviation from the Mean. Here the Mean of the observations is 20.
68% of the observations will lie within 20 +/- 1 (Standard Deviation), which is 20 +/- 3. So the range is 17 to 23.
There is a 68% chance that the minimum years a person survives after retirement lies between 17 to 23. Now the percentage that is lying outside this range is (100 – 68) = 32%. 32 is distributed equally on both sides, which means a 16% chance that the minimum years will be below 17 and a 16% chance that minimum years will be greater than 23.
So the probability that the person will draw more than 23years of pension is 16%.
Empirical Rule vs. Chebyshev’s Theorem
Empirical Rule is applied to data sets that follow a normal distributionNormal DistributionNormal Distribution is a bell-shaped frequency distribution curve which helps describe all the possible values a random variable can take within a given range with most of the distribution area is in the middle and few are in the tails, at the extremes. This distribution has two key parameters: the mean (µ) and the standard deviation (σ) which plays a key role in assets return calculation and in risk management strategy. that means bell-shaped. In a normal distribution, both sides of the distribution have a 50% probability each.
If the data set is not normally distributed, then there is another approximation or Rule that applies to all types of data sets, which is Chebyshev’s Theorem. It says three things:
- At least 3/4th of all the observations will lie within 2Standard Deviations from the Mean. It is a strong approximation. It means if there are 100 observations, then 3/4th of the observations that are 75 observations will lie within +/- 2 Standard Deviations from the Mean.
- At least 8/9th of all observations will lie within 3Standard Deviations from the Mean.
- At least 1 – 1/k^2 of all the observations lie within K Standard deviations from the Mean. Here K is referred to as any whole number.
When to Use?
Data is like Gold in the modern world. There are huge data flowing from different sources and are used for different approximations or forecasts. If a data set is following a normal distribution, it shows a Bell Shaped curveBell Shaped CurveBell Curve graph portrays a normal distribution which is a type of continuous probability. It gets its name from the shape of the graph which resembles to a bell. ; then, Empirical Rule can be used. It is applied to observations to create an approximation for the population.
Once it is seen that the observations are showing a Normal Distribution structure, then Empirical Rule is followed to find several probabilities of the observations. The Rule is extremely useful for many statistical forecasts.
Empirical Rule is a statistical concept that helps portray the probability of observations and is very useful when finding an approximation of a huge population. It should always be noted that these are approximations. There are always chances of outliers that don’t fall in the distribution. So the findings are not accurate and precautionary measures should be taken when acting as per the forecast.
This article has been a guide to What is Empirical Rule & its Definition. Here we discuss the formula of Empirical Rule along with calculation examples. You can learn more about from the following articles –