Outlier formula provides a graphical tool to calculate the data which is located outside the given set of distribution which may be inner or outer side depending upon the variables.
What is the Outlier Formula?
An outlier is the data point of the given sample or given observation or in a distribution that shall lie outside the overall pattern. A Commonly used rule that says that a data point will be considered as an outlier if it has more than 1.5 IQR below the first quartile or above the third quartile.
Said differently, low outliers shall lie below Q1-1.5 IQRand high outliers shall lie Q3+1.5IQR
One needs to calculate median, quartiles, including IQR, Q1, and Q3.
The outlier formula is represented as follows,
The Formula for Q1 = ¼ (n + 1)th term The Formula for Q3 = ¾ (n + 1)th term The Formula for Q2 = Q3 – Q1
Step by Step Calculation of Outlier
The below steps needs to be followed to calculate the Outlier.
- Step 1: First calculate the quartiles i.e., Q1, Q2 and interquartile
- Step 2: Now calculate the value Q2 * 1.5
- Step 3: Now Subtract Q1 value from the value calculated in Step2
- Step 4: Here Add Q3 with the value calculated in step2
- Step 5: Create the range of the values calculated in Step3 and Step4
- Step 6: Arrange the data in ascending order
- Step 7: Check whether there any values that lie below or higher than the range created in Step5.
Example
Consider a data set of the following numbers: 10, 2, 4, 7, 8, 5, 11, 3, 12. You are required to calculate all the Outliers.
Solution:
First, we need to arrange data in ascending order to find the median, which will be Q2 for us.
2, 3, 4, 5, 7, 8, 10, 11, 12
Now since the number of observations is odd, which is 9, the median would lie on a 5th position, which is 7, and the same will be Q2 for this example.
4.9 (1,067 ratings) 250+ Courses | 40+ Projects | 1000+ Hours | Full Lifetime Access | Certificate of Completion
Therefore, the calculation of Q1 is as follows –
Q1 = ¼ (9 + 1)
= ¼ (10)
Q1 will be –
Q1 = 2.5 term
This means that Q1 is the average of the 2nd and 3rd position of the observations, which is 3 & 4 here, and an average of the same is (3+4)/2 = 3.5
Therefore, the calculation of Q3 is as follows –
Q3 = ¾ (9 + 1)
= ¾ (10)
Q3 will be –
Q3 = 7.5 term
This means that Q3 is the average of the 7th and 8th position of the observations, which is 10 & 11 here, and an average of the same is (10+11)/2 = 10.5
Now, low outliers shall lie below Q1-1.5IQR, and high outliers shall lie Q3+1.5IQR
So, the values are 3.5 – (1.5*7) = -7 and higher range is 10.5 + (1.5*7) = 110.25.
Since there are no observations that lie either above or lower than 110.25 and -7, we don’t have any outliers in this sample.
Example of Outlier Formula in Excel (with Excel Template)
Creative coaching classes are considering rewarding students who are in the top 25% However, they want to avoid any outliers. The data is for the 25 students. Use the Outlier equation to determine if there is an outlier?
Solution:
Below is given data to calculate the outlier.
The number of observations here is 25, and our first step would be converting the above raw data in ascending order.
Median will be –
The median value = ½ (n+1)
= ½ = ½ (26)
= 13th term
The Q2 or median is 68.00
Which is 50% of the population.
Q1 will be –
Q1 = ¼ (n+1)th term
= ¼ (25+1)
= ¼ (26)
= 6.5th term, which is equivalent to 7th term
The Q1 is 56.00, which is bottom 25%
Q3 will be –
Finally, Q3 = ¾ (n+1)th term
= ¾ (26)
= 19.50 term
Here the average needs to be taken, which is of 19th and 20th terms which are 77 and 77 and the average of same is (77+77)/2 = 77.00
The Q3 is 77, which is the top 25%
Low Range
Now, low outliers shall lie below Q1-1.5IQR, and high outliers shall lie Q3+1.5IQR
High Range –
So, the values are 56 – (1.5*68) = -46 and higher range is 77 + (1.5*68) = 179.
There are no outliers.
Relevance and Uses
Outliers formula is very important to know as there could be data that would get skewed by such value. Take an example of observations 2, 4, 6, 101, and now if somebody takes an average of these values, it will be 28.25, but 75% of the observations lie below 7, and hence one would be an incorrect decision regarding observations of this sample.
It can be noticed here that 101 clearly appears to outline, and if this is removed, then the average would be 4, which does say about the values or observations that they lie within the range of 4. Hence it is very important to conduct this calculation to avoid any misusage leading information of the data. These are widely used by statisticians around the world whenever they are conducting any research.
Recommended Articles
This has been a guide to Outlier Formula. Here we discuss step by step calculation of Outlier along with some practical examples in excel and downloadable excel template. You can learn more about excel modeling from the following articles –
- What is Quartile Deviation?
- QUARTILE Function in Excel
- Frequency Excel Formula
- Find Mode in Excel
- Interest on Loan
- 250+ Courses
- 40+ Projects
- 1000+ Hours
- Full Lifetime Access
- Certificate of Completion