Excel Data Analysis ToolPak
Data Analysis toolpak in excel is an addin in excel which allows us to do data analysis and various other important calculations, this addin is by default not enabled in excel and we have to manually enable it from the files tab in the options section and then in the addins section, we click on manage addins then check on analysis toolpak to use it in excel.
Steps to Load the Data Analysis Toolpak Add-in
Below are the steps to load data analysis toolpak add-in –
- Click on ‘File.’
- Click on ‘Options’ from the list.
- Click on ‘Add-ins’ and then choose ‘Excel Add-ins’ for ‘Manage’. Click on ‘Go.’
- The ‘Excel Add-insExcel Add-insAdd-ins are different Excel extensions that can be found in the options section of the file tab. The first box displays the system's enabled add-ins, and if the user wishes to enable more, they must click on manage add-ins.‘ dialog box will appear with the list of add-ins. Please check for ‘Analysis ToolPak’u00a0and click on ‘OK.’
- The command ‘Data Analysis’ will appear under the ‘Data’ tab in Excel at the extreme right of the ribbon, as displayed below.
List of Functions Available in Excel Data Analysis ToolPak
Below is the list of available functions in Analysis Toolpak Excel Add-in:
- ANOVA: Single Factor in Excel
- Correlation in Excel
- Rank and Percentile in Excel
- Descriptive Statistics in ExcelDescriptive Statistics In ExcelDescriptive statistics is used to summarize information available in statistics, and there is a descriptive statistics function in Excel as well. This built-in tool is found in the data tab, in the data analysis section.
Now let us discuss each of them in detail –
#1 – ANOVA: Single Factor in Excel
ANOVA stands for Analysis of Variance and is the first set of options available in Analysis Toolpak Excel Add-in. In one-way ANOVA, we analyze whether there are any statistical differences between the means of three or more independent groups. The null hypothesis proposes that no statistical significance exists in a set of given observations. We test this hypothesisTest This HypothesisHypothesis Testing is the statistical tool that helps measure the probability of the correctness of the hypothesis result derived after performing the hypothesis on the sample data. It confirms whether the primary hypothesis results derived were correct. by checking the p-value.
Let us understand this by an ANOVA excel exampleANOVA Excel ExampleANOVA is a built-in statistical test in Excel that analyses variances. Using the ANOVA test in Excel, we can test different data sets to find the best of the bunch..
Suppose we have the following data from the experiment conducted to check ‘Can self-control be restored during intoxication?’ We categorized 44 males into 4 equal groups comprising 11 males in each group.
- Group A received 0.62mg/kg of alcohol.
- Group AC received alcohol plus caffeine.
- Group AR received alcohol and a monetary reward for performance.
- Group P received a placebo.
Scores on the award stem completion task involving “controlled (effortful) memory processes” were recorded, and the result is as follows:
We need to test the null hypothesis, which proposes that all means are equal (there is no significant difference).
How to Run the ANOVA Test?
To run the ANOVA one-way test, we need to perform the following steps:
- Step 1: Click on the ‘Data Analysis’ command available in the ‘Data’ tab under ‘Analysis.’
- Step 2: Select ‘Anova: Single Factor’ from the list and click on ‘OK.’
- Step 3: We get ‘Anova: Single Factor’ dialog box. We need to select Input Range as our data with column heading.
- Step 4: As we have taken column headings in our selection, we need the checkbox for ‘Labels in the first row.’
- Step 5: For output range, we have selected F1. Please click on ‘OK.’
We now have ANOVA analysis.
The larger the F-statistic value in excel, the more likely it is that the groups have different means, which rejects the null hypothesisNull HypothesisNull hypothesis presumes that the sampled data and the population data have no difference or in simple words, it presumes that the claim made by the person on the data or population is the absolute truth and is always right. So, even if a sample is taken from the population, the result received from the study of the sample will come the same as the assumption. that all means are equal. An F-statistic greater than the critical value is equivalent to a p-value in excelP-value In ExcelP-value is used in correlation and regression analysis in Excel to determine whether the result obtained is feasible or not and which data set from the result to work with. It's value ranges from 0 to 1. less than alpha, and both mean that we reject the null hypothesis. Hence, it is concluded that there is a significant difference between groups.
#2 – Correlation in Excel
Correlation is a statistical measure available in Analysis Toolpak Excel Add-in, and it shows the extent to which two or more variables fluctuate together. A positive correlation in excel indicates the extent to which those variables increase or decrease in parallel. A negative correlation indicates the extent to which one variable increases as the other decreases.
We have the following data related to advertising costs and sales for a company. We want to find out the relationship between both so that we can plan our budget accordingly and expect sales (set target considering other factors also).
How to Find Correlation Between Two Set of Variables?
To find out the correlation between these two sets of variables, we will follow the below-mentioned steps:
- Step 1: Click on ‘Data Analysis’ under the ‘Analysis’ group available in ‘Data.’
- Step 2: Choose ‘Correlation’ from the list and click on ‘OK.’
- Step 3: Choose range ‘$A$1:$B$16’ as input range and $F$1 as output range. Please tick the checkbox for ‘Labels in the first row’ as we have column headings in our input range, and as we have different heads in a different column. We have chosen ‘Columns’ for ‘Grouped By.’
- Step 4: Select the Output range then click on ‘OK.’
- We get the result.
As we can see, the correlation between advertising cost (column head) and Sales (row head) is +0.86274 approx, which indicates that they have a positive correlation and to 86.27% extent. Now we can accordingly decide on the advertising budgetThe Advertising BudgetAn advertising budget is an amount of money set aside by a company to promote its products and services through promotional activities such as market surveys, advertisement, creative marketing, and running ad campaigns on print media, digital media and social media. and expected sales.
#3 – Rank and Percentile in Excel
Percentile in excelPercentile In ExcelThe PERCENTILE Function returns the nth percentile from a supplied set of values. For example, you can use the PERCENTILE to find the 90th percentile, 80th percentile, etc. The PERCENTILE in excel is a built-in function of Microsoft Excel and is categorized under the Statistical Function. refers to a number where a certain percentage of scores fall below that number and is available in the Analysis Toolpak Excel Add-in. For example, if a particular score is in the 90th percentile, that means the student has scored better than 90% of people who took the test. Let us understand this with an example.
We have the following data for the scores obtained by a student of a class.
We want to find out the rank and percentile for every student.
How to Find Rank and Percentile?
The steps would be:
- Step 1: Click on ‘Data Analysis’ under the ‘Analysis’ group available in ‘Data.’
- Step 2: Click on ‘Rank and Percentile’ from the list and then click on ‘OK.’
- Step 3: Select ‘$B$1: B$B$17’ as input range and ‘$D$1’ as output range.
- Step 4: As we have data field heads in columns, i.e., the data is grouped in columns, we need to select ‘Columns’ for ‘Grouped By.’
- Step 5: We have selected column heading in our input range; that is why we need to check for ‘Labels in the first row’ then click on ‘OK.’
- We got the result like the following image.
#4 – Descriptive Statistics in Excel
Descriptive statistics included in the Analysis Toolpak Excel Add-in contains the following information about a sample:
- Central Tendency
- Mean: It is called average.
- Median: This is the mid-point of the distribution.
- Mode: It is the most frequently occurring number.
- Measures of Variability
- Range: This is the difference between the largest and smallest variables.
- Variance: This indicated how far the numbers are spread out.
- Standard Deviation: How much variation exists from the average/mean.
- Skewness: This indicates how symmetrical the distribution of a variable is.
- KurtosisKurtosisKurtosis in statistics is used to describe the distribution of the data set and depicts to what extent the data set points of a particular distribution differ from the data of a normal distribution. It determines whether the data is heavy-tailed or light-tailed.: This indicates peakedness or flatness of a distribution.
Below we have marks scored by students in Economics subject. We want to find out descriptive statistics.
To do the same, the steps are:
- Step 1: Click on the ‘Data Analysis’ command available in the ‘Analysis’ group in ‘Data.’
- Step 2: Choose ‘Descriptive Statistics’ from the list and click on ‘OK.’
- Step 3: Choose ‘$A$1:$A$15’ as input range, choose ‘Columns’ for ‘Grouped By,’ tick for ‘Labels in the first row,’
- Step 4: Choose ‘$C$1’ as the output range and make sure that we have checked the box for ‘Summary Statistics.’ Click on ‘OK.’
Now we have our descriptive statistics for the data.
This has been a guide to Data Analysis ToolPak Add-in in Excel. Here we discuss the steps to load data Analysis Toolpak in Excel for tools like 1) Anova, 2) Correlation, 3) Rank and Percentile, 4) Descriptive Statistics along with practical examples, and a downloadable excel template. You may learn more about excel from the following articles –