## What is a Linear Regression?

Linear regression is basically a statistical modeling technique which used to show the relationship between one dependent variable and one or more independent variable. It is one of the most common types of predictive analysis. This type of distribution forms in a line hence this is called linear regression. In this article, we will take the examples of Linear Regression Analysis in Excel.

To do linear regression analysis first, we need to add excel add-ins by following steps.

Click on File – Options (This will open Excel Options Pop up for you).

Click on Add-ins – Select Excel Add-ins from Manage Drop Down in excel, then Click on Go.

This will open Add-ins Pop up. Select Analysis ToolPak then click Ok.

Data Analysis Add-in will appear under Insert Tab.

Let us understand by the below Examples of linear regression analysis in excel.

### Linear Regression Analysis Examples

#### Example #1

Suppose we have monthly sales and spent on marketing for last year, and now we need to predict future sales on the basis of last year’s sales and marketing spent.

Month |
Advertising |
Sales |

Jan | 40937 | 502729 |

Feb | 42376 | 507553 |

Mar | 43355 | 516885 |

Apr | 44126 | 528347 |

May | 45060 | 537298 |

Jun | 49546 | 544066 |

Jul | 56105 | 553664 |

Aug | 59322 | 563201 |

Sep | 59877 | 568657 |

Oct | 60481 | 569384 |

Nov | 62356 | 573764 |

Dec | 63246 | 582746 |

Click on Data Analysis under Data Tab, and this will open Data Analysis Pop up for you.

Now select Regression from the list and click Ok.

Regression pop up will open.

Select Range of Sales $C$1:$C$13 in the Y-axis box as this is the dependent variable and $B$1:$B$14 in X-axis as advertisement spent is the independent variable.

Checkmark on the Labels box if you have selected headers in data else it will give you the error.

Select Output range if you want to get the value on the specific range on the worksheet else select New Worksheet Ply: and this will add a new worksheet and give you the result.

Then check on the Residuals box and click Ok.

This will add worksheets and give you the following result.

Let us understand the output.

**Summary Output**

**Multiple R:** This represents the correlation coefficient. The value 1 shows a positive relationship, and value 0 shows no relationship.

**R Square:** R Square represents the coefficient of determination. This tells you the percentage of points fall on the regression line. 0.49 means that 49% of values fit the model

**Adjusted R square****:** This is adjusted R square, which requires when you have more than one X variable.

**Standard Error:** This represents an estimate of the standard deviation of error. This is the precision that the regression coefficient is measured.** **

**Observations:** This is the number of observations that you have taken in a sample.

4.9 (927 ratings) 16 Courses | 15+ Projects | 90+ Hours | Full Lifetime Access | Certificate of Completion

**ANOVA – ****Df: **Degrees of freedom

**SS: **Sum of Squares.

**MS: **we have two MS

- Regression MS is Regression SS/Regression Df.
- Residual MS is the mean squared error (Residual SS / Residual Df).

**F: **F test for the null hypothesis.

**Significance F: **P-Values associated with Significance

**Coefficient: **Coefficient gives you the estimate of least squares.

**T Statistic: **T Statistic for null hypothesis vs the alternate hypothesis.

**P-Value: **This is the p-value for the hypothesis test.

**Lower 95% and Upper 95%: **These are the lower boundary and the upper boundary for the confidence interval

**Residuals Output.: **We have 12 observations based on the data. 2^{nd} column represents Predicted sales and 3^{rd} column Residuals. Residuals are basically the difference in predicted sales from the actual one.

#### Example#2

Select the predicted sales and marketing column

Go to the chart group under the insert tab. Select the scatter chart icon

This will insert the scatter plot in excel. See image below

Right-click on any point then select **Add Trendline in excel.** This will add a trendline to your chart.

- You can format the trendline by doing right-click anywhere on the trendline and then select format trendline.
- You can make more improvements to the chart. i.e., formatting the trendline, color and change title, etc
- You can also show the formula on the graph by checking in the Display formula on the chart, and display R squared value on the chart.

**Some More Examples of Linear Regression Analysis:**

- Prediction of Umbrella sold based on the Rain happened in Area.
- Prediction of AC sold based on the Temperature in Summer.
- During the exam season, sales of Stationary basically, Exam guide sales increased.
- Prediction of sales when Advertising has done based on High TRP serial where an advertisement is done, Popularity of Brand Ambassador, and the Footfalls at the place of holding where an advertisement is being published.
- Sales of a house based on the Locality, Area, and price.

#### Example #3

Suppose we have nine students with their IQ level and the number they scored on Test.

Student |
Test Score |
IQ |

Ram | 100 | 145 |

Shyam | 97 | 140 |

Kul | 93 | 130 |

Kappu | 91 | 125 |

Raju | 89 | 115 |

Vishal | 86 | 110 |

Vivek | 82 | 100 |

Vinay | 78 | 95 |

Kumar | 75 | 90 |

**Step 1:** First, find out the dependent and independent variables. Here Test score is the dependent variable, and IQ is the independent variable as Test score is varying as IQ gets change.

**Step 2: **Go to Data Tab – Click on Data Analysis – Select regression – click Ok.

This will Open the Regression window for you.

**Step 3.** Input Test Score range in Input Y Range Box and IQ in Input X Range Box. (Check on Labels if you have headers in your data range. Select output options, then check on the desired Residuals. Click Ok.

You will get the summary output shown in the below Image.

**Step 4:** Analysing the Regression by Summary Output** **

**Summary Output**

**Multiple R:** Here, the correlation coefficient is 0.99, which is very near to 1, which means the Linear relationship is very positive.

**R Square:** R Square value is 0.983, which means that 98.3% of values fit the model.

**P-value: **Here, P-value is 1.86881E-07, which is very less than .1, Which means IQ has significant predictive values.

**See the chart below.**

You can see that almost all the points are falling inline or a nearby trendline.

#### Example #4

We need to predict sales of AC based on the sales & temperature for a different month.

Month |
Temp |
Sales |

Jan | 25 | 38893 |

Feb | 28 | 42254 |

Mar | 31 | 42845 |

Apr | 33 | 47917 |

May | 37 | 51243 |

Jun | 40 | 69588 |

Jul | 38 | 56570 |

Aug | 37 | 50000 |

Follow the below steps to get the regression result.

**Step 1:** First, find out the dependent and independent variables. Here Sales is the dependent variable, and Temperature is an independent variable as Sales is varying as Temp gets change.

**Step 2: **Go to Data Tab – Click on Data Analysis – Select regression – click Ok.

This will Open the Regression window for you.

**Step 3.** Input Sales in Input Y Range Box and Temp in Input X Range Box. (Check on Labels if you have headers in your data range. Select output options, then check on the desired Residuals. Click Ok.

This will give you a summary output as below.

**Step 4: **Analyse the result.

**Multiple R:** Here, the correlation coefficient is 0.877, which is near to 1, which means the Linear relationship is positive.

**R Square:** R Square value is 0.770, which means that 77% of values fit the model

**P-Value: **Here, P-value is 1.86881E-07, which is very less than .1, Which means IQ has significant predictive values.

#### Example #5

Now let us do a Regression Analysis for Multiple independent variables:

You need to predict the sales of a Mobile which is going to launch next year. You have the price and Population of the countries which are affecting the sales of mobiles.

Mobile Version |
Sales |
Quantity |
Population |

US | 63860 | 858 | 823 |

UK | 61841 | 877 | 660 |

KZ | 60876 | 873 | 631 |

CH | 58188 | 726 | 842 |

HN | 52728 | 864 | 573 |

AU | 52388 | 680 | 809 |

NZ | 51075 | 728 | 661 |

RU | 49019 | 689 | 778 |

Follow the below steps to get the regression result.

**Step 1.** First, find out the dependent and independent variables. Here Sales is dependent variable and quantity and population. Both are independent variables as Sales is varying with quantity and population of the country.

**Step 2. **Go to Data Tab – Click on Data Analysis – Select regression – click Ok.

This will Open the Regression window for you.

**Step 3.** Input Sales in Input Y Range Box and select quantity and population in Input X Range Box. (Check on Labels if you have headers in your data range. Select output options, then check on the desired Residuals. Click Ok.

Now Run the regression using data analysis under Data Tab. This will give you the below result.

** **

**Summary Output**

**Multiple R:** Here, the correlation coefficient is 0.93, which is very near to 1, which means the Linear relationship is very positive.

**R Square:** R Square value is 0.866, which means that 86.7% of values fit the model.

**Significance F: **Significance F is less than .1, which means that the regression equation has significant predictive value.

**P-Value**: If you look at P-value for Quantity and Population, you can see that values are less than .1, which means quantity and population have significant predictive value. The fewer P values mean that a variable has more significant predictive values.

However, both quantity and population have significant predictive value, but If you look at P-value for quantity and population, than you can see that quantity has a lesser P-value in excel than Population. This means quantity has a more significant predictive value than Population.

### Things to Remember

- Always check the Dependent and Independent variables whenever you are selecting any data.
- Linear regression analysis considers the relationship between the Mean of the variables.
- This only model the relationship between the variables that are linear
- Sometimes it is not the best fit for a real-world problem. For Example: (Age and the wages). Most of the time, Wage increase as Age is increasing. However, after retirement, Age increases but wages goes decrease.

### Recommended Articles

This has been a guide to Linear Regression and its definition. Here we discuss how to perform a linear regression analysis in excel with the help of examples and a downloadable excel sheet. You can learn more about excel from the following articles –

- Statistics in Excel
- Confidence Interval in Excel
- Formula of Regression
- Descriptive Statistics in Excel

- 16 Courses
- 15+ Projects
- 90+ Hours
- Full Lifetime Access
- Certificate of Completion