Least Squares Regression

Least Squares Regression Method Definition

A least-squares regression method is a form of regression analysis which establishes the relationship between the dependent and independent variable along with a linear line. This line is referred to as the “line of best fit.”

Regression Analysis is a statistical method with the help of which one can estimate or predict the unknown values of one variable from the known values of another variable. The variable which is used to predict the variable interest is called the independent or explanatory variable, and the variable that is being predicted is called the dependent or explained variable.

Let us consider two variables, x & y. These are plotted on a graph with values of x on the x-axis values of y on the y-axis. These values are represented by the dots in the below graph. A straight line is drawn through the dots – referred to as the line of best fit.

Least Squares Regression

The objective of least squares regression is to ensure that the line drawn through the set of values provided establishes the closest relationship between the values.

Least Squares Regression Formula

The regression line under the Least Squares method is calculated using the following formula –

ŷ = a + bx
Least-Squares-Regression-Formula

You are free to use this image on your website, templates etc, Please provide us with an attribution linkHow to Provide Attribution?Article Link to be Hyperlinked
For eg:
Source: Least Squares Regression (wallstreetmojo.com)

Where,

  • ŷ = dependent variable
  • x = independent variable
  • a = y-intercept
  • b = slope of the line

The slope of line b is calculated using the following formula –

Slope of line Formula 1

Or

Slope of line Formula 2

Y-intercept, ‘a’ is calculated using the following formula –

Y intercept

Line of Best Fit in the Least Square Regression

The line of best fit is a straight line drawn through a scatter of data points that best represents the relationship between them.

Let us consider the following graph wherein a set of data is plotted along the x and y-axis. These data points are represented using the blue dots. Three lines are drawn through these points – a green, a red, and a blue line. The green line passes through a single point, and the red line passes through three data points. However, the blue line passes through four data points, and the distance between the residual points to the blue line is minimal as compared to the other two lines.

Line of Best Fit

In the above graph, the blue line represents the line of best fit as it lies closest to all the values and the distance between the points outside the line to the line is minimal (i.e., the distance between the residuals to the line of best fit – also referred to as the sums of squares of residuals). In the other two lines, the orange and the green, the distance between the residuals to the lines is greater as compared to the blue line.

The least-squares method provides the closest relationship between the dependent and independent variablesIndependent VariablesIndependent variable is an object or a time period or a input value, changes to which are used to assess the impact on an output value (i.e. the end objective) that is measured in mathematical or statistical or financial modeling.read more by minimizing the distance between the residuals, and the line of best fit, i.e., the sum of squares of residuals is minimal under this approach. Hence the term “least squares.”

Examples of Least Squares Regression Line

Let us apply these formulae in the below question –

You can download this Least Squares Regression Excel Template here – Least Squares Regression Excel Template

Example #1

The details pertaining to the experience of technicians in a company (in a number of years) and their performance rating is provided in the table below. Using these values, estimate the performance rating for a technician with 20 years of experience.

Experience of Technician (in Years)Performance Rating
1687
1288
1889
468
378
1080
575
1283

Solution –

To calculate the least squares first we will calculate the Y-intercept (a) and slope of a line(b) as follows –

Least Squares Example 1.2

The slope of Line (b)

Least Squares Example 1.3
  • b = 6727 – [(80*648)/8] / 1018 – [(80)2/8]
  • = 247/218
  • = 1.13

Y-intercept (a)

Least Squares Example 1.4
  • a = 648 – (1.13)(80) /8
  • = 69.7

The regression line is calculated as follows –

Least Squares Example 1.5

Substituting 20 for the value of x in the formula,

  • ŷ = a + bx
  • ŷ = 69.7 + (1.13)(20)
  • ŷ = 92.3

The performance rating for a technician with 20 years of experience is estimated to be 92.3.

Example #2

Least Squares Regression Equation Using Excel

The least-squares regression equation can be computed using excel by the following steps –

Least Squares Regression Excel 1.1
  • Insert a scatter graph using the data points.
Least Squares Regression Excel 1.2
  • Insert a trendline within the scatter graph.
Least Squares Regression Excel 1.3
  • Under trendline options – select linear trendline and select display equation on chart.
Least Squares Regression Excel 1.4
  • The least-squares regression equation for the given set of excel data is displayed on the chart.
Least Squares Regression Excel 1.5

Thus, the least-squares regression equation for the given set of excel data is calculated. Using the equation, predictions, and trend analyses may be made. Excel tools also provide for detailed regression computations.

Advantages

  • The least-squares method of regression analysis is best suited for prediction models and trend analysis. It is best used in the fields of economics, finance, and stock markets wherein the value of any future variable is predicted with the help of existing variables and the relationship between the same.
  • The least-squares method provides the closest relationship between the variables. The difference between the sums of squares of residuals to the line of best fit is minimal under this method.
  • The computation mechanism is simple and easy to apply.

Disadvantages

  • The least-squares method relies on establishing the closest relationship between a given set of variables. The computation mechanism is sensitive to the data, and in case of any outliers (exceptional data), results may tend to majorly affect.
  • This type of calculation is best suited for linear models. For nonlinear equations, more exhaustive computation mechanisms are applied.

Conclusion

The least-squares method is one of the most popularly used methods for prediction models and trend analysis. When calculated appropriately, it delivers the best results.

Recommended Articles

This has been a guide to Least Squares Regression Method and its definition. Here we discuss the formula to calculate the least-squares regression line along with excel examples. You can learn more from the following articles –

  • 16 Courses
  • 15+ Projects
  • 90+ Hours
  • Full Lifetime Access
  • Certificate of Completion
LEARN MORE >>

Reader Interactions

Comments

  1. Scott Julien says

    Thank you for this helpful tutorial.

    • Dheeraj Vaidya says

      Thanks for your kind words!

Leave a Reply

Your email address will not be published. Required fields are marked *