Lasso

Publication Date :

11 May, 2025

Blog Author :

Edited by :

Reviewed by :

Table of Contents

Lasso Meaning

Lasso (Least Absolute Shrinkage and Selection Operator) refers to the regression method used as a regulator to minimize errors and create more accurate predictions. The primary purpose of this regression technique is to reduce the error sum of squares (SSE) by applying a penalty.

Key Takeaways

Lasso, or Least Absolute Shrinkage and Selection Operator, is a regression method that implies a penalty that minimizes errors and eliminates unconstrained problems.
In 1986, the concept was first introduced in geophysics literature. Likewise, Statistician Robert Tibshirani repopularized the concept in 1996.
By eliminating the coefficients, the errors are reduced to zero. As a result, there are sparse or simple models leading to accurate estimation.
In a model, using L1 regularization helps to shrink only a few selected features instead of the whole. However, ridge regression shrinks large ones.

Lasso Regression Explained

The Lasso regression model is a regularization technique that shrinks or minimizes the error rate to closer to zero and has a realistic prediction. It usually occurs when too many coefficients are overfitted by creating space through penalization.

There are two regularization techniques to assign penalties: ridge and lasso. When a model exclusively uses the L1 technique, it’s known as lasso regularization. This technique adds a penalty to the absolute value of the coefficient. The role of this penalty is to compress the coefficients and achieve lesser variance. Thus, the risk of error of sum squares shrinks to the least. Therefore, the model is left with fewer coefficients. Some coefficients may get closer to zero and slowly exit the model. However, it is necessary to select the amount of shrinkage wisely. Some developers also use cross-selection techniques for this purpose.

Lasso regression is widely used in finance for tasks like asset pricing, risk management, and portfolio optimization due to its efficient way of shrinking coefficients and selecting pertinent characteristics. It is a useful tool in financial analysis and decision-making processes due to its capacity to handle high-dimensional data and provide parsimonious models.

The first introduction of the lasso was visible in the 1990s. However, its name originally meant a knot or a cord created to trap animals. Similarly, developers use this feature selection to capture unconstrained problems from the variables. It was designed for linear regression models. Prior to this, stepwise selection played a crucial role in choosing the covariates.

In the later stages, the rising prediction errors caused a major limitation to the model. In 1986, the lasso model became a vital part of the geophysics literature. However, in 1996, Statistician Robert Tibshirani reconsidered and popularized the model using Breiman's non-negative garrote.

Formula

Let us look at the equation for a better understanding of the concept:

Lasso = Residual Sum of Squares + λ * (Sum of the absolute value of the magnitude of coefficients)

Where

λ refers to the shrinkage amount. If λ = 0, all the features equal the linear regression. Likewise, λ = ∞ implies the elimination of maximum coefficients. Thus, no feature is considered. However, in the former case, only selective coefficients are used in the model.

The Residual Sum of Squares measures the variance level in error. The smaller value fits the model appropriately. However, larger values may cause overfitting.

Pros And Cons

Let us look at the advantages and disadvantages of the model that influence the coefficients and their absolute value:

Pros	Cons
Eligible for feature selection to determine the predictors.	Sometimes, the model might exclude necessary coefficients.
It uses a penalty to prevent overfitting in a high-dimensional data set.	Small changes in the data set can cause high variance.
It creates a simple model by eliminating the less important coefficients.	While shrinking coefficients, there can be bias created.
Increases sparsity by shrinking a few coefficients to zero.	This regression is volatile or sensitive to multicollinearity.

Examples

Let us look at the examples to comprehend the concept better:

Example #1

Suppose James is an investor who wants to build a portfolio that consists of stocks providing high returns with low risk. He collects data, including historical prices of desired company stocks and macroeconomic indicators like GDP (Gross Domestic Product) and interest rates. Now, James will calculate the unconstrained problems that can deviate from his expected results.

For instance, high interest rates can cause the returns to shrink. Therefore, using group lasso regression, James can reduce the low returns providing stocks. By the end of the model, he will have a classic portfolio that includes equities with more than-average returns.

Example #2

According to medical research published in April 2023, the lasso regression technique and multivariate COX regression were used to develop a chemokine-related lncRNAs risk model. Here, lncRNA (Long non-coding RNAs) is a form of RNA module that does not contain protein. However, dysregulation can cause cancer-related diseases. The study also emphasizes how crucial it is to use molecular indicators such as these in clinical practice in order to enhance treatment plans and patient outcomes.

Lasso Vs Ridge Vs Elastic Net

Although lasso and ridge regression are part of the elastic net, they differ widely. So, let us look at them:

Basis	Lasso	Ridge	Elastic Net
Meaning	A regularization technique for reducing errors and giving accurate results.	Ridge regression is used when many independent variables are highly correlated or multicollinear.	Elastic Net is a regression technique combining lasso and ridge regression methods.
Purpose	To reduce errors of sum squares to zero, leading to a sparse model.	To prevent overfitting of the variables by shrinking them until zero.	To cover the limitations of the other two regression methods.
Type of Regularization Technique	L1 or lasso regularization	L2 or ridge regularization	Weighted combination of L1 and L2.
Penalty Term	It is equal to the sum of the absolute values of the coefficients	Ridge regression includes a penalty on the sum of the squares of coefficients.	Here, there is the usage of L1 and L2, and so do the penalties.
Shrinkage	It shrinks only selected coefficients that have less importance.	Here, shrinkage occurs only on large coefficients.	Elastic Net uses L1 and L2 to have a proper tradeoff during shrinkage.