# Feature Selection — Coefficient , Lasso Regularization

**Linear Regression Coefficient**

Linear regression predicts a quantitative response Y on the basis of predictor variables X1, X2, … Xn. It assumes that there is a linear relationship between X(s) and Y. Mathematically, we write this linear relationship as Y ≈ β0 + β1X1 + β2X2 + … + βnXn.

**The magnitude of the coefficients is directly influenced by the scale of the features**. Therefore, to compare coefficients across features, it is importance that all features are within a similar scale. This is why, normalization is important for variable importance and feature selection in linear models.

Indicative of the influence of the feature on the Outcome

There is a linear relationship

The feature is independent of each other , no correlation

Normally distributed

Scale of all features are the same , if not we need to standardize them (eg, Standard Scaler)

# Implementation

## Logistic Regression Coefficient

The **higher the value of coefficient the more the influence **the feature has and hence more important

1- imports

2- Apply Standard Scaler

3 —** from sklearn.feature_selection import SelectFromModel**

use the Logistic Regression as the classifier

4.**sel_.get_support() **will get the list of all the features selected

**sel_.estimator_.coef_ **gets you all the coefficient

# 5.** Find the mean coefficient**

**sel_.estimator_.coef_.mean()**

**np.abs(sel_.estimator_.coef_.mean() )**

## 4. Plot the frequencies is a histogram

**6. keep all the features whose coefficient is > greater than the mean coefficient**

## np.sum( np.abs(sel_.estimator_.coef_)

# >

## np.abs(sel_.estimator_.coef_).mean())

# Implementation

## Linear Regression Coefficient

This is exactly the same as Logistic except we don’t add the penalty in LinearRegression()

**1 Prepare the data , only include the numerical values , then train test split it**

2** Implement the Selector**

**3 Get the selected features**

**4 Include all the features whose coefficients are higher than the mean coefficient**

## np.sum(np.abs(sel_.estimator_.coef_)

# >

## np.abs(sel_.estimator_.coef_).mean())

# Lasso Regularization

Regularization consists in adding a penalty to the different parameters of the machine learning model to reduce the freedom of the model and avoid overfitting. In linear model regularization, the penalty is applied to the coefficients that multiply each of the predictors. The Lasso regularization or l1 has the property that is able to shrink some of the coefficients to zero. Therefore, those features can be removed from the model.

As we apply regularization **Lasso shrinks to zero at different points **so we can start to rank and remove those features that shrink to zero faster

**We DON’T use l2 Ridge penalty in Feature Selection** as it shrinks all the features together to **zero at the same time**

# Classification Lasso Selection

# l1 penalty = lasso

## sel_.get_support() gets you the number of features selected

## The features that shrank to zero with the regularization ==0 were 15 and that is the number of features removed

## for exploratory purposes only apply the Ridge L2

and none will be shrunk to 0 and hence no features will be removed

**Regression Feature selection using Lasso**

1- we need to prepare data only with numerics , remove all na values and train test split

2- Apply the SelectFromModel with Lasso and alpha =100

*linear regression object from sklearn does not allow for regularization. So If you want to make a regularized linear regression you need to import specifically “Lasso” alpha is the penalization, set it high to force the algorithm to shrink some coefficients*

get_support() gets you all the features selected

**The number of features that shrunk to 0 were 4 and those were removed**