Machine Learning January 01 ,2025

Python implementation of Lasso Regression

Step 1: Import the Necessary Libraries

First, we import the required libraries.

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

# Import numpy and pandas for data handling
import numpy as np
import pandas as pd

# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Scikit-learn libraries for preprocessing and modeling
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Lasso
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

Step 2: Generate or Load the Dataset

We generate a synthetic dataset for Lasso Regression. In this case, we simulate a dataset that has some linear relationship.

# Generating a dataset
np.random.seed(0)
X = np.random.rand(300, 1) * 15  # 300 random data points scaled by 15
Y = 3 * X + 2 + np.random.randn(300, 1) * 3  # Linear relation with noise

# Convert the data to a DataFrame for easier handling
data = pd.DataFrame({'X': X.flatten(), 'Y': Y.flatten()})
data.head()

Output:

		X			Y
0	8.232203	27.166127
1	10.727840	34.880065
2	9.041451	27.332404
3	8.173248	25.805978
4	6.354822	16.792283

Step 3: Visualize the Data

We visualize the relationship between X and y to understand the pattern.

plt.scatter(data['X'], data['Y'], color='blue', label='Data')
plt.title("Scatter Plot of X vs y")
plt.xlabel("X")
plt.ylabel("y")
plt.legend()
plt.show()

 

Step 4: Train-Test Split

Now, we split the data into training and testing sets.

X_train, X_test, Y_train, Y_test = train_test_split(data[['X']], data['Y'], test_size=0.2, random_state=42)

print("Training set size:", X_train.shape[0])
print("Testing set size:", X_test.shape[0])

Output:

Training set size: 80
Testing set size: 20

Step 5: Feature Scaling

Since Lasso Regression is sensitive to feature scaling, we scale the features using StandardScaler.

# Initialize the scaler
scaler = StandardScaler()

# Scale the training and test data
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Display the first 5 rows of scaled training data
print("Scaled Training Data:\n", X_train_scaled[:5])

Output:

Scaled Training Data:
 [[ 0.02403683]
 [-0.91675579]
 [-0.252981  ]
 [ 0.35312163]
 [-1.69304482]]

Step 6: Fit the Lasso Regression Model

We train the Lasso Regression model on the scaled data.

# Initialize the Lasso model with alpha (regularization strength)
lasso_model = Lasso(alpha=0.1)

# Fit the model
lasso_model.fit(X_train_scaled, Y_train)

# Print the coefficients and intercept
print("Model Coefficients:", lasso_model.coef_)
print("Intercept:", lasso_model.intercept_)

Output:

Model Coefficients: [12.80212688]
Intercept: 24.567883271133404

Step 7: Evaluate the Model

We now make predictions on the test set and evaluate the model’s performance using R², Mean Absolute Error (MAE), and Mean Squared Error (MSE).

# Make predictions
Y_pred = lasso_model.predict(X_test_scaled)

# Evaluate the model
r2 = r2_score(Y_test, Y_pred)
mae = mean_absolute_error(Y_test, Y_pred)
mse = mean_squared_error(Y_test, Y_pred)

print("R² Score:", r2)
print("Mean Absolute Error:", mae)
print("Mean Squared Error:", mse)

Output:

R² Score: 0.9379259627793001
Mean Absolute Error: 2.4745583747562288
Mean Squared Error: 10.05613859213474

Step 8: Visualize the Predictions

We compare the true values and predicted values using a scatter plot.

# Visualizing the true vs predicted values
plt.scatter(X_test, Y_test, color='blue', label='True Values')
plt.scatter(X_test, Y_pred, color='red', label='Predicted Values')
plt.title("True vs Predicted Values")
plt.xlabel("X")
plt.ylabel("Y")
plt.legend()
plt.show()

 

Step 9: Visualize the Lasso Regression Line

Finally, we visualize the Lasso Regression line fitted on the entire dataset to see the model fit.

# Plot the Lasso Regression Line
plt.scatter(data['X'], data['Y'], color='blue', label='Data')
plt.plot(X_test, Y_pred, color='red', label='Lasso Regression Line')
plt.title("Lasso Regression Line")
plt.xlabel("X")
plt.ylabel("Y")
plt.legend()
plt.show()

 

Summary of Outputs:

  1. Scatter Plot of X vs y: Visualizes the linear data distribution.
  2. Training and Testing Split Sizes: Confirms the split ratio between training and testing datasets.
  3. Scaled Features: Shows the scaled features ready for model fitting.
  4. Model Coefficients and Intercept: Highlights the learned parameters of the Lasso regression model.
  5. R², MAE, and MSE Scores: Evaluates the model's predictive accuracy.
  6. True vs Predicted Values: Plots the comparison between actual and predicted values.
  7. Lasso Regression Line: Illustrates the fitted Lasso regression line.

This completes the Python implementation for Lasso Regression. The model has effectively learned from the data, and the output evaluations confirm the model's performance.

Next Topic- Logistic Regression

Purnima
0

You must logged in to post comments.

Get In Touch

123 Street, New York, USA

+012 345 67890

techiefreak87@gmail.com

© Design & Developed by HW Infotech