Machine Learning February 02 ,2025

Random Forest for Regression

We can also use Random Forest for regression tasks. Below is an example using the Boston Housing dataset

Step 1: Import Required Libraries

Before implementing Random Forest, import the necessary Python libraries.


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.datasets import fetch_california_housing
from sklearn.metrics import mean_squared_error

Step 2: Load and Explore the Dataset

In this step we will read the data some of the open repositories like Kaggle dataset, UCI Machine Learning Repository etc and explore the data to understand the features and its importance.

We will use the Housing dataset as an example for regression.


housing = fetch_california_housing()
df_housing = pd.DataFrame(housing.data, columns=housing.feature_names)
df_housing['target'] = housing.target

df_housing.head() #print top 5 rows of the dataset
OUTPUT:
	MedInc	HouseAge	AveRooms	AveBedrms	Population	AveOccup	Latitude	Longitude	target
0	8.3252	41.0	6.984127	1.023810	322.0	2.555556	37.88	-122.23	4.526
1	8.3014	21.0	6.238137	0.971880	2401.0	2.109842	37.86	-122.22	3.585
2	7.2574	52.0	8.288136	1.073446	496.0	2.802260	37.85	-122.24	3.521
3	5.6431	52.0	5.817352	1.073059	558.0	2.547945	37.85	-122.25	3.413
4	3.8462	52.0	6.281853	1.081081	565.0	2.181467	37.85	-122.25	3.422

Step 3: Split the Data

In this step we will split the dataset into training and testing datasets and also apply the scaling to  bring all the features on the same scale.

# Split into training and testing sets (80% train, 20% test)

X_train, X_test, y_train, y_test = train_test_split(df_housing.drop(columns=['target']), df_housing['target'], test_size=0.2, random_state=42)

Step 4: Initialize Random Forest Regressor

We will create the instance of RandomForestRegressor for regression.

rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)

Step 5: Train the model

Train the model using the training data (X_train, y_train)

rf_regressor.fit(X_train, y_train)

RandomForestRegressor(random_state=42)

Step 6  Predict on test data

As we have trained the model now time is to test the model using testing data (X_test, y_test)

Now, we predict on the test dataset.

y_pred_reg = rf_regressor.predict(X_test)
print(y_pred_reg)

OUTPUT:
[0.5095    0.74161   4.9232571 ... 4.7582187 0.71409   1.65083  ]

Step 7: Evaluate Performance

mse = mean_squared_error(y_test, y_pred_reg)
print(f'Mean Squared Error: {mse:.2f}')
Mean Squared Error: 0.26

Key Takeaways

  • Random Forest Regressor is an ensemble method using multiple decision trees.
  • Works well for continuous data prediction like house prices.
  • Reduces overfitting by averaging multiple trees.
  • Mean Squared Error (MSE) measures prediction accuracy.
  • Scalable & robust for real-world regression tasks.

    Next Blog- Gradient Boosting in Machine Learning
Purnima
0

You must logged in to post comments.

Related Blogs

Machine Learning February 02 ,2025
Model Monitoring and...
Machine Learning February 02 ,2025
Model Deployment Opt...
Machine Learning February 02 ,2025
Staying Updated with...
Machine Learning February 02 ,2025
Career Paths in Mach...
Machine Learning February 02 ,2025
Transparency and Int...
Machine Learning February 02 ,2025
Bias and Fairness in...
Machine Learning February 02 ,2025
Ethical Consideratio...
Machine Learning February 02 ,2025
Case Studies and Ind...
Machine Learning February 02 ,2025
Introduction to ML T...
Machine Learning February 02 ,2025
Building a Machine L...
Machine Learning February 02 ,2025
Gradient Boosting in...
Machine Learning February 02 ,2025
AdaBoost for Regres...
Machine Learning February 02 ,2025
Gradient Boosting fo...
Machine Learning February 02 ,2025
Step-wise Python Imp...
Machine Learning February 02 ,2025
Step-wise Python Imp...
Machine Learning February 02 ,2025
Transfer Learning in...
Machine Learning February 02 ,2025
AdaBoost: A Powerful...
Machine Learning February 02 ,2025
Cross Validation in...
Machine Learning February 02 ,2025
Hyperparameter Tunin...
Machine Learning February 02 ,2025
Model Evaluation and...
Machine Learning February 02 ,2025
Model Evaluation and...
Machine Learning January 01 ,2025
(Cross-validation, C...
Machine Learning January 01 ,2025
Splitting Data into...
Machine Learning January 01 ,2025
Data Normalization a...
Machine Learning January 01 ,2025
Feature Engineering...
Machine Learning January 01 ,2025
Handling Missing Dat...
Machine Learning January 01 ,2025
Understanding Data T...
Machine Learning December 12 ,2024
Brief introduction o...
Get In Touch

123 Street, New York, USA

+012 345 67890

techiefreak87@gmail.com

© Design & Developed by HW Infotech