Python Implementation of Random Forest for Regression

Machine Learning February 15 ,2025

Random Forest for Regression

We can also use Random Forest for regression tasks. Below is an example using the Boston Housing dataset

Step 1: Import Required Libraries

Before implementing Random Forest, import the necessary Python libraries.


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.datasets import fetch_california_housing
from sklearn.metrics import mean_squared_error

Step 2: Load and Explore the Dataset

In this step we will read the data some of the open repositories like Kaggle dataset, UCI Machine Learning Repository etc and explore the data to understand the features and its importance.

We will use the Housing dataset as an example for regression.


housing = fetch_california_housing()
df_housing = pd.DataFrame(housing.data, columns=housing.feature_names)
df_housing['target'] = housing.target

df_housing.head() #print top 5 rows of the dataset
OUTPUT:
	MedInc	HouseAge	AveRooms	AveBedrms	Population	AveOccup	Latitude	Longitude	target
0	8.3252	41.0	6.984127	1.023810	322.0	2.555556	37.88	-122.23	4.526
1	8.3014	21.0	6.238137	0.971880	2401.0	2.109842	37.86	-122.22	3.585
2	7.2574	52.0	8.288136	1.073446	496.0	2.802260	37.85	-122.24	3.521
3	5.6431	52.0	5.817352	1.073059	558.0	2.547945	37.85	-122.25	3.413
4	3.8462	52.0	6.281853	1.081081	565.0	2.181467	37.85	-122.25	3.422

Step 3: Split the Data

In this step we will split the dataset into training and testing datasets and also apply the scaling to bring all the features on the same scale.

# Split into training and testing sets (80% train, 20% test)

X_train, X_test, y_train, y_test = train_test_split(df_housing.drop(columns=['target']), df_housing['target'], test_size=0.2, random_state=42)

Step 4: Initialize Random Forest Regressor

We will create the instance of RandomForestRegressor for regression.

rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)

Step 5: Train the model

Train the model using the training data (X_train, y_train)

rf_regressor.fit(X_train, y_train)

RandomForestRegressor(random_state=42)

Step 6 Predict on test data

As we have trained the model now time is to test the model using testing data (X_test, y_test)

Now, we predict on the test dataset.

y_pred_reg = rf_regressor.predict(X_test)
print(y_pred_reg)

OUTPUT:
[0.5095    0.74161   4.9232571 ... 4.7582187 0.71409   1.65083  ]

Step 7: Evaluate Performance

mse = mean_squared_error(y_test, y_pred_reg)
print(f'Mean Squared Error: {mse:.2f}')
Mean Squared Error: 0.26

Key Takeaways

Random Forest Regressor is an ensemble method using multiple decision trees.
Works well for continuous data prediction like house prices.
Reduces overfitting by averaging multiple trees.
Mean Squared Error (MSE) measures prediction accuracy.
Scalable & robust for real-world regression tasks.

Next Blog- Gradient Boosting in Machine Learning

Purnima

You must logged in to post comments.

Random Forest for Regression

Step 1: Import Required Libraries

Step 2: Load and Explore the Dataset

Step 3: Split the Data

Step 4: Initialize Random Forest Regressor

Step 5: Train the model

Step 6 Predict on test data

Step 7: Evaluate Performance

Key Takeaways

Related Blogs

Brief introduction o...

Understanding Data T...

Handling Missing Dat...

Feature Engineering...

Data Normalization a...

Splitting Data into...

(Cross-validation, C...

Model Evaluation and...

Model Evaluation and...

Hyperparameter Tunin...

Cross Validation in...

AdaBoost: A Powerful...

Transfer Learning in...

Step-wise Python Imp...

Step-wise Python Imp...

Gradient Boosting fo...

AdaBoost for Regres...

Gradient Boosting in...

Building a Machine L...

Introduction to ML T...

Case Studies and Ind...

Ethical Consideratio...

Bias and Fairness in...

Transparency and Int...

Career Paths in Mach...

Staying Updated with...

Model Deployment Opt...

Model Monitoring and...

Get In Touch

Categories