Step-by-Step Python Implementation of Neural Network for Classification
Objective:
We will build a neural network to classify the famous Iris dataset, which contains three classes of flowers based on their petal and sepal dimensions.
Step 1: Install Dependencies
Before proceeding, ensure you have the required libraries installed. If not, install them using:
pip install tensorflow scikit-learn numpy pandas matplotlib
Step 2: Import Libraries
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
Explanation:
- tensorflow and keras are used for building and training the neural network.
- sklearn.datasets provides the Iris dataset.
- train_test_split splits data into training and testing sets.
- StandardScaler normalizes the data for better performance.
Step 3: Load and Preprocess the Data
# Load the Iris dataset
iris = load_iris()
X = iris.data # Feature matrix
y = iris.target # Target labels
# Convert labels to categorical (one-hot encoding)
y = keras.utils.to_categorical(y, num_classes=3)
# Split the dataset into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Normalize the data for better performance
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
Explanation:
- load_iris() loads the dataset.
- X = iris.data contains four numerical features.
- y = iris.target represents the class labels (0, 1, or 2).
- We one-hot encode y to represent each class as a vector (e.g., [1,0,0] for class 0).
- The dataset is split into training (80%) and testing (20%).
- StandardScaler() is used to normalize the data, improving neural network performance.
Step 4: Build the Neural Network Model
# Define the neural network model
model = Sequential([
Dense(10, activation='relu', input_shape=(X_train.shape[1],)), # Input layer
Dense(8, activation='relu'), # Hidden layer
Dense(3, activation='softmax') # Output layer with 3 neurons (one for each class)
])
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Print model summary
model.summary()
Explanation:
- We use the Sequential API to define our model.
- Input Layer: Dense(10, activation='relu') with 10 neurons.
- Hidden Layer: Dense(8, activation='relu') improves feature extraction.
- Output Layer: Dense(3, activation='softmax') (since we have 3 classes).
- Activation Functions:
- ReLU (Rectified Linear Unit) is used for hidden layers to handle non-linearity.
- Softmax is used for the output layer to predict class probabilities.
- Loss Function:
- categorical_crossentropy is used for multi-class classification.
- Optimizer:
- Adam is chosen as it adapts the learning rate automatically.
Step 5: Train the Model
# Train the model
history = model.fit(X_train, y_train, epochs=100, batch_size=10, validation_data=(X_test, y_test))
Explanation:
- epochs=100 trains the model for 100 cycles.
- batch_size=10 processes 10 samples at a time for weight updates.
- validation_data=(X_test, y_test) helps track the model's performance on unseen data.
Step 6: Evaluate the Model
# Evaluate the model on test data
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {accuracy:.4f}")
Explanation:
- evaluate() calculates the test loss and accuracy.
- The higher the accuracy, the better the model's performance on unseen data.
Step 7: Make Predictions
# Make a prediction on a sample
sample = np.array([X_test[0]]) # Taking one test sample
prediction = model.predict(sample)
predicted_class = np.argmax(prediction)
print(f"Predicted class: {iris.target_names[predicted_class]}")
Explanation:
- We take one test sample and pass it through the trained model.
- predict() returns probabilities for each class.
- argmax() selects the class with the highest probability.
Step 8: Visualize Training Performance
# Plot training & validation accuracy
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.title('Model Accuracy')
plt.show()
Explanation:
- We plot the training and validation accuracy over time.
- A well-trained model should show an increase in accuracy over epochs.
Key Takeaways
- Neural networks can be used for multi-class classification problems.
- Data preprocessing (normalization, one-hot encoding) is crucial for better performance.
- The Sequential API makes it easy to build deep learning models.
- ReLU activation is preferred for hidden layers, while Softmax is used for multi-class output.
- Cross-entropy loss is ideal for classification problems.
- Batch size and epochs need tuning for optimal performance.
- Adam optimizer is widely used for efficient learning.
- Evaluation metrics (accuracy, loss) help assess model performance.
- Visualization of training curves helps diagnose underfitting or overfitting.
Next Blog- Python Implementation of Neural Network for Regression