Unsupervised Learning January 01 ,2025

Step-wise Python Implementation of LLE

Let's use the Iris dataset for this demonstration. The Iris dataset is a famous dataset in machine learning that contains measurements of 150 iris flowers in 4 features: sepal length, sepal width, petal length, and petal width, divided into 3 classes (species).

Step 1: Import Necessary Libraries

# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.manifold import LocallyLinearEmbedding
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

Step 2: Load and Prepare the Dataset

We'll use the Iris dataset, which is available directly from sklearn.datasets. We will also scale the features before applying LLE.

# Load Iris dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Labels

# Standardize the features (important for LLE)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

Step 3: Apply Locally Linear Embedding (LLE)

We now apply the LLE technique. We will use n_neighbors=10 to choose the number of neighbors and n_components=2 to reduce the dimensionality to 2 for visualization.

# Apply Locally Linear Embedding (LLE)
lle = LocallyLinearEmbedding(n_neighbors=10, n_components=2)
X_lle = lle.fit_transform(X_scaled)

# X_lle now contains the data projected into 2D

Step 4: Visualize the Result

Now, we can plot the 2D projection of the Iris dataset after applying LLE. We'll use the matplotlib library to visualize the low-dimensional representation of the data.

# Plot the 2D projection of the data
plt.figure(figsize=(8, 6))

# Scatter plot of the low-dimensional embedding
plt.scatter(X_lle[:, 0], X_lle[:, 1], c=y, cmap='viridis', edgecolors='k', s=100)
plt.colorbar()
plt.title('Locally Linear Embedding of the Iris Dataset')
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.show()

Step 5: Output and Interpretation

  • The scatter plot will show the Iris dataset in a 2D space where each point represents a flower, and the points are color-coded according to their species.
  • What do we see?
    • The LLE algorithm preserves local neighborhood relationships, which means that similar data points (flowers of the same species) should be close together in the 2D representation.
    • The plot shows that LLE has successfully reduced the dataset from 4D to 2D while keeping the local structures intact.

Complete Code:

# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.manifold import LocallyLinearEmbedding
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

# Load Iris dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Labels

# Standardize the features (important for LLE)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Apply Locally Linear Embedding (LLE)
lle = LocallyLinearEmbedding(n_neighbors=10, n_components=2)
X_lle = lle.fit_transform(X_scaled)

# Plot the 2D projection of the data
plt.figure(figsize=(8, 6))

# Scatter plot of the low-dimensional embedding
plt.scatter(X_lle[:, 0], X_lle[:, 1], c=y, cmap='viridis', edgecolors='k', s=100)
plt.colorbar()
plt.title('Locally Linear Embedding of the Iris Dataset')
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.show()

Real-Life Use Case

Application in Image Processing: In real-world applications, LLE can be used in image processing where high-dimensional pixel data needs to be reduced to lower dimensions for visualization or further analysis. For example, images of objects or faces often lie on a manifold in high-dimensional space, and LLE can help to visualize these in 2D or 3D by preserving the local geometric structure of the data.

Expected Output

  • Plot Output:
    • You should see a scatter plot where each point corresponds to an iris flower, and points are grouped by their species (color-coded).
    • The reduction from 4 dimensions to 2 should allow us to clearly see how the LLE algorithm preserved the local structure of the data in the lower-dimensional space.

Conclusion

This step-by-step guide demonstrated how to apply Locally Linear Embedding (LLE) on the Iris dataset using Python. By using LLE, you reduce the dimensionality of the dataset from 4D to 2D while preserving the local structure, which can be useful for visualization or as a preprocessing step for other machine learning tasks.

Next Blog- Isomap: A Comprehensive Guide to Non-Linear Dimensionality Reduction

Purnima
0

You must logged in to post comments.

Related Blogs

Get In Touch

123 Street, New York, USA

+012 345 67890

techiefreak87@gmail.com

© Design & Developed by HW Infotech