PCA is an algorithm used to find the principal component of data.
Principal components are the directions where there is the most variance, the directions where the data is most spread out.

Eigenvectors and Eigenvalues

On a set of data, we can deconstruct the set into eigenvectors and eigenvalues.
Every eigenvector has a corresponding eigenvalue.
An eigenvector is a direction, while the eigenvalue associated is a number that tells how much variance there is in the data in that direction.
The eigenvector with the highest eigenvalue is, therefore, the principal component.
The number of eigenvectors/values that exist in a data set is the total number of the dimensions of the dataset.

import numpy as np  
import matplotlib.pyplot as plt  
from mpl_toolkits.mplot3d import Axes3D  
import pandas as pd

from sklearn import decomposition  
from sklearn import datasets


centers = [[1, 1], [-1, -1], [1, -1]]  
iris = datasets.load_iris()  
X = iris.data  
y = iris.target  
pca = decomposition.PCA(n_components=3)  
X = pca.transform(X)

fig = plt.figure(1, figsize=(11, 8))  
ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=48, azim=134)


for name, label in [('Setosa', 0), ('Versicolour', 1), ('Virginica', 2)]:  
    ax.text3D(X[y == label, 0].mean(),
              X[y == label, 1].mean() + 1.5,
              X[y == label, 2].mean(), name,
              bbox=dict(alpha=.5, edgecolor='w', facecolor='w'))
# Reorder the labels to have colors matching the cluster results
y = np.choose(y, [1, 2, 0]).astype(np.float)  
ax.scatter(X[:, 0], X[:, 1], X[:, 2], c=y, cmap=plt.cm.spectral)


—Read This Next—

Charge Stripe Processing Fee to the user

Following is how I've implemented the possibility to charge the stripe fee onto the user. Unfortunately, Stripe does not offer any API endp
—You Might Enjoy—

ThreeJS rotating icosahedron with images on vertices

Simple clone of the Stripe element @ https://stripe.com/radar I had some fun replicating the rotating icosahedron of the Stripe page, and in