Soft K-Means Clustering

Introduction / Background

Soft K-Means Clustering is an extension of K-Means Clustering

\[ \text{softmax}(x_i) = \frac{e^{x_i}}{\sum_{j=1}^n e^{x_j}} \]

Implementation

Algorithm Inputs

def soft_k_means(X, k=3, max_iterations=3)

Here is the function definition, whose inputs are:

\[ \begin{bmatrix} x_{1,1} & \dots & x_{1,d} \\ x_{2,1} & \dots & x_{2,d} \\ \vdots & \ddots & \vdots \\ x_{N,1} & \dots & x_{N,d} \end{bmatrix} \]

Setup

This algorithm follows the same setup as that of K-Means Clustering

Main Loop

for i in range(max_iterations):
    # 1. calculate the distance between each point and the k means
    # 2. calculate the probability distribution of each point belonging to each mean
    # (optionally calculate the loss)
    # 3. if no updates are made then stop early
    # 4. update each k mean to be the mean of all points nearest to it

Loop Step 1

dists = euclidean(X, mus)
\[ \begin{bmatrix} \text{dist}_{1,1} & \dots & \text{dist}_{1,k} \\ \text{dist}_{2,1} & \dots & \text{dist}_{2,k} \\ \vdots & \ddots & \vdots \\ \text{dist}_{N,1} & \dots & \text{dist}_{N,k} \end{bmatrix} \]

Loop Step 2

exps = np.exp(dists) # Nxk
r = exps/np.sum(exps, axis=1, keepdims=True) # Nxk
\[ \begin{bmatrix} e^{x_{1,1}} & \dots & e^{x_{1,d}} \\ e^{x_{2,1}} & \dots & e^{x_{2,d}} \\ \vdots & \ddots & \vdots \\ e^{x_{N,1}} & \dots & e^{x_{N,d}} \end{bmatrix} \]
\[ \begin{bmatrix} e^{x_{1,1}}/\sum_{m=1}^d e^{x_{1,m}} & \dots & e^{x_{1,d}}/\sum_{m=1}^d e^{x_{1,m}} \\ e^{x_{2,1}}/\sum_{m=1}^d e^{x_{2,m}} & \dots & e^{x_{2,d}}/\sum_{m=1}^d e^{x_{2,m}} \\ \vdots & \ddots & \vdots \\ e^{x_{N,1}}/\sum_{m=1}^d e^{x_{N,m}} & \dots & e^{x_{N,d}}/\sum_{m=1}^d e^{x_{N,m}} \end{bmatrix} \]

Loop Step 4

labels = np.argmin(dists, axis=1)
if len(ret) > 0 and (ret[-1] == labels).all():
    print(f"Early stop at index {i}")
    break
ret.append(labels)
\[ \begin{bmatrix} \text{label}_{1} \\ \text{label}_{2} \\ \vdots \\ \text{label}_{N} \end{bmatrix} \]
\[ \text{where: } \text{ label}_{i} \in [0, k), \mathbb{Z} \]

Loop Step 5

for j in range(k):
    mus[j] = r[:,j].dot(X)/np.sum(r[:,j]) 
\[ \begin{bmatrix} e^{x_{1,j}}/\sum_{m=1}^d e^{x_{1,m}} \\ e^{x_{2,j}}/\sum_{m=1}^d e^{x_{2,m}} \\ \vdots \\ e^{x_{N,j}}/\sum_{m=1}^d e^{x_{N,m}} \end{bmatrix} \]

Overall code

def soft_k_means(X, k=3, max_iterations=3, beta=1.0):
    N, dim = X.shape
    ret = []
    mus = X[np.random.choice(N, size=(k,), replace=False)]

    for i in range(max_iterations):
        # Step 1
        dists = euclidean(X, mus)
        # Step 2
        exps = np.exp(-beta*dists)
        r = exps/np.sum(exps, axis=1, keepdims=True)
        # Step 3
        labels = np.argmin(dists, axis=1)
        # the line below is the optional loss calculation
        loss = sum([np.sum(dists[np.where(labels==j), j]) for j in range(k)])
        if len(ret) > 0 and (ret[-1][0] == labels).all():
            print(f"EARLY STOP AT {i}, max_iterations={max_iterations}")
            break
        ret.append((labels, loss))

        # Step 4
        for j in range(k):
            mus[j] = r[:,j].dot(X)/np.sum(r[:,j]) 

    return ret

Previous:K-Means Clustering
Next:Support Vector Machines