Applying Clustering

This section continues the example presented in the previous section on K-means. In addition to discussing code for implementing agglomerative clustering, it also includes applications of various accuracy measures useful for analyzing clutering performance. 

Affinity Propagation

Affinity propagation is another example of a clustering algorithm. As opposed to K-means, this approach does not require us to set the number of clusters beforehand. The main idea here is that we would like to cluster our data based on the similarity of the observations (or how they "correspond" to each other).

Let's define a similarity metric such that s(x_i, x_j) > s(x_i, x_k) if an observation x_i is more similar to observation x_j and less similar to observation x_k. A simple example of such a similarity metric is a negative square of distance s(x_i,x_j)=−||x_i−x_j||^2.

Now, let's describe "correspondence" by making two zero matrices. One of them, r_{i,k}, determines how well the kth observation is as a "role model" for the ith observation with respect to all other possible "role models". Another matrix, a_{i,k} determines how appropriate it would be for ith observation to take the kth observation as a "role model". This may sound confusing, but it becomes more understandable with some hands-on practice.

The matrices are updated sequentially with the following rules:

r_{i,k} \leftarrow s_(x_i, x_k) - \max_{k' \neq k} \left\{ a_{i,k'} + s(x_i, x_k') \right\}

a_{i,k} \leftarrow \min \left( 0, r_{k,k} + \sum_{i' \not\in \{i,k\}} \max(0, r_{i',k}) \right), \ \ \  i \neq k

a_{k,k} \leftarrow \sum_{i' \neq k} \max(0, r_{i',k})