Clustering

Affinity Propagation

Affinity propagation is another example of a clustering algorithm. As opposed to K-means, this approach does not require us to set the number of clusters beforehand. The main idea here is that we would like to cluster our data based on the similarity of the observations (or how they "correspond" to each other).

Let's define a similarity metric such that \(s(x_i, x_j) > s(x_i, x_k)\) if an observation \(x_i\) is more similar to observation \(x_j\) and less similar to observation \(x_k\). A simple example of such a similarity metric is a negative square of distance \(s(x_i,x_j)=−||x_i−x_j||^2\).

Now, let's describe "correspondence" by making two zero matrices. One of them, \(r_{i,k}\), determines how well the \(k\)th observation is as a "role model" for the \(i\)th observation with respect to all other possible "role models". Another matrix, \(a_{i,k}\) determines how appropriate it would be for \(i\)th observation to take the \(k\)th observation as a "role model". This may sound confusing, but it becomes more understandable with some hands-on practice.

The matrices are updated sequentially with the following rules:

\(r_{i,k} \leftarrow s_(x_i, x_k) - \max_{k' \neq k} \left\{ a_{i,k'} + s(x_i, x_k') \right\}\)

\(a_{i,k} \leftarrow \min \left( 0, r_{k,k} + \sum_{i' \not\in \{i,k\}} \max(0, r_{i',k}) \right), \ \ \  i \neq k\)

\(a_{k,k} \leftarrow \sum_{i' \neq k} \max(0, r_{i',k})\)