1

I'm trying to solve an optimization problem. I have a list of around 4000 geo coordinates data, and I want to cluster them into 30 groups based on the distance, so that the closer properties belongs to one group.

I started with FCM algorithm In formula -

Minimize $J=\sum_{i=1}^c\sum_{j=1}^n {{u_i}_j}^2{d_i}_j$

But the issue is that the size of the clusters are uneven. I hope to find a solution so that the size of a cluster is within a certain range(upper and lower bound).

Subject to $\forall 1\le j\le n : \sum_{i=1}^c {{u_i}_j}=1$

and $\sum_{j=1}^n {{u_i}_j}=\frac nc$

I've also tried to process the data after performing the FCM by moving the properties in the too large cluster to it's next closest cluster, but found it doesn't make sense.

I think Lagrange Multiplier would be a good solution. The formula as below. I thought of solving it using matrix method, but since this will be a huge matrice $c*n+c+n\approx10,000 $ by $10,000$. and I'm not familiar with matrix calculation, and how to implement it with programming.

n=number of property $j\in[1,n]$

c=number of cluster $i\in[1,c]$

$L=\sum_{i=1}^c\sum_{j=1}^n {{u_i}_j}^2{d_i}_j+\sum_{j=1}^n\alpha_j(1-\sum_{i=1}^c {{u_i}_j})+\sum_{i=1}^c\beta_j(\frac nc-\sum_{j=1}^n {{u_i}_j})$

Any ideas for implementation and solution are appreciated. Thanks!

qshng
  • 121

0 Answers0