Questions tagged [clustering]

Clustering is grouping (partitioning) a set of objects so that items in the same group are more similar to each other than to items in different groups, where the notion of similarity may be variously defined.

Clustering is a task of grouping (partitioning) a set of objects so that items in the same group are more similar (closer) to each other than to items in different groups. Often the notion of similarity is expressed as a distance measure, with greater distance conveying less similarity. The study of clustering algorithms (cluster analysis) originated in the social sciences but has become important in statistical data analysis (data mining) and in machine learning.

Examples of such algorithms are $K$-means and self-organizing map.

328 questions
2
votes
1 answer

Cluster points so that within each cluster holds a certain maximum distance between points

Currently I'm struggling with a (for me) new field, namely clustering. I would really appreciate any help I could get! The starting situation is that a data set $(x_k)_{k\in\{1,\dots,n\}} \subseteq \mathbb{R}^N$ is given. The task is to partition…
Murp
  • 1,164
2
votes
1 answer

What's the algorithm for agglomerative hierarchical clustering?

I have read some descriptions about agglomerative hierarchical clustering, however, I cannot seem to find an accurate description of the algorithm. My notes give: Assign each observation to own single-object cluster. Calculate distances between…
mavavilj
  • 7,270
2
votes
2 answers

What defines a convex Cluster and how it differentiates from other types?

I keep encountering the term "convex cluster" which I cannot understand what it means. I am exploring different types of clustering methods and in the description sections some mention advantages/drawbacks related to the convex properties of the…
2
votes
2 answers

Classification with equal number of elements per class

I am looking for a way to classify a set of points $\vec x$ such that each class contains roughly the same number of points. Currently I start with the first point $\vec x_0$ use this as reference point for the first class and the next point get…
2
votes
0 answers

Ward's method merging cost formula

I cannot fill in the steps in this equation: $$\Delta (A,B) = \sum_{x\in A\cup B} \vert\vert x -m_{A\cup B}\vert\vert^2 \ - \sum_{x \in A} \vert\vert x - m_{A} \vert\vert^2 - \sum_{x\in B} \vert\vert x - m_B\vert\vert^2 \\ = \frac{n_A…
juq
  • 55
2
votes
4 answers

What is the correct definition of Minkowski distance

I'm really confused. In a book ISBN: 978-0-470-27680-8 is written: The Euclidean distance can be generalized as a special case of a family of metrics, called Minkowski distance or L p norm, defined as, $$ D(\mathbf{x}_i,{\mathbf…
1
vote
0 answers

UPGMA: Distance between clusters for multi-dimensional data

How would you calculate the distance for multi-dimensional data? From wikipedia: The distance between any two clusters A and B is taken to be the average of all distances between pairs of objects "x" in A and "y" in B, that is, the mean distance…
1
vote
0 answers

Explanation of Equation for K-means Initialization

I'm currently working on a college project and was having trouble deciphering a formula I ran across. The problem involves the initialization of cluster centers for the K-means algorithm, and here is how it is shown: Consider the following heuristic…
1
vote
1 answer

distance function for hierarchical clustering

I would like to implement hierarchical clustering for a dataset with several dimensions, very different from each other. E.g. meters VS percentage VS times. I want to adopt a distance method that would allow to deal with that, by standardizing them.…
Mike
  • 111
1
vote
1 answer

$K$-means - how to calculate minimum distance

I was reading this article on $K$-Means and I got lost when it was time to assign objects to clusters. After calculating the centroids distance to every object, how can I calculate the minimum distance in other to assign objects to the nearest…
Kennedy
  • 131
1
vote
1 answer

How exactly is the prototype clustering solved using gradient descent?

How exactly is the prototype clustering solved using gradient descent? I don't see any derivatives.
mavavilj
  • 7,270
1
vote
1 answer

Best vector distance measure for contrasting clustering?

Let say i have V-vectors, everyone with size N, which measure of distance should I use if i want to cluster them by the following criteria : the more elements vector has in common with other vectors of the cluster and the fewer elements in common…
sten
  • 149
1
vote
1 answer

clustering over sphere surface

I have two algorithms, both of which create points on the surface of a higher dimensional sphere (say in $\mathbb{R}^n$). Now, I want to check, which algorithm gives a more uniform spread of points over the surface of the sphere. In lower…
1
vote
1 answer

three-dimensional vector clustering

I am looking for one or more algorithms that cluster points in non-euclidian vector space. My axes, specifically, are X and Y in space and Z in time. I was thinking about first clustering in X and Y only, then going ahead and internally clustering…
1
vote
1 answer

update of centroids in $k$ means clustering

I have been reading the lecture notes of Andrew Ng about Clustering techniques available in: http://cs229.stanford.edu/notes/cs229-notes7a.pdf but I have a problem in the second for of step $2$ which is the following: For what I know, the previous…
Lila
  • 479
1
2 3