K-mean clustering and the Security domain.
Clustering Clustering is an unsupervised algorithm to discover groups of similar things, ideas, or people. Unlike supervised algorithms, we're not training clustering algorithms with examples of known labels. Instead, clustering tries to find structures within a training set where no point of the data is the label. K-Means Clustering K-Means is a clustering algorithm with one fundamental property: the number of clusters is defined in advance . In addition to K-Means, there are other types of clustering algorithms like Hierarchical Clustering, Affinity Propagation, or Spectral Clustering. How K-Means Works Suppose our goal is to find a few similar groups in a dataset like: K-Means begins with k randomly placed centroids. Centroids, as their name suggests, are the center points of the clusters . For example, here we're adding four random centroids: Then we assign each existing data point to its nearest centroid: After the assignment, we move the centroids to the average ...