Elucidating the solution structure of the K-means cost function using energy landscape theory
Publication Date
2022-02-07Journal Title
The Journal of Chemical Physics
ISSN
0021-9606
Publisher
AIP Publishing
Type
Article
This Version
AM
Metadata
Show full item recordCitation
Dicks, L., & Wales, D. J. (2022). Elucidating the solution structure of the K-means cost function using energy landscape theory. The Journal of Chemical Physics https://doi.org/10.1063/5.0078793
Abstract
The K-means algorithm, routinely used in many scientific fields, generates clustering solutions that depend on the initial cluster coordinates. The number of solutions may be large, which can make locating the global minimum challenging. Hence, the topography of the cost function surface is crucial to understanding the performance of the algorithm. Here we employ the energy landscape approach to elucidate the topography of the K-means cost function surface for Fisher’s Iris dataset. For any number of clusters we find that the solution landscapes have a funnelled structure that is usually associated with efficient global optimisation. An analysis of the barriers between clustering solutions shows that the funnelled structures result from remarkably small barriers between almost all clustering solutions. The funnelled structure becomes less well defined as the number of clusters increases, and we analyse kinetic analogues to quantify the increased difficulty of locating the global minimum for these different landscapes.
Sponsorship
Engineering and Physical Sciences Research Council (EP/L015552/1)
EPSRC (1819290)
Identifiers
External DOI: https://doi.org/10.1063/5.0078793
This record's URL: https://www.repository.cam.ac.uk/handle/1810/332953
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk