Clustering

← return to practice.dsc40a.com


This page contains all problems about Clustering.


Problem 1

Source: Spring 2023 Final Part 2, Problem 1

You run the k-means clustering algorithm on a dataset and it converges to a certain clustering with associated inertia I. You then duplicate each data point in the dataset and run k-means again on this twice-as-big dataset, with the same initial centroids as before. Which of the following is true? Select all that apply.

The inertia will be twice as much as before, 2I.

The centroids found will be the same as before.


Problem 2

Source: Winter 2024 Final Part 2, Problem 2

You run the k-means clustering algorithm on a dataset and it converges to a certain clustering with associated inertia I. You then duplicate each data point in the dataset and run k-means again on this twice-as-big dataset, with the same initial centroids as before. Which of the following is true? Select all that apply.

Bubbles 1 and 3: “The centroids found will be the same as before” and “The inertia will be twice as much as before, 2I.”


👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.