Suppose a data sets is generated by sampling examples uniformly at random from r spherical gaussians with an std of 1. In which cases is Kmeans with Kmeans initialization likely to be significantly better than Kmeans with standard initialization?

a. The clusters are very close each other.
b. The clusters are far from each other.
c. r is large.
d. r is small
e. All clusters have equal probability
f. One cluster has much higher probability than the others.

Respuesta :

Answer:

B. The clusters are far from each other.

Step-by-step explanation:

When there is several variation in the cluster, we use the Kmeans ++ initialization, therefore the correct answer is option B