Am I allowed to choose k initial centroids that are not contained in the original data set, in another word, not using the random sampling.
For instance, in the below two graphs the middle coloured points are my original data set.
- In the left graph, the 5 red points are the initial centroids I selected using my own method.
- In the right graph, the initial centroids will be evenly distributed on the megenta circle. Notice that, although my original data set will all be positive numbers, some initial centroids will have negative values in this case depending on the location of the initial centroids on the circle.
I wonder whether there are any fundemental mistakes I made which I haven't been aware of yet for selecting initial centroids using above two proposed methods.
Even there are no fundermental mistakes, any disadvantages of using these two ways of selecting initial centroids?