-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 3.9.4
-
Component/s: Clustering Algorithms
-
Labels:None
As reported by Sergio Queiroz:
I was running BisectingKMeansClusteringAlgorithm on some documents and saw that, contrary to the expected, it did not returned hard clusterings, i.e., the same document appeared in multiple clusters. I looked at the code and figured out that the problem was likely due to a bug in line 442, where starts the block: if (it < iterations - 1) { previousResult = result; result = Lists.newArrayList(); for (int i = 0; i < partitions; i++) { result.add(new IntArrayList(selected.columns())); } } This condition caused that the result list is not initialized anew in the last iteration, so that the last iteration adds elements to the partitions of the iteration before it. I removed the "if" (so that the code inside the if executed for all iterations) and the algorithm started behaving as expected.
I believe Sergio's insight is correct – I looked at the code and can't find the reason for the 'if' to be there.