Details
-
Type:
Bug
-
Status:
Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 3.0.1
-
Fix Version/s: 3.1.0
-
Component/s: Clustering Algorithms
-
Labels:None
Description
for example, there are two sentences of "cat ate cheese too" and "mouse ate cheese too". when I construct the GST. it works correctly.
but I found a bug when i call PhraseNode.getInternalDocumentsRepresentation() . the rusult of this function is wrong.
after analysising the code of STC, I found that the PhraseNode.docs.set() is called when the node is created.
however, after the node was created, when a new ISuffixableElement is added to the GST. the bug occured. the existed node of"ate cheese too EOS" would not add the second document "mouse ate cheese too" traverse.
the reason would be in the source code of "org.carrot2.text.suffixtrees.SuffixTree". When calling insertPrefix(), if a edge is found ,it will break directly.the node's propertis of "docs" and "elementsInNode" has not be changed.
please check and assure it . I have been blocked for days in integerated it into my experments.
thanks you .
Assigning to Dawid for investigation and a possible fix.