Modifier and Type | Method and Description |
---|---|
void |
Cluster.clutoClusterSimilarityGraph(DISCO disco,
int n,
float minSim,
java.lang.String outputDir)
Creates a sparse graph file that can be clustered with CLUTO's
scluster program.Important note: This method only works with word spaces of type DISCO.WordspaceType.SIM ! |
void |
Cluster.clutoClusterVectors(DISCO disco,
java.util.ArrayList<java.lang.String> wordList,
java.lang.String outputDir)
Creates sparse matrix file for use with CLUTO's
vcluster program. |
float |
Compositionality.compositionalSemanticSimilarity(java.lang.String multiWords1,
java.lang.String multiWords2,
Compositionality.VectorCompositionMethod compositionMethod,
DISCO.SimilarityMeasure simMeasure,
DISCO disco,
java.lang.Float a,
java.lang.Float b,
java.lang.Float c,
java.lang.Float lambda)
This method computes the semantic similarity between two multi-word terms,
phrases, sentences or paragraphs.
|
float |
TextSimilarity.directedTextSimilarity(java.lang.String text,
java.lang.String hypothesis,
DISCO disco)
Compute the semantic similarity of text and hypothesis according to the
algorithm given in Jijkoun & De Rijke (2005): "Recognizing Textual Entailment
Using Lexical Similarity".
The method tests if the hypothesis is licensed by the text. |
static ReturnDataBN |
Cluster.filterOutliers(DISCO disco,
java.lang.String word,
int n)
This method takes the list of the n most similar words of the
input word and filters out all words that do not appear in the
similarity list of at least one of the other similar
words of the input word.
The resulting list of similar words will have size <= n. Important note: This method only works with word spaces of type DISCO.WordspaceType.SIM . |
static java.lang.String[] |
Cluster.growSet(DISCO disco,
java.lang.String[] inputSet)
Retrieves the similar words for all the words in the input set
and extends the input set by all words that appear in the
similarity lists of all the input words.
|
java.util.ArrayList<Rank.WordAndRank> |
Rank.highestRanking(DISCO disco,
java.util.Set<java.lang.String> words,
DISCO.WordspaceType type)
Finds the words in the index in whose similarity or collocation lists the
words rank highest. |
int |
Rank.rankCol(DISCO disco,
java.lang.String w1,
java.lang.String w2)
Computes the rank of w2 among the collocations of w1.
|
int |
Rank.rankSim(DISCO disco,
java.lang.String w1,
java.lang.String w2)
Computes the rank of w2 in the similarity list of w1.
|
java.util.ArrayList<ReturnDataCol> |
Compositionality.similarWords(java.util.HashMap<java.lang.String,java.lang.Float> wordvector,
DISCO disco,
DISCO.SimilarityMeasure simMeasure)
Find the most similar words in the DISCO word space for an input word
vector.
|
float |
TextSimilarity.textSimilarity(java.lang.String text1,
java.lang.String text2,
DISCO disco)
Computes the semantic similarity between the two texts as an average of
both directed text similarities.
|