public class Compositionality
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
static class |
Compositionality.VectorCompositionMethod
Implemented methods of vector composition.
|
Constructor and Description |
---|
Compositionality() |
Modifier and Type | Method and Description |
---|---|
java.util.HashMap<java.lang.String,java.lang.Float> |
composeWordVectors(java.util.ArrayList<java.util.HashMap<java.lang.String,java.lang.Float>> wordvectorList,
Compositionality.VectorCompositionMethod compositionMethod,
java.lang.Float a,
java.lang.Float b,
java.lang.Float c,
java.lang.Float lambda)
Compose two or more word vectors by the composition method given in
compositionMethod . |
java.util.HashMap<java.lang.String,java.lang.Float> |
composeWordVectors(java.util.HashMap<java.lang.String,java.lang.Float> wv1,
java.util.HashMap<java.lang.String,java.lang.Float> wv2,
Compositionality.VectorCompositionMethod compositionMethod,
java.lang.Float a,
java.lang.Float b,
java.lang.Float c,
java.lang.Float lambda)
Compose two word vectors by the composition method given in
compositionMethod . |
float |
compositionalSemanticSimilarity(java.lang.String multiWords1,
java.lang.String multiWords2,
Compositionality.VectorCompositionMethod compositionMethod,
DISCO.SimilarityMeasure simMeasure,
DISCO disco,
java.lang.Float a,
java.lang.Float b,
java.lang.Float c,
java.lang.Float lambda)
This method computes the semantic similarity between two multi-word terms,
phrases, sentences or paragraphs.
|
void |
printWordVector(java.util.HashMap<java.lang.String,java.lang.Float> wordvector)
Utility function.
|
float |
semanticSimilarity(java.util.HashMap<java.lang.String,java.lang.Float> wordvector1,
java.util.HashMap<java.lang.String,java.lang.Float> wordvector2)
Computes the semantic similarity (according to the vector similarity
measure
SimilarityMeasures.KOLB which is described in
Kolb 2009) between the
two input word vectors. |
float |
semanticSimilarity(java.util.HashMap<java.lang.String,java.lang.Float> wordvector1,
java.util.HashMap<java.lang.String,java.lang.Float> wordvector2,
DISCO.SimilarityMeasure simMeasure)
Computes the semantic similarity (according to the vector similarity
measure
similarityMeasure ) between the two input word
vectors. |
java.util.ArrayList<ReturnDataCol> |
similarWords(java.util.HashMap<java.lang.String,java.lang.Float> wordvector,
DISCO disco,
DISCO.SimilarityMeasure simMeasure)
Find the most similar words in the DISCO word space for an input word
vector.
|
public java.util.HashMap<java.lang.String,java.lang.Float> composeWordVectors(java.util.HashMap<java.lang.String,java.lang.Float> wv1, java.util.HashMap<java.lang.String,java.lang.Float> wv2, Compositionality.VectorCompositionMethod compositionMethod, java.lang.Float a, java.lang.Float b, java.lang.Float c, java.lang.Float lambda)
compositionMethod
.wv1
- word vector #1wv2
- word vector #2compositionMethod
- One of the methods in VectorCompositionMethod
.a
- only needed for composition method COMBINED.b
- only needed for composition method COMBINED.c
- only needed for composition method COMBINED.lambda
- only needed for composition method DILATION.null
.public java.util.HashMap<java.lang.String,java.lang.Float> composeWordVectors(java.util.ArrayList<java.util.HashMap<java.lang.String,java.lang.Float>> wordvectorList, Compositionality.VectorCompositionMethod compositionMethod, java.lang.Float a, java.lang.Float b, java.lang.Float c, java.lang.Float lambda)
compositionMethod
.wordvectorList
- a list of word vectors to be combined. The list has
to have at least two elements. The ordering of the list has no influence
on the result.compositionMethod
- One of the methods in VectorCompositionMethod
.a
- only needed for composition method COMBINED.b
- only needed for composition method COMBINED.c
- only needed for composition method COMBINED.lambda
- only needed for composition method DILATION.null
.public void printWordVector(java.util.HashMap<java.lang.String,java.lang.Float> wordvector)
wordvector
- public float semanticSimilarity(java.util.HashMap<java.lang.String,java.lang.Float> wordvector1, java.util.HashMap<java.lang.String,java.lang.Float> wordvector2, DISCO.SimilarityMeasure simMeasure)
similarityMeasure
) between the two input word
vectors.wordvector1
- wordvector2
- simMeasure
- One of the similarity measures enumerated in
DISCO.SimilarityMeasures
.similarityMeasure
is unknown the return
value is -3.0F.public float semanticSimilarity(java.util.HashMap<java.lang.String,java.lang.Float> wordvector1, java.util.HashMap<java.lang.String,java.lang.Float> wordvector2)
SimilarityMeasures.KOLB
which is described in
Kolb 2009) between the
two input word vectors.wordvector1
- wordvector2
- public float compositionalSemanticSimilarity(java.lang.String multiWords1, java.lang.String multiWords2, Compositionality.VectorCompositionMethod compositionMethod, DISCO.SimilarityMeasure simMeasure, DISCO disco, java.lang.Float a, java.lang.Float b, java.lang.Float c, java.lang.Float lambda) throws java.io.IOException
composeWordVectors()
.
The two resulting vectors are then compared using
Compositionality.semanticSimilarity()
.multiWords1
- a tokenized string containing a multi-word term, phrase,
sentence or paragraph.multiWords2
- a tokenized string containing a multi-word term, phrase,
sentence or paragraph.compositionMethod
- a vector composition method.simMeasure
- a similarity measure.disco
- a DISCO word space.a
- only needed for composition method COMBINED.b
- only needed for composition method COMBINED.c
- only needed for composition method COMBINED.lambda
- only needed for composition method DILATION.multiWord1
and
multiWord2
.java.io.IOException
public java.util.ArrayList<ReturnDataCol> similarWords(java.util.HashMap<java.lang.String,java.lang.Float> wordvector, DISCO disco, DISCO.SimilarityMeasure simMeasure) throws java.io.IOException
Compositionality.composeWordVectors()
) the most
similar words will only be single-token words from the index.wordvector
- input word vectordisco
- DISCO word spacesimMeasure
- wordvector
is greater than zero, ordered by
similarity value (highest value first).java.io.IOException