Maximal Marginal Relevance
¶
Calculate Maximal Marginal Relevance (MMR) between candidate keywords and the document.
MMR considers the similarity of keywords/keyphrases with the document, along with the similarity of already selected keywords and keyphrases. This results in a selection of keywords that maximize their within diversity with respect to the document.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
doc_embedding
|
ndarray
|
The document embeddings |
required |
word_embeddings
|
ndarray
|
The embeddings of the selected candidate keywords/phrases |
required |
words
|
List[str]
|
The selected candidate keywords/keyphrases |
required |
top_n
|
int
|
The number of keywords/keyhprases to return |
5
|
diversity
|
float
|
How diverse the select keywords/keyphrases are. Values between 0 and 1 with 0 being not diverse at all and 1 being most diverse. |
0.8
|
Returns:
Type | Description |
---|---|
List[Tuple[str, float]]
|
List[Tuple[str, float]]: The selected keywords/keyphrases with their distances |
Source code in keybert\_mmr.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|