Term Score Decline
¶
Visualize the ranks of all terms across all topics.
Each topic is represented by a set of words. These words, however, do not all equally represent the topic. This visualization shows how many words are needed to represent a topic and at which point the beneficial effect of adding words starts to decline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
topic_model
|
A fitted BERTopic instance. |
required | |
topics
|
List[int]
|
A selection of topics to visualize. These will be colored red where all others will be colored black. |
None
|
log_scale
|
bool
|
Whether to represent the ranking on a log scale |
False
|
custom_labels
|
Union[bool, str]
|
If bool, whether to use custom topic labels that were defined using
|
False
|
title
|
str
|
Title of the plot. |
'<b>Term score decline per Topic</b>'
|
width
|
int
|
The width of the figure. |
800
|
height
|
int
|
The height of the figure. |
500
|
Returns:
Name | Type | Description |
---|---|---|
fig |
Figure
|
A plotly figure |
Examples: To visualize the ranks of all words across all topics simply run:
topic_model.visualize_term_rank()
Or if you want to save the resulting figure:
fig = topic_model.visualize_term_rank()
fig.write_html("path/to/file.html")
Reference:
This visualization was heavily inspired by the "Term Probability Decline" visualization found in an analysis by the amazing tmtoolkit. Reference to that specific analysis can be found here.
Source code in bertopic\plotting\_term_rank.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
|