BaseRepresentation
¶
Bases: BaseEstimator
The base representation model for fine-tuning topic representations
Source code in bertopic\representation\_base.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
extract_topics(topic_model, documents, c_tf_idf, topics)
¶
Extract topics
Each representation model that inherits this class will have its arguments (topic_model, documents, c_tf_idf, topics) automatically passed. Therefore, the representation model will only have access to the information about topics related to those arguments.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
topic_model |
The BERTopic model that is fitted until topic representations are calculated. |
required | |
documents |
DataFrame
|
A dataframe with columns "Document" and "Topic" that contains all documents with each corresponding topic. |
required |
c_tf_idf |
csr_matrix
|
A c-TF-IDF representation that is typically
identical to |
required |
topics |
Mapping[str, List[Tuple[str, float]]]
|
A dictionary with topic (key) and tuple of word and weight (value) as calculated by c-TF-IDF. This is the default topics that are returned if no representation model is used. |
required |
Source code in bertopic\representation\_base.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|