Skip to content

Large Language Models (LLM)

In this tutorial we will be going through the Large Language Models (LLM) that can be used in KeyLLM. Having the option to choose the LLM allow you to leverage the model that suit your use-case.

OpenAI

To use OpenAI's external API, we need to define our key and use the keybert.llm.OpenAI model.

We install the package first:

pip install openai

Then we run OpenAI as follows:

import openai
from keybert.llm import OpenAI
from keybert import KeyLLM

# Create your OpenAI LLM
openai.api_key = "sk-..."
llm = OpenAI()

# Load it in KeyLLM
kw_model = KeyLLM(llm)

# Extract keywords
keywords = kw_model.extract_keywords(MY_DOCUMENTS)

If you want to use a chat-based model, please run the following instead:

import openai
from keybert.llm import OpenAI
from keybert import KeyLLM

# Create your LLM
openai.api_key = "sk-..."
llm = OpenAI(model="gpt-3.5-turbo", chat=True)

# Load it in KeyLLM
kw_model = KeyLLM(llm)

Cohere

To use Cohere's external API, we need to define our key and use the keybert.llm.Cohere model.

We install the package first:

pip install cohere

Then we run Cohere as follows:

import cohere
from keybert.llm import Cohere
from keybert import KeyLLM

# Create your OpenAI LLM
co = cohere.Client(my_api_key)
llm = Cohere(co)

# Load it in KeyLLM
kw_model = KeyLLM(llm)

# Extract keywords
keywords = kw_model.extract_keywords(MY_DOCUMENTS)

LiteLLM

LiteLLM allows you to use any closed-source LLM with KeyLLM

We install the package first:

pip install litellm

Let's use OpenAI as an example:

import os
from keybert.llm import LiteLLM
from keybert import KeyLLM

# Select LLM
os.environ["OPENAI_API_KEY"] = "sk-..."
llm = LiteLLM("gpt-3.5-turbo")

# Load it in KeyLLM
kw_model = KeyLLM(llm)

🤗 Hugging Face Transformers

To use a Hugging Face transformers model, load in a pipeline and point to any model found on their model hub (https://huggingface.co/models). Let's use Llama 2 as an example:

from torch import cuda, bfloat16
import transformers

model_id = 'meta-llama/Llama-2-7b-chat-hf'

# 4-bit Quantization to load Llama 2 with less GPU memory
bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,  
    bnb_4bit_quant_type='nf4',  
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=bfloat16
)

# Llama 2 Model & Tokenizer
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    quantization_config=bnb_config,
    device_map='auto',
)
model.eval()

# Our text generator
generator = transformers.pipeline(
    model=model, tokenizer=tokenizer,
    task='text-generation',
    temperature=0.1,
    max_new_tokens=500,
    repetition_penalty=1.1
)

Then, we load the generator in KeyLLM with a custom prompt:

from keybert.llm import TextGeneration
from keybert import KeyLLM

prompt = """
<s>[INST] <<SYS>>

You are a helpful assistant specialized in extracting comma-separated keywords.
You are to the point and only give the answer in isolation without any chat-based fluff.

<</SYS>>
I have the following document:
- The website mentions that it only takes a couple of days to deliver but I still have not received mine.

Please give me the keywords that are present in this document and separate them with commas.
Make sure you to only return the keywords and say nothing else. For example, don't say: 
"Here are the keywords present in the document"
[/INST] meat, beef, eat, eating, emissions, steak, food, health, processed, chicken [INST]

I have the following document:
- [DOCUMENT]

Please give me the keywords that are present in this document and separate them with commas.
Make sure you to only return the keywords and say nothing else. For example, don't say: 
"Here are the keywords present in the document"
[/INST]
"""

# Load it in KeyLLM
llm = TextGeneration(generator, prompt=prompt)
kw_model = KeyLLM(llm)

LangChain

To use LangChain, we can simply load in any LLM and pass that as a QA-chain to KeyLLM.

We install the package first:

pip install langchain

Then we run LangChain as follows:

from langchain.chains.question_answering import load_qa_chain
from langchain.llms import OpenAI
chain = load_qa_chain(OpenAI(temperature=0, openai_api_key=my_openai_api_key), chain_type="stuff")

Finally, you can pass the chain to KeyBERT as follows:

from keybert.llm import LangChain
from keybert import KeyLLM

# Create your LLM
llm = LangChain(chain)

# Load it in KeyLLM
kw_model = KeyLLM(llm)