Prompt Formatter

Modules concerning the prompt formatters used in Gen AI applications.

`AgnosticPromptFormatter(message_separator='\n', content_separator='\n')`

Bases: BasePromptFormatter

A prompt formatter that formats prompt without any specific model formatting.

The AgnosticPromptFormatter class formats a prompt by joining the content of the prompt templates using a specified separator. It is designed to work independently of specific model types.

Attributes:

Name	Type	Description
`content_separator`	`str`	A string used to separate each content in a message.
`message_separator`	`str`	A string used to separate each message.

Usage

The AgnosticPromptFormatter can be used to format a prompt for any model. The content_separator and message_separator can be customized to define the format of the prompt.

Usage example:

prompt = [
    (MessageRole.USER, ["Hello", "how are you?"]),
    (MessageRole.ASSISTANT, ["I'm fine", "thank you!"]),
    (MessageRole.USER, ["What is the capital of France?"]),
]
prompt_formatter = AgnosticPromptFormatter(
    message_separator="\n###\n",
    content_separator="---"
)
print(prompt_formatter.format(prompt))

Output example:

Hello---how are you?
###
I'm fine---thank you!
###
What is the capital of France?

Initializes a new instance of the AgnosticPromptFormatter class.

Parameters:

Name	Type	Description	Default
`message_separator`	`str`	A string used to separate each message. Defaults to "\n".	`'\n'`
`content_separator`	`str`	A string used to separate each content in a message. Defaults to "\n".	`'\n'`

`HuggingFacePromptFormatter(model_name_or_path, content_separator='\n')`

Bases: BasePromptFormatter

A prompt formatter that formats prompt using HuggingFace model's specific formatting.

The HuggingFacePromptFormatter class is designed to format prompt using a HuggingFace model's specific formatting. It does so by using the model's tokenizer's apply_chat_template method.

Attributes:

Name	Type	Description
`content_separator`	`str`	A string used to separate each content in a message.
`tokenizer`	`PreTrainedTokenizer`	The HuggingFace model tokenizer used for chat templating.

Usage

The HuggingFacePromptFormatter can be used to format a prompt using a HuggingFace model's specific formatting. The content_separator and model_name_or_path can be customized to define the format of the prompt. The model_name_or_path defines the name of the HuggingFace model whose tokenizer will be used to format the prompt using the apply_chat_template method.

Usage example:

prompt = [
    (MessageRole.USER, ["Hello", "how are you?"]),
    (MessageRole.ASSISTANT, ["I'm fine", "thank you!"]),
    (MessageRole.USER, ["What is the capital of France?"]),
]
prompt_formatter = HuggingFacePromptFormatter(
    model_name_or_path="mistralai/Mistral-7B-Instruct-v0.1",
    content_separator="---"
)
print(prompt_formatter.format(prompt))

Output example:

<s>[INST] Hello---how are you? [/INST]I'm fine---thank you!</s> [INST] What is the capital of France? [/INST]

Using a gated model

If you're trying to access the prompt builder template of a gated model, you'd need to: 1. Request access to the gated repo using your HuggingFace account. 2. Login to HuggingFace in your system. This can be done as follows: 2.1. Install huggingface-hub: pip install huggingface-hub 2.2. Login to HuggingFace: huggingface-cli login 2.3. Enter your HuggingFace token.

Initializes a new instance of the HuggingFacePromptFormatter class.

Parameters:

Name	Type	Description	Default
`model_name_or_path`	`str`	The model name or path of the HuggingFace model tokenizer to be loaded.	required
`content_separator`	`str`	A string used to separate each content in a message. Defaults to "\n".	`'\n'`

`LlamaPromptFormatter(model_name='Meta-Llama-3.1-8B-Instruct', content_separator='\n')`

Bases: HuggingFacePromptFormatter

A prompt formatter that formats prompt using Llama model's specific formatting.

The LlamaPromptFormatter class is designed to format prompt using a Llama model's specific formatting. It does so by using the model's tokenizer's apply_chat_template method.

Attributes:

Name	Type	Description
`content_separator`	`str`	A string used to separate each content in a message.
`tokenizer`	`PreTrainedTokenizer`	The HuggingFace model tokenizer used for chat templating.

Usage

The LlamaPromptFormatter can be used to format a prompt using a Llama model's specific formatting. The content_separator and model_name can be customized to define the format of the prompt. The model_name defines the name of the HuggingFace model whose tokenizer will be used to format the prompt using the apply_chat_template method.

Usage example:

prompt = [
    (MessageRole.USER, ["Hello", "how are you?"]),
    (MessageRole.ASSISTANT, ["I'm fine", "thank you!"]),
    (MessageRole.USER, ["What is the capital of France?"]),
]
prompt_formatter = LlamaPromptFormatter(
    model_name_or_path="meta-llama/Meta-Llama-3.1-8B-Instruct",
    content_separator="---"
)
print(prompt_formatter.format(prompt))

Output example:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

Hello---how are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I'm fine---thank you!<|eot_id|><|start_header_id|>user<|end_header_id|>

What is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Using a gated model

If you're trying to access the prompt builder template of a gated model, you'd need to: 1. Request access to the gated repo using your HuggingFace account. 2. Login to HuggingFace in your system. This can be done as follows: 2.1. Install huggingface-hub: pip install huggingface-hub 2.2. Login to HuggingFace: huggingface-cli login 2.3. Enter your HuggingFace token.

Initializes a new instance of the LlamaPromptFormatter class.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	The name of the Llama model tokenizer to be loaded. Defaults to `Meta-Llama-3.1-8B-Instruct`.	`'Meta-Llama-3.1-8B-Instruct'`
`content_separator`	`str`	A string used to separate each content in a message. Defaults to "\n".	`'\n'`

`MistralPromptFormatter(model_name='Mistral-7B-Instruct-v0.3', content_separator='\n')`

Bases: HuggingFacePromptFormatter

A prompt formatter that formats prompt using Mistral model's specific formatting.

The MistralPromptFormatter class is designed to format prompt using a Mistral model's specific formatting. It does so by using the model's tokenizer's apply_chat_template method.

Attributes:

Name	Type	Description
`content_separator`	`str`	A string used to separate each content in a message.
`tokenizer`	`PreTrainedTokenizer`	The HuggingFace model tokenizer used for chat templating.

Usage

The MistralPromptFormatter can be used to format a prompt using a Mistral model's specific formatting. The content_separator and model_name can be customized to define the format of the prompt. The model_name defines the name of the HuggingFace model whose tokenizer will be used to format the prompt using the apply_chat_template method.

Usage example:

prompt = [
    (MessageRole.USER, ["Hello", "how are you?"]),
    (MessageRole.ASSISTANT, ["I'm fine", "thank you!"]),
    (MessageRole.USER, ["What is the capital of France?"]),
]
prompt_formatter = MistralPromptFormatter(
    model_name_or_path="mistralai/Mistral-7B-Instruct-v0.1",
    content_separator="---"
)
print(prompt_formatter.format(prompt))

Output example:

<s>[INST] Hello---how are you? [/INST]I'm fine---thank you!</s> [INST] What is the capital of France? [/INST]

Using a gated model

If you're trying to access the prompt builder template of a gated model, you'd need to: 1. Request access to the gated repo using your HuggingFace account. 2. Login to HuggingFace in your system. This can be done as follows: 2.1. Install huggingface-hub: pip install huggingface-hub 2.2. Login to HuggingFace: huggingface-cli login 2.3. Enter your HuggingFace token.

Initializes a new instance of the MistralPromptFormatter class.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	The name of the Mistral model tokenizer to be loaded. Defaults to `Mistral-7B-Instruct-v0.3`.	`'Mistral-7B-Instruct-v0.3'`
`content_separator`	`str`	A string used to separate each content in a message. Defaults to "\n".	`'\n'`

`OpenAIPromptFormatter(content_separator='\n')`

Bases: BasePromptFormatter

A prompt formatter that formats prompt with OpenAI's specific formatting.

The OpenAIPromptFormatter class formats a prompt by utilizing OpenAI's specific formatting.

Attributes:

Name	Type	Description
`content_separator`	`str`	A string used to separate each content in a message.

Usage

The OpenAIPromptFormatter can be used to format a prompt for OpenAI's models. The content_separator can be customized to define the format of the prompt.

Usage example:

prompt = [
    (MessageRole.USER, ["Hello", "how are you?"]),
    (MessageRole.ASSISTANT, ["I'm fine", "thank you!"]),
    (MessageRole.USER, ["What is the capital of France?"]),
]
prompt_formatter = OpenAIPromptFormatter(
    content_separator="---"
)
print(prompt_formatter.format(prompt))

Output example:

User: Hello---how are you?
Assistant: I'm fine---thank you!
User: What is the capital of France?

Initializes a new instance of the BasePromptFormatter class.

Parameters:

Name	Type	Description	Default
`content_separator`	`str`	The separator to be used between the string in a single message. Defaults to "\n".	`'\n'`