LlamaIndex(三)——LlamaIndex Prompt

一、LlamaIndex Prompt

prompt是赋予大型语言模型(LLMs)表达能力的基本输入。LlamaIndex 使用prompt来构建索引、进行插入、在查询时执行遍历,以及合成最终答案。

LlamaIndex提供了一些prompts,还有chat-specific prompts

用户也可以提供自己的提示模板,以进一步定制框架的行为。定制的最佳方法是复制上述链接中的默认提示,并将其作为任何修改的基础。

1.1 使用示例

1.1.1 自定义Prompt

从格式化字符串构建

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from llama_index.core import PromptTemplate

# 创建Prompt模板
template = PromptTemplate(
template="We have provided context information below.\n\n---------------------\n{context_str}\n---------------------\nGiven this information, please answer the question:\n{query_str}\n"
)


context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
query_str="How many params does llama 2 have",

# text prompt (for completion API)
prompt = template.format(context_str=context_str, query_str=query_str)
print(prompt)
print("==========================")
# message prompts (for chat API)
messages = template.format_messages(context_str=context_str, query_str=query_str)
print(messages)

输出

1
2
3
4
5
6
7
8
9
10
We have provided context information below.

---------------------
('In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters',)
---------------------
Given this information, please answer the question:
('How many params does llama 2 have',)

==========================
[ChatMessage(role=<MessageRole.USER: 'user'>, content="We have provided context information below.\n\n---------------------\n('In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters',)\n---------------------\nGiven this information, please answer the question:\n('How many params does llama 2 have',)\n", additional_kwargs={})]

还可以从chat messages构建

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from llama_index.core import ChatPromptTemplate
from llama_index.core.llms import ChatMessage, MessageRole

message_templates = [
ChatMessage(content="You are an expert system.", role=MessageRole.SYSTEM),
ChatMessage(
content="Generate a short story about {topic}",
role=MessageRole.USER,
),
]
chat_template = ChatPromptTemplate(message_templates=message_templates)

# you can create message prompts (for chat API)
messages = chat_template.format_messages(topic="bear")
print(messages)
print("==========================")
# or easily convert to text prompt (for completion API)
prompt = chat_template.format(topic="bear")
print(prompt)

输出

1
2
3
4
5
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are an expert system.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Generate a short story about bear', additional_kwargs={})]
==========================
system: You are an expert system.
user: Generate a short story about bear
assistant:

1.1.2 获取和设置自定义prompt

由于LlamaIndex是一个多步骤的流水线,因此识别想要修改的操作,并在正确的位置传入自定义prompt非常重要。例如,prompt在响应合成器、检索器、索引构建等中使用;其中一些模块嵌套在其他模块中(合成器嵌套在查询引擎中)。

完整的例子:https://docs.llamaindex.ai/en/stable/examples/prompts/prompt_mixin/

常用的Prompts

  • text_qa_template - 用于使用检索到的节点获取查询的初始答案。
  • refine_template - 当检索到的文本不适合在response_mode="compact"(默认设置)的单个大型语言模型(LLM)调用中使用,或者当使用response_mode="refine"时检索到多个节点,就会使用这个模板。第一次查询的答案会被插入作为existing_answer,并且大型语言模型(LLM)必须根据新的上下文更新或重复现有的答案。

获取Prompts

你可以在LlamaIndex的许多模块上调用get_prompts方法,以获取模块及其嵌套子模块中使用的prompts列表。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
load_index_from_storage,
StorageContext,
)
from IPython.display import Markdown, display


# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine(response_mode="compact")
prompts_dict = query_engine.get_prompts()
print(list(prompts_dict.keys()))

输出

1
['response_synthesizer:text_qa_template', 'response_synthesizer:refine_template']

更新Prompts

可以使用update_prompts函数来自定义任何实现了get_prompts方法的模块中的prompts。只需传入与通过get_prompts获得的prompts字典中所对应的参数值即可。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# define prompt viewing function
def display_prompt_dict(prompts_dict):
for k, p in prompts_dict.items():
text_md = f"**Prompt Key**: {k}<br>" f"**Text:** <br>"
display(Markdown(text_md))
print(p.get_template())
display(Markdown("<br><br>"))
qa_prompt_tmpl_str = (
"Context information is below.\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"Given the context information and not prior knowledge, "
"answer the query in the style of a Shakespeare play.\n"
"Query: {query_str}\n"
"Answer: "
)

qa_prompt_tmpl = PromptTemplate(qa_prompt_tmpl_str)

query_engine.update_prompts(
{"response_synthesizer:text_qa_template": qa_prompt_tmpl}
)

prompts_dict = query_engine.get_prompts()
display_prompt_dict(prompts_dict)

输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Prompt Key: response_synthesizer:text_qa_template
Text:

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query in the style of a Shakespeare play.
Query: {query_str}
Answer:



Prompt Key: response_synthesizer:refine_template
Text:

The original query is as follows: {query_str}
We have provided an existing answer: {existing_answer}
We have the opportunity to refine the existing answer (only if needed) with some more context below.
------------
{context_msg}
------------
Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.
Refined Answer:

修改Query引擎中使用的Prompts
对于Query引擎,可以在选择查询时直接传入自定义prompt,有两种方式:

high-level API

1
2
3
query_engine = index.as_query_engine(
text_qa_template=custom_qa_prompt, refine_template=custom_refine_prompt
)

low-level composition API

1
2
3
4
5
retriever = index.as_retriever()
synth = get_response_synthesizer(
text_qa_template=custom_qa_prompt, refine_template=custom_refine_prompt
)
query_engine = RetrieverQueryEngine(retriever, response_synthesizer)

上述两种方法是等效的,更多信息请参考[1] [2]

修改index构建中使用的Prompts

一些索引在构建过程中使用不同类型的prompt(最常见的VectorStoreIndexSummaryIndex不使用任何prompt)。

例如,TreeIndex使用摘要prompt来分层汇总节点,而KeywordTableIndex使用关键词提取prompt来提取关键词。

有以下两种等效的方法:

1
2
index = TreeIndex(nodes, summary_template=custom_prompt)
index = TreeIndex.from_documents(docs, summary_template=custom_prompt)

更多信息请参考[3]

1.2 高级 Prompt 功能 [4] [5]

1.2.1 部分格式化

部分格式化一个prompt,填写一些变量,同时留下其他变量稍后填写。

1
2
3
4
5
6
7
from llama_index.core import PromptTemplate

prompt_tmpl_str = "{foo} {bar}"
prompt_tmpl = PromptTemplate(prompt_tmpl_str)
partial_prompt_tmpl = prompt_tmpl.partial_format(foo="abc")

fmt_str = partial_prompt_tmpl.format(bar="def")

1.2.2 模板变量映射

LlamaIndex prompt抽象通常期望某些特定的key。例如,text_qa_prompt 期望 context_str 作为上下文,query_str 作为用户查询。

但如果你试图适配一个字符串模板以用于 LlamaIndex,更改模板变量可能会很麻烦。

相反,你可以定义template_var_mappings

1
2
3
4
5
template_var_mappings = {"context_str": "my_context", "query_str": "my_query"}

prompt_tmpl = PromptTemplate(
qa_prompt_tmpl_str, template_var_mappings=template_var_mappings
)

1.2.3 函数映射

将函数作为模板变量传入,而不是固定值,允许动态的少量样本提示等。重新格式化 context_str 的示例:

1
2
3
4
5
6
7
8
9
10
11
12
def format_context_fn(**kwargs):
# format context with bullet points
context_list = kwargs["context_str"].split("\n\n")
fmtted_context = "\n\n".join([f"- {c}" for c in context_list])
return fmtted_context


prompt_tmpl = PromptTemplate(
qa_prompt_tmpl_str, function_mappings={"context_str": format_context_fn}
)

prompt_tmpl.format(context_str="context", query_str="query")

输出

1
'Context information is below.\n---------------------\n- context\n---------------------\nGiven the context information and not prior knowledge, answer the query in the style of a Shakespeare play.\nQuery: query\nAnswer: '

更多例子:https://docs.llamaindex.ai/en/stable/module_guides/models/prompts/#example-guides

官方资源


LlamaIndex(三)——LlamaIndex Prompt
https://mztchaoqun.com.cn/posts/D16_LlamaIndex_Prompt/
作者
mztchaoqun
发布于
2024年4月7日
许可协议