主要参考资料：

实际用例介绍：https://python.langchain.com.cn/docs/use_cases

LangChain API 文档：https://api.python.langchain.com/en/latest/api_reference.html

什么是 LangChain？

LangChain 是一个用于开发由语言模型驱动的应用程序的框架。

LangChain 提供了一些组件和接口，集成了多元的生态，包括各类向量数据库服务、源数据集、LLM 应用开发相关工具等，可以让开发者方便地使用语言模型来实现各种功能，例如文档分析和摘要、聊天机器人、代码分析等。

使用 LangChain 可以在 LLM 应用开发中：灵活构建和管理提示词模板；便捷地处理多轮对话的上下文记忆问题；使用外部知识库增强 LLM；构造代理和自定义工具，为复杂的 LLM 应用调用链服务。

从主要模块的视角出发 …

模型输入输出：与语言模型进行接口交互，便捷地实现和组织与 LLM 进行交互的过程

提示 prompts：将模型输入模板化、动态选择和管理
语言模型 models：通过常见接口调用语言模型
输出解析器 output_parsers：从模型输出中提取信息

数据连接：与特定应用程序数据进行接口交互，对接 LLM 相关的全流程数据处理支持

文档加载器 Document loaders：从许多不同的源加载文档
文档转换器 Document transformers：分割文档、删除冗余文档等
文本嵌入模型 Text embedding models：将非结构化文本转换为浮点数列表
向量存储 Vector stores：存储和检索嵌入数据
检索器 Retrievers：查询数据

链：构造调用序列，支持实现使用 LLM 构建复杂应用所必须的链式调用（与其他 LLM 进行链式调用 / 与其他组件进行链式调用）

记忆：在链式调用的运行之间保持应用程序状态，管理并使用与 LLM 对话的上下文（历史消息）

代理人：实现根据用户输入，对 LLM 和其他工具进行灵活的调用链（类似 LLM-plugin）

动作代理人 Action agents：在每个时间步上，使用所有先前动作的输出决定下一个动作
计划执行代理人 Plan-and-execute agents：预先决定所有动作的完整顺序，然后按照计划执行，不更新计划

回调函数：记录和流式传输任何链式调用的中间步骤，在 LLM 应用程序的各个阶段进行钩子处理（如日志记录、监控、流式处理和其他任务）

从实践领域角度出发 …

一、提示工程 - Prompting Engineering

更好地管理和使用提示模板：LangChain 所提供的提示模板对象本质上可以用 f-strings 替换，但能规范化提示过程，灵活添加参数，以面向对象的方式构建提示，并和 LangChain 的其他的功能模块更好地结合

使用 PromptTemplate 对象

from langchain.llms import OpenAI
from langchain import PromptTemplate

openai = OpenAI(
    model_name="text-davinci-003",
    openai_api_key=os.getenv('OPENAI_API_KEY')
)

template = "Question: {query}\nAnswer: "
prompt_template = PromptTemplate(
    input_variables=["query"],
    template=template
)

prompt = prompt_template.format(query="Who's Jackie Chan?")

response = openai(prompt)

使用 FewShotPromptTemplate 对象

将 few-shot example 用 PromptTemplate 对象封装

将 few-shot prompt 结构上分解，由 prefix + examples + suffix 共同组成

from langchain import FewShotPromptTemplate

# create examples
examples = [
    {"query": "xxx?", "answer": "xxx"},
    {"query": "xxx?", "answer": "xxx"},
]

# create a example template
example_template = "User: {query}\nAI: {answer}“

# create a prompt example from above template
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)
# now break our previous prompt into a prefix and suffix
# the prefix is our instructions
prefix = """The following are excerpts from conversations
with an AI assistant. The assistant is typically sarcastic
and witty, producing creative and funny responses to the
users questions. Here are some examples: """
# and the suffix our user input and output indicator
suffix = "User: {query}\nAI: "

# now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

prompt= few_shot_prompt_template.format(query="What is the meaning of life?")

在 examples 数量较多的情况下，动态地选择和构造 few-shot prompt

基于长度：可以限制过多的标记使用，并避免超出 LLM 的最大上下文窗口而导致错误

from langchain.prompts.example_selector import LengthBasedExampleSelector

example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=50  # this sets the max length that examples should be
)

dynamic_prompt_template = FewShotPromptTemplate(
    example_selector = example_selector,  # use example_selector instead of examples
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

二、会话记忆 - Conversational Memory

便捷地管理和组织连续对话的上下文：

帮助无状态的 LLM 以类似于有状态的环境的方式进行交互，能够考虑并参考过去的交互
可以实现自己的记忆模块，在同一链中使用多种类型的记忆，将它们与代理结合使用等等

使用 ConversationChain 来管理 LLM 的会话记忆

from langchain import OpenAI
from langchain.chains import ConversationChain

# first initialize the large language model
llm = OpenAI(
    temperature=0,
    openai_api_key=os.getenv('OPENAI_API_KEY'),
    model_name="text-davinci-003"
)

# now initialize the conversation chain
conversation = ConversationChain(llm=llm)

conversation.prompt.template 中包含了 {history} 和 {input} 两个参数，其中 history 是使用对话记忆的地方，input 是放置最新的人类查询的地方。

我们可以使用多种类型的对话记忆来使用 ConversationChain 。它们会修改传递给 {history} 参数的文本。

LangChain 对于 ConversationChain 中的 memory 提供了不同的实现方式

使用 ConversationBufferMemory，缓冲区直接保存了聊天历史中的每个交互

每次交互消耗的 Token 数量近似呈线性增长，容易超出上下文窗口的限制

from langchain.chains.conversation.memory import ConversationBufferMemory

conversation_buf = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory()
)

使用 ConversationSummaryMemory，总结每个新的交互并将其附加到所有过去交互的运行汇总中

每次交互消耗的 Token 数量增长速度低于线性但不收敛，对话轮数较少时消耗的 Token 数量反而相对较高

from langchain.chains.conversation.memory import ConversationSummaryMemory

conversation = ConversationChain(
    llm=llm,
    memory=ConversationSummaryMemory(llm=llm)  # 总结由LLM提供支持
)

生成 Summary 的详细提示模板见 conversation_sum.memory.prompt.template

使用 ConversationBufferWindowMemory，在忘记之前只保留给定数量的过去交互

from langchain.chains.conversation.memory import ConversationBufferWindowMemory

conversation = ConversationChain(
    llm=llm,
    memor=ConversationBufferWindowMemory(k=6)
)

使用 ConversationSummaryBufferMemory，既能记住远程交互又能以其原始形式存储最近的交互

from langchain.chains.conversation.memory import ConversationSummaryBufferMemory

conversation_sum_bufw = ConversationChain(
    llm=llm,
    memory=ConversationSummaryBufferMemory(llm=llm, max_token_limit=650)
)

使用其他记忆类型，如 ConversationKnowledgeGraphMemory、ConversationEntityMemory

三、知识库 - Knowledge Bases

便捷地使用外部知识库增强 LLM：

使用检索增强技术，从外部知识库中检索相关信息并提供给 LLM
使用向量数据库作为知识库来为 LLM 提供源知识的支撑

获取知识库数据

可以是 LLM 需要帮助编写代码的代码文档，也可以是内部聊天机器人的公司文档，或者其他任何东西

以使用 Hugging Face 数据集获取维基百科的一个子集为例：

from datasets import load_dataset
data = load_dataset("wikipedia", "20220301.simple", split ='train [: 10000]')

"""
Dataset({
    features: ['id', 'url', 'title', 'text'],
    num_rows: 10000
})
"""

创建文本块

为什么要将文本分割成较小的块？

提高 ”嵌入准确性“
减少输入 LLM 的文本量：限制输入可以提高 LLM 遵循指示的能力，减少生成成本，获得更快的响应
将信息源缩小到更小的文本块，提供更精确的信息源
将非常长的（超过最大上下文窗口）文本块拆分，使其可以被添加到知识库中

使用 RecursiveCharacterTextSplitter，将文本分割成不超过 chunk_size长度的块

注：这里使用了 tiktoken 库来完成计算 token 数量的计算

import tiktoken  # ! pip install tiktoken
tokenizer = tiktoken.get_encoding('p50k_base')

# create the length function
def tiktoken_len(text):
    tokens = tokenizer.encode(text, disallowed_special=())
    return len(tokens)

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=400,
    chunk_overlap=20,
    length_function=tiktoken_len,
    separators=["\n\n", "\n", " ", ""]
)

chunks = text_splitter.split_text(data[6]['text'])[:3]

创建嵌入

嵌入结果存储在向量数据库中，并且可以通过计算向量空间中嵌入之间的距离来找到具有相似含义的文本块。

from langchain.embeddings.openai import OpenAIEmbeddings

embed = OpenAIEmbeddings(
    document_model_name='text-embedding-ada-002',
    query_model_name='text-embedding-ada-002',
    openai_api_key=os.getenv('OPENAI_API_KEY')
)

texts = [
    'first chunk of text',
    'second chunk of text',
    'third chunk of text'
]
res = embed.embed_documents(texts)

使用向量数据库

主流的开源向量数据库：Milvus 等；主流的闭源向量数据库：Pinecone 等

以 Pinecone 向量数据库为例（需要一个 API 密钥 https://app.pinecone.io/）：

import pinecone

pinecone.init(
	api_key="YOUR_API_KEY",  # find api key in console at app.pinecone.io
	environment="YOUR_ENV"  # find next to api key in console
)

# create a new index
pinecone.create_index(
	name=index_name,
	metric='dotproduct',
	dimension=len(res[0]) # 1536 dim of text-embedding-ada-002
)

# link to new index
Index = pinecone.GRPCIndex(index_name)

Index.describe_index_stats()
"""
{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}
"""

此时尚未添加任何向量，故新 Pinecone 索引的 total_vector_count 为 0。

索引过程包括：遍历想要添加到知识库中的数据，为其创建ID、嵌入和元数据，然后将其添加到索引中。

可以批量处理此过程以加快速度

from tqdm.auto import tqdm
from uuid import uuid4

batch_limit = 100

texts = []
metadatas = []

for i, record in enumerate(tqdm(data)):
    # get metadata fields for this record
    metadata = {
        'wiki-id': str(record['id']),
        'source': record['url'],
        'title': record['title']
    }
    # create chunks from the record text
    record_texts = text_splitter.split_text(record['text'])
    # create individual metadata dicts for each chunk
    record_metadatas = [{
        "chunk": j, "text": text, **metadata
    } for j, text in enumerate(record_texts)]
    # append these to current batches
    texts.extend(record_texts)
    metadatas.extend(record_metadatas)
    # if we have reached the batch_limit we can add texts
    if len(texts) >= batch_limit:
        ids = [str(uuid4()) for _ in range(len(texts))]
        embeds = embed.embed_documents(texts)
        Index.upsert(vectors=zip(ids, embeds, metadatas))
        texts = []
        metadatas = []

Index.describe_index_stats()
"""
{'dimension': 1536,
 'index_fullness': 0.1,
 'namespaces': {'': {'vector_count': 27437}},
 'total_vector_count': 27437}
"""

向量存储与查询

通过 LangChain 库重新连接到创建的索引

上一步中独立构建的索引与 LangChain 无关

from langchain.vectorstores import Pinecone

text_field = "text"

# switch back to normal index for langchain

Index = pinecone.Index(index_name)

vectorstore = Pinecone(
    index, embed.embed_query, text_field
)

使用 similarity search 方法直接进行查询

query = "who was Benito Mussolini?"

vectorstore.similarity_search(
    query="who was Jakie Chan?",
    k=3  # return 3 most relevant docs
)

得到检索到的相关上下文信息，从而可以进一步使用 LLM 来生成答案。

以生成式问答（将问题传递给 LLM，指示它基于从知识库返回的信息来回答问题）为例：

使用 RetrievalQA 链，根据向量数据库检索到的信息生成回答

from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA

# completion llm
llm = ChatOpenAI(
    openai_api_key=os.getenv('OPENAI_API_KEY'),
    model_name='gpt-3.5-turbo',
    temperature=0
)

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

response = qa.run(query="who was Jakie Chan?")

使用 RetrievalQAWithSourcesChain，允许用户看到信息的来源，提高对所提供答案的信任

from langchain.chains import RetrievalQAWithSourcesChain

qa_with_sources = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

response = qa_with_sources.run(query="who was Jakie Chan?")

四、会话代理 - Conversational Agents

代理可以被视为 LLM 的工具，比如：代理是可以使用计算器、搜索或执行代码的 LLM

代理三要素：基本的 LLM，要进行交互的工具，控制交互的代理

LangChain 库提供了大量预置的工具，以下示例中包含了计算器工具和常规对话工具，Agent 可以根据需要进行工具的选择，既能够精确处理计算问题，也能够应对常规对话：

"""Basic LLM"""
from langchain import OpenAI

llm = OpenAI(
    openai_api_key=os.getenv("OPENAI_API_KEY"),
    temperature=0,
    model_name="text-davinci-003"
)


"""Tool(s) to be interacted"""
from langchain.chains import LLMMathChain
from langchain.agents import Tool

llm_math = LLMMathChain(llm=llm)
# initialize the math tool
math_tool = Tool(
    name='Calculator',
    func=llm_math.run,
    description='Useful for when you need to answer questions about math.'
)
llm_tool = Tool(
    name ='Language Model',
    func = llm_chain.run,
    description ='use this tool for general purpose queries and logic'
)
# when giving tools to LLM, we must pass as list of tools
tools = [math_tool, llm_tool]
# directly load pre-constructed tools
# from langchain.agents import load_tools
# tools = load_tools(['llm-math'], llm=llm)


"""Agent"""
from langchain.agents import initialize_agent

zero_shot_agent = initialize_agent(
    agent="zero-shot-react-description",  # stateless + ReAct method
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3
)

1 2	zero_shot_agent("what is (4.5*2.1)^2.2?") zero_shot_agent("what is the capital of Norway?")

代理的类型：

Zero Shot ReAct (“zero-shot-react-description”)：无状态（无记忆）

ReAct Conversational ReAct (“conversational-react-description”)：有状态（有记忆）

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history")

conversational_agent = initialize_agent(
    agent='conversational-react-description', 
    tools=tools, 
    llm=llm,
    verbose=True,
    max_iterations=3,
    memory=memory,
)

ReAct Docstore (“react-docstore”)：使用 LangChain 的 docstore 进行信息搜索和查找

from langchain import Wikipedia
from langchain.agents.react.base import DocstoreExplorer

docstore = DocstoreExplorer(Wikipedia())
tools = [
    Tool(
        name = "Search",  # Search for relevant articles
        func = docstore.search,
        description ='search wikipedia'
    ),
    Tool(
        name = "Lookup",  # Find relevant chunks of information in retrieved articles
        func = docstore.lookup,
        description ='lookup a term in wikipedia'
    )
]

docstore_agent = initialize_agent(
    tools=tools, 
    llm=llm, 
    agent="react-docstore", 
    verbose=True,
    max_iterations=3
)

docstore_agent("What were Archimedes' last words?")

Self-ask With Search (“self-ask-with-search”)：根据需要执行搜索和提问步骤，以获得最终答案

from langchain import SerpAPIWrapper

# initialize the search chain
search = SerpAPIWrapper(serpapi_api_key ='serp_api_key')

# create a search tool
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run,
        description='google search'
    )
]

# initialize the search enabled agent
self_ask_with_search = initialize_agent(
    tools=tools,
    llm=llm,
    agent="self-ask-with-search",
    verbose=True
)

self_ask_with_search("who lived longer; Plato, Socrates, or Aristotle?")

五、自定义工具 - Custom Tools

简单的订制工具

以一个简单的计算圆周长的工具为例：

from langchain.tools import BaseTool  # Necessary template of LangChain tools
from math import pi
from typing import Union
  
class CircumferenceTool(BaseTool):
    # Necessary attribute of a LangChain tool
	self.name = "Circumference calculator"
	self.description = "use this tool when you need to calculate a circumference using the radius of a circle"

    # Synchronous call (default)
    def _run(self, radius: Union[int, float]):
        return float(radius) * 2.0 * pi

    # Asynchronous call
    def _arun(self, radius: int):
        raise NotImplementedError("This tool does not support async")

from langchain.chat_models import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory


# initialize LLM (we use ChatOpenAI because we'll later define a `chat` agent)
llm = ChatOpenAI(
	openai_api_key=os.getenv("OPENAI_API_KEY"),
	temperature=0,
	model_name='gpt-3.5-turbo'
)

# initialize conversational memory
conversational_memory = ConversationBufferWindowMemory(
	memory_key='chat_history',
	k=5,
	return_messages=True
)

from langchain.agents import initialize_agent

tools = [CircumferenceTool()]

# initialize agent with tools
agent = initialize_agent(
    agent='chat-conversational-react-description',
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    memory=conversational_memory
)

1	agent("can you calculate the circumference of a circle that has a radius of 7.81mm")

我们发现代理给出了不准确的答案，查看调用链可以看出，这是由于代理决定不使用圆周计算器工具导致的。

查看代理内置的提示词，并基于此添加一句话，告诉模型永远不应该尝试进行数学计算。

1	print(agent.agent.llm_chain.prompt.messages[0].prompt.template)

sys_msg = """Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Unfortunately, Assistant is terrible at maths. When provided with math questions, no matter how simple, assistant always refers to it's trusty tools and absolutely does NOT try to answer math questions by itself

Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
"""

new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)

agent.agent.llm_chain.prompt = new_prompt

带有多个参数的工具

下面是一个 三角形斜边计算器工具 的例子：

from typing import Optional
from math import sqrt, cos, sin

desc = (
    "use this tool when you need to calculate the length of a hypotenuse"
    "given one or two sides of a triangle and/or an angle (in degrees). "
    "To use the tool, you must provide at least two of the following parameters "
    "['adjacent_side', 'opposite_side', 'angle']."
)  # Teach LLM about input format / requirement

class PythagorasTool(BaseTool):
    self.name = "Hypotenuse calculator"
    self.description = desc
    
    def _run(
        self,
        adjacent_side: Optional[Union[int, float]] = None,
        opposite_side: Optional[Union[int, float]] = None,
        angle: Optional[Union[int, float]] = None
    ):
        # check for the values we have been given
        if adjacent_side and opposite_side:
            return sqrt(float(adjacent_side)**2 + float(opposite_side)**2)
        elif adjacent_side and angle:
            return adjacent_side / cos(float(angle))
        elif opposite_side and angle:
            return opposite_side / sin(float(angle))
        else:
            return "Could not calculate the hypotenuse of the triangle. Need two or more of `adjacent_side`, `opposite_side`, or `angle`."
    
    def _arun(self, query: str):
        raise NotImplementedError("This tool does not support async")

tools = [PythagorasTool()]

更高级的工具使用

将（能够完成 LLM 无法做到的任务的）专家模型作为工具添加进来，使得代理作为这些模型的控制器。
除此之外，工具可以用于与无尽的功能和服务集成，或者与一系列专家模型进行通信。
通常可以使用 LangChain 的默认工具来运行 SQL 查询，执行计算或进行向量搜索。但当这些默认工具无法满足要求时，需要构建自己的工具。

如：开源专家模型 Salesforce/blip-image-captioning-large ：接收一张图片并对其进行描述（托管在 Hugging Face 上）