README.md

视觉设计

软件工程

3D图形学

C语言

C++语言基础

C语言基础

LinuxC编程

Windows编程

Go语言

GORM框架

Gin框架

Go语言基础

Go语言标准库

Go语言网络编程

开发工具链

Java

JavaEE

JavaFX

JavaSE

Java企业级应用框架

Activiti

Hibernate

MyBatis

Netty

Quartz

ShardingSphere

Shiro

Spring

SpringBoot

SpringCloud

SpringCloudAlibaba

SpringData

SpringFramework

SpringSecurity

SpringSession

Struts2

Java构建和开发工具链

Java虚拟机

Java语言基础

BigDecimal

CLASSPATH详解

Java8新特性

Java并发程序设计

Java网络和IO程序设计

Linux下Java环境搭建

反射机制

异常机制详解

弱引用

模块化

类初始化块的执行顺序

迭代器

第三方库

JUnit5-单元测试框架

Jasypt-配置文件加密

Lettuce-Redis客户端

Logback-日志模块

Lombok-简化冗余代码

Lucene-全文检索引擎

OpenFeign-声明式HTTP客户端

POI-读写Excel文档

RxJava响应式编程

SpringDoc-接口文档管理

commons-beanutils-对象属性处理

commons-codec-编解码库

commons-collections4-集合操作

commons-compress-压缩文件处理

commons-io-输入输出处理

commons-lang3-基础工具类

groovy-整合脚本引擎

httpclient5-通用HTTP客户端

jackson-json解析库

jodconverter-文档转换

redisson-分布式内存数据网格

velocity-模板引擎

Linux

BashShell

Linux操作系统基础

RaspberryPi

命令行工具

服务配置管理

系统配置管理

NodeJS

PHP

Laravel

PHP网络应用开发

PHP语言基础

开发工具

扩展库

Python

Django

FastAPI

LangChain

01-LangChain简介

02-实现聊天机器人

02-实现聊天机器人.md

03-检索增强生成RAG

Python语言基础

Scrapy爬虫框架

内置库

开发相关工具

第三方库

Web前端

Web客户端编程

EcmaScript6

Electron

HTML5

JQuery

JavaScript客户端编程

JavaScript语言精粹

NextJS

React

TypeScript

UmiJS

Vue

WebExtension

常用功能实现

常用库

开发工具链

Web网页设计

Bootstrap4

CSS

Less

TailwindCSS

常用库

常见问题

Windows

dotNet

Csharp语言基础

Winform编程

dotNet运行时库

开发工具链

信息安全

应用架构和中间件

Docker

Istio

Kafka

Kubernetes

Nginx

OpenResty

Prometheus

RabbitMQ

Tomcat

ZooKeeper

gRPC

操作系统

数据库系统

ElasticSearch

Kettle开源ETL工具

MongoDB

MySQL

Oracle

Redis

关系型数据库基础理论

数据结构和算法

游戏引擎

LibGDX

Unity

2D游戏开发

GUI系统

Unity编辑器

Unity脚本编程

实例

移动端应用开发

Android开发基础

Cordova

Flutter开发框架

常见问题总结

开发工具

微信小程序开发

高级控件

编译原理

计算机网络

软件工程学

软件开发相关工具

Eclipse-集成开发环境

Firefox-浏览器

Git-版本控制

GitLab-开源代码仓库管理工具

Jenkins-持续集成

Nexus-私有包管理仓库

SVN-版本控制

VSCode-代码编辑器

其他工具软件

知识管理

软件测试

软件开发相关知识

实现聊天机器人

前面章节我们实现了一个最简单的LangChain应用，它通过一个固定的Prompt模板调用LLM。这篇笔记我们实现更复杂的例子，一个能够“记忆”对话历史实现多轮对话的聊天机器人。

LangChain中的“对话历史”

LLM实际上是无状态的，它并没有什么“记忆”。在多轮对话中LLM能够记得之前说过的话，是因为我们在应用层将对话历史全部传给了LLM，下面例子展示了LangChain中如何传递“对话历史”。

from langchain.globals import set_debug, set_verbose
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_models import ChatOllama

# 输出额外的调试信息
set_debug(True)
set_verbose(True)

# 创建Prompt模板
prompt_template = ChatPromptTemplate.from_messages([
    ('human', 'Hi, my name is Aiko.'),
    ('ai', 'Hi, Aiko.'),
    ('human', 'What is my name?'),
])

# 创建LLM对象
model = ChatOllama(model='llama3:8b-instruct-q5_K_M', temperature=1)

# 组装链
chain = prompt_template | model | StrOutputParser()

# 调用链并输出结果
result = chain.invoke({})
print(result)

输出结果：

Your name is Aiko! Nice to meet you!

我们这里使用了ChatPromptTemplate.from_messages()创建了一个Prompt，它是LangChain中专用于多轮对话的Prompt模板，它的参数是一个数组，其中human对应的元组是我们发给LLM的信息，ai对应的元组是LLM回复给我们的信息，LLM会通过这一系列对话历史构成的数组来生成接下来输出的内容。

除了human和ai类型的消息，大多数LLM还支持system消息，它通常用于为LLM在对话时设置一些全局规则。下面代码是一个例子，用于指示LLM在回复时加上Emoji表情来活跃气氛。

prompt_template = ChatPromptTemplate.from_messages([
    ('system', 'You are a helpful AI assistant. You always add emojis to your responses.'),
    ('human', 'Hi, my name is Aiko.'),
    ('ai', 'Hi, Aiko.'),
    ('human', 'What is my name?'),
])

加入系统提示后，输出可能就会变成这样：

Your name is Aiko 🙋‍♀️! 😊

在服务端维护“对话历史”

前面代码我们固定写死了一个数组作为对话历史，实际使用场景中肯定不是这样的，我们和AI之间的对话历史需要不断更新，此外LLM对外提供服务时为了能够和多个人对话，我们还需要建立一个会话（Session）机制，不同的人在不同的会话下有不同的对话历史。我们可以自己实现这些功能，不过LangChain对这些功能也进行了封装。

Session机制相关的实现被封装在langchain_community包中，我们需要先安装这个包。

pip install langchain_community

下面是实现能够多轮对话的AI聊天机器人的代码。

from langchain.globals import set_debug, set_verbose
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# 输出额外的调试信息
set_debug(True)
set_verbose(True)

# 用一个变量在内存中维护对话历史
chat_history = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    """根据SessionID获取对话历史信息"""
    if session_id not in chat_history:
        chat_history[session_id] = ChatMessageHistory()
    return chat_history[session_id]


# 创建Prompt模板
prompt_template = ChatPromptTemplate.from_messages([
    ('system', 'You are a helpful AI assistant. You always add emojis to your responses.'),
    MessagesPlaceholder(variable_name='history')
])

# 创建LLM对象
model = ChatOllama(model='llama3:8b-instruct-q5_K_M', temperature=1)

# 组装链
chain = prompt_template | model | StrOutputParser()

# 封装会话处理层
chain_with_history = RunnableWithMessageHistory(chain, get_session_history, input_messages_key='history')

# 调用链并输出结果
while True:
    message = input()
    if message == 'exit':
        break
    result = chain_with_history.invoke({'history': [('human', message)]},
                                       config={'configurable': {'session_id': 'abc123'}})
    print(result)

代码中我们的对话历史都维护在chat_history这个dict对象中，实际开发中我们可能还会将其存储在数据库中，无论如何，查询对话历史的逻辑都封装在get_session_history这个函数里，它接收一个SessionID并返回之前的对话历史。

在组装好Chain后，我们又在其之上封装了RunnableWithMessageHistory对象，它为整个Chain的运行加入了Session机制，随后我们就可以基于这个对象来实现带历史的多轮对话了，在invoke()方法中，我们除了传入当前的输入消息还传入了SessionID，同一个SessionID对应着同一组会话。

当然，LangChain的这个会话层实现也有很多问题，它封装的太深了，使用起来非常不灵活，而对话历史在实际开发中操作和优化空间很大，LangChain的这个会话实现可能难以满足我们的需求，因此这里仅做了解。实际开发中，我们不使用LangChain的这个会话实现也是完全可以的，我们将自己维护的整个对话历史传递给RunnableWithMessageHistory封装前的Chain也是同样的效果。

流式输出

LLM输出大段文本通常需要较长的时间，聊天机器人是一个相对实时的应用场景，因此我们通常会采用流式输出的方式，避免用户苦苦等待。LangChain封装了流式输出的方法，我们可以很方便的实现流式输出，下面是一个例子。

from langchain.globals import set_debug, set_verbose
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# 输出额外的调试信息
set_debug(True)
set_verbose(True)

# 用一个变量在内存中维护对话历史
chat_history = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    """根据SessionID获取对话历史信息"""
    if session_id not in chat_history:
        chat_history[session_id] = ChatMessageHistory()
    return chat_history[session_id]


# 创建Prompt模板
prompt_template = ChatPromptTemplate.from_messages([
    ('system', 'You are a helpful AI assistant. You always add emojis to your responses.'),
    MessagesPlaceholder(variable_name='history')
])

# 创建封装了对话历史的LLM对象
model = ChatOllama(model='llama3:8b-instruct-q5_K_M', temperature=1)

# 组装链
chain = prompt_template | model | StrOutputParser()
chain_with_history = RunnableWithMessageHistory(chain, get_session_history, input_messages_key='history')

# 调用链并输出结果
while True:
    message = input()
    if message == 'exit':
        break
    for response in chain_with_history.stream({'history': [('human', message)]},
                                              config={'configurable': {'session_id': 'abc123'}}):
        print(response, end='')
    print('')

代码中，我们调用Chain的方法从之前的invoke()换成了stream()，我们迭代它的返回值即可以流的方式读取LLM的输出内容。实际开发中，如果我们编写的是一个Web服务，通常都会采用SSE的方式输出流式消息，具体可以参考Web框架相关的章节，这里就不多介绍了。

基于Mustache模板创建复杂Prompt

有时我们输入的Prompt会异常复杂，其中可能包含判断和循环逻辑，生成这样一个Prompt需要一个模板引擎来实现。LangChain的早期版本集成了广泛使用的Jinja2模板引擎，但最新版本换成了比较奇葩的Mustache，不过用法还是差不多的。下面例子构建的Prompt相对复杂一些，它实现了一个类似“角色扮演”的功能，在System Prompt中我们使用模板引擎遍历了一个数组npc_data，并读取了数组元素的属性。

from langchain.globals import set_debug, set_verbose
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_models import ChatOllama

# 输出额外的调试信息
set_debug(True)
set_verbose(True)

# 创建Prompt模板
prompt_template = ChatPromptTemplate.from_messages([
    ('system', """
    {{#npc_data}}
    Name: {{name}}
    Age: {{age}}
    Desc: {{desc}}
    {{/npc_data}}
    Now you play the role of {{npc}}. You print what {{npc}} says and surround his or her actions with `*`. 
    Reply with the format below in {{lang}}:
    {{npc}}: What {{npc}} say.*{{npc}}'s action and psychological activities*
    """),
    MessagesPlaceholder(variable_name='history')
], template_format='mustache')

# 创建LLM对象
model = ChatOllama(model='llama3:8b-instruct-q5_K_M', temperature=1)

# 组装链
chain = prompt_template | model | StrOutputParser()

# 一些用于拼装模板的数据
npc_data = [
    {'name': 'Tom', 'age': '18', 'desc': 'a worker in a factory'},
    {'name': 'Jerry', 'age': '17', 'desc': 'a high school student'},
]

player = 'Tom'
npc = 'Jerry'

# 调用链并输出结果
user_input = input()
result = chain.invoke(
    {'npc_data': npc_data, 'npc': npc, 'player': player, 'lang': 'English', 'history': [('human', user_input)]})
print(result)

我们调用ChatPromptTemplate.from_messages()构建Prompt时，指定了template_format属性，它的默认值是f-string，仅能实现一些基础的字符串替换。这里我们将其指定为mustache，即使用Mustache模板引擎，此时我们的System Prompt就能正确渲染为我们需要的内容了。

作者：Gacfox

Build with NextJS | Sitemap