mirror of
https://github.com/NoFxAiOS/nofx.git
synced 2026-06-06 05:51:19 +08:00
feat: port NOFXi agent module onto latest dev base (#1485)
* feat: integrate NOFXi agent into dev * Enhance NOFXi agent workflow and diagnostics
This commit is contained in:
203
docs/agent-skills/diagnostic-skills.zh-CN.md
Normal file
203
docs/agent-skills/diagnostic-skills.zh-CN.md
Normal file
@@ -0,0 +1,203 @@
|
||||
# NOFXi 诊断与配置 Skills(第一批)
|
||||
|
||||
这份文档用于沉淀交易智能助手的第一批高频诊断与配置 skill。
|
||||
|
||||
目标不是让模型“更会想”,而是让它面对常见问题时,优先走稳定、可复用的排查路径。
|
||||
|
||||
## 设计原则
|
||||
|
||||
- 优先按 skill 回答,不要对高频问题重复自由规划
|
||||
- 先归类问题,再给出原因、检查项和修复建议
|
||||
- 能通过工具验证当前状态时,先查再下结论
|
||||
- 敏感信息只指导填写,不完整回显
|
||||
- 对结论不确定时,要明确标注为“更可能”或“优先怀疑”
|
||||
|
||||
## skill_model_api_setup
|
||||
|
||||
### 适用场景
|
||||
|
||||
- 用户问某个大模型的 API key 去哪里申请
|
||||
- 用户问 base URL 怎么填
|
||||
- 用户问 model name 怎么填
|
||||
- 用户问 OpenAI / Claude / Gemini / DeepSeek / Qwen / Kimi / Grok / MiniMax 怎么接入
|
||||
|
||||
### 处理策略
|
||||
|
||||
1. 先确认用户要配置哪个 provider
|
||||
2. 告诉用户需要准备的最少字段:
|
||||
- provider
|
||||
- API key
|
||||
- custom_api_url
|
||||
- custom_model_name
|
||||
3. 如果系统已有默认地址和默认模型名,优先给推荐值
|
||||
4. 回答按步骤组织,不要泛泛解释概念
|
||||
|
||||
### 已知实现事实
|
||||
|
||||
- 系统内置 provider 默认运行配置,见 `agent.resolveModelRuntimeConfig(...)`
|
||||
- 常见 provider 已有默认 URL 和默认 model name
|
||||
|
||||
## skill_model_config_diagnosis
|
||||
|
||||
### 适用场景
|
||||
|
||||
- 模型保存成功但 agent 仍然不可用
|
||||
- 提示 AI unavailable
|
||||
- 提示模型没启用
|
||||
- 提示 custom_api_url 不合法
|
||||
- 配置后 trader 不生效
|
||||
|
||||
### 优先排查
|
||||
|
||||
1. 是否存在已启用模型
|
||||
2. API key 是否为空
|
||||
3. custom_api_url 是否为合法 HTTPS 地址
|
||||
4. custom_model_name 是否为空或不匹配
|
||||
5. 当前 trader 是否绑定了这个模型
|
||||
6. 更新模型后是否已触发 trader reload
|
||||
|
||||
### 已知实现事实
|
||||
|
||||
- 非 HTTPS 的 `custom_api_url` 会被后端拒绝,见 `api/handler_ai_model.go`
|
||||
- 已启用模型如果缺少 API Key 或 URL,会导致 agent 无法就绪,见 `agent.ensureAIClientForStoreUser(...)`
|
||||
- 更新模型配置后,系统会尝试移除并重载相关 trader,使新配置立即生效
|
||||
|
||||
### 输出格式
|
||||
|
||||
- 现象
|
||||
- 更可能原因
|
||||
- 先检查什么
|
||||
- 下一步怎么修复
|
||||
|
||||
## skill_exchange_api_setup
|
||||
|
||||
### 适用场景
|
||||
|
||||
- 用户要新建交易所 API
|
||||
- 用户不知道交易所需要哪些权限
|
||||
- 用户问 API key / secret / passphrase 分别填什么
|
||||
|
||||
### 通用处理策略
|
||||
|
||||
1. 先确认交易所类型
|
||||
2. 告知必须权限与禁止权限
|
||||
3. 告知是否需要额外字段
|
||||
4. 强调 IP 白名单与权限配置
|
||||
5. 引导用户回到系统内完成绑定
|
||||
|
||||
### 特殊规则
|
||||
|
||||
- OKX 除 API Key 和 Secret 外,还需要 passphrase
|
||||
- Bybit 永续/合约交易需要合约权限
|
||||
- 不建议开启提现权限
|
||||
|
||||
### 参考文档
|
||||
|
||||
- `docs/getting-started/okx-api.md`
|
||||
- `docs/getting-started/bybit-api.md`
|
||||
|
||||
## skill_exchange_api_diagnosis
|
||||
|
||||
### 适用场景
|
||||
|
||||
- `invalid signature`
|
||||
- `timestamp` 错误
|
||||
- `IP not allowed`
|
||||
- `permission denied`
|
||||
- 交易所连接不上
|
||||
|
||||
### 优先排查
|
||||
|
||||
1. 系统时间是否同步
|
||||
2. API Key / Secret 是否正确
|
||||
3. 是否遗漏额外字段,如 OKX passphrase
|
||||
4. IP 白名单是否包含当前服务器
|
||||
5. 是否启用了交易或合约权限
|
||||
6. 密钥是否过期或已重建
|
||||
|
||||
### 已知实现事实
|
||||
|
||||
- 时间不同步是 `invalid signature` / `timestamp` 的高频根因,见 `docs/guides/TROUBLESHOOTING.zh-CN.md`
|
||||
- OKX 的 passphrase 缺失会导致签名相关问题,见 `docs/getting-started/okx-api.md`
|
||||
|
||||
### 输出格式
|
||||
|
||||
- 报错现象
|
||||
- 最常见根因
|
||||
- 优先检查顺序
|
||||
- 修复步骤
|
||||
|
||||
## skill_trader_start_diagnosis
|
||||
|
||||
### 适用场景
|
||||
|
||||
- trader 启动不了
|
||||
- trader 启动了但没开始交易
|
||||
- 页面显示已启动但一直没有动作
|
||||
- 用户怀疑 strategy / model / exchange 绑定有问题
|
||||
|
||||
### 优先排查
|
||||
|
||||
1. 是否有已启用的模型配置
|
||||
2. 是否有已启用的交易所配置
|
||||
3. trader 是否绑定了 exchange_id / strategy_id / ai_model_id
|
||||
4. 交易所余额和权限是否满足下单条件
|
||||
5. AI 最近的决策到底是 wait、hold 还是下单失败
|
||||
|
||||
### 回答原则
|
||||
|
||||
- 要区分“没启动”“启动了但 AI 选择不交易”“尝试下单但失败”这三类
|
||||
- 不要把“没开仓”直接等同于“系统故障”
|
||||
|
||||
## skill_order_execution_diagnosis
|
||||
|
||||
### 适用场景
|
||||
|
||||
- 下单失败
|
||||
- 只开空不开户 / 只开单边
|
||||
- 杠杆报错
|
||||
- position side mismatch
|
||||
|
||||
### 优先排查
|
||||
|
||||
1. 账户模式是否匹配,例如 Binance 是否为 Hedge Mode
|
||||
2. 是否为子账户杠杆限制
|
||||
3. 合约权限是否开启
|
||||
4. 余额、保证金、可交易 symbol 是否满足条件
|
||||
|
||||
### 已知实现事实
|
||||
|
||||
- Binance 在 One-way Mode 下,可能出现 `position side mismatch` 或单边行为
|
||||
- 某些子账户杠杆上限较低,超过限制会直接失败
|
||||
- 这些问题在 `docs/guides/TROUBLESHOOTING.md` 已有明确说明
|
||||
|
||||
## skill_strategy_diagnosis
|
||||
|
||||
### 适用场景
|
||||
|
||||
- 用户说策略没生效
|
||||
- 用户说 prompt 预览和实际不一致
|
||||
- 用户说修改策略后 trader 行为没有变化
|
||||
|
||||
### 优先排查
|
||||
|
||||
1. 当前编辑的是策略模板,还是 trader 的 custom prompt
|
||||
2. 策略是否真的保存成功
|
||||
3. 是否需要重新读取当前配置做对比
|
||||
4. 用户说的“没生效”是指未保存、未绑定,还是运行结果与预期不一致
|
||||
|
||||
### 回答原则
|
||||
|
||||
- 先明确“对象”再排查:strategy template / trader / prompt override
|
||||
- 如果能读取当前保存值,就不要凭印象判断
|
||||
|
||||
## 后续扩展方向
|
||||
|
||||
下一批可以继续补:
|
||||
|
||||
- `skill_balance_and_position_diagnosis`
|
||||
- `skill_market_data_diagnosis`
|
||||
- `skill_prompt_generation_diagnosis`
|
||||
- `skill_strategy_test_run_diagnosis`
|
||||
- `skill_exchange_specific_setup_<exchange>`
|
||||
- `skill_model_provider_setup_<provider>`
|
||||
613
docs/architecture/AGENT_CURRENT_DESIGN.zh-CN.md
Normal file
613
docs/architecture/AGENT_CURRENT_DESIGN.zh-CN.md
Normal file
@@ -0,0 +1,613 @@
|
||||
# NOFXi Agent 当前设计说明
|
||||
|
||||
## 目的
|
||||
|
||||
本文描述当前 NOFXi Agent 的实际设计,而不是早期版本的理想设计。重点回答这些问题:
|
||||
|
||||
- 用户消息从哪里进入
|
||||
- 什么请求会进入 planner
|
||||
- 当前有哪些记忆层
|
||||
- planner 如何生成与执行 plan
|
||||
- tool 现在是怎么设计的
|
||||
- 动态快照和当前引用分别解决什么问题
|
||||
- 为什么某些问题会出现“看起来有历史,但模型还是会追问”
|
||||
|
||||
本文对应的主要实现文件:
|
||||
|
||||
- `agent/agent.go`
|
||||
- `agent/web.go`
|
||||
- `api/agent_routes.go`
|
||||
- `agent/planner_runtime.go`
|
||||
- `agent/execution_state.go`
|
||||
- `agent/memory.go`
|
||||
- `agent/history.go`
|
||||
- `agent/tools.go`
|
||||
|
||||
## 一句话总览
|
||||
|
||||
当前 Agent 的运行模型可以概括为:
|
||||
|
||||
1. 前端把消息发到 `/api/agent/chat/stream`
|
||||
2. 后端把登录用户身份放进 context
|
||||
3. Agent 除 `/clear` 和 `/status` 外,其他消息全部进入 planner
|
||||
4. planner 结合多层记忆、动态快照和 tool schema 生成 plan
|
||||
5. 执行 plan 中的 `tool / reason / ask_user / respond`
|
||||
6. 在执行过程中持续更新执行态、短期原话、长期摘要和当前对象引用
|
||||
|
||||
## 请求入口
|
||||
|
||||
### 前端入口
|
||||
|
||||
前端 Agent 页面在:
|
||||
|
||||
- `web/src/pages/AgentChatPage.tsx`
|
||||
|
||||
当前聊天使用:
|
||||
|
||||
- `POST /api/agent/chat/stream`
|
||||
|
||||
请求体里会传:
|
||||
|
||||
- `message`
|
||||
- `lang`
|
||||
- `user_key`
|
||||
|
||||
### 后端路由入口
|
||||
|
||||
路由注册在:
|
||||
|
||||
- `api/agent_routes.go`
|
||||
|
||||
这里会:
|
||||
|
||||
1. 经过 `authMiddleware`
|
||||
2. 从登录态里取出 `user_id`
|
||||
3. 通过 `agent.WithStoreUserID(...)` 写入 request context
|
||||
|
||||
### Agent Web Handler
|
||||
|
||||
真正的 HTTP handler 在:
|
||||
|
||||
- `agent/web.go`
|
||||
|
||||
主要入口:
|
||||
|
||||
- `HandleChat(...)`
|
||||
- `HandleChatStream(...)`
|
||||
|
||||
再往下进入:
|
||||
|
||||
- `HandleMessageForStoreUser(...)`
|
||||
- `HandleMessageStreamForStoreUser(...)`
|
||||
|
||||
## 最外层分流
|
||||
|
||||
当前外层分流已经被收口。
|
||||
|
||||
在 `agent/agent.go` 中,除了这两个命令之外,其他输入全部交给 planner:
|
||||
|
||||
- `/clear`
|
||||
- `/status`
|
||||
|
||||
也就是说,现在这些都不再在外层直接处理:
|
||||
|
||||
- setup flow
|
||||
- trade confirmation
|
||||
- direct trade regex
|
||||
- 自然语言配置流程
|
||||
- 自然语言策略创建
|
||||
|
||||
这些都统一进入 planner。
|
||||
|
||||
这是当前设计里一个很重要的原则:
|
||||
|
||||
- 外层分流越少,行为边界越清晰
|
||||
- 自然语言理解尽量统一交给 planner + tool
|
||||
|
||||
## 当前的 5 层记忆
|
||||
|
||||
当前不是 3 层,也不是 4 层,而是 5 层:
|
||||
|
||||
1. `chatHistory`
|
||||
2. `TaskState`
|
||||
3. `ExecutionState`
|
||||
4. `CurrentReferences`
|
||||
5. `Persistent Preferences`
|
||||
|
||||
### 1. chatHistory
|
||||
|
||||
定义位置:
|
||||
|
||||
- `agent/history.go`
|
||||
|
||||
作用:
|
||||
|
||||
- 保存最近几轮用户 / assistant 原始消息
|
||||
- 给模型保留最近原话上下文
|
||||
- 为后续摘要成 `TaskState` 提供原始素材
|
||||
|
||||
特点:
|
||||
|
||||
- 只保留短期原话
|
||||
- 内存态
|
||||
- `/clear` 时清空
|
||||
|
||||
适合存:
|
||||
|
||||
- 最近几轮对话原文
|
||||
- 用户的最新措辞
|
||||
- 刚刚的自然语言上下文
|
||||
|
||||
不适合存:
|
||||
|
||||
- 长期真相
|
||||
- 当前外部系统状态
|
||||
- 当前流程精确执行位置
|
||||
|
||||
### 2. TaskState
|
||||
|
||||
定义位置:
|
||||
|
||||
- `agent/memory.go`
|
||||
|
||||
作用:
|
||||
|
||||
- 保存跨轮次仍然有意义的高层摘要
|
||||
- 注入 planner / reasoning / final response
|
||||
|
||||
持久化 key:
|
||||
|
||||
- `agent_task_state_<userID>`
|
||||
|
||||
字段:
|
||||
|
||||
- `CurrentGoal`
|
||||
- `ActiveFlow`
|
||||
- `OpenLoops`
|
||||
- `ImportantFacts`
|
||||
- `LastDecision`
|
||||
- `UpdatedAt`
|
||||
|
||||
适合存:
|
||||
|
||||
- 当前高层目标
|
||||
- 跨轮次仍然成立的未闭环事项
|
||||
- 关键事实
|
||||
- 最近一次重要决策及其原因
|
||||
|
||||
不适合存:
|
||||
|
||||
- step 级待办
|
||||
- “下一步调用哪个 tool”
|
||||
- 动态余额、持仓、配置存在性
|
||||
- 任何可以通过 tool 重新读取的实时状态
|
||||
|
||||
### 3. ExecutionState
|
||||
|
||||
定义位置:
|
||||
|
||||
- `agent/execution_state.go`
|
||||
|
||||
作用:
|
||||
|
||||
- 保存当前 plan 的执行态
|
||||
- 支持 `ask_user` 之后继续执行
|
||||
- 保存 plan、当前步骤、执行日志、等待状态等
|
||||
|
||||
持久化 key:
|
||||
|
||||
- `agent_execution_state_<userID>`
|
||||
|
||||
当前关键字段:
|
||||
|
||||
- `SessionID`
|
||||
- `Goal`
|
||||
- `Status`
|
||||
- `PlanID`
|
||||
- `Steps`
|
||||
- `CurrentStepID`
|
||||
- `DynamicSnapshots`
|
||||
- `ExecutionLog`
|
||||
- `SummaryNotes`
|
||||
- `Waiting`
|
||||
- `CurrentReferences`
|
||||
- `FinalAnswer`
|
||||
- `LastError`
|
||||
|
||||
### 4. CurrentReferences
|
||||
|
||||
定义位置:
|
||||
|
||||
- `agent/execution_state.go`
|
||||
|
||||
作用:
|
||||
|
||||
- 记录当前对话里“这个 / 那个 / 刚才那个”到底指的是谁
|
||||
|
||||
当前支持的引用对象:
|
||||
|
||||
- `strategy`
|
||||
- `trader`
|
||||
- `model`
|
||||
- `exchange`
|
||||
|
||||
这是为了解决一种常见问题:
|
||||
|
||||
- 用户明明前一轮刚说过“激进策略”
|
||||
- 下一轮说“改一下这个策略”
|
||||
- 如果没有结构化引用,模型虽然有聊天历史,也容易重新追问
|
||||
|
||||
`CurrentReferences` 不是系统状态快照,而是:
|
||||
|
||||
- 当前对话焦点对象
|
||||
- 当前代词绑定对象
|
||||
|
||||
### 5. Persistent Preferences
|
||||
|
||||
对应工具:
|
||||
|
||||
- `get_preferences`
|
||||
- `manage_preferences`
|
||||
|
||||
作用:
|
||||
|
||||
- 保存用户长期偏好
|
||||
|
||||
适合存:
|
||||
|
||||
- 默认中文回复
|
||||
- 偏好激进风格
|
||||
- 更关注 BTC / ETH
|
||||
- 不喜欢高频
|
||||
- 每天固定时间简报
|
||||
|
||||
它和 `TaskState` 的区别是:
|
||||
|
||||
- `TaskState` 偏向当前任务摘要
|
||||
- `Persistent Preferences` 偏向长期用户画像
|
||||
|
||||
## DynamicSnapshots 是什么
|
||||
|
||||
`DynamicSnapshots` 是当前真实系统状态的快照。
|
||||
|
||||
它不是历史,也不是长期记忆,而是 planner 在规划前或执行中插入的“当前事实”。
|
||||
|
||||
当前会进入快照的典型信息包括:
|
||||
|
||||
- 当前模型配置列表
|
||||
- 当前交易所配置列表
|
||||
- 当前策略列表
|
||||
- 当前 trader 列表
|
||||
- 当前余额
|
||||
- 当前持仓
|
||||
- 最近交易历史
|
||||
|
||||
作用:
|
||||
|
||||
- 防止 planner 盲信旧结论
|
||||
- 避免“之前没配置,现在其实已经配好了却还说没有”
|
||||
- 避免“之前余额是 A,现在拿旧 observation 继续回答”
|
||||
|
||||
一句话:
|
||||
|
||||
- `DynamicSnapshots` = 当前世界里真实有什么
|
||||
|
||||
## CurrentReferences 和 DynamicSnapshots 的区别
|
||||
|
||||
这两个容易混淆,但职责完全不同。
|
||||
|
||||
`DynamicSnapshots`:
|
||||
|
||||
- 当前系统状态快照
|
||||
- 是候选集合 / 当前事实
|
||||
- 例如当前有两个策略:`激进`、`新策略`
|
||||
|
||||
`CurrentReferences`:
|
||||
|
||||
- 当前对话焦点对象
|
||||
- 是“这个”到底指谁
|
||||
- 例如用户现在说的“这个策略”就是 `激进`
|
||||
|
||||
可以这样理解:
|
||||
|
||||
- `DynamicSnapshots` 是地图
|
||||
- `CurrentReferences` 是你手指现在指着地图上的哪个点
|
||||
|
||||
## Planner 的输入
|
||||
|
||||
planner 主逻辑在:
|
||||
|
||||
- `agent/planner_runtime.go`
|
||||
|
||||
生成计划时,当前会把这些东西一起送给模型:
|
||||
|
||||
- 当前用户请求
|
||||
- tool schema
|
||||
- `Persistent Preferences`
|
||||
- `TaskState`
|
||||
- `ExecutionState`
|
||||
- `Resume context`
|
||||
- `Structured waiting state`
|
||||
- `Observation context`
|
||||
|
||||
其中 observation context 不是旧版单数组,而是分层后的:
|
||||
|
||||
- `dynamic_snapshots`
|
||||
- `execution_log`
|
||||
- `summary_notes`
|
||||
|
||||
## Plan 的结构
|
||||
|
||||
当前 planner 只允许这 4 类 step:
|
||||
|
||||
- `tool`
|
||||
- `reason`
|
||||
- `ask_user`
|
||||
- `respond`
|
||||
|
||||
这意味着现在的 Agent 不是一个“自由发挥的回复器”,而是:
|
||||
|
||||
- 先规划
|
||||
- 再执行步骤
|
||||
- 必要时重规划
|
||||
|
||||
## 步骤执行流程
|
||||
|
||||
`executePlan(...)` 的核心逻辑是:
|
||||
|
||||
1. 找下一个 pending step
|
||||
2. 标记 step 为 running
|
||||
3. 执行对应类型
|
||||
4. 写回 `ExecutionState`
|
||||
5. 必要时触发 replanning
|
||||
|
||||
不同 step 类型行为如下:
|
||||
|
||||
### tool
|
||||
|
||||
- 调内部 tool
|
||||
- 把结果写入 `ExecutionLog`
|
||||
- 根据结果更新 `CurrentReferences`
|
||||
- 必要时触发 replanner
|
||||
|
||||
### reason
|
||||
|
||||
- 发起一次短 reasoning 调用
|
||||
- 生成一段简短中间推理
|
||||
- 写入 `ExecutionLog`
|
||||
|
||||
### ask_user
|
||||
|
||||
- 进入 `waiting_user`
|
||||
- 保存 `WaitingState`
|
||||
- 把问题直接回给用户
|
||||
|
||||
### respond
|
||||
|
||||
- 生成最终回答
|
||||
- 标记当前执行完成
|
||||
|
||||
## WaitingState 是什么
|
||||
|
||||
`WaitingState` 用来解决:
|
||||
|
||||
- 用户回复 `是`
|
||||
- 用户回复 `继续`
|
||||
- 用户回复 `那个就行`
|
||||
|
||||
这类短回复如果没有结构化等待状态,很容易丢上下文。
|
||||
|
||||
当前字段包括:
|
||||
|
||||
- `Question`
|
||||
- `Intent`
|
||||
- `PendingFields`
|
||||
- `ConfirmationTarget`
|
||||
- `CreatedAt`
|
||||
|
||||
它的作用是:
|
||||
|
||||
- 告诉 planner 上一轮到底在等什么
|
||||
- 让这轮短回复更容易被理解成“对上一问的回答”
|
||||
|
||||
## CurrentReferences 如何更新
|
||||
|
||||
当前是双路径更新:
|
||||
|
||||
### 1. 用户消息命中对象名时更新
|
||||
|
||||
如果用户说:
|
||||
|
||||
- `修改激进策略`
|
||||
- `停止 lky`
|
||||
- `用 DeepSeek`
|
||||
|
||||
系统会去当前用户的策略 / trader / model / exchange 列表里尝试匹配名称或 ID。
|
||||
|
||||
匹配成功后,更新 `CurrentReferences`。
|
||||
|
||||
### 2. tool 成功返回对象时更新
|
||||
|
||||
比如:
|
||||
|
||||
- `manage_strategy(create/update/activate)`
|
||||
- `manage_trader(create/update)`
|
||||
- `manage_model_config(update)`
|
||||
- `manage_exchange_config(update)`
|
||||
|
||||
只要 tool 返回了具体对象,系统就会把对应 ID / name 写回当前引用。
|
||||
|
||||
## Tool 设计
|
||||
|
||||
当前 tool 是“资源型 tool”设计,不是“页面动作型 tool”。
|
||||
|
||||
### 当前主要工具
|
||||
|
||||
配置资源:
|
||||
|
||||
- `get_exchange_configs`
|
||||
- `manage_exchange_config`
|
||||
- `get_model_configs`
|
||||
- `manage_model_config`
|
||||
|
||||
策略资源:
|
||||
|
||||
- `get_strategies`
|
||||
- `manage_strategy`
|
||||
|
||||
trader 资源:
|
||||
|
||||
- `manage_trader`
|
||||
|
||||
交易 / 查询资源:
|
||||
|
||||
- `search_stock`
|
||||
- `execute_trade`
|
||||
- `get_positions`
|
||||
- `get_balance`
|
||||
- `get_market_price`
|
||||
- `get_trade_history`
|
||||
|
||||
### 为什么这么设计
|
||||
|
||||
优点:
|
||||
|
||||
- tool schema 稳定
|
||||
- 行为边界清晰
|
||||
- planner 更容易学会
|
||||
- 资源增删改查统一
|
||||
|
||||
当前 `manage_strategy` 支持:
|
||||
|
||||
- `list`
|
||||
- `get_default_config`
|
||||
- `create`
|
||||
- `update`
|
||||
- `delete`
|
||||
- `activate`
|
||||
- `duplicate`
|
||||
|
||||
当前 `manage_trader` 支持:
|
||||
|
||||
- `list`
|
||||
- `create`
|
||||
- `update`
|
||||
- `delete`
|
||||
- `start`
|
||||
- `stop`
|
||||
|
||||
## 为什么“创建策略”不该默认依赖交易所和模型
|
||||
|
||||
当前设计里,策略模板应该是独立资源:
|
||||
|
||||
- `strategy`
|
||||
|
||||
而运行态对象是:
|
||||
|
||||
- `trader`
|
||||
|
||||
更合理的边界是:
|
||||
|
||||
- 创建策略模板:用 `manage_strategy`
|
||||
- 把策略跑起来:用 `manage_trader`
|
||||
|
||||
也就是说:
|
||||
|
||||
- 策略不默认依赖交易所和模型
|
||||
- 只有当用户要求“运行 / 部署 / 创建 trader”时,才需要进一步关联 exchange / model / trader
|
||||
|
||||
## 当前一个完整例子
|
||||
|
||||
用户输入:
|
||||
|
||||
`帮我创建一个新的激进策略模板,名字就叫激进。创建完后,再把这个策略绑定到 trader lky。`
|
||||
|
||||
当前大致流程:
|
||||
|
||||
1. 前端请求 `/api/agent/chat/stream`
|
||||
2. 后端注入 `store_user_id`
|
||||
3. Agent 进入 planner
|
||||
4. planner 刷新动态快照:
|
||||
- 当前策略
|
||||
- 当前 trader
|
||||
5. 生成 plan,例如:
|
||||
- `get_strategies`
|
||||
- `manage_strategy(create)`
|
||||
- `manage_trader(update)`
|
||||
- `respond`
|
||||
6. 执行 `manage_strategy(create)` 后:
|
||||
- 写入 `ExecutionLog`
|
||||
- 更新 `CurrentReferences.strategy`
|
||||
7. 执行 `manage_trader(update)` 时:
|
||||
- 直接使用刚创建策略的 ID
|
||||
8. 输出最终回复
|
||||
|
||||
如果此后用户继续说:
|
||||
|
||||
`把这个策略的 prompt 改激进一点`
|
||||
|
||||
系统会优先从 `CurrentReferences.strategy` 理解“这个策略”。
|
||||
|
||||
## 为什么看起来“有历史”,模型还是会追问
|
||||
|
||||
因为“有聊天历史”不等于“有结构化对象绑定”。
|
||||
|
||||
如果没有 `CurrentReferences`:
|
||||
|
||||
- 模型只能依赖原话文本推断“这个策略”是谁
|
||||
- 一旦中间插入多条消息,或者有多个候选策略
|
||||
- 就容易重新追问
|
||||
|
||||
所以当前设计里,`CurrentReferences` 是补齐这一块的关键。
|
||||
|
||||
## 当前已知限制
|
||||
|
||||
### 1. 外层虽然已经大幅收口,但仍然不是纯 graph runtime
|
||||
|
||||
现在比之前更统一,但整体仍然是:
|
||||
|
||||
- Agent 主入口
|
||||
- Planner
|
||||
- Tool 执行
|
||||
|
||||
而不是完整 node-graph 引擎。
|
||||
|
||||
### 2. ExecutionState 仍然是按 userID 单槽位
|
||||
|
||||
这意味着:
|
||||
|
||||
- 同一用户的多个并行任务仍然可能相互影响
|
||||
|
||||
更彻底的方向应该是:
|
||||
|
||||
- 按 thread / session 多实例存储
|
||||
|
||||
### 3. CurrentReferences 目前还是轻量实现
|
||||
|
||||
当前只覆盖:
|
||||
|
||||
- strategy
|
||||
- trader
|
||||
- model
|
||||
- exchange
|
||||
|
||||
后面如果要更强,需要考虑:
|
||||
|
||||
- 多候选冲突消解
|
||||
- 昵称映射
|
||||
- 跨更长会话的稳定实体绑定
|
||||
|
||||
## 当前设计的核心思想
|
||||
|
||||
一句话总结:
|
||||
|
||||
- `chatHistory` 记原话
|
||||
- `Persistent Preferences` 记长期偏好
|
||||
- `TaskState` 记高层摘要
|
||||
- `ExecutionState` 记当前流程
|
||||
- `DynamicSnapshots` 记当前事实
|
||||
- `CurrentReferences` 记当前指代对象
|
||||
- `planner` 决定步骤
|
||||
- `tools` 执行落地动作
|
||||
|
||||
这就是当前 NOFXi Agent 的实际运行设计。
|
||||
454
docs/architecture/AGENT_MEMORY_AND_PLANNING.md
Normal file
454
docs/architecture/AGENT_MEMORY_AND_PLANNING.md
Normal file
@@ -0,0 +1,454 @@
|
||||
# NOFXi Agent Memory And Planning Design
|
||||
|
||||
## Purpose
|
||||
|
||||
This document explains how the current NOFXi agent handles:
|
||||
|
||||
- short-term conversation memory
|
||||
- durable task memory
|
||||
- durable execution / planning state
|
||||
- planner execution and replanning
|
||||
- state reset and resume behavior
|
||||
|
||||
The implementation described here is primarily in:
|
||||
|
||||
- `agent/history.go`
|
||||
- `agent/memory.go`
|
||||
- `agent/execution_state.go`
|
||||
- `agent/planner_runtime.go`
|
||||
- `agent/agent.go`
|
||||
|
||||
## High-Level Model
|
||||
|
||||
The current agent uses three different layers of state:
|
||||
|
||||
1. `chatHistory`
|
||||
Recent in-memory user/assistant turns for the live conversation.
|
||||
|
||||
2. `TaskState`
|
||||
Durable summarized context that should survive beyond recent turns.
|
||||
|
||||
3. `ExecutionState`
|
||||
Durable workflow state for the currently running or recently blocked plan.
|
||||
|
||||
These three layers serve different purposes and should not be treated as the same thing.
|
||||
|
||||
## State Layers
|
||||
|
||||
### 1. `chatHistory`
|
||||
|
||||
Defined in `agent/history.go`.
|
||||
|
||||
Role:
|
||||
|
||||
- stores recent `user` / `assistant` messages in memory
|
||||
- keyed by `userID`
|
||||
- used as short-term conversational context
|
||||
- acts as the source material for later compression into `TaskState`
|
||||
|
||||
Characteristics:
|
||||
|
||||
- in-memory only
|
||||
- capped by `maxTurns`
|
||||
- cleared by `/clear`
|
||||
- not suitable as durable truth
|
||||
|
||||
Typical contents:
|
||||
|
||||
- the last few user questions
|
||||
- the last few assistant replies
|
||||
- temporary conversational wording
|
||||
|
||||
### 2. `TaskState`
|
||||
|
||||
Defined in `agent/memory.go`.
|
||||
|
||||
Role:
|
||||
|
||||
- stores durable, structured, non-derivable context
|
||||
- persisted through `system_config`
|
||||
- injected into planning and reasoning prompts
|
||||
|
||||
Storage key:
|
||||
|
||||
- `agent_task_state_<userID>`
|
||||
|
||||
Fields:
|
||||
|
||||
- `CurrentGoal`
|
||||
- `ActiveFlow`
|
||||
- `OpenLoops`
|
||||
- `ImportantFacts`
|
||||
- `LastDecision`
|
||||
- `UpdatedAt`
|
||||
|
||||
Intended contents:
|
||||
|
||||
- user goal that still matters across turns
|
||||
- high-level unresolved issues that still matter across turns
|
||||
- facts that tools cannot cheaply re-fetch
|
||||
- latest important decision summary
|
||||
|
||||
Explicitly not intended for:
|
||||
|
||||
- step-level pending items such as "wait for API key"
|
||||
- execution actions such as "call get_exchange_configs"
|
||||
- live balances
|
||||
- current positions
|
||||
- current market prices
|
||||
- mutable configuration availability
|
||||
|
||||
Those should be checked from tools at planning time instead of being trusted from old summaries.
|
||||
|
||||
### 3. `ExecutionState`
|
||||
|
||||
Defined in `agent/execution_state.go`.
|
||||
|
||||
Role:
|
||||
|
||||
- stores the current execution workflow
|
||||
- allows the agent to resume after `ask_user`
|
||||
- persists plan steps, observations, and completion status
|
||||
|
||||
Storage key:
|
||||
|
||||
- `agent_execution_state_<userID>`
|
||||
|
||||
Fields:
|
||||
|
||||
- `SessionID`
|
||||
- `UserID`
|
||||
- `Goal`
|
||||
- `Status`
|
||||
- `PlanID`
|
||||
- `Steps`
|
||||
- `CurrentStepID`
|
||||
- `Observations`
|
||||
- `FinalAnswer`
|
||||
- `LastError`
|
||||
- `UpdatedAt`
|
||||
|
||||
This is the planner's working state, not a general memory store.
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Request Entry
|
||||
|
||||
Entry points:
|
||||
|
||||
- `HandleMessage(...)`
|
||||
- `HandleMessageStream(...)`
|
||||
|
||||
Flow:
|
||||
|
||||
1. user message enters `agent`
|
||||
2. slash commands and explicit direct branches are handled first
|
||||
3. all other requests go into planner flow via `thinkAndAct(...)` / `thinkAndActStream(...)`
|
||||
|
||||
### Planner Flow
|
||||
|
||||
The planner pipeline in `agent/planner_runtime.go` is:
|
||||
|
||||
1. append user message into `chatHistory`
|
||||
2. emit `planning` SSE event
|
||||
3. load `ExecutionState`
|
||||
4. optionally reset stale `ExecutionState`
|
||||
5. optionally refresh dynamic configuration snapshots
|
||||
6. create a fresh execution plan with the LLM
|
||||
7. execute steps one by one
|
||||
8. persist `ExecutionState` after important transitions
|
||||
9. append assistant answer into `chatHistory`
|
||||
10. maybe compress old conversation into `TaskState`
|
||||
|
||||
## Short-Term vs Durable Memory
|
||||
|
||||
### What lives in `chatHistory`
|
||||
|
||||
Good fits:
|
||||
|
||||
- raw recent messages
|
||||
- conversational wording
|
||||
- latest assistant phrasing
|
||||
|
||||
Bad fits:
|
||||
|
||||
- long-lived truths
|
||||
- current external system state
|
||||
|
||||
### What lives in `TaskState`
|
||||
|
||||
Good fits:
|
||||
|
||||
- durable goal
|
||||
- high-level unfinished work that remains relevant across turns
|
||||
- important facts the user stated
|
||||
- previous decisions and why they were made
|
||||
|
||||
Bad fits:
|
||||
|
||||
- pending steps inside the current plan
|
||||
- execution-level reminders such as "wait for a field" or "call a tool"
|
||||
- old conclusions about whether tools exist
|
||||
- old conclusions about whether model/exchange config is present
|
||||
- live operational state that can change outside the chat
|
||||
|
||||
### What lives in `ExecutionState`
|
||||
|
||||
Good fits:
|
||||
|
||||
- current plan steps
|
||||
- observations from tool calls
|
||||
- blocked-on-user-input status
|
||||
- exact current workflow state
|
||||
- step-level pending work and block reasons
|
||||
|
||||
Bad fits:
|
||||
|
||||
- evergreen user profile
|
||||
- long-term semantic memory
|
||||
|
||||
## Planning Logic
|
||||
|
||||
### Plan Creation
|
||||
|
||||
`createExecutionPlan(...)` sends the following into the planner model:
|
||||
|
||||
- available tool definitions
|
||||
- persistent preferences
|
||||
- `TaskState` context
|
||||
- `ExecutionState` JSON
|
||||
- current user request
|
||||
|
||||
The planner must return JSON only with step types:
|
||||
|
||||
- `tool`
|
||||
- `reason`
|
||||
- `ask_user`
|
||||
- `respond`
|
||||
|
||||
### Step Execution
|
||||
|
||||
`executePlan(...)` executes the plan loop:
|
||||
|
||||
- `tool`
|
||||
call tool and append observation
|
||||
- `reason`
|
||||
run reasoning sub-call and append observation
|
||||
- `ask_user`
|
||||
save `waiting_user` state and return question
|
||||
- `respond`
|
||||
generate final answer and mark completed
|
||||
|
||||
After each completed step, `replanAfterStep(...)` may:
|
||||
|
||||
- continue
|
||||
- replace remaining steps
|
||||
- ask user
|
||||
- finish
|
||||
|
||||
## Resume Behavior
|
||||
|
||||
When `ExecutionState.Status == waiting_user`, the next user turn is treated as a reply to the pending question.
|
||||
|
||||
Current safeguards:
|
||||
|
||||
- latest asked question is extracted from the stored plan
|
||||
- the user reply is appended as a `user_reply` observation
|
||||
- planner prompt receives explicit `Resume context`
|
||||
|
||||
This prevents short replies like `是` from being misread as unrelated fresh intents as often as before.
|
||||
|
||||
## Dynamic State Refresh
|
||||
|
||||
Configuration and trader management requests are dynamic by nature. Their truth can change outside the current chat, for example:
|
||||
|
||||
- user configures exchange in the UI
|
||||
- user adds model in another tab
|
||||
- user creates trader elsewhere
|
||||
|
||||
Because of that, configuration/trader requests should not trust stale model conclusions.
|
||||
|
||||
Current protection in `planner_runtime.go`:
|
||||
|
||||
- detects config / trader intent with `isConfigOrTraderIntent(...)`
|
||||
- clears `TaskState` context from the planner prompt for these requests
|
||||
- refreshes `ExecutionState.Observations` with fresh snapshots from:
|
||||
- `toolGetModelConfigs(...)`
|
||||
- `toolGetExchangeConfigs(...)`
|
||||
- `toolListTraders(...)`
|
||||
|
||||
This makes the planner rely more on current system state and less on older narrative memory.
|
||||
|
||||
## Reset Strategy
|
||||
|
||||
The system currently resets or weakens stale execution state when:
|
||||
|
||||
- user says retry-like phrases such as `再试`, `继续`, `try again`, `continue`
|
||||
- request is config / trader related and old execution state is failed / completed / waiting
|
||||
|
||||
Reset scope:
|
||||
|
||||
- `ExecutionState` may be cleared
|
||||
- `TaskState` is not globally deleted, but it is intentionally ignored for config/trader planning
|
||||
|
||||
Manual reset:
|
||||
|
||||
- `/clear`
|
||||
|
||||
This clears:
|
||||
|
||||
- short-term chat history
|
||||
- task state
|
||||
- execution state
|
||||
|
||||
## Compression Design
|
||||
|
||||
`maybeCompressHistory(...)` moves older short-term chat content into `TaskState` when:
|
||||
|
||||
- recent message count exceeds the configured window
|
||||
- estimated token count exceeds the threshold
|
||||
|
||||
Compression strategy:
|
||||
|
||||
1. keep recent conversation in `chatHistory`
|
||||
2. summarize older turns into structured `TaskState`
|
||||
3. persist new `TaskState`
|
||||
4. replace `chatHistory` with recent slice
|
||||
|
||||
Important design rule:
|
||||
|
||||
- `TaskState` should keep durable context only
|
||||
- it should not become a stale copy of mutable operational state
|
||||
|
||||
## Current Architecture Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
U[User Message] --> A[HandleMessage / HandleMessageStream]
|
||||
A --> B{Direct command?}
|
||||
B -->|Yes| C[Direct branch or slash command]
|
||||
B -->|No| D[thinkAndAct / thinkAndActStream]
|
||||
|
||||
D --> E[Append user turn to chatHistory]
|
||||
D --> F[Load ExecutionState]
|
||||
F --> G{waiting_user?}
|
||||
G -->|Yes| H[Attach user_reply observation]
|
||||
G -->|No| I[Create fresh ExecutionState]
|
||||
|
||||
H --> J[Refresh dynamic snapshots if config/trader intent]
|
||||
I --> J
|
||||
J --> K[createExecutionPlan via LLM]
|
||||
K --> L[Execution plan]
|
||||
L --> M[executePlan loop]
|
||||
|
||||
M --> N[tool step]
|
||||
M --> O[reason step]
|
||||
M --> P[ask_user step]
|
||||
M --> Q[respond step]
|
||||
|
||||
N --> R[Append Observation]
|
||||
O --> R
|
||||
R --> S[replanAfterStep]
|
||||
S --> M
|
||||
|
||||
P --> T[Persist waiting_user ExecutionState]
|
||||
T --> UQ[Return question to user]
|
||||
|
||||
Q --> V[Persist completed ExecutionState]
|
||||
V --> W[Append assistant turn to chatHistory]
|
||||
W --> X[maybeCompressHistory]
|
||||
X --> Y[Persist TaskState]
|
||||
Y --> Z[Final response]
|
||||
```
|
||||
|
||||
## Memory Relationship Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
CH[chatHistory\nin-memory\nrecent turns]
|
||||
TS[TaskState\npersisted summary\nsystem_config]
|
||||
ES[ExecutionState\npersisted workflow\nsystem_config]
|
||||
PL[Planner Prompt]
|
||||
|
||||
CH -->|recent raw turns| PL
|
||||
ES -->|current workflow JSON| PL
|
||||
TS -->|durable structured context| PL
|
||||
|
||||
CH -->|old turns compressed| TS
|
||||
PL -->|plan / observations / status| ES
|
||||
```
|
||||
|
||||
## State Transition Diagram
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> planning
|
||||
planning --> running: plan created
|
||||
running --> waiting_user: ask_user step
|
||||
waiting_user --> planning: user replies
|
||||
running --> completed: respond step finished
|
||||
running --> failed: step error
|
||||
failed --> planning: retry / continue / config-trader reset
|
||||
completed --> planning: new relevant request or retry flow
|
||||
```
|
||||
|
||||
## Known Design Tradeoffs
|
||||
|
||||
### Strengths
|
||||
|
||||
- separates short-term chat from durable task summary
|
||||
- allows blocked flows to resume
|
||||
- supports replanning after every meaningful step
|
||||
- can recover from stale assumptions better for dynamic config/trader requests
|
||||
|
||||
### Weaknesses
|
||||
|
||||
- `TaskState` is still summary-driven, so summarization quality matters
|
||||
- planner still depends on model compliance for some transitions
|
||||
- `ExecutionState` is single-track per user, not multiple concurrent workflows
|
||||
- config/trader intent detection is heuristic and keyword-based
|
||||
|
||||
## Practical Guidance
|
||||
|
||||
### When to trust `TaskState`
|
||||
|
||||
Trust it for:
|
||||
|
||||
- user intent continuity
|
||||
- open loops
|
||||
- durable facts
|
||||
|
||||
Do not trust it for:
|
||||
|
||||
- whether current exchange/model/trader config exists now
|
||||
- whether a specific operational action is currently possible
|
||||
|
||||
### When to trust `ExecutionState`
|
||||
|
||||
Trust it for:
|
||||
|
||||
- current plan continuity
|
||||
- exact blocked step
|
||||
- latest observation chain
|
||||
|
||||
Do not trust it blindly when:
|
||||
|
||||
- user has changed configuration outside the chat
|
||||
- the system capabilities changed after deployment
|
||||
|
||||
### When to fetch live state again
|
||||
|
||||
Always prefer fresh tool snapshots before answering about:
|
||||
|
||||
- existing model configs
|
||||
- existing exchange configs
|
||||
- existing traders
|
||||
- whether trader creation can proceed
|
||||
|
||||
## Suggested Future Improvements
|
||||
|
||||
- add workflow versioning so capability changes invalidate stale `ExecutionState`
|
||||
- separate `waiting_user_confirmation` from generic `waiting_user`
|
||||
- introduce code-level handling for short confirmations such as `是`, `好`, `继续`
|
||||
- move dynamic state refresh from heuristic to explicit planner preflight stage
|
||||
- support multiple concurrent execution sessions per user if needed
|
||||
453
docs/architecture/AGENT_MEMORY_AND_PLANNING.zh-CN.md
Normal file
453
docs/architecture/AGENT_MEMORY_AND_PLANNING.zh-CN.md
Normal file
@@ -0,0 +1,453 @@
|
||||
# NOFXi Agent 记忆与规划设计
|
||||
|
||||
## 目的
|
||||
|
||||
本文说明当前 NOFXi agent 是如何处理以下能力的:
|
||||
|
||||
- 短期对话记忆
|
||||
- 持久化任务记忆
|
||||
- 持久化执行态 / 规划态
|
||||
- planner 的执行与重规划
|
||||
- 状态重置与恢复
|
||||
|
||||
本文主要对应以下实现文件:
|
||||
|
||||
- `agent/history.go`
|
||||
- `agent/memory.go`
|
||||
- `agent/execution_state.go`
|
||||
- `agent/planner_runtime.go`
|
||||
- `agent/agent.go`
|
||||
|
||||
## 总体模型
|
||||
|
||||
当前 agent 使用三层不同的状态:
|
||||
|
||||
1. `chatHistory`
|
||||
用于保存当前会话最近几轮的原始用户/助手对话,驻留内存。
|
||||
|
||||
2. `TaskState`
|
||||
用于保存跨轮次仍然有价值的结构化摘要,持久化存储。
|
||||
|
||||
3. `ExecutionState`
|
||||
用于保存当前规划流程的执行态,支持流程中断后的继续执行。
|
||||
|
||||
这三层职责不同,不能混为一谈。
|
||||
|
||||
## 三层状态
|
||||
|
||||
### 1. `chatHistory`
|
||||
|
||||
定义位置:`agent/history.go`
|
||||
|
||||
作用:
|
||||
|
||||
- 按 `userID` 保存最近的 `user` / `assistant` 消息
|
||||
- 作为短期对话上下文
|
||||
- 作为后续压缩进 `TaskState` 的原始素材
|
||||
|
||||
特性:
|
||||
|
||||
- 仅在内存中存在
|
||||
- 有 `maxTurns` 上限
|
||||
- `/clear` 时会清空
|
||||
- 不适合作为长期真相来源
|
||||
|
||||
典型内容:
|
||||
|
||||
- 最近几轮用户问题
|
||||
- 最近几轮助手回答
|
||||
- 临时措辞与上下文表达
|
||||
|
||||
### 2. `TaskState`
|
||||
|
||||
定义位置:`agent/memory.go`
|
||||
|
||||
作用:
|
||||
|
||||
- 保存持久化、结构化、不可轻易从工具重新推导出的上下文
|
||||
- 通过 `system_config` 持久化
|
||||
- 注入到 planner / reasoning prompt 中
|
||||
|
||||
存储 key:
|
||||
|
||||
- `agent_task_state_<userID>`
|
||||
|
||||
字段:
|
||||
|
||||
- `CurrentGoal`
|
||||
- `ActiveFlow`
|
||||
- `OpenLoops`
|
||||
- `ImportantFacts`
|
||||
- `LastDecision`
|
||||
- `UpdatedAt`
|
||||
|
||||
适合存放:
|
||||
|
||||
- 当前仍有效的用户目标
|
||||
- 跨轮次仍然成立的高层未闭环问题
|
||||
- 无法简单通过工具重新读取的重要事实
|
||||
- 最近一次关键决策及原因
|
||||
|
||||
不适合存放:
|
||||
|
||||
- “等用户提供 API Key” 这类 step 级待办
|
||||
- “调用 get_exchange_configs” 这类执行动作
|
||||
- 实时余额
|
||||
- 当前持仓
|
||||
- 当前行情价格
|
||||
- 是否存在某个配置这类会变化的状态
|
||||
|
||||
这些动态信息应该在规划阶段通过工具重新检查,而不是相信旧摘要。
|
||||
|
||||
### 3. `ExecutionState`
|
||||
|
||||
定义位置:`agent/execution_state.go`
|
||||
|
||||
作用:
|
||||
|
||||
- 保存当前执行中的工作流状态
|
||||
- 支持 `ask_user` 之后恢复执行
|
||||
- 持久化保存计划步骤、观察结果和最终状态
|
||||
|
||||
存储 key:
|
||||
|
||||
- `agent_execution_state_<userID>`
|
||||
|
||||
字段:
|
||||
|
||||
- `SessionID`
|
||||
- `UserID`
|
||||
- `Goal`
|
||||
- `Status`
|
||||
- `PlanID`
|
||||
- `Steps`
|
||||
- `CurrentStepID`
|
||||
- `Observations`
|
||||
- `FinalAnswer`
|
||||
- `LastError`
|
||||
- `UpdatedAt`
|
||||
|
||||
它是 planner 的“工作态”,不是通用记忆仓库。
|
||||
|
||||
## 数据流
|
||||
|
||||
### 请求入口
|
||||
|
||||
入口函数:
|
||||
|
||||
- `HandleMessage(...)`
|
||||
- `HandleMessageStream(...)`
|
||||
|
||||
流程:
|
||||
|
||||
1. 用户消息进入 `agent`
|
||||
2. 优先处理 slash command 和显式直达分支
|
||||
3. 其余请求进入 planner 流程:`thinkAndAct(...)` / `thinkAndActStream(...)`
|
||||
|
||||
### Planner 主流程
|
||||
|
||||
`agent/planner_runtime.go` 中的 planner 管线如下:
|
||||
|
||||
1. 把用户消息加入 `chatHistory`
|
||||
2. 发出 `planning` SSE 事件
|
||||
3. 加载 `ExecutionState`
|
||||
4. 视情况重置过期的 `ExecutionState`
|
||||
5. 视情况刷新动态配置快照
|
||||
6. 调用 LLM 生成新的执行计划
|
||||
7. 按步骤执行计划
|
||||
8. 在关键状态变化后持久化 `ExecutionState`
|
||||
9. 把助手回答加入 `chatHistory`
|
||||
10. 视情况把旧对话压缩进 `TaskState`
|
||||
|
||||
## 短期记忆 vs 持久记忆
|
||||
|
||||
### `chatHistory` 里应该放什么
|
||||
|
||||
适合:
|
||||
|
||||
- 最近原始消息
|
||||
- 对话措辞
|
||||
- 最近一轮助手的表达方式
|
||||
|
||||
不适合:
|
||||
|
||||
- 长期真相
|
||||
- 外部系统当前状态
|
||||
|
||||
### `TaskState` 里应该放什么
|
||||
|
||||
适合:
|
||||
|
||||
- 持续目标
|
||||
- 跨轮次仍有意义的高层未闭环事项
|
||||
- 用户明确讲过的重要事实
|
||||
- 历史关键决策和原因
|
||||
|
||||
不适合:
|
||||
|
||||
- 当前 plan 中尚未执行的步骤
|
||||
- “等待某个字段”“调用某个 tool” 这类执行级待办
|
||||
- “系统有没有这个工具” 这种过时结论
|
||||
- “当前有没有模型/交易所配置” 这种可变化状态
|
||||
- 可以通过工具重新查询到的动态状态
|
||||
|
||||
### `ExecutionState` 里应该放什么
|
||||
|
||||
适合:
|
||||
|
||||
- 当前计划步骤
|
||||
- 工具调用观察结果
|
||||
- 当前是否卡在等用户补充信息
|
||||
- 当前工作流的精确执行位置
|
||||
- step 级待办和阻塞原因
|
||||
|
||||
不适合:
|
||||
|
||||
- 长期用户画像
|
||||
- 通用长期语义记忆
|
||||
|
||||
## 规划逻辑
|
||||
|
||||
### 计划生成
|
||||
|
||||
`createExecutionPlan(...)` 会把以下信息送给 planner 模型:
|
||||
|
||||
- 当前可用 tool 定义
|
||||
- 持久化用户偏好
|
||||
- `TaskState` 上下文
|
||||
- `ExecutionState` JSON
|
||||
- 当前用户请求
|
||||
|
||||
planner 必须返回 JSON,且步骤类型只能是:
|
||||
|
||||
- `tool`
|
||||
- `reason`
|
||||
- `ask_user`
|
||||
- `respond`
|
||||
|
||||
### 步骤执行
|
||||
|
||||
`executePlan(...)` 的执行循环如下:
|
||||
|
||||
- `tool`
|
||||
调用工具并写入 observation
|
||||
- `reason`
|
||||
发起 reasoning 子调用并写入 observation
|
||||
- `ask_user`
|
||||
保存 `waiting_user` 状态并把问题返回给用户
|
||||
- `respond`
|
||||
生成最终回答并标记完成
|
||||
|
||||
每个步骤结束后,`replanAfterStep(...)` 还可以决定:
|
||||
|
||||
- continue
|
||||
- replace_remaining
|
||||
- ask_user
|
||||
- finish
|
||||
|
||||
## 恢复执行
|
||||
|
||||
当 `ExecutionState.Status == waiting_user` 时,下一条用户消息会被视为对上一轮追问的回复。
|
||||
|
||||
当前保护机制:
|
||||
|
||||
- 从已有 plan 中提取最近一次追问内容
|
||||
- 将用户回复作为 `user_reply` observation 追加
|
||||
- 在 planner prompt 中注入显式的 `Resume context`
|
||||
|
||||
这样可以减少用户只回复 `是` 这类短消息时,被错误理解成全新意图的情况。
|
||||
|
||||
## 动态状态刷新
|
||||
|
||||
配置类与 trader 管理类请求本质上是动态请求,它们的真相可能在聊天之外发生变化,例如:
|
||||
|
||||
- 用户在 Web UI 中配置了交易所
|
||||
- 用户在另一个页面新增了模型
|
||||
- 用户在别处创建了 trader
|
||||
|
||||
因此,这类请求不能依赖旧的模型结论。
|
||||
|
||||
当前在 `planner_runtime.go` 中的保护措施:
|
||||
|
||||
- 通过 `isConfigOrTraderIntent(...)` 检测配置 / trader 意图
|
||||
- 这类请求在 planner prompt 中不再注入旧 `TaskState`
|
||||
- 同时刷新 `ExecutionState.Observations` 中的实时快照:
|
||||
- `toolGetModelConfigs(...)`
|
||||
- `toolGetExchangeConfigs(...)`
|
||||
- `toolListTraders(...)`
|
||||
|
||||
这样 planner 会更多依赖当前系统状态,而不是依赖旧记忆中的描述。
|
||||
|
||||
## 重置策略
|
||||
|
||||
当前系统在以下场景会重置或弱化旧执行态:
|
||||
|
||||
- 用户说了类似 `再试`、`继续`、`try again`、`continue`
|
||||
- 当前请求是配置 / trader 相关,并且旧 `ExecutionState` 已经失败 / 完成 / 正在等待用户
|
||||
|
||||
重置范围:
|
||||
|
||||
- `ExecutionState` 可能会被清空
|
||||
- `TaskState` 不会整体删除,但在配置 / trader 请求中会被主动忽略
|
||||
|
||||
手动清理:
|
||||
|
||||
- `/clear`
|
||||
|
||||
这条命令会清掉:
|
||||
|
||||
- 短期 chat history
|
||||
- task state
|
||||
- execution state
|
||||
|
||||
## 压缩设计
|
||||
|
||||
`maybeCompressHistory(...)` 会在以下条件满足时把旧的短期对话压缩进 `TaskState`:
|
||||
|
||||
- 最近消息数超过窗口
|
||||
- 估算 token 数超过阈值
|
||||
|
||||
压缩流程:
|
||||
|
||||
1. 保留最近若干轮对话在 `chatHistory`
|
||||
2. 把更早的内容总结成结构化 `TaskState`
|
||||
3. 持久化新的 `TaskState`
|
||||
4. 用最近消息切片替换 `chatHistory`
|
||||
|
||||
重要设计原则:
|
||||
|
||||
- `TaskState` 只保留长期有效上下文
|
||||
- 不能把它变成动态运营状态的陈旧副本
|
||||
|
||||
## 当前架构图
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
U[用户消息] --> A[HandleMessage / HandleMessageStream]
|
||||
A --> B{是否命中直达分支?}
|
||||
B -->|是| C[直接处理 slash command 或快捷分支]
|
||||
B -->|否| D[thinkAndAct / thinkAndActStream]
|
||||
|
||||
D --> E[写入 chatHistory]
|
||||
D --> F[加载 ExecutionState]
|
||||
F --> G{是否 waiting_user?}
|
||||
G -->|是| H[追加 user_reply observation]
|
||||
G -->|否| I[创建新的 ExecutionState]
|
||||
|
||||
H --> J[若为配置或 trader 请求则刷新动态快照]
|
||||
I --> J
|
||||
J --> K[createExecutionPlan 调用 LLM]
|
||||
K --> L[得到 execution plan]
|
||||
L --> M[executePlan 循环执行]
|
||||
|
||||
M --> N[tool step]
|
||||
M --> O[reason step]
|
||||
M --> P[ask_user step]
|
||||
M --> Q[respond step]
|
||||
|
||||
N --> R[写入 Observation]
|
||||
O --> R
|
||||
R --> S[replanAfterStep]
|
||||
S --> M
|
||||
|
||||
P --> T[持久化 waiting_user ExecutionState]
|
||||
T --> UQ[向用户返回追问]
|
||||
|
||||
Q --> V[持久化 completed ExecutionState]
|
||||
V --> W[把 assistant 回复写入 chatHistory]
|
||||
W --> X[maybeCompressHistory]
|
||||
X --> Y[持久化 TaskState]
|
||||
Y --> Z[返回最终回答]
|
||||
```
|
||||
|
||||
## 记忆关系图
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
CH[chatHistory\n内存态\n最近对话]
|
||||
TS[TaskState\n持久化摘要\nsystem_config]
|
||||
ES[ExecutionState\n持久化执行态\nsystem_config]
|
||||
PL[Planner Prompt]
|
||||
|
||||
CH -->|最近原始对话| PL
|
||||
ES -->|当前工作流 JSON| PL
|
||||
TS -->|长期结构化上下文| PL
|
||||
|
||||
CH -->|旧消息压缩| TS
|
||||
PL -->|计划 / 观察 / 状态| ES
|
||||
```
|
||||
|
||||
## 状态转换图
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> planning
|
||||
planning --> running: plan created
|
||||
running --> waiting_user: ask_user step
|
||||
waiting_user --> planning: user replies
|
||||
running --> completed: respond step finished
|
||||
running --> failed: step error
|
||||
failed --> planning: retry / continue / config-trader reset
|
||||
completed --> planning: new relevant request or retry flow
|
||||
```
|
||||
|
||||
## 当前设计的取舍
|
||||
|
||||
### 优点
|
||||
|
||||
- 将短期对话与长期摘要分离
|
||||
- 支持在 `ask_user` 之后恢复执行
|
||||
- 每个关键步骤后都支持重规划
|
||||
- 对配置 / 创建 trader 这类动态请求,已经能更好抵抗旧结论污染
|
||||
|
||||
### 缺点
|
||||
|
||||
- `TaskState` 的质量仍然依赖总结效果
|
||||
- 某些恢复逻辑仍依赖模型是否听话
|
||||
- 每个用户当前只有一条 `ExecutionState`,不支持多个并发工作流
|
||||
- 配置 / trader 意图识别目前仍是关键词启发式
|
||||
|
||||
## 实践建议
|
||||
|
||||
### 什么时候该相信 `TaskState`
|
||||
|
||||
应该相信它用于:
|
||||
|
||||
- 延续用户目标
|
||||
- 跟踪未完成事项
|
||||
- 保留长期有效事实
|
||||
|
||||
不应该相信它用于:
|
||||
|
||||
- 当前是否存在模型 / 交易所 / trader 配置
|
||||
- 当前是否能够执行某个操作
|
||||
|
||||
### 什么时候该相信 `ExecutionState`
|
||||
|
||||
应该相信它用于:
|
||||
|
||||
- 当前工作流是否仍然连续
|
||||
- 当前阻塞在哪一步
|
||||
- 最近的 observation 链条
|
||||
|
||||
不应该盲信它用于:
|
||||
|
||||
- 用户在聊天外已经修改过配置的场景
|
||||
- 系统能力或工具集发生变化后的旧结论
|
||||
|
||||
### 什么时候必须重新获取实时状态
|
||||
|
||||
以下场景应该优先重新通过工具获取:
|
||||
|
||||
- 当前模型配置
|
||||
- 当前交易所配置
|
||||
- 当前 trader 列表
|
||||
- 当前是否满足 trader 创建条件
|
||||
|
||||
## 后续建议
|
||||
|
||||
- 为 `ExecutionState` 增加版本号或能力签名,能力变化时自动失效
|
||||
- 将 `waiting_user_confirmation` 与通用 `waiting_user` 分开
|
||||
- 对 `是`、`好`、`继续` 这类短确认增加代码级识别
|
||||
- 将动态快照刷新从启发式升级为显式 planner 预检查阶段
|
||||
- 如果后续需要,支持一个用户多条并发执行会话
|
||||
Reference in New Issue
Block a user