feat: port NOFXi agent module onto latest dev base (#1485)

* feat: integrate NOFXi agent into dev * Enhance NOFXi agent workflow and diagnostics
2026-07-22 03:37:36 +08:00 · 2026-04-21 23:47:55 +08:00
parent 1ba50bdedf
commit 3ca95b294d
88 changed files with 22630 additions and 1143 deletions
--- a/docs/agent-skills/diagnostic-skills.zh-CN.md
+++ b/docs/agent-skills/diagnostic-skills.zh-CN.md
@@ -0,0 +1,203 @@
+# NOFXi 诊断与配置 Skills（第一批）
+
+这份文档用于沉淀交易智能助手的第一批高频诊断与配置 skill。
+
+目标不是让模型“更会想”，而是让它面对常见问题时，优先走稳定、可复用的排查路径。
+
+## 设计原则
+
+- 优先按 skill 回答，不要对高频问题重复自由规划
+- 先归类问题，再给出原因、检查项和修复建议
+- 能通过工具验证当前状态时，先查再下结论
+- 敏感信息只指导填写，不完整回显
+- 对结论不确定时，要明确标注为“更可能”或“优先怀疑”
+
+## skill_model_api_setup
+
+### 适用场景
+
+- 用户问某个大模型的 API key 去哪里申请
+- 用户问 base URL 怎么填
+- 用户问 model name 怎么填
+- 用户问 OpenAI / Claude / Gemini / DeepSeek / Qwen / Kimi / Grok / MiniMax 怎么接入
+
+### 处理策略
+
+1. 先确认用户要配置哪个 provider
+2. 告诉用户需要准备的最少字段：
+   - provider
+   - API key
+   - custom_api_url
+   - custom_model_name
+3. 如果系统已有默认地址和默认模型名，优先给推荐值
+4. 回答按步骤组织，不要泛泛解释概念
+
+### 已知实现事实
+
+- 系统内置 provider 默认运行配置，见 `agent.resolveModelRuntimeConfig(...)`
+- 常见 provider 已有默认 URL 和默认 model name
+
+## skill_model_config_diagnosis
+
+### 适用场景
+
+- 模型保存成功但 agent 仍然不可用
+- 提示 AI unavailable
+- 提示模型没启用
+- 提示 custom_api_url 不合法
+- 配置后 trader 不生效
+
+### 优先排查
+
+1. 是否存在已启用模型
+2. API key 是否为空
+3. custom_api_url 是否为合法 HTTPS 地址
+4. custom_model_name 是否为空或不匹配
+5. 当前 trader 是否绑定了这个模型
+6. 更新模型后是否已触发 trader reload
+
+### 已知实现事实
+
+- 非 HTTPS 的 `custom_api_url` 会被后端拒绝，见 `api/handler_ai_model.go`
+- 已启用模型如果缺少 API Key 或 URL，会导致 agent 无法就绪，见 `agent.ensureAIClientForStoreUser(...)`
+- 更新模型配置后，系统会尝试移除并重载相关 trader，使新配置立即生效
+
+### 输出格式
+
+- 现象
+- 更可能原因
+- 先检查什么
+- 下一步怎么修复
+
+## skill_exchange_api_setup
+
+### 适用场景
+
+- 用户要新建交易所 API
+- 用户不知道交易所需要哪些权限
+- 用户问 API key / secret / passphrase 分别填什么
+
+### 通用处理策略
+
+1. 先确认交易所类型
+2. 告知必须权限与禁止权限
+3. 告知是否需要额外字段
+4. 强调 IP 白名单与权限配置
+5. 引导用户回到系统内完成绑定
+
+### 特殊规则
+
+- OKX 除 API Key 和 Secret 外，还需要 passphrase
+- Bybit 永续/合约交易需要合约权限
+- 不建议开启提现权限
+
+### 参考文档
+
+- `docs/getting-started/okx-api.md`
+- `docs/getting-started/bybit-api.md`
+
+## skill_exchange_api_diagnosis
+
+### 适用场景
+
+- `invalid signature`
+- `timestamp` 错误
+- `IP not allowed`
+- `permission denied`
+- 交易所连接不上
+
+### 优先排查
+
+1. 系统时间是否同步
+2. API Key / Secret 是否正确
+3. 是否遗漏额外字段，如 OKX passphrase
+4. IP 白名单是否包含当前服务器
+5. 是否启用了交易或合约权限
+6. 密钥是否过期或已重建
+
+### 已知实现事实
+
+- 时间不同步是 `invalid signature` / `timestamp` 的高频根因，见 `docs/guides/TROUBLESHOOTING.zh-CN.md`
+- OKX 的 passphrase 缺失会导致签名相关问题，见 `docs/getting-started/okx-api.md`
+
+### 输出格式
+
+- 报错现象
+- 最常见根因
+- 优先检查顺序
+- 修复步骤
+
+## skill_trader_start_diagnosis
+
+### 适用场景
+
+- trader 启动不了
+- trader 启动了但没开始交易
+- 页面显示已启动但一直没有动作
+- 用户怀疑 strategy / model / exchange 绑定有问题
+
+### 优先排查
+
+1. 是否有已启用的模型配置
+2. 是否有已启用的交易所配置
+3. trader 是否绑定了 exchange_id / strategy_id / ai_model_id
+4. 交易所余额和权限是否满足下单条件
+5. AI 最近的决策到底是 wait、hold 还是下单失败
+
+### 回答原则
+
+- 要区分“没启动”“启动了但 AI 选择不交易”“尝试下单但失败”这三类
+- 不要把“没开仓”直接等同于“系统故障”
+
+## skill_order_execution_diagnosis
+
+### 适用场景
+
+- 下单失败
+- 只开空不开户 / 只开单边
+- 杠杆报错
+- position side mismatch
+
+### 优先排查
+
+1. 账户模式是否匹配，例如 Binance 是否为 Hedge Mode
+2. 是否为子账户杠杆限制
+3. 合约权限是否开启
+4. 余额、保证金、可交易 symbol 是否满足条件
+
+### 已知实现事实
+
+- Binance 在 One-way Mode 下，可能出现 `position side mismatch` 或单边行为
+- 某些子账户杠杆上限较低，超过限制会直接失败
+- 这些问题在 `docs/guides/TROUBLESHOOTING.md` 已有明确说明
+
+## skill_strategy_diagnosis
+
+### 适用场景
+
+- 用户说策略没生效
+- 用户说 prompt 预览和实际不一致
+- 用户说修改策略后 trader 行为没有变化
+
+### 优先排查
+
+1. 当前编辑的是策略模板，还是 trader 的 custom prompt
+2. 策略是否真的保存成功
+3. 是否需要重新读取当前配置做对比
+4. 用户说的“没生效”是指未保存、未绑定，还是运行结果与预期不一致
+
+### 回答原则
+
+- 先明确“对象”再排查：strategy template / trader / prompt override
+- 如果能读取当前保存值，就不要凭印象判断
+
+## 后续扩展方向
+
+下一批可以继续补：
+
+- `skill_balance_and_position_diagnosis`
+- `skill_market_data_diagnosis`
+- `skill_prompt_generation_diagnosis`
+- `skill_strategy_test_run_diagnosis`
+- `skill_exchange_specific_setup_<exchange>`
+- `skill_model_provider_setup_<provider>`
--- a/docs/architecture/AGENT_CURRENT_DESIGN.zh-CN.md
+++ b/docs/architecture/AGENT_CURRENT_DESIGN.zh-CN.md
@@ -0,0 +1,613 @@
+# NOFXi Agent 当前设计说明
+
+## 目的
+
+本文描述当前 NOFXi Agent 的实际设计，而不是早期版本的理想设计。重点回答这些问题：
+
+- 用户消息从哪里进入
+- 什么请求会进入 planner
+- 当前有哪些记忆层
+- planner 如何生成与执行 plan
+- tool 现在是怎么设计的
+- 动态快照和当前引用分别解决什么问题
+- 为什么某些问题会出现“看起来有历史，但模型还是会追问”
+
+本文对应的主要实现文件：
+
+- `agent/agent.go`
+- `agent/web.go`
+- `api/agent_routes.go`
+- `agent/planner_runtime.go`
+- `agent/execution_state.go`
+- `agent/memory.go`
+- `agent/history.go`
+- `agent/tools.go`
+
+## 一句话总览
+
+当前 Agent 的运行模型可以概括为：
+
+1. 前端把消息发到 `/api/agent/chat/stream`
+2. 后端把登录用户身份放进 context
+3. Agent 除 `/clear` 和 `/status` 外，其他消息全部进入 planner
+4. planner 结合多层记忆、动态快照和 tool schema 生成 plan
+5. 执行 plan 中的 `tool / reason / ask_user / respond`
+6. 在执行过程中持续更新执行态、短期原话、长期摘要和当前对象引用
+
+## 请求入口
+
+### 前端入口
+
+前端 Agent 页面在：
+
+- `web/src/pages/AgentChatPage.tsx`
+
+当前聊天使用：
+
+- `POST /api/agent/chat/stream`
+
+请求体里会传：
+
+- `message`
+- `lang`
+- `user_key`
+
+### 后端路由入口
+
+路由注册在：
+
+- `api/agent_routes.go`
+
+这里会：
+
+1. 经过 `authMiddleware`
+2. 从登录态里取出 `user_id`
+3. 通过 `agent.WithStoreUserID(...)` 写入 request context
+
+### Agent Web Handler
+
+真正的 HTTP handler 在：
+
+- `agent/web.go`
+
+主要入口：
+
+- `HandleChat(...)`
+- `HandleChatStream(...)`
+
+再往下进入：
+
+- `HandleMessageForStoreUser(...)`
+- `HandleMessageStreamForStoreUser(...)`
+
+## 最外层分流
+
+当前外层分流已经被收口。
+
+在 `agent/agent.go` 中，除了这两个命令之外，其他输入全部交给 planner：
+
+- `/clear`
+- `/status`
+
+也就是说，现在这些都不再在外层直接处理：
+
+- setup flow
+- trade confirmation
+- direct trade regex
+- 自然语言配置流程
+- 自然语言策略创建
+
+这些都统一进入 planner。
+
+这是当前设计里一个很重要的原则：
+
+- 外层分流越少，行为边界越清晰
+- 自然语言理解尽量统一交给 planner + tool
+
+## 当前的 5 层记忆
+
+当前不是 3 层，也不是 4 层，而是 5 层：
+
+1. `chatHistory`
+2. `TaskState`
+3. `ExecutionState`
+4. `CurrentReferences`
+5. `Persistent Preferences`
+
+### 1. chatHistory
+
+定义位置：
+
+- `agent/history.go`
+
+作用：
+
+- 保存最近几轮用户 / assistant 原始消息
+- 给模型保留最近原话上下文
+- 为后续摘要成 `TaskState` 提供原始素材
+
+特点：
+
+- 只保留短期原话
+- 内存态
+- `/clear` 时清空
+
+适合存：
+
+- 最近几轮对话原文
+- 用户的最新措辞
+- 刚刚的自然语言上下文
+
+不适合存：
+
+- 长期真相
+- 当前外部系统状态
+- 当前流程精确执行位置
+
+### 2. TaskState
+
+定义位置：
+
+- `agent/memory.go`
+
+作用：
+
+- 保存跨轮次仍然有意义的高层摘要
+- 注入 planner / reasoning / final response
+
+持久化 key：
+
+- `agent_task_state_<userID>`
+
+字段：
+
+- `CurrentGoal`
+- `ActiveFlow`
+- `OpenLoops`
+- `ImportantFacts`
+- `LastDecision`
+- `UpdatedAt`
+
+适合存：
+
+- 当前高层目标
+- 跨轮次仍然成立的未闭环事项
+- 关键事实
+- 最近一次重要决策及其原因
+
+不适合存：
+
+- step 级待办
+- “下一步调用哪个 tool”
+- 动态余额、持仓、配置存在性
+- 任何可以通过 tool 重新读取的实时状态
+
+### 3. ExecutionState
+
+定义位置：
+
+- `agent/execution_state.go`
+
+作用：
+
+- 保存当前 plan 的执行态
+- 支持 `ask_user` 之后继续执行
+- 保存 plan、当前步骤、执行日志、等待状态等
+
+持久化 key：
+
+- `agent_execution_state_<userID>`
+
+当前关键字段：
+
+- `SessionID`
+- `Goal`
+- `Status`
+- `PlanID`
+- `Steps`
+- `CurrentStepID`
+- `DynamicSnapshots`
+- `ExecutionLog`
+- `SummaryNotes`
+- `Waiting`
+- `CurrentReferences`
+- `FinalAnswer`
+- `LastError`
+
+### 4. CurrentReferences
+
+定义位置：
+
+- `agent/execution_state.go`
+
+作用：
+
+- 记录当前对话里“这个 / 那个 / 刚才那个”到底指的是谁
+
+当前支持的引用对象：
+
+- `strategy`
+- `trader`
+- `model`
+- `exchange`
+
+这是为了解决一种常见问题：
+
+- 用户明明前一轮刚说过“激进策略”
+- 下一轮说“改一下这个策略”
+- 如果没有结构化引用，模型虽然有聊天历史，也容易重新追问
+
+`CurrentReferences` 不是系统状态快照，而是：
+
+- 当前对话焦点对象
+- 当前代词绑定对象
+
+### 5. Persistent Preferences
+
+对应工具：
+
+- `get_preferences`
+- `manage_preferences`
+
+作用：
+
+- 保存用户长期偏好
+
+适合存：
+
+- 默认中文回复
+- 偏好激进风格
+- 更关注 BTC / ETH
+- 不喜欢高频
+- 每天固定时间简报
+
+它和 `TaskState` 的区别是：
+
+- `TaskState` 偏向当前任务摘要
+- `Persistent Preferences` 偏向长期用户画像
+
+## DynamicSnapshots 是什么
+
+`DynamicSnapshots` 是当前真实系统状态的快照。
+
+它不是历史，也不是长期记忆，而是 planner 在规划前或执行中插入的“当前事实”。
+
+当前会进入快照的典型信息包括：
+
+- 当前模型配置列表
+- 当前交易所配置列表
+- 当前策略列表
+- 当前 trader 列表
+- 当前余额
+- 当前持仓
+- 最近交易历史
+
+作用：
+
+- 防止 planner 盲信旧结论
+- 避免“之前没配置，现在其实已经配好了却还说没有”
+- 避免“之前余额是 A，现在拿旧 observation 继续回答”
+
+一句话：
+
+- `DynamicSnapshots` = 当前世界里真实有什么
+
+## CurrentReferences 和 DynamicSnapshots 的区别
+
+这两个容易混淆，但职责完全不同。
+
+`DynamicSnapshots`：
+
+- 当前系统状态快照
+- 是候选集合 / 当前事实
+- 例如当前有两个策略：`激进`、`新策略`
+
+`CurrentReferences`：
+
+- 当前对话焦点对象
+- 是“这个”到底指谁
+- 例如用户现在说的“这个策略”就是 `激进`
+
+可以这样理解：
+
+- `DynamicSnapshots` 是地图
+- `CurrentReferences` 是你手指现在指着地图上的哪个点
+
+## Planner 的输入
+
+planner 主逻辑在：
+
+- `agent/planner_runtime.go`
+
+生成计划时，当前会把这些东西一起送给模型：
+
+- 当前用户请求
+- tool schema
+- `Persistent Preferences`
+- `TaskState`
+- `ExecutionState`
+- `Resume context`
+- `Structured waiting state`
+- `Observation context`
+
+其中 observation context 不是旧版单数组，而是分层后的：
+
+- `dynamic_snapshots`
+- `execution_log`
+- `summary_notes`
+
+## Plan 的结构
+
+当前 planner 只允许这 4 类 step：
+
+- `tool`
+- `reason`
+- `ask_user`
+- `respond`
+
+这意味着现在的 Agent 不是一个“自由发挥的回复器”，而是：
+
+- 先规划
+- 再执行步骤
+- 必要时重规划
+
+## 步骤执行流程
+
+`executePlan(...)` 的核心逻辑是：
+
+1. 找下一个 pending step
+2. 标记 step 为 running
+3. 执行对应类型
+4. 写回 `ExecutionState`
+5. 必要时触发 replanning
+
+不同 step 类型行为如下：
+
+### tool
+
+- 调内部 tool
+- 把结果写入 `ExecutionLog`
+- 根据结果更新 `CurrentReferences`
+- 必要时触发 replanner
+
+### reason
+
+- 发起一次短 reasoning 调用
+- 生成一段简短中间推理
+- 写入 `ExecutionLog`
+
+### ask_user
+
+- 进入 `waiting_user`
+- 保存 `WaitingState`
+- 把问题直接回给用户
+
+### respond
+
+- 生成最终回答
+- 标记当前执行完成
+
+## WaitingState 是什么
+
+`WaitingState` 用来解决：
+
+- 用户回复 `是`
+- 用户回复 `继续`
+- 用户回复 `那个就行`
+
+这类短回复如果没有结构化等待状态，很容易丢上下文。
+
+当前字段包括：
+
+- `Question`
+- `Intent`
+- `PendingFields`
+- `ConfirmationTarget`
+- `CreatedAt`
+
+它的作用是：
+
+- 告诉 planner 上一轮到底在等什么
+- 让这轮短回复更容易被理解成“对上一问的回答”
+
+## CurrentReferences 如何更新
+
+当前是双路径更新：
+
+### 1. 用户消息命中对象名时更新
+
+如果用户说：
+
+- `修改激进策略`
+- `停止 lky`
+- `用 DeepSeek`
+
+系统会去当前用户的策略 / trader / model / exchange 列表里尝试匹配名称或 ID。
+
+匹配成功后，更新 `CurrentReferences`。
+
+### 2. tool 成功返回对象时更新
+
+比如：
+
+- `manage_strategy(create/update/activate)`
+- `manage_trader(create/update)`
+- `manage_model_config(update)`
+- `manage_exchange_config(update)`
+
+只要 tool 返回了具体对象，系统就会把对应 ID / name 写回当前引用。
+
+## Tool 设计
+
+当前 tool 是“资源型 tool”设计，不是“页面动作型 tool”。
+
+### 当前主要工具
+
+配置资源：
+
+- `get_exchange_configs`
+- `manage_exchange_config`
+- `get_model_configs`
+- `manage_model_config`
+
+策略资源：
+
+- `get_strategies`
+- `manage_strategy`
+
+trader 资源：
+
+- `manage_trader`
+
+交易 / 查询资源：
+
+- `search_stock`
+- `execute_trade`
+- `get_positions`
+- `get_balance`
+- `get_market_price`
+- `get_trade_history`
+
+### 为什么这么设计
+
+优点：
+
+- tool schema 稳定
+- 行为边界清晰
+- planner 更容易学会
+- 资源增删改查统一
+
+当前 `manage_strategy` 支持：
+
+- `list`
+- `get_default_config`
+- `create`
+- `update`
+- `delete`
+- `activate`
+- `duplicate`
+
+当前 `manage_trader` 支持：
+
+- `list`
+- `create`
+- `update`
+- `delete`
+- `start`
+- `stop`
+
+## 为什么“创建策略”不该默认依赖交易所和模型
+
+当前设计里，策略模板应该是独立资源：
+
+- `strategy`
+
+而运行态对象是：
+
+- `trader`
+
+更合理的边界是：
+
+- 创建策略模板：用 `manage_strategy`
+- 把策略跑起来：用 `manage_trader`
+
+也就是说：
+
+- 策略不默认依赖交易所和模型
+- 只有当用户要求“运行 / 部署 / 创建 trader”时，才需要进一步关联 exchange / model / trader
+
+## 当前一个完整例子
+
+用户输入：
+
+`帮我创建一个新的激进策略模板，名字就叫激进。创建完后，再把这个策略绑定到 trader lky。`
+
+当前大致流程：
+
+1. 前端请求 `/api/agent/chat/stream`
+2. 后端注入 `store_user_id`
+3. Agent 进入 planner
+4. planner 刷新动态快照：
+   - 当前策略
+   - 当前 trader
+5. 生成 plan，例如：
+   - `get_strategies`
+   - `manage_strategy(create)`
+   - `manage_trader(update)`
+   - `respond`
+6. 执行 `manage_strategy(create)` 后：
+   - 写入 `ExecutionLog`
+   - 更新 `CurrentReferences.strategy`
+7. 执行 `manage_trader(update)` 时：
+   - 直接使用刚创建策略的 ID
+8. 输出最终回复
+
+如果此后用户继续说：
+
+`把这个策略的 prompt 改激进一点`
+
+系统会优先从 `CurrentReferences.strategy` 理解“这个策略”。
+
+## 为什么看起来“有历史”，模型还是会追问
+
+因为“有聊天历史”不等于“有结构化对象绑定”。
+
+如果没有 `CurrentReferences`：
+
+- 模型只能依赖原话文本推断“这个策略”是谁
+- 一旦中间插入多条消息，或者有多个候选策略
+- 就容易重新追问
+
+所以当前设计里，`CurrentReferences` 是补齐这一块的关键。
+
+## 当前已知限制
+
+### 1. 外层虽然已经大幅收口，但仍然不是纯 graph runtime
+
+现在比之前更统一，但整体仍然是：
+
+- Agent 主入口
+- Planner
+- Tool 执行
+
+而不是完整 node-graph 引擎。
+
+### 2. ExecutionState 仍然是按 userID 单槽位
+
+这意味着：
+
+- 同一用户的多个并行任务仍然可能相互影响
+
+更彻底的方向应该是：
+
+- 按 thread / session 多实例存储
+
+### 3. CurrentReferences 目前还是轻量实现
+
+当前只覆盖：
+
+- strategy
+- trader
+- model
+- exchange
+
+后面如果要更强，需要考虑：
+
+- 多候选冲突消解
+- 昵称映射
+- 跨更长会话的稳定实体绑定
+
+## 当前设计的核心思想
+
+一句话总结：
+
+- `chatHistory` 记原话
+- `Persistent Preferences` 记长期偏好
+- `TaskState` 记高层摘要
+- `ExecutionState` 记当前流程
+- `DynamicSnapshots` 记当前事实
+- `CurrentReferences` 记当前指代对象
+- `planner` 决定步骤
+- `tools` 执行落地动作
+
+这就是当前 NOFXi Agent 的实际运行设计。
--- a/docs/architecture/AGENT_MEMORY_AND_PLANNING.md
+++ b/docs/architecture/AGENT_MEMORY_AND_PLANNING.md
@@ -0,0 +1,454 @@
+# NOFXi Agent Memory And Planning Design
+
+## Purpose
+
+This document explains how the current NOFXi agent handles:
+
+- short-term conversation memory
+- durable task memory
+- durable execution / planning state
+- planner execution and replanning
+- state reset and resume behavior
+
+The implementation described here is primarily in:
+
+- `agent/history.go`
+- `agent/memory.go`
+- `agent/execution_state.go`
+- `agent/planner_runtime.go`
+- `agent/agent.go`
+
+## High-Level Model
+
+The current agent uses three different layers of state:
+
+1. `chatHistory`
+Recent in-memory user/assistant turns for the live conversation.
+
+2. `TaskState`
+Durable summarized context that should survive beyond recent turns.
+
+3. `ExecutionState`
+Durable workflow state for the currently running or recently blocked plan.
+
+These three layers serve different purposes and should not be treated as the same thing.
+
+## State Layers
+
+### 1. `chatHistory`
+
+Defined in `agent/history.go`.
+
+Role:
+
+- stores recent `user` / `assistant` messages in memory
+- keyed by `userID`
+- used as short-term conversational context
+- acts as the source material for later compression into `TaskState`
+
+Characteristics:
+
+- in-memory only
+- capped by `maxTurns`
+- cleared by `/clear`
+- not suitable as durable truth
+
+Typical contents:
+
+- the last few user questions
+- the last few assistant replies
+- temporary conversational wording
+
+### 2. `TaskState`
+
+Defined in `agent/memory.go`.
+
+Role:
+
+- stores durable, structured, non-derivable context
+- persisted through `system_config`
+- injected into planning and reasoning prompts
+
+Storage key:
+
+- `agent_task_state_<userID>`
+
+Fields:
+
+- `CurrentGoal`
+- `ActiveFlow`
+- `OpenLoops`
+- `ImportantFacts`
+- `LastDecision`
+- `UpdatedAt`
+
+Intended contents:
+
+- user goal that still matters across turns
+- high-level unresolved issues that still matter across turns
+- facts that tools cannot cheaply re-fetch
+- latest important decision summary
+
+Explicitly not intended for:
+
+- step-level pending items such as "wait for API key"
+- execution actions such as "call get_exchange_configs"
+- live balances
+- current positions
+- current market prices
+- mutable configuration availability
+
+Those should be checked from tools at planning time instead of being trusted from old summaries.
+
+### 3. `ExecutionState`
+
+Defined in `agent/execution_state.go`.
+
+Role:
+
+- stores the current execution workflow
+- allows the agent to resume after `ask_user`
+- persists plan steps, observations, and completion status
+
+Storage key:
+
+- `agent_execution_state_<userID>`
+
+Fields:
+
+- `SessionID`
+- `UserID`
+- `Goal`
+- `Status`
+- `PlanID`
+- `Steps`
+- `CurrentStepID`
+- `Observations`
+- `FinalAnswer`
+- `LastError`
+- `UpdatedAt`
+
+This is the planner's working state, not a general memory store.
+
+## Data Flow
+
+### Request Entry
+
+Entry points:
+
+- `HandleMessage(...)`
+- `HandleMessageStream(...)`
+
+Flow:
+
+1. user message enters `agent`
+2. slash commands and explicit direct branches are handled first
+3. all other requests go into planner flow via `thinkAndAct(...)` / `thinkAndActStream(...)`
+
+### Planner Flow
+
+The planner pipeline in `agent/planner_runtime.go` is:
+
+1. append user message into `chatHistory`
+2. emit `planning` SSE event
+3. load `ExecutionState`
+4. optionally reset stale `ExecutionState`
+5. optionally refresh dynamic configuration snapshots
+6. create a fresh execution plan with the LLM
+7. execute steps one by one
+8. persist `ExecutionState` after important transitions
+9. append assistant answer into `chatHistory`
+10. maybe compress old conversation into `TaskState`
+
+## Short-Term vs Durable Memory
+
+### What lives in `chatHistory`
+
+Good fits:
+
+- raw recent messages
+- conversational wording
+- latest assistant phrasing
+
+Bad fits:
+
+- long-lived truths
+- current external system state
+
+### What lives in `TaskState`
+
+Good fits:
+
+- durable goal
+- high-level unfinished work that remains relevant across turns
+- important facts the user stated
+- previous decisions and why they were made
+
+Bad fits:
+
+- pending steps inside the current plan
+- execution-level reminders such as "wait for a field" or "call a tool"
+- old conclusions about whether tools exist
+- old conclusions about whether model/exchange config is present
+- live operational state that can change outside the chat
+
+### What lives in `ExecutionState`
+
+Good fits:
+
+- current plan steps
+- observations from tool calls
+- blocked-on-user-input status
+- exact current workflow state
+- step-level pending work and block reasons
+
+Bad fits:
+
+- evergreen user profile
+- long-term semantic memory
+
+## Planning Logic
+
+### Plan Creation
+
+`createExecutionPlan(...)` sends the following into the planner model:
+
+- available tool definitions
+- persistent preferences
+- `TaskState` context
+- `ExecutionState` JSON
+- current user request
+
+The planner must return JSON only with step types:
+
+- `tool`
+- `reason`
+- `ask_user`
+- `respond`
+
+### Step Execution
+
+`executePlan(...)` executes the plan loop:
+
+- `tool`
+  call tool and append observation
+- `reason`
+  run reasoning sub-call and append observation
+- `ask_user`
+  save `waiting_user` state and return question
+- `respond`
+  generate final answer and mark completed
+
+After each completed step, `replanAfterStep(...)` may:
+
+- continue
+- replace remaining steps
+- ask user
+- finish
+
+## Resume Behavior
+
+When `ExecutionState.Status == waiting_user`, the next user turn is treated as a reply to the pending question.
+
+Current safeguards:
+
+- latest asked question is extracted from the stored plan
+- the user reply is appended as a `user_reply` observation
+- planner prompt receives explicit `Resume context`
+
+This prevents short replies like `是` from being misread as unrelated fresh intents as often as before.
+
+## Dynamic State Refresh
+
+Configuration and trader management requests are dynamic by nature. Their truth can change outside the current chat, for example:
+
+- user configures exchange in the UI
+- user adds model in another tab
+- user creates trader elsewhere
+
+Because of that, configuration/trader requests should not trust stale model conclusions.
+
+Current protection in `planner_runtime.go`:
+
+- detects config / trader intent with `isConfigOrTraderIntent(...)`
+- clears `TaskState` context from the planner prompt for these requests
+- refreshes `ExecutionState.Observations` with fresh snapshots from:
+  - `toolGetModelConfigs(...)`
+  - `toolGetExchangeConfigs(...)`
+  - `toolListTraders(...)`
+
+This makes the planner rely more on current system state and less on older narrative memory.
+
+## Reset Strategy
+
+The system currently resets or weakens stale execution state when:
+
+- user says retry-like phrases such as `再试`, `继续`, `try again`, `continue`
+- request is config / trader related and old execution state is failed / completed / waiting
+
+Reset scope:
+
+- `ExecutionState` may be cleared
+- `TaskState` is not globally deleted, but it is intentionally ignored for config/trader planning
+
+Manual reset:
+
+- `/clear`
+
+This clears:
+
+- short-term chat history
+- task state
+- execution state
+
+## Compression Design
+
+`maybeCompressHistory(...)` moves older short-term chat content into `TaskState` when:
+
+- recent message count exceeds the configured window
+- estimated token count exceeds the threshold
+
+Compression strategy:
+
+1. keep recent conversation in `chatHistory`
+2. summarize older turns into structured `TaskState`
+3. persist new `TaskState`
+4. replace `chatHistory` with recent slice
+
+Important design rule:
+
+- `TaskState` should keep durable context only
+- it should not become a stale copy of mutable operational state
+
+## Current Architecture Diagram
+
+```mermaid
+flowchart TD
+    U[User Message] --> A[HandleMessage / HandleMessageStream]
+    A --> B{Direct command?}
+    B -->|Yes| C[Direct branch or slash command]
+    B -->|No| D[thinkAndAct / thinkAndActStream]
+
+    D --> E[Append user turn to chatHistory]
+    D --> F[Load ExecutionState]
+    F --> G{waiting_user?}
+    G -->|Yes| H[Attach user_reply observation]
+    G -->|No| I[Create fresh ExecutionState]
+
+    H --> J[Refresh dynamic snapshots if config/trader intent]
+    I --> J
+    J --> K[createExecutionPlan via LLM]
+    K --> L[Execution plan]
+    L --> M[executePlan loop]
+
+    M --> N[tool step]
+    M --> O[reason step]
+    M --> P[ask_user step]
+    M --> Q[respond step]
+
+    N --> R[Append Observation]
+    O --> R
+    R --> S[replanAfterStep]
+    S --> M
+
+    P --> T[Persist waiting_user ExecutionState]
+    T --> UQ[Return question to user]
+
+    Q --> V[Persist completed ExecutionState]
+    V --> W[Append assistant turn to chatHistory]
+    W --> X[maybeCompressHistory]
+    X --> Y[Persist TaskState]
+    Y --> Z[Final response]
+```
+
+## Memory Relationship Diagram
+
+```mermaid
+flowchart LR
+    CH[chatHistory\nin-memory\nrecent turns]
+    TS[TaskState\npersisted summary\nsystem_config]
+    ES[ExecutionState\npersisted workflow\nsystem_config]
+    PL[Planner Prompt]
+
+    CH -->|recent raw turns| PL
+    ES -->|current workflow JSON| PL
+    TS -->|durable structured context| PL
+
+    CH -->|old turns compressed| TS
+    PL -->|plan / observations / status| ES
+```
+
+## State Transition Diagram
+
+```mermaid
+stateDiagram-v2
+    [*] --> planning
+    planning --> running: plan created
+    running --> waiting_user: ask_user step
+    waiting_user --> planning: user replies
+    running --> completed: respond step finished
+    running --> failed: step error
+    failed --> planning: retry / continue / config-trader reset
+    completed --> planning: new relevant request or retry flow
+```
+
+## Known Design Tradeoffs
+
+### Strengths
+
+- separates short-term chat from durable task summary
+- allows blocked flows to resume
+- supports replanning after every meaningful step
+- can recover from stale assumptions better for dynamic config/trader requests
+
+### Weaknesses
+
+- `TaskState` is still summary-driven, so summarization quality matters
+- planner still depends on model compliance for some transitions
+- `ExecutionState` is single-track per user, not multiple concurrent workflows
+- config/trader intent detection is heuristic and keyword-based
+
+## Practical Guidance
+
+### When to trust `TaskState`
+
+Trust it for:
+
+- user intent continuity
+- open loops
+- durable facts
+
+Do not trust it for:
+
+- whether current exchange/model/trader config exists now
+- whether a specific operational action is currently possible
+
+### When to trust `ExecutionState`
+
+Trust it for:
+
+- current plan continuity
+- exact blocked step
+- latest observation chain
+
+Do not trust it blindly when:
+
+- user has changed configuration outside the chat
+- the system capabilities changed after deployment
+
+### When to fetch live state again
+
+Always prefer fresh tool snapshots before answering about:
+
+- existing model configs
+- existing exchange configs
+- existing traders
+- whether trader creation can proceed
+
+## Suggested Future Improvements
+
+- add workflow versioning so capability changes invalidate stale `ExecutionState`
+- separate `waiting_user_confirmation` from generic `waiting_user`
+- introduce code-level handling for short confirmations such as `是`, `好`, `继续`
+- move dynamic state refresh from heuristic to explicit planner preflight stage
+- support multiple concurrent execution sessions per user if needed
--- a/docs/architecture/AGENT_MEMORY_AND_PLANNING.zh-CN.md
+++ b/docs/architecture/AGENT_MEMORY_AND_PLANNING.zh-CN.md
@@ -0,0 +1,453 @@
+# NOFXi Agent 记忆与规划设计
+
+## 目的
+
+本文说明当前 NOFXi agent 是如何处理以下能力的：
+
+- 短期对话记忆
+- 持久化任务记忆
+- 持久化执行态 / 规划态
+- planner 的执行与重规划
+- 状态重置与恢复
+
+本文主要对应以下实现文件：
+
+- `agent/history.go`
+- `agent/memory.go`
+- `agent/execution_state.go`
+- `agent/planner_runtime.go`
+- `agent/agent.go`
+
+## 总体模型
+
+当前 agent 使用三层不同的状态：
+
+1. `chatHistory`
+用于保存当前会话最近几轮的原始用户/助手对话，驻留内存。
+
+2. `TaskState`
+用于保存跨轮次仍然有价值的结构化摘要，持久化存储。
+
+3. `ExecutionState`
+用于保存当前规划流程的执行态，支持流程中断后的继续执行。
+
+这三层职责不同，不能混为一谈。
+
+## 三层状态
+
+### 1. `chatHistory`
+
+定义位置：`agent/history.go`
+
+作用：
+
+- 按 `userID` 保存最近的 `user` / `assistant` 消息
+- 作为短期对话上下文
+- 作为后续压缩进 `TaskState` 的原始素材
+
+特性：
+
+- 仅在内存中存在
+- 有 `maxTurns` 上限
+- `/clear` 时会清空
+- 不适合作为长期真相来源
+
+典型内容：
+
+- 最近几轮用户问题
+- 最近几轮助手回答
+- 临时措辞与上下文表达
+
+### 2. `TaskState`
+
+定义位置：`agent/memory.go`
+
+作用：
+
+- 保存持久化、结构化、不可轻易从工具重新推导出的上下文
+- 通过 `system_config` 持久化
+- 注入到 planner / reasoning prompt 中
+
+存储 key：
+
+- `agent_task_state_<userID>`
+
+字段：
+
+- `CurrentGoal`
+- `ActiveFlow`
+- `OpenLoops`
+- `ImportantFacts`
+- `LastDecision`
+- `UpdatedAt`
+
+适合存放：
+
+- 当前仍有效的用户目标
+- 跨轮次仍然成立的高层未闭环问题
+- 无法简单通过工具重新读取的重要事实
+- 最近一次关键决策及原因
+
+不适合存放：
+
+- “等用户提供 API Key” 这类 step 级待办
+- “调用 get_exchange_configs” 这类执行动作
+- 实时余额
+- 当前持仓
+- 当前行情价格
+- 是否存在某个配置这类会变化的状态
+
+这些动态信息应该在规划阶段通过工具重新检查，而不是相信旧摘要。
+
+### 3. `ExecutionState`
+
+定义位置：`agent/execution_state.go`
+
+作用：
+
+- 保存当前执行中的工作流状态
+- 支持 `ask_user` 之后恢复执行
+- 持久化保存计划步骤、观察结果和最终状态
+
+存储 key：
+
+- `agent_execution_state_<userID>`
+
+字段：
+
+- `SessionID`
+- `UserID`
+- `Goal`
+- `Status`
+- `PlanID`
+- `Steps`
+- `CurrentStepID`
+- `Observations`
+- `FinalAnswer`
+- `LastError`
+- `UpdatedAt`
+
+它是 planner 的“工作态”，不是通用记忆仓库。
+
+## 数据流
+
+### 请求入口
+
+入口函数：
+
+- `HandleMessage(...)`
+- `HandleMessageStream(...)`
+
+流程：
+
+1. 用户消息进入 `agent`
+2. 优先处理 slash command 和显式直达分支
+3. 其余请求进入 planner 流程：`thinkAndAct(...)` / `thinkAndActStream(...)`
+
+### Planner 主流程
+
+`agent/planner_runtime.go` 中的 planner 管线如下：
+
+1. 把用户消息加入 `chatHistory`
+2. 发出 `planning` SSE 事件
+3. 加载 `ExecutionState`
+4. 视情况重置过期的 `ExecutionState`
+5. 视情况刷新动态配置快照
+6. 调用 LLM 生成新的执行计划
+7. 按步骤执行计划
+8. 在关键状态变化后持久化 `ExecutionState`
+9. 把助手回答加入 `chatHistory`
+10. 视情况把旧对话压缩进 `TaskState`
+
+## 短期记忆 vs 持久记忆
+
+### `chatHistory` 里应该放什么
+
+适合：
+
+- 最近原始消息
+- 对话措辞
+- 最近一轮助手的表达方式
+
+不适合：
+
+- 长期真相
+- 外部系统当前状态
+
+### `TaskState` 里应该放什么
+
+适合：
+
+- 持续目标
+- 跨轮次仍有意义的高层未闭环事项
+- 用户明确讲过的重要事实
+- 历史关键决策和原因
+
+不适合：
+
+- 当前 plan 中尚未执行的步骤
+- “等待某个字段”“调用某个 tool” 这类执行级待办
+- “系统有没有这个工具” 这种过时结论
+- “当前有没有模型/交易所配置” 这种可变化状态
+- 可以通过工具重新查询到的动态状态
+
+### `ExecutionState` 里应该放什么
+
+适合：
+
+- 当前计划步骤
+- 工具调用观察结果
+- 当前是否卡在等用户补充信息
+- 当前工作流的精确执行位置
+- step 级待办和阻塞原因
+
+不适合：
+
+- 长期用户画像
+- 通用长期语义记忆
+
+## 规划逻辑
+
+### 计划生成
+
+`createExecutionPlan(...)` 会把以下信息送给 planner 模型：
+
+- 当前可用 tool 定义
+- 持久化用户偏好
+- `TaskState` 上下文
+- `ExecutionState` JSON
+- 当前用户请求
+
+planner 必须返回 JSON，且步骤类型只能是：
+
+- `tool`
+- `reason`
+- `ask_user`
+- `respond`
+
+### 步骤执行
+
+`executePlan(...)` 的执行循环如下：
+
+- `tool`
+  调用工具并写入 observation
+- `reason`
+  发起 reasoning 子调用并写入 observation
+- `ask_user`
+  保存 `waiting_user` 状态并把问题返回给用户
+- `respond`
+  生成最终回答并标记完成
+
+每个步骤结束后，`replanAfterStep(...)` 还可以决定：
+
+- continue
+- replace_remaining
+- ask_user
+- finish
+
+## 恢复执行
+
+当 `ExecutionState.Status == waiting_user` 时，下一条用户消息会被视为对上一轮追问的回复。
+
+当前保护机制：
+
+- 从已有 plan 中提取最近一次追问内容
+- 将用户回复作为 `user_reply` observation 追加
+- 在 planner prompt 中注入显式的 `Resume context`
+
+这样可以减少用户只回复 `是` 这类短消息时，被错误理解成全新意图的情况。
+
+## 动态状态刷新
+
+配置类与 trader 管理类请求本质上是动态请求，它们的真相可能在聊天之外发生变化，例如：
+
+- 用户在 Web UI 中配置了交易所
+- 用户在另一个页面新增了模型
+- 用户在别处创建了 trader
+
+因此，这类请求不能依赖旧的模型结论。
+
+当前在 `planner_runtime.go` 中的保护措施：
+
+- 通过 `isConfigOrTraderIntent(...)` 检测配置 / trader 意图
+- 这类请求在 planner prompt 中不再注入旧 `TaskState`
+- 同时刷新 `ExecutionState.Observations` 中的实时快照：
+  - `toolGetModelConfigs(...)`
+  - `toolGetExchangeConfigs(...)`
+  - `toolListTraders(...)`
+
+这样 planner 会更多依赖当前系统状态，而不是依赖旧记忆中的描述。
+
+## 重置策略
+
+当前系统在以下场景会重置或弱化旧执行态：
+
+- 用户说了类似 `再试`、`继续`、`try again`、`continue`
+- 当前请求是配置 / trader 相关，并且旧 `ExecutionState` 已经失败 / 完成 / 正在等待用户
+
+重置范围：
+
+- `ExecutionState` 可能会被清空
+- `TaskState` 不会整体删除，但在配置 / trader 请求中会被主动忽略
+
+手动清理：
+
+- `/clear`
+
+这条命令会清掉：
+
+- 短期 chat history
+- task state
+- execution state
+
+## 压缩设计
+
+`maybeCompressHistory(...)` 会在以下条件满足时把旧的短期对话压缩进 `TaskState`：
+
+- 最近消息数超过窗口
+- 估算 token 数超过阈值
+
+压缩流程：
+
+1. 保留最近若干轮对话在 `chatHistory`
+2. 把更早的内容总结成结构化 `TaskState`
+3. 持久化新的 `TaskState`
+4. 用最近消息切片替换 `chatHistory`
+
+重要设计原则：
+
+- `TaskState` 只保留长期有效上下文
+- 不能把它变成动态运营状态的陈旧副本
+
+## 当前架构图
+
+```mermaid
+flowchart TD
+    U[用户消息] --> A[HandleMessage / HandleMessageStream]
+    A --> B{是否命中直达分支?}
+    B -->|是| C[直接处理 slash command 或快捷分支]
+    B -->|否| D[thinkAndAct / thinkAndActStream]
+
+    D --> E[写入 chatHistory]
+    D --> F[加载 ExecutionState]
+    F --> G{是否 waiting_user?}
+    G -->|是| H[追加 user_reply observation]
+    G -->|否| I[创建新的 ExecutionState]
+
+    H --> J[若为配置或 trader 请求则刷新动态快照]
+    I --> J
+    J --> K[createExecutionPlan 调用 LLM]
+    K --> L[得到 execution plan]
+    L --> M[executePlan 循环执行]
+
+    M --> N[tool step]
+    M --> O[reason step]
+    M --> P[ask_user step]
+    M --> Q[respond step]
+
+    N --> R[写入 Observation]
+    O --> R
+    R --> S[replanAfterStep]
+    S --> M
+
+    P --> T[持久化 waiting_user ExecutionState]
+    T --> UQ[向用户返回追问]
+
+    Q --> V[持久化 completed ExecutionState]
+    V --> W[把 assistant 回复写入 chatHistory]
+    W --> X[maybeCompressHistory]
+    X --> Y[持久化 TaskState]
+    Y --> Z[返回最终回答]
+```
+
+## 记忆关系图
+
+```mermaid
+flowchart LR
+    CH[chatHistory\n内存态\n最近对话]
+    TS[TaskState\n持久化摘要\nsystem_config]
+    ES[ExecutionState\n持久化执行态\nsystem_config]
+    PL[Planner Prompt]
+
+    CH -->|最近原始对话| PL
+    ES -->|当前工作流 JSON| PL
+    TS -->|长期结构化上下文| PL
+
+    CH -->|旧消息压缩| TS
+    PL -->|计划 / 观察 / 状态| ES
+```
+
+## 状态转换图
+
+```mermaid
+stateDiagram-v2
+    [*] --> planning
+    planning --> running: plan created
+    running --> waiting_user: ask_user step
+    waiting_user --> planning: user replies
+    running --> completed: respond step finished
+    running --> failed: step error
+    failed --> planning: retry / continue / config-trader reset
+    completed --> planning: new relevant request or retry flow
+```
+
+## 当前设计的取舍
+
+### 优点
+
+- 将短期对话与长期摘要分离
+- 支持在 `ask_user` 之后恢复执行
+- 每个关键步骤后都支持重规划
+- 对配置 / 创建 trader 这类动态请求，已经能更好抵抗旧结论污染
+
+### 缺点
+
+- `TaskState` 的质量仍然依赖总结效果
+- 某些恢复逻辑仍依赖模型是否听话
+- 每个用户当前只有一条 `ExecutionState`，不支持多个并发工作流
+- 配置 / trader 意图识别目前仍是关键词启发式
+
+## 实践建议
+
+### 什么时候该相信 `TaskState`
+
+应该相信它用于：
+
+- 延续用户目标
+- 跟踪未完成事项
+- 保留长期有效事实
+
+不应该相信它用于：
+
+- 当前是否存在模型 / 交易所 / trader 配置
+- 当前是否能够执行某个操作
+
+### 什么时候该相信 `ExecutionState`
+
+应该相信它用于：
+
+- 当前工作流是否仍然连续
+- 当前阻塞在哪一步
+- 最近的 observation 链条
+
+不应该盲信它用于：
+
+- 用户在聊天外已经修改过配置的场景
+- 系统能力或工具集发生变化后的旧结论
+
+### 什么时候必须重新获取实时状态
+
+以下场景应该优先重新通过工具获取：
+
+- 当前模型配置
+- 当前交易所配置
+- 当前 trader 列表
+- 当前是否满足 trader 创建条件
+
+## 后续建议
+
+- 为 `ExecutionState` 增加版本号或能力签名，能力变化时自动失效
+- 将 `waiting_user_confirmation` 与通用 `waiting_user` 分开
+- 对 `是`、`好`、`继续` 这类短确认增加代码级识别
+- 将动态快照刷新从启发式升级为显式 planner 预检查阶段
+- 如果后续需要，支持一个用户多条并发执行会话