Commit Graph

86 Commits

Author SHA1 Message Date
Liang Jiaqing
12c2fe1f79 add uv install guide & pyproject.toml; unify error prefix to !!!Error: 2026-04-24 10:00:23 +08:00
Liang Jiaqing
fd46d61543 fix: SSE tool_call streaming compat for non-standard index/id, MockToolCall handle list args & None fallback; sop: CDP bringToFront note 2026-04-23 23:44:50 +08:00
Liang Jiaqing
8a6a5715ff Fix responses call_id mapping for converted tool history 2026-04-23 20:53:44 +08:00
Liang Jiaqing
df2a7749b4 refactor: unify system prompt injection into _openai_stream; disable gpt done_hooks 2026-04-23 20:07:20 +08:00
Jiaqing Liang
6091bf0c6f fix: add requests to pip install & tune max_turns/prompt 2026-04-23 17:21:22 +08:00
Liang Jiaqing
1678114f1f fix: split concatenated JSON tool args in _parse_openai_sse to prevent _raw death spiral 2026-04-23 11:53:35 +08:00
Liang Jiaqing
97573e6f46 fix: agent-level retry for intermittent SSL errors during streaming
- llmcore: prefix all error outputs with !!! marker
- ga: detect !!!Error: [SSL: in tail of content as incomplete response and retry
- add len(content)>100 guard to avoid false positives on short responses
2026-04-23 11:24:54 +08:00
Liang Jiaqing
17971f642a fix: guard all id/call_id fields with or "" to prevent None leaking into payload 2026-04-23 11:00:04 +08:00
Liang Jiaqing
08c583d1c6 fix: filter empty text blocks in message conversion & add prompt continuation line 2026-04-22 19:37:06 +08:00
Liang Jiaqing
274d47f35f Normalize OAI messages before conversion and update tool-call prompt 2026-04-22 17:16:05 +08:00
song
f14ad0c693 fix(llmcore): preserve thinking block signature in streaming SSE parser (#123)
Anthropic's extended thinking streaming protocol emits two delta types
for a single thinking block: `thinking_delta` (the textual reasoning)
and `signature_delta` (a base64 HMAC tag appended at the end of the
block). Both must be accumulated into the same `content_block`.

Current code only handles `thinking_delta`, so `signature_delta` events
are silently dropped. When the assistant's reply (with thinking) is
echoed back on the next turn, Anthropic's server validates the
signature and rejects the request with 400:

    "Invalid `signature` in `thinking` block"

Downstream effects observed in production (via sub2api relay logs):
- Every request with history triggers a 400 signature error
- The relay strips thinking blocks and retries, which changes the
  cache prefix and invalidates prompt caching, forcing a full rebuild
  of cache_creation_tokens (~20k-30k per affected request)
- Measured in a 5h window: 5/25 requests suffered cache invalidation,
  accounting for 53.5% of total spend that was otherwise avoidable

Fix:
1. Initialize `current_block` with an empty `signature` field when a
   thinking block starts, so the dict shape matches Anthropic's spec
   (`{type, thinking, signature}`).
2. Handle `signature_delta` events by appending `delta.signature` to
   `current_block["signature"]`. Using `+=` (rather than assignment)
   mirrors how `thinking_delta` is accumulated and is robust against
   future chunked signatures.

No behavior change for clients that disable extended thinking, or for
upstreams that don't emit `signature_delta`. For `tool_use` threads
that require valid thinking signatures to preserve reasoning context,
this fix is required — the previous behavior silently corrupted them.

Verification:
- Before fix: upstream returns 400 + retry; cache_creation_tokens
  spike to ~25k on every 4th-5th request in a conversation
- After fix: upstream accepts the first attempt; cache_read_tokens
  dominate, cache_creation_tokens only appear on the first request
  of a fresh 5m prompt-cache window
2026-04-21 12:07:14 +08:00
Jiaqing Liang
116d7d3d23 refactor: plugins dir + opt-in langfuse via __getattr__ guard
- mv langfuse_tracing.py -> plugins/langfuse_tracing.py
- llmcore: load plugin lazily inside __getattr__ when langfuse_config present
  (PEP 562 module __getattr__ naturally fires only once after globals().update)
- llmcore: extract _record_usage() from 4 scattered [Cache] print sites
- agentmain: /resume scans only latest 10 files
2026-04-20 15:56:06 +08:00
totoyang
8e6270e3a3 feat: optional Langfuse tracing for agent execution (#115)
Self-activating langfuse tracing via monkey-patch: independent module, zero impact when langfuse_config unset. Captures LLM generation, tool calls, token usage from SSE streams.

Co-authored-by: totoyang
2026-04-20 15:27:55 +08:00
Liang Jiaqing
86ca4625ad refactor: simplify HTTP error handling in _openai_stream, add non-stream support, broadcast history in MixinSession 2026-04-19 20:57:07 +08:00
Liang Jiaqing
745220e62c Revert "Merge pull request #108 from ggandmee-cloud/feat/cch-signing"
This reverts commit 3e61155b33, reversing
changes made to 4abb8e205d.
2026-04-19 16:18:45 +08:00
Shen Hao
9217fee211 [feat]: Add user agent configuration to llmcore 2026-04-19 15:38:41 +08:00
LJQ
3e61155b33 Merge pull request #108 from ggandmee-cloud/feat/cch-signing
feat: NativeClaudeSession 在 fake_cc_system_prompt 模式下支持可选 CCH 签名
2026-04-19 15:27:13 +08:00
ggandmee-cloud
c79b1c5140 feat: NativeClaudeSession 加入 CCH 签名
匹配真实 Claude Code 客户端协议,兼容需要验证客户端身份的代理。
2026-04-19 01:20:01 -04:00
Liang Jiaqing
c6319594c7 清理SOP: web_setup去TM方案, tmwebdriver排查流程优化 2026-04-19 11:16:29 +08:00
Liang Jiaqing
f8a380d4fd fix: increase default timeout for non-stream mode (10/240s vs 5/30s) 2026-04-18 16:45:37 +08:00
宋明明
04b4818f87 fix(llmcore): 添加MiniMax超时错误码529支持重试机制
## 问题描述
MiniMax API在请求超时时返回HTTP 529状态码,该状态码未被包含在可重试状态码集合中, 导致即使配置了max_retries参数,重试机制也对MiniMax超时无效。
当前行为:MiniMax 529 → 不重试 → 直接报错
预期行为:MiniMax 529 → 触发重试逻辑 → max_retries生效 ## 解决方案
在llmcore.py的RETRYABLE集合中添加529状态码。 ## 关于其他改动的思考 最初的想法是:将 LLM API 重试的 HTTP 状态码从硬编码提取为可配置文件
assets/http_retry_codes.json,首次运行时自动生成默认配置

### 之前的初衷(配置文件方案)
- ✓ 不修改代码即可适配新的错误码
- ✓ 降低维护成本
### 现实考量(硬编码方案)
- 主流模型厂商就那几家:OpenAI、Claude、MiniMax、Kimi等 -
标准的超时错误码基本固化:408(timeout)、429(rate_limit)、5xx(server_error)
- MiniMax 529是特例但不频繁变化
- 硬编码更简洁直接,维护更清晰

## 受影响范围
- llmcore.py: _openai_stream() 函数的重试机制现在支持MiniMax 529错误
- 对应MiniMax API的超时场景现在能正确触发retry逻辑
2026-04-17 15:09:52 +08:00
Jiaqing Liang
f418963585 refactor: omit temperature from payload when default (1), avoid errors on reasoning models 2026-04-17 13:11:57 +08:00
Liang Jiaqing
7cadbd7403 feat: i18n support - auto-detect system language for zh/en prompts 2026-04-16 18:47:40 +08:00
Liang Jiaqing
9e18ce26dc fix: empty file ZeroDivisionError in file_read; remove SiderLLMSession; init global_mem with header 2026-04-16 16:04:18 +08:00
Liang Jiaqing
d18c8438d3 refactor: use with statement for requests.post in NativeClaudeSession 2026-04-14 23:35:51 +08:00
Liang Jiaqing
74abb77a0b refactor: rename default_model->model, add thinking_type support, fix pet urlopen error handling 2026-04-14 21:14:05 +08:00
Liang Jiaqing
499f0119bb refactor llm session params and thinking parsing 2026-04-13 20:20:31 +08:00
Jiaqing Liang
086599a5d6 fix: scroll ghost height reflow via overflow toggle; extend cache markers to last 2 user msgs; simplify cursor & merge JS fixes 2026-04-13 14:59:38 +08:00
Liang Jiaqing
d94e404f64 fix: use local var for claude tools conversion, avoid mutating self.tools 2026-04-12 14:46:45 +08:00
Liang Jiaqing
142de0c45b refactor: remove claude_tools_format flag, auto-convert tools in NativeClaudeSession.raw_ask 2026-04-12 14:43:46 +08:00
Liang Jiaqing
2b5cbff7be feat: --verbose flag, lazy mykeys loading, temperature config support
- agentmain: add --verbose arg for subagent monitoring mode
- llmcore: lazy-load mykeys/proxies via module __getattr__
- llmcore: fix auto_make_url regex for trailing slash cases
- llmcore: support temperature override from session config
- docs: update subagent.md with --verbose usage note
2026-04-12 14:29:32 +08:00
Liang Jiaqing
9e03a675ae fix: MixinSession copy before override & add no-tools warning; refine subagent SOP 2026-04-11 18:50:26 +08:00
Liang Jiaqing
6f1585e88f fix: comment out MixinSession copy to prevent tools not propagating to original session 2026-04-11 18:25:54 +08:00
Liang Jiaqing
de8adf76a9 feat: L4 session archiver + scheduler cron integration
- Add compress_session.py: compress raw model_responses into L4 archives
- Integrate 12h silent cron into scheduler.check() (runs before TASKS dir check)
- Whitelist compress_session.py in .gitignore (archives excluded)
- llmcore: refactor SSE warn handling, max_retries default 2->1
- scheduler: remove unused health_check(), INTERVAL 60->120
2026-04-11 15:55:35 +08:00
Liang Jiaqing
a737523f0a llmcore: remove runtime model param, upgrade Claude beta headers, softer backoff 2026-04-11 14:34:27 +08:00
Liang Jiaqing
7bfd6e43e6 fix: _fix_messages for Claude API compliance, raw_ask simplify, no_tool orphan fix, summary extraction improvement 2026-04-11 13:24:33 +08:00
Jiaqing Liang
5a1d3a41da fix: handle window object serialization in CDP bridge; improve file_write error msg; minor llmcore style cleanup 2026-04-10 14:04:41 +08:00
Liang Jiaqing
4b18ad683f Fix tool call conversion and working checkpoint result 2026-04-10 10:45:00 +08:00
Liang Jiaqing
2977be33c6 Update llmcore truncation, adb_ui parsing, stapp regex, and subagent docs 2026-04-09 18:43:20 +08:00
Liang Jiaqing
6628a3d987 tune: read_timeout 60s, simphtml print/newTabs fix, remove obsolete iframe sop 2026-04-08 20:45:06 +08:00
Liang Jiaqing
b4741a9a39 Optimize: force aggressive tag compression before history truncation to save context 2026-04-06 22:50:23 +08:00
Liang Jiaqing
aae6d810cd NativeClaudeSession: auto-convert OpenAI tools format to Claude; skip in NativeOAISession 2026-04-04 07:03:18 +08:00
Liang Jiaqing
14125ed57c refactor: move tools ownership from NativeToolClient to Session layer
- tools state now held by Session (NativeClaudeSession.tools)
- MixinSession.__setattr__ broadcasts tools/system to all sub-sessions
- NativeToolClient no longer duplicates tools storage
- fix: use type(s) is instead of isinstance to avoid catching NativeOAISession subclass
2026-04-03 23:05:28 +08:00
Liang Jiaqing
97abc43a40 refactor(simphtml): rewrite list detection & cutlist for multi-list support
- simphtml: replace center-point ancestor-chain approach with global container scan;
  support multiple lists per page; add container-scoped selector prefixes;
  inline FAKE ELEMENT hints with hidden item previews; remove findMainContent
- ga: hot-reload simphtml on each web_scan; fix file_read total_lines for keyword search;
  add errors='replace' for global_mem encoding safety
- llmcore: stabilize NativeClaude session/device IDs across requests;
  rename no_system_prompt to fake_cc_system_prompt; deep-copy message content
- launch: adjust window width 700->600
2026-04-03 19:38:51 +08:00
Liang Jiaqing
da40ba413b feat: fold_turns UI折叠 + 放宽输出限制 + 压缩history/key_info标签 + context_win调大 2026-04-03 10:43:15 +08:00
Liang Jiaqing
b0d4563ae8 Refine llmcore debug and tool parsing 2026-04-02 21:16:45 +08:00
Liang Jiaqing
88f32b208b feat: support NativeToolClient and optimize tool use format for native API 2026-04-01 23:09:48 +08:00
Liang Jiaqing
629e57ad83 Refactor: 统一消息格式和Session架构重构
核心改动:
- 统一所有Session内部使用Claude content-block格式
- 引入BaseSession基类,简化代码结构
- tool_results从字符串改为结构化字典列表
- NativeClaudeSession增强:支持cr_token、metadata、thinking提取
- ToolClient简化:删除structured分支,统一使用protocol prompt
- MixinSession支持按名称选择session
- ljqCtrl_sop增加DPI坐标陷阱警告
2026-04-01 22:20:18 +08:00
Liang Jiaqing
368d68baa5 fix: remove stray brace in tool_pattern regex 2026-04-01 09:47:53 +08:00
Liang Jiaqing
1c38412e42 revert: 恢复工具调用正则最小长度限制 2026-04-01 09:35:37 +08:00