Liang Jiaqing
5b0d96b1b5
tg: support document upload, disable draft streaming; wx: streaming flush during task; llmcore: safeprint, max_tokens fixes
2026-04-25 21:33:38 +08:00
Jiaqing Liang
10776fed51
Remove responses max output token limit
2026-04-25 14:03:21 +08:00
Jiaqing Liang
afa4586e3d
fix: _drop_unsigned_thinking对str content崩溃 + _to_responses_input丢弃thinking-only assistant消息导致tool_call被吞
2026-04-25 13:53:08 +08:00
Liang Jiaqing
08181be4bf
fix: use correct token limit params for OpenAI APIs
2026-04-25 10:54:44 +08:00
Liang Jiaqing
cac3ba4769
feat: reasoning/thinking互通适配 + history窗口扩大 + summary提示强化
2026-04-25 10:33:42 +08:00
Liang Jiaqing
12c2fe1f79
add uv install guide & pyproject.toml; unify error prefix to !!!Error:
2026-04-24 10:00:23 +08:00
Liang Jiaqing
fd46d61543
fix: SSE tool_call streaming compat for non-standard index/id, MockToolCall handle list args & None fallback; sop: CDP bringToFront note
2026-04-23 23:44:50 +08:00
Liang Jiaqing
8a6a5715ff
Fix responses call_id mapping for converted tool history
2026-04-23 20:53:44 +08:00
Liang Jiaqing
df2a7749b4
refactor: unify system prompt injection into _openai_stream; disable gpt done_hooks
2026-04-23 20:07:20 +08:00
Jiaqing Liang
6091bf0c6f
fix: add requests to pip install & tune max_turns/prompt
2026-04-23 17:21:22 +08:00
Liang Jiaqing
1678114f1f
fix: split concatenated JSON tool args in _parse_openai_sse to prevent _raw death spiral
2026-04-23 11:53:35 +08:00
Liang Jiaqing
97573e6f46
fix: agent-level retry for intermittent SSL errors during streaming
...
- llmcore: prefix all error outputs with !!! marker
- ga: detect !!!Error: [SSL: in tail of content as incomplete response and retry
- add len(content)>100 guard to avoid false positives on short responses
2026-04-23 11:24:54 +08:00
Liang Jiaqing
17971f642a
fix: guard all id/call_id fields with or "" to prevent None leaking into payload
2026-04-23 11:00:04 +08:00
Liang Jiaqing
08c583d1c6
fix: filter empty text blocks in message conversion & add prompt continuation line
2026-04-22 19:37:06 +08:00
Liang Jiaqing
274d47f35f
Normalize OAI messages before conversion and update tool-call prompt
2026-04-22 17:16:05 +08:00
song
f14ad0c693
fix(llmcore): preserve thinking block signature in streaming SSE parser ( #123 )
...
Anthropic's extended thinking streaming protocol emits two delta types
for a single thinking block: `thinking_delta` (the textual reasoning)
and `signature_delta` (a base64 HMAC tag appended at the end of the
block). Both must be accumulated into the same `content_block`.
Current code only handles `thinking_delta`, so `signature_delta` events
are silently dropped. When the assistant's reply (with thinking) is
echoed back on the next turn, Anthropic's server validates the
signature and rejects the request with 400:
"Invalid `signature` in `thinking` block"
Downstream effects observed in production (via sub2api relay logs):
- Every request with history triggers a 400 signature error
- The relay strips thinking blocks and retries, which changes the
cache prefix and invalidates prompt caching, forcing a full rebuild
of cache_creation_tokens (~20k-30k per affected request)
- Measured in a 5h window: 5/25 requests suffered cache invalidation,
accounting for 53.5% of total spend that was otherwise avoidable
Fix:
1. Initialize `current_block` with an empty `signature` field when a
thinking block starts, so the dict shape matches Anthropic's spec
(`{type, thinking, signature}`).
2. Handle `signature_delta` events by appending `delta.signature` to
`current_block["signature"]`. Using `+=` (rather than assignment)
mirrors how `thinking_delta` is accumulated and is robust against
future chunked signatures.
No behavior change for clients that disable extended thinking, or for
upstreams that don't emit `signature_delta`. For `tool_use` threads
that require valid thinking signatures to preserve reasoning context,
this fix is required — the previous behavior silently corrupted them.
Verification:
- Before fix: upstream returns 400 + retry; cache_creation_tokens
spike to ~25k on every 4th-5th request in a conversation
- After fix: upstream accepts the first attempt; cache_read_tokens
dominate, cache_creation_tokens only appear on the first request
of a fresh 5m prompt-cache window
2026-04-21 12:07:14 +08:00
Jiaqing Liang
116d7d3d23
refactor: plugins dir + opt-in langfuse via __getattr__ guard
...
- mv langfuse_tracing.py -> plugins/langfuse_tracing.py
- llmcore: load plugin lazily inside __getattr__ when langfuse_config present
(PEP 562 module __getattr__ naturally fires only once after globals().update)
- llmcore: extract _record_usage() from 4 scattered [Cache] print sites
- agentmain: /resume scans only latest 10 files
2026-04-20 15:56:06 +08:00
totoyang
8e6270e3a3
feat: optional Langfuse tracing for agent execution ( #115 )
...
Self-activating langfuse tracing via monkey-patch: independent module, zero impact when langfuse_config unset. Captures LLM generation, tool calls, token usage from SSE streams.
Co-authored-by: totoyang
2026-04-20 15:27:55 +08:00
Liang Jiaqing
86ca4625ad
refactor: simplify HTTP error handling in _openai_stream, add non-stream support, broadcast history in MixinSession
2026-04-19 20:57:07 +08:00
Liang Jiaqing
745220e62c
Revert "Merge pull request #108 from ggandmee-cloud/feat/cch-signing"
...
This reverts commit 3e61155b33 , reversing
changes made to 4abb8e205d .
2026-04-19 16:18:45 +08:00
Shen Hao
9217fee211
[feat]: Add user agent configuration to llmcore
2026-04-19 15:38:41 +08:00
LJQ
3e61155b33
Merge pull request #108 from ggandmee-cloud/feat/cch-signing
...
feat: NativeClaudeSession 在 fake_cc_system_prompt 模式下支持可选 CCH 签名
2026-04-19 15:27:13 +08:00
ggandmee-cloud
c79b1c5140
feat: NativeClaudeSession 加入 CCH 签名
...
匹配真实 Claude Code 客户端协议,兼容需要验证客户端身份的代理。
2026-04-19 01:20:01 -04:00
Liang Jiaqing
c6319594c7
清理SOP: web_setup去TM方案, tmwebdriver排查流程优化
2026-04-19 11:16:29 +08:00
Liang Jiaqing
f8a380d4fd
fix: increase default timeout for non-stream mode (10/240s vs 5/30s)
2026-04-18 16:45:37 +08:00
宋明明
04b4818f87
fix(llmcore): 添加MiniMax超时错误码529支持重试机制
...
## 问题描述
MiniMax API在请求超时时返回HTTP 529状态码,该状态码未被包含在可重试状态码集合中, 导致即使配置了max_retries参数,重试机制也对MiniMax超时无效。
当前行为:MiniMax 529 → 不重试 → 直接报错
预期行为:MiniMax 529 → 触发重试逻辑 → max_retries生效 ## 解决方案
在llmcore.py的RETRYABLE集合中添加529状态码。 ## 关于其他改动的思考 最初的想法是:将 LLM API 重试的 HTTP 状态码从硬编码提取为可配置文件
assets/http_retry_codes.json,首次运行时自动生成默认配置
### 之前的初衷(配置文件方案)
- ✓ 不修改代码即可适配新的错误码
- ✓ 降低维护成本
### 现实考量(硬编码方案)
- 主流模型厂商就那几家:OpenAI、Claude、MiniMax、Kimi等 -
标准的超时错误码基本固化:408(timeout)、429(rate_limit)、5xx(server_error)
- MiniMax 529是特例但不频繁变化
- 硬编码更简洁直接,维护更清晰
## 受影响范围
- llmcore.py: _openai_stream() 函数的重试机制现在支持MiniMax 529错误
- 对应MiniMax API的超时场景现在能正确触发retry逻辑
2026-04-17 15:09:52 +08:00
Jiaqing Liang
f418963585
refactor: omit temperature from payload when default (1), avoid errors on reasoning models
2026-04-17 13:11:57 +08:00
Liang Jiaqing
7cadbd7403
feat: i18n support - auto-detect system language for zh/en prompts
2026-04-16 18:47:40 +08:00
Liang Jiaqing
9e18ce26dc
fix: empty file ZeroDivisionError in file_read; remove SiderLLMSession; init global_mem with header
2026-04-16 16:04:18 +08:00
Liang Jiaqing
d18c8438d3
refactor: use with statement for requests.post in NativeClaudeSession
2026-04-14 23:35:51 +08:00
Liang Jiaqing
74abb77a0b
refactor: rename default_model->model, add thinking_type support, fix pet urlopen error handling
2026-04-14 21:14:05 +08:00
Liang Jiaqing
499f0119bb
refactor llm session params and thinking parsing
2026-04-13 20:20:31 +08:00
Jiaqing Liang
086599a5d6
fix: scroll ghost height reflow via overflow toggle; extend cache markers to last 2 user msgs; simplify cursor & merge JS fixes
2026-04-13 14:59:38 +08:00
Liang Jiaqing
d94e404f64
fix: use local var for claude tools conversion, avoid mutating self.tools
2026-04-12 14:46:45 +08:00
Liang Jiaqing
142de0c45b
refactor: remove claude_tools_format flag, auto-convert tools in NativeClaudeSession.raw_ask
2026-04-12 14:43:46 +08:00
Liang Jiaqing
2b5cbff7be
feat: --verbose flag, lazy mykeys loading, temperature config support
...
- agentmain: add --verbose arg for subagent monitoring mode
- llmcore: lazy-load mykeys/proxies via module __getattr__
- llmcore: fix auto_make_url regex for trailing slash cases
- llmcore: support temperature override from session config
- docs: update subagent.md with --verbose usage note
2026-04-12 14:29:32 +08:00
Liang Jiaqing
9e03a675ae
fix: MixinSession copy before override & add no-tools warning; refine subagent SOP
2026-04-11 18:50:26 +08:00
Liang Jiaqing
6f1585e88f
fix: comment out MixinSession copy to prevent tools not propagating to original session
2026-04-11 18:25:54 +08:00
Liang Jiaqing
de8adf76a9
feat: L4 session archiver + scheduler cron integration
...
- Add compress_session.py: compress raw model_responses into L4 archives
- Integrate 12h silent cron into scheduler.check() (runs before TASKS dir check)
- Whitelist compress_session.py in .gitignore (archives excluded)
- llmcore: refactor SSE warn handling, max_retries default 2->1
- scheduler: remove unused health_check(), INTERVAL 60->120
2026-04-11 15:55:35 +08:00
Liang Jiaqing
a737523f0a
llmcore: remove runtime model param, upgrade Claude beta headers, softer backoff
2026-04-11 14:34:27 +08:00
Liang Jiaqing
7bfd6e43e6
fix: _fix_messages for Claude API compliance, raw_ask simplify, no_tool orphan fix, summary extraction improvement
2026-04-11 13:24:33 +08:00
Jiaqing Liang
5a1d3a41da
fix: handle window object serialization in CDP bridge; improve file_write error msg; minor llmcore style cleanup
2026-04-10 14:04:41 +08:00
Liang Jiaqing
4b18ad683f
Fix tool call conversion and working checkpoint result
2026-04-10 10:45:00 +08:00
Liang Jiaqing
2977be33c6
Update llmcore truncation, adb_ui parsing, stapp regex, and subagent docs
2026-04-09 18:43:20 +08:00
Liang Jiaqing
6628a3d987
tune: read_timeout 60s, simphtml print/newTabs fix, remove obsolete iframe sop
2026-04-08 20:45:06 +08:00
Liang Jiaqing
b4741a9a39
Optimize: force aggressive tag compression before history truncation to save context
2026-04-06 22:50:23 +08:00
Liang Jiaqing
aae6d810cd
NativeClaudeSession: auto-convert OpenAI tools format to Claude; skip in NativeOAISession
2026-04-04 07:03:18 +08:00
Liang Jiaqing
14125ed57c
refactor: move tools ownership from NativeToolClient to Session layer
...
- tools state now held by Session (NativeClaudeSession.tools)
- MixinSession.__setattr__ broadcasts tools/system to all sub-sessions
- NativeToolClient no longer duplicates tools storage
- fix: use type(s) is instead of isinstance to avoid catching NativeOAISession subclass
2026-04-03 23:05:28 +08:00
Liang Jiaqing
97abc43a40
refactor(simphtml): rewrite list detection & cutlist for multi-list support
...
- simphtml: replace center-point ancestor-chain approach with global container scan;
support multiple lists per page; add container-scoped selector prefixes;
inline FAKE ELEMENT hints with hidden item previews; remove findMainContent
- ga: hot-reload simphtml on each web_scan; fix file_read total_lines for keyword search;
add errors='replace' for global_mem encoding safety
- llmcore: stabilize NativeClaude session/device IDs across requests;
rename no_system_prompt to fake_cc_system_prompt; deep-copy message content
- launch: adjust window width 700->600
2026-04-03 19:38:51 +08:00
Liang Jiaqing
da40ba413b
feat: fold_turns UI折叠 + 放宽输出限制 + 压缩history/key_info标签 + context_win调大
2026-04-03 10:43:15 +08:00