Only use a proxy for tgapp when mykeys['proxy'] is explicitly configured, avoiding the dead default local proxy and restoring Telegram polling startup.
Closes#175
subprocess.run was already patched with CREATE_NO_WINDOW, but Popen and
os.startfile were unprotected. Agent code could open visible GUI windows
via subprocess.Popen(['notepad.exe']) or os.startfile().
The proxy was hardcoded to http://127.0.0.1:2082 which breaks for users
without a local proxy (e.g. international users with direct access).
Now defaults to None; users who need a proxy can set it in mykey.py.
- Add /llm command to list available models
- Add /llm [n] command to switch to specific model
- Improve /status command to show current LLM info with emoji indicators
- Update /help text to include /llm command
- Unify command parsing logic using parts and op variables
This brings fsapp.py (Feishu/Lark) frontend in line with other chat frontends (QQ, DingTalk, WeCom) that already support all slash commands.
When the user runs '/continue N' in stapp, the agent's in-memory context
is restored, but the UI previously showed only a single '✅ restored' line
— all prior chat bubbles were missing.
This change parses the target session log and reconstructs the
user/assistant message pairs into st.session_state.messages, so reopening
a session feels like the conversation was never interrupted.
* continue_cmd.py: add extract_ui_messages(path)
- parses model_responses log into [{role, content}, ...]
- groups multi-turn LLM calls (prompts whose text starts with the
'### [WORKING MEMORY]' header) into a single assistant bubble,
inserting the existing '**LLM Running (Turn N) ...**' marker so
fold_turns() renders them as collapsible segments.
- two small helpers (_user_text / _assistant_text) keep parsing local.
* stapp.py: in the /continue branch, resolve the target log path BEFORE
calling handle_frontend_command (which snapshots the current log and
would otherwise shift list_sessions indices), then replace
session_state.messages with the reconstructed history on success.
Falls back to the previous behavior for bare /continue or failure.
Co-authored-by: wjl2023 <wjl2023@users.noreply.github.com>
- llmcore: prefix all error outputs with !!! marker
- ga: detect !!!Error: [SSL: in tail of content as incomplete response and retry
- add len(content)>100 guard to avoid false positives on short responses
Rework the Feishu frontend so each user turn renders as a single
collapsible task card that patches itself in place, replacing the
dq-based streaming path that produced many fragmented messages.
- One _TaskCard per turn; hook reacts to summary / exit_reason events
from the agent loop and patches the same card.
- Each step is a foldable panel: header shows the summary, expanding
reveals three sections (auto-hidden when empty):
* Thinking - from response.thinking (separate field, not content)
* Tool Calls - tool name + truncated JSON args
* Output - response.content, with protocol tags stripped so
the header summary is not duplicated inside
- Final reply rendered as a schema 2.0 markdown card for consistency.
- Code-review pass per code_review_principles.md:
* _TaskCard owns only stateful card lifecycle (start/step/done/fail)
* Pure formatting extracted to module-level _build_step_detail and
_fmt_tool_call (no more reaching into card._private from the hook)
* Hook is a ~10-line dispatcher
* Flattened a 4-level nested lambda into a named function
Anthropic's extended thinking streaming protocol emits two delta types
for a single thinking block: `thinking_delta` (the textual reasoning)
and `signature_delta` (a base64 HMAC tag appended at the end of the
block). Both must be accumulated into the same `content_block`.
Current code only handles `thinking_delta`, so `signature_delta` events
are silently dropped. When the assistant's reply (with thinking) is
echoed back on the next turn, Anthropic's server validates the
signature and rejects the request with 400:
"Invalid `signature` in `thinking` block"
Downstream effects observed in production (via sub2api relay logs):
- Every request with history triggers a 400 signature error
- The relay strips thinking blocks and retries, which changes the
cache prefix and invalidates prompt caching, forcing a full rebuild
of cache_creation_tokens (~20k-30k per affected request)
- Measured in a 5h window: 5/25 requests suffered cache invalidation,
accounting for 53.5% of total spend that was otherwise avoidable
Fix:
1. Initialize `current_block` with an empty `signature` field when a
thinking block starts, so the dict shape matches Anthropic's spec
(`{type, thinking, signature}`).
2. Handle `signature_delta` events by appending `delta.signature` to
`current_block["signature"]`. Using `+=` (rather than assignment)
mirrors how `thinking_delta` is accumulated and is robust against
future chunked signatures.
No behavior change for clients that disable extended thinking, or for
upstreams that don't emit `signature_delta`. For `tool_use` threads
that require valid thinking signatures to preserve reasoning context,
this fix is required — the previous behavior silently corrupted them.
Verification:
- Before fix: upstream returns 400 + retry; cache_creation_tokens
spike to ~25k on every 4th-5th request in a conversation
- After fix: upstream accepts the first attempt; cache_read_tokens
dominate, cache_creation_tokens only appear on the first request
of a fresh 5m prompt-cache window