From 5a9736b7f0d4ce4447471f79f14d67956b7a7dd3 Mon Sep 17 00:00:00 2001 From: Jinyi Han <15224975562@163.com> Date: Sat, 14 Mar 2026 00:52:51 +0800 Subject: [PATCH] update the readme --- README.md | 621 ++++++++++++++++++--------------- assets/images/bar.png | Bin 0 -> 4535965 bytes assets/images/logo.jpg | Bin 0 -> 282423 bytes assets/images/wechat_group.jpg | Bin 0 -> 136061 bytes assets/images/workflow.jpg | Bin 0 -> 171553 bytes 5 files changed, 342 insertions(+), 279 deletions(-) create mode 100644 assets/images/bar.png create mode 100644 assets/images/logo.jpg create mode 100644 assets/images/wechat_group.jpg create mode 100644 assets/images/workflow.jpg diff --git a/README.md b/README.md index bd7fe4a..a617e6b 100644 --- a/README.md +++ b/README.md @@ -1,92 +1,118 @@ -# GenericAgent — 3,300 Lines to Full OS Autonomy +
+![]() "Order me a milk tea" — navigates a delivery app, picks items, and checks out. |
-![]() "Find GEM stocks with EXPMA golden cross, turnover > 5%" — quantitative screening via mootdx. |
-
![]() Autonomous web exploration — browses and summarizes on its own schedule. |
-![]() "Find expenses over ¥2K in the past 3 months" — drives Alipay on a phone via ADB. |
-![]() WeChat batch messaging — yes, it can drive WeChat too. |
-
|
|
+| *"Order me a milk tea"* — Navigates the delivery app, selects items, and completes checkout automatically. | *"Find GEM stocks with EXPMA golden cross, turnover > 5%"* — Screens stocks with quantitative conditions. |
+| 🌐 Autonomous Web Exploration | 💰 Expense Tracking | 💬 Batch Messaging |
+|
|
|
|
+| Autonomously browses and periodically summarizes web content. | *"Find expenses over ¥2K in the last 3 months"* — Drives Alipay via ADB. | Sends bulk WeChat messages, fully driving the WeChat client. |
-1. You ask it to do something new
-2. It figures out how (install dependencies, write scripts, test)
-3. It saves the procedure as a new SOP in its memory
-4. Next time, it recalls and executes directly
+---
-The agent doesn't just execute — it **learns and remembers**.
+## 📅 Latest News
-## Quick Start
+- **2026-03-10:** [Released million-scale Skill Library](https://mp.weixin.qq.com/s/q2gQ7YvWoiAcwxzaiwpuiQ?scene=1&click_id=7)
+- **2026-03-08:** [Released "Dintal Claw" — a GenericAgent-powered government affairs bot](https://mp.weixin.qq.com/s/eiEhwo-j6S-WpLxgBnNxBg)
+- **2026-03-01:** [GenericAgent featured by Jiqizhixin (机器之心)](https://mp.weixin.qq.com/s/uVWpTTF5I1yzAENV_qm7yg)
+- **2026-01-11:** GenericAgent V1.0 public release
-> 💡 **Windows零基础用户**:不知道Python是什么?[下载便携版](http://kw.fudan.edu.cn/resources/PC-Agent-Portable.zip)(19MB,解压即用)
+---
+
+## 🚀 Quick Start
+
+#### Method 1: Standard Installation
```bash
-# 1. Clone
+# 1. Clone the repo
git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
-# 2. Install minimal deps
+# 2. Install minimal dependencies
pip install streamlit pywebview
-# 3. Configure API key
+# 3. Configure API Key
cp mykey_template.py mykey.py
-# Edit mykey.py with your LLM API key
+# Edit mykey.py and fill in your LLM API Key
# 4. Launch
python launch.pyw
```
-## QQ Bot (Optional)
+#### Method 2: Windows Portable Version (Recommended for beginners)
-QQ support uses `qq-botpy` over WebSocket, so no public webhook is required.
+[Download portable version](http://kw.fudan.edu.cn/resources/PC-Agent-Portable.zip) (19MB, unzip and run)
+
+Full guide: [WELCOME_NEW_USER.md](WELCOME_NEW_USER.md)
+
+#### Method 3: Android (Termux)
+
+```bash
+cd /sdcard/ga
+python agentmain.py
+```
+
+---
+
+## 🤖 Bot Interfaces (Optional)
+
+### QQ Bot
+
+Uses `qq-botpy` WebSocket long connection — **no public webhook required**:
```bash
pip install qq-botpy
```
-Then add these fields to `mykey.py` or `mykey.json`:
+Add to `mykey.py`:
```python
qq_app_id = "YOUR_APP_ID"
@@ -94,216 +120,232 @@ qq_app_secret = "YOUR_APP_SECRET"
qq_allowed_users = ["YOUR_USER_OPENID"] # or ['*'] for public access
```
-Run QQ directly:
-
```bash
python qqapp.py
-```
-
-Or start it together with the desktop window:
-
-```bash
+# or launch together with the desktop floating window
python launch.pyw --qq
```
-Notes:
-- Create the bot at [QQ Open Platform](https://q.qq.com)
-- In sandbox mode, add your own QQ account to the message list first
-- After the first inbound message, the user's openid will be written to `temp/qqapp.log`
+> Create a bot at the [QQ Open Platform](https://q.qq.com) to get AppID / AppSecret. After the first message, user openid is logged in `temp/qqapp.log`.
-## Feishu / WeCom / DingTalk (Optional)
+---
-Feishu:
+### Lark (Feishu)
```bash
pip install lark-oapi
-python fsapp.py
-# or
-python launch.pyw --feishu
+python fsapp.py # or python launch.pyw --feishu
```
-Config keys in `mykey.py` / `mykey.json`:
-
```python
fs_app_id = "cli_xxx"
fs_app_secret = "xxx"
fs_allowed_users = ["ou_xxx"] # or ['*']
```
-Current Feishu support in this repo:
-- inbound: text, post rich text, image, file, audio, media, interactive/share cards
-- images are sent to multimodal-capable OpenAI-compatible backends as true image inputs on the first turn
-- outbound: interactive progress cards, uploaded image replies, uploaded file/media replies
+**Inbound support**: text, rich text post, images, files, audio, media, interactive cards / share cards
+**Outbound support**: streaming progress cards, image replies, file / media replies
+**Vision model**: Images are sent as true multimodal input to OpenAI Vision-compatible backends on the first turn
-Detailed setup guide: `assets/SETUP_FEISHU.md`
+Full setup: [assets/SETUP_FEISHU.md](assets/SETUP_FEISHU.md)
-WeCom:
+---
+
+### WeCom (Enterprise WeChat)
```bash
pip install wecom_aibot_sdk
-python wecomapp.py
-# or
-python launch.pyw --wecom
+python wecomapp.py # or python launch.pyw --wecom
```
-Config keys:
-
```python
wecom_bot_id = "your_bot_id"
wecom_secret = "your_bot_secret"
-wecom_allowed_users = ["your_user_id"] # or ['*']
-wecom_welcome_message = "Hello"
+wecom_allowed_users = ["your_user_id"]
+wecom_welcome_message = "Hello, I'm online."
```
-DingTalk:
+---
+
+### DingTalk
```bash
pip install dingtalk-stream
-python dingtalkapp.py
-# or
-python launch.pyw --dingtalk
+python dingtalkapp.py # or python launch.pyw --dingtalk
```
-Config keys:
-
```python
dingtalk_client_id = "your_app_key"
dingtalk_client_secret = "your_app_secret"
dingtalk_allowed_users = ["your_staff_id"] # or ['*']
```
-**Also runs on Android** — tested successfully on Termux with `python agentmain.py` (CLI frontend):
+---
+
+### Telegram Bot
+
+```python
+# mykey.py
+tg_bot_token = 'YOUR_BOT_TOKEN'
+tg_allowed_users = [YOUR_USER_ID]
+```
```bash
-# In Termux
-cd /sdcard/ga
-python agentmain.py
+python tgapp.py
```
-Once running, tell the agent: *"Execute web setup SOP to unlock browser tools"* — it handles the rest. See [WELCOME_NEW_USER.md](WELCOME_NEW_USER.md) for the full bootstrap sequence.
-
-## vs. Alternatives
-
-| | GenericAgent | OpenClaw | Claude Code |
-|---|---|---|---|
-| Codebase | ~3,300 lines | ~530,000 lines | Open-source (large) |
-| Deploy | `pip install` + API key | Multi-service orchestration | CLI + subscription |
-| Browser | Injects into real browser (keeps login state) | Sandboxed/headless | Via MCP plugins |
-| OS Control | Keyboard, mouse, vision, ADB | Multi-agent delegation | File + terminal |
-| Self-evolution | Grows SOPs & tools autonomously | Plugin ecosystem | Stateless per session |
-| Core shipped | 10 .py + 5 SOPs | Hundreds of modules | Rich CLI toolkit |
-
-## How It Works
-
-```
-User instruction
- ↓
-┌─────────────────────┐
-│ agent_loop.py (92L) │ ← Sense-Think-Act cycle
-│ "What do I know? │
-│ What should I do?" │
-└────────┬────────────┘
- ↓
-┌─────────────────────┐
-│ 7 Atomic Tools │ ← All capabilities derive from these
-│ code_run │ Execute any Python/PowerShell
-│ file_read/write │ Direct disk access
-│ file_patch │ Surgical code edits
-│ web_scan │ Read live web pages
-│ web_execute_js │ Control browser DOM
-│ ask_user │ Human-in-the-loop
-└────────┬────────────┘
- ↓
-┌─────────────────────┐
-│ Memory System │ ← Persistent across sessions
-│ L0: Meta-SOP │ How to manage memory itself
-│ L2: Global Facts │ Environment, credentials, paths
-│ L3: Task SOPs │ Learned procedures (self-growing)
-└─────────────────────┘
-```
-
-The agent starts with 7 primitive tools. Through `code_run`, it can install packages, write scripts, and interface with any hardware or API — effectively manufacturing new tools at runtime.
-
-
+
+
+
|
|
+| *"Order me a milk tea"* — 自动导航外卖 App,选品并完成结账 | *"Find GEM stocks with EXPMA golden cross, turnover > 5%"* — 量化条件筛股 |
-1. 你让它做一件新事
-2. 它自己摸索方法(安装依赖、写脚本、测试)
-3. 把流程保存为新 SOP
-4. 下次直接调用
+
-Agent 不只是执行——它**学习并记忆**。
+| 🌐 自主网页探索 | 💰 支出追踪 | 💬 批量消息 |
+|:---:|:---:|:---:|
+|
|
|
|
+| 自主浏览并定时汇总网页信息 | *"查找近 3 个月超 ¥2K 的支出"* — 通过 ADB 驱动支付宝 | 批量发送微信消息,完整驱动微信客户端 |
-## 快速开始
+---
+
+## 📅 最新动态
+
+- **2026-03-:** [发布百万级 Skill 库](https://mp.weixin.qq.com/s/q2gQ7YvWoiAcwxzaiwpuiQ?scene=1&click_id=7)
+- **2026-03-08:** [发布以 GenericAgent 为核心的"政务龙虾" Dintal Claw](https://mp.weixin.qq.com/s/eiEhwo-j6S-WpLxgBnNxBg)
+- **2026-03-01:** [GenericAgent 被机器之心报道](https://mp.weixin.qq.com/s/uVWpTTF5I1yzAENV_qm7yg)
+- **2026-01-11:** GenericAgent V1.0 公开版本发布
+
+---
+
+## 🚀 快速开始
+
+#### 方法一:标准安装
```bash
-# 1. 克隆
+# 1. 克隆仓库
git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
@@ -312,168 +354,189 @@ pip install streamlit pywebview
# 3. 配置 API Key
cp mykey_template.py mykey.py
-# 编辑 mykey.py 填入你的 LLM API Key
+# 编辑 mykey.py,填入你的 LLM API Key
# 4. 启动
python launch.pyw
```
-**同样可在 Android 上运行** — 已在 Termux 上测试通过,通过 `python agentmain.py`(CLI 前端)启动:
+#### 方法二:Windows 便携版(推荐新手)
+
+[下载便携版](http://kw.fudan.edu.cn/resources/PC-Agent-Portable.zip)(19MB,解压即用)
+
+完整引导流程见 [WELCOME_NEW_USER.md](WELCOME_NEW_USER.md)。
+
+#### 方法三:Android(Termux)
```bash
-# 在 Termux 中
cd /sdcard/ga
python agentmain.py
```
-启动后告诉 Agent:"执行 web setup SOP 解锁浏览器工具"——剩下的它自己搞定。完整引导流程见 [WELCOME_NEW_USER.md](WELCOME_NEW_USER.md)。
+---
-## QQ Bot(可选)
+## 🤖 Bot 接口(可选)
-QQ 适配使用 `qq-botpy` 的 WebSocket 长连接,不需要公网 webhook。
+### QQ Bot
+
+使用 `qq-botpy` WebSocket 长连接,**无需公网 webhook**:
```bash
pip install qq-botpy
```
-然后在 `mykey.py` 或 `mykey.json` 中补充:
+在 `mykey.py` 中补充:
```python
qq_app_id = "YOUR_APP_ID"
qq_app_secret = "YOUR_APP_SECRET"
-qq_allowed_users = ["YOUR_USER_OPENID"] # 或 ['*'] 表示公开访问
+qq_allowed_users = ["YOUR_USER_OPENID"] # 或 ['*'] 公开访问
```
-启动方式:
-
```bash
python qqapp.py
-```
-
-或和桌面悬浮窗一起启动:
-
-```bash
+# 或与桌面悬浮窗一起启动
python launch.pyw --qq
```
-补充说明:
-- 在 [QQ 开放平台](https://q.qq.com) 创建机器人并拿到 `AppID` / `AppSecret`
-- 沙箱调试时,先把自己的 QQ 号加入消息列表
-- 首次给机器人发消息后,用户 openid 会记录在 `temp/qqapp.log` 中,便于填入 `qq_allowed_users`
+> 在 [QQ 开放平台](https://q.qq.com) 创建机器人获取 AppID / AppSecret。首次消息后,用户 openid 记录于 `temp/qqapp.log`。
-## Feishu / WeCom / DingTalk(可选)
+---
-Feishu:
+### 飞书(Lark)
```bash
pip install lark-oapi
-python fsapp.py
-# 或
-python launch.pyw --feishu
+python fsapp.py # 或 python launch.pyw --feishu
```
-配置项:
-
```python
fs_app_id = "cli_xxx"
fs_app_secret = "xxx"
fs_allowed_users = ["ou_xxx"] # 或 ['*']
```
-当前仓库里的飞书能力:
-- 入站:文本、富文本 post、图片、文件、音频、media、交互卡片/分享卡片
-- 图片首轮会以真正的多模态图片输入发送给支持 OpenAI 兼容视觉的模型后端
-- 出站:流式进度卡片、图片回传、文件或 media 回传
+**入站支持**:文本、富文本 post、图片、文件、音频、media、交互卡片 / 分享卡片
+**出站支持**:流式进度卡片、图片回传、文件 / media 回传
+**视觉模型**:图片首轮以真正的多模态输入发送给兼容 OpenAI Vision 的后端
-详细配置流程见 `assets/SETUP_FEISHU.md`
+详细配置见 [assets/SETUP_FEISHU.md](assets/SETUP_FEISHU.md)
-WeCom(企业微信):
+---
+
+### 企业微信(WeCom)
```bash
pip install wecom_aibot_sdk
-python wecomapp.py
-# 或
-python launch.pyw --wecom
+python wecomapp.py # 或 python launch.pyw --wecom
```
-配置项:
-
```python
wecom_bot_id = "your_bot_id"
wecom_secret = "your_bot_secret"
-wecom_allowed_users = ["your_user_id"] # 或 ['*']
+wecom_allowed_users = ["your_user_id"]
wecom_welcome_message = "你好,我在线上。"
```
-DingTalk(钉钉):
+---
+
+### 钉钉(DingTalk)
```bash
pip install dingtalk-stream
-python dingtalkapp.py
-# 或
-python launch.pyw --dingtalk
+python dingtalkapp.py # 或 python launch.pyw --dingtalk
```
-配置项:
-
```python
dingtalk_client_id = "your_app_key"
dingtalk_client_secret = "your_app_secret"
dingtalk_allowed_users = ["your_staff_id"] # 或 ['*']
```
-## 对比
+---
-| | GenericAgent | OpenClaw | Claude Code |
-|---|---|---|---|
-| 代码量 | ~3,300 行 | ~530,000 行 | 已开源(体量大) |
-| 部署 | `pip install` + API key | 多服务编排 | CLI + 订阅 |
-| 浏览器 | 注入真实浏览器(保留登录态) | 沙箱/无头浏览器 | 通过 MCP 插件 |
-| OS 控制 | 键鼠、视觉、ADB | 多 Agent 委派 | 文件 + 终端 |
-| 自我进化 | 自主生长 SOP 和工具 | 插件生态 | 会话间无状态 |
-| 出厂配置 | 10 个 .py + 5 个 SOP | 数百模块 | 丰富 CLI 工具集 |
+### Telegram Bot
-## 工作原理
+```python
+# mykey.py
+tg_bot_token = 'YOUR_BOT_TOKEN'
+tg_allowed_users = [YOUR_USER_ID]
+```
-Agent 拥有 7 个原子工具:`code_run`(执行任意代码)、`file_read/write/patch`(文件操作)、`web_scan`(网页感知)、`web_execute_js`(浏览器控制)、`ask_user`(人机协作)。
+```bash
+python tgapp.py
+```
-通过 `code_run`,它可以安装任何包、编写任何脚本、对接任何硬件——相当于在运行时制造新工具。学到的流程保存为 SOP,下次直接调用。
+---
-核心循环只有 92 行(`agent_loop.py`):感知 → 思考 → 行动 → 记忆。
+## 📊 与同类产品的对比
-
+
+^EBMD0Yz8jdB?6NTDpjw~k^7zIB)*=KuMbf6IPi
zu{C>U?_nRF`QdS9A0OwuG;VQZ%#yfxGhQP3Kcu<2X1f}H47pHt Xdq-;BZ(L|wE0-C1u
zc !&a2rSvN_Dc5@Rby^1C-_iWI$?V)UPOH#C{BNLmTyf|C8+;2W-Y|9ecn^wAbT&|}
zvV(g7H$GzrD}|qZ{w|#{^;59?85bPb@MTAr$_;3Z=;>pK8HEUToQ!z-at9&;J70L=
zgl%?Drb9!dN@twOGbp?hniK>-^Xjt)Tx@Wf2@0fedPmD{T;2jS*-%HK!k@`b<;|c?
z&jk3wi4{*cS4-z1U+H64>FE1
zP
em1!1zED$Ot*k!APi-JU9$p|3R&of-79Ma4N^E_1BnzyHXl8+s
zYC%>|YEGE^-UE0xm0BeP`IfKQLwku(3
~>#v=!5S)8eU
zD0Qi!strc7Jg9bUxC0iuxpzbbzSaa^n&&(fWyF^MI
zJ8jvVa>9sg{Eb4@shw4xGWbwtBR1jg#w5GJm4*P1UO%7yt$t2dXVz1yBM~cW_^EgB2JQo@_;A
zU?!7M$%5!6oTRwHTvScF7`T8Va?KfH&ST>jfKc;|IqZt#W^WB4Vw^Ah&C=rrX3PlXxhjJdzLu
zLDu~J_@ROFK4?0;>Iqt=$S$Z;NjNpsQ^ilxreQKFD?D}r!Z`fxE(4pLp3wekp4
zs3cKx&MPNo!92g0n^^;_3XK{R(@46=Kc1|J#XJyc=z$y%bUX6b?h51*r!fxDo
&M_Q?})f(+=!!4pI@?20&uc{8p$q1he>Ns&Aj#vWfW0~5~n@{PdGxO@G}(>N~}
zM?OI4X1(_()`GDUP$D6s5q|KGQ$#GLhCO9&y+W4LYSn9uSN1r+7n$0=<8B+(pW9($
z8OU4q)udQ0@Y4-5>btPF@1(g1irPgIBQKdRA0u%yhl~Tt>K9Svb2|s7EKp~Z+jHFR
z(%IO}PUX)|l1F}@?H5|qdsE$+G3ym6O%suG*Ba?7OX&u@i#*V(8)xAfv_0cwPEOy2
zv4uDmm^z=y%B%@U@c_WnrRVm^Xg{e7BlO7~+)^I&OheI_dN#409EH?#MnBWKG-~5iK2Cb@aOr;9@`>
zhevTH16aa^?XZJ5LUMkGS78C3890(!KNvvDvuQsF>cqCA>V2Wn8G{u<2GiyMkvwpP
zxzULvN_k0W65o27i>3=Gy%nWc9%hRWepQ1PuMXoJgOE2s0IS`wY+^&ULxo%duVFnb
z-_fpos~W~Y&Z1aFHHi&w{068-yqvy9ZzW8Wysd4+adl@#(%o&uE
zd$`0g%E5YXR!kJF(AM?%cCz81!y&2X)037Ch;OOHNW+*Sruz8wewl${~)z7y!4fWc{ih8
zpYHfN&86s;-S+3>?}R*lc0MNRTG@yxn55I*yArExJ&W^Wo3bi5FHC)cXZ|2JKXzWh
zOuN+<1ur8?&2w^}ZtQWn_j-!~W}dxUzjtzq1S%nkt3UHyW1*uVy1Z!gmjUUa&VFQ7
zum(Lcg;3B3MTU=5o`qkHq