#1025pending
2026-07-04T00:11:06+08:00
存在可接续下一步
tool: importer
已从会话中识别到 next/下一步 类标记;打开 transcript.md 查看上下文。
#1024tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command":"python3 - <<'PY'\nimport urllib.request,json,base64\nrepos=['xpzouying/x-mcp','aicu-icu/xhs-mcp-server','yxwxl/login-state-reader','Panniantong/Agent-Reach','chubbyguan/chubbyskills']\nfor repo in repos:\n print('\\n===',repo,'===')\n try:\n meta=json.load(urllib.request.urlopen(urllib.request.Request(f'https://api.github.com/repos/{repo}',headers={'User-Agent':'Hermes-Agent'}),timeout=20))\n print('stars',meta.get('stargazers_count'),'lang',meta.get('language'),'updated',meta.get('updated_at'),'license',(meta.get('license') or {}).get('spdx_id'))\n print('desc',meta.get('description'))\n for name in ['README.md','README.zh-CN.md','README_CN.md','readme.md']:\n try:\n c=json.load(urllib.request.urlopen(urllib.request.Request(f'https://api.github.com/repos/{repo}/contents/{name}',headers={'User-Agent':'Hermes-Agent'}),timeout=20))\n txt=base64.b64decode(c['content']).decode('utf-8','replace')\n print('readme',name,'len',len(txt))\n keys=['小红书','xhs','xiaohongshu','插件','extension','Chrome','浏览器','cookie','评论','comment','导出','export','MCP','userscript','油猴']\n
...[truncated 410 chars]
#1023tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command":"python3 - <<'PY'\nimport urllib.request, urllib.parse, json, base64, time\nqueries=['xiaohongshu browser extension crawler','小红书 浏览器插件 采集 github','小红书 插件 导出 评论 github','xhs chrome extension crawler','xiaohongshu userscript export comments']\nfor q in queries:\n print('\\n===',q,'===')\n url='https://api.github.com/search/repositories?'+urllib.parse.urlencode({'q':q,'sort':'stars','order':'desc','per_page':10})\n req=urllib.request.Request(url,headers={'User-Agent':'Hermes-Agent','Accept':'application/vnd.github+json'})\n try:\n data=json.load(urllib.request.urlopen(req,timeout=20))\n for it in data.get('items',[])[:10]:\n print(f\"{it['stargazers_count']:>6}★ {it['full_name']} | {it.get('language')} | {it['html_url']}\")\n print(' ',(it.get('description') or '')[:180])\n except Exception as e: print('ERR',repr(e))\n time.sleep(1)\nPY","timeout":180}
#1022user
2026-07-04T00:11:06+08:00
我想这个采集的,本地有没有浏览器插件可以做呢?自动采集完,数据再清洗一下
tool: weixin
我想这个采集的,本地有没有浏览器插件可以做呢?自动采集完,数据再清洗一下
#1021assistant
2026-07-04T00:11:06+08:00
Boss,我查了 GitHub 当前开源生态。现成方案是有的,而且已经有比我们自己从零写 Playwright 更成熟的路线。
tool: assistant
Boss,我查了 GitHub 当前开源生态。现成方案是有的,而且已经有比我们自己从零写 Playwright 更成熟的路线。
## 结论
**最值得借鉴的是 `MediaCrawler`,其次是 `xpzouying/xiaohongshu-mcp` 的浏览器插件版 / x-mcp 思路。**
你要的“本地浏览器模拟真人点击浏览”,开源社区成熟方向已经很明确:
> **本地真实 Chrome + CDP 连接 + 复用已有登录态/Cookie + 慢速采样 + 本地保存结果 + Hermes 分析。**
不是传统 headless 爬虫。
---
## 候选项目
### 1. `NanmiCoder/MediaCrawler` — 最成熟
GitHub:`NanmiCoder/MediaCrawler`
Stars:约 **55k**
技术:Python + Playwright / CDP
支持:小红书搜索、笔记、评论、二级评论、创作者主页、登录态缓存、代理池
它 README 里有一个关键点:
> 默认使用 **CDP 模式连接用户已有 Chrome 浏览器**,可以复用浏览器已有登录状态、Cookie、扩展等,降低平台风控检测风险。
这正好对应你的想法。
**我认为它是首选参考方案。**
优点:
- 社区最大,踩坑最多
- 支持小红书评论
- 支持登录态缓存
- 支持 CDP 连接已有浏览器
- 不需要 JS 逆向,靠浏览器上下文拿签名
- 比 VPS headless 更贴近真人环境
缺点:
- 还是采集工具,不是“账号绝对安全”
- 配置和依赖会比小工具重
- 需要在你本地电脑跑更合适
---
### 2. `persist-1/Visual_MediaCrawler` — 可视化版
GitHub:`persist-1/Visual_MediaCrawler`
Stars:约 64
定位:基于 MediaCrawler 的前后端可视化项目
支持:小红书搜索、详情页、创作者数据采集
优点:
- 有 UI,适合非纯命令行使用
- 可能更适合你本地手动控制
缺点:
- 社区小很多
- 本质还是 MediaCrawler 包装版,稳定性看上游
---
### 3. `yangsijie666/xiaohongshu-crawler` — Playwright 隐匿爬虫
GitHub:`yangsijie666/xiaohongshu-crawler`
Stars:较少,约 8
技术:Playwright + playwright-stealth + browserforge
支持:搜索、笔记详情、评论采集、JSON/Excel 导出、MCP 服务
优点:
- 很贴合“小红书 + Playwright + 评论 + MCP”
- 明确写了双层反检测、登录态持久化、崩溃恢复
缺点:
- 星少,项目成熟度不如 MediaCrawler
- 可以参考设计,但不建议直接当主力
---
### 4. `xpzouying/xiaohongshu-mcp` / `x-mcp` — MCP 生态路线
GitHub:`xpzouying/xiaohongshu-mcp`
Stars:约 **14k**
这个我们已经在 VPS 上试过,VPS 风控不理想。但它 README 里现在强调了一个新方向:
> x-mcp 浏览器插件版,零配置,安装插件即用。
这个思路很适合我们:**浏览器插件在本地拿数据,Hermes/MCP 只接收结果。**
优点:
- MCP 接入天然适合 Hermes
- 插件版比服务器部署更接近本地真人环境
缺点:
- 需要再验证插件
...[truncated 1357 chars]
#1020tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command":"python3 - <<'PY'\nimport urllib.request, json, base64\nrepos=['NanmiCoder/MediaCrawler','persist-1/Visual_MediaCrawler','yangsijie666/xiaohongshu-crawler','xpzouying/xiaohongshu-mcp','iFurySt/RedNote-MCP','Panniantong/Agent-Reach','aicu-icu/xhs-mcp-server','panda-lsy/XHS-HotCommentsCrawler']\nfor repo in repos:\n print('\\n===',repo,'===')\n try:\n meta=json.load(urllib.request.urlopen(urllib.request.Request(f'https://api.github.com/repos/{repo}',headers={'User-Agent':'Hermes-Agent'}),timeout=20))\n print('stars',meta.get('stargazers_count'),'lang',meta.get('language'),'updated',meta.get('updated_at'),'license',(meta.get('license') or {}).get('spdx_id'))\n print('desc',meta.get('description'))\n for name in ['README.md','README_CN.md','readme.md']:\n try:\n req=urllib.request.Request(f'https://api.github.com/repos/{repo}/contents/{name}',headers={'User-Agent':'Hermes-Agent'})\n c=json.load(urllib.request.urlopen(req,timeout=20))\n txt=base64.b64decode(c['content']).decode('utf-8','replace')\n low=txt.lower()\n hits=[]\n for kw in ['小红书',
...[truncated 789 chars]
#1019tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path":"/root/browser-data-extraction/references/xiaohongshu-extraction.md","limit":100}
#1018tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path":"/root/hermes-agent-configuration/references/xhs-cli-alternatives.md","limit":140}
#1017tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command":"python3 - <<'PY'\nimport urllib.request, urllib.parse, json, time\nqueries = [\n 'xiaohongshu crawler browser playwright',\n '小红书 爬虫 playwright',\n 'xiaohongshu mcp',\n 'xhs crawler comments',\n '小红书 评论 采集 开源',\n]\nfor q in queries:\n url='https://api.github.com/search/repositories?'+urllib.parse.urlencode({'q':q,'sort':'stars','order':'desc','per_page':8})\n print('\\n===',q,'===')\n req=urllib.request.Request(url, headers={'Accept':'application/vnd.github+json','User-Agent':'Hermes-Agent'})\n try:\n with urllib.request.urlopen(req, timeout=20) as r:\n data=json.load(r)\n for it in data.get('items',[])[:8]:\n print(f\"{it['stargazers_count']:>6}★ {it['full_name']} | {it.get('language')} | {it['html_url']}\")\n print(' ', (it.get('description') or '')[:180])\n except Exception as e:\n print('ERR',repr(e))\n time.sleep(1)\nPY","timeout":180}
#1016tool
2026-07-04T00:11:06+08:00
调用工具:tool_slimmer_request_full_tools
tool: tool_slimmer_request_full_tools
{"reason":"需要联网检索当前开源的小红书本地浏览器/采集成熟方案与社区策略;当前工具集中没有 web/search 工具。"}
#1015user
2026-07-04T00:11:06+08:00
天下文章一大抄,你去看看有没有现成的成熟方案,开源
tool: weixin
天下文章一大抄,你去看看有没有现成的成熟方案,开源
#1014assistant
2026-07-04T00:11:06+08:00
有,本地方案反而是我更建议的方向,Boss。
tool: assistant
有,本地方案反而是我更建议的方向,Boss。
结论:**“本地浏览器模拟真人点击浏览”会比 VPS 稳妥,但不要理解成万能。它降低的是 IP / 设备异常风险,不消除账号行为风险。**
我会把方案分成三层:
## 1. 最稳:本地真人浏览 + Hermes 分析
你在本地小红书正常搜索、打开帖子、看评论;然后把链接、截图、复制出的评论文本或导出的 HTML/JSON 给我。
适合:医院、医生、拔智齿这类高价值决策。
优点:最像真人,账号风险最低。
缺点:慢,需要你参与采样。
这其实很适合你的需求,因为你不需要全网爬取,只需要找到“高信号评论”。
## 2. 次稳:本地 Playwright/Chrome 自动慢速浏览
在你本地电脑跑脚本,用真实 Chrome 用户资料:
- 使用真实浏览器 profile
- 不 headless
- 鼠标滚动、停留、随机等待
- 每次只搜 1–2 个关键词
- 每篇只看少量评论
- 结果保存为本地 JSON/Markdown
- Hermes 只读取结果做分析
这比 VPS 好很多,因为网络、设备、登录环境都更接近你本人。
但我会设置很保守:
```text
每轮搜索:1 个关键词
每轮打开:3–5 篇笔记
每篇停留:30–90 秒
评论采样:前 20–50 条
每轮结束:至少间隔数小时
遇到验证码/异常提示:立即停止
```
## 3. 不建议作为主路:VPS + 住宅代理 + 自动化
理论可行,但不如本地稳。
原因是即使代理像住宅,VPS 的浏览器指纹、行为节奏、cookie 环境仍然容易不自然。你之前已经收到过账号警告,我不建议再拿主账号试。
## 我推荐的实际工作流
```text
本地浏览器采集
→ 保存搜索结果/评论文本
→ 发给 Hermes
→ Hermes 做:
医生/医院/项目提及统计
正负面口碑归类
广告号/水军特征过滤
推荐/避雷清单
```
也就是说:**本地负责“像人一样看”,VPS 负责“像机器一样整理”。**
如果要做成半自动,我建议先做一个“本地采样助手”,不是爬虫:
- 输入:地区、医院、项目、医生名
- 打开小红书搜索页
- 你人工扫码/登录
- 它慢速辅助打开候选帖子
- 自动把页面文本、评论、作者主页摘要存成本地文件
- 最后上传给我分析
这条路最符合你的场景:慢一点,但风险小,得到的是可决策信息。
#1013tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path":"/root/hermes-agent-configuration/references/xhs-cli-alternatives.md","limit":220}
#1012tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path":"/root/browser-data-extraction/references/xiaohongshu-extraction.md","limit":120}
#1011user
2026-07-04T00:11:06+08:00
那小红书这个有没有本地方案呢?比如本地浏览器模拟真人点击浏览?这种会不会稳妥点,虽然慢了点
tool: weixin
那小红书这个有没有本地方案呢?比如本地浏览器模拟真人点击浏览?这种会不会稳妥点,虽然慢了点
#1010user
2026-07-04T00:11:06+08:00
那算了保持 observation-only吧,有些需要的我再主动提醒你去找细节即可
tool: weixin
那算了保持 observation-only吧,有些需要的我再主动提醒你去找细节即可
#1009tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command":"python3 - <<'PY'\nimport json, urllib.request\nbase='http://127.0.0.1:8888'\nqueries=['小红书 汕头 牛肉 MCP 警告 风控 截断','小红书 MCP 汕头牛肉火锅','xiaohongshu xhs 汕头 牛肉']\nfor q in queries:\n print('\\n=== QUERY',q,'===')\n for path, payload in [\n ('/v1/default/banks/default/recall', {'query': q, 'max_tokens': 1200}),\n ('/v1/default/banks/default/reflect', {'query': q, 'max_tokens': 1200}),\n ]:\n req=urllib.request.Request(base+path, data=json.dumps(payload).encode(), headers={'Content-Type':'application/json'})\n try:\n with urllib.request.urlopen(req, timeout=60) as r:\n txt=r.read().decode('utf-8','replace')\n print('---',path,'status ok len',len(txt))\n print(txt[:1500])\n except Exception as e:\n print('---',path,'ERR',repr(e))\nPY","timeout":180}
#1008tool
2026-07-04T00:11:06+08:00
调用工具:patch
tool: patch
{"mode":"replace","path":"/root/social-media/xiaohongshu-mcp/SKILL.md","old_string":"\n11. **Account-risk lesson from prior run**: A prior exploratory run around `汕头牛肉火锅` was cut off and the Xiaohongshu account received a platform warning. Treat Xiaohongshu as account-risky, not just IP-risky. Do not run bulk searches/comment scraping from the user's main account on the VPS. Default to low-frequency, human-in-the-loop sampling, search-result-only triage, and local/residential execution for any substantial research.\n\n### Account-Safe Operating Policy\n\nFor future Xiaohongshu research tasks:\n\n1. **No bulk crawling on the VPS by default** — VPS IPs are high-risk and can transfer risk to cookies/account.\n2. **Do not use the user's main account for automation-heavy collection** — use manual sampling, a dedicated low-value research account, or local device flow.\n3. **Keep queries short and sparse** — avoid long keyword chains and rapid variations. Use one compact query, inspect results, then decide the next query.\n4. **Prefer search-card triage before details** — first collect titles, author names, likes/comments counts, and URLs; only open details/comments for a small handpicked
...[truncated 590 chars]
#1007tool
2026-07-04T00:11:06+08:00
调用工具:session_search
tool: session_search
{"query":"小红书 汕头 牛肉 MCP 警告 风控 截断","limit":10,"sort":"newest","role_filter":"user,assistant,tool"}
#1006tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command":"/root/.hermes/scripts/hermes_config_doctor.sh 2>&1 | sed -n '1,220p'","timeout":60}
#1005tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path":"/root/.hermes/hindsight/config.json","limit":300}
#1004tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path":"/root/.hermes/CONFIG_OWNERSHIP.md","limit":300}
#1003tool
2026-07-04T00:11:06+08:00
调用工具:skill_view
tool: skill_view
{"name":"hindsight-ops"}
#1002user
2026-07-04T00:11:06+08:00
成功经验你才去写skill,失败经验不用了skill了,不过话说应该有说小红书,hindsight应该会自动召回关于小红书的历史记录,自动注入上下文,不应该我再问你才对
tool: weixin
成功经验你才去写skill,失败经验不用了skill了,不过话说应该有说小红书,hindsight应该会自动召回关于小红书的历史记录,自动注入上下文,不应该我再问你才对
#1001assistant
2026-07-04T00:11:06+08:00
Boss,我查了本地记录和 session 线索:确实有 **6月9日“小红书/高德 MCP 接入 + 汕头牛肉火锅”** 这条历史;本地 skill 里也已经写着几个关键教训:VPS 浏览器访问小红书会触发 **IP at risk**,`get_feed_detail` 不稳定,搜索词太长容易空结果。
tool: assistant
Boss,我查了本地记录和 session 线索:确实有 **6月9日“小红书/高德 MCP 接入 + 汕头牛肉火锅”** 这条历史;本地 skill 里也已经写着几个关键教训:VPS 浏览器访问小红书会触发 **IP at risk**,`get_feed_detail` 不稳定,搜索词太长容易空结果。
但这次你补充的核心信息更关键:**账号收到警告**。这说明问题不是“技术上能不能抓”,而是**账号风控风险已经被触发**。所以下次我会改策略:
1. **不再用主账号 + VPS 做批量搜索/评论抓取**
- VPS IP 风险会传导到 cookie/账号。
- 小红书这类平台不能按普通网页爬虫处理。
2. **先做低频搜索卡片,不直接批量进评论**
- 只取标题、作者、互动数、链接。
- 人工/半自动挑 3–5 篇高价值笔记,再看评论。
- 不再一轮跑很多关键词变体。
3. **评论区改为“抽样研究”,不是全量抓取**
- 每个需求先抓少量高信号评论。
- 发现验证码、异常登录、警告提示,立即停,不重试。
4. **采集端尽量放在本地/住宅网络**
- VPS 负责分析,不负责冒风险采集。
- 最稳方案是:本地浏览器或本地 CLI 导出搜索结果/评论缓存,Hermes 读取 JSON/HTML 后做医生/店铺口碑分析。
5. **主账号只做人工验证,不做自动化**
- 如果要长期做,应该用低价值研究号或专门采集号。
- 医院/医生这种高价值决策场景,宁可少量高质量采样,也不要把账号跑废。
我已经把这条教训补进本地 `xiaohongshu-mcp` skill 了:以后默认按“账号安全策略”执行,不再把它当纯 IP 反爬问题。
我的建议:**下次小红书需求,Hermes 只做分析中枢;采集走本地/住宅网络、低频、缓存优先。** 这样最稳。
#1000tool
2026-07-04T00:11:06+08:00
调用工具:patch
tool: patch
{"mode":"replace","path":"/root/social-media/xiaohongshu-mcp/SKILL.md","old_string":"10. **Browser tool blocked**: Navigating to `xiaohongshu.com` via Hermes browser tools is blocked (\"IP at risk\"), even with headless Chromium. The xhs-mcp Go binary's internal rod-based browser is the only working path. Do NOT attempt browser_navigate to xiaohongshu.com.\n","new_string":"10. **Browser tool blocked**: Navigating to `xiaohongshu.com` via Hermes browser tools is blocked (\"IP at risk\"), even with headless Chromium. The xhs-mcp Go binary's internal rod-based browser is the only working path. Do NOT attempt browser_navigate to xiaohongshu.com.\n\n11. **Account-risk lesson from prior run**: A prior exploratory run around `汕头牛肉火锅` was cut off and the Xiaohongshu account received a platform warning. Treat Xiaohongshu as account-risky, not just IP-risky. Do not run bulk searches/comment scraping from the user's main account on the VPS. Default to low-frequency, human-in-the-loop sampling, search-result-only triage, and local/residential execution for any substantial research.\n\n### Account-Safe Operating Policy\n\nFor future Xiaohongshu research tasks:\n\n1. **No bulk crawling on the VP
...[truncated 1140 chars]
#999tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path":"/root/social-media/xiaohongshu-mcp/SKILL.md","limit":500}
#998tool
2026-07-04T00:11:06+08:00
调用工具:session_search
tool: session_search
{"query":"小红书 汕头 牛肉 警告 截断 MCP OR xhs OR xiaohongshu","limit":5,"sort":"newest"}
#997tool
2026-07-04T00:11:06+08:00
调用工具:tool_slimmer_request_full_tools
tool: tool_slimmer_request_full_tools
{"reason":"User explicitly asked to check Hindsight records about Xiaohongshu; hindsight_recall is not available in the trimmed tool list."}
#996tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command":"python3 - <<'PY'\nimport os, json\npaths=['/root/.xiaohongshu-cli/search_sessions.json','/root/.xiaohongshu-cli/cookies.json','/root/.hermes/xhs-data/cookies.json','/root/social-media/xiaohongshu-mcp/SKILL.md']\nfor p in paths:\n print('\\n---',p,'---')\n try:\n st=os.stat(p); print('exists size',st.st_size,'mtime',st.st_mtime)\n if p.endswith('search_sessions.json'):\n print(open(p,encoding='utf-8').read()[:2000])\n elif p.endswith('SKILL.md'):\n s=open(p,encoding='utf-8').read()\n for kw in ['警告','风控','截断','牛肉','汕头','captcha','type=216','rate']:\n if kw.lower() in s.lower(): print('has',kw)\n except Exception as e: print('ERR',e)\nPY"}
#995tool
2026-07-04T00:11:06+08:00
调用工具:search_files
tool: search_files
{"path":"/root","pattern":"小红书|xhs|xiaohongshu|牛肉|汕头|警告|截断|风控","target":"content","limit":50}
#994tool
2026-07-04T00:11:06+08:00
调用工具:skill_view
tool: skill_view
{"name":"browser-data-extraction"}
#993user
2026-07-04T00:11:06+08:00
你查一下hindsight有关小红书的记录,我们之前的小红书mcp,跑了一会汕头哪个牛肉店好吃就被截断了,账号也收到了警告
tool: weixin
你查一下hindsight有关小红书的记录,我们之前的小红书mcp,跑了一会汕头哪个牛肉店好吃就被截断了,账号也收到了警告
对此,下次你会怎么调整
#992assistant
2026-07-04T00:11:06+08:00
Boss,先给结论:**从这台 VPS 硬搜小红书,目前最稳的做法不是继续加代理“硬冲”,而是把现有三个工具分层用起来,并把真正的“口碑挖掘”放到评论层做过滤。**
tool: assistant
Boss,先给结论:**从这台 VPS 硬搜小红书,目前最稳的做法不是继续加代理“硬冲”,而是把现有三个工具分层用起来,并把真正的“口碑挖掘”放到评论层做过滤。**
你已经装了完整栈,不需要重新找项目,关键是怎么编排:
- **VPS 兜底层**:用现有的 `xhs-mcp` HTTP 接口(`localhost:18060`)做低频搜索和高互动笔记的评论抓取。它走 headless browser,慢,但比直连 API 更难被当场拦掉。
- **主力搜索层**:用 `xiaohongshu-cli` 或 `redbook` 做关键词搜索。它们走内部签名 API,速度快,但 VPS flagged IP 很容易出 `type=216` 验证码。所以这层最好跑在干净 IP / 住宅代理 / 你自己本地机器上。
- **评论层分析**:你的需求核心本来就不在帖子正文,而在评论区“真实经历 + 避雷 + 推荐”。所以流程应该是:
1. 搜索:`汕头 医院 拔智齿 医生推荐`
2. 从搜索结果里挑互动高、看起来像真实经历的笔记
3. 抓评论,按“账号是否是广告号”过滤
4. 汇总成:**推荐医生 / 避雷医生 / 信号强度**
**广告号过滤策略**,可以按你 observed 特征落地:
- 账号主页帖子高度同质化(全是同一科室测评、同一医院推荐、或全是带商品链接)
- 同一账号在多个笔记下发近似文案
- 新号 / 粉丝少但评论高频
- 评论内容含明显外链、加微、预约话术
这些可以直接做规则打分,不需要上大模型。
**VPS 侧稳定性建议**:
- `xhs-mcp` 的 detail API 很慢且 `xsec_token` 容易过期,别逐篇打详情;搜索结果里的 `noteCard.interactInfo` 通常足够初筛
- 遇到 `IP at risk` / timeout,直接 kill 掉 chrome 和 `xhs-mcp`,换 cookies 或等一会再起,不要空转重试
- 批量任务别在单 IP 高频跑,容易把 cookies 也带崩
**可借鉴的最新开源策略**:
- `jackwener/xiaohongshu-cli`:Gaussian jitter、验证码冷却、指数退避
- `lucasygu/redbook`:分析模块里已经有“病毒式传播分析、关键词矩阵”,适合做“哪些医生被高频正面/负面提及”的量化
- `xpzouying/xhs-mcp`:HTTP 模式比 stdio 可靠,适合脚本化
如果你愿意,我下一步可以直接把这套流程做成一套脚本/工作流,接你现在这套 Hermes 工具链,输入“地区 + 医院 + 项目”就能输出一份“避雷/推荐医生 + 证据来源”的短报告。要做吗?
#991tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command": "ls -la /root/.xiaohongshu-cli /root/.hermes/xhs-data /root/social-media/xiaohongshu-mcp 2>/dev/null && printf '\\n--- check_xhs_config.py ---\\n' && sed -n '1,220p' /root/check_xhs_config.py 2>/dev/null"}
#990tool
2026-07-04T00:11:06+08:00
调用工具:read_file
tool: read_file
{"path": "/root/browser-data-extraction/references/xiaohongshu-extraction.md"}
#989tool
2026-07-04T00:11:06+08:00
调用工具:skill_view
tool: skill_view
{"file_path": "references/xhs-cli-alternatives.md", "name": "hermes-agent-configuration"}
#988tool
2026-07-04T00:11:06+08:00
调用工具:skill_view
tool: skill_view
{"file_path": "references/xhs-mcp-http-api.md", "name": "hermes-agent-configuration"}
#987tool
2026-07-04T00:11:06+08:00
调用工具:skill_view
tool: skill_view
{"name": "xhs-mcp-setup"}
#986tool
2026-07-04T00:11:06+08:00
调用工具:search_files
tool: search_files
{"path": "/root", "pattern": "xhs|xiaohongshu|redbook|crawl|scrape|proxy|反爬|小红书", "target": "content", "limit": 20}
#985tool
2026-07-04T00:11:06+08:00
调用工具:terminal
tool: terminal
{"command": "find /root -maxdepth 3 -type f \\( -name \"*.md\" -o -name \"*.py\" -o -name \"*.json\" \\) 2>/dev/null | grep -i -E \"(xhs|xiaohongshu|redbook|crawl|scrape|proxy)\" | head -30"}
#984tool
2026-07-04T00:11:06+08:00
调用工具:web_search
tool: web_search
{"query": "小红书 VPS 访问限制 解决方案 2025 2026 开源项目 反爬"}
#983tool
2026-07-04T00:11:06+08:00
调用工具:hindsight_recall
tool: hindsight_recall
{"query": "小红书 VPS 访问 限制 方案 开源 爬取 评论"}
#982user
2026-07-04T00:11:06+08:00
帮我想个解决一个需求,我发现中国大陆的小红书是一个本地经验平台,他的有效通常不在帖子,而是帖子下方的评论区。比如我去医院拔智齿,搜索相关话题,地区+医院名称+xxx医生/xxx功能(拔智齿)
tool: weixin
帮我想个解决一个需求,我发现中国大陆的小红书是一个本地经验平台,他的有效通常不在帖子,而是帖子下方的评论区。比如我去医院拔智齿,搜索相关话题,地区+医院名称+xxx医生/xxx功能(拔智齿)
然后通常会有很多网友亲身经历,避雷或者多个推荐的医生。这能很好的帮我去筛选资源。挂号到一些很不错的医生。(比如我原先选择的医生,一搜才知道他的口碑非常差,技术也比较差。有负面信息避雷贴,当然这个过程中,评论区也找到了不少推荐贴)
虽然有不少广告,但是广告托的特征和正常网友还是有明显区别(至少广告号点进去账号主页,发现全都是清一色的推销帖子或者该话题的测评贴,这种就是非正常账号)
因为vps端访问小红书,好像会被限制,你有什么好的解决方案吗?或者社区,开源项目有什么最新策略可以借鉴?
#981session
2026-07-04T00:11:06+08:00
导入 Hermes 会话:VPS访问小红书受限解法
tool: importer
Imported from Hermes session 20260703_055856_2f2594d3 · weixin
inputs
outputs
[
"overview.md",
"artifacts/transcript.md",
"artifacts/process-summary.md"
]
files
[
"/root/aliyun-bailian-proxy/SKILL.md",
"/root/hermes-agent-configuration/references/xhs-mcp-http-api.md",
"/root/hermes-agent-configuration/references/aliyun-model-proxy.md",
"/root/hermes-agent-configuration/references/xhs-mcp-setup.md",
"/root/hermes-agent-configuration/references/xhs-cli-alternatives.md",
"/root/browser-data-extraction/references/xiaohongshu-extraction.md",
"/root/social-media/xiaohongshu-mcp/SKILL.md",
"/root/.xiaohongshu-cli/cookies.json",
"/root/.xiaohongshu-cli/search_sessions.json",
"/root/hermes_config_backups/20260624_223531_before_remove_cliproxyapi/CONFIG_OWNERSHIP.md",
"/root/hermes_config_backups/20260623_175247_before_cliproxyapi_aux_fast/CONFIG_OWNERSHIP.md",
"/root/.hermes/scripts/temp_oauth_connect_proxy.py",
"/root/.hermes/temp-oauth-proxy/proxy.py",
"/root/.hermes/xhs-data/cookies.json",
"/root/check_xhs_config.py",
"/root/aliyun-model-proxy-deploy.md",
"/root/test_xhs_mcp.py",
"/root/test_xhs_simple.py",
"/root/devops/proxy-client-configuration/SKILL.md",
"/root/devops/caddy-reverse-proxy/SKILL.md",
"/root/marketing_management_public_case_pool.md",
"/root/hermes-skills-overview.html",
"/root/racknerd-final.yaml",
"/root/import_sessions_to_hindsight.py",
"/root/devops/hermes-dashboard-maintenance/SKILL.md",
"/root/.hermes/bin/xhs-mcp",
"/root/.hermes/xhs-data",
"/root/.cache/rod/browser/chromium-1321438/chrome",
"/root/.cache/rod/browser/chromium-",
"/root/social-media/xiaohongshu-mcp",
"/root/.xiaohongshu-cli",
"/root/.hermes/config.yaml",
"/root/.hermes/scripts/product-hunt-daily.py",
"/root/.hermes/skills/browser-data-extraction",
"/root/devops/sing-box-ops/SKILL.md",
"/root/marketing_management_public_case_deepened.md",
"/root/book_knowledge/source/自媒体与内容生产/马督工自媒体教程/08-第六讲",
"/root/book_knowledge/source/自媒体与内容生产/马督工自媒体教程/04-第二讲",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/6.新质供应链_来自大湾区制造业的启示.md",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/12.领导力.md",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/7.战略模拟与行为反思.md",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/13.产业互联网平台化战略-产业重做一遍的方法论.md",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/3.供应链战略与决策分析.md",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/9.创新与创业.md",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/11.财务报告.md",
"/root/book_knowledge/source/商学院案例库/中欧商学院案例合集/1.产业发展与竞争分析.md",
"/root/ceibs_ceo_succession_v3_selfmedia.md",
"/root/ceibs_ceo_succession_selfmedia.md",
"/root/ceibs_ceo_succession_two_versions_comparison.md",
"/root/ceo_succession_public_cases_madugong_version.md",
"/root/.hermes/skills/book-derived/decode-od/chapters/ch19.md",
"/root/ceo_succession_original_case_public_audit.md",
"/root/ceo_succession_case_public_audit/original_a_version.txt",
"/root/ceo_succession_source_chain_case_pool.md",
"/root/marketing_public_case_replacement_pool.md",
"/root/ceo_succession_public_case_source_expansion.md",
"/root/.hermes/skills/writing-skills/public-case-source-replacement/SKILL.md",
"/root/ceibs_self_media_articles/中欧商学院案例_营销管理_自媒体稿.md",
"/root/.hermes/skills/book-derived/decode-od/chapters/ch18.md",
"/root/.hermes/skills/book-derived/madugong-self-media/SKILL.md",
"/root/weixin_file_archive/2026/07/01/014812_99bcc773_【话题】为什么创始人明明想交班",
"/root/marketing_management_public_case_original_vs_rewrite.md",
"/root/backups/runs/20260701_033059",
"/root/.config/rclone/rclone.conf",
"/root/.hermes/.env",
"/root/.hermes/temp-oauth-proxy/creds.env",
"/root/project-a",
"/root/.hermes/skills/media-case-map/SKILL.md",
"/root/.hermes/skills/collaborative-media-case-map/SKILL.md",
"/root/media_case_map_professional/output/专业人才陈皮书_自媒体案例地图与知识型稿件.md",
"/root/cstd_casebook/final/专业人才陈皮书_道法术器增补版.pdf",
"/root/cstd_casebook/final/CSTD八类原始转写稿_MD格式_按分类.zip",
"/root/media_case_map_professional/professional_pdf.txt",
"/root/media_case_map_professional/output/send/专业人才陈皮书_自媒体案例地图_参考内训师样式.html",
"/root/media_case_map_professional/output/send/专业人才陈皮书_知识型自媒体稿件.md",
"/root/.hermes/skills/social-media/xiaohongshu-mcp/scripts/inject-cookie.py",
"/root/.hermes/skills/social-media/xiaohongshu-mcp/templates/xhs-mcp.service",
"/root/.hermes/scripts/start-xhs-mcp.sh",
"/root/.hermes/hindsight/config.json",
"/root/docker-compose.yml",
"/root/hindsight-docker.env",
"/root/.hermes/CONFIG_OWNERSHIP.md",
"/root/.hermes/skills/hermes-agent/hindsight-ops/scripts/hindsight_ops.py",
"/root/hindsight-data",
"/root/.hermes/skills/hermes-agent-configuration/references/hindsight-bank-config-api.md",
"/root/.hermes/skills/hermes-agent-configuration/references/hindsight-bank-configuration.md",
"/root/.hermes/skills/hermes-agent-configuration/references/hindsight-performance-tuning.md",
"/root/.hermes/skills/hermes-agent-configuration/references/hindsight-pipeline-ops.md",
"/root/.hermes/skills/hermes-agent/hindsight-ops",
"/root/.hermes",
"/root/.hermes/auth.json",
"/root/weixin_file_archive",
"/root/.hermes/cache/documents",
"/root/weixin_file_archive/index.jsonl",
"/root/.hermes/memories/MEMORY.md",
"/root/.hermes/memories/USER.md",
"/root/.hermes/SOUL.md",
"/root/.hindsight/profiles/hermes.env",
"/root/.hermes/scripts/set-hermes-secret.sh",
"/root/.hermes/scripts/hermes_config_doctor.sh",
"/root/hindsight-docker.env.",
"/root/.local/bin/hermes",
"/root/.hermes/logs/gateway.log",
"/root/.hermes/logs/agent.log",
"/root/.config/systemd/user/hermes-gateway.service",
"/root/.local/bin/uv",
"/root/.cache/uv/archive-v0/LPWA6EWrRIreH1NY/bin/python",
"/root/.cache/uv/archive-v0/LPWA6EWrRIreH1NY/bin/amap-mcp-server",
"/root/.cache/uv/archive-v0/EatXorTLuHpK-0iT/bin/python",
"/root/.cache/uv/archive-v0/EatXorTLuHpK-0iT/bin/xhs-mcp",
"/root/.hermes/config.yaml.bak.20260606_071437",
"/root/.hermes/health_check_report.txt",
"/root/backups/hindsight/2026-06-23.sql.gz",
"/root/backups/hindsight/",
"/root/.hermes/cron/jobs.json",
"/root/.hermes/terminal_audit.log-45-",
"/root/.hermes/terminal_audit.log-46-",
"/root/.hermes/terminal_audit.log",
"/root/.hermes/terminal_audit.log-48-",
"/root/.hermes/terminal_audit.log-49-",
"/root/.hermes/terminal_audit.log-51-",
"/root/.hermes/terminal_audit.log-52-",
"/root/.hermes/state.db",
"/root/.hermes/terminal_audit.log-65-",
"/root/.hermes/terminal_audit.log-66-",
"/root/.hermes/terminal_audit.log-68-",
"/root/.hermes/terminal_audit.log-69-"
]