AI 建造者日报 — 2026年6月22日

AI Builders Digest — 2026年6月22日

📌 X / TWITTER

Guillermo Rauch — Vercel CEO

Genuinely impressed, almost shocked, at how good GLM-5.2 by Z.AI is at coding. This changes things. Vercel CEO Guillermo Rauch 对 Z.AI 的 GLM-5.2 编程能力表示「真正印象深刻，几乎震惊」，直言「这改变了局面」。这是对国产模型编码能力的一次高调背书。 🔗 https://x.com/rauchg/status/2068517095818809770

Madhu Guru — 前 Google Gemini/Veo 产品负责人

The Product role is having an identity crisis too. Engineering has found its AI-native interface — SWE agents dramatically increase individual leverage. Companies are asking PMs to use AI, but they haven’t evolved the role. So there are two camps: old-school PMs using AI to accelerate the old job, versus a new breed that hasn’t emerged yet. 前 Google Gemini/Veo 产品负责人 Madhu Guru 认为 PM 角色正经历身份危机：工程师已通过 SWE agent 找到了 AI 原生的提效方式，但公司只是让 PM「用 AI」而非重新设计 PM 角色。他划分了两类 PM：用 AI 加速旧工作的传统派，和尚未出现的新型 PM。 🔗 https://x.com/realmadhuguru/status/2068350509027876876

Nikunj Kothari — FPV Ventures 合伙人

The biggest problem with AI is that priors need to be reset every few weeks — and it seems like most people are incapable of doing that. People say XYZ doesn’t work, but when asked when they last tested it, the answer is always “a few months ago.” Brother… FPV Ventures 合伙人 Nikunj Kothari 认为 AI 领域最大的问题是先验认知每几周就需要重置，但大多数人做不到。太多人说某个方案不行，一问上次测试是几个月前。这种认知滞后在快速迭代的 AI 领域尤其致命。他还给 Shopify 的 UCP CLI 提供了详细反馈。 🔗 https://x.com/nikunj/status/2068411460620042720 🔗 https://x.com/nikunj/status/2068372026268811517

Aaron Levie — Box CEO

Pretty remarkable what’s happening with open weights AI right now. We’re seeing models achieve SOTA results on specific tasks, and getting close to frontier on some areas of coding and other domains. The more that open weights maintains only a marginal gap from the frontier, instead of a 12-18 month gap, the more dramatic the implications for the ecosystem. Box CEO Aaron Levie 观察到开源权重模型的进展「相当惊人」——在特定任务上达到 SOTA，在编程等领域的差距正在缩小。他认为开源与前沿之间如果只差一线而非 12-18 个月，对整个生态的影响将是颠覆性的。 🔗 https://x.com/levie/status/2068434042148782515

Thibault Sottiaux — OpenAI Codex & ChatGPT

We built the Codex App with models that were okayish at front-end. Wait to see what we can do when we finally improve front-end capabilities significantly in our models. That day will be something. Some tokens work harder than others — some of the most valuable ones are found in the Codex app. OpenAI Codex 团队成员 Thibault Sottiaux 透露：Codex App 是用前端能力仅算「凑合」的模型构建的。一旦模型的前端能力显著提升，产品体验将完全不同。他还提出了一个哲学观点：有些 token 比其他的更「努力」，最有价值的 token 往往在 Codex App 中。 🔗 https://x.com/thsottiaux/status/2068568650924409260 🔗 https://x.com/thsottiaux/status/2068443037907522002

Peter Yang — AI 教程作者（15 万+ 读者）

I will go against the grain and say I can barely use up my Codex and Claude $200 subscriptions, so I don’t see the point of trying local models. Also to try the latest GLM locally requires 512GB which is like a $10K Mac Studio. Peter Yang 逆主流观点：他连 Codex 和 Claude 的 $200 订阅都用不完，看不出本地模型的必要性。以最新 GLM 为例，本地运行需要 512GB 内存 ≈ $10K 的 Mac Studio。 🔗 https://x.com/petergyang/status/2068411894185295969

Nan Yu — Linear 产品负责人

Hey devs at Outlook and Gmail — you can point your agents at this tweet and they will fix it for you. Why isn’t the default for pasted text in email apps to take on the font treatment surrounding it? Crazy that this is still a problem. Linear 产品负责人 Nan Yu 调侃 Outlook 和 Gmail 的开发者：直接把 agent 指向这条推文就能修好粘贴文本字体不匹配的 Bug。这个问题居然还存在，太离谱了。 🔗 https://x.com/thenanyu/status/2068396602973143274

Amjad Masad — Replit CEO

We posted for twenty years, thinking we were talking to each other. Then the transformer came online, and the network read what we’d written, and became itself. Replit CEO Amjad Masad 以诗意的方式描述了 AI 时代的转折：人类在网上写了二十年，以为只是在互相交流。然后 transformer 上线了，网络读完了我们写的一切，成为了它自己。Replit Japan 正在招聘。 🔗 https://x.com/amasad/status/2068589860097790449

Zara Zhang — Builder

I hoard X bookmarks and never read them. So I built an extension that injects a bookmarked post into my main feed every time I open X (almost like an ad). Now I read my bookmarks. The trick was hijacking real estate I already check 50 times a day. Builder Zara Zhang 分享了一个巧妙的产品思路：她收藏了大量 X 书签但从不去看，于是做了一个浏览器扩展，每次打开 X 时把一条书签注入主 feed（像广告一样）。劫持每天看 50 次的位置来消化书签，这个思路很妙。她还讨论了加入大公司 vs 创业公司的风险比较。 🔗 https://x.com/zarazhangrui/status/2068568920613953626

📝 官方博客

Anthropic Engineering: Claude Code 质量下降事后分析

Anthropic published a postmortem on user reports of Claude Code quality degradation. They identified three separate bugs: (1) a wrong default reasoning effort tradeoff on March 4 that sacrificed intelligence for latency, (2) a thinking-clearing bug on March 26 that made Claude forgetful across turns, and (3) a verbosity-reduction prompt change on April 16 that hurt coding quality. All were resolved by April 20 (v2.1.116). The API was unaffected. As of April 23, they’re resetting usage limits for all subscribers as a make-good. Anthropic 发布了 Claude Code 质量下降的完整事后分析。三个独立 bug：3 月 4 日将默认推理强度从高降为中导致智能度下降（4 月 7 日回滚）；3 月 26 日的 thinking 清理 bug 导致 Claude 在会话中越来越健忘（4 月 10 日修复）；4 月 16 日减少啰嗦度的 prompt 改动意外损害了编码质量（4 月 20 日回滚）。API 未受影响。所有订阅用户将获得用量额度重置作为补偿。 🔗 https://www.anthropic.com/engineering/april-23-postmortem

Anthropic Engineering: 规模化 Managed Agents — 将「大脑」与「双手」解耦

Anthropic details the architecture behind Claude Managed Agents. The core insight: agent harnesses encode assumptions about what models can’t do, but those assumptions go stale as models improve. Their solution virtualizes agent components — session (append-only log), harness (the loop), sandbox (execution environment) — so each can be swapped independently. This follows the OS design pattern where stable abstractions outlast changing implementations. Key lesson: don’t adopt a pet — decouple infrastructure so components can be replaced without rebuilding the whole system. Anthropic 揭示了 Claude Managed Agents 的架构设计哲学。核心洞察：agent harness 中编码了大量关于「模型做不到什么」的假设，但模型在进步，这些假设会过时。解决方案是将 agent 组件虚拟化——session（只追加日志）、harness（调度循环）、sandbox（执行环境）——各自独立可替换。借鉴了操作系统的设计模式：稳定的抽象层比底层实现更长寿。关键教训：不要「养宠物」——解耦组件，才能在不重建整个系统的前提下替换任意部分。 🔗 https://www.anthropic.com/engineering/managed-agents

Claude Blog: Managed Agents 更新 — 自托管沙箱和 MCP 隧道

Claude Managed Agents can now operate in sandboxes you control and connect to your private MCP servers. Self-hosted sandboxes keep sensitive files and services within enterprise boundaries, while the agent loop stays on Anthropic’s infrastructure. Supported providers include Cloudflare, Daytona, Modal, and Vercel. This is a significant step toward enterprise-grade agent deployment — security teams can apply existing policies, audit logging, and network controls while running long-horizon agents. Claude Managed Agents 现在支持自托管沙箱和 MCP 隧道。敏感文件和服务留在企业边界内运行，agent 调度循环仍在 Anthropic 基础设施上。支持 Cloudflare、Daytona、Modal、Vercel 等沙箱提供商。这是向企业级 agent 部署迈出的重要一步——安全团队可以在运行长时间 agent 任务的同时，沿用现有的网络策略、审计日志和安全工具。 🔗 https://claude.com/blog/claude-managed-agents-updates

🎙️ 播客

Unsupervised Learning: AI Vibe Check — Lab Wars, Why APIs Might Vanish & Future Predictions

Jacob Efron hosts Ari (Datalogy, ex-DeepMind/Meta) and Rob (Radical Ventures) for a wide-ranging AI vibe check. Three key threads:

Coding agents are crossing a threshold. Six months ago they were just starting to work at longer time horizons; now engineers are shifting from ICs to managers of agent fleets. The “token maxing” trend is a direct result.

The compute crunch could kill API businesses. Ari predicted that as compute constraints intensify, frontier labs may cut off API access entirely — reserving their most powerful models for internal use only. Rob added that this depends on how long the compute bottleneck stays acute, which ties into efforts to break the semiconductor supply chain (mentioning Elon Musk’s TerraFab concept as under-discussed but potentially transformative).

Domain expertise matters more than ever. The panel discussed Anthropic’s drug discovery ambitions — a hard problem because AI alone isn’t enough. DeepMind/Isomorphic’s AlphaFold succeeded precisely because they embedded domain experts alongside AI researchers, a lesson many labs haven’t learned. 🔗 https://www.youtube.com/watch?v=W_iO8XxgD_I

通过 Follow Builders 生成: https://github.com/zarazhangrui/follow-builders