Anthropic 官方指南:怎麼給 Agent 設計工具
整理版優先睇
設計 Agent 工具嘅核心原則:要貼合模型能力,定期審視並迭代,學會「像 Agent 一樣睇」。
呢篇文章翻譯自 Anthropic 官方博客,作者 Thariq Shihipar 係 Claude Code 團隊嘅工程師。佢分享咗喺設計 Agent 工具嗰陣嘅經驗同教訓,特別係點樣隨住模型能力提升去迭代工具。成篇文章嘅核心結論係:工具設計要貼合模型嘅能力形狀,需要反覆實驗同觀察輸出,學會「像 Agent 一樣睇」。
文章通過三個具體案例說明呢個原則:AskUserQuestion 工具嘅三次嘗試(從修改參數到獨立工具)、從 TodoWrite 到 Task 嘅迭代(線性清單變協作任務圖)、同埋用漸進式披露(Progressive disclosure)替代加新工具嘅做法。重點係要定期審視現有工具係咪仍然必要,避免工具反過來限制模型。
另外,文章強調咗一個重要概念:漸進式披露。通過比 Agent 自己構建上下文,而唔係被動接收上下文,可以大幅提升效率。Claude Code 目前有大約20個工具,團隊經常檢討係咪全部需要。加新工具嘅門檻好高,因為每多一個選項,模型就要多考慮一個可能性。
- 工具設計要貼合模型能力,要定期審視工具係咪仍然必要,避免限制模型。
- 設計工具時,獨立工具(如 AskUserQuestion)比修改現有工具或改變輸出格式更可靠、更容易控制。
- 隨住模型能力提升,曾經有用嘅工具可能變成絆腳石(如 TodoWrite 需轉為 Task)。
- 漸進式披露係一種唔加新工具就能擴展 Agent 能力嘅有效方法。
- 多實驗、觀察輸出、試新嘢,學會「像 Agent 一樣睇」係設計工具嘅關鍵心法。
設計工具嘅核心原則:像 Agent 一樣睇
構建 Agent 最難嘅部分之一,係設計佢嘅工具集。Claude 完全透過工具調用嚟行動,但喺 Claude API 入面,工具可以用 bash、skills、代碼執行等基礎原語嚟構建。
你要比 Agent 嘅工具,應該貼合佢自身嘅能力形狀
點樣先知模型嘅能力係咩?你要觀察佢、讀佢嘅輸出、反覆實驗。你學會「像 Agent 一樣睇」。呢個係一個有用嘅設計框架。
AskUserQuestion 工具:三次嘗試嘅教訓
為咗提升 Claude 向用戶提問嘅能力,團隊做咗三次嘗試。雖然 Claude 可以用純文字提問,但回答體驗好差,耗時太多。點樣降低摩擦、提升溝通帶寬?
- 1 第一次:修改 ExitPlanTool,加一個參數同時輸出計劃同問題。結果 Claude 好睏惑,唔知點處理矛盾。
- 2 第二次:改變輸出格式,叫 Claude 用特殊 Markdown 格式提問。Claude 大部分時候做到,但唔穩定,會加多餘句子或者漏選項。
- 3 第三次:做一個獨立嘅 AskUserQuestion 工具,Claude 可以隨時調用,特別喺規劃模式引導使用。工具觸發後彈 modal 顯示問題,阻塞 Agent 循環直到用戶回答。
呢個工具引導 Claude 輸出結構化內容,確保比用戶多個選項,而且 Claude 好鍾意調用,輸出質量好好。
再好嘅工具設計,如果模型唔理解點樣調用,都係白搭
從 Todo 到 Task:跟能力迭代
Claude Code 剛上線時,模型需要一個待辦清單保持專注。團隊做咗 TodoWrite 工具,開工前列好待辦,做完一項勾一項。但 Claude 成日忘記要做乜,於是每隔5輪對話插一條系統提醒。
隨着模型迭代,Todo 列表開始礙事
系統提醒令 Claude 覺得一定要跟清單行,唔敢中途調整方向。Opus 4.5 用子 Agent 能力提升,但多個子 Agent 點樣共享一個 Todo 列表?
隨着模型能力提升,曾經需要嘅工具可能反過來限制佢。定期回頭審視「呢啲工具係咪仲有必要」好重要,亦係點解建議只支援少量能力相近嘅模型。
漸進式披露:唔加工具,擴展能力
團隊做過最有影響力嘅工具,係比 Claude 自己揾上下文。早期用 RAG 預先索引代碼庫,自動檢索片段塞比 Claude,但上下文係被動接受,唔係 Claude 自己揾嘅。
如果 Claude 能搜網頁,點解唔可以搜代碼庫?
比 Claude 一個 Grep 工具,等佢自己搜檔案、自己構建上下文。Agent Skills 上線後,將呢個思路正式化為漸進式披露:讓 Agent 透過探索逐步發現相關上下文。Claude 而家可以讀 Skill 檔案,遞歸發現同加載上下文,做到多層嵌套搜索。
- 一個常見用法:用 Skills 增加搜索能力,例如教 Claude 點樣調 API 或查數據庫。
- Claude Code Guide 係另一個例子:用子 Agent 喺自己上下文搜文檔,只傳返答案,主 Agent 上下文保持乾淨。
唔加新工具,就能擴展 Agent 嘅能力範圍
Claude Code 目前有大約20個工具,團隊經常審視係咪全部需要。加新工具嘅門檻好高,因為每多一個選項,模型就要多考慮一個可能性。
BLOG
呢篇文章翻譯自 Anthropic 官方博客「Seeing like an agent: how we design tools in Claude Code」,作者 Thariq Shihipar,Claude Code 團隊工程師,今日發佈
以下係逐段中英對照翻譯
整 Agent 最難嘅部分之一:設計工具
One of the hardest parts about building an agent harness is constructing its tools.
整 Agent harness 最困難嘅部分之一,就係設計佢嘅工具集
Claude acts completely through tool calling, but there are a number of ways tools can be constructed in the Claude API with primitives like bash, skills and code execution.
Claude 完全透過 tool calling 嚟行動。喺 Claude API 入面,工具可以用 bash、skills、code execution 呢啲基本原語嚟整
So how do you design your agents' tools? Do you give it one general-purpose tool like bash or code execution? Or fifty specialized tools, one for each use case?
咁你點樣幫 Agent 設計工具?俾佢一個通用工具(好似 bash 或者 code execution)就夠?定係整五十個專用工具,每個場景一個?
To put yourself in the mind of the model, imagine being given a difficult math problem. What tools would you want in order to solve it? It would depend on your own skill set!
要企喺模型嘅角度諗呢個問題,可以想像你面前有一條好難嘅數學題。你想要乜嘢工具嚟解決佢?答案取決於你自己嘅能力
Paper would be the minimum, but you'd be limited by manual calculations. A calculator would be better, but you would need to know how to operate the more advanced options. The fastest and most powerful option would be a computer, but you would have to know how to use it to write and execute code.
一張紙係最低配置,但係你只能手算。計數機好啲,但你就要識得用進階功能。最快最勁嘅選擇係電腦,但你要識得用佢嚟寫同執行 code
This is a useful framework for designing your agent. You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment. You learn to see like an agent.
呢個係一個好有用嘅設計框架。你要俾 Agent 嘅工具,應該要啱佢自身能力嘅形狀。但你點知佢嘅能力係乜?你觀察佢,讀佢嘅輸出,反覆實驗。你學會「好似 Agent 咁睇」
If you're building an agent, you'll face the same questions we did: when to add a tool, when to remove one, and how to tell the difference. Here's how we've answered them while building Claude Code, including where we got it wrong first.
如果你喺度做 Agent,你會面對同我哋一樣嘅問題:幾時加工具,幾時刪工具,點樣區分呢兩種情況。下面係我哋喺 Claude Code 嘅實際經驗,包括一開始做錯嘅地方
用 AskUserQuestion 工具嚟改善提問能力

三種方案嘅光譜:由無結構到過度剛性,AskUserQuestion 工具喺中間
When building the AskUserQuestion tool, our goal was to improve Claude's ability to ask questions (often called elicitation).
設計 AskUserQuestion 工具嘅時候,我哋嘅目標係提升 Claude 向用戶提問嘅能力(通常叫做 elicitation)
While Claude could just ask questions in plain text, we found answering those questions felt like they took an unnecessary amount of time. How could we lower this friction and increase the bandwidth of communication between the user and Claude?
雖然 Claude 可以用純文字提問,但我哋發現回答呢啲問題嘅體驗好差,花咗太多時間。點樣降低呢個摩擦,提升用戶同 Claude 之間嘅溝通頻寬?
第一次嘗試:修改 ExitPlanTool
The first approach we tried was adding a parameter to the ExitPlanTool to have an array of questions alongside the plan. This was the easiest fix to implement, but it confused Claude because we were simultaneously asking for a plan and a set of questions about the plan. What if the user's answers conflicted with what the plan said? Would Claude need to call the ExitPlanTool twice? We knew this tactic wouldn't work, so we went back to the drawing board.
我哋第一個方案係俾 ExitPlanTool 加一個參數,等佢喺輸出計劃嘅同時輸出一組問題。呢個係最簡單嘅改法,但搞到 Claude 好亂:我哋同時要求佢做計劃同對計劃提問。如果用戶嘅答覆同計劃有矛盾咁點?Claude 係咪要 call 兩次呢個工具?我哋知道呢個方法唔掂,於是返返去起點
第二次嘗試:改變輸出格式
Next, we tried updating Claude's output instructions to serve a slightly modified markdown format that it could use to ask questions. For example, we could ask it to output a list of bullet point questions with alternatives in brackets. We could then parse and format that question as UI for the user.
接下來,我哋嘗試修改 Claude 嘅輸出指令,等佢用一種特別嘅 Markdown 格式嚟提問。例如,叫佢用 bullet point 列出問題,每個問題後面用方括號俾選項。然後前端解析呢個格式,整成 UI 俾用戶
Claude could usually produce this format, but not reliably. It would append extra sentences, drop options, or abandon the structure altogether. Onto the next approach.
Claude 大部分時候可以生成呢個格式,但唔穩定。佢會喺尾度加多一句,漏咗選項,或者直頭唔用呢個格式。下一個方案
第三次嘗試:AskUserQuestion 工具

AskUserQuestion 工具嘅實際界面
Finally, we landed on creating a tool that Claude could call at any point, but it was particularly prompted to do so during plan mode. When the tool triggered we would show a modal to display the questions and block the agent's loop until the user answered.
最終方案係做一個獨立嘅工具,Claude 可以隨時 call,但喺規劃模式入面會特別引導佢用。工具觸發之後會彈出一個 modal 顯示問題,block 住 Agent 循環直到用戶答完
This tool allowed us to prompt Claude for a structured output and it helped us ensure that Claude gave the user multiple options. It also gave users ways to compose this functionality, for example calling it in the Agent SDK or using referring to it in skills.
呢個工具等我哋可以引導 Claude 輸出結構化內容,確保俾用戶多個選項。佢亦都俾用戶組合使用嘅空間,例如喺 Agent SDK 或者 Skills 入面 call 佢
Most importantly, Claude seemed to like calling this tool and we found its outputs worked well. After all, even the best designed tool doesn't work if Claude doesn't understand how to call it.
最緊要嘅一點:Claude 好鍾意 call 呢個工具,輸出質量亦都好好畢竟,就算工具設計得再好,如果模型唔明點 call,都係嘥氣
Is this the final form of elicitation in Claude Code? We doubt it. As Claude gets more capable, the tools that serve it have to evolve too. The next section shows a case where a tool that once helped started getting in the way.
呢個係 Claude Code 入面 elicitation 嘅最終形態嗎?應該唔係。隨住 Claude 能力提升,服務佢嘅工具都要一齊進化。下一節會展示一個曾經有用嘅工具後來開始阻頭阻勢嘅案例
跟住能力迭代:由 Todos 到 Tasks

由 Todos 到 Tasks:單 Agent 線性清單 → 多 Agent 協作任務圖
When we first launched Claude Code, we realized that the model needed a todo list to keep it on track. Todos could be written at the start and checked off as the model did work. To do this we gave Claude the TodoWrite tool, which would write or update Todos and display them to the user.
Claude Code 啱啱上線嗰時,我哋發現模型需要一個待辦清單嚟 keep 住 focus。開工前列好待辦,做完一項 tick 一項。我哋整咗 TodoWrite 工具嚟做呢個功能
But even then, we often saw Claude forgetting what it had to do. To adapt, we inserted system reminders every 5 turns that reminded Claude of its goal.
就算係咁,Claude 都成日唔記得要做乜。我哋於是每 5 輪對話就插一條系統提醒
As models improved, they found To-do lists limiting. Being sent reminders of the todo list made Claude think that it had to stick to the list instead of modifying it when it realized it needed to change course. We also saw Opus 4.5 also get much better at using subagents, but how could subagents coordinate on a shared todo list?
隨住模型迭代,Todo 列表開始礙事。系統提醒令 Claude 覺得一定要跟住清單做,而唔係喺發現要轉方向時修改。我哋仲見到 Opus 4.5 用子 Agent 嘅能力好咗好多,但多個子 Agent 點樣共享一個 Todo 列表?
Seeing this, we replaced the TodoWrite feature with the Task tool. Whereas todos are focused on keeping the model on track, tasks help agents communicate with each other. Tasks could include dependencies, share updates across subagents and the model could alter and delete them.
見到呢啲問題,我哋將 TodoWrite 換成咗 Task 工具。Todo 嘅重點係叫模型 keep 住方向,Task 嘅重點係幫 Agent 之間互相溝通。Task 支援依賴關係,可以跨子 Agent 共享狀態更新,模型可以隨時改同刪
模型能力提升之後,以前需要嘅工具可能會反過來限制佢
As model capabilities increase, the tools that your models once needed might now be constraining them. It's important to constantly revisit previous assumptions on what tools are needed. This is also why it's useful to stick to a small set of models to support that have a fairly similar capabilities profile.
隨住模型能力提升,你嘅模型以前需要嘅工具而家可能反過來限制緊佢。定期回頭檢討「呢啲工具仲有冇需要」好重要。呢個亦都係點解建議只 support 少量能力相近嘅模型,咁工具設計可以集中啲
設計搜尋界面
The most consequential tools we've built are the ones that let Claude find its own context.
我哋做過最有影響力嘅工具,係嗰啲俾 Claude 自己揾 context 嘅工具
When Claude Code was first released internally, we used RAG: a vector database would pre-index the codebase, and the harness would retrieve relevant snippets and hand them to Claude before each response. While RAG was powerful and fast, it required indexing and setup and could be fragile across a host of different environments. Most importantly, Claude was given this context instead of finding the context itself.
Claude Code 內部版本最早係用 RAG:向量數據庫預先索引 codebase,每次回覆前自動檢索相關片段塞俾 Claude。RAG 快、效果好,但需要預處理,環境兼容性脆弱。最根本嘅問題係:context 係俾人塞俾 Claude,而唔係 Claude 自己揾
But if Claude could search on the web, why couldn't it also search your codebase? By giving Claude a Grep tool, we could let it search for files and build context itself.
如果 Claude 可以 search 網頁,點解唔可以 search 你嘅 codebase?俾 Claude 一個 Grep 工具,就可以等佢自己 search 檔案、自己 build context
As Claude gets smarter, it becomes increasingly good at building its context when given the right tools.
Claude 越嚟越聰明,俾啱嘅工具之後,佢就越擅長自己 build context
When we introduced Agent Skills, we formalized the idea of progressive disclosure, which allows agents to incrementally discover relevant context through exploration.
Agent Skills 上線之後,我哋將呢個思路正式化為漸進式披露(progressive disclosure):等 Agent 透過探索逐步發現相關 context
Claude could now read skill files and those files could then reference other files that the model could read recursively. In fact, a common use of skills is to add more search capabilities to Claude like giving it instructions on how to use an API or query a database.
Claude 而家可以讀 Skill 檔案,Skill 檔案可以引用其他檔案,模型可以遞歸咁發現同 load context。一個常見嘅 Skill 用途就係俾 Claude 加 search 能力:話俾佢知點 call API、點 query 數據庫
Over the course of a year, Claude went from not really being able to build its own context to being able to do nested search across several layers of files to find the exact context it needed.
一年時間,Claude 由幾乎唔識自己 build context,到識得喺多層檔案入面 nested search,精準揾到需要嘅資訊
Progressive disclosure is now a common technique we use to add new functionality without adding a tool. In the next section, we explain why.
漸進式披露而家係我哋成日用嘅技術:唔使加工具就可以加功能。下一節解釋點解
漸進式披露:Claude Code Guide 子 Agent
Claude Code currently has ~20 tools, and our team frequently revisits if we need all of them for Claude to be most effective. The bar to add a new tool is high, because this gives the model one more option to think about.
Claude Code 而家大約有 20 個工具,團隊成日審視係咪每個都有需要。加新工具嘅門檻好高,因為每多一個工具,模型就多一個需要諗嘅選項
For example, we noticed that Claude did not know enough about how to use Claude Code. If you asked it how to add a MCP or what a slash command did, it would not be able to reply.
例如,我哋發現 Claude 唔夠瞭解 Claude Code 自身嘅功能。你問佢點加 MCP、某個斜槓命令係咩意思,佢答唔出
We could have put all of this information in the system prompt, but given that users rarely asked about this, it would have added context rot and interfered with Claude Code's main job: writing code.
我哋可以將呢啲資訊全部塞入 system prompt,但用戶好少問呢類問題,塞咗會造成 context rot,幹擾 Claude 嘅主要工作(寫 code)
Instead, we tried progressive disclosure: we gave Claude a link to its docs that it could load and search when needed. This worked, but Claude would pull large chunks of documentation into context to find an answer the user could have gotten in one sentence.
我哋嘗試漸進式披露:俾 Claude 一個指向 docs 嘅 link,需要時自己去 load 同 search。用得掂,但 Claude 會將大段 documentation 拉入 context,只係為咗答一句話就搞得掂嘅問題
So we built the Claude Code Guide — a subagent Claude calls whenever a user asks about Claude Code itself. The subagent does the doc-searching in its own context, follows detailed instructions on how to search and what to extract, and hands back only the answer. The main agent's context stays clean.
最終我哋做咗一個 Claude Code Guide 子 Agent。當用戶問 Claude Code 自身嘅問題時,主 Agent 將請求轉俾呢個子 Agent。子 Agent 喺自己嘅 context 入面 search docs、提取答案,淨係傳返答案返嚟。主 Agent 嘅 context 保持乾淨
While this isn't a perfect solution (Claude can still get confused when you ask it about how to set itself up), we were able to add things to Claude's action space without adding a new tool.
呢個方案唔完美(Claude 有時響自身配置問題上仍然會搞亂),但關鍵係:唔使加新工具,就可以擴展 Agent 嘅能力範圍
好似 Agent 咁睇,係一門手藝
Designing the tools for your models is as much an art as it is a science. It depends heavily on the model you're using, the goal of the agent and the environment it's operating in.
幫模型設計工具,同科學比起來更加似手藝。佢好取決於你用嘅模型、Agent 嘅目標、運行嘅環境
Our best advice? Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.
我哋最好嘅建議?多啲實驗,讀你嘅 output,試新嘢。最重要嘅係,學嚇好似 Agent 咁睇
Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.
原文連結
https://claude.com/blog/seeing-like-an-agent
作者:Thariq Shihipar,Anthropic 工程師,Claude Code 團隊
BLOG
本文翻譯自 Anthropic 官方博客「Seeing like an agent: how we design tools in Claude Code」,作者 Thariq Shihipar,Claude Code 團隊工程師,今天發佈
以下為逐段中英對照翻譯
構建 Agent 最難的部分之一:設計工具
One of the hardest parts about building an agent harness is constructing its tools.
構建 Agent harness 最困難的部分之一,是設計它的工具集
Claude acts completely through tool calling, but there are a number of ways tools can be constructed in the Claude API with primitives like bash, skills and code execution.
Claude 完全通過工具調用來行動。在 Claude API 中,工具可以用 bash、skills、代碼執行等基礎原語來構建
So how do you design your agents' tools? Do you give it one general-purpose tool like bash or code execution? Or fifty specialized tools, one for each use case?
那你該怎麼給 Agent 設計工具?給它一個通用工具(比如 bash 或代碼執行)就夠了?還是做五十個專用工具,每個場景一個?
To put yourself in the mind of the model, imagine being given a difficult math problem. What tools would you want in order to solve it? It would depend on your own skill set!
要站在模型的角度想這個問題,可以想象你面前有一道很難的數學題。你想要什麼工具來解決它?答案取決於你自己的能力
Paper would be the minimum, but you'd be limited by manual calculations. A calculator would be better, but you would need to know how to operate the more advanced options. The fastest and most powerful option would be a computer, but you would have to know how to use it to write and execute code.
一張紙是最低配,但你只能手算。計算器好一些,但你得知道怎麼用高級功能。最快最強的選擇是電腦,但你得會用它來寫和執行代碼
This is a useful framework for designing your agent. You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment. You learn to see like an agent.
這是一個很有用的設計框架。你要給 Agent 的工具,應該貼合它自身的能力形狀。但你怎麼知道它的能力是什麼?你觀察它,讀它的輸出,反覆實驗。你學會「像 Agent 一樣看」
If you're building an agent, you'll face the same questions we did: when to add a tool, when to remove one, and how to tell the difference. Here's how we've answered them while building Claude Code, including where we got it wrong first.
如果你在做 Agent,你會面對和我們一樣的問題:什麼時候加工具,什麼時候刪工具,怎麼區分這兩種情況。下面是我們在 Claude Code 的實際經驗,包括一開始做錯的地方
用 AskUserQuestion 工具改善提問能力

三種方案的光譜:從無結構到過度剛性,AskUserQuestion 工具落在中間
When building the AskUserQuestion tool, our goal was to improve Claude's ability to ask questions (often called elicitation).
設計 AskUserQuestion 工具時,我們的目標是提升 Claude 向用戶提問的能力(通常稱為 elicitation)
While Claude could just ask questions in plain text, we found answering those questions felt like they took an unnecessary amount of time. How could we lower this friction and increase the bandwidth of communication between the user and Claude?
雖然 Claude 可以用純文本提問,但我們發現回答這些問題的體驗很差,耗時太多。怎麼降低這個摩擦,提升用戶和 Claude 之間的溝通帶寬?
第一次嘗試:修改 ExitPlanTool
The first approach we tried was adding a parameter to the ExitPlanTool to have an array of questions alongside the plan. This was the easiest fix to implement, but it confused Claude because we were simultaneously asking for a plan and a set of questions about the plan. What if the user's answers conflicted with what the plan said? Would Claude need to call the ExitPlanTool twice? We knew this tactic wouldn't work, so we went back to the drawing board.
我們第一個方案是給 ExitPlanTool 加一個參數,讓它在輸出計劃的同時輸出一組問題。這是最省事的改法,但它讓 Claude 很困惑:我們同時要求它做計劃和對計劃提問。如果用戶的回答和計劃矛盾怎麼辦?Claude 是不是得調兩次這個工具?我們知道這個方案行不通,於是回到原點
第二次嘗試:改變輸出格式
Next, we tried updating Claude's output instructions to serve a slightly modified markdown format that it could use to ask questions. For example, we could ask it to output a list of bullet point questions with alternatives in brackets. We could then parse and format that question as UI for the user.
接下來,我們嘗試修改 Claude 的輸出指令,讓它用一種特殊的 Markdown 格式來提問。比如用 bullet point 列出問題,每個問題後面用方括號給出選項。然後前端解析這個格式,渲染成 UI
Claude could usually produce this format, but not reliably. It would append extra sentences, drop options, or abandon the structure altogether. Onto the next approach.
Claude 大部分時候能生成這個格式,但不穩定。它會在末尾多加一句話,漏掉選項,或者乾脆不用這個格式。下一個方案
第三次嘗試:AskUserQuestion 工具

AskUserQuestion 工具的實際界面
Finally, we landed on creating a tool that Claude could call at any point, but it was particularly prompted to do so during plan mode. When the tool triggered we would show a modal to display the questions and block the agent's loop until the user answered.
最終方案是做一個獨立的工具,Claude 可以在任何時候調用,但在規劃模式中會被特別引導去使用。工具觸發後彈出一個模態框顯示問題,阻塞 Agent 循環直到用戶回答
This tool allowed us to prompt Claude for a structured output and it helped us ensure that Claude gave the user multiple options. It also gave users ways to compose this functionality, for example calling it in the Agent SDK or using referring to it in skills.
這個工具讓我們能引導 Claude 輸出結構化內容,確保給用戶多個選項。它也給了用戶組合使用的空間,比如在 Agent SDK 或 Skills 中引用它
Most importantly, Claude seemed to like calling this tool and we found its outputs worked well. After all, even the best designed tool doesn't work if Claude doesn't understand how to call it.
最關鍵的一點:Claude 喜歡調用這個工具,輸出質量也好。畢竟,再好的工具設計,如果模型不理解怎麼調用,也是白搭
Is this the final form of elicitation in Claude Code? We doubt it. As Claude gets more capable, the tools that serve it have to evolve too. The next section shows a case where a tool that once helped started getting in the way.
這是 Claude Code 中 elicitation 的最終形態嗎?大概不是。隨着 Claude 能力提升,服務它的工具也必須跟着演進。下一節會展示一個曾經有用的工具後來開始礙事的案例
跟隨能力迭代:從 Todos 到 Tasks

從 Todos 到 Tasks:單 Agent 線性清單 → 多 Agent 協作任務圖
When we first launched Claude Code, we realized that the model needed a todo list to keep it on track. Todos could be written at the start and checked off as the model did work. To do this we gave Claude the TodoWrite tool, which would write or update Todos and display them to the user.
Claude Code 剛上線時,我們發現模型需要一個待辦清單來保持專注。開工前列好待辦,做完一項勾一項。我們做了 TodoWrite 工具來實現這個功能
But even then, we often saw Claude forgetting what it had to do. To adapt, we inserted system reminders every 5 turns that reminded Claude of its goal.
即便如此,Claude 還是經常忘記該幹什麼。我們於是每隔 5 輪對話就插一條系統提醒
As models improved, they found To-do lists limiting. Being sent reminders of the todo list made Claude think that it had to stick to the list instead of modifying it when it realized it needed to change course. We also saw Opus 4.5 also get much better at using subagents, but how could subagents coordinate on a shared todo list?
隨着模型迭代,Todo 列表開始礙事。系統提醒讓 Claude 覺得必須嚴格按清單執行,不敢中途調整方向。Opus 4.5 用子 Agent 的能力大幅提升,但多個子 Agent 怎麼共享一個 Todo 列表?
Seeing this, we replaced the TodoWrite feature with the Task tool. Whereas todos are focused on keeping the model on track, tasks help agents communicate with each other. Tasks could include dependencies, share updates across subagents and the model could alter and delete them.
看到這些問題,我們把 TodoWrite 替換成了 Task 工具。Todo 的重點是讓模型保持方向,Task 的重點是讓 Agent 之間互相溝通。Task 支持依賴關係,可以跨子 Agent 共享狀態更新,模型可以隨時修改和刪除
模型能力提升之後,曾經需要的工具可能反過來限制它
As model capabilities increase, the tools that your models once needed might now be constraining them. It's important to constantly revisit previous assumptions on what tools are needed. This is also why it's useful to stick to a small set of models to support that have a fairly similar capabilities profile.
隨着模型能力提升,你的模型曾經需要的工具現在可能反過來在限制它。定期回頭審視「這些工具是否還有必要」很重要。這也是為什麼建議只支持少量能力相近的模型,這樣工具設計可以聚焦
設計搜索界面
The most consequential tools we've built are the ones that let Claude find its own context.
我們做過的最有影響力的工具,是那些讓 Claude 自己尋找上下文的工具
When Claude Code was first released internally, we used RAG: a vector database would pre-index the codebase, and the harness would retrieve relevant snippets and hand them to Claude before each response. While RAG was powerful and fast, it required indexing and setup and could be fragile across a host of different environments. Most importantly, Claude was given this context instead of finding the context itself.
Claude Code 內部版本最早用的是 RAG:向量數據庫預先索引代碼庫,每次回覆前自動檢索相關片段塞給 Claude。RAG 速度快、效果好,但需要預處理,環境兼容性脆弱。最根本的問題是:上下文是被塞給 Claude 的,不是 Claude 自己找的
But if Claude could search on the web, why couldn't it also search your codebase? By giving Claude a Grep tool, we could let it search for files and build context itself.
如果 Claude 能搜網頁,為什麼不能搜代碼庫?給 Claude 一個 Grep 工具,就能讓它自己搜文件、自己構建上下文
As Claude gets smarter, it becomes increasingly good at building its context when given the right tools.
Claude 越聰明,給它合適的工具後它就越擅長自己構建上下文
When we introduced Agent Skills, we formalized the idea of progressive disclosure, which allows agents to incrementally discover relevant context through exploration.
Agent Skills 上線後,我們把這個思路正式化為漸進式披露(progressive disclosure):讓 Agent 通過探索逐步發現相關上下文
Claude could now read skill files and those files could then reference other files that the model could read recursively. In fact, a common use of skills is to add more search capabilities to Claude like giving it instructions on how to use an API or query a database.
Claude 現在可以讀 Skill 文件,Skill 文件可以引用其他文件,模型可以遞歸地發現和加載上下文。一個常見的 Skill 用法就是給 Claude 增加搜索能力:告訴它怎麼調 API、怎麼查數據庫
Over the course of a year, Claude went from not really being able to build its own context to being able to do nested search across several layers of files to find the exact context it needed.
一年時間,Claude 從幾乎不會自己構建上下文,到能在多層文件中嵌套搜索,精確找到需要的信息
Progressive disclosure is now a common technique we use to add new functionality without adding a tool. In the next section, we explain why.
漸進式披露現在是我們常用的一種技術:不加工具就能加功能。下一節解釋具體怎麼做
漸進式披露:Claude Code Guide 子 Agent
Claude Code currently has ~20 tools, and our team frequently revisits if we need all of them for Claude to be most effective. The bar to add a new tool is high, because this gives the model one more option to think about.
Claude Code 目前有大約 20 個工具,團隊經常審視是否每個都有必要。加新工具的門檻很高,因為每多一個工具,模型就多一個需要思考的選項
For example, we noticed that Claude did not know enough about how to use Claude Code. If you asked it how to add a MCP or what a slash command did, it would not be able to reply.
比如,我們發現 Claude 不夠了解 Claude Code 自身的功能。你問它怎麼加 MCP、某個斜槓命令是什麼意思,它答不上來
We could have put all of this information in the system prompt, but given that users rarely asked about this, it would have added context rot and interfered with Claude Code's main job: writing code.
可以把這些信息全塞進 system prompt,但用戶很少問這類問題,塞進去會造成上下文腐蝕,干擾 Claude 的主要工作(寫代碼)
Instead, we tried progressive disclosure: we gave Claude a link to its docs that it could load and search when needed. This worked, but Claude would pull large chunks of documentation into context to find an answer the user could have gotten in one sentence.
我們嘗試漸進式披露:給 Claude 一個指向文檔的連結,需要時自己去查。能用,但 Claude 會把大段文檔拉進上下文,只為回答一個一句話就能搞定的問題
So we built the Claude Code Guide — a subagent Claude calls whenever a user asks about Claude Code itself. The subagent does the doc-searching in its own context, follows detailed instructions on how to search and what to extract, and hands back only the answer. The main agent's context stays clean.
最終我們做了一個 Claude Code Guide 子 Agent。當用戶問 Claude Code 自身的問題時,主 Agent 把請求轉給這個子 Agent。子 Agent 在自己的上下文裏搜索文檔、提取答案,只把答案傳回來。主 Agent 的上下文保持乾淨
While this isn't a perfect solution (Claude can still get confused when you ask it about how to set itself up), we were able to add things to Claude's action space without adding a new tool.
這個方案不完美(Claude 有時候還是會在自身配置問題上犯糊塗),但關鍵是:不用加新工具,就能擴展 Agent 的能力範圍
像 Agent 一樣看,是手藝活
Designing the tools for your models is as much an art as it is a science. It depends heavily on the model you're using, the goal of the agent and the environment it's operating in.
給模型設計工具,與其說是科學,更接近手藝。它取決於你用的模型、Agent 的目標、運行的環境
Our best advice? Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.
我們最好的建議?多實驗,讀你的輸出,試新東西。最重要的是,學會像 Agent 一樣看
Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.
原文連結
https://claude.com/blog/seeing-like-an-agent
作者:Thariq Shihipar,Anthropic 工程師,Claude Code 團隊