fix(plugin): mcp-server-dev — correct APIs against spec, add missing primitives

Corrects fabricated/deprecated APIs: ext-apps App class model (not embedded resources), real MCPB v0.4 manifest (no permissions block exists), registerTool (not server.tool), @anthropic-ai/mcpb package name, CIMD preferred over DCR. Adds missing spec coverage: resources, prompts, elicitation (with capability check + fallback), sampling, roots, tool annotations, structured output, instructions field, progress/cancellation.
2026-03-19 23:23:07 +00:00 · 2026-03-18 22:53:38 +00:00
parent f7ba55786d
commit 48a018f27a
13 changed files with 1195 additions and 409 deletions
--- a/plugins/mcp-server-dev/skills/build-mcp-server/SKILL.md
+++ b/plugins/mcp-server-dev/skills/build-mcp-server/SKILL.md
@@ -38,14 +38,16 @@ This determines the tool-design pattern — see Phase 3.
 - **Under ~15 actions** → one tool per action
 - **Dozens to hundreds of actions** (e.g. wrapping a large API surface) → search + execute pattern

-### 4. Does it need interactive UI in the chat?
+### 4. Does a tool need mid-call user input or rich display?

-Forms, pickers, dashboards, confirmation dialogs rendered inline in the conversation → **MCP app** (adds UI resources on top of a standard server).
+- **Simple structured input** (pick from list, enter a value, confirm) → **Elicitation** — spec-native, zero UI code. *Host support is rolling out* (Claude Code ≥2.1.76) — always pair with a capability check and fallback. See `references/elicitation.md`.
+- **Rich/visual UI** (charts, custom pickers with search, live dashboards) → **MCP app widgets** — iframe-based, needs `@modelcontextprotocol/ext-apps`. See `build-mcp-app` skill.
+- **Neither** → plain tool returning text/JSON.

 ### 5. What auth does the upstream service use?

 - None / API key → straightforward
- OAuth 2.0 → you'll need a remote server with DCR (Dynamic Client Registration) or CIMD support; see `references/auth.md`
+- OAuth 2.0 → you'll need a remote server with CIMD (preferred) or DCR support; see `references/auth.md`

 ---

@@ -67,11 +69,19 @@ A hosted service speaking MCP over streamable HTTP. This is the **recommended pa

 → Scaffold with `references/remote-http-scaffold.md`

+### Elicitation (structured input, no UI build)
+
+If a tool just needs the user to confirm, pick an option, or fill a short form, **elicitation** does it with zero UI code. The server sends a flat JSON schema; the host renders a native form. Spec-native, no extra packages.
+
+**Caveat:** Host support is new (Claude Code shipped it in v2.1.76; Desktop unconfirmed). The SDK throws if the client doesn't advertise the capability. Always check `clientCapabilities.elicitation` first and have a fallback — see `references/elicitation.md` for the canonical pattern. This is the right spec-correct approach; host coverage will catch up.
+
+Escalate to `build-mcp-app` widgets when you need: nested/complex data, scrollable/searchable lists, visual previews, live updates.
+
 ### MCP app (remote HTTP + interactive UI)

-Same as above, plus **UI resources** — interactive widgets rendered in chat. Forms, file pickers, rich previews, confirmation dialogs. Built once, renders in Claude *and* ChatGPT.
+Same as above, plus **UI resources** — interactive widgets rendered in chat. Rich pickers with search, charts, live dashboards, visual previews. Built once, renders in Claude *and* ChatGPT.

-**Choose this when** one or more tools benefit from structured user input or rich output that plain text can't handle.
+**Choose this when** elicitation's flat-form constraints don't fit — you need custom layout, large searchable lists, visual content, or live updates.

 Usually remote, but can be shipped as MCPB if the UI needs to drive a local app.

@@ -137,7 +147,7 @@ Recommend one of these two. Others exist but these have the best MCP-spec covera
 | Framework | Language | Use when |
 |---|---|---|
 | **Official TypeScript SDK** (`@modelcontextprotocol/sdk`) | TS/JS | Default choice. Best spec coverage, first to get new features. |
-| **FastMCP 2.0** | Python | User prefers Python, or wrapping a Python library. Decorator-based, very low boilerplate. |
+| **FastMCP 3.x** (`fastmcp` on PyPI) | Python | User prefers Python, or wrapping a Python library. Decorator-based, very low boilerplate. This is jlowin's package — not the frozen FastMCP 1.0 bundled in the official `mcp` SDK. |

 If the user already has a language/stack in mind, go with it — both produce identical wire protocol.

@@ -156,6 +166,21 @@ When handing off, restate the design brief in one paragraph so the next skill do

 ---

+## Beyond tools — the other primitives
+
+Tools are one of three server primitives. Most servers start with tools and never need the others, but knowing they exist prevents reinventing wheels:
+
+| Primitive | Who triggers it | Use when |
+|---|---|---|
+| **Resources** | Host app (not Claude) | Exposing docs/files/data as browsable context |
+| **Prompts** | User (slash command) | Canned workflows ("/summarize-thread") |
+| **Elicitation** | Server, mid-tool | Asking user for input without building UI |
+| **Sampling** | Server, mid-tool | Need LLM inference in your tool logic |
+
+→ `references/resources-and-prompts.md`, `references/elicitation.md`, `references/server-capabilities.md`
+
+---
+
 ## Quick reference: decision matrix

 | Scenario | Deployment | Tool pattern |
@@ -174,4 +199,7 @@ When handing off, restate the design brief in one paragraph so the next skill do

 - `references/remote-http-scaffold.md` — minimal remote server in TS SDK and FastMCP
 - `references/tool-design.md` — writing tool descriptions and schemas Claude understands well
- `references/auth.md` — OAuth, DCR, CIMD, token storage patterns
+- `references/auth.md` — OAuth, CIMD, DCR, token storage patterns
+- `references/resources-and-prompts.md` — the two non-tool primitives
+- `references/elicitation.md` — spec-native user input mid-tool (capability check + fallback)
+- `references/server-capabilities.md` — instructions, sampling, roots, logging, progress, cancellation
--- a/plugins/mcp-server-dev/skills/build-mcp-server/references/auth.md
+++ b/plugins/mcp-server-dev/skills/build-mcp-server/references/auth.md
@@ -17,32 +17,32 @@ if (!apiKey) throw new Error("UPSTREAM_API_KEY not set");

 Works for local stdio, MCPB, and remote servers alike. If this is all you need, stop here.

-### Tier 2: OAuth 2.0 via Dynamic Client Registration (DCR)
+### Tier 2: OAuth 2.0 via CIMD (preferred per spec 2025-11-25)

-The MCP host (Claude desktop, Claude Code, etc.) discovers your server's OAuth metadata, **registers itself as a client dynamically**, runs the auth-code flow, and stores the token. Your server never sees credentials — it just receives bearer tokens on each request.
+**Client ID Metadata Document.** The MCP host publishes its client metadata at an HTTPS URL and uses that URL *as* its `client_id`. Your authorization server fetches the document, validates it, and proceeds with the auth-code flow. No registration endpoint, no stored client records.

-This is the **recommended path** for any remote server wrapping an OAuth-protected API.
+Spec 2025-11-25 promoted CIMD to SHOULD (preferred). Advertise support via `client_id_metadata_document_supported: true` in your OAuth AS metadata.

 **Server responsibilities:**

-1. Serve OAuth Authorization Server Metadata (RFC 8414) at `/.well-known/oauth-authorization-server`
+1. Serve OAuth Authorization Server Metadata (RFC 8414) at `/.well-known/oauth-authorization-server` with `client_id_metadata_document_supported: true`
 2. Serve an MCP-protected-resource metadata document pointing at (1)
-3. Implement (or proxy to) a DCR endpoint that hands out client IDs
+3. At authorize time: fetch `client_id` as an HTTPS URL, validate the returned client metadata, proceed
 4. Validate bearer tokens on incoming `/mcp` requests

-Most of this is boilerplate — the SDK has helpers. The real decision is whether you **proxy** to the upstream's OAuth (if they support DCR) or run your own **shim** authorization server that exchanges your tokens for upstream tokens.
-
 ```
-┌─────────┐   DCR + auth code   ┌──────────────┐   upstream OAuth   ┌──────────┐
-│ MCP host│ ──────────────────> │ Your MCP srv │ ─────────────────> │ Upstream │
-└─────────┘ <── bearer token ── └──────────────┘ <── access token ──└──────────┘
+┌─────────┐  client_id=https://...  ┌──────────────┐   upstream OAuth   ┌──────────┐
+│ MCP host│ ──────────────────────> │ Your MCP srv │ ─────────────────> │ Upstream │
+└─────────┘ <─── bearer token ───── └──────────────┘ <── access token ──└──────────┘
 ```

-### Tier 3: CIMD (Client ID Metadata Document)
+### Tier 3: OAuth 2.0 via Dynamic Client Registration (DCR)

-An alternative to DCR for ecosystems that don't want dynamic registration. The host publishes its client metadata at a well-known URL; your server fetches it, validates it, and issues a client credential. Lower friction than DCR for the host, slightly more work for you.
+**Backward-compat fallback** — spec 2025-11-25 demoted DCR to MAY. The host discovers your `registration_endpoint`, POSTs its metadata to register itself as a client, gets back a `client_id`, then runs the auth-code flow.

-Use CIMD when targeting hosts that advertise CIMD support in their client metadata. Otherwise default to DCR — it's more broadly implemented.
+Implement DCR if you need to support hosts that haven't moved to CIMD yet. Same server responsibilities as CIMD, but instead of fetching the `client_id` URL you run a registration endpoint that stores client records.
+
+**Client priority order:** pre-registered → CIMD (if AS advertises `client_id_metadata_document_supported`) → DCR (if AS has `registration_endpoint`) → prompt user.

 ---

@@ -71,3 +71,22 @@ If OAuth is required, lean hard toward remote HTTP. If you *must* ship local + O
 | Remote, stateless | Nowhere — host sends bearer each request |
 | Remote, stateful | Session store keyed by MCP session ID (Redis, etc.) |
 | MCPB / local | OS keychain (`keytar` on Node, `keyring` on Python). **Never plaintext on disk.** |
+
+---
+
+## Token audience validation (spec MUST)
+
+Validating "is this a valid bearer token" isn't enough. The spec requires validating "was this token minted *for this server*" — RFC 8707 audience. A token issued for `api.other-service.com` must be rejected even if the signature checks out.
+
+**Token passthrough is explicitly forbidden.** Don't accept a token, then forward it upstream. If your server needs to call another service, exchange the token or use its own credentials.
+
+---
+
+## SDK helpers — don't hand-roll
+
+`@modelcontextprotocol/sdk/server/auth` ships:
+- `mcpAuthRouter()` — Express router for the full OAuth AS surface (metadata, authorize, token)
+- `bearerAuth` — middleware that validates bearer tokens against your verifier
+- `proxyProvider` — forward auth to an upstream IdP
+
+If you're wiring auth from scratch, check these first.
--- a/plugins/mcp-server-dev/skills/build-mcp-server/references/elicitation.md
+++ b/plugins/mcp-server-dev/skills/build-mcp-server/references/elicitation.md
@@ -0,0 +1,129 @@
+# Elicitation — spec-native user input
+
+Elicitation lets a server pause mid-tool-call and ask the user for structured input. The client renders a native form (no iframe, no HTML). User fills it, server continues.
+
+**This is the right answer for simple input.** Widgets (`build-mcp-app`) are for when you need rich UI — charts, searchable lists, visual previews. If you just need a confirmation, a picked option, or a few form fields, elicitation is simpler, spec-native, and works in any compliant host.
+
+---
+
+## ⚠️ Check capability first — support is new
+
+Host support is very recent:
+
+| Host | Status |
+|---|---|
+| Claude Code | ✅ since v2.1.76 (both `form` and `url` modes) |
+| Claude Desktop | Unconfirmed — likely not yet or very recent |
+| claude.ai | Unknown |
+
+**The SDK throws `CapabilityNotSupported` if the client doesn't advertise elicitation.** There is no graceful degradation built in. You MUST check and have a fallback.
+
+### The canonical pattern
+
+```typescript
+server.registerTool("delete_all", {
+  description: "Delete all items after confirmation",
+  inputSchema: {},
+}, async ({}, extra) => {
+  const caps = server.getClientCapabilities();
+  if (caps?.elicitation) {
+    const r = await server.elicitInput({
+      mode: "form",
+      message: "Delete all items? This cannot be undone.",
+      requestedSchema: {
+        type: "object",
+        properties: { confirm: { type: "boolean", title: "Confirm deletion" } },
+        required: ["confirm"],
+      },
+    });
+    if (r.action === "accept" && r.content?.confirm) {
+      await deleteAll();
+      return { content: [{ type: "text", text: "Deleted." }] };
+    }
+    return { content: [{ type: "text", text: "Cancelled." }] };
+  }
+  // Fallback: return text asking Claude to relay the question
+  return { content: [{ type: "text", text: "Confirmation required. Please ask the user: 'Delete all items? This cannot be undone.' Then call this tool again with their answer." }] };
+});
+```
+
+```python
+# fastmcp
+from fastmcp import Context
+from fastmcp.exceptions import CapabilityNotSupported
+
+@mcp.tool
+async def delete_all(ctx: Context) -> str:
+    try:
+        result = await ctx.elicit("Delete all items? This cannot be undone.", response_type=bool)
+        if result.action == "accept" and result.data:
+            await do_delete()
+            return "Deleted."
+        return "Cancelled."
+    except CapabilityNotSupported:
+        return "Confirmation required. Ask the user to confirm deletion, then retry."
+```
+
+---
+
+## Schema constraints
+
+Elicitation schemas are deliberately limited — keep forms simple:
+
+- **Flat objects only** — no nesting, no arrays of objects
+- **Primitives only** — `string`, `number`, `integer`, `boolean`, `enum`
+- String formats limited to: `email`, `uri`, `date`, `date-time`
+- Use `title` and `description` on each property — they become form labels
+
+If your data doesn't fit these constraints, that's the signal to escalate to a widget.
+
+---
+
+## Three-state response
+
+| Action | Meaning | `content` present? |
+|---|---|---|
+| `accept` | User submitted the form | ✅ validated against your schema |
+| `decline` | User explicitly said no | ❌ |
+| `cancel` | User dismissed (escape, clicked away) | ❌ |
+
+Treat `decline` and `cancel` differently if it matters — `decline` is intentional, `cancel` might be accidental.
+
+The TS SDK's `server.elicitInput()` auto-validates `accept` responses against your schema via Ajv. fastmcp's `ctx.elicit()` returns a typed discriminated union (`AcceptedElicitation[T] | DeclinedElicitation | CancelledElicitation`).
+
+---
+
+## fastmcp response_type shorthand
+
+```python
+await ctx.elicit("Pick a color", response_type=["red", "green", "blue"])  # enum
+await ctx.elicit("Enter email", response_type=str)                         # string
+await ctx.elicit("Confirm?", response_type=bool)                           # boolean
+
+@dataclass
+class ContactInfo:
+    name: str
+    email: str
+await ctx.elicit("Contact details", response_type=ContactInfo)             # flat dataclass
+```
+
+Accepts: primitives, `list[str]` (becomes enum), dataclass, TypedDict, Pydantic BaseModel. All must be flat.
+
+---
+
+## Security
+
+**MUST NOT request passwords, API keys, or tokens via elicitation** — spec requirement. Those go through OAuth or `user_config` with `sensitive: true` (MCPB), not runtime forms.
+
+---
+
+## When to escalate to widgets
+
+Elicitation handles: confirm dialogs, enum pickers, short flat forms.
+
+Reach for `build-mcp-app` widgets when you need:
+- Nested or complex data structures
+- Scrollable/searchable lists (100+ items)
+- Visual preview before choosing (image thumbnails, file tree)
+- Live-updating progress or streaming content
+- Custom layouts, charts, maps
--- a/plugins/mcp-server-dev/skills/build-mcp-server/references/remote-http-scaffold.md
+++ b/plugins/mcp-server-dev/skills/build-mcp-server/references/remote-http-scaffold.md
@@ -20,20 +20,24 @@ import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/
 import express from "express";
 import { z } from "zod";

-const server = new McpServer({
-  name: "my-service",
-  version: "0.1.0",
-});
+const server = new McpServer(
+  { name: "my-service", version: "0.1.0" },
+  { instructions: "Prefer search_items before calling get_item directly — IDs aren't guessable." },
+);

 // Pattern A: one tool per action
-server.tool(
+server.registerTool(
  "search_items",
-  "Search items by keyword. Returns up to `limit` matches ranked by relevance.",
  {
-    query: z.string().describe("Search keywords"),
-    limit: z.number().int().min(1).max(50).default(10),
+    description: "Search items by keyword. Returns up to `limit` matches ranked by relevance.",
+    inputSchema: {
+      query: z.string().describe("Search keywords"),
+      limit: z.number().int().min(1).max(50).default(10),
+    },
+    annotations: { readOnlyHint: true },
  },
-  async ({ query, limit }) => {
+  async ({ query, limit }, extra) => {
+    // extra.signal is an AbortSignal — check it in long loops for cancellation
    const results = await upstreamApi.search(query, limit);
    return {
      content: [{ type: "text", text: JSON.stringify(results, null, 2) }],
@@ -41,10 +45,13 @@ server.tool(
  },
 );

-server.tool(
+server.registerTool(
  "get_item",
-  "Fetch a single item by its ID.",
-  { id: z.string() },
+  {
+    description: "Fetch a single item by its ID.",
+    inputSchema: { id: z.string() },
+    annotations: { readOnlyHint: true },
+  },
  async ({ id }) => {
    const item = await upstreamApi.get(id);
    return { content: [{ type: "text", text: JSON.stringify(item) }] };
@@ -71,7 +78,7 @@ app.listen(process.env.PORT ?? 3000);

 ---

-## FastMCP 2.0 (Python)
+## FastMCP 3.x (Python)

 ```bash
 pip install fastmcp
@@ -82,14 +89,17 @@ pip install fastmcp
 ```python
 from fastmcp import FastMCP

-mcp = FastMCP(name="my-service")
+mcp = FastMCP(
+    name="my-service",
+    instructions="Prefer search_items before calling get_item directly — IDs aren't guessable.",
+)

-@mcp.tool
+@mcp.tool(annotations={"readOnlyHint": True})
 def search_items(query: str, limit: int = 10) -> list[dict]:
    """Search items by keyword. Returns up to `limit` matches ranked by relevance."""
    return upstream_api.search(query, limit)

-@mcp.tool
+@mcp.tool(annotations={"readOnlyHint": True})
 def get_item(id: str) -> dict:
    """Fetch a single item by its ID."""
    return upstream_api.get(id)
@@ -109,22 +119,27 @@ When wrapping 50+ endpoints, don't register them all. Two tools:
 ```typescript
 const CATALOG = loadActionCatalog(); // { id, description, paramSchema }[]

-server.tool(
+server.registerTool(
  "search_actions",
-  "Find available actions matching an intent. Call this first to discover what's possible. Returns action IDs, descriptions, and parameter schemas.",
-  { intent: z.string().describe("What you want to do, in plain English") },
+  {
+    description: "Find available actions matching an intent. Call this first to discover what's possible. Returns action IDs, descriptions, and parameter schemas.",
+    inputSchema: { intent: z.string().describe("What you want to do, in plain English") },
+    annotations: { readOnlyHint: true },
+  },
  async ({ intent }) => {
    const matches = rankActions(CATALOG, intent).slice(0, 10);
    return { content: [{ type: "text", text: JSON.stringify(matches, null, 2) }] };
  },
 );

-server.tool(
+server.registerTool(
  "execute_action",
-  "Execute an action by ID. Get the ID and params schema from search_actions first.",
  {
-    action_id: z.string(),
-    params: z.record(z.unknown()),
+    description: "Execute an action by ID. Get the ID and params schema from search_actions first.",
+    inputSchema: {
+      action_id: z.string(),
+      params: z.record(z.unknown()),
+    },
  },
  async ({ action_id, params }) => {
    const action = CATALOG.find(a => a.id === action_id);
@@ -146,6 +161,9 @@ server.tool(
 - [ ] `tools/list` returns your tools with complete schemas
 - [ ] Errors return structured MCP errors, not HTTP 500s with HTML bodies
 - [ ] CORS headers set if browser clients will connect
+- [ ] `Origin` header validated on `/mcp` (spec MUST — DNS rebinding prevention)
+- [ ] `MCP-Protocol-Version` header honored (return 400 for unsupported versions)
+- [ ] `instructions` field set if tool-use needs hints
 - [ ] Health check endpoint separate from `/mcp` (hosts poll it)
 - [ ] Secrets from env vars, never hardcoded
- [ ] If OAuth: DCR endpoint implemented — see `auth.md`
+- [ ] If OAuth: CIMD or DCR endpoint implemented — see `auth.md`
--- a/plugins/mcp-server-dev/skills/build-mcp-server/references/resources-and-prompts.md
+++ b/plugins/mcp-server-dev/skills/build-mcp-server/references/resources-and-prompts.md
@@ -0,0 +1,122 @@
+# Resources & Prompts — the other two primitives
+
+MCP defines three server-side primitives. Tools are model-controlled (Claude decides when to call them). The other two are different:
+
+- **Resources** are application-controlled — the host decides what to pull into context
+- **Prompts** are user-controlled — surfaced as slash commands or menu items
+
+Most servers only need tools. Reach for these when the shape of your integration doesn't fit "Claude calls a function."
+
+---
+
+## Resources
+
+A resource is data identified by a URI. Unlike a tool, it's not *called* — it's *read*. The host browses available resources and decides which to load into context.
+
+**When a resource beats a tool:**
+- Large reference data (docs, schemas, configs) that Claude should be able to browse
+- Content that changes independently of conversation (log files, live data)
+- Anything where "Claude decides to fetch" is the wrong mental model
+
+**When a tool is better:**
+- The operation has side effects
+- The result depends on parameters Claude chooses
+- You want Claude (not the host UI) to decide when to pull it in
+
+### Static resources
+
+```typescript
+// TypeScript SDK
+server.registerResource(
+  "config",
+  "config://app/settings",
+  { name: "App Settings", description: "Current configuration", mimeType: "application/json" },
+  async (uri) => ({
+    contents: [{ uri: uri.href, mimeType: "application/json", text: JSON.stringify(config) }],
+  }),
+);
+```
+
+```python
+# fastmcp
+@mcp.resource("config://app/settings")
+def get_settings() -> str:
+    """Current application configuration."""
+    return json.dumps(config)
+```
+
+### Dynamic resources (URI templates)
+
+RFC 6570 templates let one registration serve many URIs:
+
+```typescript
+import { ResourceTemplate } from "@modelcontextprotocol/sdk/server/mcp.js";
+
+server.registerResource(
+  "file",
+  new ResourceTemplate("file:///{path}", { list: undefined }),
+  { name: "File", description: "Read a file from the workspace" },
+  async (uri, { path }) => ({
+    contents: [{ uri: uri.href, text: await fs.readFile(path, "utf8") }],
+  }),
+);
+```
+
+```python
+@mcp.resource("file:///{path}")
+def read_file(path: str) -> str:
+    return Path(path).read_text()
+```
+
+### Subscriptions
+
+Resources can notify the client when they change. Declare `subscribe: true` in capabilities, then emit `notifications/resources/updated`. The host re-reads. Useful for log tails, live dashboards, watched files.
+
+---
+
+## Prompts
+
+A prompt is a parameterized message template. The host surfaces it as a slash command or menu item. The user picks it, fills in arguments, and the resulting messages land in the conversation.
+
+**When to use:** canned workflows users run repeatedly — `/summarize-thread`, `/draft-reply`, `/explain-error`. Near-zero code, high UX leverage.
+
+```typescript
+server.registerPrompt(
+  "summarize",
+  {
+    title: "Summarize document",
+    description: "Generate a concise summary of the given text",
+    argsSchema: { text: z.string(), max_words: z.string().optional() },
+  },
+  ({ text, max_words }) => ({
+    messages: [{
+      role: "user",
+      content: { type: "text", text: `Summarize in ${max_words ?? "100"} words:\n\n${text}` },
+    }],
+  }),
+);
+```
+
+```python
+@mcp.prompt
+def summarize(text: str, max_words: str = "100") -> str:
+    """Generate a concise summary of the given text."""
+    return f"Summarize in {max_words} words:\n\n{text}"
+```
+
+**Constraints:**
+- Arguments are **string-only** (no numbers, booleans, objects) — convert inside the handler
+- Returns a `messages[]` array — can include embedded resources/images, not just text
+- No side effects — the handler just builds a message, it doesn't *do* anything
+
+---
+
+## Quick decision table
+
+| You want to... | Use |
+|---|---|
+| Let Claude fetch something on demand, with parameters | **Tool** |
+| Expose browsable context (files, docs, schemas) | **Resource** |
+| Expose a dynamic family of things (`db://{table}`) | **Resource template** |
+| Give users a one-click workflow | **Prompt** |
+| Ask the user something mid-tool | **Elicitation** (see `elicitation.md`) |
--- a/plugins/mcp-server-dev/skills/build-mcp-server/references/server-capabilities.md
+++ b/plugins/mcp-server-dev/skills/build-mcp-server/references/server-capabilities.md
@@ -0,0 +1,164 @@
+# Server capabilities — the rest of the spec
+
+Features beyond the three core primitives. Most are optional, a few are near-free wins.
+
+---
+
+## `instructions` — system prompt injection
+
+One line of config, lands directly in Claude's system prompt. Use it for tool-use hints that don't fit in individual tool descriptions.
+
+```typescript
+const server = new McpServer(
+  { name: "my-server", version: "1.0.0" },
+  { instructions: "Always call search_items before get_item — IDs aren't guessable." },
+);
+```
+
+```python
+mcp = FastMCP("my-server", instructions="Always call search_items before get_item — IDs aren't guessable.")
+```
+
+This is the highest-leverage one-liner in the spec. If Claude keeps misusing your tools, put the fix here.
+
+---
+
+## Sampling — delegate LLM calls to the host
+
+If your tool logic needs LLM inference (summarize, classify, generate), don't ship your own model client. Ask the host to do it.
+
+```typescript
+// Inside a tool handler
+const result = await extra.sendRequest({
+  method: "sampling/createMessage",
+  params: {
+    messages: [{ role: "user", content: { type: "text", text: `Summarize: ${doc}` } }],
+    maxTokens: 500,
+  },
+}, CreateMessageResultSchema);
+```
+
+```python
+# fastmcp
+response = await ctx.sample("Summarize this document", context=doc)
+```
+
+**Requires client support** — check `clientCapabilities.sampling` first. Model preference hints are substring-matched (`"claude-3-5"` matches any Claude 3.5 variant).
+
+---
+
+## Roots — query workspace boundaries
+
+Instead of hardcoding a root directory, ask the host which directories the user approved.
+
+```typescript
+const caps = server.getClientCapabilities();
+if (caps?.roots) {
+  const { roots } = await server.server.listRoots();
+  // roots: [{ uri: "file:///home/user/project", name: "My Project" }]
+}
+```
+
+```python
+roots = await ctx.list_roots()
+```
+
+Particularly relevant for MCPB local servers — see `build-mcpb/references/local-security.md`.
+
+---
+
+## Logging — structured, level-aware
+
+Better than stderr for remote servers. Client can filter by level.
+
+```typescript
+// In a tool handler
+await extra.sendNotification({
+  method: "notifications/message",
+  params: { level: "info", logger: "my-tool", data: { msg: "Processing", count: 42 } },
+});
+```
+
+```python
+await ctx.info("Processing", count=42)   # also: ctx.debug, ctx.warning, ctx.error
+```
+
+Levels follow syslog: `debug`, `info`, `notice`, `warning`, `error`, `critical`, `alert`, `emergency`. Client sets minimum via `logging/setLevel`.
+
+---
+
+## Progress — for long-running tools
+
+Client sends a `progressToken` in request `_meta`. Server emits progress notifications against it.
+
+```typescript
+async (args, extra) => {
+  const token = extra._meta?.progressToken;
+  for (let i = 0; i < 100; i++) {
+    if (token !== undefined) {
+      await extra.sendNotification({
+        method: "notifications/progress",
+        params: { progressToken: token, progress: i, total: 100, message: `Step ${i}` },
+      });
+    }
+    await doStep(i);
+  }
+  return { content: [{ type: "text", text: "Done" }] };
+}
+```
+
+```python
+async def long_task(ctx: Context) -> str:
+    for i in range(100):
+        await ctx.report_progress(progress=i, total=100, message=f"Step {i}")
+        await do_step(i)
+    return "Done"
+```
+
+---
+
+## Cancellation — honor the abort signal
+
+Long tools should check the SDK-provided `AbortSignal`:
+
+```typescript
+async (args, extra) => {
+  for (const item of items) {
+    if (extra.signal.aborted) throw new Error("Cancelled");
+    await process(item);
+  }
+}
+```
+
+fastmcp handles this via asyncio cancellation — no explicit check needed if your handler is properly async.
+
+---
+
+## Completion — autocomplete for prompt args
+
+If you've registered prompts or resource templates with arguments, you can offer autocomplete:
+
+```typescript
+server.registerPrompt("query", {
+  argsSchema: {
+    table: completable(z.string(), async (partial) => tables.filter(t => t.startsWith(partial))),
+  },
+}, ...);
+```
+
+Low priority unless your prompts have many valid values.
+
+---
+
+## Which capabilities need client support?
+
+| Feature | Server declares | Client must support | Fallback if not |
+|---|---|---|---|
+| `instructions` | implicit | — | — (always works) |
+| Logging | `logging: {}` | — | stderr |
+| Progress | — | sends `progressToken` | silently skip |
+| Sampling | — | `sampling: {}` | bring your own LLM |
+| Elicitation | — | `elicitation: {}` | return text, ask Claude to relay |
+| Roots | — | `roots: {}` | config env var |
+
+Check client caps via `server.getClientCapabilities()` (TS) or `ctx.session.client_params.capabilities` (fastmcp) before using the bottom three.
--- a/plugins/mcp-server-dev/skills/build-mcp-server/references/tool-design.md
+++ b/plugins/mcp-server-dev/skills/build-mcp-server/references/tool-design.md
@@ -110,3 +110,70 @@ if (!item) {
 ```

 The hint ("use search_items…") turns a dead end into a next step.
+
+---
+
+## Tool annotations
+
+Hints the host uses for UX — red confirm button for destructive, auto-approve for readonly. All default to unset (host assumes worst case).
+
+| Annotation | Meaning | Host behavior |
+|---|---|---|
+| `readOnlyHint: true` | No side effects | May auto-approve |
+| `destructiveHint: true` | Deletes/overwrites | Confirmation dialog |
+| `idempotentHint: true` | Safe to retry | May retry on transient error |
+| `openWorldHint: true` | Talks to external world (web, APIs) | May show network indicator |
+
+```typescript
+server.registerTool("delete_file", {
+  description: "Delete a file",
+  inputSchema: { path: z.string() },
+  annotations: { destructiveHint: true, idempotentHint: false },
+}, handler);
+```
+
+```python
+@mcp.tool(annotations={"destructiveHint": True, "idempotentHint": False})
+def delete_file(path: str) -> str:
+    ...
+```
+
+Pair with the read/write split advice in `build-mcpb/references/local-security.md` — mark every read tool `readOnlyHint: true`.
+
+---
+
+## Structured output
+
+`JSON.stringify(result)` in a text block works, but the spec has first-class typed output: `outputSchema` + `structuredContent`. Clients can validate.
+
+```typescript
+server.registerTool("get_weather", {
+  description: "Get current weather",
+  inputSchema: { city: z.string() },
+  outputSchema: { temp: z.number(), conditions: z.string() },
+}, async ({ city }) => {
+  const data = await fetchWeather(city);
+  return {
+    content: [{ type: "text", text: JSON.stringify(data) }],  // backward compat
+    structuredContent: data,                                    // typed output
+  };
+});
+```
+
+Always include the text fallback — not all hosts read `structuredContent` yet.
+
+---
+
+## Content types beyond text
+
+Tools can return more than strings:
+
+| Type | Shape | Use for |
+|---|---|---|
+| `text` | `{ type: "text", text: string }` | Default |
+| `image` | `{ type: "image", data: base64, mimeType }` | Screenshots, charts, diagrams |
+| `audio` | `{ type: "audio", data: base64, mimeType }` | TTS output, recordings |
+| `resource_link` | `{ type: "resource_link", uri, name?, description? }` | Pointer — client fetches later |
+| `resource` (embedded) | `{ type: "resource", resource: { uri, text\|blob, mimeType } }` | Inline the full content |
+
+**`resource_link` vs embedded:** link for large payloads or when the client might not need it (let them decide). Embed when it's small and always needed.