MCP Apps is an extension to the Model Context Protocol that enables MCP servers to deliver interactive user interfaces to hosts. It defines how servers declare UI resources, how hosts render them securely in iframes, and how the two communicate.
MCP allows servers to expose tools and resources to AI assistants, but responses are limited to text and structured data. Many use cases need more:
Before MCP Apps, each host implemented UI support differently. MCP-UI, OpenAI's Apps SDK, and custom implementations all solved similar problems with incompatible approaches. Server developers had to maintain separate adapters for each host, and security models varied.
MCP Apps standardizes this. Servers declare their UIs once; any compliant host can render them. Security is consistent — sandboxed iframes, auditable communication, declarative CSP. The result is write-once portability with predictable behavior.
MCP Apps is designed for graceful degradation. Hosts advertise their UI support when connecting to servers; servers check these capabilities before registering UI-enabled tools. If a host doesn't support MCP Apps, tools still work — they just return text instead of UI.
This is fundamental: UI is a progressive enhancement, not a requirement. Your server works everywhere; hosts that support UI get a richer experience.
See the Client<>Server Capability Negotiation section of the specification.
In MCP Apps, three entities work together:
flowchart LR
subgraph Server ["MCP Server"]
T[Tools]
R[UI Resources]
end
subgraph Host ["Host (Chat Client)"]
B[AppBridge]
end
subgraph View ["View (iframe)"]
A[App]
end
Server <-->|MCP Protocol| Host
Host <-->|postMessage| ViewThe View acts as an MCP client, the Host acts as a proxy, and the Server is a standard MCP server.
sequenceDiagram
participant H as Host
participant V as View (iframe)
participant S as MCP Server
Note over H,S: 1. Discovery
S-->>H: tools/list (with UI metadata)
Note over H,V: 2. Initialization
H->>H: Render iframe with UI resource
V->>H: ui/initialize
H-->>V: host context, capabilities
V->>H: ui/notifications/initialized
Note over H,V: 3. Data Delivery
H-->>V: ui/notifications/tool-input
H-->>V: ui/notifications/tool-result
Note over H,V: 4. Interactive Phase
loop User Interaction
V->>H: tools/call
H->>S: tools/call
S-->>H: result
H-->>V: result
end
Note over H,V: 5. Teardown
H->>V: ui/resource-teardown
V-->>H: acknowledgmentui/initialize and receives host context (theme, capabilities, container dimensions). This handshake ensures the View is ready before receiving data.content (text for the model's context) and optionally structuredContent (data optimized for UI rendering). This separation lets servers provide rich data to the UI without bloating the model's context.See the Lifecycle section of the specification for the complete sequence diagrams.
UI resources are HTML templates that servers declare using the ui:// URI scheme. When a tool with UI metadata is called, the Host fetches the corresponding resource and renders it.
Resources are declared upfront, during tool registration. This design enables:
See the UI Resource Format section of the specification for the full schema.
Tools reference their UI templates through metadata. When a server registers a tool, it includes a _meta.ui object pointing to a ui:// resource:
"_meta": {
"ui": { "resourceUri": "ui://weather/forecast" }
}
When this tool is called, the Host:
See the Resource Discovery section of the specification for details.
Views communicate with Hosts using JSON-RPC over postMessage. From a View, you can:
Interact with the server:
tools/call)resources/read)Interact with the chat:
ui/message)ui/update-model-context)Request host actions:
ui/open-link)See the app!App class for the View-side API and the Communication Protocol section of the specification.
Tools can be visible to the model, the app, or both. By default, tools are visible to both (visibility: ["model", "app"]).
App-only tools (visibility: ["app"]) are useful for UI interactions that shouldn't clutter the agent's context — things like refresh buttons, pagination controls, or form submissions. The model never sees these tools; they exist purely for the View to call.
When a View initializes, the Host provides context about the environment it's running in. This includes:
Views use this context to adapt their presentation. For example, a chart might use dark colors when the host is in dark mode, or a form might adjust its layout for mobile platforms.
The Host notifies Views when context changes (e.g., the user toggles dark mode), allowing dynamic updates without reloading.
See the Host Context section of the specification for the full schema.
Hosts provide CSS custom properties for colors, typography, and borders. Views use these variables with fallbacks to match the host's visual style:
.container {
background: var(--color-background-primary, #ffffff);
color: var(--color-text-primary, #000000);
}
Theme changes (light/dark toggle) are sent via notifications, allowing Views to update dynamically.
See the Theming section of the specification for the full list of CSS variables.
Views can be displayed in different modes:
Views declare which modes they support; Hosts declare which they can provide. Views can request mode changes, but the Host decides whether to honor them — the Host always has final say over its own UI.
See the Display Modes section of the specification.
All Views run in sandboxed iframes with no access to the Host's DOM, cookies, or storage. Communication happens only through postMessage, making it auditable.
Servers declare which external domains their UI needs via CSP metadata. Hosts enforce these declarations — if no domains are declared, no external connections are allowed. This "restrictive by default" approach prevents data exfiltration to undeclared servers.
See the Security Implications section of the specification for the threat model and mitigations.