
AI agents aren't just generating text and writing code anymore. They're opening browsers, navigating web pages, clicking buttons, filling forms, and verifying results, all without a single line of traditional automation script.
The technology behind this is Playwright MCP, a server that connects AI models to real web browsers through Microsoft's Playwright automation library. Since its release in March 2025, it's quickly become the standard way for large language models (LLMs) to interact with web applications programmatically.
This guide covers everything you need to know about Playwright MCP: what it is, how it works, how to set it up, when to use it, and how it compares to alternatives like the Playwright CLI and traditional automation tools.
What Is MCP (Model Context Protocol)?
Before diving into Playwright MCP specifically, it helps to understand the protocol it's built on.
Model Context Protocol (MCP) is an open-source standard for connecting AI applications to external systems. Originally developed by Anthropic, MCP gives AI models a universal interface to communicate with tools, data sources, and services in a structured, consistent way.
Think of MCP as a USB-C port for AI applications. Just as USB-C provides a single standardized connector for charging, data transfer, and video across different devices, MCP provides a single standardized protocol for AI models to use external tools, regardless of the specific tool or AI application involved.
How MCP Works
MCP follows a client-server architecture with three main components:
Hosts are the AI applications that users interact with directly, such as Claude Desktop, VS Code with GitHub Copilot, or Cursor IDE.
Clients are components embedded within hosts that manage connections to MCP servers. They handle the protocol-level communication, sending requests and receiving responses.
Servers are external programs that expose specific capabilities (called "tools") through the MCP standard. Each server makes a defined set of actions available that any MCP client can invoke. If you want a deeper dive, our guide on understanding what an MCP server is covers the fundamentals in detail.
When an AI model needs to perform an action, the client sends a JSON-formatted request to the appropriate server. The server executes the action and returns a structured response. This happens in real time, allowing the AI to maintain context across multiple interactions.
Why MCP Matters
Before MCP, connecting an AI model to an external tool meant building a custom integration for every tool-and-model combination. Five AI applications and ten tools? That's fifty separate integrations. MCP eliminates this by standardizing the communication layer. Build one MCP server for your tool, and it works with every MCP-compatible AI client.
This standardization has driven broad adoption. AI assistants like Claude and ChatGPT, development tools like VS Code and Cursor, and dozens of other applications now support MCP natively. And the protocol isn't limited to browser automation. Agen.co, for example, provides an enterprise-grade MCP gateway that lets organizations safely expose their SaaS products and internal tools to AI agents while enforcing identity, access control, and data governance at the protocol level.
What Is Playwright MCP?
Playwright MCP is a specific MCP server built on top of Microsoft's Playwright browser automation library. It exposes Playwright's browser controls (opening pages, clicking elements, filling forms, taking screenshots, reading page content) as MCP tools that AI agents can call.
In practice, Playwright MCP acts as a bridge between an AI model and a real web browser. The AI doesn't control the browser directly. Instead, it sends high-level commands to the Playwright MCP server, which translates those commands into Playwright actions, executes them in a live browser instance, and returns structured results.
The Playwright MCP Pipeline
Here is the flow of a typical interaction:
AI Client (such as Claude Desktop or VS Code Copilot) sends an MCP request, for example: call the tool
browser_navigatewith{ url: "https://example.com" }.Playwright MCP Server receives the request and translates it into a Playwright command.
Real Browser (Chromium, Firefox, or WebKit) executes the navigation.
Structured Response is sent back to the AI client, including page title, URL, accessibility snapshot, and any requested data.
The AI reads this response and decides its next action. This loop repeats until the task is complete.
Accessibility Tree vs. Screenshots
One of the most important design decisions in Playwright MCP is how it reads web pages. Instead of relying on screenshots and computer vision (which require visually-tuned models and introduce ambiguity), Playwright MCP uses the browser's accessibility tree.
The accessibility tree is a structured, text-based representation of every element on a webpage. It includes roles (button, link, heading), names (accessible labels), states (checked, disabled), and hierarchy (parent-child relationships). This gives the AI a precise, deterministic understanding of page structure without needing to interpret pixels.
The advantages are significant:
Speed: Reading a text-based tree is significantly faster than processing a screenshot image.
Reliability: Element identification is based on semantic roles and labels, not visual positions that shift across screen sizes.
Token efficiency: A compact accessibility snapshot uses far fewer tokens than a base64-encoded image.
No vision model required: Any LLM can work with Playwright MCP, not just multimodal models with image understanding.
How Playwright MCP Works Under the Hood
Understanding the internals helps explain why Playwright MCP behaves the way it does and where its limitations come from.
Snapshot Mode (Default)
In its default Snapshot Mode, Playwright MCP captures accessibility snapshots of the current page after each action. These snapshots contain the full accessibility tree: every interactive element, its role, its accessible name, and a reference ID that the AI can use in subsequent commands.
For example, after navigating to a login page, the snapshot might return:
- textbox "Email address" [ref=e1]
The AI can then issue a command like browser_type with ref: "e1" and text: "user@example.com" to fill the email field. This reference-based system makes interactions precise and repeatable.
Vision Mode (Optional)
For elements that are not well represented in the accessibility tree, such as canvas elements, complex SVGs, or image-based interfaces, Playwright MCP offers an optional Vision Mode (enabled with --caps=vision). This mode adds coordinate-based interaction tools (browser_mouse_click_xy, browser_mouse_move_xy) that allow the AI to click at specific pixel positions.
Vision Mode requires the AI to work with screenshots rather than accessibility data, making it slower and less deterministic. For most web automation tasks, Snapshot Mode is the better choice.
The Command-Response Loop
A typical Playwright MCP workflow follows this pattern:
The AI calls
browser_navigateto open a URL.The server returns an accessibility snapshot of the loaded page.
The AI analyzes the snapshot, identifies the target elements, and calls
browser_type,browser_click, or other tools.After each action, the server returns an updated snapshot reflecting the new page state.
The AI continues issuing commands based on the updated state until the task is complete.
This feedback loop is what makes Playwright MCP genuinely agentic. The AI doesn't blindly execute a script. It observes the page state after each step and adapts its approach accordingly, handling loading states, dynamic content, and unexpected UI changes.
Key Features of Playwright MCP
Accessibility-First Automation
The default Snapshot Mode uses the accessibility tree for all interactions. Elements are identified by semantic roles and labels rather than CSS selectors or XPath, making automations more resilient to UI changes. If a developer renames a CSS class, the automation still works as long as the button's accessible label stays the same.
Cross-Browser Support
Playwright MCP supports Chromium, Firefox, and WebKit (Safari's rendering engine). You can specify the browser using the --browser flag:
npx @playwright/mcp@latest --browser=firefox
This allows testing the same workflows across all major browser engines.
Headless and Headed Modes
By default, Playwright MCP runs the browser in headed mode (visible window), which is useful for development and debugging. For CI/CD pipelines and server environments, headless mode runs the browser without a visible window:
npx @playwright/mcp@latest --headless
Wide IDE and Client Support
Playwright MCP works with virtually every major AI coding tool:
VS Code with GitHub Copilot
Cursor
Claude Desktop and Claude Code
Windsurf
Goose
OpenAI Codex
GitHub Copilot Coding Agent (preconfigured, no setup needed)
Amp, Kiro, LM Studio, and many others
The same server command works across all these clients because MCP standardizes the communication protocol.
Session Management
Playwright MCP offers three session modes:
Persistent profile (default): Browser data like cookies and login sessions are saved to disk between runs, similar to a normal browser.
Isolated mode (
--isolated): Each session starts with a clean, in-memory profile. All state is lost when the browser closes. Ideal for testing.Extension mode (
--extension): Connects to an existing Chrome browser session using the Playwright MCP Bridge extension, leveraging your logged-in state and cookies.
Network Mocking and Storage Controls
Beyond basic navigation and interaction, Playwright MCP includes tools for:
Mocking network requests (
browser_route): Intercept requests matching a URL pattern and return custom responses.Cookie management: Get, set, delete, and clear cookies.
localStorage and sessionStorage: Full read/write access for managing client-side storage.
Storage state save/restore: Capture and replay authentication states across sessions.
Video Recording and DevTools Tracing
With the --caps=devtools flag enabled, Playwright MCP supports:
On-demand video recording (
--save-video=800x600): Capture full video playback of browser sessions.Performance tracing: Record and export Playwright traces for debugging, including network activity, DOM snapshots, and console logs.
Core Web Vitals: Capture metrics like LCP, CLS, and INP during test sessions.
Docker Support
An official Docker image is available for containerized environments:
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "docker",</p> <p> "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p> <p> }</p> <p> }</p>
Note: The Docker image currently supports headless Chromium only. Firefox and WebKit are not yet available in containers.
Playwright MCP vs. Playwright CLI
In mid-2025, Microsoft released @playwright/cli, a companion tool that provides browser automation through standard shell commands rather than the MCP protocol. Understanding the differences helps you choose the right tool for your situation.
Architecture Differences
Aspect | Playwright MCP | Playwright CLI |
|---|---|---|
Communication | Persistent MCP server connection | Stateless shell commands |
State management | Continuous browser session | Each command is independent |
Token cost | ~114,000 tokens per typical task | ~27,000 tokens per typical task |
Tool schema | Full schema loaded into context upfront | Minimal skill description (~68 tokens) |
Output delivery | Streams into LLM context window | Saves to disk files |
Token Efficiency
The most significant difference is token consumption. When an AI agent connects to a Playwright MCP server, the entire tool schema (all available functions and their parameters) is loaded into the context window. This costs approximately 3,600 tokens before the agent even starts working.
The CLI avoids this overhead entirely. It exposes capabilities as lightweight "skills" that cost roughly 68 tokens to describe. Accessibility snapshots and screenshots are saved to disk files instead of being streamed into the context. The result is approximately a 4x reduction in total token usage for equivalent tasks.
When to Use Each
Use Playwright MCP when:
Your AI agent does not have filesystem access (e.g., web-based chat interfaces)
You need persistent browser state across many interactions
You are building exploratory automation or self-healing test workflows
You need rich, iterative reasoning about page structure over time
Use Playwright CLI when:
Your agent runs in a terminal environment with filesystem access (Claude Code, Cursor, Copilot)
Token efficiency is a priority
You are running many browser automation tasks in sequence
You need to compose browser commands with other shell tools
Microsoft's Own Guidance
The official Playwright MCP repository recommends the CLI for coding agents, stating that CLI invocations are more token-efficient because they avoid loading large tool schemas and verbose accessibility trees into the model context. MCP remains recommended for specialized agentic loops that benefit from persistent state and rich introspection.
Common Use Cases for Playwright MCP
AI-Assisted Test Generation
Describe a test scenario in plain English, and the AI generates and executes Playwright test code automatically. For example, prompting "Test the login flow with valid credentials and verify the dashboard loads" produces a complete test without writing a single line of code manually.
Exploratory Testing
AI agents can autonomously navigate an application, click through different features, and identify potential issues without predefined test scripts. This is particularly valuable for discovering edge cases and unexpected behaviors that static test suites miss.
End-to-End Test Automation
Playwright MCP enables full end-to-end testing workflows where the AI navigates through multi-step user journeys: logging in, filling forms, submitting data, and verifying results across multiple pages. The accessibility-tree approach ensures these tests remain stable even when CSS or layout changes occur.
Web Scraping and Data Extraction
AI agents can navigate to pages, interact with dynamic content (expanding dropdowns, paginating results, scrolling infinite feeds), and extract structured data. The structured accessibility snapshot makes it straightforward to identify and target specific data elements.
Form Automation and Task Workflows
For repetitive browser tasks like filling out forms, submitting applications, or walking through multi-step wizards, Playwright MCP allows an AI agent to handle the entire workflow based on a natural language description of what needs to happen.
CI/CD Integration and Self-Verifying Workflows
GitHub Copilot's Coding Agent has Playwright MCP built in, enabling a powerful workflow: the AI generates code, opens a browser, navigates to the running application, and visually verifies that its changes work correctly. This creates a closed loop where code generation and validation happen automatically.
Governing AI Agent Access in Production
As AI agents gain the ability to browse, interact with, and extract data from web applications, governance becomes a real concern. Who authorized the agent? What data can it access? What actions can it take on behalf of a user?
This is where agent governance platforms like Agen.co come in. While Playwright MCP handles the browser automation mechanics, Agen.co sits between your AI agents and your applications to enforce identity-aware access control for MCP connections, data masking, and tool-level permissions. If you're deploying agents that interact with production applications, pairing browser automation with a governance layer ensures agents operate within defined boundaries and every action is auditable.
How to Set Up Playwright MCP
Prerequisites
Before you start, make sure you have:
Node.js 18 or newer installed on your system
npm available on your PATH
An MCP-compatible client (VS Code, Cursor, Claude Desktop, etc.)
Step 1: Install Playwright Browsers
If you do not already have Playwright set up, install the browser binaries:
npx playwright install
This downloads Chromium, Firefox, and WebKit binaries that Playwright MCP will use.
Step 2: Launch the Playwright MCP Server
You can run the server directly with npx (no permanent installation needed):
npx @playwright/mcp@latest
This starts the server and opens a browser instance ready to receive commands. If this runs without errors, the server is working.
Step 3: Configure Your AI Client
The configuration is nearly identical across all clients. Here are the most common ones:
VS Code (GitHub Copilot):
Run this command in your terminal:
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
Or add to your VS Code settings.json manually.
Cursor:
Go to Settings > MCP > Add new MCP Server. Use the name "playwright" and command npx @playwright/mcp@latest. Alternatively, create a .cursor/mcp.json file in your project root:
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "npx",</p> <p> "args": ["@playwright/mcp@latest"]</p> <p> }</p> <p> }</p>
Claude Desktop:
Edit the Claude Desktop config file:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "npx",</p> <p> "args": ["@playwright/mcp@latest"]</p> <p> }</p> <p> }</p>
Restart Claude Desktop after saving.
Claude Code:
claude mcp add playwright npx @playwright/mcp@latest
Docker:
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "docker",</p> <p> "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p> <p> }</p> <p> }</p>
Step 4: Verify the Setup
Once configured, test the connection by asking your AI assistant something like: "Open https://example.com and tell me the page title." If the AI launches a browser, navigates to the page, and returns the title, your setup is working correctly.
Playwright MCP vs. Other Browser Automation Tools
Comparison Table
Feature | Playwright MCP | Selenium | Puppeteer | Cypress |
|---|---|---|---|---|
AI-native control | Yes (MCP protocol) | No (requires scripting) | No (requires scripting) | No (requires scripting) |
Natural language input | Yes | No | No | No |
Accessibility tree | Yes (default) | Limited | Limited | No |
Cross-browser | Chromium, Firefox, WebKit | All major browsers | Chromium only | Chromium, Firefox |
Auto-waiting | Built-in | Manual waits | Manual waits | Built-in |
Open-source | Yes (Apache 2.0) | Yes | Yes | Yes (MIT) |
Primary use case | AI-driven automation | Traditional test scripts | Headless Chrome tasks | Component/E2E testing |
Key Differences
Playwright MCP vs. Selenium: Selenium requires explicit test scripts written in a programming language. Playwright MCP accepts natural language commands from AI agents. Selenium supports more browsers (including older versions), but Playwright MCP's accessibility-tree approach produces more reliable element identification with less maintenance. Playwright MCP vs. Puppeteer: Puppeteer is limited to Chromium-based browsers and requires JavaScript scripting. Playwright MCP supports multiple browser engines and can be driven by any MCP-compatible AI client without writing code. Playwright MCP vs. Other MCP Browser Servers: Several third-party MCP servers exist for browser automation, including community projects like executeautomation/mcp-playwright. The official Microsoft @playwright/mcp server is the most actively maintained, has the broadest feature set, and receives regular updates aligned with the core Playwright library.
Pros and Cons of Playwright MCP
Pros
Advantage | Detail |
|---|---|
Natural language control | Describe tests in plain English without needing Playwright API knowledge |
Accessibility-tree approach | More reliable than screenshot-based tools; targets elements by roles and labels |
Cross-browser support | Test across Chromium, Firefox, and WebKit with one server |
Fast setup | One |
Universal client support | Works with VS Code, Cursor, Claude Desktop, Windsurf, and many more |
Open-source | Free under Apache 2.0 license, backed by Microsoft, and actively maintained |
Exploratory testing | AI can autonomously discover issues without predefined scripts |
Self-adapting | AI adjusts to page changes in real time rather than following brittle scripts |
Cons
Limitation | Detail |
|---|---|
High token consumption | A typical task uses ~114K tokens via MCP (the CLI alternative reduces this to ~27K) |
Non-deterministic output | AI-generated steps can vary between runs for the same prompt |
No persistent test scripts | Tests live in chat context, not as reusable |
Limited DOM-only coverage | Canvas elements, complex SVGs, or image-based UIs are difficult to interact with |
No built-in test management | No reporting, versioning, or CI/CD integration out of the box |
Context window pressure | Large pages with deep DOM trees can flood the LLM context |
Docker limitations | The Docker image currently supports headless Chromium only |
Best Practices for Using Playwright MCP
Write Specific Prompts
Vague prompts produce unreliable results. Instead of "test the login page," give explicit instructions: "Navigate to /login, enter 'testuser@example.com' in the email field, enter 'password123' in the password field, click the Sign In button, and verify the dashboard heading is visible." More detail means more deterministic results.
Use the CLI for Token-Heavy Workflows
If you are running many browser tasks in sequence or working with large pages, switch to @playwright/cli to reduce token usage by approximately 4x. Reserve MCP for agents that do not have filesystem access.
Combine AI Exploration with Deterministic Tests
Use Playwright MCP for exploratory testing and discovering new flows. Once a flow is validated, codify it into a proper Playwright test script (.spec.ts) for repeatable CI/CD execution. MCP is excellent for discovery; traditional scripts are better for regression.
Start Sessions Clean
Use ephemeral, in-memory browser profiles (the --isolated flag) to avoid state leakage between test runs. Only use persistent profiles (--user-data-dir) when you specifically need to preserve cookies or authentication state.
Scope Your Origin Access
In shared or production-adjacent environments, use --allowed-origins and --blocked-origins to restrict which domains the browser can access. Do not leave the server open to all origins without reason. For a broader look at securing MCP connections, see our guide on MCP security best practices and common risks.
Monitor Token Usage
Deep DOM trees on complex pages can consume significant tokens per interaction. If you notice costs spiking or responses being truncated, consider switching to the CLI, using partial snapshots (targeting a specific element with a selector parameter), or simplifying the page under test.
Version Your MCP Configuration
Commit your .cursor/mcp.json, .vscode/settings.json, or equivalent MCP config files to version control. This ensures your entire team shares a consistent setup and new contributors can get started without manual configuration.
Add a Governance Layer for Production Agents
If your AI agents are interacting with production applications or sensitive data through browser automation, you'll want an agent governance layer. Agen.co provides observability over agent actions, enforces delegated permissions, and maintains audit logs across systems. For a practical look at what this involves, read how organizations are governing AI agents across enterprise applications.
Troubleshooting Common Playwright MCP Issues
Problem | Possible Cause | Solution |
|---|---|---|
Server will not start | Node.js not installed or wrong version | Ensure Node.js 18+ is installed. Run |
"Browser not found" error | Playwright browsers not installed | Run |
Client cannot connect | Server not running or port conflict | Verify the server is running in your terminal. Check for error messages on startup. |
Browser opens but tests fail immediately | Stale session or browser crash | Use |
Tests are slow or timing out | Large page DOM overwhelming the context | Use |
AI clicks wrong elements | Ambiguous or duplicate labels on the page | Be more specific in prompts (e.g., "click the Submit button in the login form" instead of "click Submit"). |
| Blocked by default security policy | Use the |
Docker-specific failures | HTTP transport issues with containers | Use stdio transport instead of streamable-http when running in Docker. |
Permission errors on macOS/Linux | npm global install permissions | Use |
What's New in Playwright MCP (2026)
Playwright MCP has evolved rapidly since its initial release. Here are the most significant updates as of early 2026:
Playwright CLI: A Token-Efficient Alternative
Microsoft released @playwright/cli as a companion tool that uses shell commands instead of the MCP protocol. The key advantage is a roughly 4x reduction in token consumption for equivalent tasks. Use CLI when your agent has filesystem access, and MCP when it does not.
Overhauled Session Management
Session management was simplified significantly. The old session commands were replaced with list, close-all, and kill-all. Browser profiles are now ephemeral by default, starting clean with no leftover state. Each workspace gets its own daemon process, preventing cross-project interference.
Security Hardening
Tighter defaults now ship out of the box. File system access is restricted to the workspace root directory. Navigation to file:// URLs is blocked by default. New origin controls (--allowed-origins and --blocked-origins) let you whitelist or blacklist specific domains.
Chrome Extension: Playwright MCP Bridge
An official Chrome extension connects MCP to pages in an existing browser session, using your default profile with all cookies and logged-in state. This eliminates the need to re-implement login flows in automation and is especially useful for testing authenticated workflows.
Video Recording and DevTools
On-demand video recording is available via --save-video=800x600. Combined with --caps=devtools, you can capture performance traces, Core Web Vitals, and full video playback of test sessions.
GitHub Copilot Coding Agent Integration
Playwright MCP is now automatically configured for GitHub Copilot's Coding Agent. No manual setup is required. The agent can read, interact with, and screenshot web pages during code generation, enabling a prompt-to-verification workflow out of the box.
Frequently Asked Questions
Is Playwright MCP free?
Yes. Playwright MCP is open-source under the Apache 2.0 license and free to use. You'll need access to an LLM to drive the MCP server though, and that LLM may have its own costs depending on the provider.
What is the difference between Playwright MCP and regular Playwright?
Regular Playwright is a Node.js library for writing browser automation scripts in JavaScript or TypeScript. You write code, and Playwright executes it. Playwright MCP wraps that same library in an MCP server, allowing AI agents to control the browser through natural language commands instead of code. The underlying browser engine and capabilities are the same.
Can Playwright MCP replace manual testing?
Partially. Playwright MCP is excellent for exploratory testing and quick smoke tests described in natural language. For release-critical regression testing, you'll still want deterministic, repeatable test scripts that produce consistent results across runs. Many teams use Playwright MCP for test discovery and then convert validated flows into traditional Playwright scripts.
What IDEs and clients support Playwright MCP?
Playwright MCP works with any MCP-compatible client. The most popular options include VS Code (via GitHub Copilot), Cursor, Claude Desktop, Claude Code, Windsurf, Goose, OpenAI Codex, GitHub Copilot Coding Agent, Amp, Kiro, and LM Studio. The list continues to grow as MCP adoption expands.
Can I use Playwright MCP in CI/CD pipelines?
Yes. You can run the server in headless mode (--headless) or use the official Docker image for containerized environments. Keep in mind you'll also need an LLM agent orchestrating the tests. For fully deterministic CI/CD execution, many teams prefer traditional Playwright scripts while using MCP for development-time exploration and test generation.
What are the system requirements?
You need Node.js 18 or later and a system capable of running Chromium (or Firefox/WebKit for non-Docker setups). The server itself is lightweight. The main resource consumption comes from the browser instances it launches and the LLM tokens consumed during interaction.
How is Playwright MCP different from screenshot-based browser tools?
Screenshot-based tools capture images of web pages and use computer vision to identify elements. This approach is slower, requires multimodal AI models, and introduces ambiguity when multiple elements look similar visually. Playwright MCP reads the page's accessibility tree instead, providing precise, text-based element identification that works with any LLM and produces more reliable results.
AI agents aren't just generating text and writing code anymore. They're opening browsers, navigating web pages, clicking buttons, filling forms, and verifying results, all without a single line of traditional automation script.
The technology behind this is Playwright MCP, a server that connects AI models to real web browsers through Microsoft's Playwright automation library. Since its release in March 2025, it's quickly become the standard way for large language models (LLMs) to interact with web applications programmatically.
This guide covers everything you need to know about Playwright MCP: what it is, how it works, how to set it up, when to use it, and how it compares to alternatives like the Playwright CLI and traditional automation tools.
What Is MCP (Model Context Protocol)?
Before diving into Playwright MCP specifically, it helps to understand the protocol it's built on.
Model Context Protocol (MCP) is an open-source standard for connecting AI applications to external systems. Originally developed by Anthropic, MCP gives AI models a universal interface to communicate with tools, data sources, and services in a structured, consistent way.
Think of MCP as a USB-C port for AI applications. Just as USB-C provides a single standardized connector for charging, data transfer, and video across different devices, MCP provides a single standardized protocol for AI models to use external tools, regardless of the specific tool or AI application involved.
How MCP Works
MCP follows a client-server architecture with three main components:
Hosts are the AI applications that users interact with directly, such as Claude Desktop, VS Code with GitHub Copilot, or Cursor IDE.
Clients are components embedded within hosts that manage connections to MCP servers. They handle the protocol-level communication, sending requests and receiving responses.
Servers are external programs that expose specific capabilities (called "tools") through the MCP standard. Each server makes a defined set of actions available that any MCP client can invoke. If you want a deeper dive, our guide on understanding what an MCP server is covers the fundamentals in detail.
When an AI model needs to perform an action, the client sends a JSON-formatted request to the appropriate server. The server executes the action and returns a structured response. This happens in real time, allowing the AI to maintain context across multiple interactions.
Why MCP Matters
Before MCP, connecting an AI model to an external tool meant building a custom integration for every tool-and-model combination. Five AI applications and ten tools? That's fifty separate integrations. MCP eliminates this by standardizing the communication layer. Build one MCP server for your tool, and it works with every MCP-compatible AI client.
This standardization has driven broad adoption. AI assistants like Claude and ChatGPT, development tools like VS Code and Cursor, and dozens of other applications now support MCP natively. And the protocol isn't limited to browser automation. Agen.co, for example, provides an enterprise-grade MCP gateway that lets organizations safely expose their SaaS products and internal tools to AI agents while enforcing identity, access control, and data governance at the protocol level.
What Is Playwright MCP?
Playwright MCP is a specific MCP server built on top of Microsoft's Playwright browser automation library. It exposes Playwright's browser controls (opening pages, clicking elements, filling forms, taking screenshots, reading page content) as MCP tools that AI agents can call.
In practice, Playwright MCP acts as a bridge between an AI model and a real web browser. The AI doesn't control the browser directly. Instead, it sends high-level commands to the Playwright MCP server, which translates those commands into Playwright actions, executes them in a live browser instance, and returns structured results.
The Playwright MCP Pipeline
Here is the flow of a typical interaction:
AI Client (such as Claude Desktop or VS Code Copilot) sends an MCP request, for example: call the tool
browser_navigatewith{ url: "https://example.com" }.Playwright MCP Server receives the request and translates it into a Playwright command.
Real Browser (Chromium, Firefox, or WebKit) executes the navigation.
Structured Response is sent back to the AI client, including page title, URL, accessibility snapshot, and any requested data.
The AI reads this response and decides its next action. This loop repeats until the task is complete.
Accessibility Tree vs. Screenshots
One of the most important design decisions in Playwright MCP is how it reads web pages. Instead of relying on screenshots and computer vision (which require visually-tuned models and introduce ambiguity), Playwright MCP uses the browser's accessibility tree.
The accessibility tree is a structured, text-based representation of every element on a webpage. It includes roles (button, link, heading), names (accessible labels), states (checked, disabled), and hierarchy (parent-child relationships). This gives the AI a precise, deterministic understanding of page structure without needing to interpret pixels.
The advantages are significant:
Speed: Reading a text-based tree is significantly faster than processing a screenshot image.
Reliability: Element identification is based on semantic roles and labels, not visual positions that shift across screen sizes.
Token efficiency: A compact accessibility snapshot uses far fewer tokens than a base64-encoded image.
No vision model required: Any LLM can work with Playwright MCP, not just multimodal models with image understanding.
How Playwright MCP Works Under the Hood
Understanding the internals helps explain why Playwright MCP behaves the way it does and where its limitations come from.
Snapshot Mode (Default)
In its default Snapshot Mode, Playwright MCP captures accessibility snapshots of the current page after each action. These snapshots contain the full accessibility tree: every interactive element, its role, its accessible name, and a reference ID that the AI can use in subsequent commands.
For example, after navigating to a login page, the snapshot might return:
- textbox "Email address" [ref=e1]
The AI can then issue a command like browser_type with ref: "e1" and text: "user@example.com" to fill the email field. This reference-based system makes interactions precise and repeatable.
Vision Mode (Optional)
For elements that are not well represented in the accessibility tree, such as canvas elements, complex SVGs, or image-based interfaces, Playwright MCP offers an optional Vision Mode (enabled with --caps=vision). This mode adds coordinate-based interaction tools (browser_mouse_click_xy, browser_mouse_move_xy) that allow the AI to click at specific pixel positions.
Vision Mode requires the AI to work with screenshots rather than accessibility data, making it slower and less deterministic. For most web automation tasks, Snapshot Mode is the better choice.
The Command-Response Loop
A typical Playwright MCP workflow follows this pattern:
The AI calls
browser_navigateto open a URL.The server returns an accessibility snapshot of the loaded page.
The AI analyzes the snapshot, identifies the target elements, and calls
browser_type,browser_click, or other tools.After each action, the server returns an updated snapshot reflecting the new page state.
The AI continues issuing commands based on the updated state until the task is complete.
This feedback loop is what makes Playwright MCP genuinely agentic. The AI doesn't blindly execute a script. It observes the page state after each step and adapts its approach accordingly, handling loading states, dynamic content, and unexpected UI changes.
Key Features of Playwright MCP
Accessibility-First Automation
The default Snapshot Mode uses the accessibility tree for all interactions. Elements are identified by semantic roles and labels rather than CSS selectors or XPath, making automations more resilient to UI changes. If a developer renames a CSS class, the automation still works as long as the button's accessible label stays the same.
Cross-Browser Support
Playwright MCP supports Chromium, Firefox, and WebKit (Safari's rendering engine). You can specify the browser using the --browser flag:
npx @playwright/mcp@latest --browser=firefox
This allows testing the same workflows across all major browser engines.
Headless and Headed Modes
By default, Playwright MCP runs the browser in headed mode (visible window), which is useful for development and debugging. For CI/CD pipelines and server environments, headless mode runs the browser without a visible window:
npx @playwright/mcp@latest --headless
Wide IDE and Client Support
Playwright MCP works with virtually every major AI coding tool:
VS Code with GitHub Copilot
Cursor
Claude Desktop and Claude Code
Windsurf
Goose
OpenAI Codex
GitHub Copilot Coding Agent (preconfigured, no setup needed)
Amp, Kiro, LM Studio, and many others
The same server command works across all these clients because MCP standardizes the communication protocol.
Session Management
Playwright MCP offers three session modes:
Persistent profile (default): Browser data like cookies and login sessions are saved to disk between runs, similar to a normal browser.
Isolated mode (
--isolated): Each session starts with a clean, in-memory profile. All state is lost when the browser closes. Ideal for testing.Extension mode (
--extension): Connects to an existing Chrome browser session using the Playwright MCP Bridge extension, leveraging your logged-in state and cookies.
Network Mocking and Storage Controls
Beyond basic navigation and interaction, Playwright MCP includes tools for:
Mocking network requests (
browser_route): Intercept requests matching a URL pattern and return custom responses.Cookie management: Get, set, delete, and clear cookies.
localStorage and sessionStorage: Full read/write access for managing client-side storage.
Storage state save/restore: Capture and replay authentication states across sessions.
Video Recording and DevTools Tracing
With the --caps=devtools flag enabled, Playwright MCP supports:
On-demand video recording (
--save-video=800x600): Capture full video playback of browser sessions.Performance tracing: Record and export Playwright traces for debugging, including network activity, DOM snapshots, and console logs.
Core Web Vitals: Capture metrics like LCP, CLS, and INP during test sessions.
Docker Support
An official Docker image is available for containerized environments:
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "docker",</p> <p> "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p> <p> }</p> <p> }</p>
Note: The Docker image currently supports headless Chromium only. Firefox and WebKit are not yet available in containers.
Playwright MCP vs. Playwright CLI
In mid-2025, Microsoft released @playwright/cli, a companion tool that provides browser automation through standard shell commands rather than the MCP protocol. Understanding the differences helps you choose the right tool for your situation.
Architecture Differences
Aspect | Playwright MCP | Playwright CLI |
|---|---|---|
Communication | Persistent MCP server connection | Stateless shell commands |
State management | Continuous browser session | Each command is independent |
Token cost | ~114,000 tokens per typical task | ~27,000 tokens per typical task |
Tool schema | Full schema loaded into context upfront | Minimal skill description (~68 tokens) |
Output delivery | Streams into LLM context window | Saves to disk files |
Token Efficiency
The most significant difference is token consumption. When an AI agent connects to a Playwright MCP server, the entire tool schema (all available functions and their parameters) is loaded into the context window. This costs approximately 3,600 tokens before the agent even starts working.
The CLI avoids this overhead entirely. It exposes capabilities as lightweight "skills" that cost roughly 68 tokens to describe. Accessibility snapshots and screenshots are saved to disk files instead of being streamed into the context. The result is approximately a 4x reduction in total token usage for equivalent tasks.
When to Use Each
Use Playwright MCP when:
Your AI agent does not have filesystem access (e.g., web-based chat interfaces)
You need persistent browser state across many interactions
You are building exploratory automation or self-healing test workflows
You need rich, iterative reasoning about page structure over time
Use Playwright CLI when:
Your agent runs in a terminal environment with filesystem access (Claude Code, Cursor, Copilot)
Token efficiency is a priority
You are running many browser automation tasks in sequence
You need to compose browser commands with other shell tools
Microsoft's Own Guidance
The official Playwright MCP repository recommends the CLI for coding agents, stating that CLI invocations are more token-efficient because they avoid loading large tool schemas and verbose accessibility trees into the model context. MCP remains recommended for specialized agentic loops that benefit from persistent state and rich introspection.
Common Use Cases for Playwright MCP
AI-Assisted Test Generation
Describe a test scenario in plain English, and the AI generates and executes Playwright test code automatically. For example, prompting "Test the login flow with valid credentials and verify the dashboard loads" produces a complete test without writing a single line of code manually.
Exploratory Testing
AI agents can autonomously navigate an application, click through different features, and identify potential issues without predefined test scripts. This is particularly valuable for discovering edge cases and unexpected behaviors that static test suites miss.
End-to-End Test Automation
Playwright MCP enables full end-to-end testing workflows where the AI navigates through multi-step user journeys: logging in, filling forms, submitting data, and verifying results across multiple pages. The accessibility-tree approach ensures these tests remain stable even when CSS or layout changes occur.
Web Scraping and Data Extraction
AI agents can navigate to pages, interact with dynamic content (expanding dropdowns, paginating results, scrolling infinite feeds), and extract structured data. The structured accessibility snapshot makes it straightforward to identify and target specific data elements.
Form Automation and Task Workflows
For repetitive browser tasks like filling out forms, submitting applications, or walking through multi-step wizards, Playwright MCP allows an AI agent to handle the entire workflow based on a natural language description of what needs to happen.
CI/CD Integration and Self-Verifying Workflows
GitHub Copilot's Coding Agent has Playwright MCP built in, enabling a powerful workflow: the AI generates code, opens a browser, navigates to the running application, and visually verifies that its changes work correctly. This creates a closed loop where code generation and validation happen automatically.
Governing AI Agent Access in Production
As AI agents gain the ability to browse, interact with, and extract data from web applications, governance becomes a real concern. Who authorized the agent? What data can it access? What actions can it take on behalf of a user?
This is where agent governance platforms like Agen.co come in. While Playwright MCP handles the browser automation mechanics, Agen.co sits between your AI agents and your applications to enforce identity-aware access control for MCP connections, data masking, and tool-level permissions. If you're deploying agents that interact with production applications, pairing browser automation with a governance layer ensures agents operate within defined boundaries and every action is auditable.
How to Set Up Playwright MCP
Prerequisites
Before you start, make sure you have:
Node.js 18 or newer installed on your system
npm available on your PATH
An MCP-compatible client (VS Code, Cursor, Claude Desktop, etc.)
Step 1: Install Playwright Browsers
If you do not already have Playwright set up, install the browser binaries:
npx playwright install
This downloads Chromium, Firefox, and WebKit binaries that Playwright MCP will use.
Step 2: Launch the Playwright MCP Server
You can run the server directly with npx (no permanent installation needed):
npx @playwright/mcp@latest
This starts the server and opens a browser instance ready to receive commands. If this runs without errors, the server is working.
Step 3: Configure Your AI Client
The configuration is nearly identical across all clients. Here are the most common ones:
VS Code (GitHub Copilot):
Run this command in your terminal:
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
Or add to your VS Code settings.json manually.
Cursor:
Go to Settings > MCP > Add new MCP Server. Use the name "playwright" and command npx @playwright/mcp@latest. Alternatively, create a .cursor/mcp.json file in your project root:
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "npx",</p> <p> "args": ["@playwright/mcp@latest"]</p> <p> }</p> <p> }</p>
Claude Desktop:
Edit the Claude Desktop config file:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "npx",</p> <p> "args": ["@playwright/mcp@latest"]</p> <p> }</p> <p> }</p>
Restart Claude Desktop after saving.
Claude Code:
claude mcp add playwright npx @playwright/mcp@latest
Docker:
{ <p> "mcpServers": {</p> <p> "playwright": {</p> <p> "command": "docker",</p> <p> "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p> <p> }</p> <p> }</p>
Step 4: Verify the Setup
Once configured, test the connection by asking your AI assistant something like: "Open https://example.com and tell me the page title." If the AI launches a browser, navigates to the page, and returns the title, your setup is working correctly.
Playwright MCP vs. Other Browser Automation Tools
Comparison Table
Feature | Playwright MCP | Selenium | Puppeteer | Cypress |
|---|---|---|---|---|
AI-native control | Yes (MCP protocol) | No (requires scripting) | No (requires scripting) | No (requires scripting) |
Natural language input | Yes | No | No | No |
Accessibility tree | Yes (default) | Limited | Limited | No |
Cross-browser | Chromium, Firefox, WebKit | All major browsers | Chromium only | Chromium, Firefox |
Auto-waiting | Built-in | Manual waits | Manual waits | Built-in |
Open-source | Yes (Apache 2.0) | Yes | Yes | Yes (MIT) |
Primary use case | AI-driven automation | Traditional test scripts | Headless Chrome tasks | Component/E2E testing |
Key Differences
Playwright MCP vs. Selenium: Selenium requires explicit test scripts written in a programming language. Playwright MCP accepts natural language commands from AI agents. Selenium supports more browsers (including older versions), but Playwright MCP's accessibility-tree approach produces more reliable element identification with less maintenance. Playwright MCP vs. Puppeteer: Puppeteer is limited to Chromium-based browsers and requires JavaScript scripting. Playwright MCP supports multiple browser engines and can be driven by any MCP-compatible AI client without writing code. Playwright MCP vs. Other MCP Browser Servers: Several third-party MCP servers exist for browser automation, including community projects like executeautomation/mcp-playwright. The official Microsoft @playwright/mcp server is the most actively maintained, has the broadest feature set, and receives regular updates aligned with the core Playwright library.
Pros and Cons of Playwright MCP
Pros
Advantage | Detail |
|---|---|
Natural language control | Describe tests in plain English without needing Playwright API knowledge |
Accessibility-tree approach | More reliable than screenshot-based tools; targets elements by roles and labels |
Cross-browser support | Test across Chromium, Firefox, and WebKit with one server |
Fast setup | One |
Universal client support | Works with VS Code, Cursor, Claude Desktop, Windsurf, and many more |
Open-source | Free under Apache 2.0 license, backed by Microsoft, and actively maintained |
Exploratory testing | AI can autonomously discover issues without predefined scripts |
Self-adapting | AI adjusts to page changes in real time rather than following brittle scripts |
Cons
Limitation | Detail |
|---|---|
High token consumption | A typical task uses ~114K tokens via MCP (the CLI alternative reduces this to ~27K) |
Non-deterministic output | AI-generated steps can vary between runs for the same prompt |
No persistent test scripts | Tests live in chat context, not as reusable |
Limited DOM-only coverage | Canvas elements, complex SVGs, or image-based UIs are difficult to interact with |
No built-in test management | No reporting, versioning, or CI/CD integration out of the box |
Context window pressure | Large pages with deep DOM trees can flood the LLM context |
Docker limitations | The Docker image currently supports headless Chromium only |
Best Practices for Using Playwright MCP
Write Specific Prompts
Vague prompts produce unreliable results. Instead of "test the login page," give explicit instructions: "Navigate to /login, enter 'testuser@example.com' in the email field, enter 'password123' in the password field, click the Sign In button, and verify the dashboard heading is visible." More detail means more deterministic results.
Use the CLI for Token-Heavy Workflows
If you are running many browser tasks in sequence or working with large pages, switch to @playwright/cli to reduce token usage by approximately 4x. Reserve MCP for agents that do not have filesystem access.
Combine AI Exploration with Deterministic Tests
Use Playwright MCP for exploratory testing and discovering new flows. Once a flow is validated, codify it into a proper Playwright test script (.spec.ts) for repeatable CI/CD execution. MCP is excellent for discovery; traditional scripts are better for regression.
Start Sessions Clean
Use ephemeral, in-memory browser profiles (the --isolated flag) to avoid state leakage between test runs. Only use persistent profiles (--user-data-dir) when you specifically need to preserve cookies or authentication state.
Scope Your Origin Access
In shared or production-adjacent environments, use --allowed-origins and --blocked-origins to restrict which domains the browser can access. Do not leave the server open to all origins without reason. For a broader look at securing MCP connections, see our guide on MCP security best practices and common risks.
Monitor Token Usage
Deep DOM trees on complex pages can consume significant tokens per interaction. If you notice costs spiking or responses being truncated, consider switching to the CLI, using partial snapshots (targeting a specific element with a selector parameter), or simplifying the page under test.
Version Your MCP Configuration
Commit your .cursor/mcp.json, .vscode/settings.json, or equivalent MCP config files to version control. This ensures your entire team shares a consistent setup and new contributors can get started without manual configuration.
Add a Governance Layer for Production Agents
If your AI agents are interacting with production applications or sensitive data through browser automation, you'll want an agent governance layer. Agen.co provides observability over agent actions, enforces delegated permissions, and maintains audit logs across systems. For a practical look at what this involves, read how organizations are governing AI agents across enterprise applications.
Troubleshooting Common Playwright MCP Issues
Problem | Possible Cause | Solution |
|---|---|---|
Server will not start | Node.js not installed or wrong version | Ensure Node.js 18+ is installed. Run |
"Browser not found" error | Playwright browsers not installed | Run |
Client cannot connect | Server not running or port conflict | Verify the server is running in your terminal. Check for error messages on startup. |
Browser opens but tests fail immediately | Stale session or browser crash | Use |
Tests are slow or timing out | Large page DOM overwhelming the context | Use |
AI clicks wrong elements | Ambiguous or duplicate labels on the page | Be more specific in prompts (e.g., "click the Submit button in the login form" instead of "click Submit"). |
| Blocked by default security policy | Use the |
Docker-specific failures | HTTP transport issues with containers | Use stdio transport instead of streamable-http when running in Docker. |
Permission errors on macOS/Linux | npm global install permissions | Use |
What's New in Playwright MCP (2026)
Playwright MCP has evolved rapidly since its initial release. Here are the most significant updates as of early 2026:
Playwright CLI: A Token-Efficient Alternative
Microsoft released @playwright/cli as a companion tool that uses shell commands instead of the MCP protocol. The key advantage is a roughly 4x reduction in token consumption for equivalent tasks. Use CLI when your agent has filesystem access, and MCP when it does not.
Overhauled Session Management
Session management was simplified significantly. The old session commands were replaced with list, close-all, and kill-all. Browser profiles are now ephemeral by default, starting clean with no leftover state. Each workspace gets its own daemon process, preventing cross-project interference.
Security Hardening
Tighter defaults now ship out of the box. File system access is restricted to the workspace root directory. Navigation to file:// URLs is blocked by default. New origin controls (--allowed-origins and --blocked-origins) let you whitelist or blacklist specific domains.
Chrome Extension: Playwright MCP Bridge
An official Chrome extension connects MCP to pages in an existing browser session, using your default profile with all cookies and logged-in state. This eliminates the need to re-implement login flows in automation and is especially useful for testing authenticated workflows.
Video Recording and DevTools
On-demand video recording is available via --save-video=800x600. Combined with --caps=devtools, you can capture performance traces, Core Web Vitals, and full video playback of test sessions.
GitHub Copilot Coding Agent Integration
Playwright MCP is now automatically configured for GitHub Copilot's Coding Agent. No manual setup is required. The agent can read, interact with, and screenshot web pages during code generation, enabling a prompt-to-verification workflow out of the box.
Frequently Asked Questions
Is Playwright MCP free?
Yes. Playwright MCP is open-source under the Apache 2.0 license and free to use. You'll need access to an LLM to drive the MCP server though, and that LLM may have its own costs depending on the provider.
What is the difference between Playwright MCP and regular Playwright?
Regular Playwright is a Node.js library for writing browser automation scripts in JavaScript or TypeScript. You write code, and Playwright executes it. Playwright MCP wraps that same library in an MCP server, allowing AI agents to control the browser through natural language commands instead of code. The underlying browser engine and capabilities are the same.
Can Playwright MCP replace manual testing?
Partially. Playwright MCP is excellent for exploratory testing and quick smoke tests described in natural language. For release-critical regression testing, you'll still want deterministic, repeatable test scripts that produce consistent results across runs. Many teams use Playwright MCP for test discovery and then convert validated flows into traditional Playwright scripts.
What IDEs and clients support Playwright MCP?
Playwright MCP works with any MCP-compatible client. The most popular options include VS Code (via GitHub Copilot), Cursor, Claude Desktop, Claude Code, Windsurf, Goose, OpenAI Codex, GitHub Copilot Coding Agent, Amp, Kiro, and LM Studio. The list continues to grow as MCP adoption expands.
Can I use Playwright MCP in CI/CD pipelines?
Yes. You can run the server in headless mode (--headless) or use the official Docker image for containerized environments. Keep in mind you'll also need an LLM agent orchestrating the tests. For fully deterministic CI/CD execution, many teams prefer traditional Playwright scripts while using MCP for development-time exploration and test generation.
What are the system requirements?
You need Node.js 18 or later and a system capable of running Chromium (or Firefox/WebKit for non-Docker setups). The server itself is lightweight. The main resource consumption comes from the browser instances it launches and the LLM tokens consumed during interaction.
How is Playwright MCP different from screenshot-based browser tools?
Screenshot-based tools capture images of web pages and use computer vision to identify elements. This approach is slower, requires multimodal AI models, and introduces ambiguity when multiple elements look similar visually. Playwright MCP reads the page's accessibility tree instead, providing precise, text-based element identification that works with any LLM and produces more reliable results.
Empower your workforce with secure agents

© 2026 Agen™ | All rights reserved.
Deploy anywhere
Empower your workforce with secure agents

© 2026 Agen™ | All rights reserved.
Deploy anywhere