/

what-is-playwright-mcp

AI agents aren't just generating text and writing code anymore. They're opening browsers, navigating web pages, clicking buttons, filling forms, and verifying results, all without a single line of traditional automation script.

The technology behind this is Playwright MCP, a server that connects AI models to real web browsers through Microsoft's Playwright automation library. Since its release in March 2025, it's quickly become the standard way for large language models (LLMs) to interact with web applications programmatically.

This guide covers everything you need to know about Playwright MCP: what it is, how it works, how to set it up, when to use it, and how it compares to alternatives like the Playwright CLI and traditional automation tools.

What Is MCP (Model Context Protocol)?

Before diving into Playwright MCP specifically, it helps to understand the protocol it's built on.

Model Context Protocol (MCP) is an open-source standard for connecting AI applications to external systems. Originally developed by Anthropic, MCP gives AI models a universal interface to communicate with tools, data sources, and services in a structured, consistent way.

Think of MCP as a USB-C port for AI applications. Just as USB-C provides a single standardized connector for charging, data transfer, and video across different devices, MCP provides a single standardized protocol for AI models to use external tools, regardless of the specific tool or AI application involved.

How MCP Works

MCP follows a client-server architecture with three main components:

  • Hosts are the AI applications that users interact with directly, such as Claude Desktop, VS Code with GitHub Copilot, or Cursor IDE.

  • Clients are components embedded within hosts that manage connections to MCP servers. They handle the protocol-level communication, sending requests and receiving responses.

  • Servers are external programs that expose specific capabilities (called "tools") through the MCP standard. Each server makes a defined set of actions available that any MCP client can invoke. If you want a deeper dive, our guide on understanding what an MCP server is covers the fundamentals in detail.

When an AI model needs to perform an action, the client sends a JSON-formatted request to the appropriate server. The server executes the action and returns a structured response. This happens in real time, allowing the AI to maintain context across multiple interactions.

Why MCP Matters

Before MCP, connecting an AI model to an external tool meant building a custom integration for every tool-and-model combination. Five AI applications and ten tools? That's fifty separate integrations. MCP eliminates this by standardizing the communication layer. Build one MCP server for your tool, and it works with every MCP-compatible AI client.

This standardization has driven broad adoption. AI assistants like Claude and ChatGPT, development tools like VS Code and Cursor, and dozens of other applications now support MCP natively. And the protocol isn't limited to browser automation. Agen.co, for example, provides an enterprise-grade MCP gateway that lets organizations safely expose their SaaS products and internal tools to AI agents while enforcing identity, access control, and data governance at the protocol level.

What Is Playwright MCP?

Playwright MCP is a specific MCP server built on top of Microsoft's Playwright browser automation library. It exposes Playwright's browser controls (opening pages, clicking elements, filling forms, taking screenshots, reading page content) as MCP tools that AI agents can call.

In practice, Playwright MCP acts as a bridge between an AI model and a real web browser. The AI doesn't control the browser directly. Instead, it sends high-level commands to the Playwright MCP server, which translates those commands into Playwright actions, executes them in a live browser instance, and returns structured results.

The Playwright MCP Pipeline

Here is the flow of a typical interaction:

  1. AI Client (such as Claude Desktop or VS Code Copilot) sends an MCP request, for example: call the tool browser_navigate with { url: "https://example.com" }.

  2. Playwright MCP Server receives the request and translates it into a Playwright command.

  3. Real Browser (Chromium, Firefox, or WebKit) executes the navigation.

  4. Structured Response is sent back to the AI client, including page title, URL, accessibility snapshot, and any requested data.

The AI reads this response and decides its next action. This loop repeats until the task is complete.

Accessibility Tree vs. Screenshots

One of the most important design decisions in Playwright MCP is how it reads web pages. Instead of relying on screenshots and computer vision (which require visually-tuned models and introduce ambiguity), Playwright MCP uses the browser's accessibility tree.

The accessibility tree is a structured, text-based representation of every element on a webpage. It includes roles (button, link, heading), names (accessible labels), states (checked, disabled), and hierarchy (parent-child relationships). This gives the AI a precise, deterministic understanding of page structure without needing to interpret pixels.

The advantages are significant:

  • Speed: Reading a text-based tree is significantly faster than processing a screenshot image.

  • Reliability: Element identification is based on semantic roles and labels, not visual positions that shift across screen sizes.

  • Token efficiency: A compact accessibility snapshot uses far fewer tokens than a base64-encoded image.

  • No vision model required: Any LLM can work with Playwright MCP, not just multimodal models with image understanding.

How Playwright MCP Works Under the Hood

Understanding the internals helps explain why Playwright MCP behaves the way it does and where its limitations come from.

Snapshot Mode (Default)

In its default Snapshot Mode, Playwright MCP captures accessibility snapshots of the current page after each action. These snapshots contain the full accessibility tree: every interactive element, its role, its accessible name, and a reference ID that the AI can use in subsequent commands.

For example, after navigating to a login page, the snapshot might return:

- textbox "Email address" [ref=e1]

The AI can then issue a command like browser_type with ref: "e1" and text: "user@example.com" to fill the email field. This reference-based system makes interactions precise and repeatable.

Vision Mode (Optional)

For elements that are not well represented in the accessibility tree, such as canvas elements, complex SVGs, or image-based interfaces, Playwright MCP offers an optional Vision Mode (enabled with --caps=vision). This mode adds coordinate-based interaction tools (browser_mouse_click_xy, browser_mouse_move_xy) that allow the AI to click at specific pixel positions.

Vision Mode requires the AI to work with screenshots rather than accessibility data, making it slower and less deterministic. For most web automation tasks, Snapshot Mode is the better choice.

The Command-Response Loop

A typical Playwright MCP workflow follows this pattern:

  1. The AI calls browser_navigate to open a URL.

  2. The server returns an accessibility snapshot of the loaded page.

  3. The AI analyzes the snapshot, identifies the target elements, and calls browser_type, browser_click, or other tools.

  4. After each action, the server returns an updated snapshot reflecting the new page state.

  5. The AI continues issuing commands based on the updated state until the task is complete.

This feedback loop is what makes Playwright MCP genuinely agentic. The AI doesn't blindly execute a script. It observes the page state after each step and adapts its approach accordingly, handling loading states, dynamic content, and unexpected UI changes.

Key Features of Playwright MCP

Accessibility-First Automation

The default Snapshot Mode uses the accessibility tree for all interactions. Elements are identified by semantic roles and labels rather than CSS selectors or XPath, making automations more resilient to UI changes. If a developer renames a CSS class, the automation still works as long as the button's accessible label stays the same.

Cross-Browser Support

Playwright MCP supports Chromium, Firefox, and WebKit (Safari's rendering engine). You can specify the browser using the --browser flag:

npx @playwright/mcp@latest --browser=firefox

This allows testing the same workflows across all major browser engines.

Headless and Headed Modes

By default, Playwright MCP runs the browser in headed mode (visible window), which is useful for development and debugging. For CI/CD pipelines and server environments, headless mode runs the browser without a visible window:

npx @playwright/mcp@latest --headless

Wide IDE and Client Support

Playwright MCP works with virtually every major AI coding tool:

  • VS Code with GitHub Copilot

  • Cursor

  • Claude Desktop and Claude Code

  • Windsurf

  • Goose

  • OpenAI Codex

  • GitHub Copilot Coding Agent (preconfigured, no setup needed)

  • Amp, Kiro, LM Studio, and many others

The same server command works across all these clients because MCP standardizes the communication protocol.

Session Management

Playwright MCP offers three session modes:

  • Persistent profile (default): Browser data like cookies and login sessions are saved to disk between runs, similar to a normal browser.

  • Isolated mode (--isolated): Each session starts with a clean, in-memory profile. All state is lost when the browser closes. Ideal for testing.

  • Extension mode (--extension): Connects to an existing Chrome browser session using the Playwright MCP Bridge extension, leveraging your logged-in state and cookies.

Network Mocking and Storage Controls

Beyond basic navigation and interaction, Playwright MCP includes tools for:

  • Mocking network requests (browser_route): Intercept requests matching a URL pattern and return custom responses.

  • Cookie management: Get, set, delete, and clear cookies.

  • localStorage and sessionStorage: Full read/write access for managing client-side storage.

  • Storage state save/restore: Capture and replay authentication states across sessions.

Video Recording and DevTools Tracing

With the --caps=devtools flag enabled, Playwright MCP supports:

  • On-demand video recording (--save-video=800x600): Capture full video playback of browser sessions.

  • Performance tracing: Record and export Playwright traces for debugging, including network activity, DOM snapshots, and console logs.

  • Core Web Vitals: Capture metrics like LCP, CLS, and INP during test sessions.

Docker Support

An official Docker image is available for containerized environments:

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "docker",</p>
<p>      "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p>
<p>    }</p>
<p>  }</p>


Note: The Docker image currently supports headless Chromium only. Firefox and WebKit are not yet available in containers.

Playwright MCP vs. Playwright CLI

In mid-2025, Microsoft released @playwright/cli, a companion tool that provides browser automation through standard shell commands rather than the MCP protocol. Understanding the differences helps you choose the right tool for your situation.

Architecture Differences

Aspect

Playwright MCP

Playwright CLI

Communication

Persistent MCP server connection

Stateless shell commands

State management

Continuous browser session

Each command is independent

Token cost

~114,000 tokens per typical task

~27,000 tokens per typical task

Tool schema

Full schema loaded into context upfront

Minimal skill description (~68 tokens)

Output delivery

Streams into LLM context window

Saves to disk files

Token Efficiency

The most significant difference is token consumption. When an AI agent connects to a Playwright MCP server, the entire tool schema (all available functions and their parameters) is loaded into the context window. This costs approximately 3,600 tokens before the agent even starts working.

The CLI avoids this overhead entirely. It exposes capabilities as lightweight "skills" that cost roughly 68 tokens to describe. Accessibility snapshots and screenshots are saved to disk files instead of being streamed into the context. The result is approximately a 4x reduction in total token usage for equivalent tasks.

When to Use Each

Use Playwright MCP when:

  • Your AI agent does not have filesystem access (e.g., web-based chat interfaces)

  • You need persistent browser state across many interactions

  • You are building exploratory automation or self-healing test workflows

  • You need rich, iterative reasoning about page structure over time

Use Playwright CLI when:

  • Your agent runs in a terminal environment with filesystem access (Claude Code, Cursor, Copilot)

  • Token efficiency is a priority

  • You are running many browser automation tasks in sequence

  • You need to compose browser commands with other shell tools

Microsoft's Own Guidance

The official Playwright MCP repository recommends the CLI for coding agents, stating that CLI invocations are more token-efficient because they avoid loading large tool schemas and verbose accessibility trees into the model context. MCP remains recommended for specialized agentic loops that benefit from persistent state and rich introspection.

Common Use Cases for Playwright MCP

AI-Assisted Test Generation

Describe a test scenario in plain English, and the AI generates and executes Playwright test code automatically. For example, prompting "Test the login flow with valid credentials and verify the dashboard loads" produces a complete test without writing a single line of code manually.

Exploratory Testing

AI agents can autonomously navigate an application, click through different features, and identify potential issues without predefined test scripts. This is particularly valuable for discovering edge cases and unexpected behaviors that static test suites miss.

End-to-End Test Automation

Playwright MCP enables full end-to-end testing workflows where the AI navigates through multi-step user journeys: logging in, filling forms, submitting data, and verifying results across multiple pages. The accessibility-tree approach ensures these tests remain stable even when CSS or layout changes occur.

Web Scraping and Data Extraction

AI agents can navigate to pages, interact with dynamic content (expanding dropdowns, paginating results, scrolling infinite feeds), and extract structured data. The structured accessibility snapshot makes it straightforward to identify and target specific data elements.

Form Automation and Task Workflows

For repetitive browser tasks like filling out forms, submitting applications, or walking through multi-step wizards, Playwright MCP allows an AI agent to handle the entire workflow based on a natural language description of what needs to happen.

CI/CD Integration and Self-Verifying Workflows

GitHub Copilot's Coding Agent has Playwright MCP built in, enabling a powerful workflow: the AI generates code, opens a browser, navigates to the running application, and visually verifies that its changes work correctly. This creates a closed loop where code generation and validation happen automatically.

Governing AI Agent Access in Production

As AI agents gain the ability to browse, interact with, and extract data from web applications, governance becomes a real concern. Who authorized the agent? What data can it access? What actions can it take on behalf of a user?

This is where agent governance platforms like Agen.co come in. While Playwright MCP handles the browser automation mechanics, Agen.co sits between your AI agents and your applications to enforce identity-aware access control for MCP connections, data masking, and tool-level permissions. If you're deploying agents that interact with production applications, pairing browser automation with a governance layer ensures agents operate within defined boundaries and every action is auditable.

How to Set Up Playwright MCP

Prerequisites

Before you start, make sure you have:

  • Node.js 18 or newer installed on your system

  • npm available on your PATH

  • An MCP-compatible client (VS Code, Cursor, Claude Desktop, etc.)

Step 1: Install Playwright Browsers

If you do not already have Playwright set up, install the browser binaries:

npx playwright install

This downloads Chromium, Firefox, and WebKit binaries that Playwright MCP will use.

Step 2: Launch the Playwright MCP Server

You can run the server directly with npx (no permanent installation needed):

npx @playwright/mcp@latest

This starts the server and opens a browser instance ready to receive commands. If this runs without errors, the server is working.

Step 3: Configure Your AI Client

The configuration is nearly identical across all clients. Here are the most common ones:

VS Code (GitHub Copilot):

Run this command in your terminal:

code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'

Or add to your VS Code settings.json manually.

Cursor:

Go to Settings > MCP > Add new MCP Server. Use the name "playwright" and command npx @playwright/mcp@latest. Alternatively, create a .cursor/mcp.json file in your project root:

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "npx",</p>
<p>      "args": ["@playwright/mcp@latest"]</p>
<p>    }</p>
<p>  }</p>


Claude Desktop:

Edit the Claude Desktop config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

  • Windows: %APPDATA%\Claude\claude_desktop_config.json

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "npx",</p>
<p>      "args": ["@playwright/mcp@latest"]</p>
<p>    }</p>
<p>  }</p>


Restart Claude Desktop after saving.

Claude Code:

claude mcp add playwright npx @playwright/mcp@latest

Docker:

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "docker",</p>
<p>      "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p>
<p>    }</p>
<p>  }</p>


Step 4: Verify the Setup

Once configured, test the connection by asking your AI assistant something like: "Open https://example.com and tell me the page title." If the AI launches a browser, navigates to the page, and returns the title, your setup is working correctly.

Playwright MCP vs. Other Browser Automation Tools

Comparison Table

Feature

Playwright MCP

Selenium

Puppeteer

Cypress

AI-native control

Yes (MCP protocol)

No (requires scripting)

No (requires scripting)

No (requires scripting)

Natural language input

Yes

No

No

No

Accessibility tree

Yes (default)

Limited

Limited

No

Cross-browser

Chromium, Firefox, WebKit

All major browsers

Chromium only

Chromium, Firefox

Auto-waiting

Built-in

Manual waits

Manual waits

Built-in

Open-source

Yes (Apache 2.0)

Yes

Yes

Yes (MIT)

Primary use case

AI-driven automation

Traditional test scripts

Headless Chrome tasks

Component/E2E testing

Key Differences

Playwright MCP vs. Selenium: Selenium requires explicit test scripts written in a programming language. Playwright MCP accepts natural language commands from AI agents. Selenium supports more browsers (including older versions), but Playwright MCP's accessibility-tree approach produces more reliable element identification with less maintenance. Playwright MCP vs. Puppeteer: Puppeteer is limited to Chromium-based browsers and requires JavaScript scripting. Playwright MCP supports multiple browser engines and can be driven by any MCP-compatible AI client without writing code. Playwright MCP vs. Other MCP Browser Servers: Several third-party MCP servers exist for browser automation, including community projects like executeautomation/mcp-playwright. The official Microsoft @playwright/mcp server is the most actively maintained, has the broadest feature set, and receives regular updates aligned with the core Playwright library.

Pros and Cons of Playwright MCP

Pros

Advantage

Detail

Natural language control

Describe tests in plain English without needing Playwright API knowledge

Accessibility-tree approach

More reliable than screenshot-based tools; targets elements by roles and labels

Cross-browser support

Test across Chromium, Firefox, and WebKit with one server

Fast setup

One npx command to get started, no complex configuration

Universal client support

Works with VS Code, Cursor, Claude Desktop, Windsurf, and many more

Open-source

Free under Apache 2.0 license, backed by Microsoft, and actively maintained

Exploratory testing

AI can autonomously discover issues without predefined scripts

Self-adapting

AI adjusts to page changes in real time rather than following brittle scripts

Cons

Limitation

Detail

High token consumption

A typical task uses ~114K tokens via MCP (the CLI alternative reduces this to ~27K)

Non-deterministic output

AI-generated steps can vary between runs for the same prompt

No persistent test scripts

Tests live in chat context, not as reusable .spec.ts files

Limited DOM-only coverage

Canvas elements, complex SVGs, or image-based UIs are difficult to interact with

No built-in test management

No reporting, versioning, or CI/CD integration out of the box

Context window pressure

Large pages with deep DOM trees can flood the LLM context

Docker limitations

The Docker image currently supports headless Chromium only

Best Practices for Using Playwright MCP

Write Specific Prompts

Vague prompts produce unreliable results. Instead of "test the login page," give explicit instructions: "Navigate to /login, enter 'testuser@example.com' in the email field, enter 'password123' in the password field, click the Sign In button, and verify the dashboard heading is visible." More detail means more deterministic results.

Use the CLI for Token-Heavy Workflows

If you are running many browser tasks in sequence or working with large pages, switch to @playwright/cli to reduce token usage by approximately 4x. Reserve MCP for agents that do not have filesystem access.

Combine AI Exploration with Deterministic Tests

Use Playwright MCP for exploratory testing and discovering new flows. Once a flow is validated, codify it into a proper Playwright test script (.spec.ts) for repeatable CI/CD execution. MCP is excellent for discovery; traditional scripts are better for regression.

Start Sessions Clean

Use ephemeral, in-memory browser profiles (the --isolated flag) to avoid state leakage between test runs. Only use persistent profiles (--user-data-dir) when you specifically need to preserve cookies or authentication state.

Scope Your Origin Access

In shared or production-adjacent environments, use --allowed-origins and --blocked-origins to restrict which domains the browser can access. Do not leave the server open to all origins without reason. For a broader look at securing MCP connections, see our guide on MCP security best practices and common risks.

Monitor Token Usage

Deep DOM trees on complex pages can consume significant tokens per interaction. If you notice costs spiking or responses being truncated, consider switching to the CLI, using partial snapshots (targeting a specific element with a selector parameter), or simplifying the page under test.

Version Your MCP Configuration

Commit your .cursor/mcp.json, .vscode/settings.json, or equivalent MCP config files to version control. This ensures your entire team shares a consistent setup and new contributors can get started without manual configuration.

Add a Governance Layer for Production Agents

If your AI agents are interacting with production applications or sensitive data through browser automation, you'll want an agent governance layer. Agen.co provides observability over agent actions, enforces delegated permissions, and maintains audit logs across systems. For a practical look at what this involves, read how organizations are governing AI agents across enterprise applications.

Troubleshooting Common Playwright MCP Issues

Problem

Possible Cause

Solution

Server will not start

Node.js not installed or wrong version

Ensure Node.js 18+ is installed. Run node --version to check.

"Browser not found" error

Playwright browsers not installed

Run npx playwright install to download browser binaries.

Client cannot connect

Server not running or port conflict

Verify the server is running in your terminal. Check for error messages on startup.

Browser opens but tests fail immediately

Stale session or browser crash

Use --isolated flag or restart the server to clear old sessions.

Tests are slow or timing out

Large page DOM overwhelming the context

Use --caps=devtools for targeted inspection or switch to the CLI for token reduction.

AI clicks wrong elements

Ambiguous or duplicate labels on the page

Be more specific in prompts (e.g., "click the Submit button in the login form" instead of "click Submit").

file:// URLs will not open

Blocked by default security policy

Use the --allow-unrestricted-file-access flag if you need local file access.

Docker-specific failures

HTTP transport issues with containers

Use stdio transport instead of streamable-http when running in Docker.

Permission errors on macOS/Linux

npm global install permissions

Use npx (recommended) instead of global install.

What's New in Playwright MCP (2026)

Playwright MCP has evolved rapidly since its initial release. Here are the most significant updates as of early 2026:

Playwright CLI: A Token-Efficient Alternative

Microsoft released @playwright/cli as a companion tool that uses shell commands instead of the MCP protocol. The key advantage is a roughly 4x reduction in token consumption for equivalent tasks. Use CLI when your agent has filesystem access, and MCP when it does not.

Overhauled Session Management

Session management was simplified significantly. The old session commands were replaced with list, close-all, and kill-all. Browser profiles are now ephemeral by default, starting clean with no leftover state. Each workspace gets its own daemon process, preventing cross-project interference.

Security Hardening

Tighter defaults now ship out of the box. File system access is restricted to the workspace root directory. Navigation to file:// URLs is blocked by default. New origin controls (--allowed-origins and --blocked-origins) let you whitelist or blacklist specific domains.

Chrome Extension: Playwright MCP Bridge

An official Chrome extension connects MCP to pages in an existing browser session, using your default profile with all cookies and logged-in state. This eliminates the need to re-implement login flows in automation and is especially useful for testing authenticated workflows.

Video Recording and DevTools

On-demand video recording is available via --save-video=800x600. Combined with --caps=devtools, you can capture performance traces, Core Web Vitals, and full video playback of test sessions.

GitHub Copilot Coding Agent Integration

Playwright MCP is now automatically configured for GitHub Copilot's Coding Agent. No manual setup is required. The agent can read, interact with, and screenshot web pages during code generation, enabling a prompt-to-verification workflow out of the box.

Frequently Asked Questions

Is Playwright MCP free?

Yes. Playwright MCP is open-source under the Apache 2.0 license and free to use. You'll need access to an LLM to drive the MCP server though, and that LLM may have its own costs depending on the provider.

What is the difference between Playwright MCP and regular Playwright?

Regular Playwright is a Node.js library for writing browser automation scripts in JavaScript or TypeScript. You write code, and Playwright executes it. Playwright MCP wraps that same library in an MCP server, allowing AI agents to control the browser through natural language commands instead of code. The underlying browser engine and capabilities are the same.

Can Playwright MCP replace manual testing?

Partially. Playwright MCP is excellent for exploratory testing and quick smoke tests described in natural language. For release-critical regression testing, you'll still want deterministic, repeatable test scripts that produce consistent results across runs. Many teams use Playwright MCP for test discovery and then convert validated flows into traditional Playwright scripts.

What IDEs and clients support Playwright MCP?

Playwright MCP works with any MCP-compatible client. The most popular options include VS Code (via GitHub Copilot), Cursor, Claude Desktop, Claude Code, Windsurf, Goose, OpenAI Codex, GitHub Copilot Coding Agent, Amp, Kiro, and LM Studio. The list continues to grow as MCP adoption expands.

Can I use Playwright MCP in CI/CD pipelines?

Yes. You can run the server in headless mode (--headless) or use the official Docker image for containerized environments. Keep in mind you'll also need an LLM agent orchestrating the tests. For fully deterministic CI/CD execution, many teams prefer traditional Playwright scripts while using MCP for development-time exploration and test generation.

What are the system requirements?

You need Node.js 18 or later and a system capable of running Chromium (or Firefox/WebKit for non-Docker setups). The server itself is lightweight. The main resource consumption comes from the browser instances it launches and the LLM tokens consumed during interaction.

How is Playwright MCP different from screenshot-based browser tools?

Screenshot-based tools capture images of web pages and use computer vision to identify elements. This approach is slower, requires multimodal AI models, and introduces ambiguity when multiple elements look similar visually. Playwright MCP reads the page's accessibility tree instead, providing precise, text-based element identification that works with any LLM and produces more reliable results.

AI agents aren't just generating text and writing code anymore. They're opening browsers, navigating web pages, clicking buttons, filling forms, and verifying results, all without a single line of traditional automation script.

The technology behind this is Playwright MCP, a server that connects AI models to real web browsers through Microsoft's Playwright automation library. Since its release in March 2025, it's quickly become the standard way for large language models (LLMs) to interact with web applications programmatically.

This guide covers everything you need to know about Playwright MCP: what it is, how it works, how to set it up, when to use it, and how it compares to alternatives like the Playwright CLI and traditional automation tools.

What Is MCP (Model Context Protocol)?

Before diving into Playwright MCP specifically, it helps to understand the protocol it's built on.

Model Context Protocol (MCP) is an open-source standard for connecting AI applications to external systems. Originally developed by Anthropic, MCP gives AI models a universal interface to communicate with tools, data sources, and services in a structured, consistent way.

Think of MCP as a USB-C port for AI applications. Just as USB-C provides a single standardized connector for charging, data transfer, and video across different devices, MCP provides a single standardized protocol for AI models to use external tools, regardless of the specific tool or AI application involved.

How MCP Works

MCP follows a client-server architecture with three main components:

  • Hosts are the AI applications that users interact with directly, such as Claude Desktop, VS Code with GitHub Copilot, or Cursor IDE.

  • Clients are components embedded within hosts that manage connections to MCP servers. They handle the protocol-level communication, sending requests and receiving responses.

  • Servers are external programs that expose specific capabilities (called "tools") through the MCP standard. Each server makes a defined set of actions available that any MCP client can invoke. If you want a deeper dive, our guide on understanding what an MCP server is covers the fundamentals in detail.

When an AI model needs to perform an action, the client sends a JSON-formatted request to the appropriate server. The server executes the action and returns a structured response. This happens in real time, allowing the AI to maintain context across multiple interactions.

Why MCP Matters

Before MCP, connecting an AI model to an external tool meant building a custom integration for every tool-and-model combination. Five AI applications and ten tools? That's fifty separate integrations. MCP eliminates this by standardizing the communication layer. Build one MCP server for your tool, and it works with every MCP-compatible AI client.

This standardization has driven broad adoption. AI assistants like Claude and ChatGPT, development tools like VS Code and Cursor, and dozens of other applications now support MCP natively. And the protocol isn't limited to browser automation. Agen.co, for example, provides an enterprise-grade MCP gateway that lets organizations safely expose their SaaS products and internal tools to AI agents while enforcing identity, access control, and data governance at the protocol level.

What Is Playwright MCP?

Playwright MCP is a specific MCP server built on top of Microsoft's Playwright browser automation library. It exposes Playwright's browser controls (opening pages, clicking elements, filling forms, taking screenshots, reading page content) as MCP tools that AI agents can call.

In practice, Playwright MCP acts as a bridge between an AI model and a real web browser. The AI doesn't control the browser directly. Instead, it sends high-level commands to the Playwright MCP server, which translates those commands into Playwright actions, executes them in a live browser instance, and returns structured results.

The Playwright MCP Pipeline

Here is the flow of a typical interaction:

  1. AI Client (such as Claude Desktop or VS Code Copilot) sends an MCP request, for example: call the tool browser_navigate with { url: "https://example.com" }.

  2. Playwright MCP Server receives the request and translates it into a Playwright command.

  3. Real Browser (Chromium, Firefox, or WebKit) executes the navigation.

  4. Structured Response is sent back to the AI client, including page title, URL, accessibility snapshot, and any requested data.

The AI reads this response and decides its next action. This loop repeats until the task is complete.

Accessibility Tree vs. Screenshots

One of the most important design decisions in Playwright MCP is how it reads web pages. Instead of relying on screenshots and computer vision (which require visually-tuned models and introduce ambiguity), Playwright MCP uses the browser's accessibility tree.

The accessibility tree is a structured, text-based representation of every element on a webpage. It includes roles (button, link, heading), names (accessible labels), states (checked, disabled), and hierarchy (parent-child relationships). This gives the AI a precise, deterministic understanding of page structure without needing to interpret pixels.

The advantages are significant:

  • Speed: Reading a text-based tree is significantly faster than processing a screenshot image.

  • Reliability: Element identification is based on semantic roles and labels, not visual positions that shift across screen sizes.

  • Token efficiency: A compact accessibility snapshot uses far fewer tokens than a base64-encoded image.

  • No vision model required: Any LLM can work with Playwright MCP, not just multimodal models with image understanding.

How Playwright MCP Works Under the Hood

Understanding the internals helps explain why Playwright MCP behaves the way it does and where its limitations come from.

Snapshot Mode (Default)

In its default Snapshot Mode, Playwright MCP captures accessibility snapshots of the current page after each action. These snapshots contain the full accessibility tree: every interactive element, its role, its accessible name, and a reference ID that the AI can use in subsequent commands.

For example, after navigating to a login page, the snapshot might return:

- textbox "Email address" [ref=e1]

The AI can then issue a command like browser_type with ref: "e1" and text: "user@example.com" to fill the email field. This reference-based system makes interactions precise and repeatable.

Vision Mode (Optional)

For elements that are not well represented in the accessibility tree, such as canvas elements, complex SVGs, or image-based interfaces, Playwright MCP offers an optional Vision Mode (enabled with --caps=vision). This mode adds coordinate-based interaction tools (browser_mouse_click_xy, browser_mouse_move_xy) that allow the AI to click at specific pixel positions.

Vision Mode requires the AI to work with screenshots rather than accessibility data, making it slower and less deterministic. For most web automation tasks, Snapshot Mode is the better choice.

The Command-Response Loop

A typical Playwright MCP workflow follows this pattern:

  1. The AI calls browser_navigate to open a URL.

  2. The server returns an accessibility snapshot of the loaded page.

  3. The AI analyzes the snapshot, identifies the target elements, and calls browser_type, browser_click, or other tools.

  4. After each action, the server returns an updated snapshot reflecting the new page state.

  5. The AI continues issuing commands based on the updated state until the task is complete.

This feedback loop is what makes Playwright MCP genuinely agentic. The AI doesn't blindly execute a script. It observes the page state after each step and adapts its approach accordingly, handling loading states, dynamic content, and unexpected UI changes.

Key Features of Playwright MCP

Accessibility-First Automation

The default Snapshot Mode uses the accessibility tree for all interactions. Elements are identified by semantic roles and labels rather than CSS selectors or XPath, making automations more resilient to UI changes. If a developer renames a CSS class, the automation still works as long as the button's accessible label stays the same.

Cross-Browser Support

Playwright MCP supports Chromium, Firefox, and WebKit (Safari's rendering engine). You can specify the browser using the --browser flag:

npx @playwright/mcp@latest --browser=firefox

This allows testing the same workflows across all major browser engines.

Headless and Headed Modes

By default, Playwright MCP runs the browser in headed mode (visible window), which is useful for development and debugging. For CI/CD pipelines and server environments, headless mode runs the browser without a visible window:

npx @playwright/mcp@latest --headless

Wide IDE and Client Support

Playwright MCP works with virtually every major AI coding tool:

  • VS Code with GitHub Copilot

  • Cursor

  • Claude Desktop and Claude Code

  • Windsurf

  • Goose

  • OpenAI Codex

  • GitHub Copilot Coding Agent (preconfigured, no setup needed)

  • Amp, Kiro, LM Studio, and many others

The same server command works across all these clients because MCP standardizes the communication protocol.

Session Management

Playwright MCP offers three session modes:

  • Persistent profile (default): Browser data like cookies and login sessions are saved to disk between runs, similar to a normal browser.

  • Isolated mode (--isolated): Each session starts with a clean, in-memory profile. All state is lost when the browser closes. Ideal for testing.

  • Extension mode (--extension): Connects to an existing Chrome browser session using the Playwright MCP Bridge extension, leveraging your logged-in state and cookies.

Network Mocking and Storage Controls

Beyond basic navigation and interaction, Playwright MCP includes tools for:

  • Mocking network requests (browser_route): Intercept requests matching a URL pattern and return custom responses.

  • Cookie management: Get, set, delete, and clear cookies.

  • localStorage and sessionStorage: Full read/write access for managing client-side storage.

  • Storage state save/restore: Capture and replay authentication states across sessions.

Video Recording and DevTools Tracing

With the --caps=devtools flag enabled, Playwright MCP supports:

  • On-demand video recording (--save-video=800x600): Capture full video playback of browser sessions.

  • Performance tracing: Record and export Playwright traces for debugging, including network activity, DOM snapshots, and console logs.

  • Core Web Vitals: Capture metrics like LCP, CLS, and INP during test sessions.

Docker Support

An official Docker image is available for containerized environments:

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "docker",</p>
<p>      "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p>
<p>    }</p>
<p>  }</p>


Note: The Docker image currently supports headless Chromium only. Firefox and WebKit are not yet available in containers.

Playwright MCP vs. Playwright CLI

In mid-2025, Microsoft released @playwright/cli, a companion tool that provides browser automation through standard shell commands rather than the MCP protocol. Understanding the differences helps you choose the right tool for your situation.

Architecture Differences

Aspect

Playwright MCP

Playwright CLI

Communication

Persistent MCP server connection

Stateless shell commands

State management

Continuous browser session

Each command is independent

Token cost

~114,000 tokens per typical task

~27,000 tokens per typical task

Tool schema

Full schema loaded into context upfront

Minimal skill description (~68 tokens)

Output delivery

Streams into LLM context window

Saves to disk files

Token Efficiency

The most significant difference is token consumption. When an AI agent connects to a Playwright MCP server, the entire tool schema (all available functions and their parameters) is loaded into the context window. This costs approximately 3,600 tokens before the agent even starts working.

The CLI avoids this overhead entirely. It exposes capabilities as lightweight "skills" that cost roughly 68 tokens to describe. Accessibility snapshots and screenshots are saved to disk files instead of being streamed into the context. The result is approximately a 4x reduction in total token usage for equivalent tasks.

When to Use Each

Use Playwright MCP when:

  • Your AI agent does not have filesystem access (e.g., web-based chat interfaces)

  • You need persistent browser state across many interactions

  • You are building exploratory automation or self-healing test workflows

  • You need rich, iterative reasoning about page structure over time

Use Playwright CLI when:

  • Your agent runs in a terminal environment with filesystem access (Claude Code, Cursor, Copilot)

  • Token efficiency is a priority

  • You are running many browser automation tasks in sequence

  • You need to compose browser commands with other shell tools

Microsoft's Own Guidance

The official Playwright MCP repository recommends the CLI for coding agents, stating that CLI invocations are more token-efficient because they avoid loading large tool schemas and verbose accessibility trees into the model context. MCP remains recommended for specialized agentic loops that benefit from persistent state and rich introspection.

Common Use Cases for Playwright MCP

AI-Assisted Test Generation

Describe a test scenario in plain English, and the AI generates and executes Playwright test code automatically. For example, prompting "Test the login flow with valid credentials and verify the dashboard loads" produces a complete test without writing a single line of code manually.

Exploratory Testing

AI agents can autonomously navigate an application, click through different features, and identify potential issues without predefined test scripts. This is particularly valuable for discovering edge cases and unexpected behaviors that static test suites miss.

End-to-End Test Automation

Playwright MCP enables full end-to-end testing workflows where the AI navigates through multi-step user journeys: logging in, filling forms, submitting data, and verifying results across multiple pages. The accessibility-tree approach ensures these tests remain stable even when CSS or layout changes occur.

Web Scraping and Data Extraction

AI agents can navigate to pages, interact with dynamic content (expanding dropdowns, paginating results, scrolling infinite feeds), and extract structured data. The structured accessibility snapshot makes it straightforward to identify and target specific data elements.

Form Automation and Task Workflows

For repetitive browser tasks like filling out forms, submitting applications, or walking through multi-step wizards, Playwright MCP allows an AI agent to handle the entire workflow based on a natural language description of what needs to happen.

CI/CD Integration and Self-Verifying Workflows

GitHub Copilot's Coding Agent has Playwright MCP built in, enabling a powerful workflow: the AI generates code, opens a browser, navigates to the running application, and visually verifies that its changes work correctly. This creates a closed loop where code generation and validation happen automatically.

Governing AI Agent Access in Production

As AI agents gain the ability to browse, interact with, and extract data from web applications, governance becomes a real concern. Who authorized the agent? What data can it access? What actions can it take on behalf of a user?

This is where agent governance platforms like Agen.co come in. While Playwright MCP handles the browser automation mechanics, Agen.co sits between your AI agents and your applications to enforce identity-aware access control for MCP connections, data masking, and tool-level permissions. If you're deploying agents that interact with production applications, pairing browser automation with a governance layer ensures agents operate within defined boundaries and every action is auditable.

How to Set Up Playwright MCP

Prerequisites

Before you start, make sure you have:

  • Node.js 18 or newer installed on your system

  • npm available on your PATH

  • An MCP-compatible client (VS Code, Cursor, Claude Desktop, etc.)

Step 1: Install Playwright Browsers

If you do not already have Playwright set up, install the browser binaries:

npx playwright install

This downloads Chromium, Firefox, and WebKit binaries that Playwright MCP will use.

Step 2: Launch the Playwright MCP Server

You can run the server directly with npx (no permanent installation needed):

npx @playwright/mcp@latest

This starts the server and opens a browser instance ready to receive commands. If this runs without errors, the server is working.

Step 3: Configure Your AI Client

The configuration is nearly identical across all clients. Here are the most common ones:

VS Code (GitHub Copilot):

Run this command in your terminal:

code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'

Or add to your VS Code settings.json manually.

Cursor:

Go to Settings > MCP > Add new MCP Server. Use the name "playwright" and command npx @playwright/mcp@latest. Alternatively, create a .cursor/mcp.json file in your project root:

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "npx",</p>
<p>      "args": ["@playwright/mcp@latest"]</p>
<p>    }</p>
<p>  }</p>


Claude Desktop:

Edit the Claude Desktop config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

  • Windows: %APPDATA%\Claude\claude_desktop_config.json

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "npx",</p>
<p>      "args": ["@playwright/mcp@latest"]</p>
<p>    }</p>
<p>  }</p>


Restart Claude Desktop after saving.

Claude Code:

claude mcp add playwright npx @playwright/mcp@latest

Docker:

{
<p>  "mcpServers": {</p>
<p>    "playwright": {</p>
<p>      "command": "docker",</p>
<p>      "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]</p>
<p>    }</p>
<p>  }</p>


Step 4: Verify the Setup

Once configured, test the connection by asking your AI assistant something like: "Open https://example.com and tell me the page title." If the AI launches a browser, navigates to the page, and returns the title, your setup is working correctly.

Playwright MCP vs. Other Browser Automation Tools

Comparison Table

Feature

Playwright MCP

Selenium

Puppeteer

Cypress

AI-native control

Yes (MCP protocol)

No (requires scripting)

No (requires scripting)

No (requires scripting)

Natural language input

Yes

No

No

No

Accessibility tree

Yes (default)

Limited

Limited

No

Cross-browser

Chromium, Firefox, WebKit

All major browsers

Chromium only

Chromium, Firefox

Auto-waiting

Built-in

Manual waits

Manual waits

Built-in

Open-source

Yes (Apache 2.0)

Yes

Yes

Yes (MIT)

Primary use case

AI-driven automation

Traditional test scripts

Headless Chrome tasks

Component/E2E testing

Key Differences

Playwright MCP vs. Selenium: Selenium requires explicit test scripts written in a programming language. Playwright MCP accepts natural language commands from AI agents. Selenium supports more browsers (including older versions), but Playwright MCP's accessibility-tree approach produces more reliable element identification with less maintenance. Playwright MCP vs. Puppeteer: Puppeteer is limited to Chromium-based browsers and requires JavaScript scripting. Playwright MCP supports multiple browser engines and can be driven by any MCP-compatible AI client without writing code. Playwright MCP vs. Other MCP Browser Servers: Several third-party MCP servers exist for browser automation, including community projects like executeautomation/mcp-playwright. The official Microsoft @playwright/mcp server is the most actively maintained, has the broadest feature set, and receives regular updates aligned with the core Playwright library.

Pros and Cons of Playwright MCP

Pros

Advantage

Detail

Natural language control

Describe tests in plain English without needing Playwright API knowledge

Accessibility-tree approach

More reliable than screenshot-based tools; targets elements by roles and labels

Cross-browser support

Test across Chromium, Firefox, and WebKit with one server

Fast setup

One npx command to get started, no complex configuration

Universal client support

Works with VS Code, Cursor, Claude Desktop, Windsurf, and many more

Open-source

Free under Apache 2.0 license, backed by Microsoft, and actively maintained

Exploratory testing

AI can autonomously discover issues without predefined scripts

Self-adapting

AI adjusts to page changes in real time rather than following brittle scripts

Cons

Limitation

Detail

High token consumption

A typical task uses ~114K tokens via MCP (the CLI alternative reduces this to ~27K)

Non-deterministic output

AI-generated steps can vary between runs for the same prompt

No persistent test scripts

Tests live in chat context, not as reusable .spec.ts files

Limited DOM-only coverage

Canvas elements, complex SVGs, or image-based UIs are difficult to interact with

No built-in test management

No reporting, versioning, or CI/CD integration out of the box

Context window pressure

Large pages with deep DOM trees can flood the LLM context

Docker limitations

The Docker image currently supports headless Chromium only

Best Practices for Using Playwright MCP

Write Specific Prompts

Vague prompts produce unreliable results. Instead of "test the login page," give explicit instructions: "Navigate to /login, enter 'testuser@example.com' in the email field, enter 'password123' in the password field, click the Sign In button, and verify the dashboard heading is visible." More detail means more deterministic results.

Use the CLI for Token-Heavy Workflows

If you are running many browser tasks in sequence or working with large pages, switch to @playwright/cli to reduce token usage by approximately 4x. Reserve MCP for agents that do not have filesystem access.

Combine AI Exploration with Deterministic Tests

Use Playwright MCP for exploratory testing and discovering new flows. Once a flow is validated, codify it into a proper Playwright test script (.spec.ts) for repeatable CI/CD execution. MCP is excellent for discovery; traditional scripts are better for regression.

Start Sessions Clean

Use ephemeral, in-memory browser profiles (the --isolated flag) to avoid state leakage between test runs. Only use persistent profiles (--user-data-dir) when you specifically need to preserve cookies or authentication state.

Scope Your Origin Access

In shared or production-adjacent environments, use --allowed-origins and --blocked-origins to restrict which domains the browser can access. Do not leave the server open to all origins without reason. For a broader look at securing MCP connections, see our guide on MCP security best practices and common risks.

Monitor Token Usage

Deep DOM trees on complex pages can consume significant tokens per interaction. If you notice costs spiking or responses being truncated, consider switching to the CLI, using partial snapshots (targeting a specific element with a selector parameter), or simplifying the page under test.

Version Your MCP Configuration

Commit your .cursor/mcp.json, .vscode/settings.json, or equivalent MCP config files to version control. This ensures your entire team shares a consistent setup and new contributors can get started without manual configuration.

Add a Governance Layer for Production Agents

If your AI agents are interacting with production applications or sensitive data through browser automation, you'll want an agent governance layer. Agen.co provides observability over agent actions, enforces delegated permissions, and maintains audit logs across systems. For a practical look at what this involves, read how organizations are governing AI agents across enterprise applications.

Troubleshooting Common Playwright MCP Issues

Problem

Possible Cause

Solution

Server will not start

Node.js not installed or wrong version

Ensure Node.js 18+ is installed. Run node --version to check.

"Browser not found" error

Playwright browsers not installed

Run npx playwright install to download browser binaries.

Client cannot connect

Server not running or port conflict

Verify the server is running in your terminal. Check for error messages on startup.

Browser opens but tests fail immediately

Stale session or browser crash

Use --isolated flag or restart the server to clear old sessions.

Tests are slow or timing out

Large page DOM overwhelming the context

Use --caps=devtools for targeted inspection or switch to the CLI for token reduction.

AI clicks wrong elements

Ambiguous or duplicate labels on the page

Be more specific in prompts (e.g., "click the Submit button in the login form" instead of "click Submit").

file:// URLs will not open

Blocked by default security policy

Use the --allow-unrestricted-file-access flag if you need local file access.

Docker-specific failures

HTTP transport issues with containers

Use stdio transport instead of streamable-http when running in Docker.

Permission errors on macOS/Linux

npm global install permissions

Use npx (recommended) instead of global install.

What's New in Playwright MCP (2026)

Playwright MCP has evolved rapidly since its initial release. Here are the most significant updates as of early 2026:

Playwright CLI: A Token-Efficient Alternative

Microsoft released @playwright/cli as a companion tool that uses shell commands instead of the MCP protocol. The key advantage is a roughly 4x reduction in token consumption for equivalent tasks. Use CLI when your agent has filesystem access, and MCP when it does not.

Overhauled Session Management

Session management was simplified significantly. The old session commands were replaced with list, close-all, and kill-all. Browser profiles are now ephemeral by default, starting clean with no leftover state. Each workspace gets its own daemon process, preventing cross-project interference.

Security Hardening

Tighter defaults now ship out of the box. File system access is restricted to the workspace root directory. Navigation to file:// URLs is blocked by default. New origin controls (--allowed-origins and --blocked-origins) let you whitelist or blacklist specific domains.

Chrome Extension: Playwright MCP Bridge

An official Chrome extension connects MCP to pages in an existing browser session, using your default profile with all cookies and logged-in state. This eliminates the need to re-implement login flows in automation and is especially useful for testing authenticated workflows.

Video Recording and DevTools

On-demand video recording is available via --save-video=800x600. Combined with --caps=devtools, you can capture performance traces, Core Web Vitals, and full video playback of test sessions.

GitHub Copilot Coding Agent Integration

Playwright MCP is now automatically configured for GitHub Copilot's Coding Agent. No manual setup is required. The agent can read, interact with, and screenshot web pages during code generation, enabling a prompt-to-verification workflow out of the box.

Frequently Asked Questions

Is Playwright MCP free?

Yes. Playwright MCP is open-source under the Apache 2.0 license and free to use. You'll need access to an LLM to drive the MCP server though, and that LLM may have its own costs depending on the provider.

What is the difference between Playwright MCP and regular Playwright?

Regular Playwright is a Node.js library for writing browser automation scripts in JavaScript or TypeScript. You write code, and Playwright executes it. Playwright MCP wraps that same library in an MCP server, allowing AI agents to control the browser through natural language commands instead of code. The underlying browser engine and capabilities are the same.

Can Playwright MCP replace manual testing?

Partially. Playwright MCP is excellent for exploratory testing and quick smoke tests described in natural language. For release-critical regression testing, you'll still want deterministic, repeatable test scripts that produce consistent results across runs. Many teams use Playwright MCP for test discovery and then convert validated flows into traditional Playwright scripts.

What IDEs and clients support Playwright MCP?

Playwright MCP works with any MCP-compatible client. The most popular options include VS Code (via GitHub Copilot), Cursor, Claude Desktop, Claude Code, Windsurf, Goose, OpenAI Codex, GitHub Copilot Coding Agent, Amp, Kiro, and LM Studio. The list continues to grow as MCP adoption expands.

Can I use Playwright MCP in CI/CD pipelines?

Yes. You can run the server in headless mode (--headless) or use the official Docker image for containerized environments. Keep in mind you'll also need an LLM agent orchestrating the tests. For fully deterministic CI/CD execution, many teams prefer traditional Playwright scripts while using MCP for development-time exploration and test generation.

What are the system requirements?

You need Node.js 18 or later and a system capable of running Chromium (or Firefox/WebKit for non-Docker setups). The server itself is lightweight. The main resource consumption comes from the browser instances it launches and the LLM tokens consumed during interaction.

How is Playwright MCP different from screenshot-based browser tools?

Screenshot-based tools capture images of web pages and use computer vision to identify elements. This approach is slower, requires multimodal AI models, and introduces ambiguity when multiple elements look similar visually. Playwright MCP reads the page's accessibility tree instead, providing precise, text-based element identification that works with any LLM and produces more reliable results.

Read More Articles

Empower your workforce with secure agents

© 2026 Agen™ | All rights reserved.

Empower your workforce with secure agents

© 2026 Agen™ | All rights reserved.