20 Commits

Author SHA1 Message Date
Pavel Feldman
6b15c7e422 chore: mark v0.0.10 (#138) 2025-04-05 19:14:50 -07:00
Pavel Feldman
abd56f514b chore: introduce capabilities argument (#135) 2025-04-04 17:14:30 -07:00
Pavel Feldman
707ebbf4d4 chore: group tools, prepare for capabilities (#134) 2025-04-04 15:22:00 -07:00
Pavel Feldman
fc0cccf4a5 chore: reuse the first tab when navigating (#131) 2025-04-03 22:39:55 -07:00
Pavel Feldman
e36d4ea695 chore: allow multiple tabs (#129) 2025-04-03 19:24:17 -07:00
Pavel Feldman
b358e47d71 chore: prep for multiple pages in context (#124) 2025-04-03 10:30:05 -07:00
Yury Semikhatsky
38f038a5dc chore: typo in description (#127) 2025-04-02 17:26:45 -07:00
Yury Semikhatsky
2291011dc7 feat: add slowly option for typing one character at a time (#121) 2025-04-02 14:36:30 -07:00
Pavel Feldman
89627fd23a chore: extract page snapshot, prep for multipage (#120) 2025-04-02 11:42:39 -07:00
Pavel Feldman
23f392dd91 chore: mark v0.0.9 (#114) 2025-04-01 15:45:00 -07:00
Max Schmitt
128e75b9f4 devops: fix npm publishing due to proverance (#112)
Like
[upstream](3ad5c2731a/.github/workflows/publish_release_npm.yml (L15))
and in the
[docs](https://docs.npmjs.com/generating-provenance-statements#example-github-actions-workflow).
2025-04-02 00:37:13 +02:00
Pavel Feldman
2366dbf36c chore: mark v0.0.8 (#111) 2025-04-01 15:16:28 -07:00
Pavel Feldman
0de7c0d38c chore: follow up with iframe stitch (#110) 2025-04-01 15:10:23 -07:00
Simon Knott
0a5518b252 chore: stitch together iframes into one tree (#71) 2025-04-01 14:47:53 -07:00
Pavel Feldman
4f16786432 chore: merge browser and channel settings (#100) 2025-04-01 10:26:48 -07:00
Pavel Feldman
9042c03faa chore: support channel and executable path params (#90)
Fixes https://github.com/microsoft/playwright-mcp/issues/89
2025-03-31 15:30:08 -07:00
Pavel Feldman
d316441142 chore: sanitize file path when saving (#99)
Fixes https://github.com/microsoft/playwright-mcp/issues/96
2025-03-31 15:01:58 -07:00
Yoshiki Nakagawa
aeb4cf65e9 Fixed typo in README.md (#88) 2025-03-31 09:33:38 +01:00
Pavel Feldman
a7392fc266 chore: allow passing cdp endpoint (#86)
Fixes https://github.com/microsoft/playwright-mcp/issues/84
2025-03-30 09:05:58 -07:00
Max Schmitt
88fbf50841 devops: use --provenance when publishing to NPM (#83)
Similar to how we do it upstream:
e2c8163b14/utils/publish_all_packages.sh (L97)

Reference: https://docs.npmjs.com/generating-provenance-statements
2025-03-29 19:17:54 +01:00
31 changed files with 1642 additions and 615 deletions

View File

@@ -5,6 +5,9 @@ on:
jobs: jobs:
publish-npm: publish-npm:
runs-on: ubuntu-latest runs-on: ubuntu-latest
permissions:
contents: read
id-token: write
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- uses: actions/setup-node@v4 - uses: actions/setup-node@v4
@@ -15,6 +18,6 @@ jobs:
- run: npm run build - run: npm run build
- run: npm run lint - run: npm run lint
- run: npm run test - run: npm run test
- run: npm publish - run: npm publish --provenance
env: env:
NODE_AUTH_TOKEN: ${{secrets.NPM_TOKEN}} NODE_AUTH_TOKEN: ${{secrets.NPM_TOKEN}}

173
README.md
View File

@@ -59,9 +59,26 @@ code-insiders --add-mcp '{"name":"playwright","command":"npx","args":["@playwrig
After installation, the Playwright MCP server will be available for use with your GitHub Copilot agent in VS Code. After installation, the Playwright MCP server will be available for use with your GitHub Copilot agent in VS Code.
### CLI Options
The Playwright MCP server supports the following command-line options:
- `--browser <browser>`: Browser or chrome channel to use. Possible values:
- `chrome`, `firefox`, `webkit`, `msedge`
- Chrome channels: `chrome-beta`, `chrome-canary`, `chrome-dev`
- Edge channels: `msedge-beta`, `msedge-canary`, `msedge-dev`
- Default: `chrome`
- `--caps <caps>`: Comma-separated list of capabilities to enable, possible values: tabs, pdf, history, wait, files, install. Default is all.
- `--cdp-endpoint <endpoint>`: CDP endpoint to connect to
- `--executable-path <path>`: Path to the browser executable
- `--headless`: Run browser in headless mode (headed by default)
- `--port <port>`: Port to listen on for SSE transport
- `--user-data-dir <path>`: Path to the user data directory
- `--vision`: Run server that uses screenshots (Aria snapshots are used by default)
### User data directory ### User data directory
Playwright MCP will launch Chrome browser with the new profile, located at Playwright MCP will launch the browser with the new profile, located at
``` ```
- `%USERPROFILE%\AppData\Local\ms-playwright\mcp-chrome-profile` on Windows - `%USERPROFILE%\AppData\Local\ms-playwright\mcp-chrome-profile` on Windows
@@ -69,7 +86,7 @@ Playwright MCP will launch Chrome browser with the new profile, located at
- `~/.cache/ms-playwright/mcp-chrome-profile` on Linux - `~/.cache/ms-playwright/mcp-chrome-profile` on Linux
``` ```
All the logged in information will be stored in that profile, you can delete it between sessions if you'dlike to clear the offline state. All the logged in information will be stored in that profile, you can delete it between sessions if you'd like to clear the offline state.
### Running headless browser (Browser without GUI). ### Running headless browser (Browser without GUI).
@@ -151,22 +168,7 @@ transport = new SSEServerTransport("/messages", res);
server.connect(transport); server.connect(transport);
``` ```
### Snapshot Mode ### Snapshot-based Interactions
The Playwright MCP provides a set of tools for browser automation. Here are all available tools:
- **browser_navigate**
- Description: Navigate to a URL
- Parameters:
- `url` (string): The URL to navigate to
- **browser_go_back**
- Description: Go back to the previous page
- Parameters: None
- **browser_go_forward**
- Description: Go forward to the next page
- Parameters: None
- **browser_click** - **browser_click**
- Description: Perform click on a web page - Description: Perform click on a web page
@@ -194,109 +196,121 @@ The Playwright MCP provides a set of tools for browser automation. Here are all
- `element` (string): Human-readable element description used to obtain permission to interact with the element - `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot - `ref` (string): Exact target element reference from the page snapshot
- `text` (string): Text to type into the element - `text` (string): Text to type into the element
- `submit` (boolean): Whether to submit entered text (press Enter after) - `submit` (boolean, optional): Whether to submit entered text (press Enter after)
- `slowly` (boolean, optional): Whether to type one character at a time. Useful for triggering key handlers in the page. By default entire text is filled in at once.
- **browser_select_option** - **browser_select_option**
- Description: Select option in a dropdown - Description: Select an option in a dropdown
- Parameters: - Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element - `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot - `ref` (string): Exact target element reference from the page snapshot
- `values` (array): Array of values to select in the dropdown. - `values` (array): Array of values to select in the dropdown. This can be a single value or multiple values.
- **browser_choose_file** - **browser_snapshot**
- Description: Choose one or multiple files to upload - Description: Capture accessibility snapshot of the current page, this is better than screenshot
- Parameters: None
- **browser_take_screenshot**
- Description: Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.
- Parameters: - Parameters:
- `paths` (array): The absolute paths to the files to upload. Can be a single file or multiple files. - `raw` (boolean, optional): Whether to return without compression (in PNG format). Default is false, which returns a JPEG image.
### Vision-based Interactions
- **browser_screen_move_mouse**
- Description: Move mouse to a given position
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `x` (number): X coordinate
- `y` (number): Y coordinate
- **browser_screen_capture**
- Description: Take a screenshot of the current page
- Parameters: None
- **browser_screen_click**
- Description: Click left mouse button
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `x` (number): X coordinate
- `y` (number): Y coordinate
- **browser_screen_drag**
- Description: Drag left mouse button
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `startX` (number): Start X coordinate
- `startY` (number): Start Y coordinate
- `endX` (number): End X coordinate
- `endY` (number): End Y coordinate
- **browser_screen_type**
- Description: Type text
- Parameters:
- `text` (string): Text to type
- `submit` (boolean, optional): Whether to submit entered text (press Enter after)
- **browser_press_key** - **browser_press_key**
- Description: Press a key on the keyboard - Description: Press a key on the keyboard
- Parameters: - Parameters:
- `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a` - `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
- **browser_snapshot** ### Tab Management
- Description: Capture accessibility snapshot of the current page (better than screenshot)
- **browser_tab_list**
- Description: List browser tabs
- Parameters: None - Parameters: None
- **browser_save_as_pdf** - **browser_tab_new**
- Description: Save page as PDF - Description: Open a new tab
- Parameters: None
- **browser_take_screenshot**
- Description: Capture screenshot of the page
- Parameters: - Parameters:
- `raw` (string): Optionally returns lossless PNG screenshot. JPEG by default. - `url` (string, optional): The URL to navigate to in the new tab. If not provided, the new tab will be blank.
- **browser_wait** - **browser_tab_select**
- Description: Wait for a specified time in seconds - Description: Select a tab by index
- Parameters: - Parameters:
- `time` (number): The time to wait in seconds (capped at 10 seconds) - `index` (number): The index of the tab to select
- **browser_close** - **browser_tab_close**
- Description: Close the page - Description: Close a tab
- Parameters: None - Parameters:
- `index` (number, optional): The index of the tab to close. Closes current tab if not provided.
### Navigation
### Vision Mode
Vision Mode provides tools for visual-based interactions using screenshots. Here are all available tools:
- **browser_navigate** - **browser_navigate**
- Description: Navigate to a URL - Description: Navigate to a URL
- Parameters: - Parameters:
- `url` (string): The URL to navigate to - `url` (string): The URL to navigate to
- **browser_go_back** - **browser_navigate_back**
- Description: Go back to the previous page - Description: Go back to the previous page
- Parameters: None - Parameters: None
- **browser_go_forward** - **browser_navigate_forward**
- Description: Go forward to the next page - Description: Go forward to the next page
- Parameters: None - Parameters: None
- **browser_screenshot** ### Keyboard
- Description: Capture screenshot of the current page
- Parameters: None
- **browser_move_mouse**
- Description: Move mouse to specified coordinates
- Parameters:
- `x` (number): X coordinate
- `y` (number): Y coordinate
- **browser_click**
- Description: Click at specified coordinates
- Parameters:
- `x` (number): X coordinate to click at
- `y` (number): Y coordinate to click at
- **browser_drag**
- Description: Perform drag and drop operation
- Parameters:
- `startX` (number): Start X coordinate
- `startY` (number): Start Y coordinate
- `endX` (number): End X coordinate
- `endY` (number): End Y coordinate
- **browser_type**
- Description: Type text at specified coordinates
- Parameters:
- `text` (string): Text to type
- `submit` (boolean): Whether to submit entered text (press Enter after)
- **browser_press_key** - **browser_press_key**
- Description: Press a key on the keyboard - Description: Press a key on the keyboard
- Parameters: - Parameters:
- `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a` - `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
- **browser_choose_file** ### Files and Media
- **browser_file_upload**
- Description: Choose one or multiple files to upload - Description: Choose one or multiple files to upload
- Parameters: - Parameters:
- `paths` (array): The absolute paths to the files to upload. Can be a single file or multiple files. - `paths` (array): The absolute paths to the files to upload. Can be a single file or multiple files.
- **browser_save_as_pdf** - **browser_pdf_save**
- Description: Save page as PDF - Description: Save page as PDF
- Parameters: None - Parameters: None
### Utilities
- **browser_wait** - **browser_wait**
- Description: Wait for a specified time in seconds - Description: Wait for a specified time in seconds
- Parameters: - Parameters:
@@ -305,3 +319,10 @@ Vision Mode provides tools for visual-based interactions using screenshots. Here
- **browser_close** - **browser_close**
- Description: Close the page - Description: Close the page
- Parameters: None - Parameters: None
- **browser_install**
- Description: Install the browser specified in the config. Call this if you get an error about the browser not being installed.
- Parameters: None
### Vision Mode

7
index.d.ts vendored
View File

@@ -18,6 +18,8 @@
import type { LaunchOptions } from 'playwright'; import type { LaunchOptions } from 'playwright';
import type { Server } from '@modelcontextprotocol/sdk/server/index.js'; import type { Server } from '@modelcontextprotocol/sdk/server/index.js';
type ToolCapability = 'core' | 'tabs' | 'pdf' | 'history' | 'wait' | 'files' | 'install';
type Options = { type Options = {
/** /**
* Path to the user data directory. * Path to the user data directory.
@@ -35,6 +37,11 @@ type Options = {
* @default false * @default false
*/ */
vision?: boolean; vision?: boolean;
/**
* Capabilities to enable.
*/
capabilities?: ToolCapability[];
}; };
export function createServer(options?: Options): Server; export function createServer(options?: Options): Server;

43
package-lock.json generated
View File

@@ -1,17 +1,18 @@
{ {
"name": "@playwright/mcp", "name": "@playwright/mcp",
"version": "0.0.7", "version": "0.0.10",
"lockfileVersion": 3, "lockfileVersion": 3,
"requires": true, "requires": true,
"packages": { "packages": {
"": { "": {
"name": "@playwright/mcp", "name": "@playwright/mcp",
"version": "0.0.7", "version": "0.0.10",
"license": "Apache-2.0", "license": "Apache-2.0",
"dependencies": { "dependencies": {
"@modelcontextprotocol/sdk": "^1.6.1", "@modelcontextprotocol/sdk": "^1.6.1",
"commander": "^13.1.0", "commander": "^13.1.0",
"playwright": "1.52.0-alpha-1743011787000", "playwright": "^1.52.0-alpha-1743163434000",
"yaml": "^2.7.1",
"zod-to-json-schema": "^3.24.4" "zod-to-json-schema": "^3.24.4"
}, },
"bin": { "bin": {
@@ -20,7 +21,7 @@
"devDependencies": { "devDependencies": {
"@eslint/eslintrc": "^3.2.0", "@eslint/eslintrc": "^3.2.0",
"@eslint/js": "^9.19.0", "@eslint/js": "^9.19.0",
"@playwright/test": "1.52.0-alpha-1743011787000", "@playwright/test": "^1.52.0-alpha-1743163434000",
"@stylistic/eslint-plugin": "^3.0.1", "@stylistic/eslint-plugin": "^3.0.1",
"@types/node": "^22.13.10", "@types/node": "^22.13.10",
"@typescript-eslint/eslint-plugin": "^8.26.1", "@typescript-eslint/eslint-plugin": "^8.26.1",
@@ -285,13 +286,13 @@
} }
}, },
"node_modules/@playwright/test": { "node_modules/@playwright/test": {
"version": "1.52.0-alpha-1743011787000", "version": "1.52.0-alpha-1743163434000",
"resolved": "https://registry.npmjs.org/@playwright/test/-/test-1.52.0-alpha-1743011787000.tgz", "resolved": "https://registry.npmjs.org/@playwright/test/-/test-1.52.0-alpha-1743163434000.tgz",
"integrity": "sha512-ikJR8JXof5IBvErrmIsR3ixov4nKlQe/6PSYK/R6eTEe6eoT+eEXlaNY4z6mn9dF02Z1zYGxzAbb8TvSvuwh4Q==", "integrity": "sha512-4uBgNlJ6hgPtB8DrwQsgoKuVoe7j+nPqudna7CLXWCmmT3LYPMD5aOjGoBkszr+R9NejtKashq/bOi/ny9hsIA==",
"dev": true, "dev": true,
"license": "Apache-2.0", "license": "Apache-2.0",
"dependencies": { "dependencies": {
"playwright": "1.52.0-alpha-1743011787000" "playwright": "1.52.0-alpha-1743163434000"
}, },
"bin": { "bin": {
"playwright": "cli.js" "playwright": "cli.js"
@@ -3296,12 +3297,12 @@
} }
}, },
"node_modules/playwright": { "node_modules/playwright": {
"version": "1.52.0-alpha-1743011787000", "version": "1.52.0-alpha-1743163434000",
"resolved": "https://registry.npmjs.org/playwright/-/playwright-1.52.0-alpha-1743011787000.tgz", "resolved": "https://registry.npmjs.org/playwright/-/playwright-1.52.0-alpha-1743163434000.tgz",
"integrity": "sha512-wg9Tu4ZDKJWo7hBKpeuD/XLtLOQ7fCCuBfekgUrPLStA12O3224E1fbp/xGFnmi47SF71Y8F6C2Beyd3gYFWlQ==", "integrity": "sha512-4uYv49ekPjolydfFfTfFQ2z4URF9UZMVUXLy7aXam/tPxEQ5O7+jQC+yzrDMGmhcj5QkMnxjlyk7N2V9a0QLdQ==",
"license": "Apache-2.0", "license": "Apache-2.0",
"dependencies": { "dependencies": {
"playwright-core": "1.52.0-alpha-1743011787000" "playwright-core": "1.52.0-alpha-1743163434000"
}, },
"bin": { "bin": {
"playwright": "cli.js" "playwright": "cli.js"
@@ -3314,9 +3315,9 @@
} }
}, },
"node_modules/playwright-core": { "node_modules/playwright-core": {
"version": "1.52.0-alpha-1743011787000", "version": "1.52.0-alpha-1743163434000",
"resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.52.0-alpha-1743011787000.tgz", "resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.52.0-alpha-1743163434000.tgz",
"integrity": "sha512-yOpMfKxTBRqdm50b52cojvTCNttWN+Xk6LXF+KU4ufcGwcRjUud1xdHmHHvQNFFanXM1MBYnDKsMkRvjPsuYOw==", "integrity": "sha512-Tn4u3Ywwjkh847/bYWlXIrNxv5DRJRDgtb+VYMXHvNCKkrxL6yfZ1ApIAYD7IAkkKH/KLTXszGWl3a/Z/KDfQA==",
"license": "Apache-2.0", "license": "Apache-2.0",
"bin": { "bin": {
"playwright-core": "cli.js" "playwright-core": "cli.js"
@@ -4348,6 +4349,18 @@
"integrity": "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ==", "integrity": "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ==",
"license": "ISC" "license": "ISC"
}, },
"node_modules/yaml": {
"version": "2.7.1",
"resolved": "https://registry.npmjs.org/yaml/-/yaml-2.7.1.tgz",
"integrity": "sha512-10ULxpnOCQXxJvBgxsn9ptjq6uviG/htZKk9veJGhlqn3w/DxQ631zFF+nlQXLwmImeS5amR2dl2U8sg6U9jsQ==",
"license": "ISC",
"bin": {
"yaml": "bin.mjs"
},
"engines": {
"node": ">= 14"
}
},
"node_modules/yocto-queue": { "node_modules/yocto-queue": {
"version": "0.1.0", "version": "0.1.0",
"resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-0.1.0.tgz", "resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-0.1.0.tgz",

View File

@@ -1,6 +1,6 @@
{ {
"name": "@playwright/mcp", "name": "@playwright/mcp",
"version": "0.0.7", "version": "0.0.10",
"description": "Playwright Tools for MCP", "description": "Playwright Tools for MCP",
"repository": { "repository": {
"type": "git", "type": "git",
@@ -32,18 +32,19 @@
"dependencies": { "dependencies": {
"@modelcontextprotocol/sdk": "^1.6.1", "@modelcontextprotocol/sdk": "^1.6.1",
"commander": "^13.1.0", "commander": "^13.1.0",
"playwright": "1.52.0-alpha-1743011787000", "playwright": "^1.52.0-alpha-1743163434000",
"yaml": "^2.7.1",
"zod-to-json-schema": "^3.24.4" "zod-to-json-schema": "^3.24.4"
}, },
"devDependencies": { "devDependencies": {
"@eslint/eslintrc": "^3.2.0", "@eslint/eslintrc": "^3.2.0",
"@eslint/js": "^9.19.0", "@eslint/js": "^9.19.0",
"@playwright/test": "1.52.0-alpha-1743011787000", "@playwright/test": "^1.52.0-alpha-1743163434000",
"@stylistic/eslint-plugin": "^3.0.1", "@stylistic/eslint-plugin": "^3.0.1",
"@types/node": "^22.13.10",
"@typescript-eslint/eslint-plugin": "^8.26.1", "@typescript-eslint/eslint-plugin": "^8.26.1",
"@typescript-eslint/parser": "^8.26.1", "@typescript-eslint/parser": "^8.26.1",
"@typescript-eslint/utils": "^8.26.1", "@typescript-eslint/utils": "^8.26.1",
"@types/node": "^22.13.10",
"eslint": "^9.19.0", "eslint": "^9.19.0",
"eslint-plugin-import": "^2.31.0", "eslint-plugin-import": "^2.31.0",
"eslint-plugin-notice": "^1.0.0", "eslint-plugin-notice": "^1.0.0",

View File

@@ -15,136 +15,347 @@
*/ */
import * as playwright from 'playwright'; import * as playwright from 'playwright';
import yaml from 'yaml';
import { waitForCompletion } from './tools/utils';
import { ToolResult } from './tools/tool';
export type ContextOptions = {
browserName?: 'chromium' | 'firefox' | 'webkit';
userDataDir: string;
launchOptions?: playwright.LaunchOptions;
cdpEndpoint?: string;
remoteEndpoint?: string;
};
type PageOrFrameLocator = playwright.Page | playwright.FrameLocator;
type RunOptions = {
captureSnapshot?: boolean;
waitForCompletion?: boolean;
status?: string;
noClearFileChooser?: boolean;
};
export class Context { export class Context {
private _userDataDir: string; readonly options: ContextOptions;
private _launchOptions: playwright.LaunchOptions | undefined;
private _browser: playwright.Browser | undefined; private _browser: playwright.Browser | undefined;
private _page: playwright.Page | undefined; private _browserContext: playwright.BrowserContext | undefined;
private _console: playwright.ConsoleMessage[] = []; private _tabs: Tab[] = [];
private _createPagePromise: Promise<playwright.Page> | undefined; private _currentTab: Tab | undefined;
private _fileChooser: playwright.FileChooser | undefined;
private _lastSnapshotFrames: playwright.FrameLocator[] = [];
constructor(userDataDir: string, launchOptions?: playwright.LaunchOptions) { constructor(options: ContextOptions) {
this._userDataDir = userDataDir; this.options = options;
this._launchOptions = launchOptions;
} }
async createPage(): Promise<playwright.Page> { tabs(): Tab[] {
if (this._createPagePromise) return this._tabs;
return this._createPagePromise;
this._createPagePromise = (async () => {
const { browser, page } = await this._createPage();
page.on('console', event => this._console.push(event));
page.on('framenavigated', frame => {
if (!frame.parentFrame())
this._console.length = 0;
});
page.on('close', () => this._onPageClose());
page.on('filechooser', chooser => this._fileChooser = chooser);
page.setDefaultNavigationTimeout(60000);
page.setDefaultTimeout(5000);
this._page = page;
this._browser = browser;
return page;
})();
return this._createPagePromise;
} }
private _onPageClose() { currentTab(): Tab {
if (!this._currentTab)
throw new Error('Navigate to a location to create a tab');
return this._currentTab;
}
async newTab(): Promise<Tab> {
const browserContext = await this._ensureBrowserContext();
const page = await browserContext.newPage();
this._currentTab = this._tabs.find(t => t.page === page)!;
return this._currentTab;
}
async selectTab(index: number) {
this._currentTab = this._tabs[index - 1];
await this._currentTab.page.bringToFront();
}
async ensureTab(): Promise<Tab> {
const context = await this._ensureBrowserContext();
if (!this._currentTab)
await context.newPage();
return this._currentTab!;
}
async listTabs(): Promise<string> {
if (!this._tabs.length)
return 'No tabs open';
const lines: string[] = ['Open tabs:'];
for (let i = 0; i < this._tabs.length; i++) {
const tab = this._tabs[i];
const title = await tab.page.title();
const url = tab.page.url();
const current = tab === this._currentTab ? ' (current)' : '';
lines.push(`- ${i + 1}:${current} [${title}] (${url})`);
}
return lines.join('\n');
}
async closeTab(index: number | undefined) {
const tab = index === undefined ? this.currentTab() : this._tabs[index - 1];
await tab.page.close();
return await this.listTabs();
}
private _onPageCreated(page: playwright.Page) {
const tab = new Tab(this, page, tab => this._onPageClosed(tab));
this._tabs.push(tab);
if (!this._currentTab)
this._currentTab = tab;
}
private _onPageClosed(tab: Tab) {
const index = this._tabs.indexOf(tab);
if (index === -1)
return;
this._tabs.splice(index, 1);
if (this._currentTab === tab)
this._currentTab = this._tabs[Math.min(index, this._tabs.length - 1)];
const browser = this._browser; const browser = this._browser;
const page = this._page; if (this._browserContext && !this._tabs.length) {
void page?.context()?.close().then(() => browser?.close()).catch(() => {}); void this._browserContext.close().then(() => browser?.close()).catch(() => {});
this._browser = undefined;
this._browserContext = undefined;
}
}
this._createPagePromise = undefined; async close() {
this._browser = undefined; if (!this._browserContext)
this._page = undefined; return;
await this._browserContext.close();
}
private async _ensureBrowserContext() {
if (!this._browserContext) {
const context = await this._createBrowserContext();
this._browser = context.browser;
this._browserContext = context.browserContext;
for (const page of this._browserContext.pages())
this._onPageCreated(page);
this._browserContext.on('page', page => this._onPageCreated(page));
}
return this._browserContext;
}
private async _createBrowserContext(): Promise<{ browser?: playwright.Browser, browserContext: playwright.BrowserContext }> {
if (this.options.remoteEndpoint) {
const url = new URL(this.options.remoteEndpoint);
if (this.options.browserName)
url.searchParams.set('browser', this.options.browserName);
if (this.options.launchOptions)
url.searchParams.set('launch-options', JSON.stringify(this.options.launchOptions));
const browser = await playwright[this.options.browserName ?? 'chromium'].connect(String(url));
const browserContext = await browser.newContext();
return { browser, browserContext };
}
if (this.options.cdpEndpoint) {
const browser = await playwright.chromium.connectOverCDP(this.options.cdpEndpoint);
const browserContext = browser.contexts()[0];
return { browser, browserContext };
}
const browserContext = await this._launchPersistentContext();
return { browserContext };
}
private async _launchPersistentContext(): Promise<playwright.BrowserContext> {
try {
const browserType = this.options.browserName ? playwright[this.options.browserName] : playwright.chromium;
return await browserType.launchPersistentContext(this.options.userDataDir, this.options.launchOptions);
} catch (error: any) {
if (error.message.includes('Executable doesn\'t exist'))
throw new Error(`Browser specified in your config is not installed. Either install it (likely) or change the config.`);
throw error;
}
}
}
class Tab {
readonly context: Context;
readonly page: playwright.Page;
private _console: playwright.ConsoleMessage[] = [];
private _fileChooser: playwright.FileChooser | undefined;
private _snapshot: PageSnapshot | undefined;
private _onPageClose: (tab: Tab) => void;
constructor(context: Context, page: playwright.Page, onPageClose: (tab: Tab) => void) {
this.context = context;
this.page = page;
this._onPageClose = onPageClose;
page.on('console', event => this._console.push(event));
page.on('framenavigated', frame => {
if (!frame.parentFrame())
this._console.length = 0;
});
page.on('close', () => this._onClose());
page.on('filechooser', chooser => this._fileChooser = chooser);
page.setDefaultNavigationTimeout(60000);
page.setDefaultTimeout(5000);
}
private _onClose() {
this._fileChooser = undefined; this._fileChooser = undefined;
this._console.length = 0; this._console.length = 0;
this._onPageClose(this);
} }
existingPage(): playwright.Page { async navigate(url: string) {
if (!this._page) await this.page.goto(url, { waitUntil: 'domcontentloaded' });
throw new Error('Navigate to a location to create a page'); // Cap load event to 5 seconds, the page is operational at this point.
return this._page; await this.page.waitForLoadState('load', { timeout: 5000 }).catch(() => {});
}
async run(callback: (tab: Tab) => Promise<void>, options?: RunOptions): Promise<ToolResult> {
try {
if (!options?.noClearFileChooser)
this._fileChooser = undefined;
if (options?.waitForCompletion)
await waitForCompletion(this.page, () => callback(this));
else
await callback(this);
} finally {
if (options?.captureSnapshot)
this._snapshot = await PageSnapshot.create(this.page);
}
const tabList = this.context.tabs().length > 1 ? await this.context.listTabs() + '\n\nCurrent tab:' + '\n' : '';
const snapshot = this._snapshot?.text({ status: options?.status, hasFileChooser: !!this._fileChooser }) ?? options?.status ?? '';
return {
content: [{
type: 'text',
text: tabList + snapshot,
}],
};
}
async runAndWait(callback: (tab: Tab) => Promise<void>, options?: RunOptions): Promise<ToolResult> {
return await this.run(callback, {
waitForCompletion: true,
...options,
});
}
async runAndWaitWithSnapshot(callback: (tab: Tab) => Promise<void>, options?: RunOptions): Promise<ToolResult> {
return await this.run(callback, {
captureSnapshot: true,
waitForCompletion: true,
...options,
});
}
lastSnapshot(): PageSnapshot {
if (!this._snapshot)
throw new Error('No snapshot available');
return this._snapshot;
} }
async console(): Promise<playwright.ConsoleMessage[]> { async console(): Promise<playwright.ConsoleMessage[]> {
return this._console; return this._console;
} }
async close() {
if (!this._page)
return;
await this._page.close();
}
async submitFileChooser(paths: string[]) { async submitFileChooser(paths: string[]) {
if (!this._fileChooser) if (!this._fileChooser)
throw new Error('No file chooser visible'); throw new Error('No file chooser visible');
await this._fileChooser.setFiles(paths); await this._fileChooser.setFiles(paths);
this._fileChooser = undefined; this._fileChooser = undefined;
} }
}
hasFileChooser() { class PageSnapshot {
return !!this._fileChooser; private _frameLocators: PageOrFrameLocator[] = [];
private _text!: string;
constructor() {
} }
clearFileChooser() { static async create(page: playwright.Page): Promise<PageSnapshot> {
this._fileChooser = undefined; const snapshot = new PageSnapshot();
await snapshot._build(page);
return snapshot;
} }
private async _createPage(): Promise<{ browser?: playwright.Browser, page: playwright.Page }> { text(options?: { status?: string, hasFileChooser?: boolean }): string {
if (process.env.PLAYWRIGHT_WS_ENDPOINT) { const results: string[] = [];
const url = new URL(process.env.PLAYWRIGHT_WS_ENDPOINT); if (options?.status) {
if (this._launchOptions) results.push(options.status);
url.searchParams.set('launch-options', JSON.stringify(this._launchOptions)); results.push('');
const browser = await playwright.chromium.connect(String(url));
const page = await browser.newPage();
return { browser, page };
} }
if (options?.hasFileChooser) {
const context = await playwright.chromium.launchPersistentContext(this._userDataDir, this._launchOptions); results.push('- There is a file chooser visible that requires browser_file_upload to be called');
const [page] = context.pages(); results.push('');
return { page }; }
results.push(this._text);
return results.join('\n');
} }
async allFramesSnapshot() { private async _build(page: playwright.Page) {
const page = this.existingPage(); const yamlDocument = await this._snapshotFrame(page);
const visibleFrames = await page.locator('iframe').filter({ visible: true }).all(); const lines = [];
this._lastSnapshotFrames = visibleFrames.map(frame => frame.contentFrame()); lines.push(
`- Page URL: ${page.url()}`,
`- Page Title: ${await page.title()}`
);
lines.push(
`- Page Snapshot`,
'```yaml',
yamlDocument.toString().trim(),
'```',
''
);
this._text = lines.join('\n');
}
const snapshots = await Promise.all([ private async _snapshotFrame(frame: playwright.Page | playwright.FrameLocator) {
page.locator('html').ariaSnapshot({ ref: true }), const frameIndex = this._frameLocators.push(frame) - 1;
...this._lastSnapshotFrames.map(async (frame, index) => { const snapshotString = await frame.locator('body').ariaSnapshot({ ref: true });
const snapshot = await frame.locator('html').ariaSnapshot({ ref: true }); const snapshot = yaml.parseDocument(snapshotString);
const args = [];
const src = await frame.owner().getAttribute('src');
if (src)
args.push(`src=${src}`);
const name = await frame.owner().getAttribute('name');
if (name)
args.push(`name=${name}`);
return `\n# iframe ${args.join(' ')}\n` + snapshot.replaceAll('[ref=', `[ref=f${index}`);
})
]);
return snapshots.join('\n'); const visit = async (node: any): Promise<unknown> => {
if (yaml.isPair(node)) {
await Promise.all([
visit(node.key).then(k => node.key = k),
visit(node.value).then(v => node.value = v)
]);
} else if (yaml.isSeq(node) || yaml.isMap(node)) {
node.items = await Promise.all(node.items.map(visit));
} else if (yaml.isScalar(node)) {
if (typeof node.value === 'string') {
const value = node.value;
if (frameIndex > 0)
node.value = value.replace('[ref=', `[ref=f${frameIndex}`);
if (value.startsWith('iframe ')) {
const ref = value.match(/\[ref=(.*)\]/)?.[1];
if (ref) {
try {
const childSnapshot = await this._snapshotFrame(frame.frameLocator(`aria-ref=${ref}`));
return snapshot.createPair(node.value, childSnapshot);
} catch (error) {
return snapshot.createPair(node.value, '<could not take iframe snapshot>');
}
}
}
}
}
return node;
};
await visit(snapshot.contents);
return snapshot;
} }
refLocator(ref: string): playwright.Locator { refLocator(ref: string): playwright.Locator {
const page = this.existingPage(); let frame = this._frameLocators[0];
let frame: playwright.Frame | playwright.FrameLocator = page.mainFrame();
const match = ref.match(/^f(\d+)(.*)/); const match = ref.match(/^f(\d+)(.*)/);
if (match) { if (match) {
const frameIndex = parseInt(match[1], 10); const frameIndex = parseInt(match[1], 10);
if (!this._lastSnapshotFrames[frameIndex]) frame = this._frameLocators[frameIndex];
throw new Error(`Frame does not exist. Provide ref from the most current snapshot.`);
frame = this._lastSnapshotFrames[frameIndex];
ref = match[2]; ref = match[2];
} }
if (!frame)
throw new Error(`Frame does not exist. Provide ref from the most current snapshot.`);
return frame.locator(`aria-ref=${ref}`); return frame.locator(`aria-ref=${ref}`);
} }
} }

View File

@@ -15,70 +15,70 @@
*/ */
import { createServerWithTools } from './server'; import { createServerWithTools } from './server';
import * as snapshot from './tools/snapshot'; import common from './tools/common';
import * as common from './tools/common'; import files from './tools/files';
import * as screenshot from './tools/screenshot'; import install from './tools/install';
import { console } from './resources/console'; import keyboard from './tools/keyboard';
import navigate from './tools/navigate';
import pdf from './tools/pdf';
import snapshot from './tools/snapshot';
import tabs from './tools/tabs';
import screen from './tools/screen';
import { console as consoleResource } from './resources/console';
import type { Tool } from './tools/tool'; import type { Tool, ToolCapability } from './tools/tool';
import type { Resource } from './resources/resource'; import type { Resource } from './resources/resource';
import type { Server } from '@modelcontextprotocol/sdk/server/index.js'; import type { Server } from '@modelcontextprotocol/sdk/server/index.js';
import type { LaunchOptions } from 'playwright'; import type { LaunchOptions } from 'playwright';
const commonTools: Tool[] = [
common.pressKey,
common.wait,
common.pdf,
common.close,
];
const snapshotTools: Tool[] = [ const snapshotTools: Tool[] = [
common.navigate(true), ...common,
common.goBack(true), ...files(true),
common.goForward(true), ...install,
common.chooseFile(true), ...keyboard(true),
snapshot.snapshot, ...navigate(true),
snapshot.click, ...pdf,
snapshot.hover, ...snapshot,
snapshot.type, ...tabs(true),
snapshot.selectOption,
snapshot.screenshot,
...commonTools,
]; ];
const screenshotTools: Tool[] = [ const screenshotTools: Tool[] = [
common.navigate(false), ...common,
common.goBack(false), ...files(false),
common.goForward(false), ...install,
common.chooseFile(false), ...keyboard(false),
screenshot.screenshot, ...navigate(false),
screenshot.moveMouse, ...pdf,
screenshot.click, ...screen,
screenshot.drag, ...tabs(false),
screenshot.type,
...commonTools,
]; ];
const resources: Resource[] = [ const resources: Resource[] = [
console, consoleResource,
]; ];
type Options = { type Options = {
browserName?: 'chromium' | 'firefox' | 'webkit';
userDataDir?: string; userDataDir?: string;
launchOptions?: LaunchOptions; launchOptions?: LaunchOptions;
cdpEndpoint?: string;
vision?: boolean; vision?: boolean;
capabilities?: ToolCapability[];
}; };
const packageJSON = require('../package.json'); const packageJSON = require('../package.json');
export function createServer(options?: Options): Server { export function createServer(options?: Options): Server {
const tools = options?.vision ? screenshotTools : snapshotTools; const allTools = options?.vision ? screenshotTools : snapshotTools;
const tools = allTools.filter(tool => !options?.capabilities || tool.capability === 'core' || options.capabilities.includes(tool.capability));
return createServerWithTools({ return createServerWithTools({
name: 'Playwright', name: 'Playwright',
version: packageJSON.version, version: packageJSON.version,
tools, tools,
resources, resources,
browserName: options?.browserName,
userDataDir: options?.userDataDir ?? '', userDataDir: options?.userDataDir ?? '',
launchOptions: options?.launchOptions, launchOptions: options?.launchOptions,
cdpEndpoint: options?.cdpEndpoint,
}); });
} }

View File

@@ -29,26 +29,65 @@ import { ServerList } from './server';
import type { LaunchOptions } from 'playwright'; import type { LaunchOptions } from 'playwright';
import assert from 'assert'; import assert from 'assert';
import { ToolCapability } from './tools/tool';
const packageJSON = require('../package.json'); const packageJSON = require('../package.json');
program program
.version('Version ' + packageJSON.version) .version('Version ' + packageJSON.version)
.name(packageJSON.name) .name(packageJSON.name)
.option('--browser <browser>', 'Browser or chrome channel to use, possible values: chrome, firefox, webkit, msedge.')
.option('--caps <caps>', 'Comma-separated list of capabilities to enable, possible values: tabs, pdf, history, wait, files, install. Default is all.')
.option('--cdp-endpoint <endpoint>', 'CDP endpoint to connect to.')
.option('--executable-path <path>', 'Path to the browser executable.')
.option('--headless', 'Run browser in headless mode, headed by default') .option('--headless', 'Run browser in headless mode, headed by default')
.option('--port <port>', 'Port to listen on for SSE transport.')
.option('--user-data-dir <path>', 'Path to the user data directory') .option('--user-data-dir <path>', 'Path to the user data directory')
.option('--vision', 'Run server that uses screenshots (Aria snapshots are used by default)') .option('--vision', 'Run server that uses screenshots (Aria snapshots are used by default)')
.option('--port <port>', 'Port to listen on for SSE transport.')
.action(async options => { .action(async options => {
let browserName: 'chromium' | 'firefox' | 'webkit';
let channel: string | undefined;
switch (options.browser) {
case 'chrome':
case 'chrome-beta':
case 'chrome-canary':
case 'chrome-dev':
case 'msedge':
case 'msedge-beta':
case 'msedge-canary':
case 'msedge-dev':
browserName = 'chromium';
channel = options.browser;
break;
case 'chromium':
browserName = 'chromium';
break;
case 'firefox':
browserName = 'firefox';
break;
case 'webkit':
browserName = 'webkit';
break;
default:
browserName = 'chromium';
channel = 'chrome';
}
const launchOptions: LaunchOptions = { const launchOptions: LaunchOptions = {
headless: !!options.headless, headless: !!options.headless,
channel: 'chrome', channel,
executablePath: options.executablePath,
}; };
const userDataDir = options.userDataDir ?? await createUserDataDir();
const userDataDir = options.userDataDir ?? await createUserDataDir(browserName);
const serverList = new ServerList(() => createServer({ const serverList = new ServerList(() => createServer({
browserName,
userDataDir, userDataDir,
launchOptions, launchOptions,
vision: !!options.vision, vision: !!options.vision,
cdpEndpoint: options.cdpEndpoint,
capabilities: options.caps?.split(',').map((c: string) => c.trim() as ToolCapability),
})); }));
setupExitWatchdog(serverList); setupExitWatchdog(serverList);
@@ -70,7 +109,7 @@ function setupExitWatchdog(serverList: ServerList) {
program.parse(process.argv); program.parse(process.argv);
async function createUserDataDir() { async function createUserDataDir(browserName: 'chromium' | 'firefox' | 'webkit') {
let cacheDirectory: string; let cacheDirectory: string;
if (process.platform === 'linux') if (process.platform === 'linux')
cacheDirectory = process.env.XDG_CACHE_HOME || path.join(os.homedir(), '.cache'); cacheDirectory = process.env.XDG_CACHE_HOME || path.join(os.homedir(), '.cache');
@@ -80,7 +119,7 @@ async function createUserDataDir() {
cacheDirectory = process.env.LOCALAPPDATA || path.join(os.homedir(), 'AppData', 'Local'); cacheDirectory = process.env.LOCALAPPDATA || path.join(os.homedir(), 'AppData', 'Local');
else else
throw new Error('Unsupported platform: ' + process.platform); throw new Error('Unsupported platform: ' + process.platform);
const result = path.join(cacheDirectory, 'ms-playwright', 'mcp-chrome-profile'); const result = path.join(cacheDirectory, 'ms-playwright', `mcp-${browserName}-profile`);
await fs.promises.mkdir(result, { recursive: true }); await fs.promises.mkdir(result, { recursive: true });
return result; return result;
} }

View File

@@ -24,7 +24,7 @@ export const console: Resource = {
}, },
read: async (context, uri) => { read: async (context, uri) => {
const messages = await context.console(); const messages = await context.currentTab().console();
const log = messages.map(message => `[${message.type().toUpperCase()}] ${message.text()}`).join('\n'); const log = messages.map(message => `[${message.type().toUpperCase()}] ${message.text()}`).join('\n');
return [{ return [{
uri, uri,

View File

@@ -21,20 +21,18 @@ import { Context } from './context';
import type { Tool } from './tools/tool'; import type { Tool } from './tools/tool';
import type { Resource } from './resources/resource'; import type { Resource } from './resources/resource';
import type { LaunchOptions } from 'playwright'; import type { ContextOptions } from './context';
type Options = { type Options = ContextOptions & {
name: string; name: string;
version: string; version: string;
tools: Tool[]; tools: Tool[];
resources: Resource[], resources: Resource[],
userDataDir: string;
launchOptions?: LaunchOptions;
}; };
export function createServerWithTools(options: Options): Server { export function createServerWithTools(options: Options): Server {
const { name, version, tools, resources, userDataDir, launchOptions } = options; const { name, version, tools, resources } = options;
const context = new Context(userDataDir, launchOptions); const context = new Context(options);
const server = new Server({ name, version }, { const server = new Server({ name, version }, {
capabilities: { capabilities: {
tools: {}, tools: {},

View File

@@ -14,74 +14,17 @@
* limitations under the License. * limitations under the License.
*/ */
import os from 'os';
import path from 'path';
import { z } from 'zod'; import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema'; import { zodToJsonSchema } from 'zod-to-json-schema';
import { captureAriaSnapshot, runAndWait } from './utils'; import type { Tool } from './tool';
import type { ToolFactory, Tool } from './tool';
const navigateSchema = z.object({
url: z.string().describe('The URL to navigate to'),
});
export const navigate: ToolFactory = snapshot => ({
schema: {
name: 'browser_navigate',
description: 'Navigate to a URL',
inputSchema: zodToJsonSchema(navigateSchema),
},
handle: async (context, params) => {
const validatedParams = navigateSchema.parse(params);
const page = await context.createPage();
await page.goto(validatedParams.url, { waitUntil: 'domcontentloaded' });
// Cap load event to 5 seconds, the page is operational at this point.
await page.waitForLoadState('load', { timeout: 5000 }).catch(() => {});
if (snapshot)
return captureAriaSnapshot(context);
return {
content: [{
type: 'text',
text: `Navigated to ${validatedParams.url}`,
}],
};
},
});
const goBackSchema = z.object({});
export const goBack: ToolFactory = snapshot => ({
schema: {
name: 'browser_go_back',
description: 'Go back to the previous page',
inputSchema: zodToJsonSchema(goBackSchema),
},
handle: async context => {
return await runAndWait(context, 'Navigated back', async page => page.goBack(), snapshot);
},
});
const goForwardSchema = z.object({});
export const goForward: ToolFactory = snapshot => ({
schema: {
name: 'browser_go_forward',
description: 'Go forward to the next page',
inputSchema: zodToJsonSchema(goForwardSchema),
},
handle: async context => {
return await runAndWait(context, 'Navigated forward', async page => page.goForward(), snapshot);
},
});
const waitSchema = z.object({ const waitSchema = z.object({
time: z.number().describe('The time to wait in seconds'), time: z.number().describe('The time to wait in seconds'),
}); });
export const wait: Tool = { const wait: Tool = {
capability: 'wait',
schema: { schema: {
name: 'browser_wait', name: 'browser_wait',
description: 'Wait for a specified time in seconds', description: 'Wait for a specified time in seconds',
@@ -99,48 +42,10 @@ export const wait: Tool = {
}, },
}; };
const pressKeySchema = z.object({
key: z.string().describe('Name of the key to press or a character to generate, such as `ArrowLeft` or `a`'),
});
export const pressKey: Tool = {
schema: {
name: 'browser_press_key',
description: 'Press a key on the keyboard',
inputSchema: zodToJsonSchema(pressKeySchema),
},
handle: async (context, params) => {
const validatedParams = pressKeySchema.parse(params);
return await runAndWait(context, `Pressed key ${validatedParams.key}`, async page => {
await page.keyboard.press(validatedParams.key);
});
},
};
const pdfSchema = z.object({});
export const pdf: Tool = {
schema: {
name: 'browser_save_as_pdf',
description: 'Save page as PDF',
inputSchema: zodToJsonSchema(pdfSchema),
},
handle: async context => {
const page = context.existingPage();
const fileName = path.join(os.tmpdir(), `/page-${new Date().toISOString()}.pdf`);
await page.pdf({ path: fileName });
return {
content: [{
type: 'text',
text: `Saved as ${fileName}`,
}],
};
},
};
const closeSchema = z.object({}); const closeSchema = z.object({});
export const close: Tool = { const close: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_close', name: 'browser_close',
description: 'Close the page', description: 'Close the page',
@@ -157,20 +62,7 @@ export const close: Tool = {
}, },
}; };
const chooseFileSchema = z.object({ export default [
paths: z.array(z.string()).describe('The absolute paths to the files to upload. Can be a single file or multiple files.'), close,
}); wait,
];
export const chooseFile: ToolFactory = snapshot => ({
schema: {
name: 'browser_choose_file',
description: 'Choose one or multiple files to upload',
inputSchema: zodToJsonSchema(chooseFileSchema),
},
handle: async (context, params) => {
const validatedParams = chooseFileSchema.parse(params);
return await runAndWait(context, `Chose files ${validatedParams.paths.join(', ')}`, async () => {
await context.submitFileChooser(validatedParams.paths);
}, snapshot);
},
});

48
src/tools/files.ts Normal file
View File

@@ -0,0 +1,48 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import type { ToolFactory } from './tool';
const uploadFileSchema = z.object({
paths: z.array(z.string()).describe('The absolute paths to the files to upload. Can be a single file or multiple files.'),
});
const uploadFile: ToolFactory = captureSnapshot => ({
capability: 'files',
schema: {
name: 'browser_file_upload',
description: 'Upload one or multiple files',
inputSchema: zodToJsonSchema(uploadFileSchema),
},
handle: async (context, params) => {
const validatedParams = uploadFileSchema.parse(params);
const tab = context.currentTab();
return await tab.runAndWait(async () => {
await tab.submitFileChooser(validatedParams.paths);
}, {
status: `Chose files ${validatedParams.paths.join(', ')}`,
captureSnapshot,
noClearFileChooser: true,
});
},
});
export default (captureSnapshot: boolean) => [
uploadFile(captureSnapshot),
];

61
src/tools/install.ts Normal file
View File

@@ -0,0 +1,61 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { fork } from 'child_process';
import path from 'path';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import type { Tool } from './tool';
const install: Tool = {
capability: 'install',
schema: {
name: 'browser_install',
description: 'Install the browser specified in the config. Call this if you get an error about the browser not being installed.',
inputSchema: zodToJsonSchema(z.object({})),
},
handle: async context => {
const channel = context.options.launchOptions?.channel ?? context.options.browserName ?? 'chrome';
const cli = path.join(require.resolve('playwright/package.json'), '..', 'cli.js');
const child = fork(cli, ['install', channel], {
stdio: 'pipe',
});
const output: string[] = [];
child.stdout?.on('data', data => output.push(data.toString()));
child.stderr?.on('data', data => output.push(data.toString()));
await new Promise<void>((resolve, reject) => {
child.on('close', code => {
if (code === 0)
resolve();
else
reject(new Error(`Failed to install browser: ${output.join('')}`));
});
});
return {
content: [{
type: 'text',
text: `Browser ${channel} installed`,
}],
};
},
};
export default [
install,
];

46
src/tools/keyboard.ts Normal file
View File

@@ -0,0 +1,46 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import zodToJsonSchema from 'zod-to-json-schema';
import type { ToolFactory } from './tool';
const pressKeySchema = z.object({
key: z.string().describe('Name of the key to press or a character to generate, such as `ArrowLeft` or `a`'),
});
const pressKey: ToolFactory = captureSnapshot => ({
capability: 'core',
schema: {
name: 'browser_press_key',
description: 'Press a key on the keyboard',
inputSchema: zodToJsonSchema(pressKeySchema),
},
handle: async (context, params) => {
const validatedParams = pressKeySchema.parse(params);
return await context.currentTab().runAndWait(async tab => {
await tab.page.keyboard.press(validatedParams.key);
}, {
status: `Pressed key ${validatedParams.key}`,
captureSnapshot,
});
},
});
export default (captureSnapshot: boolean) => [
pressKey(captureSnapshot),
];

87
src/tools/navigate.ts Normal file
View File

@@ -0,0 +1,87 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import type { ToolFactory } from './tool';
const navigateSchema = z.object({
url: z.string().describe('The URL to navigate to'),
});
const navigate: ToolFactory = captureSnapshot => ({
capability: 'core',
schema: {
name: 'browser_navigate',
description: 'Navigate to a URL',
inputSchema: zodToJsonSchema(navigateSchema),
},
handle: async (context, params) => {
const validatedParams = navigateSchema.parse(params);
const currentTab = await context.ensureTab();
return await currentTab.run(async tab => {
await tab.navigate(validatedParams.url);
}, {
status: `Navigated to ${validatedParams.url}`,
captureSnapshot,
});
},
});
const goBackSchema = z.object({});
const goBack: ToolFactory = snapshot => ({
capability: 'history',
schema: {
name: 'browser_navigate_back',
description: 'Go back to the previous page',
inputSchema: zodToJsonSchema(goBackSchema),
},
handle: async context => {
return await context.currentTab().runAndWait(async tab => {
await tab.page.goBack();
}, {
status: 'Navigated back',
captureSnapshot: snapshot,
});
},
});
const goForwardSchema = z.object({});
const goForward: ToolFactory = snapshot => ({
capability: 'history',
schema: {
name: 'browser_navigate_forward',
description: 'Go forward to the next page',
inputSchema: zodToJsonSchema(goForwardSchema),
},
handle: async context => {
return await context.currentTab().runAndWait(async tab => {
await tab.page.goForward();
}, {
status: 'Navigated forward',
captureSnapshot: snapshot,
});
},
});
export default (captureSnapshot: boolean) => [
navigate(captureSnapshot),
goBack(captureSnapshot),
goForward(captureSnapshot),
];

51
src/tools/pdf.ts Normal file
View File

@@ -0,0 +1,51 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import os from 'os';
import path from 'path';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { sanitizeForFilePath } from './utils';
import type { Tool } from './tool';
const pdfSchema = z.object({});
const pdf: Tool = {
capability: 'pdf',
schema: {
name: 'browser_pdf_save',
description: 'Save page as PDF',
inputSchema: zodToJsonSchema(pdfSchema),
},
handle: async context => {
const tab = context.currentTab();
const fileName = path.join(os.tmpdir(), sanitizeForFilePath(`page-${new Date().toISOString()}`)) + '.pdf';
await tab.page.pdf({ path: fileName });
return {
content: [{
type: 'text',
text: `Saved as ${fileName}`,
}],
};
},
};
export default [
pdf,
];

View File

@@ -17,20 +17,19 @@
import { z } from 'zod'; import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema'; import { zodToJsonSchema } from 'zod-to-json-schema';
import { runAndWait } from './utils';
import type { Tool } from './tool'; import type { Tool } from './tool';
export const screenshot: Tool = { const screenshot: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_screenshot', name: 'browser_screen_capture',
description: 'Take a screenshot of the current page', description: 'Take a screenshot of the current page',
inputSchema: zodToJsonSchema(z.object({})), inputSchema: zodToJsonSchema(z.object({})),
}, },
handle: async context => { handle: async context => {
const page = context.existingPage(); const tab = context.currentTab();
const screenshot = await page.screenshot({ type: 'jpeg', quality: 50, scale: 'css' }); const screenshot = await tab.page.screenshot({ type: 'jpeg', quality: 50, scale: 'css' });
return { return {
content: [{ type: 'image', data: screenshot.toString('base64'), mimeType: 'image/jpeg' }], content: [{ type: 'image', data: screenshot.toString('base64'), mimeType: 'image/jpeg' }],
}; };
@@ -46,17 +45,18 @@ const moveMouseSchema = elementSchema.extend({
y: z.number().describe('Y coordinate'), y: z.number().describe('Y coordinate'),
}); });
export const moveMouse: Tool = { const moveMouse: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_move_mouse', name: 'browser_screen_move_mouse',
description: 'Move mouse to a given position', description: 'Move mouse to a given position',
inputSchema: zodToJsonSchema(moveMouseSchema), inputSchema: zodToJsonSchema(moveMouseSchema),
}, },
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = moveMouseSchema.parse(params); const validatedParams = moveMouseSchema.parse(params);
const page = context.existingPage(); const tab = context.currentTab();
await page.mouse.move(validatedParams.x, validatedParams.y); await tab.page.mouse.move(validatedParams.x, validatedParams.y);
return { return {
content: [{ type: 'text', text: `Moved mouse to (${validatedParams.x}, ${validatedParams.y})` }], content: [{ type: 'text', text: `Moved mouse to (${validatedParams.x}, ${validatedParams.y})` }],
}; };
@@ -68,19 +68,22 @@ const clickSchema = elementSchema.extend({
y: z.number().describe('Y coordinate'), y: z.number().describe('Y coordinate'),
}); });
export const click: Tool = { const click: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_click', name: 'browser_screen_click',
description: 'Click left mouse button', description: 'Click left mouse button',
inputSchema: zodToJsonSchema(clickSchema), inputSchema: zodToJsonSchema(clickSchema),
}, },
handle: async (context, params) => { handle: async (context, params) => {
return await runAndWait(context, 'Clicked mouse', async page => { return await context.currentTab().runAndWait(async tab => {
const validatedParams = clickSchema.parse(params); const validatedParams = clickSchema.parse(params);
await page.mouse.move(validatedParams.x, validatedParams.y); await tab.page.mouse.move(validatedParams.x, validatedParams.y);
await page.mouse.down(); await tab.page.mouse.down();
await page.mouse.up(); await tab.page.mouse.up();
}, {
status: 'Clicked mouse',
}); });
}, },
}; };
@@ -92,42 +95,56 @@ const dragSchema = elementSchema.extend({
endY: z.number().describe('End Y coordinate'), endY: z.number().describe('End Y coordinate'),
}); });
export const drag: Tool = { const drag: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_drag', name: 'browser_screen_drag',
description: 'Drag left mouse button', description: 'Drag left mouse button',
inputSchema: zodToJsonSchema(dragSchema), inputSchema: zodToJsonSchema(dragSchema),
}, },
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = dragSchema.parse(params); const validatedParams = dragSchema.parse(params);
return await runAndWait(context, `Dragged mouse from (${validatedParams.startX}, ${validatedParams.startY}) to (${validatedParams.endX}, ${validatedParams.endY})`, async page => { return await context.currentTab().runAndWait(async tab => {
await page.mouse.move(validatedParams.startX, validatedParams.startY); await tab.page.mouse.move(validatedParams.startX, validatedParams.startY);
await page.mouse.down(); await tab.page.mouse.down();
await page.mouse.move(validatedParams.endX, validatedParams.endY); await tab.page.mouse.move(validatedParams.endX, validatedParams.endY);
await page.mouse.up(); await tab.page.mouse.up();
}, {
status: `Dragged mouse from (${validatedParams.startX}, ${validatedParams.startY}) to (${validatedParams.endX}, ${validatedParams.endY})`,
}); });
}, },
}; };
const typeSchema = z.object({ const typeSchema = z.object({
text: z.string().describe('Text to type into the element'), text: z.string().describe('Text to type into the element'),
submit: z.boolean().describe('Whether to submit entered text (press Enter after)'), submit: z.boolean().optional().describe('Whether to submit entered text (press Enter after)'),
}); });
export const type: Tool = { const type: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_type', name: 'browser_screen_type',
description: 'Type text', description: 'Type text',
inputSchema: zodToJsonSchema(typeSchema), inputSchema: zodToJsonSchema(typeSchema),
}, },
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = typeSchema.parse(params); const validatedParams = typeSchema.parse(params);
return await runAndWait(context, `Typed text "${validatedParams.text}"`, async page => { return await context.currentTab().runAndWait(async tab => {
await page.keyboard.type(validatedParams.text); await tab.page.keyboard.type(validatedParams.text);
if (validatedParams.submit) if (validatedParams.submit)
await page.keyboard.press('Enter'); await tab.page.keyboard.press('Enter');
}, {
status: `Typed text "${validatedParams.text}"`,
}); });
}, },
}; };
export default [
screenshot,
moveMouse,
click,
drag,
type,
];

View File

@@ -17,12 +17,11 @@
import { z } from 'zod'; import { z } from 'zod';
import zodToJsonSchema from 'zod-to-json-schema'; import zodToJsonSchema from 'zod-to-json-schema';
import { captureAriaSnapshot, runAndWait } from './utils';
import type * as playwright from 'playwright'; import type * as playwright from 'playwright';
import type { Tool } from './tool'; import type { Tool } from './tool';
export const snapshot: Tool = { const snapshot: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_snapshot', name: 'browser_snapshot',
description: 'Capture accessibility snapshot of the current page, this is better than screenshot', description: 'Capture accessibility snapshot of the current page, this is better than screenshot',
@@ -30,7 +29,7 @@ export const snapshot: Tool = {
}, },
handle: async context => { handle: async context => {
return await captureAriaSnapshot(context); return await context.currentTab().run(async () => {}, { captureSnapshot: true });
}, },
}; };
@@ -39,7 +38,8 @@ const elementSchema = z.object({
ref: z.string().describe('Exact target element reference from the page snapshot'), ref: z.string().describe('Exact target element reference from the page snapshot'),
}); });
export const click: Tool = { const click: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_click', name: 'browser_click',
description: 'Perform click on a web page', description: 'Perform click on a web page',
@@ -48,7 +48,12 @@ export const click: Tool = {
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = elementSchema.parse(params); const validatedParams = elementSchema.parse(params);
return runAndWait(context, `"${validatedParams.element}" clicked`, () => context.refLocator(validatedParams.ref).click(), true); return await context.currentTab().runAndWaitWithSnapshot(async tab => {
const locator = tab.lastSnapshot().refLocator(validatedParams.ref);
await locator.click();
}, {
status: `Clicked "${validatedParams.element}"`,
});
}, },
}; };
@@ -59,7 +64,8 @@ const dragSchema = z.object({
endRef: z.string().describe('Exact target element reference from the page snapshot'), endRef: z.string().describe('Exact target element reference from the page snapshot'),
}); });
export const drag: Tool = { const drag: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_drag', name: 'browser_drag',
description: 'Perform drag and drop between two elements', description: 'Perform drag and drop between two elements',
@@ -68,15 +74,18 @@ export const drag: Tool = {
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = dragSchema.parse(params); const validatedParams = dragSchema.parse(params);
return runAndWait(context, `Dragged "${validatedParams.startElement}" to "${validatedParams.endElement}"`, async () => { return await context.currentTab().runAndWaitWithSnapshot(async tab => {
const startLocator = context.refLocator(validatedParams.startRef); const startLocator = tab.lastSnapshot().refLocator(validatedParams.startRef);
const endLocator = context.refLocator(validatedParams.endRef); const endLocator = tab.lastSnapshot().refLocator(validatedParams.endRef);
await startLocator.dragTo(endLocator); await startLocator.dragTo(endLocator);
}, true); }, {
status: `Dragged "${validatedParams.startElement}" to "${validatedParams.endElement}"`,
});
}, },
}; };
export const hover: Tool = { const hover: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_hover', name: 'browser_hover',
description: 'Hover over element on page', description: 'Hover over element on page',
@@ -85,16 +94,23 @@ export const hover: Tool = {
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = elementSchema.parse(params); const validatedParams = elementSchema.parse(params);
return runAndWait(context, `Hovered over "${validatedParams.element}"`, () => context.refLocator(validatedParams.ref).hover(), true); return await context.currentTab().runAndWaitWithSnapshot(async tab => {
const locator = tab.lastSnapshot().refLocator(validatedParams.ref);
await locator.hover();
}, {
status: `Hovered over "${validatedParams.element}"`,
});
}, },
}; };
const typeSchema = elementSchema.extend({ const typeSchema = elementSchema.extend({
text: z.string().describe('Text to type into the element'), text: z.string().describe('Text to type into the element'),
submit: z.boolean().describe('Whether to submit entered text (press Enter after)'), submit: z.boolean().optional().describe('Whether to submit entered text (press Enter after)'),
slowly: z.boolean().optional().describe('Whether to type one character at a time. Useful for triggering key handlers in the page. By default entire text is filled in at once.'),
}); });
export const type: Tool = { const type: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_type', name: 'browser_type',
description: 'Type text into editable element', description: 'Type text into editable element',
@@ -103,12 +119,17 @@ export const type: Tool = {
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = typeSchema.parse(params); const validatedParams = typeSchema.parse(params);
return await runAndWait(context, `Typed "${validatedParams.text}" into "${validatedParams.element}"`, async () => { return await context.currentTab().runAndWaitWithSnapshot(async tab => {
const locator = context.refLocator(validatedParams.ref); const locator = tab.lastSnapshot().refLocator(validatedParams.ref);
await locator.fill(validatedParams.text); if (validatedParams.slowly)
await locator.pressSequentially(validatedParams.text);
else
await locator.fill(validatedParams.text);
if (validatedParams.submit) if (validatedParams.submit)
await locator.press('Enter'); await locator.press('Enter');
}, true); }, {
status: `Typed "${validatedParams.text}" into "${validatedParams.element}"`,
});
}, },
}; };
@@ -116,7 +137,8 @@ const selectOptionSchema = elementSchema.extend({
values: z.array(z.string()).describe('Array of values to select in the dropdown. This can be a single value or multiple values.'), values: z.array(z.string()).describe('Array of values to select in the dropdown. This can be a single value or multiple values.'),
}); });
export const selectOption: Tool = { const selectOption: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_select_option', name: 'browser_select_option',
description: 'Select an option in a dropdown', description: 'Select an option in a dropdown',
@@ -125,10 +147,12 @@ export const selectOption: Tool = {
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = selectOptionSchema.parse(params); const validatedParams = selectOptionSchema.parse(params);
return await runAndWait(context, `Selected option in "${validatedParams.element}"`, async () => { return await context.currentTab().runAndWaitWithSnapshot(async tab => {
const locator = context.refLocator(validatedParams.ref); const locator = tab.lastSnapshot().refLocator(validatedParams.ref);
await locator.selectOption(validatedParams.values); await locator.selectOption(validatedParams.values);
}, true); }, {
status: `Selected option in "${validatedParams.element}"`,
});
}, },
}; };
@@ -136,7 +160,8 @@ const screenshotSchema = z.object({
raw: z.boolean().optional().describe('Whether to return without compression (in PNG format). Default is false, which returns a JPEG image.'), raw: z.boolean().optional().describe('Whether to return without compression (in PNG format). Default is false, which returns a JPEG image.'),
}); });
export const screenshot: Tool = { const screenshot: Tool = {
capability: 'core',
schema: { schema: {
name: 'browser_take_screenshot', name: 'browser_take_screenshot',
description: `Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.`, description: `Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.`,
@@ -145,11 +170,21 @@ export const screenshot: Tool = {
handle: async (context, params) => { handle: async (context, params) => {
const validatedParams = screenshotSchema.parse(params); const validatedParams = screenshotSchema.parse(params);
const page = context.existingPage(); const tab = context.currentTab();
const options: playwright.PageScreenshotOptions = validatedParams.raw ? { type: 'png', scale: 'css' } : { type: 'jpeg', quality: 50, scale: 'css' }; const options: playwright.PageScreenshotOptions = validatedParams.raw ? { type: 'png', scale: 'css' } : { type: 'jpeg', quality: 50, scale: 'css' };
const screenshot = await page.screenshot(options); const screenshot = await tab.page.screenshot(options);
return { return {
content: [{ type: 'image', data: screenshot.toString('base64'), mimeType: validatedParams.raw ? 'image/png' : 'image/jpeg' }], content: [{ type: 'image', data: screenshot.toString('base64'), mimeType: validatedParams.raw ? 'image/png' : 'image/jpeg' }],
}; };
}, },
}; };
export default [
snapshot,
click,
drag,
hover,
type,
selectOption,
screenshot,
];

109
src/tools/tabs.ts Normal file
View File

@@ -0,0 +1,109 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import type { ToolFactory, Tool } from './tool';
const listTabs: Tool = {
capability: 'tabs',
schema: {
name: 'browser_tab_list',
description: 'List browser tabs',
inputSchema: zodToJsonSchema(z.object({})),
},
handle: async context => {
return {
content: [{
type: 'text',
text: await context.listTabs(),
}],
};
},
};
const selectTabSchema = z.object({
index: z.number().describe('The index of the tab to select'),
});
const selectTab: ToolFactory = captureSnapshot => ({
capability: 'tabs',
schema: {
name: 'browser_tab_select',
description: 'Select a tab by index',
inputSchema: zodToJsonSchema(selectTabSchema),
},
handle: async (context, params) => {
const validatedParams = selectTabSchema.parse(params);
await context.selectTab(validatedParams.index);
const currentTab = await context.ensureTab();
return await currentTab.run(async () => {}, { captureSnapshot });
},
});
const newTabSchema = z.object({
url: z.string().optional().describe('The URL to navigate to in the new tab. If not provided, the new tab will be blank.'),
});
const newTab: Tool = {
capability: 'tabs',
schema: {
name: 'browser_tab_new',
description: 'Open a new tab',
inputSchema: zodToJsonSchema(newTabSchema),
},
handle: async (context, params) => {
const validatedParams = newTabSchema.parse(params);
await context.newTab();
if (validatedParams.url)
await context.currentTab().navigate(validatedParams.url);
return await context.currentTab().run(async () => {}, { captureSnapshot: true });
},
};
const closeTabSchema = z.object({
index: z.number().optional().describe('The index of the tab to close. Closes current tab if not provided.'),
});
const closeTab: ToolFactory = captureSnapshot => ({
capability: 'tabs',
schema: {
name: 'browser_tab_close',
description: 'Close a tab',
inputSchema: zodToJsonSchema(closeTabSchema),
},
handle: async (context, params) => {
const validatedParams = closeTabSchema.parse(params);
await context.closeTab(validatedParams.index);
const currentTab = await context.currentTab();
if (currentTab)
return await currentTab.run(async () => {}, { captureSnapshot });
return {
content: [{
type: 'text',
text: await context.listTabs(),
}],
};
},
});
export default (captureSnapshot: boolean) => [
listTabs,
newTab,
selectTab(captureSnapshot),
closeTab(captureSnapshot),
];

View File

@@ -18,6 +18,8 @@ import type { ImageContent, TextContent } from '@modelcontextprotocol/sdk/types'
import type { JsonSchema7Type } from 'zod-to-json-schema'; import type { JsonSchema7Type } from 'zod-to-json-schema';
import type { Context } from '../context'; import type { Context } from '../context';
export type ToolCapability = 'core' | 'tabs' | 'pdf' | 'history' | 'wait' | 'files' | 'install';
export type ToolSchema = { export type ToolSchema = {
name: string; name: string;
description: string; description: string;
@@ -30,6 +32,7 @@ export type ToolResult = {
}; };
export type Tool = { export type Tool = {
capability: ToolCapability;
schema: ToolSchema; schema: ToolSchema;
handle: (context: Context, params?: Record<string, any>) => Promise<ToolResult>; handle: (context: Context, params?: Record<string, any>) => Promise<ToolResult>;
}; };

View File

@@ -15,10 +15,8 @@
*/ */
import type * as playwright from 'playwright'; import type * as playwright from 'playwright';
import type { ToolResult } from './tool';
import type { Context } from '../context';
async function waitForCompletion<R>(page: playwright.Page, callback: () => Promise<R>): Promise<R> { export async function waitForCompletion<R>(page: playwright.Page, callback: () => Promise<R>): Promise<R> {
const requests = new Set<playwright.Request>(); const requests = new Set<playwright.Request>();
let frameNavigated = false; let frameNavigated = false;
let waitCallback: () => void = () => {}; let waitCallback: () => void = () => {};
@@ -71,38 +69,6 @@ async function waitForCompletion<R>(page: playwright.Page, callback: () => Promi
} }
} }
export async function runAndWait(context: Context, status: string, callback: (page: playwright.Page) => Promise<any>, snapshot: boolean = false): Promise<ToolResult> { export function sanitizeForFilePath(s: string) {
const page = context.existingPage(); return s.replace(/[\x00-\x2C\x2E-\x2F\x3A-\x40\x5B-\x60\x7B-\x7F]+/g, '-');
const dismissFileChooser = context.hasFileChooser();
await waitForCompletion(page, () => callback(page));
if (dismissFileChooser)
context.clearFileChooser();
const result: ToolResult = snapshot ? await captureAriaSnapshot(context, status) : {
content: [{ type: 'text', text: status }],
};
return result;
}
export async function captureAriaSnapshot(context: Context, status: string = ''): Promise<ToolResult> {
const page = context.existingPage();
const lines = [];
if (status)
lines.push(`${status}`);
lines.push(
'',
`- Page URL: ${page.url()}`,
`- Page Title: ${await page.title()}`
);
if (context.hasFileChooser())
lines.push(`- There is a file chooser visible that requires browser_choose_file to be called`);
lines.push(
`- Page Snapshot`,
'```yaml',
await context.allFramesSnapshot(),
'```',
''
);
return {
content: [{ type: 'text', text: lines.join('\n') }],
};
} }

View File

@@ -15,75 +15,28 @@
*/ */
import fs from 'fs/promises'; import fs from 'fs/promises';
import { spawn } from 'node:child_process';
import path from 'node:path';
import { test, expect } from './fixtures'; import { test, expect } from './fixtures';
test('test tool list', async ({ client, visionClient }) => { test('browser_navigate', async ({ client }) => {
const { tools } = await client.listTools();
expect(tools.map(t => t.name)).toEqual([
'browser_navigate',
'browser_go_back',
'browser_go_forward',
'browser_choose_file',
'browser_snapshot',
'browser_click',
'browser_hover',
'browser_type',
'browser_select_option',
'browser_take_screenshot',
'browser_press_key',
'browser_wait',
'browser_save_as_pdf',
'browser_close',
]);
const { tools: visionTools } = await visionClient.listTools();
expect(visionTools.map(t => t.name)).toEqual([
'browser_navigate',
'browser_go_back',
'browser_go_forward',
'browser_choose_file',
'browser_screenshot',
'browser_move_mouse',
'browser_click',
'browser_drag',
'browser_type',
'browser_press_key',
'browser_wait',
'browser_save_as_pdf',
'browser_close',
]);
});
test('test resources list', async ({ client }) => {
const { resources } = await client.listResources();
expect(resources).toEqual([
expect.objectContaining({
uri: 'browser://console',
mimeType: 'text/plain',
}),
]);
});
test('test browser_navigate', async ({ client }) => {
expect(await client.callTool({ expect(await client.callTool({
name: 'browser_navigate', name: 'browser_navigate',
arguments: { arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>', url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
}, },
})).toHaveTextContent(` })).toHaveTextContent(`
Navigated to data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html> - Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title - Page Title: Title
- Page Snapshot - Page Snapshot
\`\`\`yaml \`\`\`yaml
- document [ref=s1e2]: Hello, world! - text: Hello, world!
\`\`\` \`\`\`
` `
); );
}); });
test('test browser_click', async ({ client }) => { test('browser_click', async ({ client }) => {
await client.callTool({ await client.callTool({
name: 'browser_navigate', name: 'browser_navigate',
arguments: { arguments: {
@@ -95,48 +48,21 @@ test('test browser_click', async ({ client }) => {
name: 'browser_click', name: 'browser_click',
arguments: { arguments: {
element: 'Submit button', element: 'Submit button',
ref: 's1e4', ref: 's1e3',
}, },
})).toHaveTextContent(`"Submit button" clicked })).toHaveTextContent(`Clicked "Submit button"
- Page URL: data:text/html,<html><title>Title</title><button>Submit</button></html> - Page URL: data:text/html,<html><title>Title</title><button>Submit</button></html>
- Page Title: Title - Page Title: Title
- Page Snapshot - Page Snapshot
\`\`\`yaml \`\`\`yaml
- document [ref=s2e2]: - button "Submit" [ref=s2e3]
- button "Submit" [ref=s2e4]
\`\`\` \`\`\`
`); `);
}); });
test('test reopen browser', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
});
expect(await client.callTool({ test('browser_select_option', async ({ client }) => {
name: 'browser_close',
})).toHaveTextContent('Page closed');
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
})).toHaveTextContent(`
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s1e2]: Hello, world!
\`\`\`
`);
});
test('single option', async ({ client }) => {
await client.callTool({ await client.callTool({
name: 'browser_navigate', name: 'browser_navigate',
arguments: { arguments: {
@@ -148,7 +74,7 @@ test('single option', async ({ client }) => {
name: 'browser_select_option', name: 'browser_select_option',
arguments: { arguments: {
element: 'Select', element: 'Select',
ref: 's1e4', ref: 's1e3',
values: ['bar'], values: ['bar'],
}, },
})).toHaveTextContent(`Selected option in "Select" })).toHaveTextContent(`Selected option in "Select"
@@ -157,15 +83,14 @@ test('single option', async ({ client }) => {
- Page Title: Title - Page Title: Title
- Page Snapshot - Page Snapshot
\`\`\`yaml \`\`\`yaml
- document [ref=s2e2]: - combobox [ref=s2e3]:
- combobox [ref=s2e4]: - option "Foo" [ref=s2e4]
- option "Foo" [ref=s2e5] - option "Bar" [selected] [ref=s2e5]
- option "Bar" [selected] [ref=s2e6]
\`\`\` \`\`\`
`); `);
}); });
test('multiple option', async ({ client }) => { test('browser_select_option (multiple)', async ({ client }) => {
await client.callTool({ await client.callTool({
name: 'browser_navigate', name: 'browser_navigate',
arguments: { arguments: {
@@ -177,7 +102,7 @@ test('multiple option', async ({ client }) => {
name: 'browser_select_option', name: 'browser_select_option',
arguments: { arguments: {
element: 'Select', element: 'Select',
ref: 's1e4', ref: 's1e3',
values: ['bar', 'baz'], values: ['bar', 'baz'],
}, },
})).toHaveTextContent(`Selected option in "Select" })).toHaveTextContent(`Selected option in "Select"
@@ -186,83 +111,43 @@ test('multiple option', async ({ client }) => {
- Page Title: Title - Page Title: Title
- Page Snapshot - Page Snapshot
\`\`\`yaml \`\`\`yaml
- document [ref=s2e2]: - listbox [ref=s2e3]:
- listbox [ref=s2e4]: - option "Foo" [ref=s2e4]
- option "Foo" [ref=s2e5] - option "Bar" [selected] [ref=s2e5]
- option "Bar" [selected] [ref=s2e6] - option "Baz" [selected] [ref=s2e6]
- option "Baz" [selected] [ref=s2e7]
\`\`\` \`\`\`
`); `);
}); });
test('browser://console', async ({ client }) => { test('browser_file_upload', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><script>console.log("Hello, world!");console.error("Error"); </script></html>',
},
});
const resource = await client.readResource({
uri: 'browser://console',
});
expect(resource.contents).toEqual([{
uri: 'browser://console',
mimeType: 'text/plain',
text: '[LOG] Hello, world!\n[ERROR] Error',
}]);
});
test('stitched aria frames', async ({ client }) => {
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<h1>Hello</h1><iframe src="data:text/html,<h1>World</h1>"></iframe><iframe src="data:text/html,<h1>Should be invisible</h1>" style="display: none;"></iframe>',
},
})).toHaveTextContent(`
- Page URL: data:text/html,<h1>Hello</h1><iframe src="data:text/html,<h1>World</h1>"></iframe><iframe src="data:text/html,<h1>Should be invisible</h1>" style="display: none;"></iframe>
- Page Title:
- Page Snapshot
\`\`\`yaml
- document [ref=s1e2]:
- heading "Hello" [level=1] [ref=s1e4]
# iframe src=data:text/html,<h1>World</h1>
- document [ref=f0s1e2]:
- heading "World" [level=1] [ref=f0s1e4]
\`\`\`
`);
});
test('browser_choose_file', async ({ client }) => {
expect(await client.callTool({ expect(await client.callTool({
name: 'browser_navigate', name: 'browser_navigate',
arguments: { arguments: {
url: 'data:text/html,<html><title>Title</title><input type="file" /><button>Button</button></html>', url: 'data:text/html,<html><title>Title</title><input type="file" /><button>Button</button></html>',
}, },
})).toContainTextContent('- textbox [ref=s1e4]'); })).toContainTextContent('- textbox [ref=s1e3]');
expect(await client.callTool({ expect(await client.callTool({
name: 'browser_click', name: 'browser_click',
arguments: { arguments: {
element: 'Textbox', element: 'Textbox',
ref: 's1e4', ref: 's1e3',
}, },
})).toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called'); })).toContainTextContent('There is a file chooser visible that requires browser_file_upload to be called');
const filePath = test.info().outputPath('test.txt'); const filePath = test.info().outputPath('test.txt');
await fs.writeFile(filePath, 'Hello, world!'); await fs.writeFile(filePath, 'Hello, world!');
{ {
const response = await client.callTool({ const response = await client.callTool({
name: 'browser_choose_file', name: 'browser_file_upload',
arguments: { arguments: {
paths: [filePath], paths: [filePath],
}, },
}); });
expect(response).not.toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called'); expect(response).not.toContainTextContent('There is a file chooser visible that requires browser_file_upload to be called');
expect(response).toContainTextContent('textbox [ref=s3e4]: C:\\fakepath\\test.txt'); expect(response).toContainTextContent('textbox [ref=s3e3]: C:\\fakepath\\test.txt');
} }
{ {
@@ -270,12 +155,12 @@ test('browser_choose_file', async ({ client }) => {
name: 'browser_click', name: 'browser_click',
arguments: { arguments: {
element: 'Textbox', element: 'Textbox',
ref: 's3e4', ref: 's3e3',
}, },
}); });
expect(response).toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called'); expect(response).toContainTextContent('There is a file chooser visible that requires browser_file_upload to be called');
expect(response).toContainTextContent('button "Button" [ref=s4e5]'); expect(response).toContainTextContent('button "Button" [ref=s4e4]');
} }
{ {
@@ -283,33 +168,68 @@ test('browser_choose_file', async ({ client }) => {
name: 'browser_click', name: 'browser_click',
arguments: { arguments: {
element: 'Button', element: 'Button',
ref: 's4e5', ref: 's4e4',
}, },
}); });
expect(response, 'not submitting browser_choose_file dismisses file chooser').not.toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called'); expect(response, 'not submitting browser_file_upload dismisses file chooser').not.toContainTextContent('There is a file chooser visible that requires browser_file_upload to be called');
} }
}); });
test('sse transport', async () => { test('browser_type', async ({ client }) => {
const cp = spawn('node', [path.join(__dirname, '../cli.js'), '--port', '0'], { stdio: 'pipe' }); await client.callTool({
try { name: 'browser_navigate',
let stdout = ''; arguments: {
const url = await new Promise<string>(resolve => cp.stdout?.on('data', data => { url: `data:text/html,<input type='keypress' onkeypress="console.log('Key pressed:', event.key, ', Text:', event.target.value)"></input>`,
stdout += data.toString(); },
const match = stdout.match(/Listening on (http:\/\/.*)/); });
if (match) await client.callTool({
resolve(match[1]); name: 'browser_type',
})); arguments: {
element: 'textbox',
// need dynamic import b/c of some ESM nonsense ref: 's1e3',
const { SSEClientTransport } = await import('@modelcontextprotocol/sdk/client/sse.js'); text: 'Hi!',
const { Client } = await import('@modelcontextprotocol/sdk/client/index.js'); submit: true,
const transport = new SSEClientTransport(new URL(url)); },
const client = new Client({ name: 'test', version: '1.0.0' }); });
await client.connect(transport); const resource = await client.readResource({
await client.ping(); uri: 'browser://console',
} finally { });
cp.kill(); expect(resource.contents).toEqual([{
} uri: 'browser://console',
mimeType: 'text/plain',
text: '[LOG] Key pressed: Enter , Text: Hi!',
}]);
});
test('browser_type (slowly)', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: `data:text/html,<input type='text' onkeydown="console.log('Key pressed:', event.key, 'Text:', event.target.value)"></input>`,
},
});
await client.callTool({
name: 'browser_type',
arguments: {
element: 'textbox',
ref: 's1e3',
text: 'Hi!',
submit: true,
slowly: true,
},
});
const resource = await client.readResource({
uri: 'browser://console',
});
expect(resource.contents).toEqual([{
uri: 'browser://console',
mimeType: 'text/plain',
text: [
'[LOG] Key pressed: H Text: ',
'[LOG] Key pressed: i Text: H',
'[LOG] Key pressed: ! Text: Hi',
'[LOG] Key pressed: Enter Text: Hi!',
].join('\n'),
}]);
}); });

View File

@@ -0,0 +1,92 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures';
test('test snapshot tool list', async ({ client }) => {
const { tools } = await client.listTools();
expect(new Set(tools.map(t => t.name))).toEqual(new Set([
'browser_click',
'browser_drag',
'browser_file_upload',
'browser_hover',
'browser_select_option',
'browser_type',
'browser_close',
'browser_install',
'browser_navigate_back',
'browser_navigate_forward',
'browser_navigate',
'browser_pdf_save',
'browser_press_key',
'browser_snapshot',
'browser_tab_close',
'browser_tab_list',
'browser_tab_new',
'browser_tab_select',
'browser_take_screenshot',
'browser_wait',
]));
});
test('test vision tool list', async ({ visionClient }) => {
const { tools: visionTools } = await visionClient.listTools();
expect(new Set(visionTools.map(t => t.name))).toEqual(new Set([
'browser_close',
'browser_file_upload',
'browser_install',
'browser_navigate_back',
'browser_navigate_forward',
'browser_navigate',
'browser_pdf_save',
'browser_press_key',
'browser_screen_capture',
'browser_screen_click',
'browser_screen_drag',
'browser_screen_move_mouse',
'browser_screen_type',
'browser_tab_close',
'browser_tab_list',
'browser_tab_new',
'browser_tab_select',
'browser_wait',
]));
});
test('test resources list', async ({ client }) => {
const { resources } = await client.listResources();
expect(resources).toEqual([
expect.objectContaining({
uri: 'browser://console',
mimeType: 'text/plain',
}),
]);
});
test('test capabilities', async ({ startClient }) => {
const client = await startClient({
args: ['--caps="core"'],
});
const { tools } = await client.listTools();
const toolNames = tools.map(t => t.name);
expect(toolNames).not.toContain('browser_file_upload');
expect(toolNames).not.toContain('browser_pdf_save');
expect(toolNames).not.toContain('browser_screen_capture');
expect(toolNames).not.toContain('browser_screen_click');
expect(toolNames).not.toContain('browser_screen_drag');
expect(toolNames).not.toContain('browser_screen_move_mouse');
expect(toolNames).not.toContain('browser_screen_type');
});

37
tests/cdp.spec.ts Normal file
View File

@@ -0,0 +1,37 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures';
test('cdp server', async ({ cdpEndpoint, startClient }) => {
const client = await startClient({ args: [`--cdp-endpoint=${cdpEndpoint}`] });
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
})).toHaveTextContent(`
Navigated to data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- text: Hello, world!
\`\`\`
`
);
});

35
tests/console.spec.ts Normal file
View File

@@ -0,0 +1,35 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures';
test('browser://console', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><script>console.log("Hello, world!");console.error("Error"); </script></html>',
},
});
const resource = await client.readResource({
uri: 'browser://console',
});
expect(resource.contents).toEqual([{
uri: 'browser://console',
mimeType: 'text/plain',
text: '[LOG] Hello, world!\n[ERROR] Error',
}]);
});

View File

@@ -24,8 +24,9 @@ import { Client } from '@modelcontextprotocol/sdk/client/index.js';
type Fixtures = { type Fixtures = {
client: Client; client: Client;
visionClient: Client; visionClient: Client;
startClient: (options?: { env?: NodeJS.ProcessEnv, vision?: boolean }) => Promise<Client>; startClient: (options?: { args?: string[] }) => Promise<Client>;
wsEndpoint: string; wsEndpoint: string;
cdpEndpoint: string;
}; };
export const test = baseTest.extend<Fixtures>({ export const test = baseTest.extend<Fixtures>({
@@ -35,7 +36,7 @@ export const test = baseTest.extend<Fixtures>({
}, },
visionClient: async ({ startClient }, use) => { visionClient: async ({ startClient }, use) => {
await use(await startClient({ vision: true })); await use(await startClient({ args: ['--vision'] }));
}, },
startClient: async ({ }, use, testInfo) => { startClient: async ({ }, use, testInfo) => {
@@ -44,8 +45,8 @@ export const test = baseTest.extend<Fixtures>({
use(async options => { use(async options => {
const args = ['--headless', '--user-data-dir', userDataDir]; const args = ['--headless', '--user-data-dir', userDataDir];
if (options?.vision) if (options?.args)
args.push('--vision'); args.push(...options.args);
const transport = new StdioClientTransport({ const transport = new StdioClientTransport({
command: 'node', command: 'node',
args: [path.join(__dirname, '../cli.js'), ...args], args: [path.join(__dirname, '../cli.js'), ...args],
@@ -64,20 +65,36 @@ export const test = baseTest.extend<Fixtures>({
await use(browserServer.wsEndpoint()); await use(browserServer.wsEndpoint());
await browserServer.close(); await browserServer.close();
}, },
cdpEndpoint: async ({ }, use, testInfo) => {
const port = 3200 + (+process.env.TEST_PARALLEL_INDEX!);
const browser = await chromium.launchPersistentContext(testInfo.outputPath('user-data-dir'), {
channel: 'chrome',
args: [`--remote-debugging-port=${port}`],
});
await use(`http://localhost:${port}`);
await browser.close();
},
}); });
type Response = Awaited<ReturnType<Client['callTool']>>; type Response = Awaited<ReturnType<Client['callTool']>>;
export const expect = baseExpect.extend({ export const expect = baseExpect.extend({
toHaveTextContent(response: Response, content: string | string[]) { toHaveTextContent(response: Response, content: string | RegExp) {
const isNot = this.isNot; const isNot = this.isNot;
try { try {
content = Array.isArray(content) ? content : [content]; const text = (response.content as any)[0].text;
const texts = (response.content as any).map(c => c.text); if (typeof content === 'string') {
if (isNot) if (isNot)
baseExpect(texts).not.toEqual(content); baseExpect(text.trim()).not.toBe(content.trim());
else else
baseExpect(texts).toEqual(content); baseExpect(text.trim()).toBe(content.trim());
} else {
if (isNot)
baseExpect(text).not.toMatch(content);
else
baseExpect(text).toMatch(content);
}
} catch (e) { } catch (e) {
return { return {
pass: isNot, pass: isNot,

43
tests/iframes.spec.ts Normal file
View File

@@ -0,0 +1,43 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures';
test('stitched aria frames', async ({ client }) => {
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: `data:text/html,<h1>Hello</h1><iframe src="data:text/html,<button>World</button><main><iframe src='data:text/html,<p>Nested</p>'></iframe></main>"></iframe><iframe src="data:text/html,<h1>Should be invisible</h1>" style="display: none;"></iframe>`,
},
})).toContainTextContent(`
\`\`\`yaml
- heading "Hello" [level=1] [ref=s1e3]
- iframe [ref=s1e4]:
- button "World" [ref=f1s1e3]
- main [ref=f1s1e4]:
- iframe [ref=f1s1e5]:
- paragraph [ref=f2s1e3]: Nested
\`\`\`
`);
expect(await client.callTool({
name: 'browser_click',
arguments: {
element: 'World',
ref: 'f1s1e3',
},
})).toContainTextContent('Clicked "World"');
});

57
tests/launch.spec.ts Normal file
View File

@@ -0,0 +1,57 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures';
test('test reopen browser', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
});
expect(await client.callTool({
name: 'browser_close',
})).toHaveTextContent('Page closed');
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
})).toHaveTextContent(`
Navigated to data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- text: Hello, world!
\`\`\`
`);
});
test('executable path', async ({ startClient }) => {
const client = await startClient({ args: [`--executable-path=bogus`] });
const response = await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
});
expect(response).toContainTextContent(`executable doesn't exist`);
});

55
tests/pdf.spec.ts Normal file
View File

@@ -0,0 +1,55 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures';
test('save as pdf unavailable', async ({ startClient }) => {
const client = await startClient({ args: ['--caps="no-pdf"'] });
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
});
expect(await client.callTool({
name: 'browser_pdf_save',
})).toHaveTextContent(/Tool \"browser_pdf_save\" not found/);
});
test('save as pdf', async ({ client }) => {
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
})).toHaveTextContent(`
Navigated to data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- text: Hello, world!
\`\`\`
`
);
const response = await client.callTool({
name: 'browser_pdf_save',
});
expect(response).toHaveTextContent(/^Saved as.*page-[^:]+.pdf$/);
});

42
tests/sse.spec.ts Normal file
View File

@@ -0,0 +1,42 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { spawn } from 'node:child_process';
import path from 'node:path';
import { test } from './fixtures';
test('sse transport', async () => {
const cp = spawn('node', [path.join(__dirname, '../cli.js'), '--port', '0'], { stdio: 'pipe' });
try {
let stdout = '';
const url = await new Promise<string>(resolve => cp.stdout?.on('data', data => {
stdout += data.toString();
const match = stdout.match(/Listening on (http:\/\/.*)/);
if (match)
resolve(match[1]);
}));
// need dynamic import b/c of some ESM nonsense
const { SSEClientTransport } = await import('@modelcontextprotocol/sdk/client/sse.js');
const { Client } = await import('@modelcontextprotocol/sdk/client/index.js');
const transport = new SSEClientTransport(new URL(url));
const client = new Client({ name: 'test', version: '1.0.0' });
await client.connect(transport);
await client.ping();
} finally {
cp.kill();
}
});

121
tests/tabs.spec.ts Normal file
View File

@@ -0,0 +1,121 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { chromium } from 'playwright';
import { test, expect } from './fixtures';
import type { Client } from '@modelcontextprotocol/sdk/client/index.js';
async function createTab(client: Client, title: string, body: string) {
return await client.callTool({
name: 'browser_tab_new',
arguments: {
url: `data:text/html,<title>${title}</title><body>${body}</body>`,
},
});
}
test('create new tab', async ({ client }) => {
expect(await createTab(client, 'Tab one', 'Body one')).toHaveTextContent(`
Open tabs:
- 1: [] (about:blank)
- 2: (current) [Tab one] (data:text/html,<title>Tab one</title><body>Body one</body>)
Current tab:
- Page URL: data:text/html,<title>Tab one</title><body>Body one</body>
- Page Title: Tab one
- Page Snapshot
\`\`\`yaml
- text: Body one
\`\`\``);
expect(await createTab(client, 'Tab two', 'Body two')).toHaveTextContent(`
Open tabs:
- 1: [] (about:blank)
- 2: [Tab one] (data:text/html,<title>Tab one</title><body>Body one</body>)
- 3: (current) [Tab two] (data:text/html,<title>Tab two</title><body>Body two</body>)
Current tab:
- Page URL: data:text/html,<title>Tab two</title><body>Body two</body>
- Page Title: Tab two
- Page Snapshot
\`\`\`yaml
- text: Body two
\`\`\``);
});
test('select tab', async ({ client }) => {
await createTab(client, 'Tab one', 'Body one');
await createTab(client, 'Tab two', 'Body two');
expect(await client.callTool({
name: 'browser_tab_select',
arguments: {
index: 2,
},
})).toHaveTextContent(`
Open tabs:
- 1: [] (about:blank)
- 2: (current) [Tab one] (data:text/html,<title>Tab one</title><body>Body one</body>)
- 3: [Tab two] (data:text/html,<title>Tab two</title><body>Body two</body>)
Current tab:
- Page URL: data:text/html,<title>Tab one</title><body>Body one</body>
- Page Title: Tab one
- Page Snapshot
\`\`\`yaml
- text: Body one
\`\`\``);
});
test('close tab', async ({ client }) => {
await createTab(client, 'Tab one', 'Body one');
await createTab(client, 'Tab two', 'Body two');
expect(await client.callTool({
name: 'browser_tab_close',
arguments: {
index: 3,
},
})).toHaveTextContent(`
Open tabs:
- 1: [] (about:blank)
- 2: (current) [Tab one] (data:text/html,<title>Tab one</title><body>Body one</body>)
Current tab:
- Page URL: data:text/html,<title>Tab one</title><body>Body one</body>
- Page Title: Tab one
- Page Snapshot
\`\`\`yaml
- text: Body one
\`\`\``);
});
test('reuse first tab when navigating', async ({ startClient, cdpEndpoint }) => {
const browser = await chromium.connectOverCDP(cdpEndpoint);
const [context] = browser.contexts();
const pages = context.pages();
const client = await startClient({ args: [`--cdp-endpoint=${cdpEndpoint}`] });
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<title>Title</title><body>Body</body>',
},
});
expect(pages.length).toBe(1);
expect(await pages[0].title()).toBe('Title');
});