285 Commits

Author SHA1 Message Date
Yury Semikhatsky
d5d810f896 chore: mark 0.0.34 (#901) 2025-08-15 17:38:58 -07:00
Yury Semikhatsky
1efd3b55e5 devops: update extension manifest version (#904) 2025-08-15 16:10:49 -07:00
Yury Semikhatsky
1d1db1e287 chore: fix copyright (#903) 2025-08-15 15:36:12 -07:00
Yury Semikhatsky
25f15e7f5e devops: set-version.js script (#902) 2025-08-15 15:28:59 -07:00
Yury Semikhatsky
c559243ef6 chore(extension): connected badge while loading (#899) 2025-08-15 13:44:17 -07:00
Yury Semikhatsky
91d5d24cab chore: handle list roots in the server, with timeout (#898) 2025-08-15 11:23:59 -07:00
Yury Semikhatsky
92554abfd1 devops: extension publishing job (#888) 2025-08-15 10:25:46 -07:00
Pavel Feldman
4370f2cdf2 chore: try macos15 runners (#892) 2025-08-15 10:19:52 -07:00
Yury Semikhatsky
ba726fb44a chore(extension): connection timeout when extension not installed (#896) 2025-08-15 09:09:35 -07:00
Yury Semikhatsky
2fc4e88048 chore(extension): add readme file, recommend --extension option (#894) 2025-08-14 16:01:14 -07:00
Adam Tarantino
3f148a4005 docs: add opencode installation instructions (#895) 2025-08-14 15:41:46 -07:00
Pavel Feldman
c92aefdc12 chore: close all clients in fixture (#878) 2025-08-14 10:57:07 -07:00
Pavel Feldman
badfd82202 chore: move tool schema to mcp as it is used by all servers (#887) 2025-08-13 18:23:25 -07:00
Yury Semikhatsky
12942b81d6 fix: wait for initialization to complete before listing tools (#886) 2025-08-13 17:29:10 -07:00
Pavel Feldman
73adb0fdf0 chore: steer towards mcp types a bit (#880) 2025-08-13 14:09:37 -07:00
Yury Semikhatsky
8572ab300c chore: separate proxy client from external (#877) 2025-08-12 18:05:45 -07:00
Pavel Feldman
c091a11d76 chore: extract utils folder (#876) 2025-08-12 14:33:00 -07:00
Pavel Feldman
dbd44110f1 chore: run test server per context (#874)
Fixes https://github.com/microsoft/playwright-mcp/issues/869
2025-08-12 13:41:08 -07:00
Pavel Feldman
2f41a3f6b1 chore: roll Playwright to latest (#875) 2025-08-12 13:30:32 -07:00
Yury Semikhatsky
7c4d67b3ae chore: tool definition without zod (#873) 2025-08-12 13:19:25 -07:00
Vicente Filho
53c6b6dcb1 fix: backtick quote escaping (#871) 2025-08-12 13:19:09 -07:00
Yury Semikhatsky
1fb2878271 fix(proxy): properly forward root requests and client metadata (#865) 2025-08-12 10:17:45 +02:00
Pavel Feldman
ab0ecc4075 chore: introduce check-deps (#864) 2025-08-11 17:21:26 -07:00
Yury Semikhatsky
f010164bf1 chore: mcp backend switcher (#854) 2025-08-11 14:16:43 -07:00
Pavel Feldman
db9cfe1720 chore: bump test workers to 2 on CI (#863) 2025-08-11 12:48:54 -07:00
Pavel Feldman
24f81a7a27 fix: emit code for waitfor (#862)
Fixes https://github.com/microsoft/playwright-mcp/issues/859
2025-08-11 11:58:45 -07:00
Yury Semikhatsky
21ced701b5 chore(extension): status page (#856) 2025-08-08 18:33:10 -07:00
Pavel Feldman
d3bf2eefc6 chore: mark 0.0.33 (#851) 2025-08-08 17:22:18 -07:00
Pavel Feldman
2ca899316d chore: roll Playwright to recent (#850) 2025-08-08 09:37:07 -07:00
Pavel Feldman
16f3523317 chore: do not return fullPage screenshots to the LLM (#849) 2025-08-08 09:36:51 -07:00
Omar Bahareth
6c2dda31ad fix(docs): Invalid MCP Install Link (#846) 2025-08-07 18:39:50 -07:00
Yury Semikhatsky
3b6ecf0a43 chore(extension): connect button for each page, style tweaks (#848)
<img width="643" height="709" alt="image"
src="https://github.com/user-attachments/assets/850f2455-b853-4c0f-8047-a7f2ced16b7b"
/>
2025-08-07 17:24:48 -07:00
Yury Semikhatsky
636f1956cc chore(extension): explicitly detach from debugger when connection closes (#847) 2025-08-07 14:45:52 -07:00
Yury Semikhatsky
5aef2aafcb devops: switch to node 20 on CI (#844)
Node 18 maintanence period ended in April 2025. Running on 18 already
caused a problem in https://github.com/microsoft/playwright-mcp/pull/842
2025-08-07 10:04:43 -07:00
Yury Semikhatsky
8ecc46c905 chore(extension): add test (#842)
* On Linux headed mode under xvfb-run fails to properly launch the
process. It works fine without xvfb-run, we don't have environment for
that on CI, so run on macOS instead.
* Node v18.20.8 stalls on `const uuid = crypto.randomUUID();`, so use
v20 for the extension tests.
2025-08-06 16:27:39 -07:00
Yury Semikhatsky
5dbb1504ba chore(extension): show error when connection is rejected due to inact… (#836)
…ivity
2025-08-05 15:08:57 -07:00
Yury Semikhatsky
20e1144c3b chore(extension): proper watchdog for inactive page selector (#835) 2025-08-05 14:18:04 -07:00
Yury Semikhatsky
eab20aa69e chore(extension): do not send if socket is already closed (#834)
* Remove debugger listeners if closed() is called as `ws.onclosed` is
dispatched asynchronously
* Tabs can be closed while update badge command is in flight
* Inflight CDP commands fail if the tab closes, do not try to send their
response to a closed socket
2025-08-05 13:47:08 -07:00
Yury Semikhatsky
46ce86f97e chore(extension): terminate connection if nothing has been selected (#827) 2025-08-05 09:47:39 -07:00
Yury Semikhatsky
4890b9d509 chore(extension): create relay per context (#828) 2025-08-05 08:32:54 -07:00
Yury Semikhatsky
3f6837baa9 fix: cursor does not respond to listRoots (#826) 2025-08-04 20:52:55 -07:00
Yury Semikhatsky
6d62c173c8 chore(extension): build into dist directory (#825) 2025-08-04 11:47:25 -07:00
Pavel Feldman
3c6eac9b21 chore: follow up with win test fix (#818) 2025-08-01 18:19:03 -07:00
Yury Semikhatsky
41a44f7abc chore(extension): terminate connection on debugger detach (#816) 2025-08-01 17:56:47 -07:00
Yury Semikhatsky
372395666a chore: allow to switch between browser connection methods (#815) 2025-08-01 17:34:28 -07:00
Pavel Feldman
a60d7b8cd1 chore: slice profile dirs by root in vscode (#814) 2025-08-01 16:59:59 -07:00
Pavel Feldman
ffe0117456 chore: refactor initialize (#812) 2025-08-01 13:06:36 -07:00
Yury Semikhatsky
7c07cc86eb chore(extension): bind relay lifetime to browser context (#804) 2025-07-31 22:25:40 -07:00
Pavel Feldman
3787439fc1 chore: serialize session entries for tool calls and user actions (#803) 2025-07-31 15:16:56 -07:00
Max Schmitt
2a86ac74e3 chore: use pngs by default for screenshots (#797)
1. Use PNG by default.
1. Increase JPG quality from `50` -> `90`.
2025-07-31 11:03:19 +02:00
Pavel Feldman
6dd44923da chore: make tab snapshot structured to mimic it in recorder (#799) 2025-07-30 20:57:34 -07:00
Pavel Feldman
f600234897 chore: record user actions in the session log (#798) 2025-07-30 18:26:13 -07:00
Pavel Feldman
4df162aff5 chore: parse response in tests (#796) 2025-07-30 12:47:22 -07:00
Yury Semikhatsky
65d99fe595 chore(extension): do not show chrome: tabs (#780) 2025-07-29 10:11:44 -07:00
Yury Semikhatsky
903c857f19 chore(extension): use separate package.json (#778) 2025-07-28 17:16:08 -07:00
Yury Semikhatsky
9b5f97b076 chore(extension): use react for connect dialog (#777) 2025-07-28 15:23:33 -07:00
Pavel Feldman
04988d8fac chore: mark v0.0.32 (#768) 2025-07-25 16:40:31 -07:00
Pavel Feldman
2bf57e22c6 chore: do not snapshot on fill (#767) 2025-07-25 15:54:18 -07:00
Yury Semikhatsky
dbf113d5e4 chore(extension): reject second http connection (#766) 2025-07-25 14:46:48 -07:00
Pavel Feldman
6710a78641 Revert "chore: recommend sse by default" (#765)
Reverts microsoft/playwright-mcp#758

Sounds like the stock streamable implementation is to spec, so we can
keep it.
2025-07-25 12:18:02 -07:00
Pavel Feldman
a9b9fb85da chore: ping client and disconnect on connection termination (#764) 2025-07-25 12:17:51 -07:00
Yury Semikhatsky
26a2a6fc83 chore: recommend sse by default (#758) 2025-07-25 09:51:01 -07:00
Pavel Feldman
e934d5e23e chore: retain the source code from the underlying tools (#756) 2025-07-24 17:08:35 -07:00
Pavel Feldman
ecfa10448b chore: extract loop tools into a separate folder (#755) 2025-07-24 16:22:03 -07:00
Yury Semikhatsky
e153ac3b7c chore(extension): exit gracefully when waiting for extension connection (#754) 2025-07-24 16:02:02 -07:00
Pavel Feldman
e0fb748ccc chore: wire one tool in-process (#753) 2025-07-24 15:25:32 -07:00
Pavel Feldman
c63b7823e1 chore: extract pure mcp server helpers (#751) 2025-07-24 12:57:01 -07:00
Yury Semikhatsky
bd34e9d7e9 chore(extension): page selector for MCP (#750) 2025-07-24 12:01:35 -07:00
Yury Semikhatsky
c72d0320f4 chore(extension): use free port (#735) 2025-07-24 10:25:13 -07:00
Pavel Feldman
da8a244f33 chore: one tool experiment (#746) 2025-07-24 10:09:01 -07:00
Pavel Feldman
31a4fb3d07 chore: unify loops (#745) 2025-07-23 17:42:53 -07:00
Yury Semikhatsky
bc120baa78 chore: do not double close connection (#744) 2025-07-23 17:41:15 -07:00
Pavel Feldman
2c5eac89a8 chore: add eval script (#743) 2025-07-23 10:31:37 -07:00
christian-lms
288f1b863b docs: Add LM Studio installation instructions (#688) 2025-07-23 08:22:13 -07:00
Yury Semikhatsky
53e3e37991 chore(extension): terminate all connections when tab closes (#741) 2025-07-22 22:23:00 -07:00
Pavel Feldman
b1a0f775cf chore: save session log (#740) 2025-07-22 20:06:03 -07:00
Pavel Feldman
6320b08173 chore: follow up on tab snapshot capture (#739) 2025-07-22 17:43:42 -07:00
Pavel Feldman
601a74305c chore: introduce response type (#738) 2025-07-22 16:36:21 -07:00
Yury Semikhatsky
c2b98dc70b chore(extension): handle root session id in the relay (#737) 2025-07-22 13:49:39 -07:00
Yury Semikhatsky
70862ce456 chore(extension): propagate errors to the client (#736) 2025-07-22 13:13:27 -07:00
Pavel Feldman
468c84eb8f chore: move state to tab, do not cache snapshot (#730) 2025-07-22 07:53:33 -07:00
Yury Semikhatsky
cfcca40b90 chore(extension): find installed chrome (#728) 2025-07-21 17:57:38 -07:00
Pavel Feldman
f1826b96b6 chore: align lint w/ playwright (#729) 2025-07-21 17:07:13 -07:00
Copilot
eeeab4f042 fix: browser_take_screenshot to not require snapshot unless element is specified (#725) 2025-07-21 10:52:06 -07:00
Copilot
efe3ff0c7c Add test for browser_evaluate error handling (#719) 2025-07-19 20:12:32 -07:00
Yury Semikhatsky
e3df209b96 chore(extension): support running in http mode (#717) 2025-07-19 08:30:29 -07:00
Pavel Feldman
29711d07d3 chore: use streamable http by default (#716)
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
2025-07-18 18:31:00 -07:00
Copilot
b0be1ee256 chore: add GitHub Copilot agent YAML specification (#715) 2025-07-18 18:03:23 -07:00
Yury Semikhatsky
d3867affed chore: add mcp chrome extension (#710) 2025-07-18 17:12:44 -07:00
Copilot
1eee30fd45 feat: add fullPage mode to browser_take_screenshot (#704) 2025-07-18 13:56:43 -07:00
Copilot
29ac29e6bb fix: no-sandbox flag logic to only disable sandbox when explicitly passed (#709) 2025-07-18 13:56:01 -07:00
Adam Gastineau
9f8441daa5 chore(docs): make VSCode match other README sections (#706) 2025-07-18 11:21:29 -07:00
Pavel Feldman
64f950ae42 chore: mark v0.0.31 (#691) 2025-07-17 16:04:21 -07:00
Pavel Feldman
5bfff0a059 chore: include recent console logs in results (#689) 2025-07-17 14:58:44 -07:00
Pavel Feldman
c97bc6e2ae chore: allow right click (#687)
Fixes https://github.com/microsoft/playwright-mcp/issues/467
2025-07-17 13:24:05 -07:00
Pavel Feldman
fe0c0ffffe chore: mirror cli options w/ env vars (#685)
Fixes https://github.com/microsoft/playwright-mcp/issues/639
2025-07-17 10:19:18 -07:00
Pavel Feldman
9526910864 chore: sort install sections (#682) 2025-07-17 09:06:10 -07:00
Pavel Feldman
95454735bf chore: remove image reply special case in cursor (#680) 2025-07-16 18:32:07 -07:00
Pavel Feldman
e9f6433241 chore: remove server experiment (#681) 2025-07-16 18:05:47 -07:00
Pavel Feldman
d61aa16fee chore: turn vision into capability (#679)
Fixes https://github.com/microsoft/playwright-mcp/issues/420
2025-07-16 16:40:00 -07:00
Pavel Feldman
012c906500 chore: introduce browser_evaluate (#678)
Fixes https://github.com/microsoft/playwright-mcp/issues/424
2025-07-16 15:02:47 -07:00
Pavel Feldman
825a97d66e chore: remove generate_test tool for now - it adds no value (#675) 2025-07-16 13:33:05 -07:00
Pavel Feldman
3061d9aa56 chore: resolve dialog races (#673)
Fixes https://github.com/microsoft/playwright-mcp/issues/595
2025-07-16 13:32:54 -07:00
Pavel Feldman
da818d113a chore: make tab indexes 0-based (#674)
Fixes https://github.com/microsoft/playwright-mcp/issues/570
2025-07-16 09:55:08 -07:00
Pavel Feldman
a5a57df105 chore: include page errors in console messages (#671)
Fixes https://github.com/microsoft/playwright-mcp/issues/669
2025-07-15 15:46:09 -07:00
Pavel Feldman
be8adb1866 chore: migrate to locator._resolveSelector (#670) 2025-07-15 14:50:33 -07:00
Pavel Feldman
c5a2324aaf chore: mark v0.0.30 (#666) 2025-07-14 10:53:12 -07:00
Pavel Feldman
128474b4aa chore: remove extension code (#667) 2025-07-14 10:52:38 -07:00
Pavel Feldman
7fca8f50f8 chore: roll Playwright to 1.54.1 (#665) 2025-07-14 09:51:14 -07:00
Simon Knott
841bb417d1 chore: update to 1.54.0 (#653)
Closes https://github.com/microsoft/playwright-mcp/issues/535
2025-07-14 09:53:33 +02:00
Pavel Feldman
59f1d67a4e feat(dblclick): add double click (#654)
Fixes https://github.com/microsoft/playwright-mcp/issues/652
2025-07-11 16:45:39 -07:00
おがどら
1600ba6645 docs: Update README about imageResponses option. (#646) 2025-07-09 17:40:22 -07:00
Joah Gerstenberg
127c996e86 docs: add instructions to install in Goose (#580) 2025-07-09 17:39:41 -07:00
Sandor Major
4bd39c07e9 docs: adding installation steps for Gemini CLI (#625)
I just tried it out with Gemini CLI and it works like a charm, thanks
for creating this MCP server!
2025-07-09 17:37:29 -07:00
Max Schmitt
f5b68dc590 devops(docker): enhance Docker image publishing with ORAS end-of-life tagging (#641)
This tags the images we publish as EOL immediately in order to get
excluded from the image scanning. Like we do upstream in
microsoft/playwright.
2025-07-07 23:08:12 +02:00
Mehul Raheja
875bd3b6ec fix(docs): Fix typo of windsurf in readme (#620) 2025-07-02 09:54:36 +02:00
Yury Semikhatsky
137b74750c chore(extension): wrap CDP protocol (#604) 2025-06-26 16:21:59 -07:00
Yury Semikhatsky
ded00dc422 chore(extension): convert to typescript (#603) 2025-06-26 13:52:08 -07:00
Yury Semikhatsky
5df6c2431b chore(extension): support reconnect, implement relay-extension protocol (#602) 2025-06-26 11:12:23 -07:00
Simon Knott
9066988098 chore: improve "ref not found" error message (#561)
Helps the model better understand the error cause.
2025-06-17 14:09:29 +02:00
jito(지토)
1dc4977ff9 docs: add Claude Code installation instructions (#553)
Add installation instructions for Claude Code CLI to the README.
2025-06-16 13:35:46 +02:00
Yury Semikhatsky
96e234012d chore(extension): start relay before creating MCP server (#548)
* HTTPS server launched and the relay server is created before MCP
server. This way we can pass CDP endpoint to its constructor.
* MCP HTTP transport is added to precreated HTTP server.
* A bunch of renames to fix style issues.
2025-06-13 16:13:40 -07:00
Max Schmitt
6c3f3b6576 feat: add MCP Chrome extension (#325)
Instructions:

1. `git clone https://github.com/mxschmitt/playwright-mcp && git
checkout extension-drafft`
2. `npm ci && npm run build`
3. `chrome://extensions` in your normal Chrome, "load unpacked" and
select the extension folder.
4. `node cli.js --port=4242 --extension` - The URL it prints at the end
you can put into the extension popup.
5. 
Put either this into Claude Desktop (it does not support SSE yet hence
wrapping it or just put the URL into Cursor/VSCode)

```json
{
  "mcpServers": {
    "playwright": {
      "command": "bash",
      "args": [
        "-c",
        "source $HOME/.nvm/nvm.sh && nvm use --silent 22 && npx supergateway --streamableHttp http://127.0.0.1:4242/mcp"
      ]
    }
  }
}
```

Things like `Take a snapshot of my browser.` should now work in your
Prompt Chat.

----

- SSE only for now, since we already have a http server with a port
there
- Upstream "page tests" can be executed over this CDP relay via
https://github.com/microsoft/playwright/pull/36286
- Limitations for now are everything what happens outside of the tab its
session is shared with -> `window.open` / `target=_blank`.

---------

Co-authored-by: Yury Semikhatsky <yurys@chromium.org>
2025-06-13 13:15:17 -07:00
Dmitry Gozman
0df6d7a441 chore: roll playwright to Jun 10th, v1.53 (#542)
Co-authored-by: Simon Knott <simonknott@microsoft.com>
2025-06-11 15:53:14 +01:00
Dmitry Gozman
4ea7041ba9 chore: mark v0.0.29 (#541) 2025-06-11 12:00:52 +01:00
Dan O'Brien
7dae68de78 docs: add instructions for MCP server in Qodo Gen (#530) 2025-06-08 10:38:24 -07:00
Peter Goldstein
60495ed9b0 docs: include Cursor One-Click in README.md (#531) 2025-06-08 10:37:48 -07:00
cranemont
0aaef661b1 docs(readme): fix connection method call in programmatic usage example (#532) 2025-06-08 10:36:27 -07:00
Max Schmitt
abbe7858a2 test: add PWMCP_DEBUG env switch (#523) 2025-06-05 10:40:03 -07:00
Simon Knott
767af21e02 chore: fix Connection type (#517)
The external `Connection` type regressed in
https://github.com/microsoft/playwright-mcp/pull/490/files#diff-a6be0583428e46844273df76939f02077073da3075716fc57d291a5f2463eaf5,
where the `connect()` function was removed but not from the types. I've
changed the code so we import from there, similar to how we do it for
`config.d.ts`, so this shouldn't happen again.
2025-06-05 08:47:04 +02:00
Pavel Feldman
27c498e0e7 chore: rename browser agent to server (#521) 2025-06-04 16:43:11 -07:00
Pavel Feldman
0fb9646c4d chore: experimental agent mode (#516) 2025-06-04 09:14:50 -07:00
Simon Knott
9728527900 chore: typo (#513) 2025-06-03 11:10:47 -07:00
Pavel Feldman
675b083db3 chore: mark v0.0.28 (#503) 2025-06-01 14:30:42 -07:00
Pavel Feldman
0b74cdaaf8 chore: sort out signal handling (#506) 2025-06-01 14:11:42 -07:00
Pavel Feldman
f31ef598bc test: verify the log in close/navigate test (#505) 2025-06-01 12:49:30 -07:00
Pavel Feldman
656779531c chore: respect server settings from config (#502) 2025-05-30 18:17:51 -07:00
Pavel Feldman
eec177d3ac chore: reuse browser in server mode (#495) 2025-05-30 15:15:37 -07:00
Pavel Feldman
54ed7c3200 chore: refactor server, prepare for browser reuse (#490) 2025-05-28 16:55:47 -07:00
nabepa
3cd74a824a docs: fixed typo in README.md (#487) 2025-05-27 20:33:36 -07:00
Pavel Feldman
177b008328 chore: mark v0.0.27 (#470) 2025-05-27 16:47:54 -07:00
Pavel Feldman
9429463951 chore: roll Playwright to 5/27 (#485) 2025-05-27 16:47:22 -07:00
Simon Knott
45f493da6c chore: make library test run under older Node versions (#479) 2025-05-27 13:19:25 -07:00
Pavel Feldman
9e5ffd2ccf fix(cursor): allow enforcing images for cursor --image-responses=allow (#478)
Fixes https://github.com/microsoft/playwright-mcp/issues/449
2025-05-27 10:25:09 +02:00
Simon Knott
1051ea810a fix: import from cjs (#476)
Closes https://github.com/microsoft/playwright-mcp/issues/456
2025-05-26 14:18:03 -07:00
Pavel Feldman
f20ae22ec6 chore: roll Playwright, remove localOutputDir (#471) 2025-05-24 11:44:57 -07:00
Simon Knott
13cd1b4bd9 fix: respect browserName in config (#461)
Resolves https://github.com/microsoft/playwright-mcp/issues/458
2025-05-23 15:13:34 -07:00
Pavel Feldman
c318f13895 chore: mark v0.0.26 (#441) 2025-05-17 08:20:37 -07:00
Pavel Feldman
1318e39fac chore: fix operation over cdp (#440)
Ref https://github.com/microsoft/playwright-mcp/issues/439
2025-05-17 08:20:22 -07:00
Pavel Feldman
c2b7fb29de chore: start trace server (#427) 2025-05-14 20:15:09 -07:00
Pavel Feldman
aa6ac51f92 feat(trace): allow saving trajectory as trace (#426) 2025-05-14 18:08:44 -07:00
Pavel Feldman
fea50e6840 chore: introduce resolved config (#425) 2025-05-14 16:01:08 -07:00
Pavel Feldman
746c9fc124 chore: mark v0.0.25 (#414) 2025-05-13 16:24:04 -07:00
Pavel Feldman
ee33097abe chore: normalize --no- options (#413) 2025-05-13 16:17:45 -07:00
Pavel Feldman
ab20175826 chore: generate readme options (#411) 2025-05-13 15:52:30 -07:00
Pavel Feldman
c506027aec chore: run w/ sandbox by default (#412) 2025-05-13 15:30:02 -07:00
Pavel Feldman
7be0c8872e feat(args): allow configuring proxy, UA, viewport, https errors (#410) 2025-05-13 14:40:03 -07:00
Pavel Feldman
ce72367208 feat(storage): allow passing storage state for isolated contexts (#409)
Fixes https://github.com/microsoft/playwright-mcp/issues/403
Ref https://github.com/microsoft/playwright-mcp/issues/367
2025-05-13 13:14:04 -07:00
Pavel Feldman
949f956378 feat(ephemeral): allow for non-persistent context operation (#405)
Ref: https://github.com/microsoft/playwright-mcp/issues/367
Ref: https://github.com/microsoft/playwright-mcp/issues/393
2025-05-12 18:18:53 -07:00
Pavel Feldman
a1eee8351e chore: collapse readme (#404) 2025-05-12 16:42:47 -07:00
Pavel Feldman
fea3f26e85 chore: mark v0.0.24 (#401) 2025-05-12 09:40:59 -07:00
Pavel Feldman
dd5b41f1d8 chore: account for undefined arguments (#400) 2025-05-12 09:35:33 -07:00
Pavel Feldman
05dc5d915b chore: mark v0.0.23 (#399) 2025-05-12 09:13:48 -07:00
Taiga Mikami
65a229c79f Fix import in README from createServer to createConnection (#396)
Probably, `createServer` is not from `@playwright/mcp`.
2025-05-12 08:46:21 -07:00
Max Schmitt
84664d4b09 test: unflake 'should throw connection error and allow re-connecting' (#398)
Fixes
https://github.com/microsoft/playwright-mcp/actions/runs/14940263450/job/41976152764#step:8:315
2025-05-12 09:45:09 +02:00
Pavel Feldman
445170a76b chore: roll playwright 5/9 (#394) 2025-05-09 18:01:17 -07:00
Pavel Feldman
c28b480b51 feat(wait): allow waiting for given text (#390)
Fixes https://github.com/microsoft/playwright-mcp/issues/389
2025-05-09 15:35:28 -07:00
Max Schmitt
65716b60dd fix: createConnection() via public API (#384)
Fixes https://github.com/microsoft/playwright-mcp/issues/382
2025-05-09 21:50:38 +02:00
Max Schmitt
75f74a54bc docs: reference to new Docker image (#380) 2025-05-09 21:01:10 +02:00
Max Schmitt
ef41c626ef chore: unset skipLibCheck in tsconfig.json (#386)
Follow-up for
https://github.com/microsoft/playwright-mcp/pull/385#discussion_r2081541865.

> `skipLibCheck`: Skip type checking all .d.ts files.
2025-05-09 14:35:09 +02:00
Max Schmitt
95ca08fdb7 fix: use of wrong launchOptions type in public API (#385) 2025-05-09 14:16:04 +02:00
Max Schmitt
053c2f3d32 test: fix SSE MCP SDK imports (#383) 2025-05-09 14:08:19 +02:00
Pavel Feldman
57b3c14276 chore: only reset network log upon explicit navigation (#377)
Fixes https://github.com/microsoft/playwright-mcp/issues/376
2025-05-08 17:02:09 -07:00
おがどら
85c85bd2fb chore: support custom filename in screenshot function (#349) 2025-05-08 11:04:18 -07:00
Max Schmitt
09ba7989c3 test: run tests on MCP server inside Docker (#361)
https://github.com/microsoft/playwright-mcp/issues/346
2025-05-07 18:04:20 +02:00
Max Schmitt
a115c31953 chore: rename console to consoleMessages (#372)
Motivation: `console` is a global object in Node.js and having a method
like that confuses intellisense.
2025-05-07 16:40:08 +02:00
Max Schmitt
b5be37e5e7 chore: mark v0.0.22 (#370) 2025-05-07 12:49:11 +02:00
Simon Knott
c2255246a3 fix: don't error on navigating to a download link (#328) 2025-05-07 12:47:45 +02:00
Max Schmitt
950d0d1d34 devops: fix Docker publishing (#369) 2025-05-07 11:46:33 +02:00
Max Schmitt
cdeba454b5 chore: mark v0.0.21 (#364) 2025-05-07 11:30:11 +02:00
Max Schmitt
91ae93c167 chore: change import assert to readFile (#368) 2025-05-07 11:30:01 +02:00
Max Schmitt
35e6c49d7c devops: publish Docker image to :latest as well (#365)
We don't do that for normal Playwright because we expect the user to
mount/add/copy their own Playwright folder and there the version has to
match. In this case publishing to `:latest` seems fine since its a
isolated product.
2025-05-07 11:14:05 +02:00
Pavel Feldman
e95b5b1dd6 chore: get rid of connection factory (#362)
Drive-by User-Agent sniffing and disabling of image type in Cursor.
2025-05-06 14:27:28 -07:00
Max Schmitt
23a2e5fee7 devops: add Docker publishing (#356) 2025-05-06 23:14:41 +02:00
Pavel Feldman
d01aa19ffa chore: annotate tools (#351)
Fixes https://github.com/microsoft/playwright-mcp/issues/215
2025-05-05 17:38:22 -07:00
kanchi
8cd7d5a753 chore(docker): optimize Dockerfile by excluding unnecessary files and using non-root user (#273) 2025-05-05 14:38:02 -07:00
Ross Wollman
42faa3ccf8 feat: add --(allowed|blocked)-origins (#319)
Useful to limit the agent when using the playwright-mcp server with an
agent in auto-invocation mode.

Not intended to be a security feature.
2025-05-05 11:28:14 -07:00
Pavel Feldman
4694d60fc5 fix(config): allow specifying user data dir in config (#342)
Fixes https://github.com/microsoft/playwright-mcp/issues/340
2025-05-05 08:23:24 -07:00
Max Schmitt
7dc689eee7 fix: installation tool on Windows (#345) 2025-05-04 06:56:59 -07:00
おがどら
5df011ad4b feat(cli): set outputDir via cli options (#338) 2025-05-03 20:11:17 -07:00
Pavel Feldman
200cf737bb chore: use import.meta.resolve to lookup Playwright (#337) 2025-05-03 14:38:58 -07:00
Pavel Feldman
d8a59e0d0d chore: mark v0.0.20 (#336) 2025-05-02 21:31:06 -07:00
Pavel Feldman
21533d9000 chore: installation test added (#335) 2025-05-02 21:30:55 -07:00
Ryosuke Iwanaga
49979641fa fix: require is not defined (#334)
Since it's moved to ESM, `require` isn't defined.
This hotfix is just recreating `require` to workaround this issue.
2025-05-02 21:19:54 -07:00
Pavel Feldman
43aa4001b5 chore: mark v0.0.19 (#332) 2025-05-02 18:38:20 -07:00
Pavel Feldman
7e087af6a6 chore: slightly adjust gen test prompt (#333) 2025-05-02 18:38:06 -07:00
Pavel Feldman
927a1280f1 chore: allow generating tests for script (#331) 2025-05-02 17:41:58 -07:00
Pavel Feldman
292e75d464 chore: roll Playwright to remove empty generic nodes (#330) 2025-05-02 16:10:48 -07:00
Simon Knott
2c9376e50f chore: don't sanitize file extension away (#327) 2025-05-02 10:58:48 -07:00
Max Schmitt
062cdd0704 fix: sticky launch errors (#324)
This fixes an issue that there were sticky launch errors. When the
[following code
path](a15f0f301b/src/context.ts (L307-L339))
was throwing, the Error was stored in the Promise and not cleared
afterwards, this meant:

- If a browser was not there and the user tried to install it via
`browser_install` it was never working since the error was sticky.
- If other errors like CDP is not available yet etc. error appear a
re-connect would not work - the MCP server would require a restart.

Test plan: Since we don't have any `browser_install` tests I added a CDP
test for now to cover this bug.
2025-05-02 15:32:37 +02:00
Max Schmitt
a713300c5b test: use TestOptions type in config (#326) 2025-05-02 13:50:03 +02:00
Simon Knott
a15f0f301b chore: save downloads to outputDir (#310) 2025-05-02 10:57:31 +02:00
Pavel Feldman
23ce973377 lint: ban console output (#317) 2025-04-30 14:15:32 -07:00
Max Schmitt
685dea9e19 chore: migrate to ESM (#303)
- [Why do I need `.js`
extension?](https://stackoverflow.com/a/77150985/6512681)
- [Why setting `rootDir` in the
`tsconfig.json`?](https://stackoverflow.com/a/58941798/6512681)
- [How to ensure that we add the `.js` extension via
ESLint](https://github.com/import-js/eslint-plugin-import/blob/main/docs/rules/extensions.md#importextensions)

Fixes https://github.com/microsoft/playwright-mcp/issues/302
2025-04-30 23:06:56 +02:00
Pavel Feldman
878be97668 chore: mark v0.0.18 (#315) 2025-04-30 13:07:55 -07:00
Pavel Feldman
6d6b1a384b chore: fix merge config (#311) 2025-04-30 08:41:19 -07:00
Pavel Feldman
fd22def4c5 chore: fix test harness, close the client (#312) 2025-04-30 08:07:54 -07:00
Simon Knott
1b60870f50 chore: bump to 0.0.17 (#306) 2025-04-30 12:30:03 +02:00
Simon Knott
1c760b3826 fix: default to headful (#305)
See https://github.com/microsoft/playwright-mcp/issues/304

Regressed in
69703cc882.
2025-04-30 12:23:30 +02:00
Pavel Feldman
9efaea6a1c chore: mark v0.0.16 (#298) 2025-04-29 19:51:57 -07:00
Pavel Feldman
3f72fe53ec chore: add support for device (#300)
Fixes https://github.com/microsoft/playwright-mcp/issues/294
2025-04-29 19:51:00 -07:00
Pavel Feldman
40d125f0bb docs: document configuration file (#299) 2025-04-29 15:29:56 -07:00
Pavel Feldman
21d2f80fef chore: store channel profiles separately (#297) 2025-04-29 13:34:56 -07:00
Simon Knott
6efdc90078 fix: show custom error for modal state (#240)
Calling a tool that resolves modal state, when there's no such modal
state visible, currently shows this misleading message:

```md
Tool "browser_file_upload" does not handle the modal state.
### Modal state
```

Instead, we should show the error message from the tool implementation.
2025-04-29 18:48:52 +02:00
zwmmm
ad4147da54 docs: Fix the default path to User data directory (#290)
Fix the default path to User data directory
2025-04-29 08:53:30 -07:00
Pavel Feldman
69703cc882 chore: follow up to exposing playwright config options (#289) 2025-04-29 08:53:03 -07:00
Max Schmitt
4147e21a3a chore: fix update-readme TS linting (#296) 2025-04-29 16:12:17 +02:00
Pavel Feldman
80c9b93b72 chore: allow configuring raw Playwright options (#287)
Fixes: https://github.com/microsoft/playwright-mcp/issues/272
2025-04-28 20:17:16 -07:00
Pavel Feldman
12e72a96c4 chore: allow configuring screenshot tool (#286)
Fixes: https://github.com/microsoft/playwright-mcp/issues/277
2025-04-28 17:21:23 -07:00
Pavel Feldman
697a69a8c2 chore: allow specifying output dir (#285)
Ref: https://github.com/microsoft/playwright-mcp/issues/279
2025-04-28 16:35:33 -07:00
Pavel Feldman
6e76d5e550 chore: split context.ts into files (#284) 2025-04-28 16:14:16 -07:00
Pavel Feldman
26779ceb20 chore: allow passing config file (#281) 2025-04-28 15:04:59 -07:00
Pavel Feldman
23704ace1f chore: update docs on lint (#283) 2025-04-28 14:56:00 -07:00
Pavel Feldman
b02370df2f chore: roll playwright to latest (#269) 2025-04-28 13:44:24 -07:00
Simon Knott
bf7dbabca4 feat: support streamable http transport (#243)
Adds support for the new StreamableHttp transport. I'm not aware of any
clients that implement it, but somebody's gotta make the start! Once
some clients support it, we can also advertise it in the README.
2025-04-28 11:11:31 +02:00
Zheng Xi Zhou
7256ee3701 docs(readme): Fix syntax error and improve formatting (#263)
The commit fixes a syntax error in the `npx` command by removing
an extra backtick. It also improves the formatting by adding line
breaks before code blocks to enhance readability.
2025-04-24 10:30:35 +02:00
Zheng Xi Zhou
0ed0bcd914 feat(server): add host option to SSE server configuration (#261) 2025-04-23 23:04:00 -07:00
Zheng Xi Zhou
4d95761f66 chore(gitignore): Add .idea and .DS_Store to .gitignore (#262) 2025-04-23 22:05:06 -07:00
Max Schmitt
b9dc323734 chore: enable @typescript-eslint/no-floating-promises rule (#260) 2025-04-23 16:03:30 +02:00
Pavel Feldman
586492a3f0 chore: mark v0.0.15 (#250) 2025-04-22 16:17:36 -07:00
Pavel Feldman
f7e9bae571 chore: roll playwright to 1745357020000 (#249) 2025-04-22 16:04:50 -07:00
Pavel Feldman
1bc3c761de feat(network): implement listing network requests (#247)
Fixes: https://github.com/microsoft/playwright-mcp/issues/242
2025-04-22 16:04:25 -07:00
Simon Knott
c80f7cf222 chore: infer tool params (#241)
Moves the `schema.parse` call to the calling side of the handler, so we
don't have to duplicate it everywhere.
2025-04-22 13:24:38 +02:00
Pavel Feldman
9578a5b2af chore: mark v0.0.14 (#237) 2025-04-21 17:52:35 -07:00
Pavel Feldman
cd5aa344f1 docs: push docker doc down the readme (#236) 2025-04-21 17:31:18 -07:00
Cody Rigney
dc955c73a3 Add Docker support (#220) 2025-04-21 17:26:50 -07:00
Rui Figueira
d4f8f87b03 docs: fix "programmatic usage with custom transports" code snippet (#235)
Fixes: #230
2025-04-21 15:09:58 -07:00
Max Schmitt
0c3792d231 chore: auto update tools in README (#219)
Motivation: Keeping the readme up to date is a manual effort - this
keeps it automatically up to date and prevents things like
https://github.com/microsoft/playwright-mcp/pull/214 and other
consistency errors in the future.
2025-04-21 20:22:57 +02:00
Pavel Feldman
7695717546 docs: provide missing docs (#214) 2025-04-17 14:49:22 -07:00
Pavel Feldman
6a070a0dd8 chore: restore page-side timeout (#213) 2025-04-17 14:25:27 -07:00
Pavel Feldman
6481100bdf feat(dialog): handle dialogs (#212) 2025-04-17 14:03:13 -07:00
Pavel Feldman
4b261286bf chore: test list tabs (#208) 2025-04-17 09:58:02 +02:00
Pavel Feldman
7e4a964b0a chore: flatten tool calling, prep for timeout handling (#205) 2025-04-16 19:36:48 -07:00
Pavel Feldman
cea347d067 chore: introduce modal states (#204) 2025-04-16 15:21:45 -07:00
Pavel Feldman
6054290d9a chore: follow up to the element screenshot change (#199) 2025-04-16 12:53:27 -07:00
Andrei-Daniel Barzu
6d4adfe5c6 feat: add element screenshot action for snapshots (#182) 2025-04-16 10:28:44 -07:00
Simon Knott
e7c7709b33 chore: include "playwright" keyword, add examples (#196) 2025-04-16 08:18:40 -07:00
Pavel Feldman
5c2e11017d chore: convert console resource to tool (#193) 2025-04-15 18:01:59 -07:00
Pavel Feldman
e4331313f9 chore: update exported types (#192)
Fixes https://github.com/microsoft/playwright-mcp/issues/186
2025-04-15 16:39:52 -07:00
Pavel Feldman
bc48600a49 chore: mark v0.0.13 (#190) 2025-04-15 15:27:29 -07:00
Yury Semikhatsky
0d6bb2f547 devops: add bots for other browsers/platforms (#174) 2025-04-15 13:16:56 -07:00
Pavel Feldman
795a9d578a chore: generalize status & action as code (#188) 2025-04-15 12:54:45 -07:00
Simon Knott
4a19e18999 feat: respond with action and generated locator (#181)
Closes https://github.com/microsoft/playwright-mcp/issues/163
2025-04-15 10:55:20 -07:00
Simon Knott
4d59e06184 test: fix flaky test (#180)
Closes https://github.com/microsoft/playwright-mcp/issues/177

`ResizeObserver` isn't instant!
2025-04-15 16:10:49 +02:00
Pavel Feldman
6891a525b3 chore: add npx install step to the publish workflow (#178) 2025-04-14 20:09:38 -07:00
Yury Semikhatsky
0f7fd1362f chore: mark 0.0.12 (#176) 2025-04-14 19:39:10 -07:00
Yury Semikhatsky
de08c24b96 fix: consider DISPLAY only on linux (#175) 2025-04-14 19:07:39 -07:00
Pavel Feldman
71e51ea42a chore: mark v0.0.11 (#173) 2025-04-14 16:48:36 -07:00
Pavel Feldman
0c5a104e0f chore: default to headless when DISPLAY is missing (#172)
Fixes https://github.com/microsoft/playwright-mcp/issues/165
2025-04-14 16:47:32 -07:00
Pavel Feldman
606b898a71 chore: allow reusing tab over cdp (#170)
Fixes https://github.com/microsoft/playwright-mcp/issues/164
2025-04-14 16:39:58 -07:00
Simon Knott
e729494bd9 feat: browser_resize (#92) 2025-04-14 16:09:48 -07:00
Cameron
77080e8ca4 Restore package-lock.json module hashes (#151)
- Adds integrity hashes that were missing for 5 npm packages in
`package-lock.json`

<hr />

Is there a reason hashes for some of these dependencies are missing from
`package-lock.json`?

Right now these omissions prevent me from packaging a nix derivation for
this mcp server directly off this repo
(https://github.com/cameronfyfe/nix-mcp-servers/blob/main/pkgs/servers/mcp-server-playwright/default.nix#L17)
and I was wondering if they might just be missing due to a bad merge or
something odd like that at some point. `npm install` under normal use
doesn't seem to care if the hashes are missing and installs the packages
anyway, but it's a blocker for hermetic build systems like nix.

If this is intentional for some reason I'm not familiar with feel free
to ignore and close.
2025-04-10 15:24:36 +02:00
Simon Knott
31ac1ed191 fix: exit watchdog should listen for SIGINT/SIGTERM (#144) 2025-04-07 14:51:57 -07:00
Paul Irish
b8ff009b0a chore: add back stable vscode install button (#145) 2025-04-07 14:18:01 -07:00
Yoshiki Nakagawa
42167878fb chore: Update README.md (#140)
Remove old documents
2025-04-07 08:54:05 +02:00
Pavel Feldman
6b15c7e422 chore: mark v0.0.10 (#138) 2025-04-05 19:14:50 -07:00
Pavel Feldman
abd56f514b chore: introduce capabilities argument (#135) 2025-04-04 17:14:30 -07:00
Pavel Feldman
707ebbf4d4 chore: group tools, prepare for capabilities (#134) 2025-04-04 15:22:00 -07:00
Pavel Feldman
fc0cccf4a5 chore: reuse the first tab when navigating (#131) 2025-04-03 22:39:55 -07:00
Pavel Feldman
e36d4ea695 chore: allow multiple tabs (#129) 2025-04-03 19:24:17 -07:00
Pavel Feldman
b358e47d71 chore: prep for multiple pages in context (#124) 2025-04-03 10:30:05 -07:00
Yury Semikhatsky
38f038a5dc chore: typo in description (#127) 2025-04-02 17:26:45 -07:00
Yury Semikhatsky
2291011dc7 feat: add slowly option for typing one character at a time (#121) 2025-04-02 14:36:30 -07:00
Pavel Feldman
89627fd23a chore: extract page snapshot, prep for multipage (#120) 2025-04-02 11:42:39 -07:00
Pavel Feldman
23f392dd91 chore: mark v0.0.9 (#114) 2025-04-01 15:45:00 -07:00
Max Schmitt
128e75b9f4 devops: fix npm publishing due to proverance (#112)
Like
[upstream](3ad5c2731a/.github/workflows/publish_release_npm.yml (L15))
and in the
[docs](https://docs.npmjs.com/generating-provenance-statements#example-github-actions-workflow).
2025-04-02 00:37:13 +02:00
Pavel Feldman
2366dbf36c chore: mark v0.0.8 (#111) 2025-04-01 15:16:28 -07:00
Pavel Feldman
0de7c0d38c chore: follow up with iframe stitch (#110) 2025-04-01 15:10:23 -07:00
Simon Knott
0a5518b252 chore: stitch together iframes into one tree (#71) 2025-04-01 14:47:53 -07:00
Pavel Feldman
4f16786432 chore: merge browser and channel settings (#100) 2025-04-01 10:26:48 -07:00
Pavel Feldman
9042c03faa chore: support channel and executable path params (#90)
Fixes https://github.com/microsoft/playwright-mcp/issues/89
2025-03-31 15:30:08 -07:00
Pavel Feldman
d316441142 chore: sanitize file path when saving (#99)
Fixes https://github.com/microsoft/playwright-mcp/issues/96
2025-03-31 15:01:58 -07:00
Yoshiki Nakagawa
aeb4cf65e9 Fixed typo in README.md (#88) 2025-03-31 09:33:38 +01:00
Pavel Feldman
a7392fc266 chore: allow passing cdp endpoint (#86)
Fixes https://github.com/microsoft/playwright-mcp/issues/84
2025-03-30 09:05:58 -07:00
Max Schmitt
88fbf50841 devops: use --provenance when publishing to NPM (#83)
Similar to how we do it upstream:
e2c8163b14/utils/publish_all_packages.sh (L97)

Reference: https://docs.npmjs.com/generating-provenance-statements
2025-03-29 19:17:54 +01:00
136 changed files with 14486 additions and 1717 deletions

View File

@@ -7,29 +7,117 @@ on:
branches: [ main ]
jobs:
build-and-test:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Use Node.js 18
- name: Use Node.js 20
uses: actions/setup-node@v4
with:
node-version: '18'
node-version: '20'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linting
- run: npm run build
- name: Run ESLint
run: npm run lint
- name: Ensure no changes
run: git diff --exit-code
test:
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-15, windows-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- name: Use Node.js 20
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Playwright install
run: npx playwright install --with-deps
- name: Install MS Edge
# MS Edge is not preinstalled on macOS runners.
if: ${{ matrix.os == 'macos-latest' }}
run: npx playwright install msedge
- name: Build
run: npm run build
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run tests
run: npm test
test_docker:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Use Node.js 20
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Playwright install
run: npx playwright install --with-deps chromium
- name: Build
run: npm run build
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push
uses: docker/build-push-action@v6
with:
tags: playwright-mcp-dev:latest
cache-from: type=gha
cache-to: type=gha,mode=max
load: true
- name: Run tests
shell: bash
run: |
# Used for the Docker tests to share the test-results folder with the container.
umask 0000
npm run test -- --project=chromium-docker
env:
MCP_IN_DOCKER: 1
test_extension:
strategy:
fail-fast: false
runs-on: macos-latest
defaults:
run:
working-directory: ./extension
steps:
- uses: actions/checkout@v4
- name: Use Node.js 20
uses: actions/setup-node@v4
with:
node-version: '20' # crypto.randomUUID(); stalls in v18.20.8
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build extension
run: npm run build
- name: Upload artifact
uses: actions/upload-artifact@v4
with:
name: extension
path: ./extension/dist
retention-days: 7
- name: Install and build MCP server
run: |
cd ..
npm ci
npm run build
npx playwright install chromium
- name: Run tests
run: |
if [[ "$(uname)" == "Linux" ]]; then
xvfb-run --auto-servernum --server-args="-screen 0 1280x960x24" -- npm run test
else
npm run test
fi
shell: bash

View File

@@ -0,0 +1,44 @@
name: "Copilot Setup Steps"
# Automatically run the setup steps when they are changed to allow for easy validation, and
# allow manual testing through the repository's "Actions" tab
on:
workflow_dispatch:
push:
paths:
- .github/workflows/copilot-setup-steps.yml
pull_request:
paths:
- .github/workflows/copilot-setup-steps.yml
jobs:
# The job MUST be called `copilot-setup-steps` or it will not be picked up by Copilot.
copilot-setup-steps:
runs-on: ubuntu-latest
# Set the permissions to the lowest permissions possible needed for your steps.
# Copilot will be given its own token for its operations.
permissions:
# If you want to clone the repository as part of your setup steps, for example to install dependencies, you'll need the `contents: read` permission. If you don't clone the repository in your setup steps, Copilot will do this for you automatically after the steps complete.
contents: read
# You can define any steps you want, and they will run before the agent starts.
# If you do not check out your code, Copilot will do this for you.
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "18.19"
cache: "npm"
- name: Install JavaScript dependencies
run: npm ci
- name: Playwright install
run: npx playwright install --with-deps
- name: Build
run: npm run build

View File

@@ -5,6 +5,9 @@ on:
jobs:
publish-npm:
runs-on: ubuntu-latest
permissions:
contents: read
id-token: write # Needed for npm provenance
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
@@ -12,9 +15,84 @@ jobs:
node-version: 18
registry-url: https://registry.npmjs.org/
- run: npm ci
- run: npx playwright install --with-deps
- run: npm run build
- run: npm run lint
- run: npm run test
- run: npm publish
- run: npm run ctest
- run: npm publish --provenance
env:
NODE_AUTH_TOKEN: ${{secrets.NPM_TOKEN}}
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
publish-docker:
runs-on: ubuntu-latest
permissions:
contents: read
id-token: write # Needed for OIDC login to Azure
environment: allow-publishing-docker-to-acr
steps:
- uses: actions/checkout@v4
- name: Set up QEMU # Needed for multi-platform builds (e.g., arm64 on amd64 runner)
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx # Needed for multi-platform builds
uses: docker/setup-buildx-action@v3
- name: Azure Login via OIDC
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_DOCKER_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_DOCKER_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_DOCKER_SUBSCRIPTION_ID }}
- name: Login to ACR
run: az acr login --name playwright
- name: Build and push Docker image
id: build-push
uses: docker/build-push-action@v6
with:
context: .
file: ./Dockerfile # Adjust path if your Dockerfile is elsewhere
platforms: linux/amd64,linux/arm64
push: true
tags: |
playwright.azurecr.io/public/playwright/mcp:${{ github.event.release.tag_name }}
playwright.azurecr.io/public/playwright/mcp:latest
- uses: oras-project/setup-oras@v1
- name: Set oras tags
run: |
attach_eol_manifest() {
local image="$1"
local today=$(date -u +'%Y-%m-%d')
# oras is re-using Docker credentials, so we don't need to login.
# Following the advice in https://portal.microsofticm.com/imp/v3/incidents/incident/476783820/summary
oras attach --artifact-type application/vnd.microsoft.artifact.lifecycle --annotation "vnd.microsoft.artifact.lifecycle.end-of-life.date=$today" $image
}
# for each tag, attach the eol manifest
for tag in $(echo ${{ steps.build-push.outputs.metadata['image.name'] }} | tr ',' '\n'); do
attach_eol_manifest $tag
done
package-extension:
runs-on: ubuntu-latest
permissions:
contents: write # Needed to upload release assets
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install extension dependencies
working-directory: ./extension
run: npm ci
- name: Build extension
working-directory: ./extension
run: npm run build
- name: Package extension
working-directory: ./extension
run: |
cd dist
zip -r ../playwright-mcp-extension-${{ github.event.release.tag_name }}.zip .
cd ..
- name: Upload extension to release
env:
GITHUB_TOKEN: ${{ github.token }}
run: |
gh release upload ${{github.event.release.tag_name}} ./extension/playwright-mcp-extension-${{ github.event.release.tag_name }}.zip

7
.gitignore vendored
View File

@@ -1,3 +1,10 @@
lib/
dist/
node_modules/
test-results/
playwright-report/
.vscode/mcp.json
.idea
.DS_Store
.env
sessions/

View File

@@ -4,3 +4,4 @@ LICENSE
!lib/**/*.js
!cli.js
!index.*
!config.d.ts

69
Dockerfile Normal file
View File

@@ -0,0 +1,69 @@
ARG PLAYWRIGHT_BROWSERS_PATH=/ms-playwright
# ------------------------------
# Base
# ------------------------------
# Base stage: Contains only the minimal dependencies required for runtime
# (node_modules and Playwright system dependencies)
FROM node:22-bookworm-slim AS base
ARG PLAYWRIGHT_BROWSERS_PATH
ENV PLAYWRIGHT_BROWSERS_PATH=${PLAYWRIGHT_BROWSERS_PATH}
# Set the working directory
WORKDIR /app
RUN --mount=type=cache,target=/root/.npm,sharing=locked,id=npm-cache \
--mount=type=bind,source=package.json,target=package.json \
--mount=type=bind,source=package-lock.json,target=package-lock.json \
npm ci --omit=dev && \
# Install system dependencies for playwright
npx -y playwright-core install-deps chromium
# ------------------------------
# Builder
# ------------------------------
FROM base AS builder
RUN --mount=type=cache,target=/root/.npm,sharing=locked,id=npm-cache \
--mount=type=bind,source=package.json,target=package.json \
--mount=type=bind,source=package-lock.json,target=package-lock.json \
npm ci
# Copy the rest of the app
COPY *.json *.js *.ts .
COPY src src/
# Build the app
RUN npm run build
# ------------------------------
# Browser
# ------------------------------
# Cache optimization:
# - Browser is downloaded only when node_modules or Playwright system dependencies change
# - Cache is reused when only source code changes
FROM base AS browser
RUN npx -y playwright-core install --no-shell chromium
# ------------------------------
# Runtime
# ------------------------------
FROM base
ARG PLAYWRIGHT_BROWSERS_PATH
ARG USERNAME=node
ENV NODE_ENV=production
# Set the correct ownership for the runtime user on production `node_modules`
RUN chown -R ${USERNAME}:${USERNAME} node_modules
USER ${USERNAME}
COPY --from=browser --chown=${USERNAME}:${USERNAME} ${PLAYWRIGHT_BROWSERS_PATH} ${PLAYWRIGHT_BROWSERS_PATH}
COPY --chown=${USERNAME}:${USERNAME} cli.js package.json ./
COPY --from=builder --chown=${USERNAME}:${USERNAME} /app/lib /app/lib
# Run in headless and only with chromium (other browsers need more dependencies not included in this image)
ENTRYPOINT ["node", "cli.js", "--headless", "--browser", "chromium", "--no-sandbox"]

749
README.md
View File

@@ -4,18 +4,24 @@ A Model Context Protocol (MCP) server that provides browser automation capabilit
### Key Features
- **Fast and lightweight**: Uses Playwright's accessibility tree, not pixel-based input.
- **LLM-friendly**: No vision models needed, operates purely on structured data.
- **Deterministic tool application**: Avoids ambiguity common with screenshot-based approaches.
- **Fast and lightweight**. Uses Playwright's accessibility tree, not pixel-based input.
- **LLM-friendly**. No vision models needed, operates purely on structured data.
- **Deterministic tool application**. Avoids ambiguity common with screenshot-based approaches.
### Use Cases
### Requirements
- Node.js 18 or newer
- VS Code, Cursor, Windsurf, Claude Desktop, Goose or any other MCP client
- Web navigation and form-filling
- Data extraction from structured content
- Automated testing driven by LLMs
- General-purpose browser interaction for agents
<!--
// Generate using:
node utils/generate-links.js
-->
### Example config
### Getting started
First, install the Playwright MCP server with your client.
**Standard config** works in most of the tools:
```js
{
@@ -30,51 +36,211 @@ A Model Context Protocol (MCP) server that provides browser automation capabilit
}
```
[<img src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF" alt="Install in VS Code">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540playwright%252Fmcp%2540latest%2522%255D%257D) [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540playwright%252Fmcp%2540latest%2522%255D%257D)
#### Installation in VS Code
Install the Playwright MCP server in VS Code using one of these buttons:
<details>
<summary>Claude Code</summary>
<!--
// Generate using?:
const config = JSON.stringify({ name: 'playwright', command: 'npx', args: ["-y", "@playwright/mcp@latest"] });
const urlForWebsites = `vscode:mcp/install?${encodeURIComponent(config)}`;
// Github markdown does not allow linking to `vscode:` directly, so you can use our redirect:
const urlForGithub = `https://insiders.vscode.dev/redirect?url=${encodeURIComponent(urlForWebsites)}`;
-->
Use the Claude Code CLI to add the Playwright MCP server:
[<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522%2540playwright%252Fmcp%2540latest%2522%255D%257D)
```bash
claude mcp add playwright npx @playwright/mcp@latest
```
</details>
Alternatively, you can install the Playwright MCP server using the VS Code CLI:
<details>
<summary>Claude Desktop</summary>
Follow the MCP install [guide](https://modelcontextprotocol.io/quickstart/user), use the standard config above.
</details>
<details>
<summary>Cursor</summary>
#### Click the button to install:
[![Install MCP Server](https://cursor.com/deeplink/mcp-install-dark.svg)](cursor://anysphere.cursor-deeplink/mcp/install?name=Playwright&config=eyJjb21tYW5kIjoibnB4IEBwbGF5d3JpZ2h0L21jcEBsYXRlc3QifQ%3D%3D)
#### Or install manually:
Go to `Cursor Settings` -> `MCP` -> `Add new MCP Server`. Name to your liking, use `command` type with the command `npx @playwright/mcp`. You can also verify config or add command like arguments via clicking `Edit`.
</details>
<details>
<summary>Gemini CLI</summary>
Follow the MCP install [guide](https://github.com/google-gemini/gemini-cli/blob/main/docs/tools/mcp-server.md#configure-the-mcp-server-in-settingsjson), use the standard config above.
</details>
<details>
<summary>Goose</summary>
#### Click the button to install:
[![Install in Goose](https://block.github.io/goose/img/extension-install-dark.svg)](https://block.github.io/goose/extension?cmd=npx&arg=%40playwright%2Fmcp%40latest&id=playwright&name=Playwright&description=Interact%20with%20web%20pages%20through%20structured%20accessibility%20snapshots%20using%20Playwright)
#### Or install manually:
Go to `Advanced settings` -> `Extensions` -> `Add custom extension`. Name to your liking, use type `STDIO`, and set the `command` to `npx @playwright/mcp`. Click "Add Extension".
</details>
<details>
<summary>LM Studio</summary>
#### Click the button to install:
[![Add MCP Server playwright to LM Studio](https://files.lmstudio.ai/deeplink/mcp-install-light.svg)](https://lmstudio.ai/install-mcp?name=playwright&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyJAcGxheXdyaWdodC9tY3BAbGF0ZXN0Il19)
#### Or install manually:
Go to `Program` in the right sidebar -> `Install` -> `Edit mcp.json`. Use the standard config above.
</details>
<details>
<summary>opencode</summary>
Follow the MCP Servers [documentation](https://opencode.ai/docs/mcp-servers/). For example in `~/.config/opencode/opencode.json`:
```json
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"playwright": {
"type": "local",
"command": [
"npx",
"@playwright/mcp@latest"
],
"enabled": true
}
}
}
```
</details>
<details>
<summary>Qodo Gen</summary>
Open [Qodo Gen](https://docs.qodo.ai/qodo-documentation/qodo-gen) chat panel in VSCode or IntelliJ → Connect more tools → + Add new MCP → Paste the standard config above.
Click <code>Save</code>.
</details>
<details>
<summary>VS Code</summary>
#### Click the button to install:
[<img src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF" alt="Install in VS Code">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540playwright%252Fmcp%2540latest%2522%255D%257D) [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540playwright%252Fmcp%2540latest%2522%255D%257D)
#### Or install manually:
Follow the MCP install [guide](https://code.visualstudio.com/docs/copilot/chat/mcp-servers#_add-an-mcp-server), use the standard config above. You can also install the Playwright MCP server using the VS Code CLI:
```bash
# For VS Code
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
```
```bash
# For VS Code Insiders
code-insiders --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
```
After installation, the Playwright MCP server will be available for use with your GitHub Copilot agent in VS Code.
</details>
### User data directory
<details>
<summary>Windsurf</summary>
Playwright MCP will launch Chrome browser with the new profile, located at
Follow Windsurf MCP [documentation](https://docs.windsurf.com/windsurf/cascade/mcp). Use the standard config above.
</details>
### Configuration
Playwright MCP server supports following arguments. They can be provided in the JSON configuration above, as a part of the `"args"` list:
<!--- Options generated by update-readme.js -->
```
- `%USERPROFILE%\AppData\Local\ms-playwright\mcp-chrome-profile` on Windows
- `~/Library/Caches/ms-playwright/mcp-chrome-profile` on macOS
- `~/.cache/ms-playwright/mcp-chrome-profile` on Linux
> npx @playwright/mcp@latest --help
--allowed-origins <origins> semicolon-separated list of origins to allow the
browser to request. Default is to allow all.
--blocked-origins <origins> semicolon-separated list of origins to block the
browser from requesting. Blocklist is evaluated
before allowlist. If used without the allowlist,
requests not matching the blocklist are still
allowed.
--block-service-workers block service workers
--browser <browser> browser or chrome channel to use, possible
values: chrome, firefox, webkit, msedge.
--caps <caps> comma-separated list of additional capabilities
to enable, possible values: vision, pdf.
--cdp-endpoint <endpoint> CDP endpoint to connect to.
--config <path> path to the configuration file.
--device <device> device to emulate, for example: "iPhone 15"
--executable-path <path> path to the browser executable.
--extension Connect to a running browser instance
(Edge/Chrome only). Requires the "Playwright MCP
Bridge" browser extension to be installed.
--headless run browser in headless mode, headed by default
--host <host> host to bind server to. Default is localhost. Use
0.0.0.0 to bind to all interfaces.
--ignore-https-errors ignore https errors
--isolated keep the browser profile in memory, do not save
it to disk.
--image-responses <mode> whether to send image responses to the client.
Can be "allow" or "omit", Defaults to "allow".
--no-sandbox disable the sandbox for all process types that
are normally sandboxed.
--output-dir <path> path to the directory for output files.
--port <port> port to listen on for SSE transport.
--proxy-bypass <bypass> comma-separated domains to bypass proxy, for
example ".com,chromium.org,.domain.com"
--proxy-server <proxy> specify proxy server, for example
"http://myproxy:3128" or "socks5://myproxy:8080"
--save-session Whether to save the Playwright MCP session into
the output directory.
--save-trace Whether to save the Playwright Trace of the
session into the output directory.
--storage-state <path> path to the storage state file for isolated
sessions.
--user-agent <ua string> specify user agent string
--user-data-dir <path> path to the user data directory. If not
specified, a temporary directory will be created.
--viewport-size <size> specify browser viewport size in pixels, for
example "1280, 720"
```
All the logged in information will be stored in that profile, you can delete it between sessions if you'dlike to clear the offline state.
<!--- End of options generated section -->
### User profile
### Running headless browser (Browser without GUI).
You can run Playwright MCP with persistent profile like a regular browser (default), in isolated contexts for testing sessions, or connect to your existing browser using the browser extension.
This mode is useful for background or batch operations.
**Persistent profile**
All the logged in information will be stored in the persistent profile, you can delete it between sessions if you'd like to clear the offline state.
Persistent profile is located at the following locations and you can override it with the `--user-data-dir` argument.
```bash
# Windows
%USERPROFILE%\AppData\Local\ms-playwright\mcp-{channel}-profile
# macOS
- ~/Library/Caches/ms-playwright/mcp-{channel}-profile
# Linux
- ~/.cache/ms-playwright/mcp-{channel}-profile
```
**Isolated**
In the isolated mode, each session is started in the isolated profile. Every time you ask MCP to close the browser,
the session is closed and all the storage state for this session is lost. You can provide initial storage state
to the browser via the config's `contextOptions` or via the `--storage-state` argument. Learn more about the storage
state [here](https://playwright.dev/docs/auth).
```js
{
@@ -83,225 +249,470 @@ This mode is useful for background or batch operations.
"command": "npx",
"args": [
"@playwright/mcp@latest",
"--headless"
"--isolated",
"--storage-state={path/to/storage.json}"
]
}
}
}
```
### Running headed browser on Linux w/o DISPLAY
**Browser Extension**
The Playwright MCP Chrome Extension allows you to connect to existing browser tabs and leverage your logged-in sessions and browser state. See [extension/README.md](extension/README.md) for installation and setup instructions.
### Configuration file
The Playwright MCP server can be configured using a JSON configuration file. You can specify the configuration file
using the `--config` command line option:
```bash
npx @playwright/mcp@latest --config path/to/config.json
```
<details>
<summary>Configuration file schema</summary>
```typescript
{
// Browser configuration
browser?: {
// Browser type to use (chromium, firefox, or webkit)
browserName?: 'chromium' | 'firefox' | 'webkit';
// Keep the browser profile in memory, do not save it to disk.
isolated?: boolean;
// Path to user data directory for browser profile persistence
userDataDir?: string;
// Browser launch options (see Playwright docs)
// @see https://playwright.dev/docs/api/class-browsertype#browser-type-launch
launchOptions?: {
channel?: string; // Browser channel (e.g. 'chrome')
headless?: boolean; // Run in headless mode
executablePath?: string; // Path to browser executable
// ... other Playwright launch options
};
// Browser context options
// @see https://playwright.dev/docs/api/class-browser#browser-new-context
contextOptions?: {
viewport?: { width: number, height: number };
// ... other Playwright context options
};
// CDP endpoint for connecting to existing browser
cdpEndpoint?: string;
// Remote Playwright server endpoint
remoteEndpoint?: string;
},
// Server configuration
server?: {
port?: number; // Port to listen on
host?: string; // Host to bind to (default: localhost)
},
// List of additional capabilities
capabilities?: Array<
'tabs' | // Tab management
'install' | // Browser installation
'pdf' | // PDF generation
'vision' | // Coordinate-based interactions
>;
// Directory for output files
outputDir?: string;
// Network configuration
network?: {
// List of origins to allow the browser to request. Default is to allow all. Origins matching both `allowedOrigins` and `blockedOrigins` will be blocked.
allowedOrigins?: string[];
// List of origins to block the browser to request. Origins matching both `allowedOrigins` and `blockedOrigins` will be blocked.
blockedOrigins?: string[];
};
/**
* Whether to send image responses to the client. Can be "allow" or "omit".
* Defaults to "allow".
*/
imageResponses?: 'allow' | 'omit';
}
```
</details>
### Standalone MCP server
When running headed browser on system w/o display or from worker processes of the IDEs,
run the MCP server from environment with the DISPLAY and pass the `--port` flag to enable SSE transport.
run the MCP server from environment with the DISPLAY and pass the `--port` flag to enable HTTP transport.
```bash
npx @playwright/mcp@latest --port 8931
```
And then in MCP client config, set the `url` to the SSE endpoint:
And then in MCP client config, set the `url` to the HTTP endpoint:
```js
{
"mcpServers": {
"playwright": {
"url": "http://localhost:8931/sse"
"url": "http://localhost:8931/mcp"
}
}
}
```
### Tool Modes
<details>
<summary><b>Docker</b></summary>
The tools are available in two modes:
1. **Snapshot Mode** (default): Uses accessibility snapshots for better performance and reliability
2. **Vision Mode**: Uses screenshots for visual-based interactions
To use Vision Mode, add the `--vision` flag when starting the server:
**NOTE:** The Docker implementation only supports headless chromium at the moment.
```js
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp@latest",
"--vision"
]
"command": "docker",
"args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]
}
}
}
```
Vision Mode works best with the computer use models that are able to interact with elements using
X Y coordinate space, based on the provided screenshot.
You can build the Docker image yourself.
### Programmatic usage with custom transports
```
docker build -t mcr.microsoft.com/playwright/mcp .
```
</details>
<details>
<summary><b>Programmatic usage</b></summary>
```js
import { createServer } from '@playwright/mcp';
import http from 'http';
// ...
import { createConnection } from '@playwright/mcp';
import { SSEServerTransport } from '@modelcontextprotocol/sdk/server/sse.js';
const server = createServer({
launchOptions: { headless: true }
http.createServer(async (req, res) => {
// ...
// Creates a headless Playwright MCP server with SSE transport
const connection = await createConnection({ browser: { launchOptions: { headless: true } } });
const transport = new SSEServerTransport('/messages', res);
await connection.sever.connect(transport);
// ...
});
transport = new SSEServerTransport("/messages", res);
server.connect(transport);
```
</details>
### Snapshot Mode
### Tools
The Playwright MCP provides a set of tools for browser automation. Here are all available tools:
<!--- Tools generated by update-readme.js -->
- **browser_navigate**
- Description: Navigate to a URL
- Parameters:
- `url` (string): The URL to navigate to
<details>
<summary><b>Core automation</b></summary>
- **browser_go_back**
- Description: Go back to the previous page
- Parameters: None
- **browser_go_forward**
- Description: Go forward to the next page
- Parameters: None
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_click**
- Title: Click
- Description: Perform click on a web page
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- `doubleClick` (boolean, optional): Whether to perform a double click instead of a single click
- `button` (string, optional): Button to click, defaults to left
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_close**
- Title: Close browser
- Description: Close the page
- Parameters: None
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_console_messages**
- Title: Get console messages
- Description: Returns all console messages
- Parameters: None
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_drag**
- Title: Drag mouse
- Description: Perform drag and drop between two elements
- Parameters:
- `startElement` (string): Human-readable source element description used to obtain the permission to interact with the element
- `startRef` (string): Exact source element reference from the page snapshot
- `endElement` (string): Human-readable target element description used to obtain the permission to interact with the element
- `endRef` (string): Exact target element reference from the page snapshot
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_evaluate**
- Title: Evaluate JavaScript
- Description: Evaluate JavaScript expression on page or element
- Parameters:
- `function` (string): () => { /* code */ } or (element) => { /* code */ } when element is provided
- `element` (string, optional): Human-readable element description used to obtain permission to interact with the element
- `ref` (string, optional): Exact target element reference from the page snapshot
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_file_upload**
- Title: Upload files
- Description: Upload one or multiple files
- Parameters:
- `paths` (array): The absolute paths to the files to upload. Can be a single file or multiple files.
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_handle_dialog**
- Title: Handle a dialog
- Description: Handle a dialog
- Parameters:
- `accept` (boolean): Whether to accept the dialog.
- `promptText` (string, optional): The text of the prompt in case of a prompt dialog.
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_hover**
- Title: Hover mouse
- Description: Hover over element on page
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- Read-only: **true**
- **browser_drag**
- Description: Perform drag and drop between two elements
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_navigate**
- Title: Navigate to a URL
- Description: Navigate to a URL
- Parameters:
- `startElement` (string): Human-readable source element description used to obtain permission to interact with the element
- `startRef` (string): Exact source element reference from the page snapshot
- `endElement` (string): Human-readable target element description used to obtain permission to interact with the element
- `endRef` (string): Exact target element reference from the page snapshot
- `url` (string): The URL to navigate to
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_navigate_back**
- Title: Go back
- Description: Go back to the previous page
- Parameters: None
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_navigate_forward**
- Title: Go forward
- Description: Go forward to the next page
- Parameters: None
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_network_requests**
- Title: List network requests
- Description: Returns all network requests since loading the page
- Parameters: None
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_press_key**
- Title: Press a key
- Description: Press a key on the keyboard
- Parameters:
- `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_resize**
- Title: Resize browser window
- Description: Resize the browser window
- Parameters:
- `width` (number): Width of the browser window
- `height` (number): Height of the browser window
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_select_option**
- Title: Select option
- Description: Select an option in a dropdown
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- `values` (array): Array of values to select in the dropdown. This can be a single value or multiple values.
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_snapshot**
- Title: Page snapshot
- Description: Capture accessibility snapshot of the current page, this is better than screenshot
- Parameters: None
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_take_screenshot**
- Title: Take a screenshot
- Description: Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.
- Parameters:
- `type` (string, optional): Image format for the screenshot. Default is png.
- `filename` (string, optional): File name to save the screenshot to. Defaults to `page-{timestamp}.{png|jpeg}` if not specified.
- `element` (string, optional): Human-readable element description used to obtain permission to screenshot the element. If not provided, the screenshot will be taken of viewport. If element is provided, ref must be provided too.
- `ref` (string, optional): Exact target element reference from the page snapshot. If not provided, the screenshot will be taken of viewport. If ref is provided, element must be provided too.
- `fullPage` (boolean, optional): When true, takes a screenshot of the full scrollable page, instead of the currently visible viewport. Cannot be used with element screenshots.
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_type**
- Title: Type text
- Description: Type text into editable element
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- `text` (string): Text to type into the element
- `submit` (boolean): Whether to submit entered text (press Enter after)
- `submit` (boolean, optional): Whether to submit entered text (press Enter after)
- `slowly` (boolean, optional): Whether to type one character at a time. Useful for triggering key handlers in the page. By default entire text is filled in at once.
- Read-only: **false**
- **browser_select_option**
- Description: Select option in a dropdown
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_wait_for**
- Title: Wait for
- Description: Wait for text to appear or disappear or a specified time to pass
- Parameters:
- `time` (number, optional): The time to wait in seconds
- `text` (string, optional): The text to wait for
- `textGone` (string, optional): The text to wait for to disappear
- Read-only: **true**
</details>
<details>
<summary><b>Tab management</b></summary>
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_tab_close**
- Title: Close a tab
- Description: Close a tab
- Parameters:
- `index` (number, optional): The index of the tab to close. Closes current tab if not provided.
- Read-only: **false**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_tab_list**
- Title: List tabs
- Description: List browser tabs
- Parameters: None
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_tab_new**
- Title: Open a new tab
- Description: Open a new tab
- Parameters:
- `url` (string, optional): The URL to navigate to in the new tab. If not provided, the new tab will be blank.
- Read-only: **true**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_tab_select**
- Title: Select a tab
- Description: Select a tab by index
- Parameters:
- `index` (number): The index of the tab to select
- Read-only: **true**
</details>
<details>
<summary><b>Browser installation</b></summary>
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_install**
- Title: Install the browser specified in the config
- Description: Install the browser specified in the config. Call this if you get an error about the browser not being installed.
- Parameters: None
- Read-only: **false**
</details>
<details>
<summary><b>Coordinate-based (opt-in via --caps=vision)</b></summary>
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_mouse_click_xy**
- Title: Click
- Description: Click left mouse button at a given position
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- `values` (array): Array of values to select in the dropdown.
- **browser_choose_file**
- Description: Choose one or multiple files to upload
- Parameters:
- `paths` (array): The absolute paths to the files to upload. Can be a single file or multiple files.
- **browser_press_key**
- Description: Press a key on the keyboard
- Parameters:
- `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
- **browser_snapshot**
- Description: Capture accessibility snapshot of the current page (better than screenshot)
- Parameters: None
- **browser_save_as_pdf**
- Description: Save page as PDF
- Parameters: None
- **browser_take_screenshot**
- Description: Capture screenshot of the page
- Parameters:
- `raw` (string): Optionally returns lossless PNG screenshot. JPEG by default.
- **browser_wait**
- Description: Wait for a specified time in seconds
- Parameters:
- `time` (number): The time to wait in seconds (capped at 10 seconds)
- **browser_close**
- Description: Close the page
- Parameters: None
### Vision Mode
Vision Mode provides tools for visual-based interactions using screenshots. Here are all available tools:
- **browser_navigate**
- Description: Navigate to a URL
- Parameters:
- `url` (string): The URL to navigate to
- **browser_go_back**
- Description: Go back to the previous page
- Parameters: None
- **browser_go_forward**
- Description: Go forward to the next page
- Parameters: None
- **browser_screenshot**
- Description: Capture screenshot of the current page
- Parameters: None
- **browser_move_mouse**
- Description: Move mouse to specified coordinates
- Parameters:
- `x` (number): X coordinate
- `y` (number): Y coordinate
- Read-only: **false**
- **browser_click**
- Description: Click at specified coordinates
- Parameters:
- `x` (number): X coordinate to click at
- `y` (number): Y coordinate to click at
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_drag**
- Description: Perform drag and drop operation
- **browser_mouse_drag_xy**
- Title: Drag mouse
- Description: Drag left mouse button to a given position
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `startX` (number): Start X coordinate
- `startY` (number): Start Y coordinate
- `endX` (number): End X coordinate
- `endY` (number): End Y coordinate
- Read-only: **false**
- **browser_type**
- Description: Type text at specified coordinates
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_mouse_move_xy**
- Title: Move mouse
- Description: Move mouse to a given position
- Parameters:
- `text` (string): Text to type
- `submit` (boolean): Whether to submit entered text (press Enter after)
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `x` (number): X coordinate
- `y` (number): Y coordinate
- Read-only: **true**
- **browser_press_key**
- Description: Press a key on the keyboard
- Parameters:
- `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
</details>
- **browser_choose_file**
- Description: Choose one or multiple files to upload
- Parameters:
- `paths` (array): The absolute paths to the files to upload. Can be a single file or multiple files.
<details>
<summary><b>PDF generation (opt-in via --caps=pdf)</b></summary>
- **browser_save_as_pdf**
<!-- NOTE: This has been generated via update-readme.js -->
- **browser_pdf_save**
- Title: Save as PDF
- Description: Save page as PDF
- Parameters: None
- **browser_wait**
- Description: Wait for a specified time in seconds
- Parameters:
- `time` (number): The time to wait in seconds (capped at 10 seconds)
- `filename` (string, optional): File name to save the pdf to. Defaults to `page-{timestamp}.pdf` if not specified.
- Read-only: **true**
- **browser_close**
- Description: Close the page
- Parameters: None
</details>
<!--- End of tools generated section -->

2
cli.js
View File

@@ -15,4 +15,4 @@
* limitations under the License.
*/
require('./lib/program');
import './lib/program.js';

119
config.d.ts vendored Normal file
View File

@@ -0,0 +1,119 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type * as playwright from 'playwright';
export type ToolCapability = 'core' | 'core-tabs' | 'core-install' | 'vision' | 'pdf';
export type Config = {
/**
* The browser to use.
*/
browser?: {
/**
* The type of browser to use.
*/
browserName?: 'chromium' | 'firefox' | 'webkit';
/**
* Keep the browser profile in memory, do not save it to disk.
*/
isolated?: boolean;
/**
* Path to a user data directory for browser profile persistence.
* Temporary directory is created by default.
*/
userDataDir?: string;
/**
* Launch options passed to
* @see https://playwright.dev/docs/api/class-browsertype#browser-type-launch-persistent-context
*
* This is useful for settings options like `channel`, `headless`, `executablePath`, etc.
*/
launchOptions?: playwright.LaunchOptions;
/**
* Context options for the browser context.
*
* This is useful for settings options like `viewport`.
*/
contextOptions?: playwright.BrowserContextOptions;
/**
* Chrome DevTools Protocol endpoint to connect to an existing browser instance in case of Chromium family browsers.
*/
cdpEndpoint?: string;
/**
* Remote endpoint to connect to an existing Playwright server.
*/
remoteEndpoint?: string;
},
server?: {
/**
* The port to listen on for SSE or MCP transport.
*/
port?: number;
/**
* The host to bind the server to. Default is localhost. Use 0.0.0.0 to bind to all interfaces.
*/
host?: string;
},
/**
* List of enabled tool capabilities. Possible values:
* - 'core': Core browser automation features.
* - 'pdf': PDF generation and manipulation.
* - 'vision': Coordinate-based interactions.
*/
capabilities?: ToolCapability[];
/**
* Whether to save the Playwright session into the output directory.
*/
saveSession?: boolean;
/**
* Whether to save the Playwright trace of the session into the output directory.
*/
saveTrace?: boolean;
/**
* The directory to save output files.
*/
outputDir?: string;
network?: {
/**
* List of origins to allow the browser to request. Default is to allow all. Origins matching both `allowedOrigins` and `blockedOrigins` will be blocked.
*/
allowedOrigins?: string[];
/**
* List of origins to block the browser to request. Origins matching both `allowedOrigins` and `blockedOrigins` will be blocked.
*/
blockedOrigins?: string[];
};
/**
* Whether to send image responses to the client. Can be "allow", "omit", or "auto". Defaults to "auto", which sends images if the client can display them.
*/
imageResponses?: 'allow' | 'omit';
};

View File

@@ -33,6 +33,8 @@ const plugins = {
};
export const baseRules = {
"import/extensions": ["error", "ignorePackages", {ts: "always"}],
"@typescript-eslint/no-floating-promises": "error",
"@typescript-eslint/no-unused-vars": [
2,
{ args: "none", caughtErrors: "none" },
@@ -178,12 +180,41 @@ export const baseRules = {
// react
"react/react-in-jsx-scope": 0,
"no-console": 2,
};
const languageOptions = {
parser: tsParser,
ecmaVersion: 9,
sourceType: "module",
parserOptions: {
project: path.join(fileURLToPath(import.meta.url), "..", "tsconfig.all.json"),
}
};
const importOrderRules = {
"import/order": [
2,
{
groups: [
"builtin",
"external",
"internal",
["parent", "sibling"],
"index",
"type",
],
},
],
"import/consistent-type-specifier-style": [2, "prefer-top-level"],
};
const noFloatingPromisesRules = {
"@typescript-eslint/no-floating-promises": "error",
};
const noBooleanCompareRules = {
"@typescript-eslint/no-unnecessary-boolean-literal-compare": 2,
};
export default [
@@ -194,6 +225,11 @@ export default [
files: ["**/*.ts", "**/*.tsx"],
plugins,
languageOptions,
rules: baseRules,
rules: {
...baseRules,
...importOrderRules,
...noFloatingPromisesRules,
...noBooleanCompareRules,
},
},
];

10
examples/generate-test.md Normal file
View File

@@ -0,0 +1,10 @@
Use Playwright tools to generate test for scenario:
## GitHub PR Checks Navigation Checklist
1. Open the [Microsoft Playwright GitHub repository](https://github.com/microsoft/playwright).
2. Click on the **Pull requests** tab.
3. Find and open the pull request titled **"chore: make noWaitAfter a default"**.
4. Switch to the **Checks** tab for that pull request.
5. Expand the **infra** check suite to view its jobs.
6. Click on the **docs & lint** job to view its details.

48
extension/README.md Normal file
View File

@@ -0,0 +1,48 @@
# Playwright MCP Chrome Extension
## Introduction
The Playwright MCP Chrome Extension allows you to connect to pages in your existing browser and leverage the state of your default user profile. This means the AI assistant can interact with websites where you're already logged in, using your existing cookies, sessions, and browser state, providing a seamless experience without requiring separate authentication or setup.
## Prerequisites
- Chrome/Edge/Chromium browser
## Installation Steps
### Download the Extension
Download the latest Chrome extension from GitHub:
- **Download link**: https://github.com/microsoft/playwright-mcp/releases
### Load Chrome Extension
1. Open Chrome and navigate to `chrome://extensions/`
2. Enable "Developer mode" (toggle in the top right corner)
3. Click "Load unpacked" and select the extension directory
### Configure Playwright MCP server
Configure Playwright MCP server to connect to the browser using the extension by passing the `--extension` option when running the MCP server:
```json
{
"mcpServers": {
"playwright-extension": {
"command": "npx",
"args": [
"@playwright/mcp@latest",
"--extension"
]
}
}
}
```
## Usage
### Browser Tab Selection
When the LLM interacts with the browser for the first time, it will load a page where you can select which browser tab the LLM will connect to. This allows you to control which specific page the AI assistant will interact with during the session.

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.2 KiB

BIN
extension/icons/icon-16.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 571 B

BIN
extension/icons/icon-32.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

BIN
extension/icons/icon-48.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.0 KiB

40
extension/manifest.json Normal file
View File

@@ -0,0 +1,40 @@
{
"manifest_version": 3,
"name": "Playwright MCP Bridge",
"version": "0.0.34",
"description": "Share browser tabs with Playwright MCP server",
"key": "MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA9nMS2b0WCohjVHPGb8D9qAdkbIngDqoAjTeSccHJijgcONejge+OJxOQOMLu7b0ovt1c9BiEJa5JcpM+EHFVGL1vluBxK71zmBy1m2f9vZF3HG0LSCp7YRkum9rAIEthDwbkxx6XTvpmAY5rjFa/NON6b9Hlbo+8peUSkoOK7HTwYnnI36asZ9eUTiveIf+DMPLojW2UX33vDWG2UKvMVDewzclb4+uLxAYshY7Mx8we/b44xu+Anb/EBLKjOPk9Yh541xJ5Ozc8EiP/5yxOp9c/lRiYUHaRW+4r0HKZyFt0eZ52ti2iM4Nfk7jRXR7an3JPsUIf5deC/1cVM/+1ZQIDAQAB",
"permissions": [
"debugger",
"activeTab",
"tabs",
"storage"
],
"host_permissions": [
"<all_urls>"
],
"background": {
"service_worker": "lib/background.js",
"type": "module"
},
"action": {
"default_title": "Playwright MCP Bridge",
"default_icon": {
"16": "icons/icon-16.png",
"32": "icons/icon-32.png",
"48": "icons/icon-48.png",
"128": "icons/icon-128.png"
}
},
"icons": {
"16": "icons/icon-16.png",
"32": "icons/icon-32.png",
"48": "icons/icon-48.png",
"128": "icons/icon-128.png"
}
}

1884
extension/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

36
extension/package.json Normal file
View File

@@ -0,0 +1,36 @@
{
"name": "@playwright/mcp-extension",
"version": "0.0.34",
"description": "Playwright MCP Browser Extension",
"type": "module",
"private": true,
"repository": {
"type": "git",
"url": "git+https://github.com/microsoft/playwright-mcp.git"
},
"homepage": "https://playwright.dev",
"engines": {
"node": ">=18"
},
"author": {
"name": "Microsoft Corporation"
},
"license": "Apache-2.0",
"scripts": {
"build": "tsc --project . && tsc --project tsconfig.ui.json && vite build",
"watch": "tsc --watch --project . & tsc --watch --project tsconfig.ui.json & vite build --watch",
"test": "playwright test",
"clean": "rm -rf dist"
},
"devDependencies": {
"@types/chrome": "^0.0.315",
"@types/react": "^18.2.66",
"@types/react-dom": "^18.2.22",
"@vitejs/plugin-react": "^4.0.0",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"typescript": "^5.8.2",
"vite": "^5.0.0",
"vite-plugin-static-copy": "^3.1.1"
}
}

View File

@@ -0,0 +1,31 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { defineConfig } from '@playwright/test';
import type { TestOptions } from '../tests/fixtures.js';
export default defineConfig<TestOptions>({
testDir: './tests',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: 'list',
projects: [
{ name: 'chromium', use: { mcpBrowser: 'chromium' } },
],
});

219
extension/src/background.ts Normal file
View File

@@ -0,0 +1,219 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { RelayConnection, debugLog } from './relayConnection.js';
type PageMessage = {
type: 'connectToMCPRelay';
mcpRelayUrl: string;
} | {
type: 'getTabs';
} | {
type: 'connectToTab';
tabId: number;
windowId: number;
mcpRelayUrl: string;
} | {
type: 'getConnectionStatus';
} | {
type: 'disconnect';
};
class TabShareExtension {
private _activeConnection: RelayConnection | undefined;
private _connectedTabId: number | null = null;
private _pendingTabSelection = new Map<number, { connection: RelayConnection, timerId?: number }>();
constructor() {
chrome.tabs.onRemoved.addListener(this._onTabRemoved.bind(this));
chrome.tabs.onUpdated.addListener(this._onTabUpdated.bind(this));
chrome.tabs.onActivated.addListener(this._onTabActivated.bind(this));
chrome.runtime.onMessage.addListener(this._onMessage.bind(this));
chrome.action.onClicked.addListener(this._onActionClicked.bind(this));
}
// Promise-based message handling is not supported in Chrome: https://issues.chromium.org/issues/40753031
private _onMessage(message: PageMessage, sender: chrome.runtime.MessageSender, sendResponse: (response: any) => void) {
switch (message.type) {
case 'connectToMCPRelay':
this._connectToRelay(sender.tab!.id!, message.mcpRelayUrl!).then(
() => sendResponse({ success: true }),
(error: any) => sendResponse({ success: false, error: error.message }));
return true;
case 'getTabs':
this._getTabs().then(
tabs => sendResponse({ success: true, tabs, currentTabId: sender.tab?.id }),
(error: any) => sendResponse({ success: false, error: error.message }));
return true;
case 'connectToTab':
this._connectTab(sender.tab!.id!, message.tabId, message.windowId, message.mcpRelayUrl!).then(
() => sendResponse({ success: true }),
(error: any) => sendResponse({ success: false, error: error.message }));
return true; // Return true to indicate that the response will be sent asynchronously
case 'getConnectionStatus':
sendResponse({
connectedTabId: this._connectedTabId
});
return false;
case 'disconnect':
this._disconnect().then(
() => sendResponse({ success: true }),
(error: any) => sendResponse({ success: false, error: error.message }));
return true;
}
return false;
}
private async _connectToRelay(selectorTabId: number, mcpRelayUrl: string): Promise<void> {
try {
debugLog(`Connecting to relay at ${mcpRelayUrl}`);
const socket = new WebSocket(mcpRelayUrl);
await new Promise<void>((resolve, reject) => {
socket.onopen = () => resolve();
socket.onerror = () => reject(new Error('WebSocket error'));
setTimeout(() => reject(new Error('Connection timeout')), 5000);
});
const connection = new RelayConnection(socket);
connection.onclose = () => {
debugLog('Connection closed');
this._pendingTabSelection.delete(selectorTabId);
// TODO: show error in the selector tab?
};
this._pendingTabSelection.set(selectorTabId, { connection });
debugLog(`Connected to MCP relay`);
} catch (error: any) {
debugLog(`Failed to connect to MCP relay:`, error.message);
throw error;
}
}
private async _connectTab(selectorTabId: number, tabId: number, windowId: number, mcpRelayUrl: string): Promise<void> {
try {
debugLog(`Connecting tab ${tabId} to relay at ${mcpRelayUrl}`);
try {
this._activeConnection?.close('Another connection is requested');
} catch (error: any) {
debugLog(`Error closing active connection:`, error);
}
await this._setConnectedTabId(null);
this._activeConnection = this._pendingTabSelection.get(selectorTabId)?.connection;
if (!this._activeConnection)
throw new Error('No active MCP relay connection');
this._pendingTabSelection.delete(selectorTabId);
this._activeConnection.setTabId(tabId);
this._activeConnection.onclose = () => {
debugLog('MCP connection closed');
this._activeConnection = undefined;
void this._setConnectedTabId(null);
};
await Promise.all([
this._setConnectedTabId(tabId),
chrome.tabs.update(tabId, { active: true }),
chrome.windows.update(windowId, { focused: true }),
]);
debugLog(`Connected to MCP bridge`);
} catch (error: any) {
await this._setConnectedTabId(null);
debugLog(`Failed to connect tab ${tabId}:`, error.message);
throw error;
}
}
private async _setConnectedTabId(tabId: number | null): Promise<void> {
const oldTabId = this._connectedTabId;
this._connectedTabId = tabId;
if (oldTabId && oldTabId !== tabId)
await this._updateBadge(oldTabId, { text: '' });
if (tabId)
await this._updateBadge(tabId, { text: '✓', color: '#4CAF50', title: 'Connected to MCP client' });
}
private async _updateBadge(tabId: number, { text, color, title }: { text: string; color?: string, title?: string }): Promise<void> {
try {
await chrome.action.setBadgeText({ tabId, text });
await chrome.action.setTitle({ tabId, title: title || '' });
if (color)
await chrome.action.setBadgeBackgroundColor({ tabId, color });
} catch (error: any) {
// Ignore errors as the tab may be closed already.
}
}
private async _onTabRemoved(tabId: number): Promise<void> {
const pendingConnection = this._pendingTabSelection.get(tabId)?.connection;
if (pendingConnection) {
this._pendingTabSelection.delete(tabId);
pendingConnection.close('Browser tab closed');
return;
}
if (this._connectedTabId !== tabId)
return;
this._activeConnection?.close('Browser tab closed');
this._activeConnection = undefined;
this._connectedTabId = null;
}
private _onTabActivated(activeInfo: chrome.tabs.TabActiveInfo) {
for (const [tabId, pending] of this._pendingTabSelection) {
if (tabId === activeInfo.tabId) {
if (pending.timerId) {
clearTimeout(pending.timerId);
pending.timerId = undefined;
}
continue;
}
if (!pending.timerId) {
pending.timerId = setTimeout(() => {
const existed = this._pendingTabSelection.delete(tabId);
if (existed) {
pending.connection.close('Tab has been inactive for 5 seconds');
chrome.tabs.sendMessage(tabId, { type: 'connectionTimeout' });
}
}, 5000);
return;
}
}
}
private _onTabUpdated(tabId: number, changeInfo: chrome.tabs.TabChangeInfo, tab: chrome.tabs.Tab) {
if (this._connectedTabId === tabId)
void this._setConnectedTabId(tabId);
}
private async _getTabs(): Promise<chrome.tabs.Tab[]> {
const tabs = await chrome.tabs.query({});
return tabs.filter(tab => tab.url && !['chrome:', 'edge:', 'devtools:'].some(scheme => tab.url!.startsWith(scheme)));
}
private async _onActionClicked(): Promise<void> {
await chrome.tabs.create({
url: chrome.runtime.getURL('status.html'),
active: true
});
}
private async _disconnect(): Promise<void> {
this._activeConnection?.close('User disconnected');
this._activeConnection = undefined;
await this._setConnectedTabId(null);
}
}
new TabShareExtension();

View File

@@ -0,0 +1,178 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
export function debugLog(...args: unknown[]): void {
const enabled = true;
if (enabled) {
// eslint-disable-next-line no-console
console.log('[Extension]', ...args);
}
}
type ProtocolCommand = {
id: number;
method: string;
params?: any;
};
type ProtocolResponse = {
id?: number;
method?: string;
params?: any;
result?: any;
error?: string;
};
export class RelayConnection {
private _debuggee: chrome.debugger.Debuggee;
private _ws: WebSocket;
private _eventListener: (source: chrome.debugger.DebuggerSession, method: string, params: any) => void;
private _detachListener: (source: chrome.debugger.Debuggee, reason: string) => void;
private _tabPromise: Promise<void>;
private _tabPromiseResolve!: () => void;
private _closed = false;
onclose?: () => void;
constructor(ws: WebSocket) {
this._debuggee = { };
this._tabPromise = new Promise(resolve => this._tabPromiseResolve = resolve);
this._ws = ws;
this._ws.onmessage = this._onMessage.bind(this);
this._ws.onclose = () => this._onClose();
// Store listeners for cleanup
this._eventListener = this._onDebuggerEvent.bind(this);
this._detachListener = this._onDebuggerDetach.bind(this);
chrome.debugger.onEvent.addListener(this._eventListener);
chrome.debugger.onDetach.addListener(this._detachListener);
}
// Either setTabId or close is called after creating the connection.
setTabId(tabId: number): void {
this._debuggee = { tabId };
this._tabPromiseResolve();
}
close(message: string): void {
this._ws.close(1000, message);
// ws.onclose is called asynchronously, so we call it here to avoid forwarding
// CDP events to the closed connection.
this._onClose();
}
private _onClose() {
if (this._closed)
return;
this._closed = true;
chrome.debugger.onEvent.removeListener(this._eventListener);
chrome.debugger.onDetach.removeListener(this._detachListener);
chrome.debugger.detach(this._debuggee).catch(() => {});
this.onclose?.();
}
private _onDebuggerEvent(source: chrome.debugger.DebuggerSession, method: string, params: any): void {
if (source.tabId !== this._debuggee.tabId)
return;
debugLog('Forwarding CDP event:', method, params);
const sessionId = source.sessionId;
this._sendMessage({
method: 'forwardCDPEvent',
params: {
sessionId,
method,
params,
},
});
}
private _onDebuggerDetach(source: chrome.debugger.Debuggee, reason: string): void {
if (source.tabId !== this._debuggee.tabId)
return;
this.close(`Debugger detached: ${reason}`);
this._debuggee = { };
}
private _onMessage(event: MessageEvent): void {
this._onMessageAsync(event).catch(e => debugLog('Error handling message:', e));
}
private async _onMessageAsync(event: MessageEvent): Promise<void> {
let message: ProtocolCommand;
try {
message = JSON.parse(event.data);
} catch (error: any) {
debugLog('Error parsing message:', error);
this._sendError(-32700, `Error parsing message: ${error.message}`);
return;
}
debugLog('Received message:', message);
const response: ProtocolResponse = {
id: message.id,
};
try {
response.result = await this._handleCommand(message);
} catch (error: any) {
debugLog('Error handling command:', error);
response.error = error.message;
}
debugLog('Sending response:', response);
this._sendMessage(response);
}
private async _handleCommand(message: ProtocolCommand): Promise<any> {
if (message.method === 'attachToTab') {
await this._tabPromise;
debugLog('Attaching debugger to tab:', this._debuggee);
await chrome.debugger.attach(this._debuggee, '1.3');
const result: any = await chrome.debugger.sendCommand(this._debuggee, 'Target.getTargetInfo');
return {
targetInfo: result?.targetInfo,
};
}
if (!this._debuggee.tabId)
throw new Error('No tab is connected. Please go to the Playwright MCP extension and select the tab you want to connect to.');
if (message.method === 'forwardCDPCommand') {
const { sessionId, method, params } = message.params;
debugLog('CDP command:', method, params);
const debuggerSession: chrome.debugger.DebuggerSession = {
...this._debuggee,
sessionId,
};
// Forward CDP command to chrome.debugger
return await chrome.debugger.sendCommand(
debuggerSession,
method,
params
);
}
}
private _sendError(code: number, message: string): void {
this._sendMessage({
error: {
code,
message,
},
});
}
private _sendMessage(message: any): void {
if (this._ws.readyState === WebSocket.OPEN)
this._ws.send(JSON.stringify(message));
}
}

View File

@@ -0,0 +1,195 @@
/*
Copyright (c) Microsoft Corporation.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
body {
margin: 0;
padding: 0;
}
/* Base styles */
.app-container {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Noto Sans", Helvetica, Arial, sans-serif;
background-color: #ffffff;
color: #1f2328;
margin: 0;
padding: 16px;
min-height: 100vh;
font-size: 14px;
}
.content-wrapper {
max-width: 600px;
margin: 0 auto;
}
/* Status Banner */
.status-container {
display: flex;
align-items: center;
justify-content: space-between;
margin-bottom: 16px;
padding-right: 12px;
}
.status-banner {
padding: 12px;
font-size: 14px;
font-weight: 500;
display: flex;
align-items: center;
gap: 8px;
flex: 1;
}
.status-banner.connected {
color: #1f2328;
}
.status-banner.connected::before {
content: "\2705";
margin-right: 8px;
}
.status-banner.error {
color: #1f2328;
}
.status-banner.error::before {
content: "\274C";
margin-right: 8px;
}
/* Buttons */
.button-container {
margin-bottom: 16px;
display: flex;
justify-content: flex-end;
padding-right: 12px;
}
.button {
padding: 8px 16px;
border-radius: 6px;
border: none;
font-size: 14px;
font-weight: 500;
cursor: pointer;
display: inline-flex;
align-items: center;
justify-content: center;
text-decoration: none;
margin-right: 8px;
min-width: 90px;
}
.button.primary {
background-color: #f8f9fa;
color: #3c4043;
border: 1px solid #dadce0;
}
.button.primary:hover {
background-color: #f1f3f4;
border-color: #dadce0;
box-shadow: 0 1px 2px 0 rgba(60,64,67,.1);
}
.button.default {
background-color: #f6f8fa;
color: #24292f;
}
.button.default:hover {
background-color: #f3f4f6;
}
.button.reject {
background-color: #da3633;
color: #ffffff;
border: 1px solid #da3633;
}
.button.reject:hover {
background-color: #c73836;
border-color: #c73836;
}
/* Tab selection */
.tab-section-title {
padding-left: 12px;
font-size: 12px;
font-weight: 400;
margin-bottom: 12px;
color: #656d76;
}
.tab-item {
display: flex;
align-items: center;
padding: 12px;
margin-bottom: 8px;
background-color: #ffffff;
cursor: pointer;
border-radius: 6px;
transition: background-color 0.2s ease;
}
.tab-item:hover {
background-color: #f8f9fa;
}
.tab-item.selected {
background-color: #f6f8fa;
}
.tab-item.disabled {
cursor: not-allowed;
opacity: 0.5;
}
.tab-radio {
margin-right: 12px;
flex-shrink: 0;
}
.tab-favicon {
width: 16px;
height: 16px;
margin-right: 8px;
flex-shrink: 0;
}
.tab-content {
flex: 1;
min-width: 0;
}
.tab-title {
font-weight: 500;
color: #1f2328;
margin-bottom: 2px;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.tab-url {
font-size: 12px;
color: #656d76;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}

View File

@@ -0,0 +1,29 @@
<!--
Copyright (c) Microsoft Corporation.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!DOCTYPE html>
<html>
<head>
<title>Playwright MCP extension</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="icon" type="image/png" sizes="32x32" href="../../icons/icon-32.png">
<link rel="icon" type="image/png" sizes="16x16" href="../../icons/icon-16.png">
<link rel="stylesheet" href="connect.css">
</head>
<body>
<div id="root"></div>
<script type="module" src="connect.tsx"></script>
</body>
</html>

View File

@@ -0,0 +1,168 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import React, { useState, useEffect, useCallback } from 'react';
import { createRoot } from 'react-dom/client';
import { Button, TabItem } from './tabItem.js';
import type { TabInfo } from './tabItem.js';
type StatusType = 'connected' | 'error' | 'connecting';
const ConnectApp: React.FC = () => {
const [tabs, setTabs] = useState<TabInfo[]>([]);
const [status, setStatus] = useState<{ type: StatusType; message: string } | null>(null);
const [showButtons, setShowButtons] = useState(true);
const [showTabList, setShowTabList] = useState(true);
const [clientInfo, setClientInfo] = useState('unknown');
const [mcpRelayUrl, setMcpRelayUrl] = useState('');
useEffect(() => {
const params = new URLSearchParams(window.location.search);
const relayUrl = params.get('mcpRelayUrl');
if (!relayUrl) {
setShowButtons(false);
setStatus({ type: 'error', message: 'Missing mcpRelayUrl parameter in URL.' });
return;
}
setMcpRelayUrl(relayUrl);
try {
const client = JSON.parse(params.get('client') || '{}');
const info = `${client.name}/${client.version}`;
setClientInfo(info);
setStatus({
type: 'connecting',
message: `🎭 Playwright MCP started from "${info}" is trying to connect. Do you want to continue?`
});
} catch (e) {
setStatus({ type: 'error', message: 'Failed to parse client version.' });
return;
}
void connectToMCPRelay(relayUrl);
void loadTabs();
}, []);
const connectToMCPRelay = useCallback(async (mcpRelayUrl: string) => {
const response = await chrome.runtime.sendMessage({ type: 'connectToMCPRelay', mcpRelayUrl });
if (!response.success)
setStatus({ type: 'error', message: 'Failed to connect to MCP relay: ' + response.error });
}, []);
const loadTabs = useCallback(async () => {
const response = await chrome.runtime.sendMessage({ type: 'getTabs' });
if (response.success)
setTabs(response.tabs);
else
setStatus({ type: 'error', message: 'Failed to load tabs: ' + response.error });
}, []);
const handleConnectToTab = useCallback(async (tab: TabInfo) => {
setShowButtons(false);
setShowTabList(false);
try {
const response = await chrome.runtime.sendMessage({
type: 'connectToTab',
mcpRelayUrl,
tabId: tab.id,
windowId: tab.windowId,
});
if (response?.success) {
setStatus({ type: 'connected', message: `MCP client "${clientInfo}" connected.` });
} else {
setStatus({
type: 'error',
message: response?.error || `MCP client "${clientInfo}" failed to connect.`
});
}
} catch (e) {
setStatus({
type: 'error',
message: `MCP client "${clientInfo}" failed to connect: ${e}`
});
}
}, [clientInfo, mcpRelayUrl]);
const handleReject = useCallback(() => {
setShowButtons(false);
setShowTabList(false);
setStatus({ type: 'error', message: 'Connection rejected. This tab can be closed.' });
}, []);
useEffect(() => {
const listener = (message: any) => {
if (message.type === 'connectionTimeout')
handleReject();
};
chrome.runtime.onMessage.addListener(listener);
return () => {
chrome.runtime.onMessage.removeListener(listener);
};
}, []);
return (
<div className='app-container'>
<div className='content-wrapper'>
{status && (
<div className='status-container'>
<StatusBanner type={status.type} message={status.message} />
{showButtons && (
<Button variant='reject' onClick={handleReject}>
Reject
</Button>
)}
</div>
)}
{showTabList && (
<div>
<div className='tab-section-title'>
Select page to expose to MCP server:
</div>
<div>
{tabs.map(tab => (
<TabItem
key={tab.id}
tab={tab}
button={
<Button variant='primary' onClick={() => handleConnectToTab(tab)}>
Connect
</Button>
}
/>
))}
</div>
</div>
)}
</div>
</div>
);
};
const StatusBanner: React.FC<{ type: StatusType; message: string }> = ({ type, message }) => {
return <div className={`status-banner ${type}`}>{message}</div>;
};
// Initialize the React app
const container = document.getElementById('root');
if (container) {
const root = createRoot(container);
root.render(<ConnectApp />);
}

View File

@@ -0,0 +1,13 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Playwright MCP Bridge Status</title>
<link rel="stylesheet" href="connect.css">
</head>
<body>
<div id="root"></div>
<script src="status.tsx" type="module"></script>
</body>
</html>

110
extension/src/ui/status.tsx Normal file
View File

@@ -0,0 +1,110 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import React, { useState, useEffect } from 'react';
import { createRoot } from 'react-dom/client';
import { Button, TabItem } from './tabItem.js';
import type { TabInfo } from './tabItem.js';
interface ConnectionStatus {
isConnected: boolean;
connectedTabId: number | null;
connectedTab?: TabInfo;
}
const StatusApp: React.FC = () => {
const [status, setStatus] = useState<ConnectionStatus>({
isConnected: false,
connectedTabId: null
});
useEffect(() => {
void loadStatus();
}, []);
const loadStatus = async () => {
// Get current connection status from background script
const { connectedTabId } = await chrome.runtime.sendMessage({ type: 'getConnectionStatus' });
if (connectedTabId) {
const tab = await chrome.tabs.get(connectedTabId);
setStatus({
isConnected: true,
connectedTabId,
connectedTab: {
id: tab.id!,
windowId: tab.windowId!,
title: tab.title!,
url: tab.url!,
favIconUrl: tab.favIconUrl
}
});
} else {
setStatus({
isConnected: false,
connectedTabId: null
});
}
};
const openConnectedTab = async () => {
if (!status.connectedTabId)
return;
await chrome.tabs.update(status.connectedTabId, { active: true });
window.close();
};
const disconnect = async () => {
await chrome.runtime.sendMessage({ type: 'disconnect' });
window.close();
};
return (
<div className='app-container'>
<div className='content-wrapper'>
{status.isConnected && status.connectedTab ? (
<div>
<div className='tab-section-title'>
Page with connected MCP client:
</div>
<div>
<TabItem
tab={status.connectedTab}
button={
<Button variant='primary' onClick={disconnect}>
Disconnect
</Button>
}
onClick={openConnectedTab}
/>
</div>
</div>
) : (
<div className='status-banner'>
No MCP clients are currently connected.
</div>
)}
</div>
</div>
);
};
// Initialize the React app
const container = document.getElementById('root');
if (container) {
const root = createRoot(container);
root.render(<StatusApp />);
}

View File

@@ -0,0 +1,67 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import React from 'react';
export interface TabInfo {
id: number;
windowId: number;
title: string;
url: string;
favIconUrl?: string;
}
export const Button: React.FC<{ variant: 'primary' | 'default' | 'reject'; onClick: () => void; children: React.ReactNode }> = ({
variant,
onClick,
children
}) => {
return (
<button className={`button ${variant}`} onClick={onClick}>
{children}
</button>
);
};
export interface TabItemProps {
tab: TabInfo;
onClick?: () => void;
button?: React.ReactNode;
}
export const TabItem: React.FC<TabItemProps> = ({
tab,
onClick,
button
}) => {
return (
<div className='tab-item' onClick={onClick} style={onClick ? { cursor: 'pointer' } : undefined}>
<img
src={tab.favIconUrl || 'data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewBox="0 0 16 16"><rect width="16" height="16" fill="%23f6f8fa"/></svg>'}
alt=''
className='tab-favicon'
/>
<div className='tab-content'>
<div className='tab-title'>
{tab.title || 'Untitled'}
</div>
<div className='tab-url'>{tab.url}</div>
</div>
{button}
</div>
);
};

View File

@@ -0,0 +1,4 @@
// Help VSCode to find right tsconfig file.
{
"extends": "../../tsconfig.ui.json"
}

View File

@@ -0,0 +1,187 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { fileURLToPath } from 'url';
import { chromium } from 'playwright';
import { test as base, expect } from '../../tests/fixtures.js';
import type { BrowserContext } from 'playwright';
import type { Client } from '@modelcontextprotocol/sdk/client/index.js';
import type { StartClient } from '../../tests/fixtures.js';
type BrowserWithExtension = {
userDataDir: string;
launch: (mode?: 'disable-extension') => Promise<BrowserContext>;
};
const test = base.extend<{ browserWithExtension: BrowserWithExtension }>({
browserWithExtension: async ({ mcpBrowser }, use, testInfo) => {
// The flags no longer work in Chrome since
// https://chromium.googlesource.com/chromium/src/+/290ed8046692651ce76088914750cb659b65fb17%5E%21/chrome/browser/extensions/extension_service.cc?pli=1#
test.skip('chromium' !== mcpBrowser, '--load-extension is not supported for official builds of Chromium');
const pathToExtension = fileURLToPath(new URL('../dist', import.meta.url));
let browserContext: BrowserContext | undefined;
const userDataDir = testInfo.outputPath('extension-user-data-dir');
await use({
userDataDir,
launch: async (mode?: 'disable-extension') => {
browserContext = await chromium.launchPersistentContext(userDataDir, {
channel: mcpBrowser,
// Opening the browser singleton only works in headed.
headless: false,
// Automation disables singleton browser process behavior, which is necessary for the extension.
ignoreDefaultArgs: ['--enable-automation'],
args: mode === 'disable-extension' ? [] : [
`--disable-extensions-except=${pathToExtension}`,
`--load-extension=${pathToExtension}`,
],
});
// for manifest v3:
let [serviceWorker] = browserContext.serviceWorkers();
if (!serviceWorker)
serviceWorker = await browserContext.waitForEvent('serviceworker');
return browserContext;
}
});
await browserContext?.close();
},
});
async function startAndCallConnectTool(browserWithExtension: BrowserWithExtension, startClient: StartClient): Promise<Client> {
const { client } = await startClient({
args: [`--connect-tool`],
config: {
browser: {
userDataDir: browserWithExtension.userDataDir,
}
},
});
expect(await client.callTool({
name: 'browser_connect',
arguments: {
name: 'extension'
}
})).toHaveResponse({
result: 'Successfully changed connection method.',
});
return client;
}
async function startWithExtensionFlag(browserWithExtension: BrowserWithExtension, startClient: StartClient): Promise<Client> {
const { client } = await startClient({
args: [`--extension`],
config: {
browser: {
userDataDir: browserWithExtension.userDataDir,
}
},
});
return client;
}
for (const [mode, startClientMethod] of [
['connect-tool', startAndCallConnectTool],
['extension-flag', startWithExtensionFlag],
] as const) {
test(`navigate with extension (${mode})`, async ({ browserWithExtension, startClient, server }) => {
const browserContext = await browserWithExtension.launch();
const client = await startClientMethod(browserWithExtension, startClient);
const confirmationPagePromise = browserContext.waitForEvent('page', page => {
return page.url().startsWith('chrome-extension://jakfalbnbhgkpmoaakfflhflbfpkailf/connect.html');
});
const navigateResponse = client.callTool({
name: 'browser_navigate',
arguments: { url: server.HELLO_WORLD },
});
const selectorPage = await confirmationPagePromise;
await selectorPage.locator('.tab-item', { hasText: 'Playwright MCP Extension' }).getByRole('button', { name: 'Connect' }).click();
expect(await navigateResponse).toHaveResponse({
pageState: expect.stringContaining(`- generic [active] [ref=e1]: Hello, world!`),
});
});
test(`snapshot of an existing page (${mode})`, async ({ browserWithExtension, startClient, server }) => {
const browserContext = await browserWithExtension.launch();
const page = await browserContext.newPage();
await page.goto(server.HELLO_WORLD);
// Another empty page.
await browserContext.newPage();
expect(browserContext.pages()).toHaveLength(3);
const client = await startClientMethod(browserWithExtension, startClient);
expect(browserContext.pages()).toHaveLength(3);
const confirmationPagePromise = browserContext.waitForEvent('page', page => {
return page.url().startsWith('chrome-extension://jakfalbnbhgkpmoaakfflhflbfpkailf/connect.html');
});
const navigateResponse = client.callTool({
name: 'browser_snapshot',
arguments: { },
});
const selectorPage = await confirmationPagePromise;
expect(browserContext.pages()).toHaveLength(4);
await selectorPage.locator('.tab-item', { hasText: 'Title' }).getByRole('button', { name: 'Connect' }).click();
expect(await navigateResponse).toHaveResponse({
pageState: expect.stringContaining(`- generic [active] [ref=e1]: Hello, world!`),
});
expect(browserContext.pages()).toHaveLength(4);
});
test(`extension not installed timeout (${mode})`, async ({ browserWithExtension, startClient, server }) => {
process.env.PWMCP_TEST_CONNECTION_TIMEOUT = '100';
const browserContext = await browserWithExtension.launch();
const client = await startClientMethod(browserWithExtension, startClient);
const confirmationPagePromise = browserContext.waitForEvent('page', page => {
return page.url().startsWith('chrome-extension://jakfalbnbhgkpmoaakfflhflbfpkailf/connect.html');
});
expect(await client.callTool({
name: 'browser_navigate',
arguments: { url: server.HELLO_WORLD },
})).toHaveResponse({
result: expect.stringContaining('Extension connection timeout. Make sure the "Playwright MCP Bridge" extension is installed.'),
isError: true,
});
await confirmationPagePromise;
process.env.PWMCP_TEST_CONNECTION_TIMEOUT = undefined;
});
}

21
extension/tsconfig.json Normal file
View File

@@ -0,0 +1,21 @@
{
"compilerOptions": {
"target": "ESNext",
"esModuleInterop": true,
"moduleResolution": "node",
"strict": true,
"module": "ESNext",
"rootDir": "src",
"outDir": "./dist/lib",
"resolveJsonModule": true,
"types": ["chrome"],
"jsx": "react-jsx",
"jsxImportSource": "react"
},
"include": [
"src",
],
"exclude": [
"src/ui",
]
}

View File

@@ -0,0 +1,19 @@
{
"compilerOptions": {
"target": "ESNext",
"esModuleInterop": true,
"moduleResolution": "node",
"strict": true,
"module": "ESNext",
"rootDir": "src",
"outDir": "./lib",
"resolveJsonModule": true,
"types": ["chrome"],
"jsx": "react-jsx",
"jsxImportSource": "react",
"noEmit": true,
},
"include": [
"src/ui",
],
}

54
extension/vite.config.ts Normal file
View File

@@ -0,0 +1,54 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { resolve } from 'path';
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
import { viteStaticCopy } from 'vite-plugin-static-copy';
// https://vitejs.dev/config/
export default defineConfig({
plugins: [
react(),
viteStaticCopy({
targets: [
{
src: '../../icons/*',
dest: 'icons'
},
{
src: '../../manifest.json',
dest: '.'
}
]
})
],
root: resolve(__dirname, 'src/ui'),
build: {
outDir: resolve(__dirname, 'dist/'),
emptyOutDir: false,
minify: false,
rollupOptions: {
input: ['src/ui/connect.html', 'src/ui/status.html'],
output: {
manualChunks: undefined,
entryFileNames: 'lib/ui/[name].js',
chunkFileNames: 'lib/ui/[name].js',
assetFileNames: 'lib/ui/[name].[ext]'
}
}
}
});

25
index.d.ts vendored
View File

@@ -15,26 +15,9 @@
* limitations under the License.
*/
import type { LaunchOptions } from 'playwright';
import type { Server } from '@modelcontextprotocol/sdk/server/index.js';
import type { Config } from './config.js';
import type { BrowserContext } from 'playwright';
type Options = {
/**
* Path to the user data directory.
*/
userDataDir?: string;
/**
* Launch options for the browser.
*/
launchOptions?: LaunchOptions;
/**
* Use screenshots instead of snapshots. Less accurate, reliable and overall
* slower, but contains visual representation of the page.
* @default false
*/
vision?: boolean;
};
export function createServer(options?: Options): Server;
export declare function createConnection(config?: Config, contextGetter?: () => Promise<BrowserContext>): Promise<Server>;
export {};

View File

@@ -15,5 +15,5 @@
* limitations under the License.
*/
const { createServer } = require('./lib/index');
module.exports = { createServer };
import { createConnection } from './lib/index.js';
export { createConnection };

1009
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,8 @@
{
"name": "@playwright/mcp",
"version": "0.0.7",
"version": "0.0.34",
"description": "Playwright Tools for MCP",
"type": "module",
"repository": {
"type": "git",
"url": "git+https://github.com/microsoft/playwright-mcp.git"
@@ -16,9 +17,16 @@
"license": "Apache-2.0",
"scripts": {
"build": "tsc",
"lint": "eslint .",
"lint": "npm run update-readme && npm run check-deps && eslint . && tsc --noEmit",
"lint-fix": "eslint . --fix",
"check-deps": "node utils/check-deps.js",
"update-readme": "node utils/update-readme.js",
"watch": "tsc --watch",
"test": "playwright test",
"ctest": "playwright test --project=chrome",
"ftest": "playwright test --project=firefox",
"wtest": "playwright test --project=webkit",
"run-server": "node lib/browserServer.js",
"clean": "rm -rf lib",
"npm-publish": "npm run clean && npm run build && npm run test && npm publish"
},
@@ -30,23 +38,34 @@
}
},
"dependencies": {
"@modelcontextprotocol/sdk": "^1.6.1",
"@modelcontextprotocol/sdk": "^1.16.0",
"commander": "^13.1.0",
"playwright": "1.52.0-alpha-1743011787000",
"debug": "^4.4.1",
"dotenv": "^17.2.0",
"mime": "^4.0.7",
"playwright": "1.55.0-alpha-2025-08-12",
"playwright-core": "1.55.0-alpha-2025-08-12",
"ws": "^8.18.1",
"zod": "^3.24.1",
"zod-to-json-schema": "^3.24.4"
},
"devDependencies": {
"@anthropic-ai/sdk": "^0.57.0",
"@eslint/eslintrc": "^3.2.0",
"@eslint/js": "^9.19.0",
"@playwright/test": "1.52.0-alpha-1743011787000",
"@playwright/test": "1.55.0-alpha-2025-08-12",
"@stylistic/eslint-plugin": "^3.0.1",
"@types/debug": "^4.1.12",
"@types/node": "^22.13.10",
"@types/ws": "^8.18.1",
"@typescript-eslint/eslint-plugin": "^8.26.1",
"@typescript-eslint/parser": "^8.26.1",
"@typescript-eslint/utils": "^8.26.1",
"@types/node": "^22.13.10",
"esbuild": "^0.20.1",
"eslint": "^9.19.0",
"eslint-plugin-import": "^2.31.0",
"eslint-plugin-notice": "^1.0.0",
"openai": "^5.10.2",
"typescript": "^5.8.2"
},
"bin": {

View File

@@ -16,12 +16,27 @@
import { defineConfig } from '@playwright/test';
export default defineConfig({
import type { TestOptions } from './tests/fixtures.js';
export default defineConfig<TestOptions>({
testDir: './tests',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
workers: process.env.CI ? 2 : undefined,
reporter: 'list',
projects: [{ name: 'default' }],
projects: [
{ name: 'chrome' },
{ name: 'chromium', use: { mcpBrowser: 'chromium' } },
...process.env.MCP_IN_DOCKER ? [{
name: 'chromium-docker',
grep: /browser_navigate|browser_click/,
use: {
mcpBrowser: 'chromium',
mcpMode: 'docker' as const
}
}] : [],
{ name: 'firefox', use: { mcpBrowser: 'firefox' } },
{ name: 'webkit', use: { mcpBrowser: 'webkit' } },
... process.platform === 'win32' ? [{ name: 'msedge', use: { mcpBrowser: 'msedge' } }] : [],
],
});

7
src/DEPS.list Normal file
View File

@@ -0,0 +1,7 @@
[*]
./tools/
./mcp/
./utils/
[program.ts]
***

172
src/actions.d.ts vendored Normal file
View File

@@ -0,0 +1,172 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
type Point = { x: number, y: number };
export type ActionName =
'check' |
'click' |
'closePage' |
'fill' |
'navigate' |
'openPage' |
'press' |
'select' |
'uncheck' |
'setInputFiles' |
'assertText' |
'assertValue' |
'assertChecked' |
'assertVisible' |
'assertSnapshot';
export type ActionBase = {
name: ActionName,
signals: Signal[],
ariaSnapshot?: string,
};
export type ActionWithSelector = ActionBase & {
selector: string,
ref?: string,
};
export type ClickAction = ActionWithSelector & {
name: 'click',
button: 'left' | 'middle' | 'right',
modifiers: number,
clickCount: number,
position?: Point,
};
export type CheckAction = ActionWithSelector & {
name: 'check',
};
export type UncheckAction = ActionWithSelector & {
name: 'uncheck',
};
export type FillAction = ActionWithSelector & {
name: 'fill',
text: string,
};
export type NavigateAction = ActionBase & {
name: 'navigate',
url: string,
};
export type OpenPageAction = ActionBase & {
name: 'openPage',
url: string,
};
export type ClosesPageAction = ActionBase & {
name: 'closePage',
};
export type PressAction = ActionWithSelector & {
name: 'press',
key: string,
modifiers: number,
};
export type SelectAction = ActionWithSelector & {
name: 'select',
options: string[],
};
export type SetInputFilesAction = ActionWithSelector & {
name: 'setInputFiles',
files: string[],
};
export type AssertTextAction = ActionWithSelector & {
name: 'assertText',
text: string,
substring: boolean,
};
export type AssertValueAction = ActionWithSelector & {
name: 'assertValue',
value: string,
};
export type AssertCheckedAction = ActionWithSelector & {
name: 'assertChecked',
checked: boolean,
};
export type AssertVisibleAction = ActionWithSelector & {
name: 'assertVisible',
};
export type AssertSnapshotAction = ActionWithSelector & {
name: 'assertSnapshot',
ariaSnapshot: string,
};
export type Action = ClickAction | CheckAction | ClosesPageAction | OpenPageAction | UncheckAction | FillAction | NavigateAction | PressAction | SelectAction | SetInputFilesAction | AssertTextAction | AssertValueAction | AssertCheckedAction | AssertVisibleAction | AssertSnapshotAction;
export type AssertAction = AssertCheckedAction | AssertValueAction | AssertTextAction | AssertVisibleAction | AssertSnapshotAction;
export type PerformOnRecordAction = ClickAction | CheckAction | UncheckAction | PressAction | SelectAction;
// Signals.
export type BaseSignal = {
};
export type NavigationSignal = BaseSignal & {
name: 'navigation',
url: string,
};
export type PopupSignal = BaseSignal & {
name: 'popup',
popupAlias: string,
};
export type DownloadSignal = BaseSignal & {
name: 'download',
downloadAlias: string,
};
export type DialogSignal = BaseSignal & {
name: 'dialog',
dialogAlias: string,
};
export type Signal = NavigationSignal | PopupSignal | DownloadSignal | DialogSignal;
export type FrameDescription = {
pageGuid: string;
pageAlias: string;
framePath: string[];
};
export type ActionInContext = {
frame: FrameDescription;
description?: string;
action: Action;
startTime: number;
endTime?: number;
};
export type SignalInContext = {
frame: FrameDescription;
signal: Signal;
timestamp: number;
};

View File

@@ -0,0 +1,253 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import fs from 'fs';
import net from 'net';
import path from 'path';
import * as playwright from 'playwright';
// @ts-ignore
import { registryDirectory } from 'playwright-core/lib/server/registry/index';
// @ts-ignore
import { startTraceViewerServer } from 'playwright-core/lib/server';
import { logUnhandledError, testDebug } from './utils/log.js';
import { createHash } from './utils/guid.js';
import { outputFile } from './config.js';
import type { FullConfig } from './config.js';
export function contextFactory(config: FullConfig): BrowserContextFactory {
if (config.browser.remoteEndpoint)
return new RemoteContextFactory(config);
if (config.browser.cdpEndpoint)
return new CdpContextFactory(config);
if (config.browser.isolated)
return new IsolatedContextFactory(config);
return new PersistentContextFactory(config);
}
export type ClientInfo = { name?: string, version?: string, rootPath?: string };
export interface BrowserContextFactory {
readonly name: string;
readonly description: string;
createContext(clientInfo: ClientInfo, abortSignal: AbortSignal): Promise<{ browserContext: playwright.BrowserContext, close: () => Promise<void> }>;
}
class BaseContextFactory implements BrowserContextFactory {
readonly name: string;
readonly description: string;
readonly config: FullConfig;
protected _browserPromise: Promise<playwright.Browser> | undefined;
constructor(name: string, description: string, config: FullConfig) {
this.name = name;
this.description = description;
this.config = config;
}
protected async _obtainBrowser(clientInfo: ClientInfo): Promise<playwright.Browser> {
if (this._browserPromise)
return this._browserPromise;
testDebug(`obtain browser (${this.name})`);
this._browserPromise = this._doObtainBrowser(clientInfo);
void this._browserPromise.then(browser => {
browser.on('disconnected', () => {
this._browserPromise = undefined;
});
}).catch(() => {
this._browserPromise = undefined;
});
return this._browserPromise;
}
protected async _doObtainBrowser(clientInfo: ClientInfo): Promise<playwright.Browser> {
throw new Error('Not implemented');
}
async createContext(clientInfo: ClientInfo): Promise<{ browserContext: playwright.BrowserContext, close: () => Promise<void> }> {
testDebug(`create browser context (${this.name})`);
const browser = await this._obtainBrowser(clientInfo);
const browserContext = await this._doCreateContext(browser);
return { browserContext, close: () => this._closeBrowserContext(browserContext, browser) };
}
protected async _doCreateContext(browser: playwright.Browser): Promise<playwright.BrowserContext> {
throw new Error('Not implemented');
}
private async _closeBrowserContext(browserContext: playwright.BrowserContext, browser: playwright.Browser) {
testDebug(`close browser context (${this.name})`);
if (browser.contexts().length === 1)
this._browserPromise = undefined;
await browserContext.close().catch(logUnhandledError);
if (browser.contexts().length === 0) {
testDebug(`close browser (${this.name})`);
await browser.close().catch(logUnhandledError);
}
}
}
class IsolatedContextFactory extends BaseContextFactory {
constructor(config: FullConfig) {
super('isolated', 'Create a new isolated browser context', config);
}
protected override async _doObtainBrowser(clientInfo: ClientInfo): Promise<playwright.Browser> {
await injectCdpPort(this.config.browser);
const browserType = playwright[this.config.browser.browserName];
return browserType.launch({
tracesDir: await startTraceServer(this.config, clientInfo.rootPath),
...this.config.browser.launchOptions,
handleSIGINT: false,
handleSIGTERM: false,
}).catch(error => {
if (error.message.includes('Executable doesn\'t exist'))
throw new Error(`Browser specified in your config is not installed. Either install it (likely) or change the config.`);
throw error;
});
}
protected override async _doCreateContext(browser: playwright.Browser): Promise<playwright.BrowserContext> {
return browser.newContext(this.config.browser.contextOptions);
}
}
class CdpContextFactory extends BaseContextFactory {
constructor(config: FullConfig) {
super('cdp', 'Connect to a browser over CDP', config);
}
protected override async _doObtainBrowser(): Promise<playwright.Browser> {
return playwright.chromium.connectOverCDP(this.config.browser.cdpEndpoint!);
}
protected override async _doCreateContext(browser: playwright.Browser): Promise<playwright.BrowserContext> {
return this.config.browser.isolated ? await browser.newContext() : browser.contexts()[0];
}
}
class RemoteContextFactory extends BaseContextFactory {
constructor(config: FullConfig) {
super('remote', 'Connect to a browser using a remote endpoint', config);
}
protected override async _doObtainBrowser(): Promise<playwright.Browser> {
const url = new URL(this.config.browser.remoteEndpoint!);
url.searchParams.set('browser', this.config.browser.browserName);
if (this.config.browser.launchOptions)
url.searchParams.set('launch-options', JSON.stringify(this.config.browser.launchOptions));
return playwright[this.config.browser.browserName].connect(String(url));
}
protected override async _doCreateContext(browser: playwright.Browser): Promise<playwright.BrowserContext> {
return browser.newContext();
}
}
class PersistentContextFactory implements BrowserContextFactory {
readonly config: FullConfig;
readonly name = 'persistent';
readonly description = 'Create a new persistent browser context';
private _userDataDirs = new Set<string>();
constructor(config: FullConfig) {
this.config = config;
}
async createContext(clientInfo: ClientInfo): Promise<{ browserContext: playwright.BrowserContext, close: () => Promise<void> }> {
await injectCdpPort(this.config.browser);
testDebug('create browser context (persistent)');
const userDataDir = this.config.browser.userDataDir ?? await this._createUserDataDir(clientInfo.rootPath);
const tracesDir = await startTraceServer(this.config, clientInfo.rootPath);
this._userDataDirs.add(userDataDir);
testDebug('lock user data dir', userDataDir);
const browserType = playwright[this.config.browser.browserName];
for (let i = 0; i < 5; i++) {
try {
const browserContext = await browserType.launchPersistentContext(userDataDir, {
tracesDir,
...this.config.browser.launchOptions,
...this.config.browser.contextOptions,
handleSIGINT: false,
handleSIGTERM: false,
});
const close = () => this._closeBrowserContext(browserContext, userDataDir);
return { browserContext, close };
} catch (error: any) {
if (error.message.includes('Executable doesn\'t exist'))
throw new Error(`Browser specified in your config is not installed. Either install it (likely) or change the config.`);
if (error.message.includes('ProcessSingleton') || error.message.includes('Invalid URL')) {
// User data directory is already in use, try again.
await new Promise(resolve => setTimeout(resolve, 1000));
continue;
}
throw error;
}
}
throw new Error(`Browser is already in use for ${userDataDir}, use --isolated to run multiple instances of the same browser`);
}
private async _closeBrowserContext(browserContext: playwright.BrowserContext, userDataDir: string) {
testDebug('close browser context (persistent)');
testDebug('release user data dir', userDataDir);
await browserContext.close().catch(() => {});
this._userDataDirs.delete(userDataDir);
testDebug('close browser context complete (persistent)');
}
private async _createUserDataDir(rootPath: string | undefined) {
const dir = process.env.PWMCP_PROFILES_DIR_FOR_TEST ?? registryDirectory;
const browserToken = this.config.browser.launchOptions?.channel ?? this.config.browser?.browserName;
// Hesitant putting hundreds of files into the user's workspace, so using it for hashing instead.
const rootPathToken = rootPath ? `-${createHash(rootPath)}` : '';
const result = path.join(dir, `mcp-${browserToken}${rootPathToken}`);
await fs.promises.mkdir(result, { recursive: true });
return result;
}
}
async function injectCdpPort(browserConfig: FullConfig['browser']) {
if (browserConfig.browserName === 'chromium')
(browserConfig.launchOptions as any).cdpPort = await findFreePort();
}
async function findFreePort(): Promise<number> {
return new Promise((resolve, reject) => {
const server = net.createServer();
server.listen(0, () => {
const { port } = server.address() as net.AddressInfo;
server.close(() => resolve(port));
});
server.on('error', reject);
});
}
async function startTraceServer(config: FullConfig, rootPath: string | undefined): Promise<string | undefined> {
if (!config.saveTrace)
return undefined;
const tracesDir = await outputFile(config, rootPath, `traces-${Date.now()}`);
const server = await startTraceViewerServer();
const urlPrefix = server.urlPrefix('human-readable');
const url = urlPrefix + '/trace/index.html?trace=' + tracesDir + '/trace.json';
// eslint-disable-next-line no-console
console.error('\nTrace viewer listening on ' + url);
return tracesDir;
}

View File

@@ -0,0 +1,92 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { fileURLToPath } from 'url';
import { FullConfig } from './config.js';
import { Context } from './context.js';
import { logUnhandledError } from './utils/log.js';
import { Response } from './response.js';
import { SessionLog } from './sessionLog.js';
import { filteredTools } from './tools.js';
import { packageJSON } from './utils/package.js';
import { toMcpTool } from './mcp/tool.js';
import type { Tool } from './tools/tool.js';
import type { BrowserContextFactory } from './browserContextFactory.js';
import type * as mcpServer from './mcp/server.js';
import type { ServerBackend } from './mcp/server.js';
export class BrowserServerBackend implements ServerBackend {
name = 'Playwright';
version = packageJSON.version;
private _tools: Tool[];
private _context: Context | undefined;
private _sessionLog: SessionLog | undefined;
private _config: FullConfig;
private _browserContextFactory: BrowserContextFactory;
constructor(config: FullConfig, factory: BrowserContextFactory) {
this._config = config;
this._browserContextFactory = factory;
this._tools = filteredTools(config);
}
async initialize(clientVersion: mcpServer.ClientVersion, roots: mcpServer.Root[]): Promise<void> {
let rootPath: string | undefined;
if (roots.length > 0) {
const firstRootUri = roots[0]?.uri;
const url = firstRootUri ? new URL(firstRootUri) : undefined;
rootPath = url ? fileURLToPath(url) : undefined;
}
this._sessionLog = this._config.saveSession ? await SessionLog.create(this._config, rootPath) : undefined;
this._context = new Context({
tools: this._tools,
config: this._config,
browserContextFactory: this._browserContextFactory,
sessionLog: this._sessionLog,
clientInfo: { ...clientVersion, rootPath },
});
}
async listTools(): Promise<mcpServer.Tool[]> {
return this._tools.map(tool => toMcpTool(tool.schema));
}
async callTool(name: string, rawArguments: mcpServer.CallToolRequest['params']['arguments']) {
const tool = this._tools.find(tool => tool.schema.name === name)!;
if (!tool)
throw new Error(`Tool "${name}" not found`);
const parsedArguments = tool.schema.inputSchema.parse(rawArguments || {});
const context = this._context!;
const response = new Response(context, name, parsedArguments);
context.setRunningTool(true);
try {
await tool.handle(context, parsedArguments, response);
await response.finish();
this._sessionLog?.logResponse(response);
} catch (error: any) {
response.addError(String(error));
} finally {
context.setRunningTool(false);
}
return response.serialize();
}
serverClosed() {
void this._context?.dispose().catch(logUnhandledError);
}
}

320
src/config.ts Normal file
View File

@@ -0,0 +1,320 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import fs from 'fs';
import os from 'os';
import path from 'path';
import { devices } from 'playwright';
import { sanitizeForFilePath } from './utils/fileUtils.js';
import type { Config, ToolCapability } from '../config.js';
import type { BrowserContextOptions, LaunchOptions } from 'playwright';
export type CLIOptions = {
allowedOrigins?: string[];
blockedOrigins?: string[];
blockServiceWorkers?: boolean;
browser?: string;
caps?: string[];
cdpEndpoint?: string;
config?: string;
device?: string;
executablePath?: string;
headless?: boolean;
host?: string;
ignoreHttpsErrors?: boolean;
isolated?: boolean;
imageResponses?: 'allow' | 'omit';
sandbox?: boolean;
outputDir?: string;
port?: number;
proxyBypass?: string;
proxyServer?: string;
saveSession?: boolean;
saveTrace?: boolean;
storageState?: string;
userAgent?: string;
userDataDir?: string;
viewportSize?: string;
};
const defaultConfig: FullConfig = {
browser: {
browserName: 'chromium',
launchOptions: {
channel: 'chrome',
headless: os.platform() === 'linux' && !process.env.DISPLAY,
chromiumSandbox: true,
},
contextOptions: {
viewport: null,
},
},
network: {
allowedOrigins: undefined,
blockedOrigins: undefined,
},
server: {},
saveTrace: false,
};
type BrowserUserConfig = NonNullable<Config['browser']>;
export type FullConfig = Config & {
browser: Omit<BrowserUserConfig, 'browserName'> & {
browserName: 'chromium' | 'firefox' | 'webkit';
launchOptions: NonNullable<BrowserUserConfig['launchOptions']>;
contextOptions: NonNullable<BrowserUserConfig['contextOptions']>;
},
network: NonNullable<Config['network']>,
saveTrace: boolean;
server: NonNullable<Config['server']>,
};
export async function resolveConfig(config: Config): Promise<FullConfig> {
return mergeConfig(defaultConfig, config);
}
export async function resolveCLIConfig(cliOptions: CLIOptions): Promise<FullConfig> {
const configInFile = await loadConfig(cliOptions.config);
const envOverrides = configFromEnv();
const cliOverrides = configFromCLIOptions(cliOptions);
let result = defaultConfig;
result = mergeConfig(result, configInFile);
result = mergeConfig(result, envOverrides);
result = mergeConfig(result, cliOverrides);
return result;
}
export function configFromCLIOptions(cliOptions: CLIOptions): Config {
let browserName: 'chromium' | 'firefox' | 'webkit' | undefined;
let channel: string | undefined;
switch (cliOptions.browser) {
case 'chrome':
case 'chrome-beta':
case 'chrome-canary':
case 'chrome-dev':
case 'chromium':
case 'msedge':
case 'msedge-beta':
case 'msedge-canary':
case 'msedge-dev':
browserName = 'chromium';
channel = cliOptions.browser;
break;
case 'firefox':
browserName = 'firefox';
break;
case 'webkit':
browserName = 'webkit';
break;
}
// Launch options
const launchOptions: LaunchOptions = {
channel,
executablePath: cliOptions.executablePath,
headless: cliOptions.headless,
};
// --no-sandbox was passed, disable the sandbox
if (cliOptions.sandbox === false)
launchOptions.chromiumSandbox = false;
if (cliOptions.proxyServer) {
launchOptions.proxy = {
server: cliOptions.proxyServer
};
if (cliOptions.proxyBypass)
launchOptions.proxy.bypass = cliOptions.proxyBypass;
}
if (cliOptions.device && cliOptions.cdpEndpoint)
throw new Error('Device emulation is not supported with cdpEndpoint.');
// Context options
const contextOptions: BrowserContextOptions = cliOptions.device ? devices[cliOptions.device] : {};
if (cliOptions.storageState)
contextOptions.storageState = cliOptions.storageState;
if (cliOptions.userAgent)
contextOptions.userAgent = cliOptions.userAgent;
if (cliOptions.viewportSize) {
try {
const [width, height] = cliOptions.viewportSize.split(',').map(n => +n);
if (isNaN(width) || isNaN(height))
throw new Error('bad values');
contextOptions.viewport = { width, height };
} catch (e) {
throw new Error('Invalid viewport size format: use "width,height", for example --viewport-size="800,600"');
}
}
if (cliOptions.ignoreHttpsErrors)
contextOptions.ignoreHTTPSErrors = true;
if (cliOptions.blockServiceWorkers)
contextOptions.serviceWorkers = 'block';
const result: Config = {
browser: {
browserName,
isolated: cliOptions.isolated,
userDataDir: cliOptions.userDataDir,
launchOptions,
contextOptions,
cdpEndpoint: cliOptions.cdpEndpoint,
},
server: {
port: cliOptions.port,
host: cliOptions.host,
},
capabilities: cliOptions.caps as ToolCapability[],
network: {
allowedOrigins: cliOptions.allowedOrigins,
blockedOrigins: cliOptions.blockedOrigins,
},
saveSession: cliOptions.saveSession,
saveTrace: cliOptions.saveTrace,
outputDir: cliOptions.outputDir,
imageResponses: cliOptions.imageResponses,
};
return result;
}
function configFromEnv(): Config {
const options: CLIOptions = {};
options.allowedOrigins = semicolonSeparatedList(process.env.PLAYWRIGHT_MCP_ALLOWED_ORIGINS);
options.blockedOrigins = semicolonSeparatedList(process.env.PLAYWRIGHT_MCP_BLOCKED_ORIGINS);
options.blockServiceWorkers = envToBoolean(process.env.PLAYWRIGHT_MCP_BLOCK_SERVICE_WORKERS);
options.browser = envToString(process.env.PLAYWRIGHT_MCP_BROWSER);
options.caps = commaSeparatedList(process.env.PLAYWRIGHT_MCP_CAPS);
options.cdpEndpoint = envToString(process.env.PLAYWRIGHT_MCP_CDP_ENDPOINT);
options.config = envToString(process.env.PLAYWRIGHT_MCP_CONFIG);
options.device = envToString(process.env.PLAYWRIGHT_MCP_DEVICE);
options.executablePath = envToString(process.env.PLAYWRIGHT_MCP_EXECUTABLE_PATH);
options.headless = envToBoolean(process.env.PLAYWRIGHT_MCP_HEADLESS);
options.host = envToString(process.env.PLAYWRIGHT_MCP_HOST);
options.ignoreHttpsErrors = envToBoolean(process.env.PLAYWRIGHT_MCP_IGNORE_HTTPS_ERRORS);
options.isolated = envToBoolean(process.env.PLAYWRIGHT_MCP_ISOLATED);
if (process.env.PLAYWRIGHT_MCP_IMAGE_RESPONSES === 'omit')
options.imageResponses = 'omit';
options.sandbox = envToBoolean(process.env.PLAYWRIGHT_MCP_SANDBOX);
options.outputDir = envToString(process.env.PLAYWRIGHT_MCP_OUTPUT_DIR);
options.port = envToNumber(process.env.PLAYWRIGHT_MCP_PORT);
options.proxyBypass = envToString(process.env.PLAYWRIGHT_MCP_PROXY_BYPASS);
options.proxyServer = envToString(process.env.PLAYWRIGHT_MCP_PROXY_SERVER);
options.saveTrace = envToBoolean(process.env.PLAYWRIGHT_MCP_SAVE_TRACE);
options.storageState = envToString(process.env.PLAYWRIGHT_MCP_STORAGE_STATE);
options.userAgent = envToString(process.env.PLAYWRIGHT_MCP_USER_AGENT);
options.userDataDir = envToString(process.env.PLAYWRIGHT_MCP_USER_DATA_DIR);
options.viewportSize = envToString(process.env.PLAYWRIGHT_MCP_VIEWPORT_SIZE);
return configFromCLIOptions(options);
}
async function loadConfig(configFile: string | undefined): Promise<Config> {
if (!configFile)
return {};
try {
return JSON.parse(await fs.promises.readFile(configFile, 'utf8'));
} catch (error) {
throw new Error(`Failed to load config file: ${configFile}, ${error}`);
}
}
export async function outputFile(config: FullConfig, rootPath: string | undefined, name: string): Promise<string> {
const outputDir = config.outputDir
?? (rootPath ? path.join(rootPath, '.playwright-mcp') : undefined)
?? path.join(os.tmpdir(), 'playwright-mcp-output', sanitizeForFilePath(new Date().toISOString()));
await fs.promises.mkdir(outputDir, { recursive: true });
const fileName = sanitizeForFilePath(name);
return path.join(outputDir, fileName);
}
function pickDefined<T extends object>(obj: T | undefined): Partial<T> {
return Object.fromEntries(
Object.entries(obj ?? {}).filter(([_, v]) => v !== undefined)
) as Partial<T>;
}
function mergeConfig(base: FullConfig, overrides: Config): FullConfig {
const browser: FullConfig['browser'] = {
...pickDefined(base.browser),
...pickDefined(overrides.browser),
browserName: overrides.browser?.browserName ?? base.browser?.browserName ?? 'chromium',
isolated: overrides.browser?.isolated ?? base.browser?.isolated ?? false,
launchOptions: {
...pickDefined(base.browser?.launchOptions),
...pickDefined(overrides.browser?.launchOptions),
...{ assistantMode: true },
},
contextOptions: {
...pickDefined(base.browser?.contextOptions),
...pickDefined(overrides.browser?.contextOptions),
},
};
if (browser.browserName !== 'chromium' && browser.launchOptions)
delete browser.launchOptions.channel;
return {
...pickDefined(base),
...pickDefined(overrides),
browser,
network: {
...pickDefined(base.network),
...pickDefined(overrides.network),
},
server: {
...pickDefined(base.server),
...pickDefined(overrides.server),
},
} as FullConfig;
}
export function semicolonSeparatedList(value: string | undefined): string[] | undefined {
if (!value)
return undefined;
return value.split(';').map(v => v.trim());
}
export function commaSeparatedList(value: string | undefined): string[] | undefined {
if (!value)
return undefined;
return value.split(',').map(v => v.trim());
}
function envToNumber(value: string | undefined): number | undefined {
if (!value)
return undefined;
return +value;
}
function envToBoolean(value: string | undefined): boolean | undefined {
if (value === 'true' || value === '1')
return true;
if (value === 'false' || value === '0')
return false;
return undefined;
}
function envToString(value: string | undefined): string | undefined {
return value ? value.trim() : undefined;
}

View File

@@ -14,137 +14,263 @@
* limitations under the License.
*/
import debug from 'debug';
import * as playwright from 'playwright';
import { logUnhandledError } from './utils/log.js';
import { Tab } from './tab.js';
import { outputFile } from './config.js';
import type { FullConfig } from './config.js';
import type { Tool } from './tools/tool.js';
import type { BrowserContextFactory, ClientInfo } from './browserContextFactory.js';
import type * as actions from './actions.js';
import type { SessionLog } from './sessionLog.js';
const testDebug = debug('pw:mcp:test');
type ContextOptions = {
tools: Tool[];
config: FullConfig;
browserContextFactory: BrowserContextFactory;
sessionLog: SessionLog | undefined;
clientInfo: ClientInfo;
};
export class Context {
private _userDataDir: string;
private _launchOptions: playwright.LaunchOptions | undefined;
private _browser: playwright.Browser | undefined;
private _page: playwright.Page | undefined;
private _console: playwright.ConsoleMessage[] = [];
private _createPagePromise: Promise<playwright.Page> | undefined;
private _fileChooser: playwright.FileChooser | undefined;
private _lastSnapshotFrames: playwright.FrameLocator[] = [];
readonly tools: Tool[];
readonly config: FullConfig;
readonly sessionLog: SessionLog | undefined;
readonly options: ContextOptions;
private _browserContextPromise: Promise<{ browserContext: playwright.BrowserContext, close: () => Promise<void> }> | undefined;
private _browserContextFactory: BrowserContextFactory;
private _tabs: Tab[] = [];
private _currentTab: Tab | undefined;
private _clientInfo: ClientInfo;
constructor(userDataDir: string, launchOptions?: playwright.LaunchOptions) {
this._userDataDir = userDataDir;
this._launchOptions = launchOptions;
private static _allContexts: Set<Context> = new Set();
private _closeBrowserContextPromise: Promise<void> | undefined;
private _isRunningTool: boolean = false;
private _abortController = new AbortController();
constructor(options: ContextOptions) {
this.tools = options.tools;
this.config = options.config;
this.sessionLog = options.sessionLog;
this.options = options;
this._browserContextFactory = options.browserContextFactory;
this._clientInfo = options.clientInfo;
testDebug('create context');
Context._allContexts.add(this);
}
async createPage(): Promise<playwright.Page> {
if (this._createPagePromise)
return this._createPagePromise;
this._createPagePromise = (async () => {
const { browser, page } = await this._createPage();
page.on('console', event => this._console.push(event));
page.on('framenavigated', frame => {
if (!frame.parentFrame())
this._console.length = 0;
});
page.on('close', () => this._onPageClose());
page.on('filechooser', chooser => this._fileChooser = chooser);
page.setDefaultNavigationTimeout(60000);
page.setDefaultTimeout(5000);
this._page = page;
this._browser = browser;
return page;
})();
return this._createPagePromise;
static async disposeAll() {
await Promise.all([...Context._allContexts].map(context => context.dispose()));
}
private _onPageClose() {
const browser = this._browser;
const page = this._page;
void page?.context()?.close().then(() => browser?.close()).catch(() => {});
this._createPagePromise = undefined;
this._browser = undefined;
this._page = undefined;
this._fileChooser = undefined;
this._console.length = 0;
tabs(): Tab[] {
return this._tabs;
}
existingPage(): playwright.Page {
if (!this._page)
throw new Error('Navigate to a location to create a page');
return this._page;
currentTab(): Tab | undefined {
return this._currentTab;
}
async console(): Promise<playwright.ConsoleMessage[]> {
return this._console;
currentTabOrDie(): Tab {
if (!this._currentTab)
throw new Error('No open pages available. Use the "browser_navigate" tool to navigate to a page first.');
return this._currentTab;
}
async close() {
if (!this._page)
async newTab(): Promise<Tab> {
const { browserContext } = await this._ensureBrowserContext();
const page = await browserContext.newPage();
this._currentTab = this._tabs.find(t => t.page === page)!;
return this._currentTab;
}
async selectTab(index: number) {
const tab = this._tabs[index];
if (!tab)
throw new Error(`Tab ${index} not found`);
await tab.page.bringToFront();
this._currentTab = tab;
return tab;
}
async ensureTab(): Promise<Tab> {
const { browserContext } = await this._ensureBrowserContext();
if (!this._currentTab)
await browserContext.newPage();
return this._currentTab!;
}
async closeTab(index: number | undefined): Promise<string> {
const tab = index === undefined ? this._currentTab : this._tabs[index];
if (!tab)
throw new Error(`Tab ${index} not found`);
const url = tab.page.url();
await tab.page.close();
return url;
}
async outputFile(name: string): Promise<string> {
return outputFile(this.config, this._clientInfo.rootPath, name);
}
private _onPageCreated(page: playwright.Page) {
const tab = new Tab(this, page, tab => this._onPageClosed(tab));
this._tabs.push(tab);
if (!this._currentTab)
this._currentTab = tab;
}
private _onPageClosed(tab: Tab) {
const index = this._tabs.indexOf(tab);
if (index === -1)
return;
await this._page.close();
this._tabs.splice(index, 1);
if (this._currentTab === tab)
this._currentTab = this._tabs[Math.min(index, this._tabs.length - 1)];
if (!this._tabs.length)
void this.closeBrowserContext();
}
async submitFileChooser(paths: string[]) {
if (!this._fileChooser)
throw new Error('No file chooser visible');
await this._fileChooser.setFiles(paths);
this._fileChooser = undefined;
async closeBrowserContext() {
if (!this._closeBrowserContextPromise)
this._closeBrowserContextPromise = this._closeBrowserContextImpl().catch(logUnhandledError);
await this._closeBrowserContextPromise;
this._closeBrowserContextPromise = undefined;
}
hasFileChooser() {
return !!this._fileChooser;
isRunningTool() {
return this._isRunningTool;
}
clearFileChooser() {
this._fileChooser = undefined;
setRunningTool(isRunningTool: boolean) {
this._isRunningTool = isRunningTool;
}
private async _createPage(): Promise<{ browser?: playwright.Browser, page: playwright.Page }> {
if (process.env.PLAYWRIGHT_WS_ENDPOINT) {
const url = new URL(process.env.PLAYWRIGHT_WS_ENDPOINT);
if (this._launchOptions)
url.searchParams.set('launch-options', JSON.stringify(this._launchOptions));
const browser = await playwright.chromium.connect(String(url));
const page = await browser.newPage();
return { browser, page };
private async _closeBrowserContextImpl() {
if (!this._browserContextPromise)
return;
testDebug('close context');
const promise = this._browserContextPromise;
this._browserContextPromise = undefined;
await promise.then(async ({ browserContext, close }) => {
if (this.config.saveTrace)
await browserContext.tracing.stop();
await close();
});
}
const context = await playwright.chromium.launchPersistentContext(this._userDataDir, this._launchOptions);
const [page] = context.pages();
return { page };
async dispose() {
this._abortController.abort('MCP context disposed');
await this.closeBrowserContext();
Context._allContexts.delete(this);
}
async allFramesSnapshot() {
const page = this.existingPage();
const visibleFrames = await page.locator('iframe').filter({ visible: true }).all();
this._lastSnapshotFrames = visibleFrames.map(frame => frame.contentFrame());
private async _setupRequestInterception(context: playwright.BrowserContext) {
if (this.config.network?.allowedOrigins?.length) {
await context.route('**', route => route.abort('blockedbyclient'));
const snapshots = await Promise.all([
page.locator('html').ariaSnapshot({ ref: true }),
...this._lastSnapshotFrames.map(async (frame, index) => {
const snapshot = await frame.locator('html').ariaSnapshot({ ref: true });
const args = [];
const src = await frame.owner().getAttribute('src');
if (src)
args.push(`src=${src}`);
const name = await frame.owner().getAttribute('name');
if (name)
args.push(`name=${name}`);
return `\n# iframe ${args.join(' ')}\n` + snapshot.replaceAll('[ref=', `[ref=f${index}`);
})
]);
return snapshots.join('\n');
for (const origin of this.config.network.allowedOrigins)
await context.route(`*://${origin}/**`, route => route.continue());
}
refLocator(ref: string): playwright.Locator {
const page = this.existingPage();
let frame: playwright.Frame | playwright.FrameLocator = page.mainFrame();
const match = ref.match(/^f(\d+)(.*)/);
if (match) {
const frameIndex = parseInt(match[1], 10);
if (!this._lastSnapshotFrames[frameIndex])
throw new Error(`Frame does not exist. Provide ref from the most current snapshot.`);
frame = this._lastSnapshotFrames[frameIndex];
ref = match[2];
if (this.config.network?.blockedOrigins?.length) {
for (const origin of this.config.network.blockedOrigins)
await context.route(`*://${origin}/**`, route => route.abort('blockedbyclient'));
}
}
return frame.locator(`aria-ref=${ref}`);
private _ensureBrowserContext() {
if (!this._browserContextPromise) {
this._browserContextPromise = this._setupBrowserContext();
this._browserContextPromise.catch(() => {
this._browserContextPromise = undefined;
});
}
return this._browserContextPromise;
}
private async _setupBrowserContext(): Promise<{ browserContext: playwright.BrowserContext, close: () => Promise<void> }> {
if (this._closeBrowserContextPromise)
throw new Error('Another browser context is being closed.');
// TODO: move to the browser context factory to make it based on isolation mode.
const result = await this._browserContextFactory.createContext(this._clientInfo, this._abortController.signal);
const { browserContext } = result;
await this._setupRequestInterception(browserContext);
if (this.sessionLog)
await InputRecorder.create(this, browserContext);
for (const page of browserContext.pages())
this._onPageCreated(page);
browserContext.on('page', page => this._onPageCreated(page));
if (this.config.saveTrace) {
await browserContext.tracing.start({
name: 'trace',
screenshots: false,
snapshots: true,
sources: false,
});
}
return result;
}
}
export class InputRecorder {
private _context: Context;
private _browserContext: playwright.BrowserContext;
private constructor(context: Context, browserContext: playwright.BrowserContext) {
this._context = context;
this._browserContext = browserContext;
}
static async create(context: Context, browserContext: playwright.BrowserContext) {
const recorder = new InputRecorder(context, browserContext);
await recorder._initialize();
return recorder;
}
private async _initialize() {
const sessionLog = this._context.sessionLog!;
await (this._browserContext as any)._enableRecorder({
mode: 'recording',
recorderMode: 'api',
}, {
actionAdded: (page: playwright.Page, data: actions.ActionInContext, code: string) => {
if (this._context.isRunningTool())
return;
const tab = Tab.forPage(page);
if (tab)
sessionLog.logUserAction(data.action, tab, code, false);
},
actionUpdated: (page: playwright.Page, data: actions.ActionInContext, code: string) => {
if (this._context.isRunningTool())
return;
const tab = Tab.forPage(page);
if (tab)
sessionLog.logUserAction(data.action, tab, code, true);
},
signalAdded: (page: playwright.Page, data: actions.SignalInContext) => {
if (this._context.isRunningTool())
return;
if (data.signal.name !== 'navigation')
return;
const tab = Tab.forPage(page);
const navigateAction: actions.Action = {
name: 'navigate',
url: data.signal.url,
signals: [],
};
if (tab)
sessionLog.logUserAction(navigateAction, tab, `await page.goto('${data.signal.url}');`, false);
},
});
}
}

3
src/extension/DEPS.list Normal file
View File

@@ -0,0 +1,3 @@
[*]
../mcp/
../utils/

408
src/extension/cdpRelay.ts Normal file
View File

@@ -0,0 +1,408 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* WebSocket server that bridges Playwright MCP and Chrome Extension
*
* Endpoints:
* - /cdp/guid - Full CDP interface for Playwright MCP
* - /extension/guid - Extension connection for chrome.debugger forwarding
*/
import { spawn } from 'child_process';
import http from 'http';
import debug from 'debug';
import { WebSocket, WebSocketServer } from 'ws';
import { httpAddressToString } from '../utils/httpServer.js';
import { logUnhandledError } from '../utils/log.js';
import { ManualPromise } from '../utils/manualPromise.js';
import type websocket from 'ws';
import type { ClientInfo } from '../browserContextFactory.js';
// @ts-ignore
const { registry } = await import('playwright-core/lib/server/registry/index');
const debugLogger = debug('pw:mcp:relay');
type CDPCommand = {
id: number;
sessionId?: string;
method: string;
params?: any;
};
type CDPResponse = {
id?: number;
sessionId?: string;
method?: string;
params?: any;
result?: any;
error?: { code?: number; message: string };
};
export class CDPRelayServer {
private _wsHost: string;
private _browserChannel: string;
private _userDataDir?: string;
private _cdpPath: string;
private _extensionPath: string;
private _wss: WebSocketServer;
private _playwrightConnection: WebSocket | null = null;
private _extensionConnection: ExtensionConnection | null = null;
private _connectedTabInfo: {
targetInfo: any;
// Page sessionId that should be used by this connection.
sessionId: string;
} | undefined;
private _nextSessionId: number = 1;
private _extensionConnectionPromise!: ManualPromise<void>;
constructor(server: http.Server, browserChannel: string, userDataDir?: string) {
this._wsHost = httpAddressToString(server.address()).replace(/^http/, 'ws');
this._browserChannel = browserChannel;
this._userDataDir = userDataDir;
const uuid = crypto.randomUUID();
this._cdpPath = `/cdp/${uuid}`;
this._extensionPath = `/extension/${uuid}`;
this._resetExtensionConnection();
this._wss = new WebSocketServer({ server });
this._wss.on('connection', this._onConnection.bind(this));
}
cdpEndpoint() {
return `${this._wsHost}${this._cdpPath}`;
}
extensionEndpoint() {
return `${this._wsHost}${this._extensionPath}`;
}
async ensureExtensionConnectionForMCPContext(clientInfo: ClientInfo, abortSignal: AbortSignal) {
debugLogger('Ensuring extension connection for MCP context');
if (this._extensionConnection)
return;
this._connectBrowser(clientInfo);
debugLogger('Waiting for incoming extension connection');
await Promise.race([
this._extensionConnectionPromise,
new Promise((_, reject) => setTimeout(() => {
reject(new Error(`Extension connection timeout. Make sure the "Playwright MCP Bridge" extension is installed. See https://github.com/microsoft/playwright-mcp/blob/main/extension/README.md for installation instructions.`));
}, process.env.PWMCP_TEST_CONNECTION_TIMEOUT ? parseInt(process.env.PWMCP_TEST_CONNECTION_TIMEOUT, 10) : 5_000)),
new Promise((_, reject) => abortSignal.addEventListener('abort', reject))
]);
debugLogger('Extension connection established');
}
private _connectBrowser(clientInfo: ClientInfo) {
const mcpRelayEndpoint = `${this._wsHost}${this._extensionPath}`;
// Need to specify "key" in the manifest.json to make the id stable when loading from file.
const url = new URL('chrome-extension://jakfalbnbhgkpmoaakfflhflbfpkailf/connect.html');
url.searchParams.set('mcpRelayUrl', mcpRelayEndpoint);
url.searchParams.set('client', JSON.stringify(clientInfo));
const href = url.toString();
const executableInfo = registry.findExecutable(this._browserChannel);
if (!executableInfo)
throw new Error(`Unsupported channel: "${this._browserChannel}"`);
const executablePath = executableInfo.executablePath();
if (!executablePath)
throw new Error(`"${this._browserChannel}" executable not found. Make sure it is installed at a standard location.`);
const args: string[] = [];
if (this._userDataDir)
args.push(`--user-data-dir=${this._userDataDir}`);
args.push(href);
spawn(executablePath, args, {
windowsHide: true,
detached: true,
shell: false,
stdio: 'ignore',
});
}
stop(): void {
this.closeConnections('Server stopped');
this._wss.close();
}
closeConnections(reason: string) {
this._closePlaywrightConnection(reason);
this._closeExtensionConnection(reason);
}
private _onConnection(ws: WebSocket, request: http.IncomingMessage): void {
const url = new URL(`http://localhost${request.url}`);
debugLogger(`New connection to ${url.pathname}`);
if (url.pathname === this._cdpPath) {
this._handlePlaywrightConnection(ws);
} else if (url.pathname === this._extensionPath) {
this._handleExtensionConnection(ws);
} else {
debugLogger(`Invalid path: ${url.pathname}`);
ws.close(4004, 'Invalid path');
}
}
private _handlePlaywrightConnection(ws: WebSocket): void {
if (this._playwrightConnection) {
debugLogger('Rejecting second Playwright connection');
ws.close(1000, 'Another CDP client already connected');
return;
}
this._playwrightConnection = ws;
ws.on('message', async data => {
try {
const message = JSON.parse(data.toString());
await this._handlePlaywrightMessage(message);
} catch (error: any) {
debugLogger(`Error while handling Playwright message\n${data.toString()}\n`, error);
}
});
ws.on('close', () => {
if (this._playwrightConnection !== ws)
return;
this._playwrightConnection = null;
this._closeExtensionConnection('Playwright client disconnected');
debugLogger('Playwright WebSocket closed');
});
ws.on('error', error => {
debugLogger('Playwright WebSocket error:', error);
});
debugLogger('Playwright MCP connected');
}
private _closeExtensionConnection(reason: string) {
this._extensionConnection?.close(reason);
this._extensionConnectionPromise.reject(new Error(reason));
this._resetExtensionConnection();
}
private _resetExtensionConnection() {
this._connectedTabInfo = undefined;
this._extensionConnection = null;
this._extensionConnectionPromise = new ManualPromise();
void this._extensionConnectionPromise.catch(logUnhandledError);
}
private _closePlaywrightConnection(reason: string) {
if (this._playwrightConnection?.readyState === WebSocket.OPEN)
this._playwrightConnection.close(1000, reason);
this._playwrightConnection = null;
}
private _handleExtensionConnection(ws: WebSocket): void {
if (this._extensionConnection) {
ws.close(1000, 'Another extension connection already established');
return;
}
this._extensionConnection = new ExtensionConnection(ws);
this._extensionConnection.onclose = (c, reason) => {
debugLogger('Extension WebSocket closed:', reason, c === this._extensionConnection);
if (this._extensionConnection !== c)
return;
this._resetExtensionConnection();
this._closePlaywrightConnection(`Extension disconnected: ${reason}`);
};
this._extensionConnection.onmessage = this._handleExtensionMessage.bind(this);
this._extensionConnectionPromise.resolve();
}
private _handleExtensionMessage(method: string, params: any) {
switch (method) {
case 'forwardCDPEvent':
const sessionId = params.sessionId || this._connectedTabInfo?.sessionId;
this._sendToPlaywright({
sessionId,
method: params.method,
params: params.params
});
break;
case 'detachedFromTab':
debugLogger('← Debugger detached from tab:', params);
this._connectedTabInfo = undefined;
break;
}
}
private async _handlePlaywrightMessage(message: CDPCommand): Promise<void> {
debugLogger('← Playwright:', `${message.method} (id=${message.id})`);
const { id, sessionId, method, params } = message;
try {
const result = await this._handleCDPCommand(method, params, sessionId);
this._sendToPlaywright({ id, sessionId, result });
} catch (e) {
debugLogger('Error in the extension:', e);
this._sendToPlaywright({
id,
sessionId,
error: { message: (e as Error).message }
});
}
}
private async _handleCDPCommand(method: string, params: any, sessionId: string | undefined): Promise<any> {
switch (method) {
case 'Browser.getVersion': {
return {
protocolVersion: '1.3',
product: 'Chrome/Extension-Bridge',
userAgent: 'CDP-Bridge-Server/1.0.0',
};
}
case 'Browser.setDownloadBehavior': {
return { };
}
case 'Target.setAutoAttach': {
// Forward child session handling.
if (sessionId)
break;
// Simulate auto-attach behavior with real target info
const { targetInfo } = await this._extensionConnection!.send('attachToTab');
this._connectedTabInfo = {
targetInfo,
sessionId: `pw-tab-${this._nextSessionId++}`,
};
debugLogger('Simulating auto-attach');
this._sendToPlaywright({
method: 'Target.attachedToTarget',
params: {
sessionId: this._connectedTabInfo.sessionId,
targetInfo: {
...this._connectedTabInfo.targetInfo,
attached: true,
},
waitingForDebugger: false
}
});
return { };
}
case 'Target.getTargetInfo': {
return this._connectedTabInfo?.targetInfo;
}
}
return await this._forwardToExtension(method, params, sessionId);
}
private async _forwardToExtension(method: string, params: any, sessionId: string | undefined): Promise<any> {
if (!this._extensionConnection)
throw new Error('Extension not connected');
// Top level sessionId is only passed between the relay and the client.
if (this._connectedTabInfo?.sessionId === sessionId)
sessionId = undefined;
return await this._extensionConnection.send('forwardCDPCommand', { sessionId, method, params });
}
private _sendToPlaywright(message: CDPResponse): void {
debugLogger('→ Playwright:', `${message.method ?? `response(id=${message.id})`}`);
this._playwrightConnection?.send(JSON.stringify(message));
}
}
type ExtensionResponse = {
id?: number;
method?: string;
params?: any;
result?: any;
error?: string;
};
class ExtensionConnection {
private readonly _ws: WebSocket;
private readonly _callbacks = new Map<number, { resolve: (o: any) => void, reject: (e: Error) => void, error: Error }>();
private _lastId = 0;
onmessage?: (method: string, params: any) => void;
onclose?: (self: ExtensionConnection, reason: string) => void;
constructor(ws: WebSocket) {
this._ws = ws;
this._ws.on('message', this._onMessage.bind(this));
this._ws.on('close', this._onClose.bind(this));
this._ws.on('error', this._onError.bind(this));
}
async send(method: string, params?: any, sessionId?: string): Promise<any> {
if (this._ws.readyState !== WebSocket.OPEN)
throw new Error(`Unexpected WebSocket state: ${this._ws.readyState}`);
const id = ++this._lastId;
this._ws.send(JSON.stringify({ id, method, params, sessionId }));
const error = new Error(`Protocol error: ${method}`);
return new Promise((resolve, reject) => {
this._callbacks.set(id, { resolve, reject, error });
});
}
close(message: string) {
debugLogger('closing extension connection:', message);
if (this._ws.readyState === WebSocket.OPEN)
this._ws.close(1000, message);
}
private _onMessage(event: websocket.RawData) {
const eventData = event.toString();
let parsedJson;
try {
parsedJson = JSON.parse(eventData);
} catch (e: any) {
debugLogger(`<closing ws> Closing websocket due to malformed JSON. eventData=${eventData} e=${e?.message}`);
this._ws.close();
return;
}
try {
this._handleParsedMessage(parsedJson);
} catch (e: any) {
debugLogger(`<closing ws> Closing websocket due to failed onmessage callback. eventData=${eventData} e=${e?.message}`);
this._ws.close();
}
}
private _handleParsedMessage(object: ExtensionResponse) {
if (object.id && this._callbacks.has(object.id)) {
const callback = this._callbacks.get(object.id)!;
this._callbacks.delete(object.id);
if (object.error) {
const error = callback.error;
error.message = object.error;
callback.reject(error);
} else {
callback.resolve(object.result);
}
} else if (object.id) {
debugLogger('← Extension: unexpected response', object);
} else {
this.onmessage?.(object.method!, object.params);
}
}
private _onClose(event: websocket.CloseEvent) {
debugLogger(`<ws closed> code=${event.code} reason=${event.reason}`);
this._dispose();
this.onclose?.(this, event.reason);
}
private _onError(event: websocket.ErrorEvent) {
debugLogger(`<ws error> message=${event.message} type=${event.type} target=${event.target}`);
this._dispose();
}
private _dispose() {
for (const callback of this._callbacks.values())
callback.reject(new Error('WebSocket closed'));
this._callbacks.clear();
}
}

View File

@@ -0,0 +1,66 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import debug from 'debug';
import * as playwright from 'playwright';
import { startHttpServer } from '../utils/httpServer.js';
import { CDPRelayServer } from './cdpRelay.js';
import type { BrowserContextFactory, ClientInfo } from '../browserContextFactory.js';
const debugLogger = debug('pw:mcp:relay');
export class ExtensionContextFactory implements BrowserContextFactory {
name = 'extension';
description = 'Connect to a browser using the Playwright MCP extension';
private _browserChannel: string;
private _userDataDir?: string;
constructor(browserChannel: string, userDataDir: string | undefined) {
this._browserChannel = browserChannel;
this._userDataDir = userDataDir;
}
async createContext(clientInfo: ClientInfo, abortSignal: AbortSignal): Promise<{ browserContext: playwright.BrowserContext, close: () => Promise<void> }> {
const browser = await this._obtainBrowser(clientInfo, abortSignal);
return {
browserContext: browser.contexts()[0],
close: async () => {
debugLogger('close() called for browser context');
await browser.close();
}
};
}
private async _obtainBrowser(clientInfo: ClientInfo, abortSignal: AbortSignal): Promise<playwright.Browser> {
const relay = await this._startRelay(abortSignal);
await relay.ensureExtensionConnectionForMCPContext(clientInfo, abortSignal);
return await playwright.chromium.connectOverCDP(relay.cdpEndpoint());
}
private async _startRelay(abortSignal: AbortSignal) {
const httpServer = await startHttpServer({});
if (abortSignal.aborted) {
httpServer.close();
throw new Error(abortSignal.reason);
}
const cdpRelayServer = new CDPRelayServer(httpServer, this._browserChannel, this._userDataDir);
abortSignal.addEventListener('abort', () => cdpRelayServer.stop());
debugLogger(`CDP relay server started, extension endpoint: ${cdpRelayServer.extensionEndpoint()}.`);
return cdpRelayServer;
}
}

View File

@@ -14,71 +14,37 @@
* limitations under the License.
*/
import { createServerWithTools } from './server';
import * as snapshot from './tools/snapshot';
import * as common from './tools/common';
import * as screenshot from './tools/screenshot';
import { console } from './resources/console';
import { BrowserServerBackend } from './browserServerBackend.js';
import { resolveConfig } from './config.js';
import { contextFactory } from './browserContextFactory.js';
import * as mcpServer from './mcp/server.js';
import type { Tool } from './tools/tool';
import type { Resource } from './resources/resource';
import type { Config } from '../config.js';
import type { BrowserContext } from 'playwright';
import type { BrowserContextFactory } from './browserContextFactory.js';
import type { Server } from '@modelcontextprotocol/sdk/server/index.js';
import type { LaunchOptions } from 'playwright';
const commonTools: Tool[] = [
common.pressKey,
common.wait,
common.pdf,
common.close,
];
const snapshotTools: Tool[] = [
common.navigate(true),
common.goBack(true),
common.goForward(true),
common.chooseFile(true),
snapshot.snapshot,
snapshot.click,
snapshot.hover,
snapshot.type,
snapshot.selectOption,
snapshot.screenshot,
...commonTools,
];
const screenshotTools: Tool[] = [
common.navigate(false),
common.goBack(false),
common.goForward(false),
common.chooseFile(false),
screenshot.screenshot,
screenshot.moveMouse,
screenshot.click,
screenshot.drag,
screenshot.type,
...commonTools,
];
const resources: Resource[] = [
console,
];
type Options = {
userDataDir?: string;
launchOptions?: LaunchOptions;
vision?: boolean;
};
const packageJSON = require('../package.json');
export function createServer(options?: Options): Server {
const tools = options?.vision ? screenshotTools : snapshotTools;
return createServerWithTools({
name: 'Playwright',
version: packageJSON.version,
tools,
resources,
userDataDir: options?.userDataDir ?? '',
launchOptions: options?.launchOptions,
});
export async function createConnection(userConfig: Config = {}, contextGetter?: () => Promise<BrowserContext>): Promise<Server> {
const config = await resolveConfig(userConfig);
const factory = contextGetter ? new SimpleBrowserContextFactory(contextGetter) : contextFactory(config);
return mcpServer.createServer(new BrowserServerBackend(config, factory), false);
}
class SimpleBrowserContextFactory implements BrowserContextFactory {
name = 'custom';
description = 'Connect to a browser using a custom context getter';
private readonly _contextGetter: () => Promise<BrowserContext>;
constructor(contextGetter: () => Promise<BrowserContext>) {
this._contextGetter = contextGetter;
}
async createContext(): Promise<{ browserContext: BrowserContext, close: () => Promise<void> }> {
const browserContext = await this._contextGetter();
return {
browserContext,
close: () => browserContext.close()
};
}
}

108
src/loop/loop.ts Normal file
View File

@@ -0,0 +1,108 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import debug from 'debug';
import type { Tool, ImageContent, TextContent } from '@modelcontextprotocol/sdk/types.js';
import type { Client } from '@modelcontextprotocol/sdk/client/index.js';
export type LLMToolCall = {
name: string;
arguments: any;
id: string;
};
export type LLMTool = {
name: string;
description: string;
inputSchema: any;
};
export type LLMMessage =
| { role: 'user'; content: string }
| { role: 'assistant'; content: string; toolCalls?: LLMToolCall[] }
| { role: 'tool'; toolCallId: string; content: string; isError?: boolean };
export type LLMConversation = {
messages: LLMMessage[];
tools: LLMTool[];
};
export interface LLMDelegate {
createConversation(task: string, tools: Tool[], oneShot: boolean): LLMConversation;
makeApiCall(conversation: LLMConversation): Promise<LLMToolCall[]>;
addToolResults(conversation: LLMConversation, results: Array<{ toolCallId: string; content: string; isError?: boolean }>): void;
checkDoneToolCall(toolCall: LLMToolCall): string | null;
}
export async function runTask(delegate: LLMDelegate, client: Client, task: string, oneShot: boolean = false): Promise<LLMMessage[]> {
const { tools } = await client.listTools();
const taskContent = oneShot ? `Perform following task: ${task}.` : `Perform following task: ${task}. Once the task is complete, call the "done" tool.`;
const conversation = delegate.createConversation(taskContent, tools, oneShot);
for (let iteration = 0; iteration < 5; ++iteration) {
debug('history')('Making API call for iteration', iteration);
const toolCalls = await delegate.makeApiCall(conversation);
if (toolCalls.length === 0)
throw new Error('Call the "done" tool when the task is complete.');
const toolResults: Array<{ toolCallId: string; content: string; isError?: boolean }> = [];
for (const toolCall of toolCalls) {
const doneResult = delegate.checkDoneToolCall(toolCall);
if (doneResult !== null)
return conversation.messages;
const { name, arguments: args, id } = toolCall;
try {
debug('tool')(name, args);
const response = await client.callTool({
name,
arguments: args,
});
const responseContent = (response.content || []) as (TextContent | ImageContent)[];
debug('tool')(responseContent);
const text = responseContent.filter(part => part.type === 'text').map(part => part.text).join('\n');
toolResults.push({
toolCallId: id,
content: text,
});
} catch (error) {
debug('tool')(error);
toolResults.push({
toolCallId: id,
content: `Error while executing tool "${name}": ${error instanceof Error ? error.message : String(error)}\n\nPlease try to recover and complete the task.`,
isError: true,
});
// Skip remaining tool calls for this iteration
for (const remainingToolCall of toolCalls.slice(toolCalls.indexOf(toolCall) + 1)) {
toolResults.push({
toolCallId: remainingToolCall.id,
content: `This tool call is skipped due to previous error.`,
isError: true,
});
}
break;
}
}
delegate.addToolResults(conversation, toolResults);
if (oneShot)
return conversation.messages;
}
throw new Error('Failed to perform step, max attempts reached');
}

177
src/loop/loopClaude.ts Normal file
View File

@@ -0,0 +1,177 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type Anthropic from '@anthropic-ai/sdk';
import type { LLMDelegate, LLMConversation, LLMToolCall, LLMTool } from './loop.js';
import type { Tool } from '@modelcontextprotocol/sdk/types.js';
const model = 'claude-sonnet-4-20250514';
export class ClaudeDelegate implements LLMDelegate {
private _anthropic: Anthropic | undefined;
async anthropic(): Promise<Anthropic> {
if (!this._anthropic) {
const anthropic = await import('@anthropic-ai/sdk');
this._anthropic = new anthropic.Anthropic();
}
return this._anthropic;
}
createConversation(task: string, tools: Tool[], oneShot: boolean): LLMConversation {
const llmTools: LLMTool[] = tools.map(tool => ({
name: tool.name,
description: tool.description || '',
inputSchema: tool.inputSchema,
}));
if (!oneShot) {
llmTools.push({
name: 'done',
description: 'Call this tool when the task is complete.',
inputSchema: {
type: 'object',
properties: {},
},
});
}
return {
messages: [{
role: 'user',
content: task
}],
tools: llmTools,
};
}
async makeApiCall(conversation: LLMConversation): Promise<LLMToolCall[]> {
// Convert generic messages to Claude format
const claudeMessages: Anthropic.Messages.MessageParam[] = [];
for (const message of conversation.messages) {
if (message.role === 'user') {
claudeMessages.push({
role: 'user',
content: message.content
});
} else if (message.role === 'assistant') {
const content: Anthropic.Messages.ContentBlock[] = [];
// Add text content
if (message.content) {
content.push({
type: 'text',
text: message.content,
citations: []
});
}
// Add tool calls
if (message.toolCalls) {
for (const toolCall of message.toolCalls) {
content.push({
type: 'tool_use',
id: toolCall.id,
name: toolCall.name,
input: toolCall.arguments
});
}
}
claudeMessages.push({
role: 'assistant',
content
});
} else if (message.role === 'tool') {
// Tool results are added differently - we need to find if there's already a user message with tool results
const lastMessage = claudeMessages[claudeMessages.length - 1];
const toolResult: Anthropic.Messages.ToolResultBlockParam = {
type: 'tool_result',
tool_use_id: message.toolCallId,
content: message.content,
is_error: message.isError,
};
if (lastMessage && lastMessage.role === 'user' && Array.isArray(lastMessage.content)) {
// Add to existing tool results message
(lastMessage.content as Anthropic.Messages.ToolResultBlockParam[]).push(toolResult);
} else {
// Create new tool results message
claudeMessages.push({
role: 'user',
content: [toolResult]
});
}
}
}
// Convert generic tools to Claude format
const claudeTools: Anthropic.Messages.Tool[] = conversation.tools.map(tool => ({
name: tool.name,
description: tool.description,
input_schema: tool.inputSchema,
}));
const anthropic = await this.anthropic();
const response = await anthropic.messages.create({
model,
max_tokens: 10000,
messages: claudeMessages,
tools: claudeTools,
});
// Extract tool calls and add assistant message to generic conversation
const toolCalls = response.content.filter(block => block.type === 'tool_use') as Anthropic.Messages.ToolUseBlock[];
const textContent = response.content.filter(block => block.type === 'text').map(block => (block as Anthropic.Messages.TextBlock).text).join('');
const llmToolCalls: LLMToolCall[] = toolCalls.map(toolCall => ({
name: toolCall.name,
arguments: toolCall.input as any,
id: toolCall.id,
}));
// Add assistant message to generic conversation
conversation.messages.push({
role: 'assistant',
content: textContent,
toolCalls: llmToolCalls.length > 0 ? llmToolCalls : undefined
});
return llmToolCalls;
}
addToolResults(
conversation: LLMConversation,
results: Array<{ toolCallId: string; content: string; isError?: boolean }>
): void {
for (const result of results) {
conversation.messages.push({
role: 'tool',
toolCallId: result.toolCallId,
content: result.content,
isError: result.isError,
});
}
}
checkDoneToolCall(toolCall: LLMToolCall): string | null {
if (toolCall.name === 'done')
return (toolCall.arguments as { result: string }).result;
return null;
}
}

168
src/loop/loopOpenAI.ts Normal file
View File

@@ -0,0 +1,168 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type OpenAI from 'openai';
import type { LLMDelegate, LLMConversation, LLMToolCall, LLMTool } from './loop.js';
import type { Tool } from '@modelcontextprotocol/sdk/types.js';
const model = 'gpt-4.1';
export class OpenAIDelegate implements LLMDelegate {
private _openai: OpenAI | undefined;
async openai(): Promise<OpenAI> {
if (!this._openai) {
const oai = await import('openai');
this._openai = new oai.OpenAI();
}
return this._openai;
}
createConversation(task: string, tools: Tool[], oneShot: boolean): LLMConversation {
const genericTools: LLMTool[] = tools.map(tool => ({
name: tool.name,
description: tool.description || '',
inputSchema: tool.inputSchema,
}));
if (!oneShot) {
genericTools.push({
name: 'done',
description: 'Call this tool when the task is complete.',
inputSchema: {
type: 'object',
properties: {},
},
});
}
return {
messages: [{
role: 'user',
content: task
}],
tools: genericTools,
};
}
async makeApiCall(conversation: LLMConversation): Promise<LLMToolCall[]> {
// Convert generic messages to OpenAI format
const openaiMessages: OpenAI.Chat.Completions.ChatCompletionMessageParam[] = [];
for (const message of conversation.messages) {
if (message.role === 'user') {
openaiMessages.push({
role: 'user',
content: message.content
});
} else if (message.role === 'assistant') {
const toolCalls: OpenAI.Chat.Completions.ChatCompletionMessageToolCall[] = [];
if (message.toolCalls) {
for (const toolCall of message.toolCalls) {
toolCalls.push({
id: toolCall.id,
type: 'function',
function: {
name: toolCall.name,
arguments: JSON.stringify(toolCall.arguments)
}
});
}
}
const assistantMessage: OpenAI.Chat.Completions.ChatCompletionAssistantMessageParam = {
role: 'assistant'
};
if (message.content)
assistantMessage.content = message.content;
if (toolCalls.length > 0)
assistantMessage.tool_calls = toolCalls;
openaiMessages.push(assistantMessage);
} else if (message.role === 'tool') {
openaiMessages.push({
role: 'tool',
tool_call_id: message.toolCallId,
content: message.content,
});
}
}
// Convert generic tools to OpenAI format
const openaiTools: OpenAI.Chat.Completions.ChatCompletionTool[] = conversation.tools.map(tool => ({
type: 'function',
function: {
name: tool.name,
description: tool.description,
parameters: tool.inputSchema,
},
}));
const openai = await this.openai();
const response = await openai.chat.completions.create({
model,
messages: openaiMessages,
tools: openaiTools,
tool_choice: 'auto'
});
const message = response.choices[0].message;
// Extract tool calls and add assistant message to generic conversation
const toolCalls = message.tool_calls || [];
const genericToolCalls: LLMToolCall[] = toolCalls.map(toolCall => {
const functionCall = toolCall.function;
return {
name: functionCall.name,
arguments: JSON.parse(functionCall.arguments),
id: toolCall.id,
};
});
// Add assistant message to generic conversation
conversation.messages.push({
role: 'assistant',
content: message.content || '',
toolCalls: genericToolCalls.length > 0 ? genericToolCalls : undefined
});
return genericToolCalls;
}
addToolResults(
conversation: LLMConversation,
results: Array<{ toolCallId: string; content: string; isError?: boolean }>
): void {
for (const result of results) {
conversation.messages.push({
role: 'tool',
toolCallId: result.toolCallId,
content: result.content,
isError: result.isError,
});
}
}
checkDoneToolCall(toolCall: LLMToolCall): string | null {
if (toolCall.name === 'done')
return toolCall.arguments.result;
return null;
}
}

72
src/loop/main.ts Normal file
View File

@@ -0,0 +1,72 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/* eslint-disable no-console */
import path from 'path';
import url from 'url';
import dotenv from 'dotenv';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { program } from 'commander';
import { OpenAIDelegate } from './loopOpenAI.js';
import { ClaudeDelegate } from './loopClaude.js';
import { runTask } from './loop.js';
import type { LLMDelegate } from './loop.js';
dotenv.config();
const __filename = url.fileURLToPath(import.meta.url);
async function run(delegate: LLMDelegate) {
const transport = new StdioClientTransport({
command: 'node',
args: [
path.resolve(__filename, '../../../cli.js'),
'--save-session',
'--output-dir', path.resolve(__filename, '../../../sessions')
],
stderr: 'inherit',
env: process.env as Record<string, string>,
});
const client = new Client({ name: 'test', version: '1.0.0' });
await client.connect(transport);
await client.ping();
for (const task of tasks) {
const messages = await runTask(delegate, client, task);
for (const message of messages)
console.log(`${message.role}: ${message.content}`);
}
await client.close();
}
const tasks = [
'Open https://playwright.dev/',
];
program
.option('--model <model>', 'model to use')
.action(async options => {
if (options.model === 'claude')
await run(new ClaudeDelegate());
else
await run(new OpenAIDelegate());
});
void program.parseAsync(process.argv);

5
src/loopTools/DEPS.list Normal file
View File

@@ -0,0 +1,5 @@
[*]
../
../loop/
../mcp/
../utils/

77
src/loopTools/context.ts Normal file
View File

@@ -0,0 +1,77 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { contextFactory } from '../browserContextFactory.js';
import { BrowserServerBackend } from '../browserServerBackend.js';
import { Context as BrowserContext } from '../context.js';
import { runTask } from '../loop/loop.js';
import { OpenAIDelegate } from '../loop/loopOpenAI.js';
import { ClaudeDelegate } from '../loop/loopClaude.js';
import { InProcessTransport } from '../mcp/inProcessTransport.js';
import * as mcpServer from '../mcp/server.js';
import type { LLMDelegate } from '../loop/loop.js';
import type { FullConfig } from '../config.js';
export class Context {
readonly config: FullConfig;
private _client: Client;
private _delegate: LLMDelegate;
constructor(config: FullConfig, client: Client) {
this.config = config;
this._client = client;
if (process.env.OPENAI_API_KEY)
this._delegate = new OpenAIDelegate();
else if (process.env.ANTHROPIC_API_KEY)
this._delegate = new ClaudeDelegate();
else
throw new Error('No LLM API key found. Please set OPENAI_API_KEY or ANTHROPIC_API_KEY environment variable.');
}
static async create(config: FullConfig) {
const client = new Client({ name: 'Playwright Proxy', version: '1.0.0' });
const browserContextFactory = contextFactory(config);
const server = mcpServer.createServer(new BrowserServerBackend(config, browserContextFactory), false);
await client.connect(new InProcessTransport(server));
await client.ping();
return new Context(config, client);
}
async runTask(task: string, oneShot: boolean = false): Promise<mcpServer.CallToolResult> {
const messages = await runTask(this._delegate, this._client!, task, oneShot);
const lines: string[] = [];
// Skip the first message, which is the user's task.
for (const message of messages.slice(1)) {
// Trim out all page snapshots.
if (!message.content.trim())
continue;
const index = oneShot ? -1 : message.content.indexOf('### Page state');
const trimmedContent = index === -1 ? message.content : message.content.substring(0, index);
lines.push(`[${message.role}]:`, trimmedContent);
}
return {
content: [{ type: 'text', text: lines.join('\n') }],
};
}
async close() {
await BrowserContext.disposeAll();
}
}

65
src/loopTools/main.ts Normal file
View File

@@ -0,0 +1,65 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import dotenv from 'dotenv';
import * as mcpServer from '../mcp/server.js';
import * as mcpTransport from '../mcp/transport.js';
import { packageJSON } from '../utils/package.js';
import { Context } from './context.js';
import { perform } from './perform.js';
import { snapshot } from './snapshot.js';
import { toMcpTool } from '../mcp/tool.js';
import type { FullConfig } from '../config.js';
import type { ServerBackend } from '../mcp/server.js';
import type { Tool } from './tool.js';
export async function runLoopTools(config: FullConfig) {
dotenv.config();
const serverBackendFactory = () => new LoopToolsServerBackend(config);
await mcpTransport.start(serverBackendFactory, config.server);
}
class LoopToolsServerBackend implements ServerBackend {
readonly name = 'Playwright';
readonly version = packageJSON.version;
private _config: FullConfig;
private _context: Context | undefined;
private _tools: Tool<any>[] = [perform, snapshot];
constructor(config: FullConfig) {
this._config = config;
}
async initialize() {
this._context = await Context.create(this._config);
}
async listTools(): Promise<mcpServer.Tool[]> {
return this._tools.map(tool => toMcpTool(tool.schema));
}
async callTool(name: string, args: mcpServer.CallToolRequest['params']['arguments']): Promise<mcpServer.CallToolResult> {
const tool = this._tools.find(tool => tool.schema.name === name)!;
const parsedArguments = tool.schema.inputSchema.parse(args || {});
return await tool.handle(this._context!, parsedArguments);
}
serverClosed() {
void this._context!.close();
}
}

36
src/loopTools/perform.ts Normal file
View File

@@ -0,0 +1,36 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTool } from './tool.js';
const performSchema = z.object({
task: z.string().describe('The task to perform with the browser'),
});
export const perform = defineTool({
schema: {
name: 'browser_perform',
title: 'Perform a task with the browser',
description: 'Perform a task with the browser. It can click, type, export, capture screenshot, drag, hover, select options, etc.',
inputSchema: performSchema,
type: 'destructive',
},
handle: async (context, params) => {
return await context.runTask(params.task);
},
});

View File

@@ -14,22 +14,19 @@
* limitations under the License.
*/
import type { Resource } from './resource';
import { z } from 'zod';
import { defineTool } from './tool.js';
export const console: Resource = {
export const snapshot = defineTool({
schema: {
uri: 'browser://console',
name: 'Page console',
mimeType: 'text/plain',
name: 'browser_snapshot',
title: 'Take a snapshot of the browser',
description: 'Take a snapshot of the browser to read what is on the page.',
inputSchema: z.object({}),
type: 'readOnly',
},
read: async (context, uri) => {
const messages = await context.console();
const log = messages.map(message => `[${message.type().toUpperCase()}] ${message.text()}`).join('\n');
return [{
uri,
mimeType: 'text/plain',
text: log
}];
handle: async (context, params) => {
return await context.runTask('Capture browser snapshot', true);
},
};
});

30
src/loopTools/tool.ts Normal file
View File

@@ -0,0 +1,30 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type { z } from 'zod';
import type * as mcpServer from '../mcp/server.js';
import type { Context } from './context.js';
import type { ToolSchema } from '../mcp/tool.js';
export type Tool<Input extends z.Schema = z.Schema> = {
schema: ToolSchema<Input>;
handle: (context: Context, params: z.output<Input>) => Promise<mcpServer.CallToolResult>;
};
export function defineTool<Input extends z.Schema>(tool: Tool<Input>): Tool<Input> {
return tool;
}

2
src/mcp/DEPS.list Normal file
View File

@@ -0,0 +1,2 @@
[*]
../utils/

1
src/mcp/README.md Normal file
View File

@@ -0,0 +1 @@
- Generic MCP utils, no dependencies on Playwright here.

View File

@@ -0,0 +1,92 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type { Server } from '@modelcontextprotocol/sdk/server/index.js';
import type { Transport, TransportSendOptions } from '@modelcontextprotocol/sdk/shared/transport.js';
import type { JSONRPCMessage, MessageExtraInfo } from '@modelcontextprotocol/sdk/types.js';
export class InProcessTransport implements Transport {
private _server: Server;
private _serverTransport: InProcessServerTransport;
private _connected: boolean = false;
constructor(server: Server) {
this._server = server;
this._serverTransport = new InProcessServerTransport(this);
}
async start(): Promise<void> {
if (this._connected)
throw new Error('InprocessTransport already started!');
await this._server.connect(this._serverTransport);
this._connected = true;
}
async send(message: JSONRPCMessage, options?: TransportSendOptions): Promise<void> {
if (!this._connected)
throw new Error('Transport not connected');
this._serverTransport._receiveFromClient(message);
}
async close(): Promise<void> {
if (this._connected) {
this._connected = false;
this.onclose?.();
this._serverTransport.onclose?.();
}
}
onclose?: (() => void) | undefined;
onerror?: ((error: Error) => void) | undefined;
onmessage?: ((message: JSONRPCMessage, extra?: MessageExtraInfo) => void) | undefined;
sessionId?: string | undefined;
setProtocolVersion?: ((version: string) => void) | undefined;
_receiveFromServer(message: JSONRPCMessage, extra?: MessageExtraInfo): void {
this.onmessage?.(message, extra);
}
}
class InProcessServerTransport implements Transport {
private _clientTransport: InProcessTransport;
constructor(clientTransport: InProcessTransport) {
this._clientTransport = clientTransport;
}
async start(): Promise<void> {
}
async send(message: JSONRPCMessage, options?: TransportSendOptions): Promise<void> {
this._clientTransport._receiveFromServer(message);
}
async close(): Promise<void> {
this.onclose?.();
}
onclose?: (() => void) | undefined;
onerror?: ((error: Error) => void) | undefined;
onmessage?: ((message: JSONRPCMessage, extra?: MessageExtraInfo) => void) | undefined;
sessionId?: string | undefined;
setProtocolVersion?: ((version: string) => void) | undefined;
_receiveFromClient(message: JSONRPCMessage): void {
this.onmessage?.(message);
}
}

131
src/mcp/proxyBackend.ts Normal file
View File

@@ -0,0 +1,131 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { ListRootsRequestSchema, PingRequestSchema } from '@modelcontextprotocol/sdk/types.js';
import { logUnhandledError } from '../utils/log.js';
import { packageJSON } from '../utils/package.js';
import type { ServerBackend, ClientVersion, Root } from './server.js';
import type { Transport } from '@modelcontextprotocol/sdk/shared/transport.js';
import type { Tool, CallToolResult, CallToolRequest } from '@modelcontextprotocol/sdk/types.js';
export type MCPProvider = {
name: string;
description: string;
connect(): Promise<Transport>;
};
export class ProxyBackend implements ServerBackend {
name = 'Playwright MCP Client Switcher';
version = packageJSON.version;
private _mcpProviders: MCPProvider[];
private _currentClient: Client | undefined;
private _contextSwitchTool: Tool;
private _roots: Root[] = [];
constructor(mcpProviders: MCPProvider[]) {
this._mcpProviders = mcpProviders;
this._contextSwitchTool = this._defineContextSwitchTool();
}
async initialize(clientVersion: ClientVersion, roots: Root[]): Promise<void> {
this._roots = roots;
await this._setCurrentClient(this._mcpProviders[0]);
}
async listTools(): Promise<Tool[]> {
const response = await this._currentClient!.listTools();
if (this._mcpProviders.length === 1)
return response.tools;
return [
...response.tools,
this._contextSwitchTool,
];
}
async callTool(name: string, args: CallToolRequest['params']['arguments']): Promise<CallToolResult> {
if (name === this._contextSwitchTool.name)
return this._callContextSwitchTool(args);
return await this._currentClient!.callTool({
name,
arguments: args,
}) as CallToolResult;
}
serverClosed?(): void {
void this._currentClient?.close().catch(logUnhandledError);
}
private async _callContextSwitchTool(params: any): Promise<CallToolResult> {
try {
const factory = this._mcpProviders.find(factory => factory.name === params.name);
if (!factory)
throw new Error('Unknown connection method: ' + params.name);
await this._setCurrentClient(factory);
return {
content: [{ type: 'text', text: '### Result\nSuccessfully changed connection method.\n' }],
};
} catch (error) {
return {
content: [{ type: 'text', text: `### Result\nError: ${error}\n` }],
isError: true,
};
}
}
private _defineContextSwitchTool(): Tool {
return {
name: 'browser_connect',
description: [
'Connect to a browser using one of the available methods:',
...this._mcpProviders.map(factory => `- "${factory.name}": ${factory.description}`),
].join('\n'),
inputSchema: zodToJsonSchema(z.object({
name: z.enum(this._mcpProviders.map(factory => factory.name) as [string, ...string[]]).default(this._mcpProviders[0].name).describe('The method to use to connect to the browser'),
}), { strictUnions: true }) as Tool['inputSchema'],
annotations: {
title: 'Connect to a browser context',
readOnlyHint: true,
openWorldHint: false,
},
};
}
private async _setCurrentClient(factory: MCPProvider) {
await this._currentClient?.close();
this._currentClient = undefined;
const client = new Client({ name: 'Playwright MCP Proxy', version: packageJSON.version });
client.registerCapabilities({
roots: {
listRoots: true,
},
});
client.setRequestHandler(ListRootsRequestSchema, () => ({ roots: this._roots }));
client.setRequestHandler(PingRequestSchema, () => ({}));
const transport = await factory.connect();
await client.connect(transport);
this._currentClient = client;
}
}

122
src/mcp/server.ts Normal file
View File

@@ -0,0 +1,122 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import debug from 'debug';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
import { ManualPromise } from '../utils/manualPromise.js';
import { logUnhandledError } from '../utils/log.js';
import type { Tool, CallToolResult, CallToolRequest, Root } from '@modelcontextprotocol/sdk/types.js';
import type { Transport } from '@modelcontextprotocol/sdk/shared/transport.js';
export type { Server } from '@modelcontextprotocol/sdk/server/index.js';
export type { Tool, CallToolResult, CallToolRequest, Root } from '@modelcontextprotocol/sdk/types.js';
const serverDebug = debug('pw:mcp:server');
export type ClientVersion = { name: string, version: string };
export interface ServerBackend {
name: string;
version: string;
initialize?(clientVersion: ClientVersion, roots: Root[]): Promise<void>;
listTools(): Promise<Tool[]>;
callTool(name: string, args: CallToolRequest['params']['arguments']): Promise<CallToolResult>;
serverClosed?(): void;
}
export type ServerBackendFactory = () => ServerBackend;
export async function connect(serverBackendFactory: ServerBackendFactory, transport: Transport, runHeartbeat: boolean) {
const backend = serverBackendFactory();
const server = createServer(backend, runHeartbeat);
await server.connect(transport);
}
export function createServer(backend: ServerBackend, runHeartbeat: boolean): Server {
const initializedPromise = new ManualPromise<void>();
const server = new Server({ name: backend.name, version: backend.version }, {
capabilities: {
tools: {},
}
});
server.setRequestHandler(ListToolsRequestSchema, async () => {
serverDebug('listTools');
await initializedPromise;
const tools = await backend.listTools();
return { tools };
});
let heartbeatRunning = false;
server.setRequestHandler(CallToolRequestSchema, async request => {
serverDebug('callTool', request);
await initializedPromise;
if (runHeartbeat && !heartbeatRunning) {
heartbeatRunning = true;
startHeartbeat(server);
}
try {
return await backend.callTool(request.params.name, request.params.arguments || {});
} catch (error) {
return {
content: [{ type: 'text', text: '### Result\n' + String(error) }],
isError: true,
};
}
});
addServerListener(server, 'initialized', async () => {
try {
const capabilities = server.getClientCapabilities();
let clientRoots: Root[] = [];
if (capabilities?.roots) {
const { roots } = await server.listRoots(undefined, { timeout: 2_000 }).catch(() => ({ roots: [] }));
clientRoots = roots;
}
const clientVersion = server.getClientVersion() ?? { name: 'unknown', version: 'unknown' };
await backend.initialize?.(clientVersion, clientRoots);
initializedPromise.resolve();
} catch (e) {
logUnhandledError(e);
}
});
addServerListener(server, 'close', () => backend.serverClosed?.());
return server;
}
const startHeartbeat = (server: Server) => {
const beat = () => {
Promise.race([
server.ping(),
new Promise((_, reject) => setTimeout(() => reject(new Error('ping timeout')), 5000)),
]).then(() => {
setTimeout(beat, 3000);
}).catch(() => {
void server.close();
});
};
beat();
};
function addServerListener(server: Server, event: 'close' | 'initialized', listener: () => void) {
const oldListener = server[`on${event}`];
server[`on${event}`] = () => {
oldListener?.();
listener();
};
}

42
src/mcp/tool.ts Normal file
View File

@@ -0,0 +1,42 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { zodToJsonSchema } from 'zod-to-json-schema';
import type { z } from 'zod';
import type * as mcpServer from './server.js';
export type ToolSchema<Input extends z.Schema> = {
name: string;
title: string;
description: string;
inputSchema: Input;
type: 'readOnly' | 'destructive';
};
export function toMcpTool(tool: ToolSchema<any>): mcpServer.Tool {
return {
name: tool.name,
description: tool.description,
inputSchema: zodToJsonSchema(tool.inputSchema, { strictUnions: true }) as mcpServer.Tool['inputSchema'],
annotations: {
title: tool.title,
readOnlyHint: tool.type === 'readOnly',
destructiveHint: tool.type === 'destructive',
openWorldHint: true,
},
};
}

137
src/mcp/transport.ts Normal file
View File

@@ -0,0 +1,137 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import http from 'http';
import crypto from 'crypto';
import debug from 'debug';
import { SSEServerTransport } from '@modelcontextprotocol/sdk/server/sse.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { httpAddressToString, startHttpServer } from '../utils/httpServer.js';
import * as mcpServer from './server.js';
import type { ServerBackendFactory } from './server.js';
export async function start(serverBackendFactory: ServerBackendFactory, options: { host?: string; port?: number }) {
if (options.port !== undefined) {
const httpServer = await startHttpServer(options);
startHttpTransport(httpServer, serverBackendFactory);
} else {
await startStdioTransport(serverBackendFactory);
}
}
async function startStdioTransport(serverBackendFactory: ServerBackendFactory) {
await mcpServer.connect(serverBackendFactory, new StdioServerTransport(), false);
}
const testDebug = debug('pw:mcp:test');
async function handleSSE(serverBackendFactory: ServerBackendFactory, req: http.IncomingMessage, res: http.ServerResponse, url: URL, sessions: Map<string, SSEServerTransport>) {
if (req.method === 'POST') {
const sessionId = url.searchParams.get('sessionId');
if (!sessionId) {
res.statusCode = 400;
return res.end('Missing sessionId');
}
const transport = sessions.get(sessionId);
if (!transport) {
res.statusCode = 404;
return res.end('Session not found');
}
return await transport.handlePostMessage(req, res);
} else if (req.method === 'GET') {
const transport = new SSEServerTransport('/sse', res);
sessions.set(transport.sessionId, transport);
testDebug(`create SSE session: ${transport.sessionId}`);
await mcpServer.connect(serverBackendFactory, transport, false);
res.on('close', () => {
testDebug(`delete SSE session: ${transport.sessionId}`);
sessions.delete(transport.sessionId);
});
return;
}
res.statusCode = 405;
res.end('Method not allowed');
}
async function handleStreamable(serverBackendFactory: ServerBackendFactory, req: http.IncomingMessage, res: http.ServerResponse, sessions: Map<string, StreamableHTTPServerTransport>) {
const sessionId = req.headers['mcp-session-id'] as string | undefined;
if (sessionId) {
const transport = sessions.get(sessionId);
if (!transport) {
res.statusCode = 404;
res.end('Session not found');
return;
}
return await transport.handleRequest(req, res);
}
if (req.method === 'POST') {
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: () => crypto.randomUUID(),
onsessioninitialized: async sessionId => {
testDebug(`create http session: ${transport.sessionId}`);
await mcpServer.connect(serverBackendFactory, transport, true);
sessions.set(sessionId, transport);
}
});
transport.onclose = () => {
if (!transport.sessionId)
return;
sessions.delete(transport.sessionId);
testDebug(`delete http session: ${transport.sessionId}`);
};
await transport.handleRequest(req, res);
return;
}
res.statusCode = 400;
res.end('Invalid request');
}
function startHttpTransport(httpServer: http.Server, serverBackendFactory: ServerBackendFactory) {
const sseSessions = new Map();
const streamableSessions = new Map();
httpServer.on('request', async (req, res) => {
const url = new URL(`http://localhost${req.url}`);
if (url.pathname.startsWith('/sse'))
await handleSSE(serverBackendFactory, req, res, url, sseSessions);
else
await handleStreamable(serverBackendFactory, req, res, streamableSessions);
});
const url = httpAddressToString(httpServer.address());
const message = [
`Listening on ${url}`,
'Put this in your client config:',
JSON.stringify({
'mcpServers': {
'playwright': {
'url': `${url}/mcp`
}
}
}, undefined, 2),
'For legacy SSE transport support, you can use the /sse endpoint instead.',
].join('\n');
// eslint-disable-next-line no-console
console.error(message);
}

View File

@@ -14,134 +14,113 @@
* limitations under the License.
*/
import http from 'http';
import fs from 'fs';
import os from 'os';
import path from 'path';
import { program, Option } from 'commander';
import * as mcpServer from './mcp/server.js';
import * as mcpTransport from './mcp/transport.js';
import { commaSeparatedList, resolveCLIConfig, semicolonSeparatedList } from './config.js';
import { packageJSON } from './utils/package.js';
import { Context } from './context.js';
import { contextFactory } from './browserContextFactory.js';
import { runLoopTools } from './loopTools/main.js';
import { ProxyBackend } from './mcp/proxyBackend.js';
import { BrowserServerBackend } from './browserServerBackend.js';
import { ExtensionContextFactory } from './extension/extensionContextFactory.js';
import { InProcessTransport } from './mcp/inProcessTransport.js';
import { program } from 'commander';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { SSEServerTransport } from '@modelcontextprotocol/sdk/server/sse.js';
import { createServer } from './index';
import { ServerList } from './server';
import type { LaunchOptions } from 'playwright';
import assert from 'assert';
const packageJSON = require('../package.json');
import type { MCPProvider } from './mcp/proxyBackend.js';
import type { FullConfig } from './config.js';
import type { BrowserContextFactory } from './browserContextFactory.js';
program
.version('Version ' + packageJSON.version)
.name(packageJSON.name)
.option('--headless', 'Run browser in headless mode, headed by default')
.option('--user-data-dir <path>', 'Path to the user data directory')
.option('--vision', 'Run server that uses screenshots (Aria snapshots are used by default)')
.option('--port <port>', 'Port to listen on for SSE transport.')
.option('--allowed-origins <origins>', 'semicolon-separated list of origins to allow the browser to request. Default is to allow all.', semicolonSeparatedList)
.option('--blocked-origins <origins>', 'semicolon-separated list of origins to block the browser from requesting. Blocklist is evaluated before allowlist. If used without the allowlist, requests not matching the blocklist are still allowed.', semicolonSeparatedList)
.option('--block-service-workers', 'block service workers')
.option('--browser <browser>', 'browser or chrome channel to use, possible values: chrome, firefox, webkit, msedge.')
.option('--caps <caps>', 'comma-separated list of additional capabilities to enable, possible values: vision, pdf.', commaSeparatedList)
.option('--cdp-endpoint <endpoint>', 'CDP endpoint to connect to.')
.option('--config <path>', 'path to the configuration file.')
.option('--device <device>', 'device to emulate, for example: "iPhone 15"')
.option('--executable-path <path>', 'path to the browser executable.')
.option('--extension', 'Connect to a running browser instance (Edge/Chrome only). Requires the "Playwright MCP Bridge" browser extension to be installed.')
.option('--headless', 'run browser in headless mode, headed by default')
.option('--host <host>', 'host to bind server to. Default is localhost. Use 0.0.0.0 to bind to all interfaces.')
.option('--ignore-https-errors', 'ignore https errors')
.option('--isolated', 'keep the browser profile in memory, do not save it to disk.')
.option('--image-responses <mode>', 'whether to send image responses to the client. Can be "allow" or "omit", Defaults to "allow".')
.option('--no-sandbox', 'disable the sandbox for all process types that are normally sandboxed.')
.option('--output-dir <path>', 'path to the directory for output files.')
.option('--port <port>', 'port to listen on for SSE transport.')
.option('--proxy-bypass <bypass>', 'comma-separated domains to bypass proxy, for example ".com,chromium.org,.domain.com"')
.option('--proxy-server <proxy>', 'specify proxy server, for example "http://myproxy:3128" or "socks5://myproxy:8080"')
.option('--save-session', 'Whether to save the Playwright MCP session into the output directory.')
.option('--save-trace', 'Whether to save the Playwright Trace of the session into the output directory.')
.option('--storage-state <path>', 'path to the storage state file for isolated sessions.')
.option('--user-agent <ua string>', 'specify user agent string')
.option('--user-data-dir <path>', 'path to the user data directory. If not specified, a temporary directory will be created.')
.option('--viewport-size <size>', 'specify browser viewport size in pixels, for example "1280, 720"')
.addOption(new Option('--connect-tool', 'Allow to switch between different browser connection methods.').hideHelp())
.addOption(new Option('--loop-tools', 'Run loop tools').hideHelp())
.addOption(new Option('--vision', 'Legacy option, use --caps=vision instead').hideHelp())
.action(async options => {
const launchOptions: LaunchOptions = {
headless: !!options.headless,
channel: 'chrome',
};
const userDataDir = options.userDataDir ?? await createUserDataDir();
const serverList = new ServerList(() => createServer({
userDataDir,
launchOptions,
vision: !!options.vision,
}));
setupExitWatchdog(serverList);
setupExitWatchdog();
if (options.port) {
startSSEServer(+options.port, serverList);
} else {
const server = await serverList.create();
await server.connect(new StdioServerTransport());
if (options.vision) {
// eslint-disable-next-line no-console
console.error('The --vision option is deprecated, use --caps=vision instead');
options.caps = 'vision';
}
const config = await resolveCLIConfig(options);
if (options.extension) {
const contextFactory = createExtensionContextFactory(config);
const serverBackendFactory = () => new BrowserServerBackend(config, contextFactory);
await mcpTransport.start(serverBackendFactory, config.server);
return;
}
if (options.loopTools) {
await runLoopTools(config);
return;
}
const browserContextFactory = contextFactory(config);
const providers: MCPProvider[] = [mcpProviderForBrowserContextFactory(config, browserContextFactory)];
if (options.connectTool)
providers.push(mcpProviderForBrowserContextFactory(config, createExtensionContextFactory(config)));
await mcpTransport.start(() => new ProxyBackend(providers), config.server);
});
function setupExitWatchdog(serverList: ServerList) {
process.stdin.on('close', async () => {
function setupExitWatchdog() {
let isExiting = false;
const handleExit = async () => {
if (isExiting)
return;
isExiting = true;
setTimeout(() => process.exit(0), 15000);
await serverList.closeAll();
await Context.disposeAll();
process.exit(0);
});
};
process.stdin.on('close', handleExit);
process.on('SIGINT', handleExit);
process.on('SIGTERM', handleExit);
}
program.parse(process.argv);
async function createUserDataDir() {
let cacheDirectory: string;
if (process.platform === 'linux')
cacheDirectory = process.env.XDG_CACHE_HOME || path.join(os.homedir(), '.cache');
else if (process.platform === 'darwin')
cacheDirectory = path.join(os.homedir(), 'Library', 'Caches');
else if (process.platform === 'win32')
cacheDirectory = process.env.LOCALAPPDATA || path.join(os.homedir(), 'AppData', 'Local');
else
throw new Error('Unsupported platform: ' + process.platform);
const result = path.join(cacheDirectory, 'ms-playwright', 'mcp-chrome-profile');
await fs.promises.mkdir(result, { recursive: true });
return result;
function createExtensionContextFactory(config: FullConfig) {
return new ExtensionContextFactory(config.browser.launchOptions.channel || 'chrome', config.browser.userDataDir);
}
async function startSSEServer(port: number, serverList: ServerList) {
const sessions = new Map<string, SSEServerTransport>();
const httpServer = http.createServer(async (req, res) => {
if (req.method === 'POST') {
const searchParams = new URL(`http://localhost${req.url}`).searchParams;
const sessionId = searchParams.get('sessionId');
if (!sessionId) {
res.statusCode = 400;
res.end('Missing sessionId');
return;
}
const transport = sessions.get(sessionId);
if (!transport) {
res.statusCode = 404;
res.end('Session not found');
return;
}
await transport.handlePostMessage(req, res);
return;
} else if (req.method === 'GET') {
const transport = new SSEServerTransport('/sse', res);
sessions.set(transport.sessionId, transport);
const server = await serverList.create();
res.on('close', () => {
sessions.delete(transport.sessionId);
serverList.close(server).catch(e => console.error(e));
});
await server.connect(transport);
return;
} else {
res.statusCode = 405;
res.end('Method not allowed');
}
});
httpServer.listen(port, () => {
const address = httpServer.address();
assert(address, 'Could not bind server socket');
let url: string;
if (typeof address === 'string') {
url = address;
} else {
const resolvedPort = address.port;
let resolvedHost = address.family === 'IPv4' ? address.address : `[${address.address}]`;
if (resolvedHost === '0.0.0.0' || resolvedHost === '[::]')
resolvedHost = 'localhost';
url = `http://${resolvedHost}:${resolvedPort}`;
}
console.log(`Listening on ${url}`);
console.log('Put this in your client config:');
console.log(JSON.stringify({
'mcpServers': {
'playwright': {
'url': `${url}/sse`
}
}
}, undefined, 2));
});
function mcpProviderForBrowserContextFactory(config: FullConfig, browserContextFactory: BrowserContextFactory) {
return {
name: browserContextFactory.name,
description: browserContextFactory.description,
connect: async () => {
const server = mcpServer.createServer(new BrowserServerBackend(config, browserContextFactory), false);
return new InProcessTransport(server);
},
};
}
void program.parseAsync(process.argv);

201
src/response.ts Normal file
View File

@@ -0,0 +1,201 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { renderModalStates } from './tab.js';
import type { Tab, TabSnapshot } from './tab.js';
import type { ImageContent, TextContent } from '@modelcontextprotocol/sdk/types.js';
import type { Context } from './context.js';
export class Response {
private _result: string[] = [];
private _code: string[] = [];
private _images: { contentType: string, data: Buffer }[] = [];
private _context: Context;
private _includeSnapshot = false;
private _includeTabs = false;
private _tabSnapshot: TabSnapshot | undefined;
readonly toolName: string;
readonly toolArgs: Record<string, any>;
private _isError: boolean | undefined;
constructor(context: Context, toolName: string, toolArgs: Record<string, any>) {
this._context = context;
this.toolName = toolName;
this.toolArgs = toolArgs;
}
addResult(result: string) {
this._result.push(result);
}
addError(error: string) {
this._result.push(error);
this._isError = true;
}
isError() {
return this._isError;
}
result() {
return this._result.join('\n');
}
addCode(code: string) {
this._code.push(code);
}
code() {
return this._code.join('\n');
}
addImage(image: { contentType: string, data: Buffer }) {
this._images.push(image);
}
images() {
return this._images;
}
setIncludeSnapshot() {
this._includeSnapshot = true;
}
setIncludeTabs() {
this._includeTabs = true;
}
async finish() {
// All the async snapshotting post-action is happening here.
// Everything below should race against modal states.
if (this._includeSnapshot && this._context.currentTab())
this._tabSnapshot = await this._context.currentTabOrDie().captureSnapshot();
for (const tab of this._context.tabs())
await tab.updateTitle();
}
tabSnapshot(): TabSnapshot | undefined {
return this._tabSnapshot;
}
serialize(): { content: (TextContent | ImageContent)[], isError?: boolean } {
const response: string[] = [];
// Start with command result.
if (this._result.length) {
response.push('### Result');
response.push(this._result.join('\n'));
response.push('');
}
// Add code if it exists.
if (this._code.length) {
response.push(`### Ran Playwright code
\`\`\`js
${this._code.join('\n')}
\`\`\``);
response.push('');
}
// List browser tabs.
if (this._includeSnapshot || this._includeTabs)
response.push(...renderTabsMarkdown(this._context.tabs(), this._includeTabs));
// Add snapshot if provided.
if (this._tabSnapshot?.modalStates.length) {
response.push(...renderModalStates(this._context, this._tabSnapshot.modalStates));
response.push('');
} else if (this._tabSnapshot) {
response.push(renderTabSnapshot(this._tabSnapshot));
response.push('');
}
// Main response part
const content: (TextContent | ImageContent)[] = [
{ type: 'text', text: response.join('\n') },
];
// Image attachments.
if (this._context.config.imageResponses !== 'omit') {
for (const image of this._images)
content.push({ type: 'image', data: image.data.toString('base64'), mimeType: image.contentType });
}
return { content, isError: this._isError };
}
}
function renderTabSnapshot(tabSnapshot: TabSnapshot): string {
const lines: string[] = [];
if (tabSnapshot.consoleMessages.length) {
lines.push(`### New console messages`);
for (const message of tabSnapshot.consoleMessages)
lines.push(`- ${trim(message.toString(), 100)}`);
lines.push('');
}
if (tabSnapshot.downloads.length) {
lines.push(`### Downloads`);
for (const entry of tabSnapshot.downloads) {
if (entry.finished)
lines.push(`- Downloaded file ${entry.download.suggestedFilename()} to ${entry.outputFile}`);
else
lines.push(`- Downloading file ${entry.download.suggestedFilename()} ...`);
}
lines.push('');
}
lines.push(`### Page state`);
lines.push(`- Page URL: ${tabSnapshot.url}`);
lines.push(`- Page Title: ${tabSnapshot.title}`);
lines.push(`- Page Snapshot:`);
lines.push('```yaml');
lines.push(tabSnapshot.ariaSnapshot);
lines.push('```');
return lines.join('\n');
}
function renderTabsMarkdown(tabs: Tab[], force: boolean = false): string[] {
if (tabs.length === 1 && !force)
return [];
if (!tabs.length) {
return [
'### Open tabs',
'No open tabs. Use the "browser_navigate" tool to navigate to a page first.',
'',
];
}
const lines: string[] = ['### Open tabs'];
for (let i = 0; i < tabs.length; i++) {
const tab = tabs[i];
const current = tab.isCurrentTab() ? ' (current)' : '';
lines.push(`- ${i}:${current} [${tab.lastTitle()}] (${tab.page.url()})`);
}
lines.push('');
return lines;
}
function trim(text: string, maxLength: number) {
if (text.length <= maxLength)
return text;
return text.slice(0, maxLength) + '...';
}

View File

@@ -1,116 +0,0 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { CallToolRequestSchema, ListResourcesRequestSchema, ListToolsRequestSchema, ReadResourceRequestSchema } from '@modelcontextprotocol/sdk/types.js';
import { Context } from './context';
import type { Tool } from './tools/tool';
import type { Resource } from './resources/resource';
import type { LaunchOptions } from 'playwright';
type Options = {
name: string;
version: string;
tools: Tool[];
resources: Resource[],
userDataDir: string;
launchOptions?: LaunchOptions;
};
export function createServerWithTools(options: Options): Server {
const { name, version, tools, resources, userDataDir, launchOptions } = options;
const context = new Context(userDataDir, launchOptions);
const server = new Server({ name, version }, {
capabilities: {
tools: {},
resources: {},
}
});
server.setRequestHandler(ListToolsRequestSchema, async () => {
return { tools: tools.map(tool => tool.schema) };
});
server.setRequestHandler(ListResourcesRequestSchema, async () => {
return { resources: resources.map(resource => resource.schema) };
});
server.setRequestHandler(CallToolRequestSchema, async request => {
const tool = tools.find(tool => tool.schema.name === request.params.name);
if (!tool) {
return {
content: [{ type: 'text', text: `Tool "${request.params.name}" not found` }],
isError: true,
};
}
try {
const result = await tool.handle(context, request.params.arguments);
return result;
} catch (error) {
return {
content: [{ type: 'text', text: String(error) }],
isError: true,
};
}
});
server.setRequestHandler(ReadResourceRequestSchema, async request => {
const resource = resources.find(resource => resource.schema.uri === request.params.uri);
if (!resource)
return { contents: [] };
const contents = await resource.read(context, request.params.uri);
return { contents };
});
const oldClose = server.close.bind(server);
server.close = async () => {
await oldClose();
await context.close();
};
return server;
}
export class ServerList {
private _servers: Server[] = [];
private _serverFactory: () => Server;
constructor(serverFactory: () => Server) {
this._serverFactory = serverFactory;
}
async create() {
const server = this._serverFactory();
this._servers.push(server);
return server;
}
async close(server: Server) {
const index = this._servers.indexOf(server);
if (index !== -1)
this._servers.splice(index, 1);
await server.close();
}
async closeAll() {
await Promise.all(this._servers.map(server => server.close()));
}
}

176
src/sessionLog.ts Normal file
View File

@@ -0,0 +1,176 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import fs from 'fs';
import path from 'path';
import { Response } from './response.js';
import { logUnhandledError } from './utils/log.js';
import { outputFile } from './config.js';
import type { FullConfig } from './config.js';
import type * as actions from './actions.js';
import type { Tab, TabSnapshot } from './tab.js';
type LogEntry = {
timestamp: number;
toolCall?: {
toolName: string;
toolArgs: Record<string, any>;
result: string;
isError?: boolean;
};
userAction?: actions.Action;
code: string;
tabSnapshot?: TabSnapshot;
};
export class SessionLog {
private _folder: string;
private _file: string;
private _ordinal = 0;
private _pendingEntries: LogEntry[] = [];
private _sessionFileQueue = Promise.resolve();
private _flushEntriesTimeout: NodeJS.Timeout | undefined;
constructor(sessionFolder: string) {
this._folder = sessionFolder;
this._file = path.join(this._folder, 'session.md');
}
static async create(config: FullConfig, rootPath: string | undefined): Promise<SessionLog> {
const sessionFolder = await outputFile(config, rootPath, `session-${Date.now()}`);
await fs.promises.mkdir(sessionFolder, { recursive: true });
// eslint-disable-next-line no-console
console.error(`Session: ${sessionFolder}`);
return new SessionLog(sessionFolder);
}
logResponse(response: Response) {
const entry: LogEntry = {
timestamp: performance.now(),
toolCall: {
toolName: response.toolName,
toolArgs: response.toolArgs,
result: response.result(),
isError: response.isError(),
},
code: response.code(),
tabSnapshot: response.tabSnapshot(),
};
this._appendEntry(entry);
}
logUserAction(action: actions.Action, tab: Tab, code: string, isUpdate: boolean) {
code = code.trim();
if (isUpdate) {
const lastEntry = this._pendingEntries[this._pendingEntries.length - 1];
if (lastEntry.userAction?.name === action.name) {
lastEntry.userAction = action;
lastEntry.code = code;
return;
}
}
if (action.name === 'navigate') {
// Already logged at this location.
const lastEntry = this._pendingEntries[this._pendingEntries.length - 1];
if (lastEntry?.tabSnapshot?.url === action.url)
return;
}
const entry: LogEntry = {
timestamp: performance.now(),
userAction: action,
code,
tabSnapshot: {
url: tab.page.url(),
title: '',
ariaSnapshot: action.ariaSnapshot || '',
modalStates: [],
consoleMessages: [],
downloads: [],
},
};
this._appendEntry(entry);
}
private _appendEntry(entry: LogEntry) {
this._pendingEntries.push(entry);
if (this._flushEntriesTimeout)
clearTimeout(this._flushEntriesTimeout);
this._flushEntriesTimeout = setTimeout(() => this._flushEntries(), 1000);
}
private async _flushEntries() {
clearTimeout(this._flushEntriesTimeout);
const entries = this._pendingEntries;
this._pendingEntries = [];
const lines: string[] = [''];
for (const entry of entries) {
const ordinal = (++this._ordinal).toString().padStart(3, '0');
if (entry.toolCall) {
lines.push(
`### Tool call: ${entry.toolCall.toolName}`,
`- Args`,
'```json',
JSON.stringify(entry.toolCall.toolArgs, null, 2),
'```',
);
if (entry.toolCall.result) {
lines.push(
entry.toolCall.isError ? `- Error` : `- Result`,
'```',
entry.toolCall.result,
'```',
);
}
}
if (entry.userAction) {
const actionData = { ...entry.userAction } as any;
delete actionData.ariaSnapshot;
delete actionData.selector;
delete actionData.signals;
lines.push(
`### User action: ${entry.userAction.name}`,
`- Args`,
'```json',
JSON.stringify(actionData, null, 2),
'```',
);
}
if (entry.code) {
lines.push(
`- Code`,
'```js',
entry.code,
'```');
}
if (entry.tabSnapshot) {
const fileName = `${ordinal}.snapshot.yml`;
fs.promises.writeFile(path.join(this._folder, fileName), entry.tabSnapshot.ariaSnapshot).catch(logUnhandledError);
lines.push(`- Snapshot: ${fileName}`);
}
lines.push('', '');
}
this._sessionFileQueue = this._sessionFileQueue.then(() => fs.promises.appendFile(this._file, lines.join('\n')));
}
}

313
src/tab.ts Normal file
View File

@@ -0,0 +1,313 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { EventEmitter } from 'events';
import * as playwright from 'playwright';
import { callOnPageNoTrace, waitForCompletion } from './tools/utils.js';
import { logUnhandledError } from './utils/log.js';
import { ManualPromise } from './utils/manualPromise.js';
import { ModalState } from './tools/tool.js';
import type { Context } from './context.js';
type PageEx = playwright.Page & {
_snapshotForAI: () => Promise<string>;
};
export const TabEvents = {
modalState: 'modalState'
};
export type TabEventsInterface = {
[TabEvents.modalState]: [modalState: ModalState];
};
export type TabSnapshot = {
url: string;
title: string;
ariaSnapshot: string;
modalStates: ModalState[];
consoleMessages: ConsoleMessage[];
downloads: { download: playwright.Download, finished: boolean, outputFile: string }[];
};
export class Tab extends EventEmitter<TabEventsInterface> {
readonly context: Context;
readonly page: playwright.Page;
private _lastTitle = 'about:blank';
private _consoleMessages: ConsoleMessage[] = [];
private _recentConsoleMessages: ConsoleMessage[] = [];
private _requests: Map<playwright.Request, playwright.Response | null> = new Map();
private _onPageClose: (tab: Tab) => void;
private _modalStates: ModalState[] = [];
private _downloads: { download: playwright.Download, finished: boolean, outputFile: string }[] = [];
constructor(context: Context, page: playwright.Page, onPageClose: (tab: Tab) => void) {
super();
this.context = context;
this.page = page;
this._onPageClose = onPageClose;
page.on('console', event => this._handleConsoleMessage(messageToConsoleMessage(event)));
page.on('pageerror', error => this._handleConsoleMessage(pageErrorToConsoleMessage(error)));
page.on('request', request => this._requests.set(request, null));
page.on('response', response => this._requests.set(response.request(), response));
page.on('close', () => this._onClose());
page.on('filechooser', chooser => {
this.setModalState({
type: 'fileChooser',
description: 'File chooser',
fileChooser: chooser,
});
});
page.on('dialog', dialog => this._dialogShown(dialog));
page.on('download', download => {
void this._downloadStarted(download);
});
page.setDefaultNavigationTimeout(60000);
page.setDefaultTimeout(5000);
(page as any)[tabSymbol] = this;
}
static forPage(page: playwright.Page): Tab | undefined {
return (page as any)[tabSymbol];
}
modalStates(): ModalState[] {
return this._modalStates;
}
setModalState(modalState: ModalState) {
this._modalStates.push(modalState);
this.emit(TabEvents.modalState, modalState);
}
clearModalState(modalState: ModalState) {
this._modalStates = this._modalStates.filter(state => state !== modalState);
}
modalStatesMarkdown(): string[] {
return renderModalStates(this.context, this.modalStates());
}
private _dialogShown(dialog: playwright.Dialog) {
this.setModalState({
type: 'dialog',
description: `"${dialog.type()}" dialog with message "${dialog.message()}"`,
dialog,
});
}
private async _downloadStarted(download: playwright.Download) {
const entry = {
download,
finished: false,
outputFile: await this.context.outputFile(download.suggestedFilename())
};
this._downloads.push(entry);
await download.saveAs(entry.outputFile);
entry.finished = true;
}
private _clearCollectedArtifacts() {
this._consoleMessages.length = 0;
this._recentConsoleMessages.length = 0;
this._requests.clear();
}
private _handleConsoleMessage(message: ConsoleMessage) {
this._consoleMessages.push(message);
this._recentConsoleMessages.push(message);
}
private _onClose() {
this._clearCollectedArtifacts();
this._onPageClose(this);
}
async updateTitle() {
await this._raceAgainstModalStates(async () => {
this._lastTitle = await callOnPageNoTrace(this.page, page => page.title());
});
}
lastTitle(): string {
return this._lastTitle;
}
isCurrentTab(): boolean {
return this === this.context.currentTab();
}
async waitForLoadState(state: 'load', options?: { timeout?: number }): Promise<void> {
await callOnPageNoTrace(this.page, page => page.waitForLoadState(state, options).catch(logUnhandledError));
}
async navigate(url: string) {
this._clearCollectedArtifacts();
const downloadEvent = callOnPageNoTrace(this.page, page => page.waitForEvent('download').catch(logUnhandledError));
try {
await this.page.goto(url, { waitUntil: 'domcontentloaded' });
} catch (_e: unknown) {
const e = _e as Error;
const mightBeDownload =
e.message.includes('net::ERR_ABORTED') // chromium
|| e.message.includes('Download is starting'); // firefox + webkit
if (!mightBeDownload)
throw e;
// on chromium, the download event is fired *after* page.goto rejects, so we wait a lil bit
const download = await Promise.race([
downloadEvent,
new Promise(resolve => setTimeout(resolve, 3000)),
]);
if (!download)
throw e;
// Make sure other "download" listeners are notified first.
await new Promise(resolve => setTimeout(resolve, 500));
return;
}
// Cap load event to 5 seconds, the page is operational at this point.
await this.waitForLoadState('load', { timeout: 5000 });
}
consoleMessages(): ConsoleMessage[] {
return this._consoleMessages;
}
requests(): Map<playwright.Request, playwright.Response | null> {
return this._requests;
}
async captureSnapshot(): Promise<TabSnapshot> {
let tabSnapshot: TabSnapshot | undefined;
const modalStates = await this._raceAgainstModalStates(async () => {
const snapshot = await (this.page as PageEx)._snapshotForAI();
tabSnapshot = {
url: this.page.url(),
title: await this.page.title(),
ariaSnapshot: snapshot,
modalStates: [],
consoleMessages: [],
downloads: this._downloads,
};
});
if (tabSnapshot) {
// Assign console message late so that we did not lose any to modal state.
tabSnapshot.consoleMessages = this._recentConsoleMessages;
this._recentConsoleMessages = [];
}
return tabSnapshot ?? {
url: this.page.url(),
title: '',
ariaSnapshot: '',
modalStates,
consoleMessages: [],
downloads: [],
};
}
private _javaScriptBlocked(): boolean {
return this._modalStates.some(state => state.type === 'dialog');
}
private async _raceAgainstModalStates(action: () => Promise<void>): Promise<ModalState[]> {
if (this.modalStates().length)
return this.modalStates();
const promise = new ManualPromise<ModalState[]>();
const listener = (modalState: ModalState) => promise.resolve([modalState]);
this.once(TabEvents.modalState, listener);
return await Promise.race([
action().then(() => {
this.off(TabEvents.modalState, listener);
return [];
}),
promise,
]);
}
async waitForCompletion(callback: () => Promise<void>) {
await this._raceAgainstModalStates(() => waitForCompletion(this, callback));
}
async refLocator(params: { element: string, ref: string }): Promise<playwright.Locator> {
return (await this.refLocators([params]))[0];
}
async refLocators(params: { element: string, ref: string }[]): Promise<playwright.Locator[]> {
const snapshot = await (this.page as PageEx)._snapshotForAI();
return params.map(param => {
if (!snapshot.includes(`[ref=${param.ref}]`))
throw new Error(`Ref ${param.ref} not found in the current page snapshot. Try capturing new snapshot.`);
return this.page.locator(`aria-ref=${param.ref}`).describe(param.element);
});
}
async waitForTimeout(time: number) {
if (this._javaScriptBlocked()) {
await new Promise(f => setTimeout(f, time));
return;
}
await callOnPageNoTrace(this.page, page => {
return page.evaluate(() => new Promise(f => setTimeout(f, 1000)));
});
}
}
export type ConsoleMessage = {
type: ReturnType<playwright.ConsoleMessage['type']> | undefined;
text: string;
toString(): string;
};
function messageToConsoleMessage(message: playwright.ConsoleMessage): ConsoleMessage {
return {
type: message.type(),
text: message.text(),
toString: () => `[${message.type().toUpperCase()}] ${message.text()} @ ${message.location().url}:${message.location().lineNumber}`,
};
}
function pageErrorToConsoleMessage(errorOrValue: Error | any): ConsoleMessage {
if (errorOrValue instanceof Error) {
return {
type: undefined,
text: errorOrValue.message,
toString: () => errorOrValue.stack || errorOrValue.message,
};
}
return {
type: undefined,
text: String(errorOrValue),
toString: () => String(errorOrValue),
};
}
export function renderModalStates(context: Context, modalStates: ModalState[]): string[] {
const result: string[] = ['### Modal state'];
if (modalStates.length === 0)
result.push('- There is no modal state present');
for (const state of modalStates) {
const tool = context.tools.filter(tool => 'clearsModalState' in tool).find(tool => tool.clearsModalState === state.type);
result.push(`- [${state.description}]: can be handled by the "${tool?.schema.name}" tool`);
}
return result;
}
const tabSymbol = Symbol('tabSymbol');

56
src/tools.ts Normal file
View File

@@ -0,0 +1,56 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import common from './tools/common.js';
import console from './tools/console.js';
import dialogs from './tools/dialogs.js';
import evaluate from './tools/evaluate.js';
import files from './tools/files.js';
import install from './tools/install.js';
import keyboard from './tools/keyboard.js';
import navigate from './tools/navigate.js';
import network from './tools/network.js';
import pdf from './tools/pdf.js';
import snapshot from './tools/snapshot.js';
import tabs from './tools/tabs.js';
import screenshot from './tools/screenshot.js';
import wait from './tools/wait.js';
import mouse from './tools/mouse.js';
import type { Tool } from './tools/tool.js';
import type { FullConfig } from './config.js';
export const allTools: Tool<any>[] = [
...common,
...console,
...dialogs,
...evaluate,
...files,
...install,
...keyboard,
...navigate,
...network,
...mouse,
...pdf,
...screenshot,
...snapshot,
...tabs,
...wait,
];
export function filteredTools(config: FullConfig) {
return allTools.filter(tool => tool.capability.startsWith('core') || config.capabilities?.includes(tool.capability));
}

2
src/tools/DEPS.list Normal file
View File

@@ -0,0 +1,2 @@
[*]
../utils/

View File

@@ -14,163 +14,50 @@
* limitations under the License.
*/
import os from 'os';
import path from 'path';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { defineTabTool, defineTool } from './tool.js';
import { captureAriaSnapshot, runAndWait } from './utils';
const close = defineTool({
capability: 'core',
import type { ToolFactory, Tool } from './tool';
const navigateSchema = z.object({
url: z.string().describe('The URL to navigate to'),
});
export const navigate: ToolFactory = snapshot => ({
schema: {
name: 'browser_navigate',
description: 'Navigate to a URL',
inputSchema: zodToJsonSchema(navigateSchema),
},
handle: async (context, params) => {
const validatedParams = navigateSchema.parse(params);
const page = await context.createPage();
await page.goto(validatedParams.url, { waitUntil: 'domcontentloaded' });
// Cap load event to 5 seconds, the page is operational at this point.
await page.waitForLoadState('load', { timeout: 5000 }).catch(() => {});
if (snapshot)
return captureAriaSnapshot(context);
return {
content: [{
type: 'text',
text: `Navigated to ${validatedParams.url}`,
}],
};
},
});
const goBackSchema = z.object({});
export const goBack: ToolFactory = snapshot => ({
schema: {
name: 'browser_go_back',
description: 'Go back to the previous page',
inputSchema: zodToJsonSchema(goBackSchema),
},
handle: async context => {
return await runAndWait(context, 'Navigated back', async page => page.goBack(), snapshot);
},
});
const goForwardSchema = z.object({});
export const goForward: ToolFactory = snapshot => ({
schema: {
name: 'browser_go_forward',
description: 'Go forward to the next page',
inputSchema: zodToJsonSchema(goForwardSchema),
},
handle: async context => {
return await runAndWait(context, 'Navigated forward', async page => page.goForward(), snapshot);
},
});
const waitSchema = z.object({
time: z.number().describe('The time to wait in seconds'),
});
export const wait: Tool = {
schema: {
name: 'browser_wait',
description: 'Wait for a specified time in seconds',
inputSchema: zodToJsonSchema(waitSchema),
},
handle: async (context, params) => {
const validatedParams = waitSchema.parse(params);
await new Promise(f => setTimeout(f, Math.min(10000, validatedParams.time * 1000)));
return {
content: [{
type: 'text',
text: `Waited for ${validatedParams.time} seconds`,
}],
};
},
};
const pressKeySchema = z.object({
key: z.string().describe('Name of the key to press or a character to generate, such as `ArrowLeft` or `a`'),
});
export const pressKey: Tool = {
schema: {
name: 'browser_press_key',
description: 'Press a key on the keyboard',
inputSchema: zodToJsonSchema(pressKeySchema),
},
handle: async (context, params) => {
const validatedParams = pressKeySchema.parse(params);
return await runAndWait(context, `Pressed key ${validatedParams.key}`, async page => {
await page.keyboard.press(validatedParams.key);
});
},
};
const pdfSchema = z.object({});
export const pdf: Tool = {
schema: {
name: 'browser_save_as_pdf',
description: 'Save page as PDF',
inputSchema: zodToJsonSchema(pdfSchema),
},
handle: async context => {
const page = context.existingPage();
const fileName = path.join(os.tmpdir(), `/page-${new Date().toISOString()}.pdf`);
await page.pdf({ path: fileName });
return {
content: [{
type: 'text',
text: `Saved as ${fileName}`,
}],
};
},
};
const closeSchema = z.object({});
export const close: Tool = {
schema: {
name: 'browser_close',
title: 'Close browser',
description: 'Close the page',
inputSchema: zodToJsonSchema(closeSchema),
inputSchema: z.object({}),
type: 'readOnly',
},
handle: async context => {
await context.close();
return {
content: [{
type: 'text',
text: `Page closed`,
}],
};
},
};
const chooseFileSchema = z.object({
paths: z.array(z.string()).describe('The absolute paths to the files to upload. Can be a single file or multiple files.'),
handle: async (context, params, response) => {
await context.closeBrowserContext();
response.setIncludeTabs();
response.addCode(`await page.close()`);
},
});
export const chooseFile: ToolFactory = snapshot => ({
const resize = defineTabTool({
capability: 'core',
schema: {
name: 'browser_choose_file',
description: 'Choose one or multiple files to upload',
inputSchema: zodToJsonSchema(chooseFileSchema),
name: 'browser_resize',
title: 'Resize browser window',
description: 'Resize the browser window',
inputSchema: z.object({
width: z.number().describe('Width of the browser window'),
height: z.number().describe('Height of the browser window'),
}),
type: 'readOnly',
},
handle: async (context, params) => {
const validatedParams = chooseFileSchema.parse(params);
return await runAndWait(context, `Chose files ${validatedParams.paths.join(', ')}`, async () => {
await context.submitFileChooser(validatedParams.paths);
}, snapshot);
handle: async (tab, params, response) => {
response.addCode(`await page.setViewportSize({ width: ${params.width}, height: ${params.height} });`);
await tab.waitForCompletion(async () => {
await tab.page.setViewportSize({ width: params.width, height: params.height });
});
},
});
export default [
close,
resize
];

36
src/tools/console.ts Normal file
View File

@@ -0,0 +1,36 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
const console = defineTabTool({
capability: 'core',
schema: {
name: 'browser_console_messages',
title: 'Get console messages',
description: 'Returns all console messages',
inputSchema: z.object({}),
type: 'readOnly',
},
handle: async (tab, params, response) => {
tab.consoleMessages().map(message => response.addResult(message.toString()));
},
});
export default [
console,
];

55
src/tools/dialogs.ts Normal file
View File

@@ -0,0 +1,55 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
const handleDialog = defineTabTool({
capability: 'core',
schema: {
name: 'browser_handle_dialog',
title: 'Handle a dialog',
description: 'Handle a dialog',
inputSchema: z.object({
accept: z.boolean().describe('Whether to accept the dialog.'),
promptText: z.string().optional().describe('The text of the prompt in case of a prompt dialog.'),
}),
type: 'destructive',
},
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
const dialogState = tab.modalStates().find(state => state.type === 'dialog');
if (!dialogState)
throw new Error('No dialog visible');
tab.clearModalState(dialogState);
await tab.waitForCompletion(async () => {
if (params.accept)
await dialogState.dialog.accept(params.promptText);
else
await dialogState.dialog.dismiss();
});
},
clearsModalState: 'dialog',
});
export default [
handleDialog,
];

62
src/tools/evaluate.ts Normal file
View File

@@ -0,0 +1,62 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
import * as javascript from '../utils/codegen.js';
import { generateLocator } from './utils.js';
import type * as playwright from 'playwright';
const evaluateSchema = z.object({
function: z.string().describe('() => { /* code */ } or (element) => { /* code */ } when element is provided'),
element: z.string().optional().describe('Human-readable element description used to obtain permission to interact with the element'),
ref: z.string().optional().describe('Exact target element reference from the page snapshot'),
});
const evaluate = defineTabTool({
capability: 'core',
schema: {
name: 'browser_evaluate',
title: 'Evaluate JavaScript',
description: 'Evaluate JavaScript expression on page or element',
inputSchema: evaluateSchema,
type: 'destructive',
},
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
let locator: playwright.Locator | undefined;
if (params.ref && params.element) {
locator = await tab.refLocator({ ref: params.ref, element: params.element });
response.addCode(`await page.${await generateLocator(locator)}.evaluate(${javascript.quote(params.function)});`);
} else {
response.addCode(`await page.evaluate(${javascript.quote(params.function)});`);
}
await tab.waitForCompletion(async () => {
const receiver = locator ?? tab.page as any;
const result = await receiver._evaluateFunction(params.function);
response.addResult(JSON.stringify(result, null, 2) || 'undefined');
});
},
});
export default [
evaluate,
];

52
src/tools/files.ts Normal file
View File

@@ -0,0 +1,52 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
const uploadFile = defineTabTool({
capability: 'core',
schema: {
name: 'browser_file_upload',
title: 'Upload files',
description: 'Upload one or multiple files',
inputSchema: z.object({
paths: z.array(z.string()).describe('The absolute paths to the files to upload. Can be a single file or multiple files.'),
}),
type: 'destructive',
},
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
const modalState = tab.modalStates().find(state => state.type === 'fileChooser');
if (!modalState)
throw new Error('No file chooser visible');
response.addCode(`await fileChooser.setFiles(${JSON.stringify(params.paths)})`);
tab.clearModalState(modalState);
await tab.waitForCompletion(async () => {
await modalState.fileChooser.setFiles(params.paths);
});
},
clearsModalState: 'fileChooser',
});
export default [
uploadFile,
];

58
src/tools/install.ts Normal file
View File

@@ -0,0 +1,58 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { fork } from 'child_process';
import path from 'path';
import { fileURLToPath } from 'url';
import { z } from 'zod';
import { defineTool } from './tool.js';
const install = defineTool({
capability: 'core-install',
schema: {
name: 'browser_install',
title: 'Install the browser specified in the config',
description: 'Install the browser specified in the config. Call this if you get an error about the browser not being installed.',
inputSchema: z.object({}),
type: 'destructive',
},
handle: async (context, params, response) => {
const channel = context.config.browser?.launchOptions?.channel ?? context.config.browser?.browserName ?? 'chrome';
const cliUrl = import.meta.resolve('playwright/package.json');
const cliPath = path.join(fileURLToPath(cliUrl), '..', 'cli.js');
const child = fork(cliPath, ['install', channel], {
stdio: 'pipe',
});
const output: string[] = [];
child.stdout?.on('data', data => output.push(data.toString()));
child.stderr?.on('data', data => output.push(data.toString()));
await new Promise<void>((resolve, reject) => {
child.on('close', code => {
if (code === 0)
resolve();
else
reject(new Error(`Failed to install browser: ${output.join('')}`));
});
});
response.setIncludeTabs();
},
});
export default [
install,
];

89
src/tools/keyboard.ts Normal file
View File

@@ -0,0 +1,89 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
import { elementSchema } from './snapshot.js';
import { generateLocator } from './utils.js';
import * as javascript from '../utils/codegen.js';
const pressKey = defineTabTool({
capability: 'core',
schema: {
name: 'browser_press_key',
title: 'Press a key',
description: 'Press a key on the keyboard',
inputSchema: z.object({
key: z.string().describe('Name of the key to press or a character to generate, such as `ArrowLeft` or `a`'),
}),
type: 'destructive',
},
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
response.addCode(`// Press ${params.key}`);
response.addCode(`await page.keyboard.press('${params.key}');`);
await tab.waitForCompletion(async () => {
await tab.page.keyboard.press(params.key);
});
},
});
const typeSchema = elementSchema.extend({
text: z.string().describe('Text to type into the element'),
submit: z.boolean().optional().describe('Whether to submit entered text (press Enter after)'),
slowly: z.boolean().optional().describe('Whether to type one character at a time. Useful for triggering key handlers in the page. By default entire text is filled in at once.'),
});
const type = defineTabTool({
capability: 'core',
schema: {
name: 'browser_type',
title: 'Type text',
description: 'Type text into editable element',
inputSchema: typeSchema,
type: 'destructive',
},
handle: async (tab, params, response) => {
const locator = await tab.refLocator(params);
await tab.waitForCompletion(async () => {
if (params.slowly) {
response.setIncludeSnapshot();
response.addCode(`await page.${await generateLocator(locator)}.pressSequentially(${javascript.quote(params.text)});`);
await locator.pressSequentially(params.text);
} else {
response.addCode(`await page.${await generateLocator(locator)}.fill(${javascript.quote(params.text)});`);
await locator.fill(params.text);
}
if (params.submit) {
response.setIncludeSnapshot();
response.addCode(`await page.${await generateLocator(locator)}.press('Enter');`);
await locator.press('Enter');
}
});
},
});
export default [
pressKey,
type,
];

113
src/tools/mouse.ts Normal file
View File

@@ -0,0 +1,113 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
const elementSchema = z.object({
element: z.string().describe('Human-readable element description used to obtain permission to interact with the element'),
});
const mouseMove = defineTabTool({
capability: 'vision',
schema: {
name: 'browser_mouse_move_xy',
title: 'Move mouse',
description: 'Move mouse to a given position',
inputSchema: elementSchema.extend({
x: z.number().describe('X coordinate'),
y: z.number().describe('Y coordinate'),
}),
type: 'readOnly',
},
handle: async (tab, params, response) => {
response.addCode(`// Move mouse to (${params.x}, ${params.y})`);
response.addCode(`await page.mouse.move(${params.x}, ${params.y});`);
await tab.waitForCompletion(async () => {
await tab.page.mouse.move(params.x, params.y);
});
},
});
const mouseClick = defineTabTool({
capability: 'vision',
schema: {
name: 'browser_mouse_click_xy',
title: 'Click',
description: 'Click left mouse button at a given position',
inputSchema: elementSchema.extend({
x: z.number().describe('X coordinate'),
y: z.number().describe('Y coordinate'),
}),
type: 'destructive',
},
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
response.addCode(`// Click mouse at coordinates (${params.x}, ${params.y})`);
response.addCode(`await page.mouse.move(${params.x}, ${params.y});`);
response.addCode(`await page.mouse.down();`);
response.addCode(`await page.mouse.up();`);
await tab.waitForCompletion(async () => {
await tab.page.mouse.move(params.x, params.y);
await tab.page.mouse.down();
await tab.page.mouse.up();
});
},
});
const mouseDrag = defineTabTool({
capability: 'vision',
schema: {
name: 'browser_mouse_drag_xy',
title: 'Drag mouse',
description: 'Drag left mouse button to a given position',
inputSchema: elementSchema.extend({
startX: z.number().describe('Start X coordinate'),
startY: z.number().describe('Start Y coordinate'),
endX: z.number().describe('End X coordinate'),
endY: z.number().describe('End Y coordinate'),
}),
type: 'destructive',
},
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
response.addCode(`// Drag mouse from (${params.startX}, ${params.startY}) to (${params.endX}, ${params.endY})`);
response.addCode(`await page.mouse.move(${params.startX}, ${params.startY});`);
response.addCode(`await page.mouse.down();`);
response.addCode(`await page.mouse.move(${params.endX}, ${params.endY});`);
response.addCode(`await page.mouse.up();`);
await tab.waitForCompletion(async () => {
await tab.page.mouse.move(params.startX, params.startY);
await tab.page.mouse.down();
await tab.page.mouse.move(params.endX, params.endY);
await tab.page.mouse.up();
});
},
});
export default [
mouseMove,
mouseClick,
mouseDrag,
];

79
src/tools/navigate.ts Normal file
View File

@@ -0,0 +1,79 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTool, defineTabTool } from './tool.js';
const navigate = defineTool({
capability: 'core',
schema: {
name: 'browser_navigate',
title: 'Navigate to a URL',
description: 'Navigate to a URL',
inputSchema: z.object({
url: z.string().describe('The URL to navigate to'),
}),
type: 'destructive',
},
handle: async (context, params, response) => {
const tab = await context.ensureTab();
await tab.navigate(params.url);
response.setIncludeSnapshot();
response.addCode(`await page.goto('${params.url}');`);
},
});
const goBack = defineTabTool({
capability: 'core',
schema: {
name: 'browser_navigate_back',
title: 'Go back',
description: 'Go back to the previous page',
inputSchema: z.object({}),
type: 'readOnly',
},
handle: async (tab, params, response) => {
await tab.page.goBack();
response.setIncludeSnapshot();
response.addCode(`await page.goBack();`);
},
});
const goForward = defineTabTool({
capability: 'core',
schema: {
name: 'browser_navigate_forward',
title: 'Go forward',
description: 'Go forward to the next page',
inputSchema: z.object({}),
type: 'readOnly',
},
handle: async (tab, params, response) => {
await tab.page.goForward();
response.setIncludeSnapshot();
response.addCode(`await page.goForward();`);
},
});
export default [
navigate,
goBack,
goForward,
];

49
src/tools/network.ts Normal file
View File

@@ -0,0 +1,49 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
import type * as playwright from 'playwright';
const requests = defineTabTool({
capability: 'core',
schema: {
name: 'browser_network_requests',
title: 'List network requests',
description: 'Returns all network requests since loading the page',
inputSchema: z.object({}),
type: 'readOnly',
},
handle: async (tab, params, response) => {
const requests = tab.requests();
[...requests.entries()].forEach(([req, res]) => response.addResult(renderRequest(req, res)));
},
});
function renderRequest(request: playwright.Request, response: playwright.Response | null) {
const result: string[] = [];
result.push(`[${request.method().toUpperCase()}] ${request.url()}`);
if (response)
result.push(`=> [${response.status()}] ${response.statusText()}`);
return result.join(' ');
}
export default [
requests,
];

47
src/tools/pdf.ts Normal file
View File

@@ -0,0 +1,47 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTabTool } from './tool.js';
import * as javascript from '../utils/codegen.js';
const pdfSchema = z.object({
filename: z.string().optional().describe('File name to save the pdf to. Defaults to `page-{timestamp}.pdf` if not specified.'),
});
const pdf = defineTabTool({
capability: 'pdf',
schema: {
name: 'browser_pdf_save',
title: 'Save as PDF',
description: 'Save page as PDF',
inputSchema: pdfSchema,
type: 'readOnly',
},
handle: async (tab, params, response) => {
const fileName = await tab.context.outputFile(params.filename ?? `page-${new Date().toISOString()}.pdf`);
response.addCode(`await page.pdf(${javascript.formatObject({ path: fileName })});`);
response.addResult(`Saved page as ${fileName}`);
await tab.page.pdf({ path: fileName });
},
});
export default [
pdf,
];

View File

@@ -15,119 +15,78 @@
*/
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { runAndWait } from './utils';
import { defineTabTool } from './tool.js';
import * as javascript from '../utils/codegen.js';
import { generateLocator } from './utils.js';
import type { Tool } from './tool';
import type * as playwright from 'playwright';
export const screenshot: Tool = {
const screenshotSchema = z.object({
type: z.enum(['png', 'jpeg']).default('png').describe('Image format for the screenshot. Default is png.'),
filename: z.string().optional().describe('File name to save the screenshot to. Defaults to `page-{timestamp}.{png|jpeg}` if not specified.'),
element: z.string().optional().describe('Human-readable element description used to obtain permission to screenshot the element. If not provided, the screenshot will be taken of viewport. If element is provided, ref must be provided too.'),
ref: z.string().optional().describe('Exact target element reference from the page snapshot. If not provided, the screenshot will be taken of viewport. If ref is provided, element must be provided too.'),
fullPage: z.boolean().optional().describe('When true, takes a screenshot of the full scrollable page, instead of the currently visible viewport. Cannot be used with element screenshots.'),
}).refine(data => {
return !!data.element === !!data.ref;
}, {
message: 'Both element and ref must be provided or neither.',
path: ['ref', 'element']
}).refine(data => {
return !(data.fullPage && (data.element || data.ref));
}, {
message: 'fullPage cannot be used with element screenshots.',
path: ['fullPage']
});
const screenshot = defineTabTool({
capability: 'core',
schema: {
name: 'browser_screenshot',
description: 'Take a screenshot of the current page',
inputSchema: zodToJsonSchema(z.object({})),
name: 'browser_take_screenshot',
title: 'Take a screenshot',
description: `Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.`,
inputSchema: screenshotSchema,
type: 'readOnly',
},
handle: async context => {
const page = context.existingPage();
const screenshot = await page.screenshot({ type: 'jpeg', quality: 50, scale: 'css' });
return {
content: [{ type: 'image', data: screenshot.toString('base64'), mimeType: 'image/jpeg' }],
handle: async (tab, params, response) => {
const fileType = params.type || 'png';
const fileName = await tab.context.outputFile(params.filename ?? `page-${new Date().toISOString()}.${fileType}`);
const options: playwright.PageScreenshotOptions = {
type: fileType,
quality: fileType === 'png' ? undefined : 90,
scale: 'css',
path: fileName,
...(params.fullPage !== undefined && { fullPage: params.fullPage })
};
},
};
const isElementScreenshot = params.element && params.ref;
const elementSchema = z.object({
element: z.string().describe('Human-readable element description used to obtain permission to interact with the element'),
});
const screenshotTarget = isElementScreenshot ? params.element : (params.fullPage ? 'full page' : 'viewport');
response.addCode(`// Screenshot ${screenshotTarget} and save it as ${fileName}`);
const moveMouseSchema = elementSchema.extend({
x: z.number().describe('X coordinate'),
y: z.number().describe('Y coordinate'),
});
// Only get snapshot when element screenshot is needed
const locator = params.ref ? await tab.refLocator({ element: params.element || '', ref: params.ref }) : null;
export const moveMouse: Tool = {
schema: {
name: 'browser_move_mouse',
description: 'Move mouse to a given position',
inputSchema: zodToJsonSchema(moveMouseSchema),
},
if (locator)
response.addCode(`await page.${await generateLocator(locator)}.screenshot(${javascript.formatObject(options)});`);
else
response.addCode(`await page.screenshot(${javascript.formatObject(options)});`);
handle: async (context, params) => {
const validatedParams = moveMouseSchema.parse(params);
const page = context.existingPage();
await page.mouse.move(validatedParams.x, validatedParams.y);
return {
content: [{ type: 'text', text: `Moved mouse to (${validatedParams.x}, ${validatedParams.y})` }],
};
},
};
const buffer = locator ? await locator.screenshot(options) : await tab.page.screenshot(options);
response.addResult(`Took the ${screenshotTarget} screenshot and saved it as ${fileName}`);
const clickSchema = elementSchema.extend({
x: z.number().describe('X coordinate'),
y: z.number().describe('Y coordinate'),
});
export const click: Tool = {
schema: {
name: 'browser_click',
description: 'Click left mouse button',
inputSchema: zodToJsonSchema(clickSchema),
},
handle: async (context, params) => {
return await runAndWait(context, 'Clicked mouse', async page => {
const validatedParams = clickSchema.parse(params);
await page.mouse.move(validatedParams.x, validatedParams.y);
await page.mouse.down();
await page.mouse.up();
// https://github.com/microsoft/playwright-mcp/issues/817
// Never return large images to LLM, saving them to the file system is enough.
if (!params.fullPage) {
response.addImage({
contentType: fileType === 'png' ? 'image/png' : 'image/jpeg',
data: buffer
});
},
};
const dragSchema = elementSchema.extend({
startX: z.number().describe('Start X coordinate'),
startY: z.number().describe('Start Y coordinate'),
endX: z.number().describe('End X coordinate'),
endY: z.number().describe('End Y coordinate'),
}
}
});
export const drag: Tool = {
schema: {
name: 'browser_drag',
description: 'Drag left mouse button',
inputSchema: zodToJsonSchema(dragSchema),
},
handle: async (context, params) => {
const validatedParams = dragSchema.parse(params);
return await runAndWait(context, `Dragged mouse from (${validatedParams.startX}, ${validatedParams.startY}) to (${validatedParams.endX}, ${validatedParams.endY})`, async page => {
await page.mouse.move(validatedParams.startX, validatedParams.startY);
await page.mouse.down();
await page.mouse.move(validatedParams.endX, validatedParams.endY);
await page.mouse.up();
});
},
};
const typeSchema = z.object({
text: z.string().describe('Text to type into the element'),
submit: z.boolean().describe('Whether to submit entered text (press Enter after)'),
});
export const type: Tool = {
schema: {
name: 'browser_type',
description: 'Type text',
inputSchema: zodToJsonSchema(typeSchema),
},
handle: async (context, params) => {
const validatedParams = typeSchema.parse(params);
return await runAndWait(context, `Typed text "${validatedParams.text}"`, async page => {
await page.keyboard.type(validatedParams.text);
if (validatedParams.submit)
await page.keyboard.press('Enter');
});
},
};
export default [
screenshot,
];

View File

@@ -15,141 +15,152 @@
*/
import { z } from 'zod';
import zodToJsonSchema from 'zod-to-json-schema';
import { captureAriaSnapshot, runAndWait } from './utils';
import { defineTabTool, defineTool } from './tool.js';
import * as javascript from '../utils/codegen.js';
import { generateLocator } from './utils.js';
import type * as playwright from 'playwright';
import type { Tool } from './tool';
export const snapshot: Tool = {
const snapshot = defineTool({
capability: 'core',
schema: {
name: 'browser_snapshot',
title: 'Page snapshot',
description: 'Capture accessibility snapshot of the current page, this is better than screenshot',
inputSchema: zodToJsonSchema(z.object({})),
inputSchema: z.object({}),
type: 'readOnly',
},
handle: async context => {
return await captureAriaSnapshot(context);
handle: async (context, params, response) => {
await context.ensureTab();
response.setIncludeSnapshot();
},
};
});
const elementSchema = z.object({
export const elementSchema = z.object({
element: z.string().describe('Human-readable element description used to obtain permission to interact with the element'),
ref: z.string().describe('Exact target element reference from the page snapshot'),
});
export const click: Tool = {
const clickSchema = elementSchema.extend({
doubleClick: z.boolean().optional().describe('Whether to perform a double click instead of a single click'),
button: z.enum(['left', 'right', 'middle']).optional().describe('Button to click, defaults to left'),
});
const click = defineTabTool({
capability: 'core',
schema: {
name: 'browser_click',
title: 'Click',
description: 'Perform click on a web page',
inputSchema: zodToJsonSchema(elementSchema),
inputSchema: clickSchema,
type: 'destructive',
},
handle: async (context, params) => {
const validatedParams = elementSchema.parse(params);
return runAndWait(context, `"${validatedParams.element}" clicked`, () => context.refLocator(validatedParams.ref).click(), true);
},
};
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
const dragSchema = z.object({
const locator = await tab.refLocator(params);
const button = params.button;
const buttonAttr = button ? `{ button: '${button}' }` : '';
if (params.doubleClick)
response.addCode(`await page.${await generateLocator(locator)}.dblclick(${buttonAttr});`);
else
response.addCode(`await page.${await generateLocator(locator)}.click(${buttonAttr});`);
await tab.waitForCompletion(async () => {
if (params.doubleClick)
await locator.dblclick({ button });
else
await locator.click({ button });
});
},
});
const drag = defineTabTool({
capability: 'core',
schema: {
name: 'browser_drag',
title: 'Drag mouse',
description: 'Perform drag and drop between two elements',
inputSchema: z.object({
startElement: z.string().describe('Human-readable source element description used to obtain the permission to interact with the element'),
startRef: z.string().describe('Exact source element reference from the page snapshot'),
endElement: z.string().describe('Human-readable target element description used to obtain the permission to interact with the element'),
endRef: z.string().describe('Exact target element reference from the page snapshot'),
}),
type: 'destructive',
},
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
const [startLocator, endLocator] = await tab.refLocators([
{ ref: params.startRef, element: params.startElement },
{ ref: params.endRef, element: params.endElement },
]);
await tab.waitForCompletion(async () => {
await startLocator.dragTo(endLocator);
});
response.addCode(`await page.${await generateLocator(startLocator)}.dragTo(page.${await generateLocator(endLocator)});`);
},
});
export const drag: Tool = {
schema: {
name: 'browser_drag',
description: 'Perform drag and drop between two elements',
inputSchema: zodToJsonSchema(dragSchema),
},
handle: async (context, params) => {
const validatedParams = dragSchema.parse(params);
return runAndWait(context, `Dragged "${validatedParams.startElement}" to "${validatedParams.endElement}"`, async () => {
const startLocator = context.refLocator(validatedParams.startRef);
const endLocator = context.refLocator(validatedParams.endRef);
await startLocator.dragTo(endLocator);
}, true);
},
};
export const hover: Tool = {
const hover = defineTabTool({
capability: 'core',
schema: {
name: 'browser_hover',
title: 'Hover mouse',
description: 'Hover over element on page',
inputSchema: zodToJsonSchema(elementSchema),
inputSchema: elementSchema,
type: 'readOnly',
},
handle: async (context, params) => {
const validatedParams = elementSchema.parse(params);
return runAndWait(context, `Hovered over "${validatedParams.element}"`, () => context.refLocator(validatedParams.ref).hover(), true);
},
};
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
const typeSchema = elementSchema.extend({
text: z.string().describe('Text to type into the element'),
submit: z.boolean().describe('Whether to submit entered text (press Enter after)'),
const locator = await tab.refLocator(params);
response.addCode(`await page.${await generateLocator(locator)}.hover();`);
await tab.waitForCompletion(async () => {
await locator.hover();
});
},
});
export const type: Tool = {
schema: {
name: 'browser_type',
description: 'Type text into editable element',
inputSchema: zodToJsonSchema(typeSchema),
},
handle: async (context, params) => {
const validatedParams = typeSchema.parse(params);
return await runAndWait(context, `Typed "${validatedParams.text}" into "${validatedParams.element}"`, async () => {
const locator = context.refLocator(validatedParams.ref);
await locator.fill(validatedParams.text);
if (validatedParams.submit)
await locator.press('Enter');
}, true);
},
};
const selectOptionSchema = elementSchema.extend({
values: z.array(z.string()).describe('Array of values to select in the dropdown. This can be a single value or multiple values.'),
});
export const selectOption: Tool = {
const selectOption = defineTabTool({
capability: 'core',
schema: {
name: 'browser_select_option',
title: 'Select option',
description: 'Select an option in a dropdown',
inputSchema: zodToJsonSchema(selectOptionSchema),
inputSchema: selectOptionSchema,
type: 'destructive',
},
handle: async (context, params) => {
const validatedParams = selectOptionSchema.parse(params);
return await runAndWait(context, `Selected option in "${validatedParams.element}"`, async () => {
const locator = context.refLocator(validatedParams.ref);
await locator.selectOption(validatedParams.values);
}, true);
},
};
handle: async (tab, params, response) => {
response.setIncludeSnapshot();
const screenshotSchema = z.object({
raw: z.boolean().optional().describe('Whether to return without compression (in PNG format). Default is false, which returns a JPEG image.'),
const locator = await tab.refLocator(params);
response.addCode(`await page.${await generateLocator(locator)}.selectOption(${javascript.formatObject(params.values)});`);
await tab.waitForCompletion(async () => {
await locator.selectOption(params.values);
});
},
});
export const screenshot: Tool = {
schema: {
name: 'browser_take_screenshot',
description: `Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.`,
inputSchema: zodToJsonSchema(screenshotSchema),
},
handle: async (context, params) => {
const validatedParams = screenshotSchema.parse(params);
const page = context.existingPage();
const options: playwright.PageScreenshotOptions = validatedParams.raw ? { type: 'png', scale: 'css' } : { type: 'jpeg', quality: 50, scale: 'css' };
const screenshot = await page.screenshot(options);
return {
content: [{ type: 'image', data: screenshot.toString('base64'), mimeType: validatedParams.raw ? 'image/png' : 'image/jpeg' }],
};
},
};
export default [
snapshot,
click,
drag,
hover,
selectOption,
];

101
src/tools/tabs.ts Normal file
View File

@@ -0,0 +1,101 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTool } from './tool.js';
const listTabs = defineTool({
capability: 'core-tabs',
schema: {
name: 'browser_tab_list',
title: 'List tabs',
description: 'List browser tabs',
inputSchema: z.object({}),
type: 'readOnly',
},
handle: async (context, params, response) => {
await context.ensureTab();
response.setIncludeTabs();
},
});
const selectTab = defineTool({
capability: 'core-tabs',
schema: {
name: 'browser_tab_select',
title: 'Select a tab',
description: 'Select a tab by index',
inputSchema: z.object({
index: z.number().describe('The index of the tab to select'),
}),
type: 'readOnly',
},
handle: async (context, params, response) => {
await context.selectTab(params.index);
response.setIncludeSnapshot();
},
});
const newTab = defineTool({
capability: 'core-tabs',
schema: {
name: 'browser_tab_new',
title: 'Open a new tab',
description: 'Open a new tab',
inputSchema: z.object({
url: z.string().optional().describe('The URL to navigate to in the new tab. If not provided, the new tab will be blank.'),
}),
type: 'readOnly',
},
handle: async (context, params, response) => {
const tab = await context.newTab();
if (params.url)
await tab.navigate(params.url);
response.setIncludeSnapshot();
},
});
const closeTab = defineTool({
capability: 'core-tabs',
schema: {
name: 'browser_tab_close',
title: 'Close a tab',
description: 'Close a tab',
inputSchema: z.object({
index: z.number().optional().describe('The index of the tab to close. Closes current tab if not provided.'),
}),
type: 'destructive',
},
handle: async (context, params, response) => {
await context.closeTab(params.index);
response.setIncludeSnapshot();
},
});
export default [
listTabs,
newTab,
selectTab,
closeTab,
];

View File

@@ -14,24 +14,57 @@
* limitations under the License.
*/
import type { ImageContent, TextContent } from '@modelcontextprotocol/sdk/types';
import type { JsonSchema7Type } from 'zod-to-json-schema';
import type { Context } from '../context';
import type { z } from 'zod';
import type { Context } from '../context.js';
import type * as playwright from 'playwright';
import type { ToolCapability } from '../../config.js';
import type { Tab } from '../tab.js';
import type { Response } from '../response.js';
import type { ToolSchema } from '../mcp/tool.js';
export type ToolSchema = {
name: string;
export type FileUploadModalState = {
type: 'fileChooser';
description: string;
inputSchema: JsonSchema7Type;
fileChooser: playwright.FileChooser;
};
export type ToolResult = {
content: (ImageContent | TextContent)[];
isError?: boolean;
export type DialogModalState = {
type: 'dialog';
description: string;
dialog: playwright.Dialog;
};
export type Tool = {
schema: ToolSchema;
handle: (context: Context, params?: Record<string, any>) => Promise<ToolResult>;
export type ModalState = FileUploadModalState | DialogModalState;
export type Tool<Input extends z.Schema = z.Schema> = {
capability: ToolCapability;
schema: ToolSchema<Input>;
handle: (context: Context, params: z.output<Input>, response: Response) => Promise<void>;
};
export type ToolFactory = (snapshot: boolean) => Tool;
export function defineTool<Input extends z.Schema>(tool: Tool<Input>): Tool<Input> {
return tool;
}
export type TabTool<Input extends z.Schema = z.Schema> = {
capability: ToolCapability;
schema: ToolSchema<Input>;
clearsModalState?: ModalState['type'];
handle: (tab: Tab, params: z.output<Input>, response: Response) => Promise<void>;
};
export function defineTabTool<Input extends z.Schema>(tool: TabTool<Input>): Tool<Input> {
return {
...tool,
handle: async (context, params, response) => {
const tab = context.currentTabOrDie();
const modalStates = tab.modalStates().map(state => state.type);
if (tool.clearsModalState && !modalStates.includes(tool.clearsModalState))
response.addError(`Error: The tool "${tool.schema.name}" can only be used when there is related modal state present.\n` + tab.modalStatesMarkdown().join('\n'));
else if (!tool.clearsModalState && modalStates.length)
response.addError(`Error: Tool "${tool.schema.name}" does not handle the modal state.\n` + tab.modalStatesMarkdown().join('\n'));
else
return tool.handle(tab, params, response);
},
};
}

View File

@@ -14,11 +14,13 @@
* limitations under the License.
*/
import type * as playwright from 'playwright';
import type { ToolResult } from './tool';
import type { Context } from '../context';
// @ts-ignore
import { asLocator } from 'playwright-core/lib/utils';
async function waitForCompletion<R>(page: playwright.Page, callback: () => Promise<R>): Promise<R> {
import type * as playwright from 'playwright';
import type { Tab } from '../tab.js';
export async function waitForCompletion<R>(tab: Tab, callback: () => Promise<R>): Promise<R> {
const requests = new Set<playwright.Request>();
let frameNavigated = false;
let waitCallback: () => void = () => {};
@@ -37,9 +39,7 @@ async function waitForCompletion<R>(page: playwright.Page, callback: () => Promi
frameNavigated = true;
dispose();
clearTimeout(timeout);
void frame.waitForLoadState('load').then(() => {
waitCallback();
});
void tab.waitForLoadState('load').then(waitCallback);
};
const onTimeout = () => {
@@ -47,15 +47,15 @@ async function waitForCompletion<R>(page: playwright.Page, callback: () => Promi
waitCallback();
};
page.on('request', requestListener);
page.on('requestfinished', requestFinishedListener);
page.on('framenavigated', frameNavigateListener);
tab.page.on('request', requestListener);
tab.page.on('requestfinished', requestFinishedListener);
tab.page.on('framenavigated', frameNavigateListener);
const timeout = setTimeout(onTimeout, 10000);
const dispose = () => {
page.off('request', requestListener);
page.off('requestfinished', requestFinishedListener);
page.off('framenavigated', frameNavigateListener);
tab.page.off('request', requestListener);
tab.page.off('requestfinished', requestFinishedListener);
tab.page.off('framenavigated', frameNavigateListener);
clearTimeout(timeout);
};
@@ -64,45 +64,22 @@ async function waitForCompletion<R>(page: playwright.Page, callback: () => Promi
if (!requests.size && !frameNavigated)
waitCallback();
await waitBarrier;
await page.evaluate(() => new Promise(f => setTimeout(f, 1000)));
await tab.waitForTimeout(1000);
return result;
} finally {
dispose();
}
}
export async function runAndWait(context: Context, status: string, callback: (page: playwright.Page) => Promise<any>, snapshot: boolean = false): Promise<ToolResult> {
const page = context.existingPage();
const dismissFileChooser = context.hasFileChooser();
await waitForCompletion(page, () => callback(page));
if (dismissFileChooser)
context.clearFileChooser();
const result: ToolResult = snapshot ? await captureAriaSnapshot(context, status) : {
content: [{ type: 'text', text: status }],
};
return result;
export async function generateLocator(locator: playwright.Locator): Promise<string> {
try {
const { resolvedSelector } = await (locator as any)._resolveSelector();
return asLocator('javascript', resolvedSelector);
} catch (e) {
throw new Error('Ref not found, likely because element was removed. Use browser_snapshot to see what elements are currently on the page.');
}
}
export async function captureAriaSnapshot(context: Context, status: string = ''): Promise<ToolResult> {
const page = context.existingPage();
const lines = [];
if (status)
lines.push(`${status}`);
lines.push(
'',
`- Page URL: ${page.url()}`,
`- Page Title: ${await page.title()}`
);
if (context.hasFileChooser())
lines.push(`- There is a file chooser visible that requires browser_choose_file to be called`);
lines.push(
`- Page Snapshot`,
'```yaml',
await context.allFramesSnapshot(),
'```',
''
);
return {
content: [{ type: 'text', text: lines.join('\n') }],
};
export async function callOnPageNoTrace<T>(page: playwright.Page, callback: (page: playwright.Page) => Promise<T>): Promise<T> {
return await (page as any)._wrapApiCall(() => callback(page), { internal: true });
}

65
src/tools/wait.ts Normal file
View File

@@ -0,0 +1,65 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { defineTool } from './tool.js';
const wait = defineTool({
capability: 'core',
schema: {
name: 'browser_wait_for',
title: 'Wait for',
description: 'Wait for text to appear or disappear or a specified time to pass',
inputSchema: z.object({
time: z.number().optional().describe('The time to wait in seconds'),
text: z.string().optional().describe('The text to wait for'),
textGone: z.string().optional().describe('The text to wait for to disappear'),
}),
type: 'readOnly',
},
handle: async (context, params, response) => {
if (!params.text && !params.textGone && !params.time)
throw new Error('Either time, text or textGone must be provided');
if (params.time) {
response.addCode(`await new Promise(f => setTimeout(f, ${params.time!} * 1000));`);
await new Promise(f => setTimeout(f, Math.min(30000, params.time! * 1000)));
}
const tab = context.currentTabOrDie();
const locator = params.text ? tab.page.getByText(params.text).first() : undefined;
const goneLocator = params.textGone ? tab.page.getByText(params.textGone).first() : undefined;
if (goneLocator) {
response.addCode(`await page.getByText(${JSON.stringify(params.textGone)}).first().waitFor({ state: 'hidden' });`);
await goneLocator.waitFor({ state: 'hidden' });
}
if (locator) {
response.addCode(`await page.getByText(${JSON.stringify(params.text)}).first().waitFor({ state: 'visible' });`);
await locator.waitFor({ state: 'visible' });
}
response.addResult(`Waited for ${params.text || params.textGone || params.time}`);
response.setIncludeSnapshot();
},
});
export default [
wait,
];

53
src/utils/codegen.ts Normal file
View File

@@ -0,0 +1,53 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
// adapted from:
// - https://github.com/microsoft/playwright/blob/76ee48dc9d4034536e3ec5b2c7ce8be3b79418a8/packages/playwright-core/src/utils/isomorphic/stringUtils.ts
// - https://github.com/microsoft/playwright/blob/76ee48dc9d4034536e3ec5b2c7ce8be3b79418a8/packages/playwright-core/src/server/codegen/javascript.ts
// NOTE: this function should not be used to escape any selectors.
export function escapeWithQuotes(text: string, char: string = '\'') {
const stringified = JSON.stringify(text);
const escapedText = stringified.substring(1, stringified.length - 1).replace(/\\"/g, '"');
if (char === '\'')
return char + escapedText.replace(/[']/g, '\\\'') + char;
if (char === '"')
return char + escapedText.replace(/["]/g, '\\"') + char;
if (char === '`')
return char + escapedText.replace(/[`]/g, '\\`') + char;
throw new Error('Invalid escape char');
}
export function quote(text: string) {
return escapeWithQuotes(text, '\'');
}
export function formatObject(value: any, indent = ' '): string {
if (typeof value === 'string')
return quote(value);
if (Array.isArray(value))
return `[${value.map(o => formatObject(o)).join(', ')}]`;
if (typeof value === 'object') {
const keys = Object.keys(value).filter(key => value[key] !== undefined).sort();
if (!keys.length)
return '{}';
const tokens: string[] = [];
for (const key of keys)
tokens.push(`${key}: ${formatObject(value[key])}`);
return `{\n${indent}${tokens.join(`,\n${indent}`)}\n}`;
}
return String(value);
}

39
src/utils/fileUtils.ts Normal file
View File

@@ -0,0 +1,39 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import os from 'node:os';
import path from 'node:path';
export function cacheDir() {
let cacheDirectory: string;
if (process.platform === 'linux')
cacheDirectory = process.env.XDG_CACHE_HOME || path.join(os.homedir(), '.cache');
else if (process.platform === 'darwin')
cacheDirectory = path.join(os.homedir(), 'Library', 'Caches');
else if (process.platform === 'win32')
cacheDirectory = process.env.LOCALAPPDATA || path.join(os.homedir(), 'AppData', 'Local');
else
throw new Error('Unsupported platform: ' + process.platform);
return path.join(cacheDirectory, 'ms-playwright');
}
export function sanitizeForFilePath(s: string) {
const sanitize = (s: string) => s.replace(/[\x00-\x2C\x2E-\x2F\x3A-\x40\x5B-\x60\x7B-\x7F]+/g, '-');
const separator = s.lastIndexOf('.');
if (separator === -1)
return sanitize(s);
return sanitize(s.substring(0, separator)) + '.' + sanitize(s.substring(separator + 1));
}

25
src/utils/guid.ts Normal file
View File

@@ -0,0 +1,25 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import crypto from 'crypto';
export function createGuid(): string {
return crypto.randomBytes(16).toString('hex');
}
export function createHash(data: string): string {
return crypto.createHash('sha256').update(data).digest('hex').slice(0, 7);
}

44
src/utils/httpServer.ts Normal file
View File

@@ -0,0 +1,44 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import assert from 'assert';
import http from 'http';
import type * as net from 'net';
export async function startHttpServer(config: { host?: string, port?: number }): Promise<http.Server> {
const { host, port } = config;
const httpServer = http.createServer();
await new Promise<void>((resolve, reject) => {
httpServer.on('error', reject);
httpServer.listen(port, host, () => {
resolve();
httpServer.removeListener('error', reject);
});
});
return httpServer;
}
export function httpAddressToString(address: string | net.AddressInfo | null): string {
assert(address, 'Could not bind server socket');
if (typeof address === 'string')
return address;
const resolvedPort = address.port;
let resolvedHost = address.family === 'IPv4' ? address.address : `[${address.address}]`;
if (resolvedHost === '0.0.0.0' || resolvedHost === '[::]')
resolvedHost = 'localhost';
return `http://${resolvedHost}:${resolvedPort}`;
}

View File

@@ -14,23 +14,12 @@
* limitations under the License.
*/
import type { Context } from '../context';
import debug from 'debug';
export type ResourceSchema = {
uri: string;
name: string;
description?: string;
mimeType?: string;
};
const errorsDebug = debug('pw:mcp:errors');
export type ResourceResult = {
uri: string;
mimeType?: string;
text?: string;
blob?: string;
};
export function logUnhandledError(error: unknown) {
errorsDebug(error);
}
export type Resource = {
schema: ResourceSchema;
read: (context: Context, uri: string) => Promise<ResourceResult[]>;
};
export const testDebug = debug('pw:mcp:test');

127
src/utils/manualPromise.ts Normal file
View File

@@ -0,0 +1,127 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
export class ManualPromise<T = void> extends Promise<T> {
private _resolve!: (t: T) => void;
private _reject!: (e: Error) => void;
private _isDone: boolean;
constructor() {
let resolve: (t: T) => void;
let reject: (e: Error) => void;
super((f, r) => {
resolve = f;
reject = r;
});
this._isDone = false;
this._resolve = resolve!;
this._reject = reject!;
}
isDone() {
return this._isDone;
}
resolve(t: T) {
this._isDone = true;
this._resolve(t);
}
reject(e: Error) {
this._isDone = true;
this._reject(e);
}
static override get [Symbol.species]() {
return Promise;
}
override get [Symbol.toStringTag]() {
return 'ManualPromise';
}
}
export class LongStandingScope {
private _terminateError: Error | undefined;
private _closeError: Error | undefined;
private _terminatePromises = new Map<ManualPromise<Error>, string[]>();
private _isClosed = false;
reject(error: Error) {
this._isClosed = true;
this._terminateError = error;
for (const p of this._terminatePromises.keys())
p.resolve(error);
}
close(error: Error) {
this._isClosed = true;
this._closeError = error;
for (const [p, frames] of this._terminatePromises)
p.resolve(cloneError(error, frames));
}
isClosed() {
return this._isClosed;
}
static async raceMultiple<T>(scopes: LongStandingScope[], promise: Promise<T>): Promise<T> {
return Promise.race(scopes.map(s => s.race(promise)));
}
async race<T>(promise: Promise<T> | Promise<T>[]): Promise<T> {
return this._race(Array.isArray(promise) ? promise : [promise], false) as Promise<T>;
}
async safeRace<T>(promise: Promise<T>, defaultValue?: T): Promise<T> {
return this._race([promise], true, defaultValue);
}
private async _race(promises: Promise<any>[], safe: boolean, defaultValue?: any): Promise<any> {
const terminatePromise = new ManualPromise<Error>();
const frames = captureRawStack();
if (this._terminateError)
terminatePromise.resolve(this._terminateError);
if (this._closeError)
terminatePromise.resolve(cloneError(this._closeError, frames));
this._terminatePromises.set(terminatePromise, frames);
try {
return await Promise.race([
terminatePromise.then(e => safe ? defaultValue : Promise.reject(e)),
...promises
]);
} finally {
this._terminatePromises.delete(terminatePromise);
}
}
}
function cloneError(error: Error, frames: string[]) {
const clone = new Error();
clone.name = error.name;
clone.message = error.message;
clone.stack = [error.name + ':' + error.message, ...frames].join('\n');
return clone;
}
function captureRawStack(): string[] {
const stackTraceLimit = Error.stackTraceLimit;
Error.stackTraceLimit = 50;
const error = new Error();
const stack = error.stack || '';
Error.stackTraceLimit = stackTraceLimit;
return stack.split('\n');
}

22
src/utils/package.ts Normal file
View File

@@ -0,0 +1,22 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import fs from 'fs';
import path from 'path';
import url from 'url';
const __filename = url.fileURLToPath(import.meta.url);
export const packageJSON = JSON.parse(fs.readFileSync(path.join(path.dirname(__filename), '..', '..', 'package.json'), 'utf8'));

View File

@@ -1,315 +0,0 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import fs from 'fs/promises';
import { spawn } from 'node:child_process';
import path from 'node:path';
import { test, expect } from './fixtures';
test('test tool list', async ({ client, visionClient }) => {
const { tools } = await client.listTools();
expect(tools.map(t => t.name)).toEqual([
'browser_navigate',
'browser_go_back',
'browser_go_forward',
'browser_choose_file',
'browser_snapshot',
'browser_click',
'browser_hover',
'browser_type',
'browser_select_option',
'browser_take_screenshot',
'browser_press_key',
'browser_wait',
'browser_save_as_pdf',
'browser_close',
]);
const { tools: visionTools } = await visionClient.listTools();
expect(visionTools.map(t => t.name)).toEqual([
'browser_navigate',
'browser_go_back',
'browser_go_forward',
'browser_choose_file',
'browser_screenshot',
'browser_move_mouse',
'browser_click',
'browser_drag',
'browser_type',
'browser_press_key',
'browser_wait',
'browser_save_as_pdf',
'browser_close',
]);
});
test('test resources list', async ({ client }) => {
const { resources } = await client.listResources();
expect(resources).toEqual([
expect.objectContaining({
uri: 'browser://console',
mimeType: 'text/plain',
}),
]);
});
test('test browser_navigate', async ({ client }) => {
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
})).toHaveTextContent(`
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s1e2]: Hello, world!
\`\`\`
`
);
});
test('test browser_click', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><button>Submit</button></html>',
},
});
expect(await client.callTool({
name: 'browser_click',
arguments: {
element: 'Submit button',
ref: 's1e4',
},
})).toHaveTextContent(`"Submit button" clicked
- Page URL: data:text/html,<html><title>Title</title><button>Submit</button></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s2e2]:
- button "Submit" [ref=s2e4]
\`\`\`
`);
});
test('test reopen browser', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
});
expect(await client.callTool({
name: 'browser_close',
})).toHaveTextContent('Page closed');
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
})).toHaveTextContent(`
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s1e2]: Hello, world!
\`\`\`
`);
});
test('single option', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><select><option value="foo">Foo</option><option value="bar">Bar</option></select></html>',
},
});
expect(await client.callTool({
name: 'browser_select_option',
arguments: {
element: 'Select',
ref: 's1e4',
values: ['bar'],
},
})).toHaveTextContent(`Selected option in "Select"
- Page URL: data:text/html,<html><title>Title</title><select><option value="foo">Foo</option><option value="bar">Bar</option></select></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s2e2]:
- combobox [ref=s2e4]:
- option "Foo" [ref=s2e5]
- option "Bar" [selected] [ref=s2e6]
\`\`\`
`);
});
test('multiple option', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><select multiple><option value="foo">Foo</option><option value="bar">Bar</option><option value="baz">Baz</option></select></html>',
},
});
expect(await client.callTool({
name: 'browser_select_option',
arguments: {
element: 'Select',
ref: 's1e4',
values: ['bar', 'baz'],
},
})).toHaveTextContent(`Selected option in "Select"
- Page URL: data:text/html,<html><title>Title</title><select multiple><option value="foo">Foo</option><option value="bar">Bar</option><option value="baz">Baz</option></select></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s2e2]:
- listbox [ref=s2e4]:
- option "Foo" [ref=s2e5]
- option "Bar" [selected] [ref=s2e6]
- option "Baz" [selected] [ref=s2e7]
\`\`\`
`);
});
test('browser://console', async ({ client }) => {
await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><script>console.log("Hello, world!");console.error("Error"); </script></html>',
},
});
const resource = await client.readResource({
uri: 'browser://console',
});
expect(resource.contents).toEqual([{
uri: 'browser://console',
mimeType: 'text/plain',
text: '[LOG] Hello, world!\n[ERROR] Error',
}]);
});
test('stitched aria frames', async ({ client }) => {
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<h1>Hello</h1><iframe src="data:text/html,<h1>World</h1>"></iframe><iframe src="data:text/html,<h1>Should be invisible</h1>" style="display: none;"></iframe>',
},
})).toHaveTextContent(`
- Page URL: data:text/html,<h1>Hello</h1><iframe src="data:text/html,<h1>World</h1>"></iframe><iframe src="data:text/html,<h1>Should be invisible</h1>" style="display: none;"></iframe>
- Page Title:
- Page Snapshot
\`\`\`yaml
- document [ref=s1e2]:
- heading "Hello" [level=1] [ref=s1e4]
# iframe src=data:text/html,<h1>World</h1>
- document [ref=f0s1e2]:
- heading "World" [level=1] [ref=f0s1e4]
\`\`\`
`);
});
test('browser_choose_file', async ({ client }) => {
expect(await client.callTool({
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><input type="file" /><button>Button</button></html>',
},
})).toContainTextContent('- textbox [ref=s1e4]');
expect(await client.callTool({
name: 'browser_click',
arguments: {
element: 'Textbox',
ref: 's1e4',
},
})).toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called');
const filePath = test.info().outputPath('test.txt');
await fs.writeFile(filePath, 'Hello, world!');
{
const response = await client.callTool({
name: 'browser_choose_file',
arguments: {
paths: [filePath],
},
});
expect(response).not.toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called');
expect(response).toContainTextContent('textbox [ref=s3e4]: C:\\fakepath\\test.txt');
}
{
const response = await client.callTool({
name: 'browser_click',
arguments: {
element: 'Textbox',
ref: 's3e4',
},
});
expect(response).toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called');
expect(response).toContainTextContent('button "Button" [ref=s4e5]');
}
{
const response = await client.callTool({
name: 'browser_click',
arguments: {
element: 'Button',
ref: 's4e5',
},
});
expect(response, 'not submitting browser_choose_file dismisses file chooser').not.toContainTextContent('There is a file chooser visible that requires browser_choose_file to be called');
}
});
test('sse transport', async () => {
const cp = spawn('node', [path.join(__dirname, '../cli.js'), '--port', '0'], { stdio: 'pipe' });
try {
let stdout = '';
const url = await new Promise<string>(resolve => cp.stdout?.on('data', data => {
stdout += data.toString();
const match = stdout.match(/Listening on (http:\/\/.*)/);
if (match)
resolve(match[1]);
}));
// need dynamic import b/c of some ESM nonsense
const { SSEClientTransport } = await import('@modelcontextprotocol/sdk/client/sse.js');
const { Client } = await import('@modelcontextprotocol/sdk/client/index.js');
const transport = new SSEClientTransport(new URL(url));
const client = new Client({ name: 'test', version: '1.0.0' });
await client.connect(transport);
await client.ping();
} finally {
cp.kill();
}
});

112
tests/capabilities.spec.ts Normal file
View File

@@ -0,0 +1,112 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures.js';
test('test snapshot tool list', async ({ client }) => {
const { tools } = await client.listTools();
expect(new Set(tools.map(t => t.name))).toEqual(new Set([
'browser_click',
'browser_console_messages',
'browser_drag',
'browser_evaluate',
'browser_file_upload',
'browser_handle_dialog',
'browser_hover',
'browser_select_option',
'browser_type',
'browser_close',
'browser_install',
'browser_navigate_back',
'browser_navigate_forward',
'browser_navigate',
'browser_network_requests',
'browser_press_key',
'browser_resize',
'browser_snapshot',
'browser_tab_close',
'browser_tab_list',
'browser_tab_new',
'browser_tab_select',
'browser_take_screenshot',
'browser_wait_for',
]));
});
test('test tool list proxy mode', async ({ startClient }) => {
const { client } = await startClient({
args: ['--connect-tool'],
});
const { tools } = await client.listTools();
expect(new Set(tools.map(t => t.name))).toEqual(new Set([
'browser_click',
'browser_connect', // the extra tool
'browser_console_messages',
'browser_drag',
'browser_evaluate',
'browser_file_upload',
'browser_handle_dialog',
'browser_hover',
'browser_select_option',
'browser_type',
'browser_close',
'browser_install',
'browser_navigate_back',
'browser_navigate_forward',
'browser_navigate',
'browser_network_requests',
'browser_press_key',
'browser_resize',
'browser_snapshot',
'browser_tab_close',
'browser_tab_list',
'browser_tab_new',
'browser_tab_select',
'browser_take_screenshot',
'browser_wait_for',
]));
});
test('test capabilities (pdf)', async ({ startClient }) => {
const { client } = await startClient({
args: ['--caps=pdf'],
});
const { tools } = await client.listTools();
const toolNames = tools.map(t => t.name);
expect(toolNames).toContain('browser_pdf_save');
});
test('test capabilities (vision)', async ({ startClient }) => {
const { client } = await startClient({
args: ['--caps=vision'],
});
const { tools } = await client.listTools();
const toolNames = tools.map(t => t.name);
expect(toolNames).toContain('browser_mouse_move_xy');
expect(toolNames).toContain('browser_mouse_click_xy');
expect(toolNames).toContain('browser_mouse_drag_xy');
});
test('support for legacy --vision option', async ({ startClient }) => {
const { client } = await startClient({
args: ['--vision'],
});
const { tools } = await client.listTools();
const toolNames = tools.map(t => t.name);
expect(toolNames).toContain('browser_mouse_move_xy');
expect(toolNames).toContain('browser_mouse_click_xy');
expect(toolNames).toContain('browser_mouse_drag_xy');
});

97
tests/cdp.spec.ts Normal file
View File

@@ -0,0 +1,97 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import url from 'node:url';
import path from 'node:path';
import { spawnSync } from 'node:child_process';
import { test, expect } from './fixtures.js';
test('cdp server', async ({ cdpServer, startClient, server }) => {
await cdpServer.start();
const { client } = await startClient({ args: [`--cdp-endpoint=${cdpServer.endpoint}`] });
expect(await client.callTool({
name: 'browser_navigate',
arguments: { url: server.HELLO_WORLD },
})).toHaveResponse({
pageState: expect.stringContaining(`- generic [active] [ref=e1]: Hello, world!`),
});
});
test('cdp server reuse tab', async ({ cdpServer, startClient, server }) => {
const browserContext = await cdpServer.start();
const { client } = await startClient({ args: [`--cdp-endpoint=${cdpServer.endpoint}`] });
const [page] = browserContext.pages();
await page.goto(server.HELLO_WORLD);
expect(await client.callTool({
name: 'browser_click',
arguments: {
element: 'Hello, world!',
ref: 'f0',
},
})).toHaveResponse({
result: `Error: No open pages available. Use the "browser_navigate" tool to navigate to a page first.`,
isError: true,
});
expect(await client.callTool({
name: 'browser_snapshot',
})).toHaveResponse({
pageState: expect.stringContaining(`- Page URL: ${server.HELLO_WORLD}
- Page Title: Title
- Page Snapshot:
\`\`\`yaml
- generic [active] [ref=e1]: Hello, world!
\`\`\``),
});
});
test('should throw connection error and allow re-connecting', async ({ cdpServer, startClient, server }) => {
const { client } = await startClient({ args: [`--cdp-endpoint=${cdpServer.endpoint}`] });
server.setContent('/', `
<title>Title</title>
<body>Hello, world!</body>
`, 'text/html');
expect(await client.callTool({
name: 'browser_navigate',
arguments: { url: server.PREFIX },
})).toHaveResponse({
result: expect.stringContaining(`Error: browserType.connectOverCDP: connect ECONNREFUSED`),
isError: true,
});
await cdpServer.start();
expect(await client.callTool({
name: 'browser_navigate',
arguments: { url: server.PREFIX },
})).toHaveResponse({
pageState: expect.stringContaining(`- generic [active] [ref=e1]: Hello, world!`),
});
});
// NOTE: Can be removed when we drop Node.js 18 support and changed to import.meta.filename.
const __filename = url.fileURLToPath(import.meta.url);
test('does not support --device', async () => {
const result = spawnSync('node', [
path.join(__filename, '../../cli.js'), '--device=Pixel 5', '--cdp-endpoint=http://localhost:1234',
]);
expect(result.error).toBeUndefined();
expect(result.status).toBe(1);
expect(result.stderr.toString()).toContain('Device emulation is not supported with cdpEndpoint.');
});

99
tests/click.spec.ts Normal file
View File

@@ -0,0 +1,99 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures.js';
test('browser_click', async ({ client, server, mcpBrowser }) => {
server.setContent('/', `
<title>Title</title>
<button>Submit</button>
`, 'text/html');
await client.callTool({
name: 'browser_navigate',
arguments: { url: server.PREFIX },
});
expect(await client.callTool({
name: 'browser_click',
arguments: {
element: 'Submit button',
ref: 'e2',
},
})).toHaveResponse({
code: `await page.getByRole('button', { name: 'Submit' }).click();`,
pageState: expect.stringContaining(`- button "Submit" ${mcpBrowser !== 'webkit' || process.platform === 'linux' ? '[active] ' : ''}[ref=e2]`),
});
});
test('browser_click (double)', async ({ client, server }) => {
server.setContent('/', `
<title>Title</title>
<script>
function handle() {
document.querySelector('h1').textContent = 'Double clicked';
}
</script>
<h1 ondblclick="handle()">Click me</h1>
`, 'text/html');
await client.callTool({
name: 'browser_navigate',
arguments: { url: server.PREFIX },
});
expect(await client.callTool({
name: 'browser_click',
arguments: {
element: 'Click me',
ref: 'e2',
doubleClick: true,
},
})).toHaveResponse({
code: `await page.getByRole('heading', { name: 'Click me' }).dblclick();`,
pageState: expect.stringContaining(`- heading "Double clicked" [level=1] [ref=e3]`),
});
});
test('browser_click (right)', async ({ client, server }) => {
server.setContent('/', `
<button oncontextmenu="handle">Menu</button>
<script>
document.addEventListener('contextmenu', event => {
event.preventDefault();
document.querySelector('button').textContent = 'Right clicked';
});
</script>
`, 'text/html');
await client.callTool({
name: 'browser_navigate',
arguments: { url: server.PREFIX },
});
const result = await client.callTool({
name: 'browser_click',
arguments: {
element: 'Menu',
ref: 'e2',
button: 'right',
},
});
expect(result).toHaveResponse({
code: `await page.getByRole('button', { name: 'Menu' }).click({ button: 'right' });`,
pageState: expect.stringContaining(`- button "Right clicked"`),
});
});

Some files were not shown because too many files have changed in this diff Show More