gsd-browser
Native Rust browser automation CLI for Chrome/Chromium via CDP. gsd-browser keeps a persistent background daemon, auto-starts on first use, and exposes 90+ top-level commands for navigation, interaction, authenticated live viewing, annotations, recording bundles, snapshots with versioned refs, assertions, structured extraction, network control, visual diffing, tracing, and stateful auth flows.
Built for AI agents, CI pipelines, and developers who want deterministic browser control without adopting a full browser test framework.
MCP Server — Massively Expanded (The Primary Path for AI Agents)
gsd-browser mcp is now a first-class, extremely powerful browser automation platform for agents. It exposes 50+ tools, live resources (real snapshot/refs/state/timeline data), and executable prompts over stdio (Model Context Protocol).
Completed advancements include:
- Full coverage of the rich surface: versioned refs + multiple snapshot modes, semantic
act+find_best, advanced forms, robust assertions + waits,browser_batchfor atomic flows, live viewer + full human collaboration (takeover, annotations, goal banners, step/abort/pause/resume, sensitive mode), first-class recording & evidence bundles, visual regression, HAR/trace/PDF export, network mocking & blocking, device emulation, encrypted auth vault + state save/restore, structured extraction, prompt injection scanning, action cache for long-term self-healing, multi-tab/frame management, rich diagnostics (debug_bundleetc.), and more. - Resources that actually query the daemon for live context (
gsd-browser://latest-snapshot,current-state,active-recordings,timeline, etc.). - Executable prompts encoding best-practice multi-step workflows (
robust_login_flow,full_page_audit,autonomous_research_task,evidence_creation_workflow,debug_stuck_agent_flow, etc.). - Standardized high-value response envelopes on every tool call:
summary,structured_data,suggested_next_actions,evidence_refs. - Seamless reuse of the proven daemon client (auto-start, named sessions for isolation + persistent cache/state, robust error handling).
This is designed to be the high-end browser backend for serious agent platforms.
Get started in seconds:
gsd-browser mcp
Point Cursor, Claude Desktop, VS Code + Copilot, or any MCP client at it.
Tailored setup + config snippets:
./scripts/mcp-quickstart.sh cursor # claude | vscode | generic
Key documentation:
- docs/mcp.md — Full capabilities, architecture, client configs, quickstart script.
- docs/AGENT-BEST-PRACTICES.md — Golden rules, workflow patterns, "When to Use What" table, self-healing, response envelopes, prompt/resource usage (essential reading for agents).
- docs/examples/mcp-client-config.json — Ready-to-paste example.
- Root SKILL.md and the
gsd-browser-skill/pack — Complete underlying command semantics and curated workflows (the MCP tools are a direct mapping).
Run gsd-browser mcp and unleash one of the most powerful browser surfaces available for agentic work.
Install
npm
npm install -g @opengsd/gsd-browser
Pre-built binaries
Download from GitHub Releases:
| Platform | Asset |
|---|---|
| macOS (Apple Silicon) | gsd-browser-darwin-arm64 |
| macOS (Intel) | gsd-browser-darwin-x64 |
| Linux (ARM64) | gsd-browser-linux-arm64 |
| Linux (x64) | gsd-browser-linux-x64 |
| Windows (x64) | gsd-browser-windows-x64.exe |
Build from source
git clone https://github.com/open-gsd/gsd-browser.git
cd gsd-browser
cargo install --path cli
Codex Plugin
Install the CLI and register the Codex Plugin in one pass:
curl -fsSL https://raw.githubusercontent.com/open-gsd/gsd-browser/main/install.sh | bash -s -- --codex-plugin
The installer writes the plugin to ~/plugins/gsd-browser, updates the personal Codex marketplace at ~/.agents/plugins/marketplace.json, and runs codex plugin add gsd-browser@<marketplace> when the Codex CLI is available. Without --codex-plugin, the interactive installer offers OpenAI Codex Plugin alongside the agent skill options.
crates.io
The crates.io package (gsd-browser) is not published yet. Use GitHub release assets or a source build.
The one-line installer (curl -fsSL https://raw.githubusercontent.com/open-gsd/gsd-browser/main/install.sh | bash) also sets up the gsd-browser-skill/ pack for coding agents and documents the MCP path in its header.
Quick Start
The daemon starts automatically on first use.
# Navigate to a page
gsd-browser navigate https://example.com
# Snapshot interactive elements and assign refs like @v1:e1
gsd-browser snapshot
# On example.com the only interactive element is the "More information..." link
gsd-browser click-ref @v1:e1
# Wait for navigation and assert the result
gsd-browser wait-for --condition network_idle
gsd-browser assert --checks '[{"kind":"url_contains","text":"iana.org"}]'
# Capture a PNG
gsd-browser screenshot --output page.png --format png
For the modern agent experience, prefer the MCP server (see top of this README).
Interactive Workbench
gsd-browser view starts an authenticated localhost workbench for the active session. The URL is bound to the session, viewer id, loopback origin, expiry, and viewer capabilities. Use view --print-only when another tool needs the URL.
gsd-browser view
gsd-browser view --print-only
gsd-browser control-state
gsd-browser takeover
gsd-browser release-control
gsd-browser sensitive-on
gsd-browser sensitive-off
The viewer streams the real Chrome page, forwards pointer, wheel, keyboard, text, and paste input while in Control mode, creates annotations in Annotate mode, and starts/stops local recording bundles in Record mode. Sensitive mode keeps local human control available while cloud frame capture and evidence surfaces use redaction policy.
Annotations and recordings stay local to the daemon state directory:
gsd-browser annotations
gsd-browser annotation-get <id>
gsd-browser annotation-clear <id>
gsd-browser annotation-resolve <id>
gsd-browser annotation-export --output annotations.json
gsd-browser record-start --name checkout-bug
gsd-browser record-stop
gsd-browser recordings
gsd-browser recording-get <id>
gsd-browser recording-export <id> --output <path>
gsd-browser recording-discard <id>
gsd-browser recording-validate <id-or-path> --json
(MCP equivalents: browser_view, browser_annotation_request, browser_record_start, etc. — among the highest-leverage features for collaborative agent work.)
Command Surface
gsd-browser currently exposes 90+ top-level commands (the MCP server exposes the most valuable subset as 50+ discoverable tools with agent-optimized descriptions and envelopes):
| Area | Commands |
|---|---|
| Navigation | navigate, back, forward, reload |
| Logs & JavaScript | console, network, dialog, eval |
| Interaction | click, type, press, hover, scroll, select-option, set-checked, drag, set-viewport, upload-file |
| Inspection | accessibility-tree, find, page-source |
| Waits | wait-for |
| Snapshots & refs | snapshot, get-ref, click-ref, hover-ref, fill-ref |
| Assertions & batching | assert, diff, batch |
| Pages & frames | list-pages, switch-page, close-page, list-frames, select-frame |
| Forms & semantic actions | analyze-form, fill-form, find-best, act |
| Live workbench | goal, view, control-state, takeover, release-control, pause, resume, step, abort, sensitive-on, sensitive-off |
| Annotations | annotations, annotation-get, annotation-clear, annotation-resolve, annotation-export, annotation-request |
| Recording bundles | record-start, record-stop, record-pause, record-resume, recordings, recording-get, recording-export, recording-discard, recording-validate |
| Diagnostics | timeline, session-summary, debug-bundle |
| Screenshots & document output | screenshot, zoom-region, save-pdf |
| Visual regression | visual-diff |
| Structured extraction | extract |
| Network control | mock-route, block-urls, clear-routes |
| Device & browser state | emulate-device, save-state, restore-state |
| Auth vault | vault-save, vault-login, vault-list |
| Recording & traces | generate-test, har-export, trace-start, trace-stop |
| Safety, caching & daemon management | action-cache, check-injection, daemon |
| MCP, cloud & updates | mcp, cloud-methods, update |
Highlights
- Persistent daemon with automatic startup for fast repeated commands
- Durable named sessions with explicit health reporting and no silent session replacement
- Versioned refs from
snapshotfor deterministic interaction (@v1:e1,@v2:e3) - Explicit assertions with
assertand multi-step automation withbatch - Shared inspection semantics across
snapshot,find,wait-for,assert, and ref-driven actions - Semantic
find-bestandactflows covering 15 built-in intents - Named sessions via
--sessionfor isolated parallel browser workers - Authenticated local viewer with human takeover, pause/step/abort, annotations, sensitive mode, and bounded recording bundles
- Structured JSON output on every command via
--json - Visual diffing, HAR export, PDF generation, and CDP tracing in the same tool
- Saved browser state plus encrypted credential replay through the auth vault
- Prompt injection scanning for agent-facing browsing workflows
- Action cache for self-healing intent mappings across sessions (especially powerful with named MCP sessions)
- Full MCP server with resources, prompts, and agent-optimized envelopes
Configuration
gsd-browser merges configuration in this order:
- Built-in defaults
- User config:
~/.gsd-browser/config.toml - Project config:
./gsd-browser.toml - Environment variables:
GSD_BROWSER_* - CLI flags
Example gsd-browser.toml:
[browser]
path = "/usr/bin/chromium"
cdp_url = "http://localhost:9222" # attach to existing Chrome instead of launching
headless = true
[daemon]
port = 9222
host = "127.0.0.1"
[screenshot]
quality = 90
format = "png"
full_page = false
[settle]
timeout_ms = 500
poll_ms = 40
quiet_window_ms = 100
[logs]
max_buffer_size = 1000
[artifacts]
dir = "./browser-artifacts"
[timeline]
enabled = true
max_entries = 500
Supported environment variable overrides use GSD_BROWSER_<SECTION>_<FIELD> naming:
export GSD_BROWSER_BROWSER_PATH=/usr/bin/chromium
export GSD_BROWSER_BROWSER_CDP_URL=http://localhost:9222
export GSD_BROWSER_BROWSER_HEADLESS=true
export GSD_BROWSER_DAEMON_PORT=9333
export GSD_BROWSER_SCREENSHOT_QUALITY=90
export GSD_BROWSER_SETTLE_TIMEOUT_MS=1000
export GSD_BROWSER_ARTIFACTS_DIR=./browser-artifacts
export GSD_BROWSER_VAULT_KEY=your-encryption-key
For MCP usage, place the relevant GSD_BROWSER_* variables in your MCP client's server env configuration.
Stealth Mode & Alternative Backends (Experimental)
gsd-browser defaults to the stable chromiumoxide CDP client for maximum compatibility.
--stealth / backend = "stealth"
gsd-browser --stealth navigate https://bot.sannysoft.com
# or
gsd-browser --backend stealth navigate ...
# config.toml
[browser]
stealth = true
backend = "chaser-oxide" # or "stealth", "chromey"
Effects when enabled:
- Anti-detection Chrome flags (
--disable-blink-features=AutomationControlled, IsolateOrigins, etc.) - Realistic UA + hardware (cores, memory, platform) spoofing
- CDP signal patches (webdriver, cdc_ markers, chrome object, permissions, WebGL)
- Client Hints and locale/language consistency
- (Future) human-like mouse curves via input_dispatch when chaser-oxide backend active
Feature-Gated Backends
The following require explicit cargo features (the published binary always ships the stable default):
chromiumoxide-backend(default) — current stablechromey-backend— fresher CDP definitions, adblock, fingerprint crate (drop-in, sameuse chromiumoxide)chaser-backend/stealthfeature — protocol-level stealth,ChaserPagehuman input (compile with--features stealth)ferrous-backend— ergonomic locator/wait API (launch path experimental)
To build with an alternative:
cargo install --path cli --no-default-features --features chromey-backend
# or for full stealth
cargo install --path cli --no-default-features --features stealth
See also the audit and superpowers plans for the "dependency/stealth refresh" item.
Trade-off: stealth backends may lag the main chromiumoxide feature surface or have different perf characteristics. The daemon handlers/refs/viewer remain unchanged regardless of backend.
How It Works
- The CLI parses commands and sends them to a local daemon over a loopback HTTP channel.
- The daemon maintains the browser lifecycle, page/frame routing, network hooks, action timeline, and session manifest state.
--session <name>creates isolated daemon and browser instances for parallel workflows.- The MCP stdio server (
gsd-browser mcp) is a thin, high-fidelity adapter over the exact same daemon client used by the CLI.
For AI Agents
Recommended 2026+ path: Connect via the MCP server (gsd-browser mcp). It gives you automatic discovery of 50+ tools, resources, and prompts with rich envelopes and best-practice guidance. See the dedicated sections at the top of this README, plus docs/mcp.md and especially docs/AGENT-BEST-PRACTICES.md.
When using the CLI directly (or for reference):
- The daemon auto-starts. You almost never need
gsd-browser daemon start. gsd-browser daemon healthreports the current session state and does not auto-start the daemon.- Use
--jsonwhen you need structured output. - Prefer
snapshotthenclick-reforfill-reffor stable interaction, and re-snapshot after page changes. (MCP: read thelatest-snapshotresource.) - Use
assertandbatchwhen you need deterministic pass/fail automation. find-bestandactcover 15 built-in semantic intents for common navigation, form, dialog, auth, and pagination actions.- The live viewer + annotations + recordings + human takeover are first-class superpowers for collaborative or auditable work.
- Read SKILL.md for the full command reference and workflow patterns (this is the source of truth for MCP tool semantics).
- Install the curated
gsd-browser-skill/pack (via the main installer) for coding agents.
License
Licensed under either of:
at your option.
source: gsd-browser/README.md