Browser automation for Go v1.2.0

Scout

A single-binary MCP server that gives AI agents a browser. 66 tools for navigation, form filling, data extraction, screenshots, and DOM diffing — built on pure Chrome DevTools Protocol.

brew install felixgeelhaar/tap/scout

Built different.

Scout is not another wrapper around DevTools Protocol libraries. It is the protocol library.

Pure CDP. Zero wrappers.

Direct WebSocket connection to Chrome DevTools Protocol. No rod, no chromedp, no Playwright dependency tree. One binary, one connection, full control.

Agent-first design.

Structured JSON output, DOM diffing that saves 50-80% tokens, content distillation at 5 levels, token budgets, and semantic form filling. Built for LLMs, not humans.

66-tool MCP server.

Single Go binary. No Node.js runtime, no Python interpreter. Just scout mcp serve and your AI agent has a browser.

Gin-like middleware.

Compose retry, timeout, circuit breaker, rate limit, and bulkhead patterns exactly like Gin HTTP middleware. If you know Gin, you know Scout.

Four ways to use Scout.

A Go library, an agent package, a CLI, and an MCP server. Same engine, different interfaces.

For Go developers building automation scripts and pipelines

Use the core API when you need full control: named tasks, middleware composition (retry, timeout, circuit breaker, stealth), grouped workflows, and the familiar Gin-style Engine → Context → HandlerFunc pattern. Ideal for scrapers, testing tools, monitoring systems, and CI pipelines.

// Gin-like engine with middleware composition
engine := browse.Default(browse.WithHeadless(true))
engine.MustLaunch()
defer engine.Close()

// Add resilience middleware
engine.Use(middleware.Timeout(30 * time.Second))
engine.Use(middleware.Retry(middleware.RetryConfig{MaxAttempts: 3}))
engine.Use(middleware.Stealth())

// Define and run tasks
engine.Task("scrape-prices", func(c *browse.Context) {
    c.MustNavigate("https://shop.example.com")
    c.El("input[name=q]").MustInput("mechanical keyboard")
    c.El("button[type=submit]").MustClick()

    prices := c.ElAll(".product .price").MustTexts()
    c.Set("prices", prices)
})

engine.Run("scrape-prices")

For AI agent developers who need structured browser interaction

Use the agent API when building autonomous agents that browse the web. Every method returns JSON-serializable structs, content is auto-truncated for LLM context windows, all operations auto-wait, and the session is goroutine-safe. DOM diffing, semantic form filling, annotated screenshots, network capture, and token budgets are built in — your agent thinks less and acts more.

// Session-based API optimized for AI agents
session, _ := agent.NewSession(agent.SessionConfig{Headless: true})
defer session.Close()

// Navigate and observe — structured JSON, not HTML soup
session.Navigate("https://app.example.com")
obs, _ := session.Observe()        // links, inputs, buttons, text

// Semantic form filling — no CSS selectors needed
session.FillFormSemantic(map[string]string{
    "Email":    "user@example.com",
    "Password": "secret123",
})

// DOM diff — only what changed (saves 50-80% tokens)
session.Click("#login")
_, diff, _ := session.ObserveDiff()
// diff.Added: [{Tag:"div", ID:"dashboard", Text:"Welcome!"}]

// Visual grounding — click by label number, not selector
result, _ := session.AnnotatedScreenshot()
session.ClickLabel(7)  // click element labeled [7]

// Network capture — read API responses directly
session.EnableNetworkCapture("/api/")
captured := session.CapturedRequests("/api/users")

For quick tasks, shell scripts, and CI pipelines

Use the CLI for one-shot operations without writing Go code. Each command launches a browser, navigates, performs one action, outputs the result, and exits. Pipe JSON output to jq, save screenshots in CI, extract data in shell scripts, or quickly inspect a page before writing automation code.

# Navigate and observe
$ scout observe https://example.com
{
  "title": "Example Domain",
  "links": ["More information..."],
  "inputs": [],
  "buttons": []
}

# Get page as markdown
$ scout markdown https://news.ycombinator.com

# Screenshot
$ scout screenshot https://github.com --output gh.png
Saved screenshot to gh.png (284519 bytes)

# Extract data
$ scout extract https://example.com h1
Example Domain

# Discover form fields
$ scout form discover https://login.example.com

# Detect frontend frameworks
$ scout frameworks https://react.dev
react
nextjs

For giving AI assistants (Claude, Cursor, etc.) browser superpowers

Use the MCP server to let your AI assistant browse the web, fill forms, extract data, and take screenshots — all via the standard Model Context Protocol. A single Go binary with 66 tools, zero runtime dependencies. Just install and add one line to your MCP config. The AI decides which tools to call; Scout handles the browser.

# Add to Claude Code
$ claude mcp add scout -- scout mcp serve

# Or configure in claude_desktop_config.json / mcp.json
{
  "mcpServers": {
    "scout": {
      "command": "scout",
      "args": ["mcp", "serve"]
    }
  }
}

# 66 tools available:
#
# Navigation:  navigate, observe, observe_diff, observe_with_budget,
#              hybrid_observe
# Interaction: click, click_label, type, hover, double_click,
#              right_click, select_option, scroll_to, scroll_by,
#              focus, drag_drop, dispatch_event,
#              select_by_prompt, batch, find_by_coordinates
# Forms:       fill_form, fill_form_semantic, discover_form
# Extraction:  extract, extract_all, extract_table,
#              markdown, readable_text, accessibility_tree
# Capture:     screenshot, annotated_screenshot, pdf
# Network:     enable_network_capture, network_requests
# Tabs:        open_tab, switch_tab, close_tab, list_tabs
# Frames:      switch_to_frame, switch_to_main_frame
# Frameworks:  wait_spa, detect_frameworks,
#              component_state, app_state
# Playback:    start_recording, stop_recording,
#              save_playbook, replay_playbook
# Tracing:     start_trace, stop_trace
# Performance: web_vitals
# Utility:     has_element, wait_for, configure

Five levels of content distillation.

Pages return megabytes of HTML. Scout gives your agent exactly the right amount of information.

MethodOutput sizeBest for
Observe()~2-5 KBDeciding what to click or fill
ObserveDiff()~0.5-2 KBSeeing only what changed after an action
Markdown()~2-8 KBReading page content in a compact format
ReadableText()~1-4 KBMain article or body text only
AccessibilityTree()~1-4 KBCompact semantic element tree

Everything, composed.

Organized by layer. Each feature is a building block that composes with the rest.

Core Engine

Pure CDP over WebSocket
Gin-like Engine / Context / Group
Auto-wait on navigation and elements
CSS selectors with chaining
Page pool for concurrent tasks
Remote CDP for cloud browsers
Table extraction to structured data
PDF generation and screenshots
Shadow DOM piercing with flattened cache
Iframe switching for nested content

Agent Package

Structured JSON output
DOM diffing (50-80% token savings)
Token budgets for observation
Semantic form fill by field name
Annotated screenshots with labels
Visual grounding via click_label
Network capture for XHR/fetch
Persistent profiles (cookies + storage)
Screenshot auto-compress for LLMs
5-level content distillation
NL selectors by prompt text
Vision hybrid mode with bounding boxes
Batch operations for multi-action
Trace export to zip files
Web vitals LCP/CLS/INP extraction

Middleware

Retry with backoff
Timeout per task
Circuit breaker (fortify)
Rate limit requests
Bulkhead isolation
Stealth mode anti-detection
Auth (Bearer, Basic, Cookie)
Resource blocking (images, fonts)
Viewport and slow motion
Screenshot on error
Stealth v2 canvas/audio/WebRTC noise
UA rotation with 27 realistic agents
Human delay randomized timing

Framework Support

React + Next.js + Gatsby
Vue 2/3 + Nuxt
Angular
Svelte + SvelteKit
Solid + Preact + Lit
Alpine + HTMX + Stimulus
Qwik + Astro + Remix
Ember

Layer stack.

Four layers from user-facing interfaces down to the wire protocol.

Install Scout.

Install the binary, connect to your AI agent, and start browsing. Go library also available.

Homebrew recommended

$ brew install felixgeelhaar/tap/scout

Claude Code MCP

$ claude mcp add scout -- scout mcp serve

Claude Desktop / Cursor MCP

{
  "mcpServers": {
    "scout": {
      "command": "scout",
      "args": ["mcp", "serve"]
    }
  }
}

Binary any platform

$ curl -fsSL https://raw.githubusercontent.com/\
  felixgeelhaar/scout/main/install.sh | bash

Go Library developers

$ go get github.com/felixgeelhaar/scout

Go Install CLI

$ go install github.com/felixgeelhaar/\
  scout/cmd/scout@latest