Snapshots
The snapshot command returns a compact accessibility tree with refs for element interaction.
Options
Filter output to reduce size:
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -i -C # Include cursor-interactive elements
agent-browser snapshot -c # Compact (remove empty elements)
agent-browser snapshot -d 3 # Limit depth to 3 levels
agent-browser snapshot -s "#main" # Scope to CSS selector
agent-browser snapshot -i -c -d 5 # Combine options| Option | Description |
|---|---|
-i, --interactive | Only interactive elements (buttons, links, inputs) |
-C, --cursor | Include cursor-interactive elements (cursor:pointer, onclick, tabindex) |
-c, --compact | Remove empty structural elements |
-d, --depth | Limit tree depth |
-s, --selector | Scope to CSS selector |
Cursor-interactive elements
Many modern web apps use custom clickable elements (divs, spans) instead of standard buttons or links.
The -C flag detects these by looking for:
cursor: pointerCSS styleonclickattribute or handlertabindexattribute (keyboard focusable)
agent-browser snapshot -i -C
# Output includes:
# @e1 [button] "Submit"
# @e2 [link] "Learn more"
# Cursor-interactive elements:
# @e3 [clickable] "Menu Item" [cursor:pointer, onclick]
# @e4 [clickable] "Card" [cursor:pointer]Output format
The default text output is compact and AI-friendly:
agent-browser snapshot -i
# Output:
# @e1 [heading] "Example Domain" [level=1]
# @e2 [button] "Submit"
# @e3 [input type="email"] placeholder="Email"
# @e4 [link] "Learn more"Using refs
Refs from the snapshot map directly to commands:
agent-browser click @e2 # Click the Submit button
agent-browser fill @e3 "a@b.com" # Fill the email input
agent-browser get text @e1 # Get heading textRef lifecycle
Refs are invalidated when the page changes. Always re-snapshot after navigation or DOM updates:
agent-browser click @e4 # Navigates to new page
agent-browser snapshot -i # Get fresh refs
agent-browser click @e1 # Use new refsAnnotated screenshots
For visual context alongside text snapshots, use screenshot --annotate to overlay numbered labels on interactive elements. Each label [N] maps to ref @eN:
agent-browser screenshot --annotate ./page.png
# -> Screenshot saved to ./page.png
# [1] @e1 button "Submit"
# [2] @e2 link "Home"
# [3] @e3 textbox "Email"
agent-browser click @e2Annotated screenshots also cache refs, so you can interact with elements immediately. This is useful when the text snapshot is insufficient -- unlabeled icons, canvas content, or visual layout verification.
Best practices
- Use
-ito reduce output to actionable elements - Re-snapshot after page changes to get updated refs
- Scope with
-sfor specific page sections - Use
-dto limit depth on complex pages - Use
screenshot --annotatewhen visual context is needed alongside refs
JSON output
For programmatic parsing in scripts:
agent-browser snapshot --json
# {"success":true,"data":{"snapshot":"...","refs":{"e1":{"role":"heading","name":"Title"},...}}}Note: JSON uses more tokens than text output. The default text format is preferred for AI agents.