For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DiscordDashboard
DocumentationAPI ReferenceCookbook
DocumentationAPI ReferenceCookbook
  • Get Started
    • Introduction
    • Quickstart
    • Best Practices
  • Guides
    • Act SDK
    • Auth States
    • Conversations
    • Tools
  • Providers
    • OpenAI
    • Anthropic
  • Instances
    • Ubuntu
    • Browser
    • Windows
  • Resources
    • Starter Templates
    • Cursor Rules
    • Cookbook
Logo
DiscordDashboard
On this page
  • BrowserInstance
  • Start a browser instance
  • Available actions
  • get_cdp_url
  • save_auth
  • authenticate
  • screenshot
  • get_stream_url
  • computer
  • move_mouse
  • click_mouse
  • drag_mouse
  • scroll
  • press_key
  • type_text
  • wait
  • take_screenshot
  • get_cursor_position
  • stop
  • pause
  • resume
  • Compatible tools
  • Screen resolution
Instances

Browser

Deploy a Browser instance
Was this page helpful?
Previous

Windows

Deploy a Windows instance
Next
Built with

BrowserInstance

The BrowserInstance is a lightweight Chromium instance that supports interactive streaming, computer actions, Playwright CDP control, and saving/loading auth states. We recommend using this instance type if your task is constrained to the browser.

  • Fastest start up time
  • 1x compute cost

Start a browser instance

Python
TypeScript
1instance = client.start_browser()

Available actions

get_cdp_url

Get the Playwright CDP URL

Python
TypeScript
1cdp_url = instance.get_cdp_url().cdp_url

save_auth

Save the browser auth state

Python
TypeScript
1auth_state_id = instance.browser.save_auth(name="default").auth_state_id

authenticate

Authenticate the browser using a saved auth state

Python
TypeScript
1instance.browser.authenticate(auth_state_id=auth_state_id)

screenshot

Take a base64 encoded image of the current desktop

Python
TypeScript
1base_64_image = instance.screenshot().base_64_image

get_stream_url

Get the interactive stream URL

Python
TypeScript
1stream_url = instance.get_stream_url().stream_url

computer

Perform computer actions with the mouse and keyboard

move_mouse

Move mouse cursor to specific coordinates

coordinates
arrayRequired

[x, y] coordinates to move to

hold_keys
array

List of modifier keys to hold during the action

Python
TypeScript
Move mouse
1instance.computer(action="move_mouse", coordinates=[100, 200])
Move mouse while holding shift
1instance.computer(action="move_mouse", coordinates=[100, 200], hold_keys=["shift"])

click_mouse

Perform a mouse click at current position or specified coordinates

button
stringRequired

Mouse button to click (“left”, “right”, “middle”, “back”, “forward”)

click_type
stringDefaults to click

Type of click action (“down”, “up”, “click”)

coordinates
array

[x, y] coordinates to click at

num_clicks
numberDefaults to 1

Number of clicks

hold_keys
array

List of modifier keys to hold during the action

Python
TypeScript
Left click at current position
1instance.computer(action="click_mouse", button="left")
Right click at coordinates
1instance.computer(action="click_mouse", button="right", coordinates=[300, 400])
Mouse down
1instance.computer(action="click_mouse", button="left", click_type="down")
Double click at coordinates
1instance.computer(action="click_mouse", button="left", num_clicks=2, coordinates=[500, 300])

drag_mouse

Click and drag from current position to specified coordinates

path
arrayRequired

List of [x, y] coordinate pairs defining the drag path

hold_keys
array

List of modifier keys to hold during the action

Python
TypeScript
Drag to coordinates
1instance.computer(action="drag_mouse", path=[[100, 200], [300, 400]])

scroll

Scroll horizontally and/or vertically

coordinates
array

[x, y] coordinates to scroll at

delta_x
numberDefaults to 0

Horizontal scroll amount

delta_y
numberDefaults to 0

Vertical scroll amount

hold_keys
array

List of modifier keys to hold during the action

Python
TypeScript
Scroll down
1instance.computer(action="scroll", coordinates=[100, 100], delta_x=0, delta_y=200)
Scroll right
1instance.computer(action="scroll", coordinates=[100, 100], delta_x=200, delta_y=0)

press_key

Press a key or combination of keys. Scrapybara supports keys defined by X keysyms. Common aliases are also supported:

  • alt → Alt_L
  • ctrl, control → Control_L
  • meta → Meta_L
  • super → Super_L
  • shift → Shift_L
  • enter, return → Return
keys
arrayRequired

List of keys to press

duration
number

Time to hold keys in seconds

Python
TypeScript
Press ctrl+c
1instance.computer(action="press_key", keys=["ctrl", "c"])
Hold shift for 2 seconds
1instance.computer(action="press_key", keys=["shift"], duration=2)
Press enter/return
1instance.computer(action="press_key", keys=["Return"])

type_text

Type text into the active window

text
stringRequired

Text to type

hold_keys
array

List of modifier keys to hold while typing

Python
TypeScript
Type text
1instance.computer(action="type_text", text="Hello world")

wait

Wait for a specified duration

duration
numberRequired

Time to wait in seconds

Python
TypeScript
Wait for 3 seconds
1instance.computer(action="wait", duration=3)

take_screenshot

Take a screenshot of the desktop

Python
TypeScript
1screenshot = instance.computer(action="take_screenshot").base64_image

get_cursor_position

Get current mouse cursor coordinates

Python
TypeScript
1cursor_position = instance.computer(action="get_cursor_position").output

stop

Stop the instance

Python
TypeScript
1instance.stop()

pause

Pause the instance

Python
TypeScript
1instance.pause()

resume

Resume the instance

Python
TypeScript
Resume with default timeout
1instance.resume()
Resume with custom timeout
1instance.resume(timeout_hours=2.5)

Compatible tools

  • ComputerTool

Screen resolution

By default, the Browser instance runs at 1024x768 resolution. You can specify a custom resolution when starting the instance:

Python
TypeScript
1instance = client.start_browser(resolution=[1920, 1080])