OpenClaw X11 tools

xdotool / X11 Access

Real mouse/keyboard automation via xdotool. Useful for bypassing bot detection that blocks CDP clicks.

Setup:

Browser runs X server on display :1 with socat forwarding to TCP 6001
Agent and browser containers share desk-net Docker network
Browser hostname: desk-browser (IP fallback: 172.18.0.2)

Usage:

export DISPLAY=desk-browser:1
xdotool getmouselocation
xdotool mousemove 500 300 click 1
xdotool key Return
xdotool type "hello world"

If DNS fails: Container may be disconnected from desk-net. Timer reconnects every 5 min, or ask Brektimus to run /data/openclaw/agents/desk/setup-x11.sh

IP fallback: DISPLAY=172.18.0.2:1

X11 Screenshots

Take screenshots of the browser using ImageMagick's import command over X11.

Take a screenshot:

DISPLAY=desk-browser:1 import -window root /tmp/screen.png
cp /tmp/screen.png ./screen.png  # Copy to workspace for reading

Read the screenshot:

read("screen.png")  # Returns the image

Why X11 screenshots?

CDP screenshots miss native browser popups (file dialogs, permission prompts)
X11 captures the actual screen as seen in noVNC

Window Management (xdotool)

Find browser window:

DISPLAY=desk-browser:1 xdotool search --class "chromium"
# Returns window IDs (use the larger geometry one, usually second)

Get window geometry:

DISPLAY=desk-browser:1 xdotool getwindowgeometry <window_id>

Resize window:

DISPLAY=desk-browser:1 xdotool windowsize <window_id> 1050 780

Note: getactivewindow doesn't work (minimal WM). Use search --class instead.

abrkn/TOOLS.md

Select an option

No results found

Select an option

No results found

xdotool / X11 Access

X11 Screenshots

Window Management (xdotool)