HuGR  /  Pilot

Your phone.
Operated from your desk.

Connect your Android or iOS device and let AI agents operate any mobile app through the native accessibility layer. No APIs, no screenshots, no root. Real apps, real actions, real results.

Download for macOS Download for Windows

Read the screen. Act on it.

Pilot reads the native UI tree of any app — the same structure screen readers use — and operates it with surgical precision.

screen_read — iFood
# AI reads the screen as structured data app: iFood (br.com.brainweb.ifood) keyboard: false overlay: none [toolbar] btn "Voltar" btn "Buscar em Bob's Shakes" [list: 20 items, scrollable] [item 0] "Milk Shake Tropical" heading "R$ 21,90" "A nossa baunilha gelada batida..." [item 1] "Bobs Max Tropical" "R$ 16,90"
agent action
# AI selects the item select_item("Milk Shake Tropical") → Found in list at index 0 → Tapped item → Waiting for screen change... OK Screen changed → product detail press("Adicionar") → Found button "Adicionar ao carrinho" → Tapped OK Item added to cart
Native UI tree

No screenshots.
No computer vision.

Other tools take screenshots and guess where to click. Pilot reads the actual accessibility tree — the same semantic structure that screen readers use. Every button, every list item, every text field is a structured node with role, label, and coordinates. No guessing. No pixel matching. No hallucinated clicks.

screen_elements
# Every interactive element, with role and coordinates i role text x y ───────────────────────────────────────────── 0 button "Voltar" 52 103 1 button "Buscar" 653 96 2 listitem "Milk Shake Tropical" 360 779 3 edittext "Buscar..." 360 125 4 tab "Pedidos" 540 1540 5 tab "Perfil" 680 1540
open_app + search
# Open any app by name — no drawer, no scrolling open_app("Uber") OK Launched com.ubercab screen_status() app: Uber keyboard: false overlay: none search("Aeroporto de Guarulhos") → Found search field → Typed query → Submitted OK 3 results found select_item("GRU Airport") OK Destination set
Universal

Any app. Any action.
Zero integration.

iFood, Uber, WhatsApp, banking apps, government services — if it runs on your phone, Pilot can operate it. No APIs to configure. No OAuth flows. No developer access. The app doesn't know it's being automated. It just works.

24 MCP tools

Surgical tools,
not raw commands.

Each tool does exactly one thing well. press finds a button by text and taps it. toggle sets a switch to the desired state idempotently. fill_form fills multiple fields at once. wait_for polls until a condition is met. The AI agent orchestrates them — you describe the goal.

available tools
Perception screen_read screen_elements screen_status read_list read_form Interaction press type_in select_item toggle dismiss scroll_down go_back open_app Task search fill_form wait_for assert_screen System take_screenshot clipboard device_info

Everything in Pilot.

Device routing

Connect your Android or iOS device over your local network. Push-based routing — no polling, no port forwarding required.

Accessibility tree

Reads the native UI tree via Android AccessibilityService. Every node is a structured object with role, text, bounds, and actions.

Smart search

Find the search field, type the query, submit, and return results — all in a single tool call. Works across any app.

Form filling

Pass a dictionary of label-value pairs and fill multiple form fields at once. Labels resolved by hint text, accessibility label, or position.

Idempotent toggle

Set a checkbox or switch to the desired state. Calling it twice does the same thing. No accidental toggles.

Auto-dismiss

Detects and closes overlays, popups, and dialogs automatically. Finds the dismiss button in the topmost window layer.

Screenshot capture

Take a screenshot when you need visual context. Returned as base64 for LLM analysis alongside the structured tree data.

Wait conditions

Poll until a condition is met — element appears, loading finishes, or screen changes. Configurable timeout.

Screen assertions

Verify the current screen matches expectations before acting. Check app name, visible text, or absence of elements.

Clipboard access

Read or write the device clipboard. Copy confirmation codes, share text between desktop and phone.

Device info

Check battery, WiFi status, screen state, and device model. Know the state of the device before acting.

Encrypted relay

Commands travel through an encrypted WebSocket relay. Token rotation, keepalive, and bounded executors. Audited and hardened.

What you get on each tier.

Feature Free Budget — $9/mo Pro — $29/mo
Device routing Yes
Connected devices Up to 3
MCP tools All 24
Screen read / elements Unlimited
Screenshot capture Yes
Encrypted relay Yes

Your phone has 200 apps.
Now your AI has them too.

Download HuGR Wallet. Connect your device. Let agents handle the rest.