Open Source · Remote Compute Agent

Your Desktop,From Anywhere

A remote compute agent that gives you full AI desktop control from your phone — real-time screen streaming, text or voice input, and autonomous task execution.

Get Started View on GitHub

How It Works

You speak, your phone understands, your desktop executes — all in one seamless flow.

Your PhoneVoice or Text

Voice Input

Open Chrome and search for flights to Tokyo

Take a screenshot and summarize what you see

Speak naturally or type — your phone understands what you want to do on your desktop.

On-Device AIUnderstands Intent

An AI model running on your phone interprets what you said and figures out if it's a simple question or a task that needs your desktop.

1Transcribes your voice to text

2Classifies: chat reply or desktop action?

3Sends command to your desktop

Simple questions are answered instantly on the phone — only desktop tasks get sent over.

Encrypted TunnelPhone → Desktop

End-to-end encrypted

Connected

Same Wi-FiFastest, direct connection

Tailscale VPNSecure mesh, any network

Cloud TunnelWorks from anywhere

Automatically picks the fastest available path. Your data never passes through third-party servers.

Safety CheckEvery Action Reviewed

Before any action runs on your desktop, it's automatically evaluated for safety.

Open VS Codesafe

Run npm testsafe

Delete system filesblocked

Run unknown scriptsandboxed

Risky commands run in an isolated sandbox — no access to your files, network, or system.

AI AgentAutonomous Execution

An AI agent on your desktop breaks down your request into steps and executes them autonomously.

Opens Chrome

Searches "flights to Tokyo"

Reads the screen to find results

Clicks on the best option

Sending progress updates to your phone

Your DesktopWindows · macOS · Linux

TerminalRun any command

GUI ControlClick, type, scroll

BrowserNavigate & interact

AccessibilityNative UI controls

File SystemRead, edit, find

SandboxIsolated environment

$ Opening Chrome...
$ Searching "flights to Tokyo"
$ Reading screen... 14 elements found▌

Your Phone

Speak naturally, AI understands

Speak or type naturally on your phone. The on-device AI figures out what you need and whether it requires your desktop.

The Bridge

Encrypted, direct, no cloud

Your commands travel directly from phone to desktop through an encrypted tunnel. Nothing is stored or routed through third-party servers.

Your Desktop

AI does the work for you

An autonomous agent takes over your desktop — opening apps, clicking buttons, running commands — while keeping you updated every step of the way.

See the animated explainer

Connectivity

Three connection paths, one seamless experience — your phone always finds the fastest route to your desktop.

LANFastest · Tried First

Same Wi-Fi Network

Direct connection, lowest latency

DiscoveryAuto-detects desktop on your network

SpeedConnects in under 1.5 seconds

PrivacyTraffic never leaves your router

Best for home & office use

TailscaleSecure · 2nd Path

Zero-Trust VPN Mesh

WireGuard-based encrypted tunnel

ReachWorks from anywhere, both devices need Tailscale

SpeedConnects in under 3 seconds

SetupAuto-detected if installed

Best for remote access

Cloudflare TunnelUniversal · 3rd Path

Public Internet Tunnel

No port forwarding, no firewall config

ReachWorks from anywhere in the world

SpeedConnects in under 5 seconds

SecurityNo open ports on your machine

Best for quick sessions anywhere

Peer-to-PeerDirect · Encrypted

Once a signaling path is found, a direct WebRTC connection is established between your phone and desktop.

1Scan QR code

2Verify with Face ID / fingerprint

3Establish encrypted channel

4Start sending commands

30sKeepalive interval

5xAuto-reconnect attempts

PairingQR Code · Biometric

Two pairing modes to fit your workflow — quick one-time sessions or persistent always-on setups.

Quick Session

Expires after 4 hours or when you disconnect

No data persisted on device

Persistent

Stays active for 30 days with auto-reconnect

Reconnects automatically

End-to-end encrypted with DTLS

Biometric verification on every pair

Old tokens auto-revoked on re-pair

CONNECTIVITY

Automatic Path Discovery

Contop tries the fastest path first and falls back automatically. If your connection drops, it reconnects with smart backoff — no manual intervention needed.

ENCRYPTION

End-to-End Encrypted

All data flows directly between your phone and desktop — encrypted with DTLS and verified with certificate fingerprints. Nothing passes through third-party servers.

AUTHENTICATION

Flexible Pairing

Quick sessions for one-time use, persistent connections for your daily setup. Each device gets one active token — re-pairing automatically revokes the old one.

Features

A powerful mobile interface designed for every workflow

Adaptive Layouts

Split ViewDefault Mode

Screen

Chat

See your desktop screen and conversation side by side. Drag the separator to resize — anywhere from 30% to 70%.

Best for monitoring tasks as they run

Video FocusWatch Mode

Full Screen

chat overlay

Maximize your desktop view. The chat floats on top as a transparent overlay — tap through it to keep watching.

Best for watching the agent work

Thread FocusRead Mode

mini screen

Full Chat

Focus on the conversation. A small video preview stays pinned at the top so you never lose sight of your desktop.

Best for reading detailed results

Side-by-SideLandscape Mode

Screen

Chat

Rotate your phone and get a widescreen view. Desktop screen on the left, conversation on the right — plus a fullscreen video option for dedicated monitoring.

Best for extended work sessions

Fullscreen VideoLandscape Mode

Full Screen

Dedicate your entire screen to watching your desktop. Minimal floating controls stay out of the way. Perfect for long-running tasks where you just need to keep an eye on things.

Best for dedicated desktop monitoring

LAYOUTS

5 Modes for Every Use Case

Split View for balanced monitoring, Video Focus for watching the agent work, Thread Focus for reading results, Side-by-Side for landscape multitasking, and Fullscreen Video for dedicated desktop viewing.

ORIENTATION

Smart Rotation

Rotate your phone and the layout adapts instantly. Set your preferred mode for portrait and landscape — Contop remembers your choices across sessions.

INTERACTION

Drag to Resize

Resize the screen and chat panels by dragging the separator. Works horizontally in portrait and vertically in landscape. Constrained so neither panel gets too small.

Intelligent Model Configuration

AI ROLESIndependent · Per-Task

Conversation

Powers chat and voice

Flash▾

Execution

Drives agent decisions

Flash▾

Screen Interaction

Sees and controls screen

Local▾

Configure each AI role independently from your phone — no server restart needed

Best for tailoring AI behavior to your specific workflow

SCREEN INTERACTION9 Backends · Your Choice

Local VisionOmniParser

Runs on your machine · Works offline

Cloud VisionUI-TARS + 5 more

Kimi · Qwen · Phi · Molmo · Holotron

Native AI VisionGemini CU

Google's built-in · Autonomous multi-step

Keyboard FirstAccessibility

Text-based · No screenshots needed

Nine vision backends from local to cloud — choose by privacy, speed, or model preference

Best for matching your privacy and performance priorities

MODELSMulti-Provider · Your Keys

Gemini

3.1 Pro · 3 Flash · 2.5 Pro · 2.5 Flash

OpenAI

GPT-5.4 · GPT-4.1 · o3 · o4 Mini

Anthropic

Claude Opus 4.6 · Sonnet 4.6 · Haiku 4.5

OpenRouter

Grok · Devstral · Qwen · Nemotron · 300+

Bring your own API keys — use any provider for conversation or execution

Best for choosing the right model per task and budget

CONFIGURATION

Per-Role Model Selection

Three independent AI roles — conversation, execution, and screen interaction — each configurable with 25+ models from Gemini, OpenAI, Anthropic, or OpenRouter. Change from your phone anytime.

BACKENDS

Nine Screen Strategies

Nine ways for the agent to see your screen — from local OmniParser to six cloud vision models, Google's native vision, or keyboard-first with no screenshots.

RUNTIME

Switch Without Restarting

Change models and backends on the fly from mobile settings. The desktop agent picks up your new configuration on the next command — zero downtime.

Everyday Experience

LIVE THREADReal-Time Updates

Check if the API server is running

I'll check the process list...

Running command...

ps aux | grep server

Server is running on port 3000

See every step the agent takes in real time — messages, tool calls, and results stream into a live thread

Best for following the agent's reasoning step by step

SESSIONSAuto-Saved · Resumable

AllTodayThis Week

Deploy fix for API

Mar 15

12Continue

Debug login timeout

Mar 14

Setup CI pipeline

Mar 12

Pick up where you left off — sessions persist across app restarts with full conversation history

Best for resuming complex multi-step tasks

CUSTOM PROMPTYour Rules · Your Way

Always use PowerShell

My project is at C:\Dev\myapp

Respond in Spanish

Clear

Tell the agent how you want it to behave — set language, project paths, or preferred tools

Best for personalizing the agent to your workflow

DEVICE CONTROLRemote · Instant

Keep Screen Awake

Lock Desktop

Control your desktop state from anywhere

Lock your screen or keep it awake during long tasks — all from your phone

Best for hands-free desktop management

VOICE INPUTSpeak · Send · Execute

0:12

CancelSend

Speak or type — your intent becomes the command. Record voice, review, and send, or type directly for quick instructions

Best for quick commands while multitasking

MANUAL CONTROLJoystick · Hybrid Mode

EscTabDelCtrl

Take direct control — move the cursor with a joystick, click, scroll, and send key combos from your phone

Best for precision tasks the agent can't handle alone

EXECUTION

See Every Step in Real Time

Watch the agent work through your request step by step. User messages, AI responses, tool calls, and results stream into a live thread — with progress indicators and expandable details.

SESSIONS

Pick Up Where You Left Off

Every session is saved automatically with full conversation history. Browse by date, filter by tool or result, rename sessions, and continue any past session with one tap.

CONTROL

Your Desktop, Your Rules

Lock your screen, keep it awake, set custom instructions, use voice input, or take direct control with a joystick overlay for cursor, clicks, and keyboard shortcuts. Switch between AI and manual mode seamlessly.

Agent & Automation

33 built-in tools that let the agent run commands, control your screen, manage files, and automate entire workflows.

TerminalRun Any Command

$ pip install requests
✓ Installed in 1.2s

Run shell commands on your desktop just like you would in a terminal — install packages, run scripts, manage files.

SafetyDangerous commands run in a Docker sandbox

PrivacySensitive env vars like API keys are hidden

LimitsAuto-stops stalled commands after 5 seconds

Screen ControlSee & Interact

The agent sees your screen, identifies every button and element, then clicks, types, and scrolls — just like a person would.

1Takes a screenshot of your desktop

2AI identifies all clickable elements

3Performs the right action at the right spot

Click

Type

Scroll

Drag

Hotkeys

Select

Also supports keyboard-only mode via accessibility tree

BrowserNavigate & Extract

Controls Chrome directly — no screenshots needed. Navigates pages, fills forms, clicks buttons, and reads content efficiently.

→Navigate to any URL

◉Click buttons and links

✎Fill in forms and fields

≡Read page text content

◻Take page snapshots

▣Manage multiple tabs

Reads page text directly instead of taking screenshots — 10x more efficient for the AI.

FilesRead · Edit · Search

Works with any file on your machine — text, code, PDFs, images, and Excel spreadsheets.

Text & CodeRead and edit with precision

PDFsExtract content as readable text

ImagesView and analyze screenshots

ExcelRead sheets, write cells, merge ranges

SearchFind files by name or content

7 file tools available

WindowsCross-Platform

Manage windows, read the clipboard, monitor processes, and download files — works the same on every platform.

Window FocusSwitch between apps

ResizeArrange your workspace

ClipboardRead and write content

DownloadsFetch files from URLs

WindowsNative adapters

macOSNative adapters

LinuxNative adapters

Apps & SkillsLaunch · Automate · Extend

Launch and close apps, handle Save As and Open dialogs, and create reusable skills to automate repetitive workflows.

Launch any applicationWaits until ready

Close apps gracefullyAuto-saves if needed

Handle file dialogsSave As, Open, export

Custom Skills

Teach the agent new abilities by creating reusable skills — chain multiple steps into one command.

PromptWorkflowPythonMixed

EXECUTION

Three ways to control your desktop

Run terminal commands, automate GUI interactions by seeing your screen, or control Chrome directly — the agent picks the best approach for each task.

OPERATIONS

Works with any file, any platform

Read and edit code, PDFs, images, and spreadsheets. Manage windows and monitor your system. Same experience on Windows, macOS, and Linux.

EXTENSIBILITY

Teach it new tricks

Create custom skills to automate your unique workflows. Chain actions together, save them once, and reuse them forever — no coding required.

Model Providers

Use API keys or your existing subscriptions — choose from 4 providers and 20+ models, and configure any combination for any task.

ProvidersKeys or Subscriptions

Choose from 4 providers and 20+ models. Use any combination for different tasks — switch anytime from your phone.

Google GeminiFlash, Pro, and Flash Lite models for conversation, execution, screen control, and speech-to-text.

OpenAIGPT and o-series models with multimodal capabilities. Whisper for alternative speech-to-text.

AnthropicClaude Opus, Sonnet, and Haiku — with optional extended thinking for deeper reasoning.

OpenRouterUniversal gateway to 300+ models — Grok, Qwen, Mistral, Nemotron, and more via one API key.

Three AI RolesMix & Match

The app uses three independent AI roles — assign any provider to any role, and change them at runtime from mobile settings.

ConversationUnderstands what you want and classifies your intent

ExecutionRuns tools and carries out tasks on your desktop

VisionPicks how the AI reads your screen — multiple backends available

Any provider can fill any role — use Gemini for conversation and Claude for execution, or any other combination.

AuthenticationQR Pairing

Set up API keys or enable subscription mode on the desktop app. Configuration travels to your phone securely through QR pairing — no manual copying.

1Configure API keys or enable subscription mode in desktop settings

2Scan the QR code with your phone to pair

3Auth config is encrypted and stored in your phone's secure enclave

Gemini

OpenAI

Anthropic

OpenRouter

Models configurable per-role at runtime

CHOICE

Pick the best model for the job

Different tasks benefit from different models. Use a fast model for quick actions and a powerful one for complex reasoning — all from the same app.

CONTROL

Configure from your phone

Switch models and providers at any time from mobile settings. Each AI role can be independently assigned to any supported model.

SECURITY

Your credentials, your control

API keys and subscription preferences never leave your devices. They're configured on your desktop, transferred securely via QR, and stored encrypted on your phone.

Skills

Extensible agent capabilities via the SKILL.md standard — built-in skills included, custom skills easy to create.

Advanced Workflowsv1.0.0 · python

async def fill_form(fields) → dict

async def extract_text(region, element_name) → dict

async def copy_between_apps(source, target) → dict

async def set_env_var(name, value, scope) → dict

async def change_setting(setting_path, value) → dict

async def app_menu(app_name, menu_path) → dict

async def install_app(name, method) → dict

async def find_and_replace_in_files(path, pattern, old, new) → dict

8 Python toolspython

IDE Chatv2.0.0 · workflow

# vscode-claude-send

- action: hotkey

keys: [ctrl, shift, p]

- action: type_text

text: "Claude: {prompt}"

- action: press_key

key: enter

VS Code Claude

VS Code Copilot

Cursor

24 deterministic workflowsworkflow

Prompt SkillsSKILL.md Standard

---

name: skill-authoring

description: Guide for creating...

version: "1.0.0"

---

# Skill Instructions

Markdown body with agent...

skill-authoringGuide for creating and editing custom skills

v1.0.0

web-researchBrowser automation + Electron + CDP strategy

v1.0.0

cli-command-patternsCross-platform bash/PowerShell patterns

v1.1.0

Agent instructions loaded on demandprompt

Skill Types4 Execution Models

promptAgent instructions loaded on demand

workflowExecuted deterministically by workflow engine

pythonAsync Python functions registered as agent tools

mixedAll mechanisms available

~/.contop/skills/{skill-name}/

├── SKILL.md # YAML frontmatter + markdown

└── scripts/ # Optional

├── *.yaml

└── *.py

Skill Lifecycle5-Stage Pipeline

Discoverydiscover_skills() scans ~/.contop/skills/

Registrationenabled_skills in settings.json

DisclosureMetadata at startup, full on load_skill

ConflictsTool name check against 33 CORE_TOOL_NAMES

Agent Toolsexecute · load · create · edit

Control AI coding IDE...

</skill>

</skills>

EXTENSIBILITY

Extensible Agent

Add new capabilities by dropping a SKILL.md file into the skills directory. The agent discovers and loads it automatically — no code changes needed.

AUTOMATION

Deterministic Workflows

Define YAML step sequences for repetitive tasks — keyboard shortcuts, menu navigation, form filling. Runs the same way every time, no AI guesswork.

CUSTOMIZATION

Create Your Own

Build custom skills as prompt instructions, YAML workflows, Python tools, or any combination. Manage them from the desktop GUI — discover, enable, edit.

Security

Every layer verified against the real codebase — from physical machine protection to encrypted peer-to-peer connections.

Away Modeaway_mode.rs · Physical Security

Bystander sees

Dark PIN overlay

WDA_EXCLUDEFROMCAPTURE = 0x00000011

Invisible to screen capture

Owner sees

Live WebRTC video feed

streaming

Desktop automation running

Away Mode Features3 layers

PIN-locked overlayFullscreen topmost window · low-level keyboard hook blocks everything except digit keys (0–9), numpad (0–9), backspace, and enter

Auto-engage on idleActivates after 5 minutes of no mouse or keyboard input · polls every 30 seconds via GetLastInputInfo()

3 unlock methodsScreen PIN (4–12 digits, bcrypt cost 10) · phone command (away_mode_disengage) · emergency recovery PIN (6–12 digits)

Command Classifierdual_tool_evaluator.py

Every command the agent wants to run goes through this 7-step check — top to bottom, first match wins:

User forced host execution→ run on host

Needs the screen (GUI, browser, observe)→ run on host

Unknown or unrecognized tool→ sandbox it

Forbidden command (rm -rf /, format C:, mkfs…)→ block entirely

Touches protected path (/root, C:\Windows…)→ sandbox it

Destructive (rm, kill, DROP TABLE, taskkill…)→ ask user first

Everything else — safe by default→ run on host

Also blocks encoded PowerShell commands · detects dangerous cmdlets like remove-item, stop-process, invoke-expression

Docker Sandboxdocker_sandbox.py

Risky commands run inside a locked-down Docker container:

No network accessContainer can't reach the internet

256 MB memory limitPrevents resource exhaustion

50% CPU capWon't slow down your machine

100 process limitNo fork bombs

Read-only filesystemCan't modify the container image

64 MB temp storageOnly /tmp is writable, and it's tiny

No privilege escalationRuns as 'nobody' with zero capabilities

Auto-starts Docker DesktopDetects if Docker is installed but not running and starts it automatically

Mobile Approvalwebrtc_peer.py

When the agent wants to do something destructive, it asks your phone for permission first:

1Desktop agent pauses

"I want to delete 3 files — approve?"

2Your phone shows a prompt

Approve or Deny with one tap

3Agent gets your answer

Proceeds only if you approved

Sent over encrypted WebRTC data channel:

{

"type": "agent_confirmation_response"

"payload": { "approved": true }

}

Audit Loggingaudit_logger.py

~/.contop/logs/session-{YYYY-MM-DD}.jsonl

timestamp: UTC ISO 8601

session_id: str

user_prompt: str

classified_command: str

tool_used: str

execution_result: str

voice_message: str (default "")

duration_ms: int (default 0)

async def log(*, session_id, user_prompt, classified_command, tool_used, execution_result, voice_message, duration_ms)

Fire-and-forget · asyncio.to_thread(self._write_line, path, line)

Auth & Encryptionpairing.py

@dataclass

PairingToken:

token

dtls_fingerprint

stun_config

created_at

expires_at

device_id

connection_type = "permanent"

TOKEN_TTL_DAYS30

TEMP_TOKEN_TTL_HOURS4

DTLS fingerprintSHA-256

STUNstun.l.google.com:19302

~/.contop/tokens.json · atomic write (.tmp → rename) · validate_token() auto-removes expired

Configurable Rulessettings.py

restricted_paths[]

/root · /etc/shadow · /etc/passwd · C:\Windows · C:\Windows\System32 · C:\Windows\SysWOW64

forbidden_commands[]

rm -rf / · mkfs · dd if= · format C: · del /f /s /q C:\

destructive_patterns[]

rm · rmdir · del · kill · taskkill · shutdown · DROP TABLE · remove-item · stop-process…

away_mode:

enabled: false

pin_hash: ""

emergency_pin_hash: ""

auto_engage_minutes: 5

idle_timeout_enabled: true

Hot-reload via mtime caching · get_settings() checks _cached_mtime · ~/.contop/settings.json

Paired Devicespairing.py · Desktop UI

See every device that can access your computer — live status, location, and connection path:

Alex's iPhoneConnected

via Local Network

Just now

Work iPadDisconnected

via Tunnel

from San Francisco, US

2 hours ago

One-click revoke — instantly disconnects and blocks the device

Alerts & Smart Pairinggeo.py · OS Notifications

Native OS notifications fire in real time — even when the app is minimized:

Device Connected

Alex's iPhone via Local Network

Token Replaced

New pairing replaced existing access

Connection path auto-classified:

Private IP (192.168.x, 10.x) → Local Network

Tailscale IP (100.64.0.0/10) → Tailscale VPN

Public IP → Tunnel — geo-located

PHYSICAL SECURITY

Away Mode

Away Mode protects your machine when you're not at the keyboard. PIN overlay, keyboard lock, idle auto-engage, encrypted secrets.

EXECUTION SAFETY

Command Classification

Every command is classified before it runs. Dangerous actions are sandboxed or blocked. You approve what matters.

CONNECTION TRUST

End-to-End Encrypted

End-to-end encrypted. Peer-to-peer. No cloud relay. Biometric pairing. Your data never leaves the tunnel.

DEVICE VISIBILITY

Paired Device Management

See every connected device, where it's connecting from, and revoke access instantly. OS alerts for every connection event.

Use Cases

Real people, real problems, solved in seconds.

Production OutageVoice → CLI

Voice Input

“Check the production logs for user-auth. If it's stalled, restart it.”

Processing

$ docker logs user-auth --tail 20

ERROR: health check timeout

$ docker restart user-auth

Up 3 seconds (healthy)

Resolved in 47s

Alex — Backend Engineer

On a train · PagerDuty alert firing

PagerDuty fires while Alex is on the train. He opens Contop, speaks one command, and the agent checks the logs, finds the stalled container, and restarts it. Outage resolved in under a minute — no laptop needed.

WebRTC TunnelVoice InputCLI ExecutionReal-Time Video

Security BoundaryCommand Gate

“Run render-final.sh, but first delete old temp files in root directory”

Security Gate

BLOCKEDRoot directory deletion

“Proceed with render only?”

Yes

Sarah — Motion Designer

Coffee shop · 4K render deadline

Sarah asks Contop to run her render script and casually adds “delete temp files in root.” The security gate blocks the dangerous part, asks her phone to confirm, and kicks off just the render. System stays safe.

DualToolEvaluatorSandboxUser ConfirmationRestricted Paths

GUI AutomationVisual Navigation

Blender 4.1

Render

72%

GPU Memory Error

CUDA out of memory — tile size too large

RetryCancel

Voice Command

“Lower the tile size in render settings and hit retry”

Marcus — 3D Artist

Dinner out · Blender render running at home

Marcus left a Blender render running on his workstation. At dinner, his phone shows a GPU memory error dialog blocking the process. He tells Contop to lower the tile size and hit retry — the agent navigates Blender's UI visually, clicks through the settings, and the render resumes.

GUI AutomationVisual StreamDesktop AppsNo CLI Needed

RESPONSE TIME

Instant Response

Resolve critical issues in seconds, not minutes. Voice-to-action from any location.

SAFETY

Safety by Design

Dangerous actions are caught, sandboxed, and confirmed before execution.

SUPPORT

Zero Walkthrough

Remote support without asking users to follow complex steps.

Download

Get Contop running on your machine in minutes.

Windows

Installer (.exe)

Download

Requires: Python 3.12+

macOS

Disk Image (.dmg)

Download

Requires: Python 3.12+

Linux

AppImage / .deb

Download

Requires: Python 3.12+

.deb also available

Or install via package manager—no security warnings

macOS:brew install slopedrop/contop/contopWindows:scoop bucket add contop https://github.com/slopedrop/scoop-contopthenscoop install contop

Coming Soon

Scan to download

Android

Google Play Store

Google Play

Coming Soon

Scan to download

iOS

Coming Soon

Developer Documentation

Setup guides, API reference, skill authoring, and configuration.

Read the Docs

DESKTOP

Desktop Agent

Control your computer with AI from anywhere. Windows, macOS, and Linux.

MOBILE

Mobile Commander

Voice and text control from your phone. Android (iOS coming soon).

DOCS

Documentation

Setup guides, API reference, skill authoring, and configuration.