Agent Click
Agent click refers to the capability of an AI agent to click, type, scroll, and read native desktop applications in the background without stealing the user’s cursor, window focus, or active Space. The term captures a new class of OS-level interaction primitives built specifically for LLM-driven automation.
The concept crystallized on April 28, 2026, when Cua (trycua) launched Cua Driver on Hacker News, describing it as “a background computer-use driver for macOS that lets an agent click, type, scroll, and read native apps while your cursor, frontmost app, and Space stay where they are.” The driver uses Apple’s private SLEventPostToPid SkyLight API to route agent events directly to a target process.
Think of it as a second invisible hand that drives apps you’re not looking at.
Search Interest
-
Nascent0–7 days
-
Emergent8–30 days
-
Validating ← now31–90 days
-
Rising91–180 days
-
Established180 days +
Why is it emerging now?
LLMs can now reason about GUIs reliably enough to drive real apps — but every existing solution hijacks your cursor. Cua Driver’s April 28 Show HN launch introduced background agent-control as a first-class primitive, letting Claude Code, Codex, or any MCP-capable loop click native macOS apps while you keep working.
Outlook
6-month signal projection and commercial timeline.
Background agent-control is a genuine gap; momentum depends on Apple formalizing or blocking the private API.
Risk · Apple could revoke SLEventPostToPid access in a macOS update, forcing a rewrite or killing the approach.
Analogs · computer-use · browser-use · desktop automation
-
nowOSS tool, MIT licensed
No direct revenue; early adopters build internal automation on top of the free driver.
-
3-6moHosted background agents
SaaS wrapper charging per agent-minute of background desktop control emerges.
-
6-12moEnterprise audit + compliance
Paid tiers adding decision logs satisfy compliance teams asking why the agent clicked.
Competition & Opportunity for term “Agent Click”
Three heuristic signals derived from the tracked queries, the term's monetization cards, and its cluster neighbors. Directional, not audited.
Ideas for term “Agent Click”
Buildable pitches — turn this term into an article, site, product, post, newsletter, video, or course. Steal any card and run with it.
High-intent comparison search. Browser-use is well-covered; native-app agent control via tools like Cua Driver is the content gap. Comparison posts in this space monetize via affiliate links to hosted agent platforms.
Evergreen tutorial targeting the ‘agent macOS automation’ long-tail. Walk through Cua Driver setup with Claude Code or Codex. Monetize via sponsored cloud-desktop sandbox affiliate.
HN top comment flagged the compliance gap: when an agent clicks through an ERP, how do you explain the ‘why’ to a compliance team? A lightweight log-replay tool recording agent intent alongside click coordinates addresses a real enterprise pain.
Multiple HN comments asked about Windows. Codex Computer Use plans Windows support but hasn’t shipped. A cross-platform background-click library filling that gap could become a category leader.
Shareable demo showing background agent-click in action; demonstrates the no-cursor-steal UX that text can’t convey. High YouTube thumbnail potential for a live split-screen.
Opinion post for dev Twitter / LinkedIn asserting that foreground computer-use agents are a dead end and background agent-click is the only viable production model.
Cua Driver relies on SLEventPostToPid, an undocumented SkyLight function Apple can remove in any OS update — and 15,000 GitHub stars now depend on it.
Every impressive computer-use demo you’ve seen steals your cursor for the duration — which is fine for a 90-second video and useless in production.
An HN commenter on the Cua Driver launch called it: if Apple keeps locking down native agent APIs, Linux (and Android) could become the default platform for agent-click workloads within two years.
What People Search
Long-tail queries from Google Suggest + Trends. Volume and competition are heuristics — directional, not audited. Content Type comes from query shape.
SERP of term “Agent Click”
What searchers see today — organic results on top, paid ads if anyone's bidding. Ad density is a real-time commercial signal.
FAQ
What is Agent Click?
Agent click refers to the capability of an AI agent to click, type, scroll, and read native desktop applications in the background without stealing the user’s cursor, window focus, or active Space.
Why is Agent Click emerging now?
LLMs can now reason about GUIs reliably enough to drive real apps — but every existing solution hijacks your cursor. Cua Driver’s April 28 Show HN launch introduced background agent-control as a first-class primitive, letting Claude Code, Codex, or any MCP-capable loop click native macOS apps while you keep working.
When did Agent Click emerge?
Publicly emerged around 2026-04-28 (about 49 days ago as of 2026-06-16). EarlyTerms first recorded a pipeline signal on 2026-04-29.
Related Terms
Other terms in the same space — aliases, subtypes, competitors, and neighbors to explore next.
- Part of agent-loop An agent loop is the control-flow pattern at the center of every autonomous LLM agent: the model observes its context, reasons about… →
- Related browser-use Browser Use is an open-source Python library (browser-use/browser-use) that lets an LLM drive a real Chrome instance to complete web… →
- Related Browser Harness Browser Harness is a 592-line Python project from browser-use that gives an LLM direct, unmediated control of Chrome via the DevTools… →
- Related agent-harness An agent harness is the middleware between a large language model and the real world — code that runs the agent loop, calls tools,… →
- Related Agent Harness An agent harness is the middleware between a large language model and the real world — code that runs the agent loop, calls tools,… →
- Related managed-agents Managed Agents is an infrastructure paradigm where cloud platforms host, orchestrate, and operate AI agents as a service. →
- Related model-context-protocol Model Context Protocol (MCP) is an open, JSON-RPC-2.0-based standard that defines how AI applications talk to external tools, data, and… →
- Related agentic-coding Agentic coding is the software-development pattern where an autonomous AI agent plans, writes, tests, and iterates on code against a… →
- Related coding-agents Coding Agents is the category name for AI developer tools that act on code autonomously — reading a repo, planning a change, editing… →
- Related dev-agents Dev Agents is a loose umbrella label for AI agents that write, review, and ship code on a developer's behalf — a near-synonym of the… →
- Related claude-code Claude Code is Anthropic's official command-line coding agent — a terminal tool that reads your codebase, edits files, runs commands,… →
- Part of
Sources
Primary URLs this report cites — open any to verify the claim yourself.
- 01 Show HN: Drive any macOS app in the background without stealing the cursor (165 pts) news.ycombinator.com ↗
- 02 trycua/cua — Open-source infrastructure for Computer-Use Agents (15.1k stars) github.com ↗
- 03 Inside macOS Window Internals — Cua Engineering Blog (Apr 23, 2026) github.com ↗
- 04 Cua Driver: comparison vs Codex Computer Use, Claude Cowork, Lume cua.ai ↗
- 05 Cua — The Computer Use Agent Platform (official site) cua.ai ↗