Long-running Agents
Long-running agents are AI agents designed to sustain work across multiple context windows, persisting state through structured artifacts — progress files, git commits, feature specs — so each new session resumes where the last ended. The pattern addresses a hard constraint: every context window is amnesia.
Anthropicエンgineer Justin Young formalized the concept on November 26, 2025 with a two-agent harness (initializer + coding agent) enabling Claude to build production web apps across sessions. By February 2026, Cursor launched a research preview with documented 36–52 hour autonomous coding runs; Anthropic followed with a three-agent planner-generator-evaluator architecture in March 2026.
Cursor's long-running agent completed an all-new chat platform integration in 36 hours and a mobile app port in 30 hours, producing PRs with merge rates comparable to short-session agents. The runs used multi-agent plan-and-verify loops to prevent context drift on tasks spanning thousands of files.
Think of it as a relay race for AI: each runner picks up the baton from a structured handoff note, not from memory.
Search Interest
-
Nascent0–7 days
-
Emergent8–30 days
-
Validating31–90 days
-
Rising91–180 days
-
Established ← now180 days +
Why is it emerging now?
Frontier models can now sustain 25–52 hour autonomous coding sessions with proper harnesses. Anthropic's November 2025 engineering post established the canonical two-agent pattern; Cursor's February 2026 research preview demonstrated it at production scale (151k-line PRs). Three forces converge: models capable enough to stay coherent, harness patterns proven at scale, and token costs low enough to run for days.
Outlook
6-month signal projection and commercial timeline.
Both Anthropic and Cursor shipping production-grade harnesses signals rapid platform standardization over the next 6 months.
Risk · Longer context windows (>2M tokens) could shrink the 'cross-session' problem before harness patterns solidify.
Analogs · serverless · containers · CI/CD pipelines
-
nowHarness tooling gap open
Harness frameworks, context-management SDKs, and observability tools serve teams deploying agents now.
-
3-6moPlatform and SaaS layer
Hosted harness-as-a-service products and per-session pricing models emerge as category matures.
-
6-12moEnterprise evaluation and audit
Compliance, cost-control, and audit tools become essential as enterprises run 24h+ autonomous agent sessions.
Competition & Opportunity for term “Long-running Agents”
Three heuristic signals derived from the tracked queries, the term's monetization cards, and its cluster neighbors. Directional, not audited.
Ideas for term “Long-running Agents”
Buildable pitches — turn this term into an article, site, product, post, newsletter, video, or course. Steal any card and run with it.
High-intent comparison query with clear task-decomposition angle. Ranks for developers evaluating multi-session architecture.
Tutorial targeting Anthropic's engineering blog pattern; strong long-tail on 'claude progress.txt' and 'feature spec JSON' sub-queries.
Comparison article capturing 'what framework should I use' intent from teams starting new agentic projects.
Targets the exact gap Anthropic's blog exposed: teams hand-rolling progress files, feature specs, and git-commit logic.
Every 30h+ run incurs hundreds of dollars; teams need cost visibility and early warning on context drift before sessions fail.
Genuine niche with no existing newsletter: Anthropic + Cursor publishing technical harness posts weekly, builders hungry for synthesis.
First-person experiment format; shareable for its specific runtime claim and honest failure/recovery narrative.
Teachable skill: harness architecture isn't obvious from docs alone. $149–299 workshop on Maven or Gumroad targets builders already deploying agents.
A 52-hour Claude coding session costs north of $500. Most teams shipping long-running agents in 2026 know this and deploy anyway.
In November 2025 Anthropic published a 1,200-word engineering post about a 'claude-progress.txt' file. By April 2026 that file pattern had spawned three GitHub stars, two SaaS products, and an InfoQ feature.
The core problem of long-running agents is brutal in its simplicity: the model forgets everything after every session. We ran 15 experiments to find what actually works.
What People Search
Long-tail queries from Google Suggest + Trends. Volume and competition are heuristics — directional, not audited. Content Type comes from query shape.
SERP of term “Long-running Agents”
What searchers see today — organic results on top, paid ads if anyone's bidding. Ad density is a real-time commercial signal.
FAQ
What is Long-running Agents?
Long-running agents are AI agents designed to sustain work across multiple context windows, persisting state through structured artifacts — progress files, git commits, feature specs — so each new session resumes where the last ended.
Why is Long-running Agents emerging now?
Frontier models can now sustain 25–52 hour autonomous coding sessions with proper harnesses. Anthropic's November 2025 engineering post established the canonical two-agent pattern; Cursor's February 2026 research preview demonstrated it at production scale (151k-line PRs). Three forces converge: models capable enough to stay coherent, harness patterns proven at scale, and token costs low enough to run for days.
When did Long-running Agents emerge?
Publicly emerged around 2025-11-26 (about 202 days ago as of 2026-06-16). EarlyTerms first recorded a pipeline signal on 2026-04-30.
Related Terms
Other terms in the same space — aliases, subtypes, competitors, and neighbors to explore next.
- Part of agent-harness An agent harness is the middleware between a large language model and the real world — code that runs the agent loop, calls tools,… →
- Part of coding-agents Coding Agents is the category name for AI developer tools that act on code autonomously — reading a repo, planning a change, editing… →
- Related managed-agents Managed Agents is an infrastructure paradigm where cloud platforms host, orchestrate, and operate AI agents as a service. →
- Related context-engineering Context engineering is the discipline of curating every token that enters an LLM's context window — system prompt, tools, retrieved… →
- Related context-rot Context rot is the measurable degradation in large-language-model output quality as input length grows, even when the prompt stays well… →
- Related agent-loop An agent loop is the control-flow pattern at the center of every autonomous LLM agent: the model observes its context, reasons about… →
- Related parallel-agents Parallel Agents is the pattern of running multiple AI coding sessions at the same time against isolated copies of a codebase, with a… →
- Related agentic-coding Agentic coding is the software-development pattern where an autonomous AI agent plans, writes, tests, and iterates on code against a… →
- Related cloud-coding-agents Cloud coding agents are AI software engineering systems that execute development tasks inside remote, sandboxed cloud environments… →
- Related context-window A context window is the span of tokens an LLM reads and reasons over in a single forward pass. →
- Also known as
- Related
Sources
Primary URLs this report cites — open any to verify the claim yourself.
- 01 Anthropic Engineering — Effective Harnesses for Long-Running Agents (Nov 26, 2025) anthropic.com ↗
- 02 Anthropic Engineering — Harness Design for Long-Running Application Development (Mar 24, 2026) anthropic.com ↗
- 03 Cursor — Expanding our Long-Running Agents Research Preview (Feb 12, 2026) cursor.com ↗
- 04 Cursor — Scaling Long-Running Autonomous Coding (Jan 14, 2026) cursor.com ↗
- 05 Hacker News — Effective harnesses for long-running agents (125 points) news.ycombinator.com ↗
- 06 Amplify Partners — How Hightouch Built Their Long-Running Agent Harness (Jan 20, 2026) amplifypartners.com ↗
- 07 Addy Osmani — Long-running Agents (Apr 28, 2026) addyosmani.com ↗