Skip to main content
Back to Blog

Your Agent's Biggest User Is a Cron Job

We start to notice something about AI agents in production — the number of requests not originating from humans keeps increasing and will probably overtake humans soon.

By QuantumFabrics
agentsproductionambientobservability
Your Agent's Biggest User Is a Cron Job

Your Agent's Biggest User Is a Cron Job

We start to notice something interesting about AI agents we have in production. With time, the number of requests not originating from humans keeps increasing — and will probably overtake humans soon. What are those? Event triggers, sync jobs, automations, subagents, webhook receivers. So agents become more and more what people now call ambient and background.

Most thinking about agents still assumes a human at a keyboard. The interfaces we ship reinforce it — chat bubbles, streaming responses, prompt engineering as the dominant skill. Production looks different.

The Inventory

A typical production AI deployment we ship includes things like:

  • Cron-triggered cloud functions for cleanup, garbage collection, expired checkpoints, scheduled health probes
  • Container jobs on schedules — weekly digests, batch reprocessing, recovery sweeps, long-running cleanups that don't fit in a 15-minute serverless budget
  • Webhook receivers — inbound email through Microsoft Graph or Gmail, Slack events, calendar invites, payment events, source-control hooks
  • System-callable HTTP endpoints/recover, /resume, /cleanup, /replay — invoked by ambient pollers, monitoring systems, or other services, not browsers
  • Agent-to-agent calls — a parent agent spawning a specialist subagent (evaluator, reviewer, summarizer) with a narrower toolset. The subagent's "user" is the parent agent. No human in the conversation at all.
  • Background workers pulling from queues — SQS, Service Bus, Pub/Sub. Throughput pattern is constant, not bursty.
  • Auto-triggered flows — reminders, follow-ups, stale-state pollers. These typically POST to the agent's chat endpoint as a system user, simulating a conversation no one is having.

A production agent with all of these is normal, not exotic. The chat route is one path. The ambient paths usually outnumber it by an order of magnitude.

Why This Is Happening

The trend lines all point the same direction:

  • Anthropic Economic Index, March 2026: 99.9th-percentile Claude Code turn duration nearly doubled — from under 25 minutes to over 45 minutes — in three months. "Experienced users auto-approve more and interrupt less."
  • Gartner (Aug 2025): 40% of enterprise apps will embed task-specific AI agents by end of 2026, up from under 5% in 2025. Most of those won't be chat — they'll be triggered by events inside the app.
  • HUMAN Security 2026 State of AI Traffic: "Agentic bots performing autonomous tasks" appeared as a new category at 1.7% of AI-driven traffic in 2025. The slope from zero to one is the point.
  • LangChain coined the term "ambient agents" in January 2025 — agents that listen to event streams and act on multiple events at a time.
  • a16z: AI agents already outnumber humans 100-to-1 in financial systems. Framing has shifted from "human in the loop" to "human on the loop."

Building for Headless

When you build for ambient, every assumption that was made for chat breaks. The agent has to be prepared for headless interaction — special instructions, guardrails, context engineering, the works.

Latency. Chat tolerates 5 seconds. Ambient runs operate on hour-long timescales. Replica timeouts in the hour range are normal. Streaming responses are pointless if no one's reading. Checkpoint persistence becomes mandatory, not optional — the process running the turn may not be the process that finishes it.

Auth. A chat user authenticates as themselves. An ambient caller is a workload — a service principal, OAuth client credentials, a managed identity, a signed AgentCard in the emerging A2A pattern. The "user" object the agent sees is a system identity. Authorization rules built around end-user permissions need a separate code path, or they break in surprising ways.

Rate limiting and idempotency. A human types one message at a time. A cron loop will hammer if you let it. MAX_CONCURRENT, MIN_INTERVAL_MS, cooldown windows — survival code, not best practices. Idempotency stops being an optimization: if a job fires twice (network blip, container restart, retry), each invocation must produce the same effect.

Error recovery. A chat user clarifies or retries. No human to do that for ambient. The agent has to self-correct in the same run, escalate to a channel a human watches (Slack notification, email, dashboard alert), or fail closed with enough context for an oncall engineer to diagnose without the original "user" who triggered it. LangChain's Notify / Question / Review patterns are exactly this — structured re-entry points where the agent is the primary actor.

Observability. No user to file a ticket. Telemetry is the only way you find out something broke. OpenTelemetry's GenAI semantic conventions are the emerging standard. Custom run hierarchy trackers, per-tool resilience metrics, structured event streams — not optional. They're the entire visibility surface.

Cost. Bursty becomes constant. Ambient runs always finish — that's what they were scheduled to do. Daily cleanups, every-minute syncs, weekly digests add up to a steady baseline. Budgeting for ambient looks more like infrastructure capacity planning than usage forecasting.

Output. No UI. Outputs go to Slack channels, email, internal dashboards, webhook payloads to downstream services, or another agent's input. Markdown that renders in chat isn't necessarily what Slack needs or what a dashboard ingests. Output formatting becomes infrastructure.

Security. Every ambient path is a non-human identity (NHI) holding credentials — a service principal, an OAuth client, a managed identity. Sophos State of Identity Security 2026: only 15% of organizations are confident they can prevent NHI-based attacks. Industry sources estimate 97% of NHIs have excessive privileges, 71% are un-rotated, and 40.6% of breaches in 2026 tie to weak NHI management. Chat security is mostly about user-input attacks. Ambient security is about credentialed callers — rotation, scope, audit, anomaly detection on identity behavior.

What to Build In

When no human is present, you have to prepare the agent for headless interaction explicitly. A few patterns that consistently matter:

Special instructions for the system user. A cron-triggered "conversation" needs different prompting than a chat-driven one. The agent needs to know it's running unattended — no clarifying questions, no "let me know if you want me to proceed", no streaming preambles.

Guardrails by default. A chat user clarifies if the agent is about to do something risky. Ambient callers can't. Build hard guardrails into the prompts, tools, and runtime — not just into the UI.

Context engineering. Streaming text is wasted on a queue consumer. State persistence is mandatory. Checkpointing isn't optional.

Workload identity from day one. Don't bolt service-principal auth on later. The auth ladder for ambient callers is different from human-user auth and should be a separate code path.

Idempotency keys, dedupe windows, transactional outbox. Cron loops retry. Webhooks deliver twice. Pick one before you ship.

Audit your NHIs. Every workload that calls the agent is an identity. Rotate, scope, monitor.

Why It Matters

The headline isn't a specific ratio. It's that the ratio shifts before anyone planned for it. Each ambient path gets added in response to a real need — a cron job here, a webhook there, a subagent for a specialized task. By the time someone counts, the agent is already running mostly headless.

We designed agents for chat. Production runs them headless. The faster you stop designing for the human at the keyboard, the closer your architecture gets to the one your traffic actually needs.


Sources: