The operating system for
autonomous agents.
LLMs are stateless. Real work requires process lifecycles, trigger scheduling, persistent state, crash recovery, governance enforcement, and distributed coordination. Not another wrapper with a loop — an actual operating system.
async with AgentRuntime() as runtime:
await runtime.add_process(
"pipeline-monitor",
ProcessConfig(
model="openai:gpt-5-mini",
triggers=[
CronTrigger("*/5 * * * *"),
WebhookTrigger("/alert", verify_hmac=True),
FileWatchTrigger("/data/*.csv"),
],
restart_policy="on_failure",
budget=AutonomyBudget(
max_tool_calls_per_invocation=10,
max_daily_cost=50.0,
),
)
)
await runtime.start_all()What is the Agent Runtime?
Your agent is a function.
The runtime makes it a service.
Without a runtime, an agent only runs when you call it. It has no memory between runs, no schedule, no crash recovery, no budget limits. The Agent Runtime wraps any Promptise agent in a persistent process that runs autonomously — triggered by events, governed by rules, recoverable from failures.
Without runtime
You call agent.ainvoke() manually. It runs once and stops. No state between calls. No triggers. No governance. If it crashes, it's gone.
With runtime
The agent becomes a process — a persistent container with a lifecycle, trigger queue, state store, conversation buffer, and governance engine. It runs until you tell it to stop.
What it gives you
Cron schedules, webhook listeners, crash recovery journals, budget enforcement, health monitoring, mission tracking, secret scoping, and a 37-endpoint REST API to manage everything.
AgentProcess = PromptiseAgent + triggers + state + governance + journal
RUNTIME ARCHITECTURE
Watch the system come alive.
19 actors across 5 layers — triggers, process lifecycle, agent execution, governance, and management. Data flows along connections as the cycle progresses.
Real work means an agent that wakes up at 6am, checks your data pipelines, discovers an anomaly, cross-references it against yesterday's deployment, writes a report, posts it to Slack, and goes back to sleep — without anyone watching.
Agents that wake when the world needs them.
A stateless agent only runs when you call it. A runtime agent runs when the world needs it to. Five trigger types compose with OR logic — any trigger can wake the process. Events queue in arrival order.
Cron Schedules
Every 5 minutes, every Monday at 9am, first of the month. Standard cron syntax.
CronTrigger("0 9 * * MON")Webhooks
GitHub pushes, Stripe payments, PagerDuty alerts. Optional HMAC signature verification.
WebhookTrigger("/github", verify_hmac=True)File Watchers
Glob patterns on directories. Filter by event type. Process new data as it arrives.
FileWatchTrigger("/data/*.csv")Event Subscriptions
React to other agents in the same runtime. Topic-based with wildcard support.
EventTrigger("pipeline.*")Inter-Agent Messages
Decoupled coordination between processes. Type-safe message passing.
MessageTrigger(from_agent="coordinator")One process, three triggers. Health checks + webhooks + file watchers.
State that survives crashes, restarts, and days of downtime.
Every agent process maintains persistent state — key-value data with full mutation history, long-term memory, environment variables, and file mounts. The state is injected into every invocation.
Two backends: InMemoryJournal (testing) and FileJournal (production — JSONL, append-only, checkpoint files). Every state transition, trigger event, and invocation result recorded. ReplayEngine reconstructs from last checkpoint.
Stop an agent. Restart days later. It resumes exactly where it left off — context intact, counters accurate, missed triggers replayed.
# Every process gets an AgentContext
ctx = process.context
# Read state
last_run = ctx.state.get("last_run_time")
error_count = ctx.state.get("error_count", 0)
# Write state (timestamped, attributed)
ctx.state.set("last_run_time", datetime.now())
ctx.state.set("error_count", error_count + 1)
# Access long-term memory
memories = ctx.memory.search(
"previous database migrations"
)
# Full mutation history
history = ctx.state.history("error_count")
# → [(timestamp, value, source), ...]Agents with a budget they cannot exceed.
Without limits, an autonomous agent can loop indefinitely, call expensive APIs all night, or take irreversible actions while everyone sleeps. The autonomy budget defines the envelope within which the agent operates freely.
Per-Invocation
Daily Limits
On Limit Exceeded
pauseSuspend for human review
stopTerminate immediately
escalateWebhook to your team
budget = AutonomyBudget(
max_tool_calls_per_invocation=10,
max_daily_cost=50.0,
tool_costs={
"search_database": 0.01, # Cheap
"send_email": 0.10, # Moderate
"process_payment": 5.00, # Expensive - 5x weight
},
on_budget_exceeded="escalate",
warning_threshold=0.8,
)Agents that detect their own problems.
System monitoring watches CPU and memory. Nobody watches whether the agent is actually doing useful work. Behavioral health monitoring catches the problems that infrastructure monitoring misses — without making any LLM calls.
Stuck Detection
The agent called the same function with identical arguments five times in a row — it is stuck.
Loop Detection
The agent is cycling through check → fix → check → fix endlessly — it is in a loop.
Empty Response
The agent returned three consecutive responses below 10 characters — something is broken.
Error Rate Spike
Six out of the last ten tool calls failed — a dependency is degrading.
Agents with a mission they complete.
Standard agents run on a trigger, do something, stop. They have no concept of progress, no definition of done. A mission-oriented agent runs until a goal is achieved.
mission = MissionConfig(
objective="Migrate all database tables to v2",
success_criteria="All tables pass v2 validation",
# Progress evaluation
check_interval=5, # Every 5 invocations
confidence_threshold=0.8, # Escalate if < 80%
# Limits
max_invocations=100,
max_duration="4h",
# Outcomes
on_complete="notify",
on_stuck="escalate",
on_timeout="fail_gracefully",
)
# Mission agent accumulates context
# Never forgets what it's working toward
# Knows when it's doneCredentials that self-destruct.
Environment variables are shared across all agents. Secret scoping gives each process its own isolated credential context.
Isolated Scope
Each process gets its own credential context via ${ENV_VAR} resolution
TTL Expiry
get() returns None when TTL expires. Expired secrets pruned lazily on next access
Zero-Fill Revocation
All secret values overwritten with zeros on process stop. Never serialized to journal
Audit Trail
Every access logged to journal (name only, never value). Rotation without restart
Agents that evolve at runtime.
Most agent frameworks are static — define it, deploy it, it does exactly what you told it forever. With Open Mode, agents adapt within guardrails. Fourteen meta-tools for self-modification, coordination, and introspection.
rewrite_instructionsUpdate own instructions based on experience
create_toolCreate new Python tools at runtime
connect_mcpConnect to additional MCP servers
modify_triggersAdd or remove own triggers
spawn_processCreate new agent processes in runtime
send_messageMessage other agents directly
broadcast_eventPublish to event bus
request_approvalPause for human approval
store_memorySave to long-term memory
search_memorySemantic search past experiences
manage_memoryUpdate or delete memories
inspect_stateRead own context and history
Talk to your agents while they work.
Autonomous agents today are fire-and-forget. You deploy them, they run, you watch from the outside. With the message inbox, you communicate without interrupting operation.
"Ignore staging alerts — we're deploying."
"What have you found so far?"
"The error spike is expected — marketing campaign."
# Send message to running agent
await runtime.send_message(
process_id="incident-responder",
message="Ignore staging alerts, we're deploying",
priority="high",
ttl=timedelta(hours=2),
)
# Ask a question
response = await runtime.ask(
process_id="data-analyst",
question="What anomalies did you find today?",
timeout=timedelta(minutes=5),
)
print(response.answer)
# Agent checks inbox between invocations
# Incorporates messages as context
# No restart. No code change. No interruption.Manage your entire fleet without touching code.
37 REST endpoints. Deploy, start, stop, update, monitor. Bearer token auth. Full Python SDK with method-level docs. Scale to multi-node with RuntimeCoordinator — StaticDiscovery or RegistryDiscovery for node topology, HTTP health checks, no etcd or Consul required.
Lifecycle
Config
Observe
Interact
Define everything in YAML.
One manifest file defines the entire agent. Validate before deployment. Version control alongside your code. Deploy from the command line. No Python required for operations.
$ promptise deploy pipeline-monitor.agentname: pipeline-monitor
model: openai:gpt-5-mini
instructions: |
Monitor data pipelines. Alert on anomalies.
Cross-reference with recent deployments.
triggers:
- cron: "*/5 * * * *"
- webhook: /alert
verify_hmac: true
budget:
max_tool_calls_per_invocation: 10
max_daily_cost: 50.0
on_exceeded: escalate
health:
stuck_threshold: 5
loop_detection: true
error_rate_window: 10
restart_policy: on_failure
max_consecutive_failures: 3Positioning
This isn't Kubernetes.
It's what runs inside it.
Kubernetes schedules containers. Airflow orchestrates data pipelines. Celery dispatches background jobs. None of them understand agents — processes that think, use tools, make decisions, and need governance. That's what the Runtime is for.
Cron + scripts
No state between runs. No error recovery. No cost control. No way to know if the agent is stuck or done.
Works for: one-off tasks
Celery / Airflow
Designed for data pipelines, not reasoning. No trigger types for agent events. No LLM cost governance. No behavioral health monitoring.
Works for: ETL, batch jobs
Promptise Runtime
Purpose-built for AI agents. Trigger types that agents need (cron, webhooks, file watch, event bus, inter-agent messages). Journals for crash recovery. Budgets for cost control. Health monitoring for stuck agents. Missions for goal-driven execution.
Works for: autonomous AI agents
Adoption path
Start with one cron agent.
Scale to a fleet.
One cron agent
Single AgentProcess with CronTrigger. Runs every hour, checks something, reports back. InMemoryJournal for dev.
Add persistence
Switch to FileJournal. Agent survives restarts. Add AgentContext for state across invocations.
Multiple agents
AgentRuntime manages 5 processes. EventBus for inter-agent communication. Budget enforcement prevents surprise costs.
Governance
Behavioral health monitoring catches stuck agents. Missions define success criteria. Secrets manager handles API keys with TTL.
Production fleet
Multi-node RuntimeCoordinator. Distributed health checks. Agent manifests in YAML. Dashboard monitors everything. Self-modifying agents with guardrails.
Resilience
Server crashes at 3am.
Agent picks up where it left off.
Every state transition, every trigger event, every invocation result is recorded in the journal. When the server restarts, the ReplayEngine reconstructs the exact state from the last checkpoint. No data loss. No duplicate work. No manual intervention.
[03:14:07] Server crashed (OOM kill) [03:14:07] All processes → STOPPED ... server restarts ... [03:14:42] RuntimeCoordinator starting [03:14:42] Found 3 journals with checkpoints [03:14:42] Process "monitor" → Last checkpoint: 03:12:55 → Replaying 4 events... → State restored. Resuming. [03:14:43] Process "analyst" → Last checkpoint: 03:13:10 → Replaying 2 events... → State restored. Resuming. [03:14:43] Process "reporter" → Last checkpoint: 03:14:01 → Replaying 1 event... → State restored. Resuming. [03:14:44] All 3 processes RUNNING [03:14:44] Zero data loss. Zero duplicates.
Build agents that run forever.
And recover from anything.
Open source. Apache 2.0. Production-grade from day one.
$ pip install promptiseNext: Prompting