What nanny guarantees
When you run an agent under nanny, these three things are true:- It will not take more steps than you allow.
- It will not spend more than your cost budget.
- It will not run longer than your timeout.
ExecutionStopped event is emitted with the exact reason, and Nanny exits with a non-zero status code.
What nanny is not
Nanny is not intelligent. It does not understand what your agent is doing, why it is doing it, or whether the result is good. It does not suggest better limits, summarise results, adapt based on context, or retry on failure. It is a primitive — a hard boundary you configure once and trust completely. This is intentional. The value comes from the guarantee: if you set a limit, it holds.Who it is for
Nanny is for developers and teams running agents in production — or preparing to. It is a good fit if you:- Are building multi-agent systems where different agents have different roles, tool access, and budget ceilings — and you need enforcement that fires per-role, not just globally
- Are running autonomous agents that call external tools, browse the web, or write to APIs
- Need hard guarantees that an agent cannot exceed a cost budget or run indefinitely
- Want a structured audit trail of every tool call and stop reason for every execution
- Are building with CrewAI, LangChain, or any Python or Rust agent framework
- Want enforcement that is not tied to any agent framework — use CrewAI, LangGraph, or any Python or Rust framework without lock-in
The multi-agent scenario
A fintech team builds a system where a manager agent spawns 12 specialists: one checks regulations, one pulls market data, one drafts reports. They deploy on Friday. One agent gets stuck looping on a market data API call over the weekend. The team has no per-role kill switch and no audit trail of which agent made which call. With Nanny, each specialist has its own named limit set innanny.toml:
[limits.analysis] when it runs. The reporter activates [limits.reporter]. Each has its own tool allowlist — the analysis agent cannot call write_report, the reporter cannot call compute_stats. The moment any agent exceeds its ceiling or reaches for the wrong tool, Nanny stops it. The event log shows exactly which agent, which tool, which limit, and when.
Scope today: This works for any multi-agent framework that runs agents within a single process — CrewAI, LangGraph, AutoGen, plain Python. See examples/python/metrics_crew for the complete working example. Cross-process and cross-machine fleet enforcement is the v0.1.6 cloud layer.
The nanny ecosystem
Nanny is designed to meet you where you are and grow with you. Nanny CLI — The enforcement entry point. Governs any agent process in any language as its parent process supervisor. Install it once as a system tool and usenanny run from any project that has a nanny.toml with a [start] command configured.
#[nanny::tool], #[nanny::rule], and #[nanny::agent] to get per-function cost accounting, allowlist enforcement, and custom rules. See the Rust SDK guide.
Python SDK — The same model as the Rust SDK, as Python decorators. @tool, @rule, @agent. Each agent in your fleet gets its own budget ceiling, tool allowlist, and custom rules. Works with LangChain, CrewAI, or any Python agent framework. See Python SDK.
Nanny Cloud (coming soon) — Durable audit logs, team dashboards, org-level budget aggregation, and managed enforcement across all your agents. The OSS runtime stays unchanged — Cloud is the observability and coordination layer above it.
Open source
The Nanny runtime is fully open source under the Apache 2.0 licence. Source code, issues, and contributions live at github.com/nanny-run/nanny. Cloud is the managed layer above the OSS primitive — not a replacement for it.Next steps
Quickstart
Install nanny and run your first governed agent in under five minutes.
How it works
Understand the enforcement model and passthrough mode.
nanny.toml reference
Full schema for the configuration file.
Rust SDK guide
Per-function governance with
#[nanny::tool], #[nanny::rule],
#[nanny::agent].Python SDK guide
Per-function governance with
@tool, @rule, @agent decorators. Works
with LangChain, CrewAI, and any Python agent framework.