Skip to main content
The Python SDK brings the same enforcement model as the Rust SDK to Python — @tool, @rule, and @agent decorators that enforce limits per function call.
pip install nanny-sdk

Passthrough mode

When running outside nanny run, every decorator is a no-op. The function executes normally with no enforcement overhead:
# Governed — enforcement active (reads [start].cmd from nanny.toml)
nanny run

# Not governed — decorators silent, agent runs normally
python agent.py
uv run agent.py
This is safe to ship to production. The instrumentation only activates when nanny run is present.

@tool — declare a governed tool

Mark a function as a tool that Nanny should track and charge against the budget:
from nanny_sdk import tool

@tool(cost=10)
def fetch_page(url: str) -> str:
    import httpx
    return httpx.get(url).text
When the agent calls fetch_page:
  1. Nanny checks: is fetch_page in the [tools] allowed list?
  2. Nanny checks: has fetch_page exceeded [tools.fetch_page] max_calls?
  3. Nanny charges 10 cost units against the budget.
  4. If any check fails, a NannyStop exception is raised — the function body never runs.
Works identically for async functions:
@tool(cost=10)
async def fetch_page(url: str) -> str:
    import httpx
    async with httpx.AsyncClient() as client:
        return (await client.get(url)).text

Cost

The cost argument is required. Set it to 0 for tools you want tracked but not charged:
@tool(cost=0)
def log_step(msg: str) -> None: ...

Matching the tool allowlist

The tool name used for allowlist checks is the function name as declared in Python:
# nanny.toml
[tools]
allowed = ["fetch_page", "read_file"]

[tools.fetch_page]
max_calls     = 20
cost_per_call = 10   # nanny.toml cost overrides the decorator default

@rule — declare an enforcement rule

A rule is a function that returns a verdict on whether execution should continue. Return True to allow, False to deny:
from nanny_sdk import rule

@rule("no_spiral")
def check_spiral(ctx) -> bool:
    h = ctx.tool_call_history
    # Deny if the last three tool calls were all the same
    return not (len(h) >= 3 and len(set(h[-3:])) == 1)
Rules are evaluated client-side on every tool call, before the bridge is contacted. When a rule returns False, Nanny raises:
RuleDenied("no_spiral")
The denied tool never runs and no cost is charged.

PolicyContext fields

The ctx parameter gives you a snapshot of the current execution state:
FieldTypeDescription
step_countintSteps completed so far
elapsed_msintWall-clock time elapsed
cost_units_spentintTotal cost units spent
tool_call_countsdict[str, int]Per-tool call counts
tool_call_historylist[str]Ordered log of tool names called
requested_toolstr | NoneThe tool being evaluated right now
last_tool_argsdict[str, str]Arguments of the tool call being evaluated
Rules are evaluated before the tool runs — requested_tool is set to the tool name being checked. Use last_tool_args for content-based enforcement:
@rule("no_sensitive_files")
def block_sensitive(ctx) -> bool:
    path = ctx.last_tool_args.get("path", "")
    return ".env" not in path and "secret" not in path
requested_tool and last_tool_args are always populated. The counter fields (step_count, tool_call_counts, tool_call_history, elapsed_ms) are coming in a future release.

@agent — activate named limits for a scope

In a multi-agent system, each agent has a different role and a different risk profile. The analysis agent makes expensive API calls and deserves a tight cost ceiling. The reporter just writes a file and barely needs a budget at all. @agent activates the right named limit set when each role runs, then reverts automatically when it’s done.
from nanny_sdk import agent

@agent("researcher")
def run_research(topic: str) -> list[str]:
    # Runs under [limits.researcher] from nanny.toml
    pages = [fetch_page(f"https://en.wikipedia.org/wiki/{topic}")]
    return pages
# nanny.toml
[limits.researcher]
steps   = 200
cost    = 5000
timeout = 120000
The named set inherits from [limits] and overrides only the declared fields. Works identically for async functions. Limits revert on exit whether the function returns normally or raises.

What happens on stop

When Nanny stops execution, it raises a NannyStop exception. All stop reasons are distinct subclasses:
from nanny_sdk import (
    NannyStop,
    MaxStepsReached,
    BudgetExhausted,
    TimeoutExpired,
    ToolDenied,
    RuleDenied,
    AgentCompleted,
    AgentNotFound,
)
Catch them by category or individually:
from nanny_sdk import NannyStop, BudgetExhausted, ToolDenied

try:
    run_research("Alan Turing")
except BudgetExhausted:
    print("Hit the cost ceiling")
except ToolDenied as e:
    print(f"Blocked tool: {e.tool_name}")
except NannyStop as e:
    print(f"Stopped: {type(e).__name__}")
You do not need to handle stop reasons in most agent code. They propagate up the call stack and terminate the process via nanny run. Catching them is useful in test code and at the CLI entry point.

Complete example

from nanny_sdk import tool, rule, agent

@tool(cost=10)
def fetch_page(url: str) -> str:
    import httpx
    return httpx.get(url).text

@tool(cost=5)
def read_file(path: str) -> str:
    with open(path) as f:
        return f.read()

@rule("no_spiral")
def check_spiral(ctx) -> bool:
    h = ctx.tool_call_history
    return not (len(h) >= 3 and len(set(h[-3:])) == 1)

@agent("researcher")
def research(topic: str) -> list[str]:
    results = []
    page = fetch_page(f"https://en.wikipedia.org/wiki/{topic}")
    results.append(page)
    return results

if __name__ == "__main__":
    pages = research("Alan Turing")
    print(f"Collected {len(pages)} pages")
Run it under Nanny:
nanny run
Run it without Nanny (decorators silent, agent runs normally):
python agent.py

Multi-agent pattern

The canonical use case: a pipeline where each agent has a specific role, a specific budget, and access to only the tools it needs. This is the metrics_crew pattern — four specialized agents, each governed independently.
from nanny_sdk import tool, rule, agent
from collections import deque

# Each tool declares its cost. The decorator fires on every call
# regardless of which agent invoked it.
@tool(cost=10)
def compute_stats(metric: str, path: str) -> dict: ...

@tool(cost=10)
def detect_anomalies(metric: str, path: str) -> list: ...

@tool(cost=5)
def write_report(content: str, output_path: str) -> str: ...

# A rule that prevents the analysis agent from looping on the same computation.
_recent: deque[str] = deque(maxlen=5)

@rule("no_analysis_loop")
def check_loop(ctx) -> bool:
    tool = ctx.requested_tool or ""
    _recent.append(tool)
    return not (len(_recent) == 5 and all(t == "compute_stats" for t in _recent))

# Each agent activates its own limit scope.
@agent("analysis")
def run_analysis(path: str):
    # Governed by [limits.analysis]: steps=60, cost=200, timeout=60000
    # Tool allowlist: ["compute_stats", "detect_anomalies"] only
    # write_report() here would raise ToolDenied immediately
    stats = compute_stats("cpu_usage", path)
    anomalies = detect_anomalies("cpu_usage", path)
    return anomalies

@agent("reporter")
def run_reporter(findings: list, output_dir: str):
    # Governed by [limits.reporter]: steps=20, cost=50, timeout=30000
    # Tool allowlist: ["write_report"] only
    # compute_stats() here would raise ToolDenied immediately
    return write_report(str(findings), f"{output_dir}/report.md")
# nanny.toml
[limits]
steps   = 200
cost    = 500
timeout = 120000

[limits.analysis]
steps   = 60
cost    = 200
timeout = 60000

[limits.reporter]
steps   = 20
cost    = 50
timeout = 30000

[tools]
allowed = ["compute_stats", "detect_anomalies", "write_report"]
The key properties this gives you:
  • Per-role budget: hitting the analysis budget doesn’t kill the reporter
  • Least-privilege tool access: each agent only receives the tools it needs; calling outside its role raises ToolDenied immediately
  • Loop detection: the @rule fires client-side before the bridge is contacted — the denied tool never runs and no cost is charged
  • Full audit trail: every tool call, every limit activation, every stop reason logged to NDJSON
Scope: All agents in this pattern run within a single Python process. This covers CrewAI, LangGraph, AutoGen, and any framework that orchestrates agents in a single runtime. Cross-process fleet enforcement is coming in v0.1.6.
See examples/python/metrics_crew for the complete working implementation of this pattern with four agents, Plotly chart generation, and a full incident report output.

Framework integration

LangChain

Stack @lc_tool (outer) and @nanny_tool (inner). LangChain registers the function for dispatch; Nanny intercepts every call regardless of which model or API style invoked it:
from langchain_core.tools import tool as lc_tool
from nanny_sdk import tool as nanny_tool

@lc_tool                    # outer — LangChain registers for tool dispatch
@nanny_tool(cost=5)         # inner — Nanny intercepts before file is opened
def read_file(path: str) -> str:
    """Read a source file from disk."""
    with open(path) as f:
        return f.read()
Execution order: your code calls tool.run(args) → LangChain validates args → Nanny wrapper intercepts → bridge check → if allowed, file is read.

CrewAI

Same stacking pattern. CrewAI’s @tool decorator and Nanny’s @tool decorator both wrap the function — Nanny’s wrapper fires on every tool.run() call inside the crew:
from crewai.tools import tool as crew_tool
from nanny_sdk import tool as nanny_tool

@crew_tool                  # outer — CrewAI registers for agent dispatch
@nanny_tool(cost=15)        # inner — Nanny intercepts before function runs
def generate_chart(metric: str, output_dir: str) -> str:
    """Generate an interactive Plotly chart for a metric."""
    # ... chart generation ...
    return output_path
See examples/python/dev_assist for a complete LangChain integration and examples/python/metrics_crew for the canonical multi-agent governance example with four specialized agents, per-role limits, and per-role tool allowlists.