Pricing

Pricing principles for agent deployments: cost drivers (tokens, retrieval, tools) and ways to control spend with routing and budgets.

Overview

Agent costs are driven by model usage, retrieval operations, and downstream tool calls. Pricing should reflect the operational reality: predictable spend, safe scaling, and clear attribution by workflow and tenant.

Primary cost drivers

Model tokens: prompts, retrieved context, and generated outputs.
Retrieval: indexing, query volume, reranking, and cache behavior.
Tooling: API calls, compute side effects, and external dependency costs.
Orchestration: long-running workflows, retries, and human approval steps.

Cost governance mechanisms

Per-workflow budgets (token and tool-call caps) with safe fallbacks.
Model routing (cheap-to-expensive) and caching for repeated queries.
Context minimization and higher-precision retrieval to reduce token load.
Operational dashboards for cost attribution and anomaly detection.

For commercial terms and packaging aligned to your deployment model, use /contact.

Pricing

Overview

Navigate

Primary cost drivers

Cost governance mechanisms