Primary cost drivers
- Model tokens: prompts, retrieved context, and generated outputs.
- Retrieval: indexing, query volume, reranking, and cache behavior.
- Tooling: API calls, compute side effects, and external dependency costs.
- Orchestration: long-running workflows, retries, and human approval steps.
Cost governance mechanisms
- Per-workflow budgets (token and tool-call caps) with safe fallbacks.
- Model routing (cheap-to-expensive) and caching for repeated queries.
- Context minimization and higher-precision retrieval to reduce token load.
- Operational dashboards for cost attribution and anomaly detection.
For commercial terms and packaging aligned to your deployment model, use /contact.