Limits and Quotas
Reference for all platform limits in TARX. Understanding these limits helps you design workflows that stay within bounds and handle edge cases gracefully.
API Rate Limits
| Limit | Value | Scope | Behavior When Exceeded |
|---|---|---|---|
| Requests per minute | 200 | Per IP address | 429 Too Many Requests |
| Burst limit | 30 requests in 5 seconds | Per IP address | 429 Too Many Requests |
| Workflow executions per hour | 100 | Per project | 429 Too Many Requests |
Rate limits degrade gracefully — if the rate-limit cache is unavailable, limiting is bypassed so the API stays up.
Rate limit headers on every response:
X-RateLimit-Limit: 200
X-RateLimit-Remaining: 147
X-RateLimit-Reset: 1705312200 # Unix timestamp when window resets
Execution Limits
| Limit | Value | Notes |
|---|---|---|
| Agent tool call rounds | 10 per agent execution | Prevents infinite tool loops |
| Agent execution timeout | 120 seconds | Configurable per node (max 600s) |
| Loop max iterations | 1000 (default) | Configurable per loop node |
| Workflow max nodes | No hard limit | Practical limit ~200 nodes for performance |
| Concurrent executions per project | 10 | Queued if exceeded |
| Maximum execution duration | 3600 seconds (1 hour) | Long workflows should use async patterns |
| Human-in-Loop wait time | No limit (default) | Configurable timeout per HiL node |
Bridge Limits
| Limit | Value | Notes |
|---|---|---|
| Messages per conversation | 30 | Older messages summarized when approaching limit |
| Max file attachment size | 100 KB | Per attached file |
| Max agents in conversation | 5 | Active agents in one Bridge conversation |
| Actions per conversation | 100 | Total create/update/delete actions |
| Max workflow nodes Bridge can create | 20 per request | For safety |
Agent Limits
| Limit | Value | Notes |
|---|---|---|
| Agents per project | No hard limit | Practical limit ~500 for UI performance |
| System prompt max length | 32,000 characters | ~8,000 tokens |
| Max RAG sources per agent | 10 | More rarely needed |
| Max MCP servers per agent | 10 | More rarely needed |
| Agent name max length | 40 characters | Lowercase, hyphens only |
Workflow Limits
| Limit | Value | Notes |
|---|---|---|
| Workflows per project | No hard limit | |
| Nodes per workflow | No hard limit | UI performance degrades above ~200 |
| Execution history per workflow | Unlimited | Stored securely |
| Export file size | 10 MB | Practical limit for workflow JSON |
RAG Source Limits
| Limit | Value | Notes |
|---|---|---|
| RAG sources per project | No hard limit | |
| RAG sources per agent | 10 | |
| Top-K max value | 20 | Higher values exceed practical usefulness |
| Retrieved chunk token budget | ~4,000 tokens | Approximate; depends on chunk size |
| Query embedding timeout | 5 seconds | |
| Vector DB query timeout | 10 seconds |
MCP Server Limits
| Limit | Value | Notes |
|---|---|---|
| MCP servers per project | No hard limit | |
| MCP servers per agent | 10 | |
| MCP tool discovery timeout | 10 seconds | Per server per execution |
| MCP tool call timeout | 30 seconds | Per individual tool call |
| Max tools from one MCP server | 100 | More tools increase context overhead |
Data Retention
| Data | Retention Period | Storage |
|---|---|---|
| Execution records (metadata, status, node statuses) | Indefinite | TARX database |
| Execution logs (full input/output text) | 30 days | Blob storage |
| Bridge conversations | Indefinite | TARX database |
| Agent configs | Until deleted | TARX database |
| Workflow definitions | Until deleted | TARX database |
Document Size
Individual records have a 2 MB size limit. TARX respects this by:
- Agent system prompts: Stored inline (< 32K chars = ~64 KB, well within limit)
- Workflow node graphs: Stored as a nested object in the workflow document. For very large workflows (200+ nodes), export/import as JSON to check size.
- Execution results: large node outputs (above ~500KB) are stored in secure cloud storage with a reference in the record
Token Limits (LLM Context Windows)
Context-window and output limits are set by your model provider, not by TARX. They vary by model and change over time — check your provider's documentation for the model you're using.
TARX doesn't enforce these limits — if your agent's total context (system prompt + RAG context + conversation history + tool results) exceeds the model's window, the provider returns an error. Design agents with context budgets in mind.
Estimating context usage:
- System prompt: ~1 token per 4 characters
- RAG context (top_k=3, 400 token chunks): ~1,200 tokens
- Agent input (average): ~200-500 tokens
- Tool definitions: a few hundred tokens per enabled tool
- Conversation history (10 turns): ~2,000 tokens
- Typical total: 4,000-6,000 tokens — well within most model windows
Practical Recommendations
Large Loop Jobs
For loops over 100+ items:
- Use cheaper, faster models (Claude Haiku, GPT-4o-mini) inside the loop
- Set
max_iterationsconservatively - Plan for the workflow to take minutes to hours
High-Frequency Schedules
Schedules running every 5 minutes × complex workflows can approach the 100 executions/hour project limit. Monitor your execution count on the dashboard.
Long Agent Outputs
For agents producing very long outputs (code, full articles, large JSON):
- Set
max_tokensappropriately (2048-8192) - Downstream nodes receive the full output
- Very long outputs stored in blob storage automatically
Multi-RAG, Multi-MCP Agents
Each RAG source adds ~100-200ms latency. Each MCP server adds ~50-200ms. An agent with 5 RAG sources and 3 MCP servers adds ~800ms-1.5s overhead before the LLM call starts. Design latency-sensitive workflows with this in mind.