Limits and Quotas

Reference for all platform limits in TARX. Understanding these limits helps you design workflows that stay within bounds and handle edge cases gracefully.

API Rate Limits

Limit	Value	Scope	Behavior When Exceeded
Requests per minute	200	Per IP address	`429 Too Many Requests`
Burst limit	30 requests in 5 seconds	Per IP address	`429 Too Many Requests`
Workflow executions per hour	100	Per project	`429 Too Many Requests`

Rate limits degrade gracefully — if the rate-limit cache is unavailable, limiting is bypassed so the API stays up.

Rate limit headers on every response:

X-RateLimit-Limit: 200
X-RateLimit-Remaining: 147
X-RateLimit-Reset: 1705312200  # Unix timestamp when window resets

Execution Limits

Limit	Value	Notes
Agent tool call rounds	10 per agent execution	Prevents infinite tool loops
Agent execution timeout	120 seconds	Configurable per node (max 600s)
Loop max iterations	1000 (default)	Configurable per loop node
Workflow max nodes	No hard limit	Practical limit ~200 nodes for performance
Concurrent executions per project	10	Queued if exceeded
Maximum execution duration	3600 seconds (1 hour)	Long workflows should use async patterns
Human-in-Loop wait time	No limit (default)	Configurable timeout per HiL node

Bridge Limits

Limit	Value	Notes
Messages per conversation	30	Older messages summarized when approaching limit
Max file attachment size	100 KB	Per attached file
Max agents in conversation	5	Active agents in one Bridge conversation
Actions per conversation	100	Total create/update/delete actions
Max workflow nodes Bridge can create	20 per request	For safety

Agent Limits

Limit	Value	Notes
Agents per project	No hard limit	Practical limit ~500 for UI performance
System prompt max length	32,000 characters	~8,000 tokens
Max RAG sources per agent	10	More rarely needed
Max MCP servers per agent	10	More rarely needed
Agent name max length	40 characters	Lowercase, hyphens only

Workflow Limits

Limit	Value	Notes
Workflows per project	No hard limit
Nodes per workflow	No hard limit	UI performance degrades above ~200
Execution history per workflow	Unlimited	Stored securely
Export file size	10 MB	Practical limit for workflow JSON

RAG Source Limits

Limit	Value	Notes
RAG sources per project	No hard limit
RAG sources per agent	10
Top-K max value	20	Higher values exceed practical usefulness
Retrieved chunk token budget	~4,000 tokens	Approximate; depends on chunk size
Query embedding timeout	5 seconds
Vector DB query timeout	10 seconds

MCP Server Limits

Limit	Value	Notes
MCP servers per project	No hard limit
MCP servers per agent	10
MCP tool discovery timeout	10 seconds	Per server per execution
MCP tool call timeout	30 seconds	Per individual tool call
Max tools from one MCP server	100	More tools increase context overhead

Data Retention

Data	Retention Period	Storage
Execution records (metadata, status, node statuses)	Indefinite	TARX database
Execution logs (full input/output text)	30 days	Blob storage
Bridge conversations	Indefinite	TARX database
Agent configs	Until deleted	TARX database
Workflow definitions	Until deleted	TARX database

Document Size

Individual records have a 2 MB size limit. TARX respects this by:

Agent system prompts: Stored inline (< 32K chars = ~64 KB, well within limit)
Workflow node graphs: Stored as a nested object in the workflow document. For very large workflows (200+ nodes), export/import as JSON to check size.
Execution results: large node outputs (above ~500KB) are stored in secure cloud storage with a reference in the record

Token Limits (LLM Context Windows)

Context-window and output limits are set by your model provider, not by TARX. They vary by model and change over time — check your provider's documentation for the model you're using.

TARX doesn't enforce these limits — if your agent's total context (system prompt + RAG context + conversation history + tool results) exceeds the model's window, the provider returns an error. Design agents with context budgets in mind.

Estimating context usage:

System prompt: ~1 token per 4 characters
RAG context (top_k=3, 400 token chunks): ~1,200 tokens
Agent input (average): ~200-500 tokens
Tool definitions: a few hundred tokens per enabled tool
Conversation history (10 turns): ~2,000 tokens
Typical total: 4,000-6,000 tokens — well within most model windows

Practical Recommendations

Large Loop Jobs

For loops over 100+ items:

Use cheaper, faster models (Claude Haiku, GPT-4o-mini) inside the loop
Set max_iterations conservatively
Plan for the workflow to take minutes to hours

High-Frequency Schedules

Schedules running every 5 minutes × complex workflows can approach the 100 executions/hour project limit. Monitor your execution count on the dashboard.

Long Agent Outputs

For agents producing very long outputs (code, full articles, large JSON):

Set max_tokens appropriately (2048-8192)
Downstream nodes receive the full output
Very long outputs stored in blob storage automatically

Multi-RAG, Multi-MCP Agents

Each RAG source adds ~100-200ms latency. Each MCP server adds ~50-200ms. An agent with 5 RAG sources and 3 MCP servers adds ~800ms-1.5s overhead before the LLM call starts. Design latency-sensitive workflows with this in mind.

API Rate Limits​

Execution Limits​

Bridge Limits​

Agent Limits​

Workflow Limits​

RAG Source Limits​

MCP Server Limits​

Data Retention​

Document Size​

Token Limits (LLM Context Windows)​

Practical Recommendations​

Large Loop Jobs​

High-Frequency Schedules​

Long Agent Outputs​

Multi-RAG, Multi-MCP Agents​