Skip to main content

Limits and Quotas

Reference for all platform limits in TARX. Understanding these limits helps you design workflows that stay within bounds and handle edge cases gracefully.


API Rate Limits

LimitValueScopeBehavior When Exceeded
Requests per minute200Per IP address429 Too Many Requests
Burst limit30 requests in 5 secondsPer IP address429 Too Many Requests
Workflow executions per hour100Per project429 Too Many Requests

Rate limits degrade gracefully — if the rate-limit cache is unavailable, limiting is bypassed so the API stays up.

Rate limit headers on every response:

X-RateLimit-Limit: 200
X-RateLimit-Remaining: 147
X-RateLimit-Reset: 1705312200 # Unix timestamp when window resets

Execution Limits

LimitValueNotes
Agent tool call rounds10 per agent executionPrevents infinite tool loops
Agent execution timeout120 secondsConfigurable per node (max 600s)
Loop max iterations1000 (default)Configurable per loop node
Workflow max nodesNo hard limitPractical limit ~200 nodes for performance
Concurrent executions per project10Queued if exceeded
Maximum execution duration3600 seconds (1 hour)Long workflows should use async patterns
Human-in-Loop wait timeNo limit (default)Configurable timeout per HiL node

Bridge Limits

LimitValueNotes
Messages per conversation30Older messages summarized when approaching limit
Max file attachment size100 KBPer attached file
Max agents in conversation5Active agents in one Bridge conversation
Actions per conversation100Total create/update/delete actions
Max workflow nodes Bridge can create20 per requestFor safety

Agent Limits

LimitValueNotes
Agents per projectNo hard limitPractical limit ~500 for UI performance
System prompt max length32,000 characters~8,000 tokens
Max RAG sources per agent10More rarely needed
Max MCP servers per agent10More rarely needed
Agent name max length40 charactersLowercase, hyphens only

Workflow Limits

LimitValueNotes
Workflows per projectNo hard limit
Nodes per workflowNo hard limitUI performance degrades above ~200
Execution history per workflowUnlimitedStored securely
Export file size10 MBPractical limit for workflow JSON

RAG Source Limits

LimitValueNotes
RAG sources per projectNo hard limit
RAG sources per agent10
Top-K max value20Higher values exceed practical usefulness
Retrieved chunk token budget~4,000 tokensApproximate; depends on chunk size
Query embedding timeout5 seconds
Vector DB query timeout10 seconds

MCP Server Limits

LimitValueNotes
MCP servers per projectNo hard limit
MCP servers per agent10
MCP tool discovery timeout10 secondsPer server per execution
MCP tool call timeout30 secondsPer individual tool call
Max tools from one MCP server100More tools increase context overhead

Data Retention

DataRetention PeriodStorage
Execution records (metadata, status, node statuses)IndefiniteTARX database
Execution logs (full input/output text)30 daysBlob storage
Bridge conversationsIndefiniteTARX database
Agent configsUntil deletedTARX database
Workflow definitionsUntil deletedTARX database

Document Size

Individual records have a 2 MB size limit. TARX respects this by:

  • Agent system prompts: Stored inline (< 32K chars = ~64 KB, well within limit)
  • Workflow node graphs: Stored as a nested object in the workflow document. For very large workflows (200+ nodes), export/import as JSON to check size.
  • Execution results: large node outputs (above ~500KB) are stored in secure cloud storage with a reference in the record

Token Limits (LLM Context Windows)

Context-window and output limits are set by your model provider, not by TARX. They vary by model and change over time — check your provider's documentation for the model you're using.

TARX doesn't enforce these limits — if your agent's total context (system prompt + RAG context + conversation history + tool results) exceeds the model's window, the provider returns an error. Design agents with context budgets in mind.

Estimating context usage:

  • System prompt: ~1 token per 4 characters
  • RAG context (top_k=3, 400 token chunks): ~1,200 tokens
  • Agent input (average): ~200-500 tokens
  • Tool definitions: a few hundred tokens per enabled tool
  • Conversation history (10 turns): ~2,000 tokens
  • Typical total: 4,000-6,000 tokens — well within most model windows

Practical Recommendations

Large Loop Jobs

For loops over 100+ items:

  • Use cheaper, faster models (Claude Haiku, GPT-4o-mini) inside the loop
  • Set max_iterations conservatively
  • Plan for the workflow to take minutes to hours

High-Frequency Schedules

Schedules running every 5 minutes × complex workflows can approach the 100 executions/hour project limit. Monitor your execution count on the dashboard.

Long Agent Outputs

For agents producing very long outputs (code, full articles, large JSON):

  • Set max_tokens appropriately (2048-8192)
  • Downstream nodes receive the full output
  • Very long outputs stored in blob storage automatically

Multi-RAG, Multi-MCP Agents

Each RAG source adds ~100-200ms latency. Each MCP server adds ~50-200ms. An agent with 5 RAG sources and 3 MCP servers adds ~800ms-1.5s overhead before the LLM call starts. Design latency-sensitive workflows with this in mind.