Skip to main content

RAG Sources Overview

RAG (Retrieval-Augmented Generation) enrichment connects your agents to external knowledge bases. Before each LLM call, TARX queries your vector databases, retrieves the most semantically relevant text chunks, and injects them as context.

This turns your agents from "what the LLM was trained on" into "what the LLM was trained on plus your curated knowledge."


The Problem RAG Solves

LLMs have training data cutoffs and don't know about your internal:

  • Product documentation
  • Company policies
  • Internal processes
  • Domain-specific knowledge bases
  • Research papers specific to your field

Without RAG, agents answer these questions from generic training — often wrong, always generic.

With RAG, agents retrieve the actual relevant sections from your indexed knowledge and answer with specificity and accuracy.


How Retrieval Works

  1. The agent's input text is embedded into a 1536-dimension vector
  2. The vector is compared against all indexed content using cosine similarity
  3. The top-K most similar chunks are returned
  4. Those chunks are prepended to the LLM's system prompt
  5. The LLM answers using both its training knowledge and the retrieved context

What RAG Sources Are

A RAG Source in TARX is a configuration object that tells TARX:

  • Which vector database to query
  • How to authenticate
  • Which index/collection/namespace to use
  • How to interpret the results

You create RAG sources in the RAG Sources section of your project, then assign them to agents in the Agent Editor.


Supported Providers

ProviderTypeNotes
Azure AI SearchCloudNative Azure integration, recommended for Azure deployments
PineconeCloudPopular vector DB, serverless tier available, easy setup
WeaviateCloud or self-hostedOpen-source, strong filtering support
QdrantCloud or self-hostedOpen-source, efficient for large indexes
Supabase VectorCloudPostgreSQL-based (pgvector), great if you already use Supabase
Custom RESTAnyAny vector DB with a REST search API

See Providers for detailed configuration for each.


Embedding Strategy

TARX provides free embeddings for RAG. You don't pay per query:

PropertyValue
Modeltext-embedding-3-small (OpenAI)
Dimensions1536
Index typeHNSW (Hierarchical Navigable Small World)
SearchSemantic similarity (cosine distance)
Cost to youFree — TARX covers embedding API cost

You don't need an OpenAI key for RAG. TARX covers the embedding API cost.


Data Flow for RAG


Document Indexing

TARX does not index your documents for you. You maintain your own vector database and TARX queries it. Your indexing pipeline is separate:

  1. Ingest documents (your own pipeline or vector DB tooling)
  2. Chunk them (typically 200-500 token chunks with 50-token overlap)
  3. Embed them (using text-embedding-3-small for best compatibility with TARX)
  4. Store in your vector DB with metadata (source URL, title, date, etc.)
  5. Configure the RAG source in TARX to query your indexed collection

You own the indexing pipeline — TARX connects to your already-populated vector store and queries it at runtime. Use your vector DB provider's own SDK or tooling to ingest and embed your documents.


Multiple RAG Sources per Agent

An agent can have multiple RAG sources. TARX queries all of them in parallel:

Agent: customer-support
RAG Sources:
- product-docs (Pinecone, top_k=3)
- company-policies (Azure AI Search, top_k=2)
- faq-database (Weaviate, top_k=3)

Before each LLM call, all 3 sources are queried simultaneously. The top-k results from each are combined and injected as context. Total chunks: up to 3+2+3 = 8 chunks.


RAG Source Scoping

RAG sources are project-scoped:

  • Visible to all project members with appropriate roles
  • Shareable across agents within the same project
  • Not visible to other projects

Next Steps