Skip to main content

/rag/ingest

All-in-one document ingestion pipeline: Upload → Chunk → Embed → Vector Store

FeatureSupported
Logging
Supported Providersopenai, bedrock, vertex_ai, gemini

Quick Start

OpenAI

Ingest to OpenAI vector store
curl -X POST "http://localhost:4000/v1/rag/ingest" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d "{
\"file\": {
\"filename\": \"document.txt\",
\"content\": \"$(base64 -i document.txt)\",
\"content_type\": \"text/plain\"
},
\"ingest_options\": {
\"vector_store\": {
\"custom_llm_provider\": \"openai\"
}
}
}"

Bedrock

Ingest to Bedrock Knowledge Base
curl -X POST "http://localhost:4000/v1/rag/ingest" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d "{
\"file\": {
\"filename\": \"document.txt\",
\"content\": \"$(base64 -i document.txt)\",
\"content_type\": \"text/plain\"
},
\"ingest_options\": {
\"vector_store\": {
\"custom_llm_provider\": \"bedrock\"
}
}
}"

Vertex AI RAG Engine

Ingest to Vertex AI RAG Corpus
curl -X POST "http://localhost:4000/v1/rag/ingest" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d "{
\"file\": {
\"filename\": \"document.txt\",
\"content\": \"$(base64 -i document.txt)\",
\"content_type\": \"text/plain\"
},
\"ingest_options\": {
\"vector_store\": {
\"custom_llm_provider\": \"vertex_ai\",
\"vector_store_id\": \"your-corpus-id\",
\"gcs_bucket\": \"your-gcs-bucket\"
}
}
}"

Response

{
"id": "ingest_abc123",
"status": "completed",
"vector_store_id": "vs_xyz789",
"file_id": "file_123"
}

Query the Vector Store

After ingestion, query with /vector_stores/{vector_store_id}/search:

Search the vector store
curl -X POST "http://localhost:4000/v1/vector_stores/vs_xyz789/search" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the main topic?",
"max_num_results": 5
}'

End-to-End Example

OpenAI

1. Ingest Document

Step 1: Ingest
curl -X POST "http://localhost:4000/v1/rag/ingest" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d "{
\"file\": {
\"filename\": \"test_document.txt\",
\"content\": \"$(base64 -i test_document.txt)\",
\"content_type\": \"text/plain\"
},
\"ingest_options\": {
\"name\": \"test-basic-ingest\",
\"vector_store\": {
\"custom_llm_provider\": \"openai\"
}
}
}"

Response:

{
"id": "ingest_d834f544-fc5e-4751-902d-fb0bcc183b85",
"status": "completed",
"vector_store_id": "vs_692658d337c4819183f2ad8488d12fc9",
"file_id": "file-M2pJJiWH56cfUP4Fe7rJay"
}

2. Query

Step 2: Query
curl -X POST "http://localhost:4000/v1/vector_stores/vs_692658d337c4819183f2ad8488d12fc9/search" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"query": "What is LiteLLM?",
"custom_llm_provider": "openai"
}'

Response:

{
"object": "vector_store.search_results.page",
"search_query": ["What is LiteLLM?"],
"data": [
{
"file_id": "file-M2pJJiWH56cfUP4Fe7rJay",
"filename": "test_document.txt",
"score": 0.4004629778869299,
"attributes": {},
"content": [
{
"type": "text",
"text": "Test document abc123 for RAG ingestion.\nThis is a sample document to test the RAG ingest API.\nLiteLLM provides a unified interface for vector stores."
}
]
}
],
"has_more": false,
"next_page": null
}

Request Parameters

Top-Level

ParameterTypeRequiredDescription
fileobjectOne of file/file_url/file_id requiredBase64-encoded file
file.filenamestringYesFilename with extension
file.contentstringYesBase64-encoded content
file.content_typestringYesMIME type (e.g., text/plain)
file_urlstringOne of file/file_url/file_id requiredURL to fetch file from
file_idstringOne of file/file_url/file_id requiredExisting file ID
ingest_optionsobjectYesPipeline configuration

ingest_options

ParameterTypeRequiredDescription
vector_storeobjectYesVector store configuration
namestringNoPipeline name for logging

vector_store (OpenAI)

ParameterTypeDefaultDescription
custom_llm_providerstring-"openai"
vector_store_idstringauto-createExisting vector store ID

vector_store (Bedrock)

ParameterTypeDefaultDescription
custom_llm_providerstring-"bedrock"
vector_store_idstringauto-createExisting Knowledge Base ID
wait_for_ingestionbooleanfalseWait for indexing to complete
ingestion_timeoutinteger300Timeout in seconds (if waiting)
s3_bucketstringauto-createS3 bucket for documents
s3_prefixstring"data/"S3 key prefix
embedding_modelstringamazon.titan-embed-text-v2:0Bedrock embedding model
aws_region_namestringus-west-2AWS region
Bedrock Auto-Creation

When vector_store_id is omitted, LiteLLM automatically creates:

  • S3 bucket for document storage
  • OpenSearch Serverless collection
  • IAM role with required permissions
  • Bedrock Knowledge Base
  • Data Source

vector_store (Vertex AI)

ParameterTypeDefaultDescription
custom_llm_providerstring-"vertex_ai"
vector_store_idstringrequiredRAG corpus ID
gcs_bucketstringrequiredGCS bucket for file uploads
vertex_projectstringenv VERTEXAI_PROJECTGCP project ID
vertex_locationstringus-central1GCP region
vertex_credentialsstringADCPath to credentials JSON
wait_for_importbooleantrueWait for import to complete
import_timeoutinteger600Timeout in seconds (if waiting)
Vertex AI Prerequisites
  1. Create a RAG corpus in Vertex AI console or via API
  2. Create a GCS bucket for file uploads
  3. Authenticate via gcloud auth application-default login
  4. Install: pip install 'google-cloud-aiplatform>=1.60.0'

Input Examples

File (Base64)

Request body
{
"file": {
"filename": "document.txt",
"content": "<base64-encoded-content>",
"content_type": "text/plain"
},
"ingest_options": {
"vector_store": {"custom_llm_provider": "openai"}
}
}

File URL

Ingest from URL
curl -X POST "http://localhost:4000/v1/rag/ingest" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"file_url": "https://example.com/document.pdf",
"ingest_options": {"vector_store": {"custom_llm_provider": "openai"}}
}'

Chunking Strategy

Control how documents are split into chunks before embedding. Specify chunking_strategy in ingest_options.

ParameterTypeDefaultDescription
chunk_sizeinteger1000Maximum size of each chunk
chunk_overlapinteger200Overlap between consecutive chunks

Vertex AI RAG Engine

Vertex AI RAG Engine supports custom chunking via the chunking_strategy parameter. Chunks are processed server-side during import.

Vertex AI with custom chunking
curl -X POST "http://localhost:4000/v1/rag/ingest" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d "{
\"file\": {
\"filename\": \"document.txt\",
\"content\": \"$(base64 -i document.txt)\",
\"content_type\": \"text/plain\"
},
\"ingest_options\": {
\"chunking_strategy\": {
\"chunk_size\": 500,
\"chunk_overlap\": 100
},
\"vector_store\": {
\"custom_llm_provider\": \"vertex_ai\",
\"vector_store_id\": \"your-corpus-id\",
\"gcs_bucket\": \"your-gcs-bucket\"
}
}
}"