ByteTools Logo

Pipeline Designer

Test your RAG system before writing a single line of code. Configure and export ready-to-use pipelines.

Pipeline Flow

Document Loader
Text Splitter
Embedding Model
Vector Store
Retriever

Parameters

Generated Python Code

from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

# Load documents
loader = PyPDFLoader("your_file_path")
documents = loader.load()

# Split documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
chunks = text_splitter.split_documents(documents)

# Create embeddings
embeddings = OpenAIEmbeddings

# Create vector store
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings
)

# Create retriever
retriever = vectorstore.as_retriever()
retriever.search_kwargs = {"k": 4}

# Query example
query = "Your question here"
relevant_docs = retriever.get_relevant_documents(query)

print(f"Found {len(relevant_docs)} relevant documents")
for i, doc in enumerate(relevant_docs):
    print(f"\nDocument {i+1}:")
    print(doc.page_content[:200] + "...")

Pipeline Design Tips

  • • Start with smaller chunk sizes (500-1000) and increase if context is lost
  • • Use 10-20% overlap to maintain context between chunks
  • • MMR retrieval reduces redundancy in results
  • • Consider your embedding model's token limit when setting chunk size