Synthesizer: A Framework for LLM Powered Data.
Project description
Synthesizer[ΨΦ]: A multi-purpose LLM framework 💡
With Synthesizer, users can:
- Custom Data Creation: Generate datasets via LLMs that are tailored to your needs.
- Anthropic, OpenAI, vLLM, and HuggingFace.
- Retrieval-Augmented Generation (RAG) on Demand: Built-in RAG Provider Interface to anchor generated data to real-world sources.
- Turnkey integration with Agent Search API.
- Custom Data Creation: Generate datasets via LLMs that are tailored to your needs, for LLM training, RAG, and more.
Fast Setup
pip install sciphi-synthesizer
Using Synthesizer
-
Generate synthetic question-answer pairs
export SCIPHI_API_KEY=MY_SCIPHI_API_KEY python -m synthesizer.scripts.data_augmenter run --dataset="wiki_qa"
tail augmented_output/config_name_eq_answer_question__dataset_name_eq_wiki_qa.jsonl { "formatted_prompt": "... ### Question:\nwhat country did wine originate in\n\n### Input:\n1. URL: https://en.wikipedia.org/wiki/History%20of%20wine (Score: 0.85)\nTitle:History of wine....", { "completion": "Wine originated in the South Caucasus, which is now part of modern-day Armenia ..."
-
Evaluate RAG pipeline performance
export SCIPHI_API_KEY=MY_SCIPHI_API_KEY python -m synthesizer.scripts.rag_harness --rag_provider="agent-search" --llm_provider_name="sciphi" --n_samples=25
Documentation
For more detailed information, tutorials, and API references, please visit the official Synthesizer Documentation.
Community & Support
Developing with Synthesizer
Quickly set up RAG augmented generation with your choice of provider, from OpenAI, Anhtropic, vLLM, and SciPhi:
# Requires SCIPHI_API_KEY in env
from synthesizer.core import LLMProviderName, RAGProviderName
from synthesizer.interface import LLMInterfaceManager, RAGInterfaceManager
from synthesizer.llm import GenerationConfig
# RAG Provider Settings
rag_interface = RAGInterfaceManager.get_interface_from_args(
RAGProviderName("agent-search"),
limit_hierarchical_url_results=rag_limit_hierarchical_url_results,
limit_final_pagerank_results=rag_limit_final_pagerank_results,
)
rag_context = rag_interface.get_rag_context(query)
# LLM Provider Settings
llm_interface = LLMInterfaceManager.get_interface_from_args(
LLMProviderName("openai"),
)
generation_config = GenerationConfig(
model_name=llm_model_name,
max_tokens_to_sample=llm_max_tokens_to_sample,
temperature=llm_temperature,
top_p=llm_top_p,
# other generation params here ...
)
formatted_prompt = raw_prompt.format(rag_context=rag_context)
completion = llm_interface.get_completion(formatted_prompt, generation_config)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sciphi_synthesizer-1.0.5.tar.gz
(144.5 kB
view hashes)
Built Distribution
Close
Hashes for sciphi_synthesizer-1.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | be1a736e8e7c7ed41f2ec2d639d1f59be46dad871e52b8e28c37798419a2aa5a |
|
MD5 | 794b6fe1f8d6c64ec9f0951cd59c8b9f |
|
BLAKE2b-256 | a1ce5e9562d6b76911186f1555b21a439ced43c2a5c18c9e5693146d807ef3fa |