Skip to main content

Synthesizer: A Framework for LLM Powered Data.

Project description

Synthesizer[ΨΦ]: A multi-purpose LLM framework 💡

SciPhi Logo

With Synthesizer, users can:

  • Custom Data Creation: Generate datasets via LLMs that are tailored to your needs.
    • Anthropic, OpenAI, vLLM, and HuggingFace.
  • Retrieval-Augmented Generation (RAG) on Demand: Built-in RAG Provider Interface to anchor generated data to real-world sources.
    • Turnkey integration with Agent Search API.
  • Custom Data Creation: Generate datasets via LLMs that are tailored to your needs, for LLM training, RAG, and more.

Fast Setup

pip install sciphi-synthesizer

Using Synthesizer

  1. Generate synthetic question-answer pairs

    export SCIPHI_API_KEY=MY_SCIPHI_API_KEY
    python -m synthesizer.scripts.data_augmenter run --dataset="wiki_qa"
    
    tail augmented_output/config_name_eq_answer_question__dataset_name_eq_wiki_qa.jsonl
    { "formatted_prompt": "... ### Question:\nwhat country did wine originate in\n\n### Input:\n1. URL: https://en.wikipedia.org/wiki/History%20of%20wine (Score: 0.85)\nTitle:History of wine....",
    { "completion": "Wine originated in the South Caucasus, which is now part of modern-day Armenia ..."
    
  2. Evaluate RAG pipeline performance

    export SCIPHI_API_KEY=MY_SCIPHI_API_KEY
    python -m synthesizer.scripts.rag_harness --rag_provider="agent-search" --llm_provider_name="sciphi" --n_samples=25
    

Documentation

For more detailed information, tutorials, and API references, please visit the official Synthesizer Documentation.

Community & Support

  • Engage with our vibrant community on Discord.
  • For tailored inquiries or feedback, please email us.

Developing with Synthesizer

Quickly set up RAG augmented generation with your choice of provider, from OpenAI, Anhtropic, vLLM, and SciPhi:

# Requires SCIPHI_API_KEY in env

from synthesizer.core import LLMProviderName, RAGProviderName
from synthesizer.interface import LLMInterfaceManager, RAGInterfaceManager
from synthesizer.llm import GenerationConfig

# RAG Provider Settings
rag_interface = RAGInterfaceManager.get_interface_from_args(
    RAGProviderName("agent-search"),
    limit_hierarchical_url_results=rag_limit_hierarchical_url_results,
    limit_final_pagerank_results=rag_limit_final_pagerank_results,
)
rag_context = rag_interface.get_rag_context(query)

# LLM Provider Settings
llm_interface = LLMInterfaceManager.get_interface_from_args(
    LLMProviderName("openai"),
)

generation_config = GenerationConfig(
    model_name=llm_model_name,
    max_tokens_to_sample=llm_max_tokens_to_sample,
    temperature=llm_temperature,
    top_p=llm_top_p,
    # other generation params here ...
)

formatted_prompt = raw_prompt.format(rag_context=rag_context)
completion = llm_interface.get_completion(formatted_prompt, generation_config)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sciphi_synthesizer-1.0.5.tar.gz (144.5 kB view hashes)

Uploaded Source

Built Distribution

sciphi_synthesizer-1.0.5-py3-none-any.whl (164.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page