A brief description of your package

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

RAFT logo

If you are using RAG (Retriever Augmented Generation), you should be using RAFT!

One of the most significant uses of generative AI in the business sector is the development of natural language interfaces that tap into existing data repositories. This involves providing answers to inquiries related to specialized areas such as finance, law, and healthcare. Two popular methods are commonly used for this scenario: Domain-Specific Fine-tuning (DSF) and Retriever Augmented Generation (RAG). Retriever Augmented Fine-Tuning (RAFT) looks at combining the two approaches aiming at training the model for a domain-specific open-book exam.

RAFT makes it easy to:

Synthetically generate training dataset to for domain-specific RAG
Cleaning up dataset and prepare them for finetuning
Easy plug in and play finetuning framework by OpenAI, Azure, AWS, Llama-recipes, ...
Serve finetuned RAG models with Side-by-side comparisons

This repo contains the code for

raft 
├── inference
│   ├── README.md
│   ├── config
│   │   ├── conversation_example.yaml
│   │   ├── qa_example.yaml
│   ├── document
│   ├── evaluation
│   │   ├── evaluation.py # execution script for `raft eval`
│   │   ├── llm_judge.py # judge llm used during evaluation
│   ├── rag
│   │   ├── base_rag.py # the base rag template that is RAFT compatible
│   │   ├── compare_rag.py # execution script for `raft compare`
│   │   ├── constant.py 
│   │   ├── directory_loader.py # a collection of chunking tools by file type
│   │   ├── serve_rag.py # host a rag server via FastAPI; execution script for `raft serve_rag`
│   │   ├── test.py
│   ├── train
│   │   ├── train_openai.py # execution script for `raft train`; support openai fine-tuning
│   ├── utils
│   ├── cli.py
│   ├── constant.py
│   ├── generate.py

RAFT Finetuning Data Generation

Overview

This script is designed to generate RAFT (Retrieval Augmented Fine-Tuning) data by first pre-processing documents using customized chunking strategies (e.g. by_title for PDF files, by_html_tag for HTML files, or semantic for embedding parition for any file tyeps), generating question-answer pairs, or conversations in RAFT format, and saving the results in specified formats (e.g. .json). The script supports various input formats, including PDF, TXT, JSON, HTML, and CSV files.

Arguments

Section	Argument	Type	Default	Example Value	Description
Input
	`--datapath`	str	`""`	`/path/to/your/data.txt`	The path at which the document is located.
Output
	`--output-dir`	str	`"./"`	`./output`	The path at which to save the dataset.
	`--output-format`	str	`"chat"`	`chat`	Format to convert the dataset to (`hf`, `chat`, `completion`).
	`--output-type`	str	`"jsonl"`	`jsonl`	Type to export the dataset to (`jsonl`).
	`--output-chat-system-prompt`	str	None	"You are a helpful assistant."	The system prompt to use when the output format is chat.
Generation
	`--style`	str	`"qa"`	`qa`	Style of the generated dataset (`qa`, `conversation`).
	`--questions`	int	5	5	The number of questions to generate per document chunk.
	`--distractors`	int	3	3	The number of distractor documents to include per data point/triplet.
	`--p`	float	1.0	0.8	The percentage that the oracle document is included in the context.
	`--chunk-size`	int	512	1000	The size of each chunk in number of tokens.
Models
	`--models-embedding-provider`	str	`"openai"`	`openai`	Provider for the embedding model.
	`--models-embedding-name`	str	`"text-embedding-ada-002"`	`text-embedding-ada-002`	The embedding model to use to encode document chunks.
	`--models-generation-provider`	str	`"openai"`	`openai`	Provider for the generation model.
	`--models-generation-name`	str	`"gpt-4"`	`gpt-3.5-turbo`	The model to use to generate questions and answers.
Execution
	`--fast`	bool	`False`	`True`	Run the script in fast mode (no recovery implemented).
Config
	`--config`	str	None	`config.yaml`	Path to the YAML configuration file.
Chunking - PDF
	`--chunking-pdf-strategy`	str	None	`by_title`	Chunking strategy for PDF files.
	`--chunking-pdf-chunk-size`	int	None	1000	Chunk size for PDF files.
	`--chunking-pdf-max-characters`	int	None	2000	Max characters for PDF chunking.
Chunking - TXT
	`--chunking-txt-strategy`	str	None	`basic`	Chunking strategy for TXT files.
	`--chunking-txt-chunk-size`	int	None	500	Chunk size for TXT files.
Chunking - JSON
	`--chunking-json-strategy`	str	None	`recursive`	Chunking strategy for JSON files.
	`--chunking-json-chunk-size`	int	None	800	Chunk size for JSON files.
Chunking - HTML
	`--chunking-html-strategy`	str	None	`by_html_tag`	Chunking strategy for HTML files.
	`--chunking-html-max-characters`	int	None	1500	Max characters for HTML chunking.
Chunking - CSV
	`--chunking-csv-strategy`	str	None	`by_csv_row`	Chunking strategy for CSV files.
	`--chunking-csv-chunk-size`	int	None	10	Chunk size for CSV files.

Usage

Generating RAFT Data

[Recommended] Method 1: `config.`yaml` file

To generate RAFT data, we recommend drafting your config.yaml file to specify chunking strategies, model providers, and other parameters.

You can use our config template obtained using raft get-configs as starting point for your use case, raft get-configs will copy template file directory to your current working directory.

After defining config.yaml, you can start generating raft finetuning data using raft generate --config config.yaml

A sample configuration file (config.yaml) could look like this:

input:
  datapath: "./data"
  doctype: "pdf"
output:
  dir: "./output"
  format: "json"
  type: "chat"
generation:
  questions: 5
  conversation_turns: 3
  style: "qa"
models:
  embedding:
    provider: "openai"
    name: "text-embedding-ada-002"
  generation:
    provider: "openai"
    name: "gpt-3.5-turbo"
execution:
  fast: false
chunking:
  pdf:
    strategy: "by_title"
    chunk_size: 1000
    max_characters: 2000
  txt:
    strategy: "basic"
    chunk_size: 500
  json:
    strategy: "recursive"
    chunk_size: 800
  html:
    strategy: "by_html_tag"
    max_characters: 1500
  csv:
    strategy: "by_csv_row"
    chunk_size: 10
chat_system_prompt: "You are a helpful assistant."

In plain words, this config file defines RAFT data generation that result in a json format output file with chat style, 5 questions per document, 3 distractors, and a conversation turn of 3. The document will be chunked by title for PDF files, by basic strategy for TXT files, recursively for JSON files, by HTML tag for HTML files, and by CSV row for CSV files. The embedding model text-embedding-ada-002 will be used to encode document chunks, and the generation model gpt-3.5-turbo will be used to generate questions and answers. The chat system prompt will be set to "You are a helpful assistant."

[Recommended] Method 2: `config.`yaml` file + CLI commands

If you want to use the template values as starting point with minor changes, you can use the config plus CLI commands, we will override the values in the config yaml file with the CLI commands.

For example, you can override the datapath with your own data path without touching other config parameters

raft generate --config config.yaml --datapath /path/to/your/data

[Not Recommended] Method 3: Pure CLI commands

Alternatively, you can define your raft generation parameter completely using CLI commands.

python generation.py generate
    --datapath /path/to/your/data.txt
    --output-dir ./output
    --output-format chat
    --output-type jsonl
    --output-chat-system-prompt "You are a helpful assistant."
    --style qa
    --questions 5
    --distractors 3
    --p 0.8
    --chunk-size 1000
    --models-embedding-provider openai
    --models-embedding-name text-embedding-ada-002
    --models-generation-provider openai
    --models-generation-name gpt-3.5-turbo
    --fast True
    --config config.yaml
    --chunking-pdf-strategy by_title
    --chunking-pdf-chunk-size 1000
    --chunking-pdf-max-characters 2000
    --chunking-txt-strategy basic
    --chunking-txt-chunk-size 500
    --chunking-json-strategy recursive
    --chunking-json-chunk-size 800
    --chunking-html-strategy by_html_tag
    --chunking-html-max-characters 1500
    --chunking-csv-strategy by_csv_row
    --chunking-csv-chunk-size 10

RAG Server README

Overview

This README provides instructions for setting up and running a Retrieval-Augmented Generation (RAG) server using the provided arguments and commands. The RAG server integrates a retrieval mechanism with a generation model to provide enhanced responses based on the provided documents.

Arguments

Argument	Type	Required	Default	Description
`--model_name`	str	Yes	N/A	Path to the base model for serving RAG
`--metadata_storage_path`	str	Yes	N/A	Path to metadata storage
`--document_storage_path`	str	Yes	N/A	Path to document storage
`--k`	int	No	5	Number of documents to retrieve
`--host`	str	No	0.0.0.0	Host for RAG server
`--port`	int	No	8000	Port for RAG server

Usage

Starting the RAG Server

To start the RAG server, use the raft serve_rag command with the required arguments. Below is an example command:

raft serve_rag 
    --model_name {fine-tuned model name}
    --metadata_storage_path ./artifact 
    --document_storage_path ./document

Use the model {fine-tuned model name} available after OAI fine-tuning
Store metadata in the ./artifact directory
Store documents in the ./document directory

If ./artifact does not exist, raft will take all the supported documents(refer to rag/directory_loarder.py) and build FAISS vector database. If ./artifact exists, raft will load it as a FAISS storage directory and skip document ingest.

Project Roadmap

In the immediate future, we plan to release the following:

README

Add easier entry point for user to start using RAFT with very minimal setup.
Add cost estimations with examples (calculate using OpenAI token counts, etc) Ofc it will varied by prompt.

Generate

Add support for vLLM support for open source LLM generation model
Input Chunking: Add support for local embedding models
Input: Option to take chunked documents as input.
Refactor: Place prompts in the config file as well (?).
Distractor doc using RAG
Refusal @tianjunz

RAG

Use refactored utils.data_preprocess to load data
@Fanjia-Yan

Train (finetune)

llama-recipes support

Evaluation

Propose a new task you would like to work on :star_struck:

Citation

If you use RAFT, please cite our paper:

@article{zhang2024raft,
  title={Raft: Adapting language model to domain specific rag},
  author={Zhang, Tianjun and Patil, Shishir G and Jain, Naman and Shen, Sheng and Zaharia, Matei and Stoica, Ion and Gonzalez, Joseph E},
  journal={arXiv preprint arXiv:2403.10131},
  year={2024}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.1.6

Aug 5, 2024

0.1.5

Aug 5, 2024

0.1.4

Aug 5, 2024

0.1.3

Aug 5, 2024

0.1.2

Aug 4, 2024

0.1.1

Jul 26, 2024

0.1.0

Jul 5, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raft_llm-0.1.6.tar.gz (35.3 kB view details)

Uploaded Aug 5, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

raft_llm-0.1.6-py3-none-any.whl (37.4 kB view details)

Uploaded Aug 5, 2024 Python 3

File details

Details for the file raft_llm-0.1.6.tar.gz.

File metadata

Download URL: raft_llm-0.1.6.tar.gz
Upload date: Aug 5, 2024
Size: 35.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for raft_llm-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`48c250e36dfa128a667414f60ff466aba07fc2bb1fc327b05f44ef060484a88f`
MD5	`c458ca5e246504d79d29c7434229d0b8`
BLAKE2b-256	`74dd3026114bb05043e949ffa550ff9c60514da8e1b18ff579eb9382efc3b77a`

See more details on using hashes here.

File details

Details for the file raft_llm-0.1.6-py3-none-any.whl.

File metadata

Download URL: raft_llm-0.1.6-py3-none-any.whl
Upload date: Aug 5, 2024
Size: 37.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for raft_llm-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`90e7cf8469f7a8ed0b14d6d70fec8bc893bb0d07f739c8457990c72855c12945`
MD5	`57f9cc4652e055e628685da84b9182c7`
BLAKE2b-256	`fbf0e09f7c083995a0294c8659f3b22bcf446970511645b6a98accea266f5e24`

See more details on using hashes here.

raft-llm 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

If you are using RAG (Retriever Augmented Generation), you should be using RAFT!

RAFT Finetuning Data Generation

Overview

Arguments

Usage

Generating RAFT Data

[Recommended] Method 1: config.yaml` file

[Recommended] Method 2: config.yaml` file + CLI commands

[Not Recommended] Method 3: Pure CLI commands

RAG Server README

Overview

Arguments

Usage

Starting the RAG Server

Project Roadmap

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

[Recommended] Method 1: `config.`yaml` file

[Recommended] Method 2: `config.`yaml` file + CLI commands