OpenAI-Parallel-Toolkit is a Python library for handling multiple OpenAI API keys and parallel tasks. It provides API key rotation, multithreading for faster task execution, and utility functions to boost your OpenAI integration. Ideal for efficient large-scale OpenAI usage.

These details have not been verified by PyPI

Project links

Homepage

Project description

OpenAI Parallel Toolkit

This project uses the key of the OpenAI $5 account. By purchasing a large number of $5 keys and combining key management with multithreaded parallel processing of large amounts of data, it bypasses the restriction of the $5 account that can only request 3 times per minute.

The speed of parallel processing is the number of keys/20, that is, the speed of 20 keys is 1 it/s, the speed of 40 keys is 2 it/s, and so on. Note that each account can only make 200 requests per day.

Features

✅ When the usage reaches the limit, it can automatically rotate the OpenAI API key, with built-in error handling and automatic retry mechanism.
✅ Provides a solution for proxy access to OpenAI services in China.
✅ Supports parallel processing of API and file operations, optimizes throughput and efficiency, and supports resumable transmission.

Installation

pip install openai-parallel-toolkit

Usage

Currently provide three usage methods:

Parallel processing of a dataset, support for continuing to run after interruption.
Process multiple data simultaneously in the code.
Process a single data in the code.

1. Dataset parallel processing

The input and output of data are all in jsonl format.

Input file input.jsonl format example:

{"index": 0, "instruction": "Translate this sentence into English", "input": "Today the weather is really good"}
{"index": 1, "instruction": "Write a sentence", "input": ""}
{"index": 2, "instruction": "Translate this sentence into English", "input": "How old are you"}
{"index": 3, "instruction": "Write a joke", "input": ""}

Output file output.jsonl format example:

{"0": "The weather is really nice today."}
{"1": "I am trying hard to think about how to answer your question."}
{"2": "How old are you?"}
{"3": "Why pull the cow to the church? \n\n Because it is a “pastor”!"}

Python code for processing datasets:

from openai-parallel-toolkit import ParallelToolkit

if __name__ == '__main__':
   ParallelToolkit(config_path="config.json",
                   input_path="input.jsonl",
                   output_path="output.jsonl").run()

ParallelToolkit parameters:

config_path: Configuration file path.
input_path: Input file path.
output_path: Output file path.
max_retries: Maximum number of retries, default is 5.
threads: Number of threads, default is 20, the final number of threads will take the minimum value of half the number of keys and the number of datasets.
name: Progress bar name, default is "ParallelToolkit Progress".
openai_model: Default is gpt-3.5-turbo-0613, note that the $5 account cannot use gpt-4.

2. Process multiple data simultaneously in the code

Use the Prompt named tuple to construct a Dict and then pass it into the parallel_api method.

from openai-parallel-toolkit import ParallelToolkit, Prompt

if __name__ == '__main__':
    data = {i: Prompt(instruction="Please write a sentence about the following topic: ", input="china") for i in range(10)}
    ans = ParallelToolkit(config_path="config.json").parallel_api(data=data)
    print(ans)

3. Process a single data in the code

from openai-parallel-toolkit import ParallelToolkit, Prompt

if __name__ == '__main__':
    prompt = Prompt(instruction="Please write a sentence about the following topic: ", input="flowers")
    ans = ParallelToolkit(config_path="config.json").api(prompt=prompt)
    print(ans)

`config.json`

The config.json file contains the OpenAI API key ↗ and api_base.

You can create the config.json file as follows:

{
  "api_keys": [
    "Your api key 1",
    "Your api key 2",
    "Your api key 3"
  ],
  "api_base": "Your api_base"
}

In this JSON, api_keys is an array that contains your OpenAI API keys. Please replace "Your api key 1", "Your api key 2", "Your api key 3" with your actual API keys. If you only have one API key, then this array only needs to contain one element.

"api_base" is the basic URL you use to# OpenAI Parallel Toolkit

China Access to OpenAI Service Agent

If you're running the program and the progress bar isn't showing any progress, it's possible that you're experiencing connectivity issues, particularly if you're in China or another region where accessing OpenAI is challenging.

To resolve this issue, we recommend deploying your own proxy and passing in api_base. You can refer to the OpenAI Proxy project for more details.

This project provides a method that uses Cloudflare's proxy, which allows up to 100,000 free calls per day. This can effectively help bypass the connectivity issue and ensure the smooth running of your program.

Remember to replace the project link with the actual URL for your specific situation.

If you don't need the api base field, you can leave it unwritten to the config.json file.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.1.3

Aug 24, 2023

1.1.2

Aug 1, 2023

1.1.1

Aug 1, 2023

1.1.0

Jul 31, 2023

This version

1.0.0

Jul 28, 2023

0.6.0

Jul 24, 2023

0.5.2

Jun 26, 2023

0.5.1

Jun 26, 2023

0.5.0

Jun 19, 2023

0.4.1

May 25, 2023

0.4

May 23, 2023

0.3

May 23, 2023

0.2

May 18, 2023

0.1

May 17, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

openai_parallel_toolkit-1.0.0-py3-none-any.whl (5.6 kB view hashes)

Uploaded Jul 28, 2023 Python 3

Hashes for openai_parallel_toolkit-1.0.0-py3-none-any.whl

Hashes for openai_parallel_toolkit-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`29a1a7fbeaec8047508401b88836264d4dec4172e24d5d6f038693691e1b9770`
MD5	`f32afd5d0aefed3a36bc20b2a4bbf012`
BLAKE2b-256	`c616925ba97aab77ff338a4dafa13f0c16b70b4449da45222eea03aceb61e2af`