Library for synthetic data generation
Project description
Code Generation using main_script.py
This script is designed to generate code snippets using various parameters. You can customize the parameters directly in the script or by modifying the JSON file in the config folder. The generated codes are less than the requested because the algorithm drops duplicated ones. Inside prompts folder/prompt_list_and_probabilities can be found the probabilities with which the four different prompts templates are selected, the proportions can be changed in this file.
Parameters
api_key
: Your API key for accessing the code generation service.share
: A dictionary specifying the share of snippets for each language.Total_number
: Total number of snippets to extract.batch_size
: Batch size for parallel processing.n_jobs
: Number of parallel jobs to run (null for automatic detection).model
: The model to use for code generation.temperature_problem
: Temperature for creating the problem text.temperature_solution
: Temperature for creating the solution text.test
: Whether to run in test mode (produces only 6 example generated samples).
Default Values
Here are the default values for each parameter:
api_key
: "aaa"share
: {"Python" : 0.40, "C++" : 0.05, ... }Total_number
: 100batch_size
: 3n_jobs
: nullmodel
: "meta-llama/Llama-3-70b-chat-hf"temperature_problem
: 0.7temperature_solution
: 0.5test
: false
Running the Script
To run the script with the default parameters, use the following command:
python app/main_coder.py
## Author
- Onur Alp Güvercin
- Cesare Bidini
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file data_generation_hyper-0.0.4.1.tar.gz
.
File metadata
- Download URL: data_generation_hyper-0.0.4.1.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b8168a15091b42411a0cae0ad859df5cb835c6a7f888b8609e241ffbdf16686 |
|
MD5 | 81ff2ced853e08c106cf6c0a203a6365 |
|
BLAKE2b-256 | ddf90cf462e9b709a4d8dfd7824d11fa2be3e7dbb673811ef49703289a64cc6c |
File details
Details for the file data_generation_hyper-0.0.4.1-py3-none-any.whl
.
File metadata
- Download URL: data_generation_hyper-0.0.4.1-py3-none-any.whl
- Upload date:
- Size: 20.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dbf2635052a4341c48b89698f1c4d5f07a1782c5271f53955161c03116969d22 |
|
MD5 | 281ea8b8b29d86751c3bc9707b78d296 |
|
BLAKE2b-256 | 15cd4bc6eb104f0cbea7765ff74ccbeb3ed6d7f4b209af80c87c324ff4280e74 |