Skip to main content

Library for synthetic data generation

Project description

Code Generation using main_script.py

This script is designed to generate code snippets using various parameters. You can customize the parameters directly in the script or by modifying the JSON file in the config folder. The generated codes are less than the requested because the algorithm drops duplicated ones. Inside prompts folder/prompt_list_and_probabilities can be found the probabilities with which the four different prompts templates are selected, the proportions can be changed in this file.

Parameters

  • api_key: Your API key for accessing the code generation service.
  • share: A dictionary specifying the share of snippets for each language.
  • Total_number: Total number of snippets to extract.
  • batch_size: Batch size for parallel processing.
  • n_jobs: Number of parallel jobs to run (null for automatic detection).
  • model: The model to use for code generation.
  • temperature_problem: Temperature for creating the problem text.
  • temperature_solution: Temperature for creating the solution text.
  • test: Whether to run in test mode (produces only 6 example generated samples).

Default Values

Here are the default values for each parameter:

  • api_key: "aaa"
  • share: {"Python" : 0.40, "C++" : 0.05, ... }
  • Total_number: 100
  • batch_size: 3
  • n_jobs: null
  • model: "meta-llama/Llama-3-70b-chat-hf"
  • temperature_problem: 0.7
  • temperature_solution: 0.5
  • test: false

Running the Script

To run the script with the default parameters, use the following command:

python app/main_coder.py

## Author
- Onur Alp Güvercin
- Cesare Bidini

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_generation_hyper-0.0.3.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

data_generation_hyper-0.0.3-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file data_generation_hyper-0.0.3.tar.gz.

File metadata

  • Download URL: data_generation_hyper-0.0.3.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for data_generation_hyper-0.0.3.tar.gz
Algorithm Hash digest
SHA256 aba4417f77f1df66f660d27053b1f14f5b8ffb4191328f9194e0d451a8e37159
MD5 00a6dda8502e6153b39288fc277c326c
BLAKE2b-256 366b83d51f8d4a1154cceac816415e71e6f5c65d8f35875a3b75edbb04438e95

See more details on using hashes here.

File details

Details for the file data_generation_hyper-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for data_generation_hyper-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6ff6d0e179ed16fb47499475d611853573be15bacf1f1f3e824f0d68babcded4
MD5 05c9b852a7ff54bf82c2403bcc3945e5
BLAKE2b-256 f56e0d80d2b2ff92d3509b80323ba94c5950a9f7c10cb94145ea894273f1c71f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page