Skip to main content

Set of pytorch modules and utils to train code2seq model

Project description

code2seq

JetBrains Research Github action: build Code style: black

PyTorch's implementation of code2seq model.

Configuration

Use yaml files from config directory to configure all processes. model option is used to define model, for now repository supports:

  • code2seq
  • typed-code2seq
  • code2class

data_folder stands for the path to the folder with dataset. For checkpoints with predefined config, users can specify data folder by argument in corresponding script.

Data

Code2seq implementation supports the same data format as the original model. The only one different is storing vocabulary. To recollect vocabulary use

PYTHONPATH='.' python preprocessing/build_vocabulary.py

Train model

To train model use train.py script

python train.py model

Use main.yaml to set up hyper-parameters. Use corresponding configuration from configs/model to set up dataset.

To resume training from saved checkpoint use --resume argument

python train.py model --resume checkpoint.ckpt

Evaluate model

To evaluate trained model use test.py script

python test.py checkpoint.py

To specify the folder with data (in case on evaluating on different from training machine) use --data-folder argument

python test.py checkpoint.py --data-folder path

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code2seq-0.0.0.tar.gz (21.1 kB view hashes)

Uploaded Source

Built Distribution

code2seq-0.0.0-py3-none-any.whl (32.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page