Skip to main content

Inference and training for multiple languages of code2seq

Project description

pycode2seq

Training and inference with multiple languages of PyTorch's implementation of code2seq model.

Installation

python setup.py install

Inference

Minimal code example:

import sys
from pycode2seq import DefaultModelRunner

def main(argv):
    runner = DefaultModelRunner(
        save_path = "./tmp",
    )

    #List of embeddings for each method
    method_embeddings = runner.run_embeddings_on_file(argv[1], "kt") 

    #Code2seq predictions
    predictions = runner.run_on_file(argv[1], "kt")

    #Predicted method names
    names = [runner.prediction_to_text(prediction) for prediction in predictions]

if __name__ == "__main__":
    main(sys.argv)

Training

Download astminer and run:

./gradelw shadowJar

Mine projects for paths:

python training/mine_projects.py <data folder> <output folder> <path to astminer's cli.sh>

Combine mined paths:

python training/astminer_to_code2seq.py <data folder/holdout> <output folder> <holdout>

Build vocabulary with build_vocabulary.py from code2seq module

Combine vocabularies:

python training/combine_vocabularies.py

Expand weights:

python training/expand_weights.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycode2seq-0.0.1.tar.gz (162.9 kB view hashes)

Uploaded Source

Built Distribution

pycode2seq-0.0.1-py3-none-any.whl (173.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page