Inference and training for multiple languages of code2seq
Project description
pycode2seq
Training and inference with multiple languages of PyTorch's implementation of code2seq model.
Installation
python setup.py install
Inference
Minimal code example:
import sys
from pycode2seq import DefaultModelRunner
def main(argv):
runner = DefaultModelRunner(
save_path = "./tmp",
)
#List of embeddings for each method
method_embeddings = runner.run_embeddings_on_file(argv[1], "kt")
#Code2seq predictions
predictions = runner.run_on_file(argv[1], "kt")
#Predicted method names
names = [runner.prediction_to_text(prediction) for prediction in predictions]
if __name__ == "__main__":
main(sys.argv)
Training
Download astminer and run:
./gradelw shadowJar
Mine projects for paths:
python training/mine_projects.py <data folder> <output folder> <path to astminer's cli.sh>
Combine mined paths:
python training/astminer_to_code2seq.py <data folder/holdout> <output folder> <holdout>
Build vocabulary with build_vocabulary.py from code2seq module
Combine vocabularies:
python training/combine_vocabularies.py
Expand weights:
python training/expand_weights.py
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pycode2seq-0.0.1.tar.gz
(162.9 kB
view hashes)
Built Distribution
pycode2seq-0.0.1-py3-none-any.whl
(173.7 kB
view hashes)
Close
Hashes for pycode2seq-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc2089836ffc0e99d44fe05e8d123b9b7cea2fcd33e696654d2738b7d807a252 |
|
MD5 | bcae6f54fc7e405c001e0d6432c3510a |
|
BLAKE2b-256 | 0c837e2af00462308bbfd775d6c180a926392250b0da0a409bb8b7a365942415 |