Skip to main content

This library contains various utils to parse GitHub repositories into function definition and docstring pairs. It is based on tree-sitter to parse code into ASTs and apply heuristics to parse metadata in more details. Currently, it supports 6 languages: Python, Java, Go, Php, Ruby, and Javascript. It also parses function calls and links them with their definitions for Python.

Project description

function_parser

This library contains various utils to parse GitHub repositories into function definition and docstring pairs. It is based on tree-sitter to parse code into ASTs and apply heuristics to parse metadata in more details. Currently, it supports 6 languages: Python, Java, Go, Php, Ruby, and Javascript. It also parses function calls and links them with their definitions for Python.

Install

pip install function-parser

How to use

In order to use the library you must download and build the language grammars for tree-sitter to parser source code with. Included in the library is a handy CLI tool for setting this up.

To download and build grammars: build_grammars

This command will download and build the grammars in the same location this python library was installed on your computer after pip installing.

import function_parser
import os

import pandas as pd

from function_parser.language_data import LANGUAGE_METADATA
from function_parser.process import DataProcessor
from tree_sitter import Language

language = "python"
DataProcessor.PARSER.set_language(
    Language(os.path.join(function_parser.__path__[0], "tree-sitter-languages.so"), language)
)
processor = DataProcessor(
    language=language, language_parser=LANGUAGE_METADATA[language]["language_parser"]
)

dependee = "keras-team/keras"
definitions = processor.process_dee(dependee, ext=LANGUAGE_METADATA[language]["ext"])
pd.DataFrame(definitions).head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
nwo sha path language identifier parameters argument_list return_statement docstring docstring_summary docstring_tokens function function_tokens url
0 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python backend () return 'tensorflow' Publicly accessible method for determining the... Publicly accessible method for determining the... [Publicly, accessible, method, for, determinin... def backend():\n """Publicly accessible metho... [def, backend, (, ), :, return, 'tensorflow'] https://github.com/keras-team/keras/blob/e43af...
1 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python cast_to_floatx (x) return np.asarray(x, dtype=floatx()) Cast a Numpy array to the default Keras float ... Cast a Numpy array to the default Keras float ... [Cast, a, Numpy, array, to, the, default, Kera... def cast_to_floatx(x):\n """Cast a Numpy arra... [def, cast_to_floatx, (, x, ), :, if, isinstan... https://github.com/keras-team/keras/blob/e43af...
2 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python get_uid (prefix='') return layer_name_uids[prefix] Associates a string prefix with an integer cou... Associates a string prefix with an integer cou... [Associates, a, string, prefix, with, an, inte... def get_uid(prefix=''):\n """Associates a str... [def, get_uid, (, prefix, =, '', ), :, graph, ... https://github.com/keras-team/keras/blob/e43af...
3 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python reset_uids () Resets graph identifiers. Resets graph identifiers. [Resets, graph, identifiers, .] def reset_uids():\n """Resets graph identifie... [def, reset_uids, (, ), :, PER_GRAPH_OBJECT_NA... https://github.com/keras-team/keras/blob/e43af...
4 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python clear_session () Resets all state generated by Keras.\n\n Kera... Resets all state generated by Keras. [Resets, all, state, generated, by, Keras, .] def clear_session():\n """Resets all state ge... [def, clear_session, (, ), :, global, _SESSION... https://github.com/keras-team/keras/blob/e43af...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

function_parser-0.0.4.tar.gz (1.3 MB view hashes)

Uploaded Source

Built Distribution

function_parser-0.0.4-py3-none-any.whl (22.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page