Skip to main content

This library contains various utils to parse GitHub repositories into function definition and docstring pairs. It is based on tree-sitter to parse code into ASTs and apply heuristics to parse metadata in more details. Currently, it supports 6 languages: Python, Java, Go, Php, Ruby, and Javascript. It also parses function calls and links them with their definitions for Python.

Project description

function_parser

This library contains various utils to parse GitHub repositories into function definition and docstring pairs. It is based on tree-sitter to parse code into ASTs and apply heuristics to parse metadata in more details. Currently, it supports 6 languages: Python, Java, Go, Php, Ruby, and Javascript. It also parses function calls and links them with their definitions for Python.

Install

pip install function-parser

How to use

In order to use the library you must download and build the language grammars for tree-sitter to parser source code with. Included in the library is a handy CLI tool for setting this up.

To download and build grammars: build_grammars

This command will download and build the grammars in the same location this python library was installed on your computer after pip installing.

import function_parser
import os

import pandas as pd

from function_parser.language_data import LANGUAGE_METADATA
from function_parser.process import DataProcessor
from tree_sitter import Language

language = "python"
DataProcessor.PARSER.set_language(
    Language(os.path.join(function_parser.__path__[0], "tree-sitter-languages.so"), language)
)
processor = DataProcessor(
    language=language, language_parser=LANGUAGE_METADATA[language]["language_parser"]
)

dependee = "keras-team/keras"
definitions = processor.process_dee(dependee, ext=LANGUAGE_METADATA[language]["ext"])
pd.DataFrame(definitions).head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
nwo sha path language identifier parameters argument_list return_statement docstring docstring_summary docstring_tokens function function_tokens url
0 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python backend () return 'tensorflow' Publicly accessible method for determining the... Publicly accessible method for determining the... [Publicly, accessible, method, for, determinin... def backend():\n """Publicly accessible metho... [def, backend, (, ), :, return, 'tensorflow'] https://github.com/keras-team/keras/blob/e43af...
1 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python cast_to_floatx (x) return np.asarray(x, dtype=floatx()) Cast a Numpy array to the default Keras float ... Cast a Numpy array to the default Keras float ... [Cast, a, Numpy, array, to, the, default, Kera... def cast_to_floatx(x):\n """Cast a Numpy arra... [def, cast_to_floatx, (, x, ), :, if, isinstan... https://github.com/keras-team/keras/blob/e43af...
2 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python get_uid (prefix='') return layer_name_uids[prefix] Associates a string prefix with an integer cou... Associates a string prefix with an integer cou... [Associates, a, string, prefix, with, an, inte... def get_uid(prefix=''):\n """Associates a str... [def, get_uid, (, prefix, =, '', ), :, graph, ... https://github.com/keras-team/keras/blob/e43af...
3 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python reset_uids () Resets graph identifiers. Resets graph identifiers. [Resets, graph, identifiers, .] def reset_uids():\n """Resets graph identifie... [def, reset_uids, (, ), :, PER_GRAPH_OBJECT_NA... https://github.com/keras-team/keras/blob/e43af...
4 keras-team/keras e43af6c89cd6c4adecc21ad5fc05b21e7fa9477b keras/backend.py python clear_session () Resets all state generated by Keras.\n\n Kera... Resets all state generated by Keras. [Resets, all, state, generated, by, Keras, .] def clear_session():\n """Resets all state ge... [def, clear_session, (, ), :, global, _SESSION... https://github.com/keras-team/keras/blob/e43af...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

function_parser-0.0.4.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

function_parser-0.0.4-py3-none-any.whl (22.8 kB view details)

Uploaded Python 3

File details

Details for the file function_parser-0.0.4.tar.gz.

File metadata

  • Download URL: function_parser-0.0.4.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.22.0 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.10

File hashes

Hashes for function_parser-0.0.4.tar.gz
Algorithm Hash digest
SHA256 ad615fda394ceb3c0f46863f91f22bdfacc119090da77d7aa0c81756ecb0c54e
MD5 795dc0ed29d3b44fd4f2eafe4b4fef75
BLAKE2b-256 477d8ad2cfc8a3049a608b2e548367eb179113e9325baa5f44e9d3d058fd8699

See more details on using hashes here.

File details

Details for the file function_parser-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: function_parser-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 22.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.22.0 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.10

File hashes

Hashes for function_parser-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 69e2ceb1f7997a9a61138e805134db56d9fd2167bbad42a647b645eff20b2fdc
MD5 24764ca0ebcce1d09c640ce9f3e4add9
BLAKE2b-256 0bf499de2a5511227f7de5879c8dbe54c393b4444a1cadf99dd8a0d5e081180e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page