Analyse source code repositories for language feature and library usage.
Project description
CodeSurvey is a framework and tool to survey code repositories for language feature usage, library usage, and more:
- Survey a specific set of repositories, or randomly sample repositories from services like GitHub
- Built-in support for analyzing Python code; extensible to support any language
- Write simple Python functions to define the code features you want to survey; record arbitrary details of feature occurrences
- Supports parallelizization of repository downloading and analysis across multiple processes
- Logging and progress tracking to monitor your survey as it runs
- Inspect the results as Python objects, or in an sqlite database
Installation
pip install codesurvey
Usage
The CodeSurvey class can easily be configured to run a survey, such
as to measure how often the math module is used in a random set of
recently updated Python repositories from GitHub:
from codesurvey import CodeSurvey
from codesurvey.sources import GithubSampleSource
from codesurvey.analyzers.python import PythonAstAnalyzer
from codesurvey.analyzers.python.features import py_module_feature_finder
# Define a FeatureFinder to look for the `math` module in Python code
has_math = py_module_feature_finder('math', modules=['math'])
# Configure the survey
survey = CodeSurvey(
db_filepath='math_survey.sqlite3',
sources=[
GithubSampleSource(language='python'),
],
analyzers=[
PythonAstAnalyzer(
feature_finders=[
has_math,
],
),
],
max_workers=5,
)
# Run the survey on 10 repositories
survey.run(max_repos=10)
# Report on the results
repo_features = survey.get_repo_features(feature_names=['math'])
repo_count_with_math = sum([
1 for repo_feature in repo_features if
repo_feature.occurrence_count > 0
])
print(f'{repo_count_with_math} out of {len(repo_features)} repos use math')
- For more Sources of repositories, see Source docs
- For more Analyzers and FeatureFinders, see Analyzer docs
- For more options and methods for inspecting results, see
CodeSurveydocs - For details on directly inspecting the sqlite database of survey results see Database docs
- More examples can be found in examples
Contributing
- Install Poetry dependencies with
make deps - Documentation:
- Run local server:
make docs-serve - Build docs:
make docs-build - Deploy docs to GitHub Pages:
make docs-github - Docstring style follows the Google style guide
- Run local server:
TODO
- Add unit tests
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codesurvey-0.1.5.tar.gz.
File metadata
- Download URL: codesurvey-0.1.5.tar.gz
- Upload date:
- Size: 37.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-44-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de307090684361f11c48541c13aafbac4a02039a568741966e1a893c8e5e6aa3
|
|
| MD5 |
380ee7a8d9641537dc95f5cea70dfa2e
|
|
| BLAKE2b-256 |
3a5d9cc8ecde2873cfd1d7ce65c918683163f032f32ac19e8b538d6a43f548c7
|
File details
Details for the file codesurvey-0.1.5-py3-none-any.whl.
File metadata
- Download URL: codesurvey-0.1.5-py3-none-any.whl
- Upload date:
- Size: 40.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-44-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01d7bcce489ee07fe298a0a0bc18a0f3ad90b518e94ef5053711a387d3f5094d
|
|
| MD5 |
3314333c85604561559693560f46231b
|
|
| BLAKE2b-256 |
1fb1887de9870659c81e0c0c09283858d213bf780b422d96addd4683f5532244
|