Analyse source code repositories for language feature and library usage.
Project description
CodeSurvey is a framework and tool to survey code repositories for language feature usage, library usage, and more:
- Survey a specific set of repositories, or randomly sample repositories from services like GitHub
- Built-in support for analyzing Python code; extensible to support any language
- Write simple Python functions to define the code features you want to survey; record arbitrary details of feature occurrences
- Supports parallelizization of repository downloading and analysis across multiple processes
- Logging and progress tracking to monitor your survey as it runs
- Inspect the results as Python objects, or in an sqlite database
Installation
pip install codesurvey
Usage
The CodeSurvey
class can easily be configured to run a survey, such
as to measure how often the math
module is used in a random set of
recently updated Python repositories from GitHub:
from codesurvey import CodeSurvey
from codesurvey.sources import GithubSampleSource
from codesurvey.analyzers.python import PythonAstAnalyzer
from codesurvey.analyzers.python.features import py_module_feature_finder
# Define a FeatureFinder to look for the `math` module in Python code
has_math = py_module_feature_finder('math', modules=['math'])
# Configure the survey
survey = CodeSurvey(
db_filepath='math_survey.sqlite3',
sources=[
GithubSampleSource(language='python'),
],
analyzers=[
PythonAstAnalyzer(
feature_finders=[
has_math,
],
),
],
max_workers=5,
)
# Run the survey on 10 repositories
survey.run(max_repos=10)
# Report on the results
repo_features = survey.get_repo_features(feature_names=['math'])
repo_count_with_math = sum([
1 for repo_feature in repo_features if
repo_feature.occurrence_count > 0
])
print(f'{repo_count_with_math} out of {len(repo_features)} repos use math')
- For more Sources of repositories, see Source docs
- For more Analyzers and FeatureFinders, see Analyzer docs
- For more options and methods for inspecting results, see
CodeSurvey
docs - For details on directly inspecting the sqlite database of survey results see Database docs
- More examples can be found in examples
Contributing
- Install Poetry dependencies with
make deps
- Documentation:
- Run local server:
make docs-serve
- Build docs:
make docs-build
- Deploy docs to GitHub Pages:
make docs-github
- Docstring style follows the Google style guide
- Run local server:
TODO
- Add unit tests
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
codesurvey-0.1.5.tar.gz
(37.1 kB
view details)
Built Distribution
File details
Details for the file codesurvey-0.1.5.tar.gz
.
File metadata
- Download URL: codesurvey-0.1.5.tar.gz
- Upload date:
- Size: 37.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-44-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de307090684361f11c48541c13aafbac4a02039a568741966e1a893c8e5e6aa3 |
|
MD5 | 380ee7a8d9641537dc95f5cea70dfa2e |
|
BLAKE2b-256 | 3a5d9cc8ecde2873cfd1d7ce65c918683163f032f32ac19e8b538d6a43f548c7 |
File details
Details for the file codesurvey-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: codesurvey-0.1.5-py3-none-any.whl
- Upload date:
- Size: 40.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-44-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 01d7bcce489ee07fe298a0a0bc18a0f3ad90b518e94ef5053711a387d3f5094d |
|
MD5 | 3314333c85604561559693560f46231b |
|
BLAKE2b-256 | 1fb1887de9870659c81e0c0c09283858d213bf780b422d96addd4683f5532244 |