Analyse source code repositories for language feature and library usage.
Project description
CodeSurvey is a framework and tool to survey code repositories for language feature usage, library usage, and more:
- Survey a specific set of repositories, or randomly sample repositories from services like GitHub
- Built-in support for analyzing Python code; extensible to support any language
- Write simple Python functions to define the code features you want to survey; record arbitrary details of feature occurrences
- Supports parallelizization of repository downloading and analysis across multiple processes
- Logging and progress tracking to monitor your survey as it runs
- Inspect the results as Python objects, or in an sqlite database
Installation
pip install codesurvey
Usage
The CodeSurvey
class can easily be configured to run a survey, such
as to measure how often the math
module is used in a random set of
recently updated Python repositories from GitHub:
from codesurvey import CodeSurvey
from codesurvey.sources import GithubSampleSource
from codesurvey.analyzers.python import PythonAstAnalyzer
from codesurvey.analyzers.python.features import py_module_feature_finder
# Define a FeatureFinder to look for the `math` module in Python code
has_math = py_module_feature_finder('math', modules=['math'])
# Configure the survey
survey = CodeSurvey(
db_filepath='math_survey.sqlite3',
sources=[
GithubSampleSource(language='python'),
],
analyzers=[
PythonAstAnalyzer(
feature_finders=[
has_math,
],
),
],
max_workers=5,
)
# Run the survey on 10 repositories
survey.run(max_repos=10)
# Report on the results
repo_features = survey.get_repo_features(feature_names=['math'])
repo_count_with_math = sum([
1 for repo_feature in repo_features if
repo_feature.occurrence_count > 0
])
print(f'{repo_count_with_math} out of {len(repo_features)} repos use math')
- For more Sources of repositories, see Source docs
- For more Analyzers and FeatureFinders, see Analyzer docs
- For more options and methods for inspecting results, see
CodeSurvey
docs - For details on directly inspecting the sqlite database of survey results see Database docs
- More examples can be found in examples
Contributing
- Install Poetry dependencies with
make deps
- Documentation:
- Run local server:
make docs-serve
- Build docs:
make docs-build
- Deploy docs to GitHub Pages:
make docs-github
- Docstring style follows the Google style guide
- Run local server:
TODO
- Add unit tests
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
codesurvey-0.1.4.tar.gz
(37.1 kB
view hashes)
Built Distribution
codesurvey-0.1.4-py3-none-any.whl
(40.1 kB
view hashes)
Close
Hashes for codesurvey-0.1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c0fe485679a502d48b4d6893783f97504debaa7e68c916416679e1199ce1232 |
|
MD5 | 4e6117b956a5068e538208bdeaab0e5e |
|
BLAKE2b-256 | 49492341bb7331979bae9636e558456bc4d7e5f9e0733ba8150bab581f7b36db |