Skip to main content

Analyse source code repositories for language feature and library usage.

Project description

CodeSurvey

PyPI

GitHub - Documentation

CodeSurvey is a framework and tool to survey code repositories for language feature usage, library usage, and more:

  • Survey a specific set of repositories, or randomly sample repositories from services like GitHub
  • Built-in support for analyzing Python code; extensible to support any language
  • Write simple Python functions to define the code features you want to survey; record arbitrary details of feature occurrences
  • Supports parallelizization of repository downloading and analysis across multiple processes
  • Logging and progress tracking to monitor your survey as it runs
  • Inspect the results as Python objects, or in an sqlite database

Installation

pip install codesurvey

Usage

The CodeSurvey class can easily be configured to run a survey, such as to measure how often the math module is used in a random set of recently updated Python repositories from GitHub:

from codesurvey import CodeSurvey
from codesurvey.sources import GithubSampleSource
from codesurvey.analyzers.python import PythonAstAnalyzer
from codesurvey.analyzers.python.features import py_module_feature_finder

# Define a FeatureFinder to look for the `math` module in Python code
has_math = py_module_feature_finder('math', modules=['math'])

# Configure the survey
survey = CodeSurvey(
    db_filepath='math_survey.sqlite3',
    sources=[
        GithubSampleSource(language='python'),
    ],
    analyzers=[
        PythonAstAnalyzer(
            feature_finders=[
                has_math,
            ],
        ),
    ],
    max_workers=5,
)

# Run the survey on 10 repositories
survey.run(max_repos=10)

# Report on the results
repo_features = survey.get_repo_features(feature_names=['math'])
repo_count_with_math = sum([
    1 for repo_feature in repo_features if
    repo_feature.occurrence_count > 0
])
print(f'{repo_count_with_math} out of {len(repo_features)} repos use math')

Animated GIF of CodeSurvey demo on the command-line

  • For more Sources of repositories, see Source docs
  • For more Analyzers and FeatureFinders, see Analyzer docs
  • For more options and methods for inspecting results, see CodeSurvey docs
  • For details on directly inspecting the sqlite database of survey results see Database docs
  • More examples can be found in examples

Contributing

  • Install Poetry dependencies with make deps
  • Documentation:
    • Run local server: make docs-serve
    • Build docs: make docs-build
    • Deploy docs to GitHub Pages: make docs-github
    • Docstring style follows the Google style guide

TODO

  • Add unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codesurvey-0.1.4.tar.gz (37.1 kB view hashes)

Uploaded Source

Built Distribution

codesurvey-0.1.4-py3-none-any.whl (40.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page