Skip to main content

Analyse source code repositories for language feature and library usage.

Project description

CodeSurvey

PyPI

GitHub - Documentation

CodeSurvey is a framework and tool to survey code repositories for language feature usage, library usage, and more:

  • Survey a specific set of repositories, or randomly sample repositories from services like GitHub
  • Built-in support for analyzing Python code; extensible to support any language
  • Write simple Python functions to define the code features you want to survey; record arbitrary details of feature occurrences
  • Supports parallelizization of repository downloading and analysis across multiple processes
  • Logging and progress tracking to monitor your survey as it runs
  • Inspect the results as Python objects, or in an sqlite database

Installation

pip install codesurvey

Usage

The CodeSurvey class can easily be configured to run a survey, such as to measure how often the math module is used in a random set of recently updated Python repositories from GitHub:

from codesurvey import CodeSurvey
from codesurvey.sources import GithubSampleSource
from codesurvey.analyzers.python import PythonAstAnalyzer
from codesurvey.analyzers.python.features import py_module_feature_finder

# Define a FeatureFinder to look for the `math` module in Python code
has_math = py_module_feature_finder('math', modules=['math'])

# Configure the survey
survey = CodeSurvey(
    db_filepath='math_survey.sqlite3',
    sources=[
        GithubSampleSource(language='python'),
    ],
    analyzers=[
        PythonAstAnalyzer(
            feature_finders=[
                has_math,
            ],
        ),
    ],
    max_workers=5,
)

# Run the survey on 10 repositories
survey.run(max_repos=10)

# Report on the results
repo_features = survey.get_repo_features(feature_names=['math'])
repo_count_with_math = sum([
    1 for repo_feature in repo_features if
    repo_feature.occurrence_count > 0
])
print(f'{repo_count_with_math} out of {len(repo_features)} repos use math')

Animated GIF of CodeSurvey demo on the command-line

  • For more Sources of repositories, see Source docs
  • For more Analyzers and FeatureFinders, see Analyzer docs
  • For more options and methods for inspecting results, see CodeSurvey docs
  • For details on directly inspecting the sqlite database of survey results see Database docs
  • More examples can be found in examples

Contributing

  • Install Poetry dependencies with make deps
  • Documentation:
    • Run local server: make docs-serve
    • Build docs: make docs-build
    • Deploy docs to GitHub Pages: make docs-github
    • Docstring style follows the Google style guide

TODO

  • Add unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codesurvey-0.1.5.tar.gz (37.1 kB view details)

Uploaded Source

Built Distribution

codesurvey-0.1.5-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file codesurvey-0.1.5.tar.gz.

File metadata

  • Download URL: codesurvey-0.1.5.tar.gz
  • Upload date:
  • Size: 37.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-44-generic

File hashes

Hashes for codesurvey-0.1.5.tar.gz
Algorithm Hash digest
SHA256 de307090684361f11c48541c13aafbac4a02039a568741966e1a893c8e5e6aa3
MD5 380ee7a8d9641537dc95f5cea70dfa2e
BLAKE2b-256 3a5d9cc8ecde2873cfd1d7ce65c918683163f032f32ac19e8b538d6a43f548c7

See more details on using hashes here.

File details

Details for the file codesurvey-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: codesurvey-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 40.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-44-generic

File hashes

Hashes for codesurvey-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 01d7bcce489ee07fe298a0a0bc18a0f3ad90b518e94ef5053711a387d3f5094d
MD5 3314333c85604561559693560f46231b
BLAKE2b-256 1fb1887de9870659c81e0c0c09283858d213bf780b422d96addd4683f5532244

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page