Skip to main content

Command-line filter for GitHub repositories that contain "samples", instead of real project or framework or library

Project description

samples-filter

EO principles respected here DevOps By Rultor.com We recommend IntelliJ IDEA

py PyPI - Version codecov PDD status Hits-of-Code License Known Vulnerabilities

Samples-filter is a command-line filter for GitHub repositories that contain samples, instead of real project or framework or library. E.g. leeowenowen/rxjava-examples, streaming-with-flink/examples-java, redisson/redisson-examples.

Motivation. During the work on CaM project, where we're building datasets with open source Java programs, we discovered the need for filtering repositories that contain not a real code, but rather samples, tutorials or examples. This repository is portable command-line tool that filters those sample repositories.

How to use

First, install it from PyPI like that:

pip install samples-filter

then, execute:

samples-filter filter --repositories=repos.csv --out=filtered.csv

For --repositories you should provide a name of existing CSV dataset with GitHub repositories, and name for the output file in --out (it will be created automatically). If you feel missed, try --help and tool will explain to you what you should do.

Optionally, you can decide which model to use for filtering via --model. You can pass either transformer (the default one), or rf.

How to contribute

Fork repository, make changes, send us a pull request. We will review your changes and apply them to the master branch shortly, provided they don't violate our quality standards. To avoid frustration, before sending us your pull request please run full build:

make install cov check

To set up virtual environment use this set of commands:

python3 -m venv venv
source $(pwd)/venv/bin/activate

You will need Python 3.11+ installed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

samples-filter-0.5.1.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

samples_filter-0.5.1-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file samples-filter-0.5.1.tar.gz.

File metadata

  • Download URL: samples-filter-0.5.1.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for samples-filter-0.5.1.tar.gz
Algorithm Hash digest
SHA256 647b12ba07f2cd890916d4722591f528ec8e94672f52073a7236d7023211ddf7
MD5 be7f6f84f278df681004ebe3f20e9dec
BLAKE2b-256 3c2ea4c4e313d2c7534afed024a861a9209b797fcd02bb4021bfcfdaadd1e35e

See more details on using hashes here.

File details

Details for the file samples_filter-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: samples_filter-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for samples_filter-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 93b14e8f4257c2a63efe96d5c974c5dbda216e20aa526925fbda514d1eb0d98e
MD5 267e3f5c861a61bcbadfbcd41d90ff9e
BLAKE2b-256 d3240f81fadef407105967f13a5d79f7d8498ddb45c974fbc462f7400fd43d8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page