Skip to main content

Guess programming language from a string or file.

Project description

whats_that_code

This is a programming language detection library.

It will detect programming language of source in pure python from an ensemble of classifiers. Use this when a quick and dirty first approximation is good enough. whats_that_code can currently identify 60%+ of samples without knowing the extension or tag.

I created this because I wanted

  • a pure python programming language detector
  • no machine learning dependencies

Tested on python 3.6 through 3.9.

Usage

> code = "def yo():\n   print('hello')"
> guess_language_all_methods(code, file_name="yo.py")
["python"]

How it Works

  1. Inspects file extension if available.
  2. Inspects shebang
  3. Looks for keywords
  4. Counts regexes for common patterns
  5. Attemps to parse python, json, yaml
  6. Inspects tags if available.

Each is imperfect and can error. The classifier then combines the results of each using a voting algorithm

This works best if you only use it for fallback, e.g. classifying code that can't already be classified by extension or tag, or when tag is ambiguous.

It was a tool that outgrew being a part of so_pip a StackOverflow code extraction tool I wrote.

Docs

Notable Similar Tools

  • Guesslang - python and tensorflow driven solution. Reasonable results but slow startup and not pure python.
  • pygments pure python, but sometimes lousy identification rates.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whats_that_code-0.1.11.tar.gz (70.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whats_that_code-0.1.11-py3-none-any.whl (73.7 kB view details)

Uploaded Python 3

File details

Details for the file whats_that_code-0.1.11.tar.gz.

File metadata

  • Download URL: whats_that_code-0.1.11.tar.gz
  • Upload date:
  • Size: 70.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.1

File hashes

Hashes for whats_that_code-0.1.11.tar.gz
Algorithm Hash digest
SHA256 99a328a1403f1a4e54935c165645af9f7d4d2709f00d0b17be794fd3382e970d
MD5 6bb5ea13fec17c0eabcabb5bdc5322d8
BLAKE2b-256 42057a3d6c29423d3c6654b29ba6efbd0cda177b6654ed978da0bf6563b5952e

See more details on using hashes here.

File details

Details for the file whats_that_code-0.1.11-py3-none-any.whl.

File metadata

  • Download URL: whats_that_code-0.1.11-py3-none-any.whl
  • Upload date:
  • Size: 73.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.1

File hashes

Hashes for whats_that_code-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 bfc930691d45cbe6bab8f1188411fc0e67b8bdfa1ef2b47597f7843830a63a4d
MD5 46f5f38ac95a2254bd46d252ba8431dc
BLAKE2b-256 6934087c039f87c78f8b4344c846e326053c45822c35efe54ac0af7d4ce19999

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page