Skip to main content

Python module for detecting password, api keys hashes and any other string that resembles a randomly generated character sequence.

Project description

stringlifier

String-classifier - is a python module for detecting random string and hashes text/code.

Typical usage scenarios include:

  • Sanitizing application or security logs
  • Detecting accidentally exposed credentials (complex passwords or api keys)

Quick start guide

You can quickly use stringlifier via pip-installation:

$ pip install stringlifier

API example:

from stringlifier.api import Stringlifier

stringlifier=Stringlifier()

s = stringlifier('/System/Library/DriverExtensions/AppleUserHIDDrivers.dext/AppleUserHIDDrivers com.apple.driverkit.AppleUserUSBHostHIDDevice0 0x10000992d')

After this, s should be:

'/System/Library/DriverExtensions/AppleUserHIDDrivers.dext/AppleUserHIDDrivers com.apple.driverkit.AppleUserUSBHostHIDDevice0 <RANDOM_STRING>'

You can also choose to see the full tokenization and classification output:

s, tokens = stringlifier('/System/Library/DriverExtensions/AppleUserHIDDrivers.dext/AppleUserHIDDrivers com.apple.driverkit.AppleUserUSBHostHIDDevice0 0x10000992d', return_tokens=True)

s will be the same as before and tokens will contain the following data:

[{'token': '/', 'type': 'SYMBOL'},
 {'token': 'System', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'Library', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'DriverExtensions', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'AppleUserHIDDrivers', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'dext', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'AppleUserHIDDrivers', 'type': 'STRING'},
 {'token': ' ', 'type': 'SYMBOL'},
 {'token': 'com', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'apple', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'driverkit', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'AppleUserUSBHostHIDDevice0', 'type': 'STRING'},
 {'token': ' ', 'type': 'SYMBOL'},
 {'token': '0x10000992d', 'type': 'HASH'}]

Building your own classifier

You can also train your own model if you want to detect different types of strings. For this you can use the Command Line Interface for the string classifier:

$ python3 stringlifier/modules/stringc.py --help

Usage: stringc.py [options]

Options:
  -h, --help            show this help message and exit
  --interactive
  --train
  --resume
  --train-file=TRAIN_FILE
  --dev-file=DEV_FILE
  --store=OUTPUT_BASE
  --patience=PATIENCE   (default=20)
  --batch-size=BATCH_SIZE
                        (default=32)
  --device=DEVICE

For instructions on how to generate your training data, use this link.

Important note: This model might not scale if detecting a type of string depends on the surrounding tokens. In this case, you can look at a more advanced tool for sequence processing such as NLP-Cube

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stringlifier-0.1.0.6.tar.gz (2.8 MB view details)

Uploaded Source

Built Distribution

stringlifier-0.1.0.6-py3-none-any.whl (2.8 MB view details)

Uploaded Python 3

File details

Details for the file stringlifier-0.1.0.6.tar.gz.

File metadata

  • Download URL: stringlifier-0.1.0.6.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.7

File hashes

Hashes for stringlifier-0.1.0.6.tar.gz
Algorithm Hash digest
SHA256 220d3e329c8d18b0f65adce8fc11c497af33ed650d122910f248d57a0e1202b3
MD5 088cd5ad55e5712c5f4ff8e003455d66
BLAKE2b-256 9cbcbf5775581d8e000e38041eb5e57b577ee227d8d608529705ca15a23574d8

See more details on using hashes here.

File details

Details for the file stringlifier-0.1.0.6-py3-none-any.whl.

File metadata

  • Download URL: stringlifier-0.1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.7

File hashes

Hashes for stringlifier-0.1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 14d11f33d0ed6b71f45fc14fa5a84efed5e12e1dc0655bd8960d40a1ee40dda4
MD5 1beb190a7d4daed27dedaa8d26616eb0
BLAKE2b-256 33175505bf82f2465ca7a078651d723abba10a12cf506b6dd9cdf7cee33e4023

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page