Skip to main content

Python module for detecting password, api keys hashes and any other string that resembles a randomly generated character sequence.

Project description

stringlifier

String-classifier - is a python module for detecting random string and hashes text/code.

Typical usage scenarios include:

  • Sanitizing application or security logs
  • Detecting accidentally exposed credentials (complex passwords or api keys)

Quick start guide

You can quickly use stringlifier via pip-installation:

$ pip install stringlifier

API example:

from stringlifier.api import Stringlifier

stringlifier=Stringlifier()

s = stringlifier('/System/Library/DriverExtensions/AppleUserHIDDrivers.dext/AppleUserHIDDrivers com.apple.driverkit.AppleUserUSBHostHIDDevice0 0x10000992d')

After this, s should be:

'/System/Library/DriverExtensions/AppleUserHIDDrivers.dext/AppleUserHIDDrivers com.apple.driverkit.AppleUserUSBHostHIDDevice0 <RANDOM_STRING>'

You can also choose to see the full tokenization and classification output:

s, tokens = stringlifier('/System/Library/DriverExtensions/AppleUserHIDDrivers.dext/AppleUserHIDDrivers com.apple.driverkit.AppleUserUSBHostHIDDevice0 0x10000992d', return_tokens=True)

s will be the same as before and tokens will contain the following data:

[{'token': '/', 'type': 'SYMBOL'},
 {'token': 'System', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'Library', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'DriverExtensions', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'AppleUserHIDDrivers', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'dext', 'type': 'STRING'},
 {'token': '/', 'type': 'SYMBOL'},
 {'token': 'AppleUserHIDDrivers', 'type': 'STRING'},
 {'token': ' ', 'type': 'SYMBOL'},
 {'token': 'com', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'apple', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'driverkit', 'type': 'STRING'},
 {'token': '.', 'type': 'SYMBOL'},
 {'token': 'AppleUserUSBHostHIDDevice0', 'type': 'STRING'},
 {'token': ' ', 'type': 'SYMBOL'},
 {'token': '0x10000992d', 'type': 'HASH'}]

Building your own classifier

You can also train your own model if you want to detect different types of strings. For this you can use the Command Line Interface for the string classifier:

$ python3 stringlifier/modules/stringc.py --help

Usage: stringc.py [options]

Options:
  -h, --help            show this help message and exit
  --interactive
  --train
  --resume
  --train-file=TRAIN_FILE
  --dev-file=DEV_FILE
  --store=OUTPUT_BASE
  --patience=PATIENCE   (default=20)
  --batch-size=BATCH_SIZE
                        (default=32)
  --device=DEVICE

For instructions on how to generate your training data, use this link.

Important note: This model might not scale if detecting a type of string depends on the surrounding tokens. In this case, you can look at a more advanced tool for sequence processing such as NLP-Cube

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stringlifier-0.1.0.0.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

stringlifier-0.1.0.0-py3-none-any.whl (825.3 kB view details)

Uploaded Python 3

File details

Details for the file stringlifier-0.1.0.0.tar.gz.

File metadata

  • Download URL: stringlifier-0.1.0.0.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.7

File hashes

Hashes for stringlifier-0.1.0.0.tar.gz
Algorithm Hash digest
SHA256 8e13968f2ee401463eb5494e03b12fccbf052c64f7dcdf11f2807151659d9b31
MD5 2ccb027aff7a19d2bff0e4ebe80fc26c
BLAKE2b-256 2979eb04b0ac492942407a3ec9530ad47287563533ede0e883f3a038e99717b7

See more details on using hashes here.

File details

Details for the file stringlifier-0.1.0.0-py3-none-any.whl.

File metadata

  • Download URL: stringlifier-0.1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 825.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.7

File hashes

Hashes for stringlifier-0.1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f918a6ba38f484f8228401b142f03aed7b5a114859be14a894835561a40ed1f8
MD5 8effb726eb99b7b357fc6369cbdfbf22
BLAKE2b-256 1b6babae788af907b45020ec5a2df99d212fe7c9a37857ce4cb44ca1ca1662cd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page