Skip to main content

malwi - AI Python Malware Scanner

Project description

malwi - AI Python Malware Scanner

Logo

Detect Python malware fast - no internet, no expensive hardware, no fees.

malwi is specialized in detecting zero-day vulnerabilities, for classifying code as safe or harmful.

Open-source software made in Europe. Based on open research, open code, open data. 🇪🇺🤘🕊️

# Install
pip install --user malwi

Run:

malwi ./examples

Output:

## examples/__init__.py
- Object: runcommand
- Maliciousness: 👹 0.9620079398155212

### Code
def runcommand(value):
    output = subprocess.run(value, shell=True, capture_output=True)
    return [output.stdout, output.stderr]

### Tokens
TARGETED_FILE resume load_global subprocess load_attr run load_fast value load_const INTEGER load_const INTEGER kw_names capture_output shell call store_fast output load_fast output load_attr stdout load_fast output load_attr stderr build_list return_value
...

Why malwi?

The number of malicious open-source packages is growing. This is not just a threat to your business but also to the open-source community.

Typical malware behaviors include:

  • Exfiltration of data: Stealing credentials, API keys, or sensitive user data.
  • Backdoors: Allowing remote attackers to gain unauthorized access to your system.
  • Destructive actions: Deleting files, corrupting databases, or sabotaging applications.

Attention: Malicious packages might execute code during installation (e.g. through setup.py). Make sure to NOT download or install malicious packages from the dataset with commands like uv add, pip install, poetry add.

What's next?

The first iteration focuses on maliciousness of Python source code.

Future iterations will cover malware scanning for more languages (JavaScript, Rust, Go) and more formats (binaries, logs).

How does it work?

malwi applies DistilBert and Support Vector Machines (SVM) based on the design of Zero Day Malware Detection with Alpha: Fast DBI with Transformer Models for Real World Application (2025). pypi_malregistry is used as a source for malicious samples.

  1. malwi compiles Python files to bytecode:
def runcommand(value):
    output = subprocess.run(value, shell=True, capture_output=True)
    return [output.stdout, output.stderr]
  0           RESUME                   0

  1           LOAD_CONST               0 (<code object runcommand at 0x5b4f60ae7540, file "example.py", line 1>)
              MAKE_FUNCTION
              STORE_NAME               0 (runcommand)
              RETURN_CONST             1 (None)
  ...
  1. Bytecode operators are mapped to tokens:
TARGETED_FILE resume load_global subprocess load_attr run load_fast value load_const INTEGER load_const INTEGER kw_names capture_output shell call store_fast output load_fast output load_attr stdout load_fast output load_attr stderr build_list return_value
  1. Tokens are used as input for a pre-trained DistilBert:
Maliciousness: 0.9620079398155212

Support

Do you have access to malicious Rust, Go, whatever packages? Contact me.

Develop

Prerequisites: uv

# Download and process data
cmds/download_and_preprocess.sh

# Only process data
cmds/preprocess.sh

# Preprocess then start training
cmds/preprocess_and_train.sh

# Only start training
cmds/train.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

malwi-0.0.11.tar.gz (65.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

malwi-0.0.11-py3-none-any.whl (56.6 kB view details)

Uploaded Python 3

File details

Details for the file malwi-0.0.11.tar.gz.

File metadata

  • Download URL: malwi-0.0.11.tar.gz
  • Upload date:
  • Size: 65.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.8

File hashes

Hashes for malwi-0.0.11.tar.gz
Algorithm Hash digest
SHA256 43c9ef6cdac36eb74c5b1c30f17a0ae03c3843bd97ffa1feb7c408a692bd05d6
MD5 96ddb9f0625d02900a57873a12212bbb
BLAKE2b-256 70a7581bd9c8f9c17f5afd847b92ee1b3469a6d71b8cd64b36428820444e4228

See more details on using hashes here.

File details

Details for the file malwi-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: malwi-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 56.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.8

File hashes

Hashes for malwi-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 8f3d0681bece43383abca76933813106cc362c2200ac84f5c8059232745f4c2e
MD5 3beafe6d05da0c28e7e32157e5878ecc
BLAKE2b-256 c9739e3de7b630aefd25328bf87c605705b5e8588213377088d3cb6c30ecc6cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page