A high-performance CLI and library for detecting open source components in binaries through semantic signature matching
Project description
BinarySniffer - Binary Component Detection and Security Analysis
A high-performance CLI tool and Python library for detecting open source components and security threats in binaries through semantic signature matching. Specialized for analyzing mobile apps (APK/IPA), Java archives, ML models, and source code to identify OSS components, their licenses, and potential security risks.
Features
- Binary Component Detection: Identify 188+ OSS components in compiled binaries using semantic signatures
- ML Model Security Analysis: Comprehensive security scanning with MITRE ATT&CK mapping
- Multi-Format Support: APK/IPA, JAR/WAR, ELF/PE/Mach-O, ML models (pickle, ONNX, SafeTensors)
- SEMCL.ONE Integration: Works seamlessly with osslili, purl2notices, and other ecosystem tools
Installation
pip install binarysniffer
For development:
git clone https://github.com/SemClone/binarysniffer.git
cd binarysniffer
pip install -e .
With performance extras:
pip install binarysniffer[fast]
Quick Start
# Analyze a binary file
binarysniffer analyze /path/to/binary
# ML model security scan
binarysniffer ml-scan model.pkl --deep
# Generate SBOM
binarysniffer analyze app.apk --format cyclonedx -o sbom.json
Usage
CLI Usage
# Basic analysis
binarysniffer analyze app.apk
# ML model security analysis
binarysniffer ml-scan model.pkl --risk-threshold 0.5
# Directory scanning with recursion
binarysniffer analyze /path/to/project -r
# Generate CycloneDX SBOM
binarysniffer analyze app.jar --format sbom -o app-sbom.json
# Extract package inventory
binarysniffer inventory app.apk --with-hashes -o inventory.json
Python API
from binarysniffer import EnhancedBinarySniffer
# Initialize analyzer
sniffer = EnhancedBinarySniffer()
# Analyze a file
result = sniffer.analyze_file("app.apk")
for match in result.matches:
print(f"{match.component} - {match.confidence:.2%}")
print(f"License: {match.license}")
# ML security analysis
from binarysniffer.ml_security import MLSecurityAnalyzer
analyzer = MLSecurityAnalyzer()
risks = analyzer.analyze_model("model.pkl")
Core Capabilities
Binary Analysis
- Advanced format support (ELF, PE, Mach-O) via LIEF
- Android DEX bytecode analysis
- Static library (.a) support
- Symbol and import extraction
Archive Support
- Mobile apps (APK, IPA)
- Java archives (JAR, WAR)
- Python packages (wheel, egg)
- Linux packages (DEB, RPM)
- Extended formats (7z, RAR, Zstandard)
ML Model Security (v1.10.0+)
- Safe pickle file analysis
- ONNX and SafeTensors validation
- PyTorch/TensorFlow native formats
- 100% detection rate on known exploits
- SARIF output for CI/CD integration
Signature Database
- 188 OSS components covered
- 1,400+ high-quality signatures
- Automatic license detection
- Security severity classification
Integration with SEMCL.ONE
BinarySniffer is a core component of the SEMCL.ONE ecosystem:
- Complements osslili for source code license detection
- Works with purl2notices for comprehensive attribution
- Integrates with ospac for policy evaluation
- Supports upmex for package metadata extraction
Configuration
# ~/.binarysniffer/config.json
{
"signature_sources": [
"https://signatures.binarysniffer.io/core.xmdb"
],
"min_confidence": 0.5,
"parallel_workers": 4,
"auto_update": true
}
Documentation
- User Guide - Comprehensive usage examples
- API Reference - Python API documentation
- ML Security - ML model security analysis
- Signature Management - Creating and managing signatures
- Architecture - System design and internals
Advanced Topics
- TLSH Fuzzy Matching - Detecting modified components
- Creating Signatures - Contributing new signatures
- Installation Guide - Platform-specific setup
- Package Verification - Archive analysis
Contributing
We welcome contributions! Please see CONTRIBUTING.md for details on:
- Code of conduct
- Development setup
- Submitting pull requests
- Signature contributions
Support
For support and questions:
- GitHub Issues - Bug reports and feature requests
- Documentation - Complete project documentation
- SEMCL.ONE Community - Ecosystem support and discussions
License
Apache License 2.0 - see LICENSE file for details.
Authors
See AUTHORS.md for a list of contributors.
Part of the SEMCL.ONE ecosystem for comprehensive OSS compliance and code analysis.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file binarysniffer-1.11.3.tar.gz.
File metadata
- Download URL: binarysniffer-1.11.3.tar.gz
- Upload date:
- Size: 247.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8a4f471f83f2910ec4009096804fa537a8ed2fd9dcdd9f201f21e0c0cc75fe8
|
|
| MD5 |
044bd814cb843dda0263e9078ca7e110
|
|
| BLAKE2b-256 |
f46eb16325a9bfd134b0f8029c126213dcdb58a809e4ccd97a284440749fe970
|
Provenance
The following attestation bundles were made for binarysniffer-1.11.3.tar.gz:
Publisher:
python-publish.yml on SemClone/binarysniffer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
binarysniffer-1.11.3.tar.gz -
Subject digest:
e8a4f471f83f2910ec4009096804fa537a8ed2fd9dcdd9f201f21e0c0cc75fe8 - Sigstore transparency entry: 674317935
- Sigstore integration time:
-
Permalink:
SemClone/binarysniffer@7eae72c938fe20e5a862d1992fe4d472e475183c -
Branch / Tag:
refs/tags/v1.11.3 - Owner: https://github.com/SemClone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@7eae72c938fe20e5a862d1992fe4d472e475183c -
Trigger Event:
release
-
Statement type:
File details
Details for the file binarysniffer-1.11.3-py3-none-any.whl.
File metadata
- Download URL: binarysniffer-1.11.3-py3-none-any.whl
- Upload date:
- Size: 341.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ce13d33d2524d6c848d9473fce049d2eb66a039a5bcac667cdf9ffeebd049f5
|
|
| MD5 |
9e03ed9d4ec1526a56a16f137e9e8927
|
|
| BLAKE2b-256 |
f79c37071737fb40e0888697543547af69bf8ad664341b1d28ac152ec1172f2f
|
Provenance
The following attestation bundles were made for binarysniffer-1.11.3-py3-none-any.whl:
Publisher:
python-publish.yml on SemClone/binarysniffer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
binarysniffer-1.11.3-py3-none-any.whl -
Subject digest:
2ce13d33d2524d6c848d9473fce049d2eb66a039a5bcac667cdf9ffeebd049f5 - Sigstore transparency entry: 674317942
- Sigstore integration time:
-
Permalink:
SemClone/binarysniffer@7eae72c938fe20e5a862d1992fe4d472e475183c -
Branch / Tag:
refs/tags/v1.11.3 - Owner: https://github.com/SemClone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@7eae72c938fe20e5a862d1992fe4d472e475183c -
Trigger Event:
release
-
Statement type: