Extractor of various archive formats for Karton framework
Project description
Extractor karton service
Performs extraction of known archive types and e-mail attachments. Produces "raw" artifacts for further classification.
Author: CERT.pl
Maintainers: psrok1, nazywam
Consumes:
{
"type": "sample",
"stage": "recognized",
"kind": "archive"
"payload": {
"sample": <Resource>,
"extraction_level": <int, default: 0>,
"password": <archive password>,
}
}
Produces:
{
"type": "sample",
"kind": "raw",
"payload": {
"sample": <Resource>,
"parent": <Resource>,
"extraction_level": <int++>
}
}
Usage
First of all, make sure you have setup the core system: https://github.com/CERT-Polska/karton
In order to unpack all available formats you'll also need a few native dependencies that sflock relies on, the installation method recommended by sflock is:
RUN sed -i 's/ main/ main non-free/' /etc/apt/sources.list \
&& apt-get update && apt-get install -y \
p7zip-full \
rar \
unace \
cabextract \
lzip
Then install karton-archive-extractor from PyPi:
$ pip install karton-archive-extractor
$ karton-archive-extractor
Configuration
There are several configuration options you can tweak up to your liking.
[archive-extractor]
# Maximum levels of nested extraction
max_depth = 5
# Maximum unpacked child filesize, larger files are not reported
max_size = 26214400
# Maximum number of children files for further analysis
max_children = 1000
To learn more about configuring your karton services, take a look at karton configuration docs
Running in Docker
Sflock uses ZipJail as a usermode syscall filtering mechanism. As a result, in our experience, container running the karton service has to have the SYS_PTRACE
capability in order for the ptrace to execute correctly. Make sure it's enabled if you run into problems extracting certain archive types.
Supported archive/compression formats*
.7z
.ace
.bup
.cab
.daa
.eml
.gz
.gzip
.iso
.lha
.lz
.lzh
.msg
.mso
.pdf
.rar
.tar
.tar.bz2
.tar.gz
.udf
.vhd
.vhdx
.xz
.zip
* Assuming you are running Linux, please see the sflock's readme for more information
PE files debloating
Some malicious PE files contain intentionally added junk to make them too big for processing. Starting from v1.4.0, archive extractor supports optional debloating of these files, using debloat tool made by Squiblydoo.
certpl/karton-archive-extractor
Docker image debloats PE files by default. To enable debloating in
karton-archive-extractor installed from PyPI, you need to install additional extra dependencies:
pip install karton-archive-extractor[debloat]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file karton_archive_extractor-1.4.0-py3-none-any.whl
.
File metadata
- Download URL: karton_archive_extractor-1.4.0-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c00f88537878265997465655a487cbc231397392665b184f582e493afb9d0a92 |
|
MD5 | e6dc84eb867baf41b113e47d5493c85f |
|
BLAKE2b-256 | 95fb3bcff61f0ac8408c790e3721a1bccc4880fd7fbc27fff28c3def65e54e62 |