File Crawler index files and search hard-coded credentials.
Project description
File Crawler
FileCrawler officially supports Python 3.8+.
Main features
- List all file contents
- Index file contents at Elasticsearch
- Do OCR at several file types (with tika lib)
- Look for hard-coded credentials
- Much more...
Parsers:
- PDF files
- Microsoft Office files (Word, Excel etc)
- X509 Certificate files
- Image files (Jpg, Png, Gif etc)
- Java packages (Jar and war)
- Disassembly APK Files with APKTool
- Compressed files (zip, tar, gzip etc)
- SQLite3 database
- Containers (docker saved at tar.gz)
Indexers:
- Elasticsearch
- Stand-alone local files
Extractors:
- AWS credentials
- Github and gitlab credentials
- URL credentials
- Authorization header credentials
Alert:
- Send credential found via Telegram
Installing
Dependencies
apt install default-jre default-jdk libmagic-dev git
Installing FileCrawler
Installing from last release
pip install -U filecrawler
Installing development package
pip install -i https://test.pypi.org/simple/ FileCrawler
Running
Config file
Create a sample config file with default parameters
filecrawler --create-config -v
Edit the configuration file config.yml with your desired parameters
Note: You must adjust the Elasticsearch URL parameter before continue
Run
# Integrate with ELK
filecrawler --index-name filecrawler --path /mnt/client_files -T 30 -v --elastic
# Just save leaks locally
filecrawler --index-name filecrawler --path /mnt/client_files -T 30 -v --local -o /home/out_test
Help
$ filecrawler -h
File Crawler v0.1.3 by Helvio Junior
File Crawler index files and search hard-coded credentials.
https://github.com/helviojunior/filecrawler
usage:
filecrawler module [flags]
Available Integration Modules:
--elastic Integrate to elasticsearch
--local Save leaks locally
Global Flags:
--index-name [index name] Crawler name
--path [folder path] Folder path to be indexed
--config [config file] Configuration file. (default: ./fileindex.yml)
--db [sqlite file] Filename to save status of indexed files. (default: ~/.filecrawler/{index_name}/indexer.db)
-T [tasks] number of connects in parallel (per host, default: 16)
--create-config Create config sample
--clear-session Clear old file status and reindex all files
-h, --help show help message and exit
-v Specify verbosity level (default: 0). Example: -v, -vv, -vvv
Use "filecrawler [module] --help" for more information about a command.
How-to install ELK from scratch
Credits
This project was inspired of:
Note: Some part of codes was ported from this 2 projects
To do
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
FileCrawler-0.1.4.tar.gz
(23.4 MB
view details)
File details
Details for the file FileCrawler-0.1.4.tar.gz.
File metadata
- Download URL: FileCrawler-0.1.4.tar.gz
- Upload date:
- Size: 23.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db88b5c614e0cafce75a2caa9cef60352df291067df0a1f08a786e969b4fb8f7
|
|
| MD5 |
b9f196e0a52e2e3cfc465545e9d1b4ac
|
|
| BLAKE2b-256 |
ef7582ff8163fe9d358620bdef46c43bc61f51a5c3f06772b908094d59da54dd
|