File Crawler index files and search hard-coded credentials.

These details have not been verified by PyPI

Project links

Project description

File Crawler

FileCrawler officially supports Python 3.8+.

Main features

List all file contents
Index file contents at Elasticsearch
Do OCR at several file types (with tika lib)
Look for hard-coded credentials
Much more...

Parsers:

PDF files
Microsoft Office files (Word, Excel etc)
X509 Certificate files
Image files (Jpg, Png, Gif etc)
Java packages (Jar and war)
Disassembly APK Files with APKTool
Compressed files (zip, tar, gzip etc)
SQLite3 database
Containers (docker saved at tar.gz)

Indexers:

Elasticsearch
Stand-alone local files

Extractors:

AWS credentials
Github and gitlab credentials
URL credentials
Authorization header credentials

Alert:

Send credential found via Telegram

Installing

Dependencies

apt install default-jre default-jdk libmagic-dev git

Installing FileCrawler

Installing from last release

pip install -U filecrawler

Installing development package

pip install -i https://test.pypi.org/simple/ FileCrawler

Running

Config file

Create a sample config file with default parameters

filecrawler --create-config -v

Edit the configuration file config.yml with your desired parameters

Note: You must adjust the Elasticsearch URL parameter before continue

Run

# Integrate with ELK
filecrawler --index-name filecrawler --path /mnt/client_files -T 30 -v --elastic

# Just save leaks locally
filecrawler --index-name filecrawler --path /mnt/client_files -T 30 -v --local -o /home/out_test

Help

$ filecrawler -h

File Crawler v0.1.3 by Helvio Junior
File Crawler index files and search hard-coded credentials.
https://github.com/helviojunior/filecrawler
    
usage: 
    filecrawler module [flags]

Available Integration Modules:
  --elastic                  Integrate to elasticsearch
  --local                    Save leaks locally

Global Flags:
  --index-name [index name]  Crawler name
  --path [folder path]       Folder path to be indexed
  --config [config file]     Configuration file. (default: ./fileindex.yml)
  --db [sqlite file]         Filename to save status of indexed files. (default: ~/.filecrawler/{index_name}/indexer.db)
  -T [tasks]                 number of connects in parallel (per host, default: 16)
  --create-config            Create config sample
  --clear-session            Clear old file status and reindex all files
  -h, --help                 show help message and exit
  -v                         Specify verbosity level (default: 0). Example: -v, -vv, -vvv

Use "filecrawler [module] --help" for more information about a command.

How-to install ELK from scratch

Installing Elasticsearch

Credits

This project was inspired of:

Note: Some part of codes was ported from this 2 projects

To do

Check the TODO file

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.21

Feb 8, 2025

0.1.20

Feb 8, 2025

0.1.19

Jan 13, 2025

0.1.18

Jan 13, 2025

0.1.17

Jan 9, 2025

0.1.14

Aug 2, 2024

0.1.13

Jul 26, 2024

0.1.12

Jul 26, 2024

0.1.11

Jun 26, 2024

0.1.10

Jun 25, 2024

0.1.8

May 12, 2023

0.1.7

May 12, 2023

0.1.6

Apr 10, 2023

0.1.5

Mar 30, 2023

This version

0.1.4

Mar 29, 2023

0.1.3

Mar 25, 2023

0.1.2

Mar 24, 2023

0.1.1

Mar 24, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

FileCrawler-0.1.4.tar.gz (23.4 MB view details)

Uploaded Mar 29, 2023 Source

File details

Details for the file FileCrawler-0.1.4.tar.gz.

File metadata

Download URL: FileCrawler-0.1.4.tar.gz
Upload date: Mar 29, 2023
Size: 23.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for FileCrawler-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`db88b5c614e0cafce75a2caa9cef60352df291067df0a1f08a786e969b4fb8f7`
MD5	`b9f196e0a52e2e3cfc465545e9d1b4ac`
BLAKE2b-256	`ef7582ff8163fe9d358620bdef46c43bc61f51a5c3f06772b908094d59da54dd`

See more details on using hashes here.

filecrawler 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

File Crawler

Main features

Parsers:

Indexers:

Extractors:

Alert:

Installing

Dependencies

Installing FileCrawler

Running

Config file

Run

Help

How-to install ELK from scratch

Credits

To do

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes