Automated web form harvesting and submission tool
Project description
FormHarvester is an AI-assisted form intelligence engine. It navigates the open web autonomously - executing searches, parsing page structure, extracting contact signals, and interacting with forms at the browser level. Built on async browser with stealth fingerprinting, proxy rotation, and a pluggable captcha solver interface.
Adjusting config.txt
mode
The CSV that will be used by the bot (e.g. mode = lawn will use lawn.csv)
max_google_pages
The CSV that will be used by the bot (e.g. mode = lawn will use lawn.csv)
skip_ads
FormHarvester will skip any ads on Google Search.
start_page
FormHarvester will start on X google page.
send_form
FormHarvester will send the form inside the website. It can be disabled to save time.
generate_email_sources
Generate an extra file showing the source URL where the email was extracted.
hide_browser
This setting will run the browser in headless mode and it will be hidden.
max_time
Max time FormHarvester can spend on a single website.
min_delay and max_delay
A random delay between min and max will be used for google.
captcha_sleep
Sleep for X minutes after a Google captcha is found. 0 to disable.
search_timer
A waiting time (in minutes) between the last google search and the next one.
[captcha]
Here you can enter deathbycaptcha credentials to solve captchas automatically.
[dev]
Disable in production. They are used for development reasons. debug_form may be useful, as it prevents the form from submitting.
How to run
Executable
Run formharvester.exe
Python
pip install -r requirements.txt
python3 bot.py
Folder structure
data
Where scraped emails and logs are dumped.
drivers
Browser drivers used by selenium.
input
Input CSV files go here.
log
This folder will report errors on websites, very useful to improve the bot.
This project is released under the MIT License. You are free to use, modify, and distribute this software, provided that the original copyright notice and license terms are included in all copies or substantial portions of the software.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file formharvester-2.2.2.tar.gz.
File metadata
- Download URL: formharvester-2.2.2.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
300f8a304bf82b7c268be8b78fda8ba468a159a3dd9292616f3d5dc710fbda32
|
|
| MD5 |
02217f95c27f7c978872236a7de5eb4d
|
|
| BLAKE2b-256 |
706a5f139d20bdd0e16bee71d7db366c57ebeabd5e4e6f355fa501b57289a55f
|
Provenance
The following attestation bundles were made for formharvester-2.2.2.tar.gz:
Publisher:
publish.yml on dariomory/formharvester
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
formharvester-2.2.2.tar.gz -
Subject digest:
300f8a304bf82b7c268be8b78fda8ba468a159a3dd9292616f3d5dc710fbda32 - Sigstore transparency entry: 1371304069
- Sigstore integration time:
-
Permalink:
dariomory/formharvester@d147ef6c0ad31ccca849c1852ff57d5d434a1744 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/dariomory
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d147ef6c0ad31ccca849c1852ff57d5d434a1744 -
Trigger Event:
push
-
Statement type:
File details
Details for the file formharvester-2.2.2-py3-none-any.whl.
File metadata
- Download URL: formharvester-2.2.2-py3-none-any.whl
- Upload date:
- Size: 262.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bd7aabc41bd4014ca999ee9602e89e08c9f7c07e1c8d5bbd093d73f85206e7b
|
|
| MD5 |
057978f847a2384e9ad42a242896e7a7
|
|
| BLAKE2b-256 |
62f9b36e7ff3a615f36b8d693049063ac19cb4ab1015ef5f9cbab5d27e288adc
|
Provenance
The following attestation bundles were made for formharvester-2.2.2-py3-none-any.whl:
Publisher:
publish.yml on dariomory/formharvester
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
formharvester-2.2.2-py3-none-any.whl -
Subject digest:
8bd7aabc41bd4014ca999ee9602e89e08c9f7c07e1c8d5bbd093d73f85206e7b - Sigstore transparency entry: 1371304111
- Sigstore integration time:
-
Permalink:
dariomory/formharvester@d147ef6c0ad31ccca849c1852ff57d5d434a1744 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/dariomory
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d147ef6c0ad31ccca849c1852ff57d5d434a1744 -
Trigger Event:
push
-
Statement type: