an HTTP Response Fuzzy Hashing package
Project description
Usage
pip install hrfh
from hrfh.utils.parser import create_http_response_from_bytes
response = create_http_response_from_bytes(b"""HTTP/1.0 200 OK\r\nServer: nginx\r\nServer: apache\r\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\r\n\r\n""")
print(response)
print(response.masked)
print(response.fuzzy_hash())
>>> from hrfh.utils.parser import create_http_response_from_bytes
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package words to /root/nltk_data...
[nltk_data] Unzipping corpora/words.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Unzipping tokenizers/punkt.zip.
>>> response = create_http_response_from_bytes(b"""HTTP/1.0 200 OK\r\nServer: nginx\r\nServer: apache\r\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\r\n\r\n""")
>>> print(response)
<HTTPResponse 1.1.1.1:80 200 OK>
>>> print(response.masked)
HTTP/1.0 200 OK
ETag: [MASK]
Server: apache
Server: nginx
>>> print(response.fuzzy_hash())
ba15cc1f9ad3ef632d0ce7798f7fa44718f1e7fcc2c0f94c1a702f647b79923b
Source Usage
- Install requirements
sudo apt install python3-pip
pip install poetry
poetry install
poetry run python main.py
- Prepare HTTP response data as json format in
data/${cdn}/${ip}.json
file
$ tree data/
data
├── akamai
│ ├── 104.103.147.116.json
│ └── 104.81.222.211.json
├── alibaba-cdn
└── wangsu
cat data/akamai/104.103.147.116.json
{
"ip": "104.103.147.116",
"timestamp": 1717146116,
"status_code": 400,
"status_reason": "Bad Request",
"headers": {
"Server": "AkamaiGHost",
"Mime-Version": "1.0",
"Content-Type": "text/html",
"Content-Length": "312",
"Expires": "Fri, 31 May 2024 09:01:56 GMT",
"Date": "Fri, 31 May 2024 09:01:56 GMT",
"Connection": "close"
},
"body": "<HTML><HEAD>\n<TITLE>Invalid URL</TITLE>\n</HEAD><BODY>\n<H1>Invalid URL</H1>\nThe requested URL \"[no URL]\", is invalid.<p>\nReference #9.8be83217.1717146116.2661874a\n<P>https://errors.edgesuite.net/9.8be83217.1717146116.2661874a</P>\n</BODY></HTML>\n"
}
- Run the script to generate the hash
poetry run python main.py
01c7da5c66ffab8b54a <HTTPResponse 45.64.21.148:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 103.151.139.204:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 199.91.74.213:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 156.59.207.6:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 23.90.149.102:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 58.57.102.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 60.188.66.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 117.68.34.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 124.225.184.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 58.42.14.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 101.206.106.41:80 403 Forbidden>
Customize
Load from another source
- Implement your load which returns a
HTTPResponse
object. - call
HTTPResponse.fuzzy_hash()
to get the hash of the http response.
Python 3.7 Support
$ docker run -i -t python:3.7 /bin/bash
root@aa0241a5a2f5:/# python --version
Python 3.7.12
root@aa0241a5a2f5:/# pip --version
pip 24.0 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)
root@aa0241a5a2f5:/# pip install --upgrade -q ipython hrfh==0.1.3
root@aa0241a5a2f5:/# ipython
Python 3.7.12 (default, Dec 21 2021, 11:25:13)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.34.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from hrfh.utils.parser import create_http_response_from_bytes
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package words to /root/nltk_data...
[nltk_data] Unzipping corpora/words.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Unzipping tokenizers/punkt.zip.
In [2]: response = create_http_response_from_bytes(b"""HTTP/1.0 200 OK\r\nServer: nginx\r\nServer: apache\r\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\r\n\r\n""")
In [3]: response.masked
Out[3]: 'HTTP/1.0 200 OK\nETag: [MASK]\nServer: apache\nServer: nginx'
In [4]: response.fuzzy_hash()
Out[4]: 'ba15cc1f9ad3ef632d0ce7798f7fa44718f1e7fcc2c0f94c1a702f647b79923b'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hrfh-0.1.18.tar.gz
(7.5 kB
view details)
Built Distribution
hrfh-0.1.18-py3-none-any.whl
(8.2 kB
view details)
File details
Details for the file hrfh-0.1.18.tar.gz
.
File metadata
- Download URL: hrfh-0.1.18.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4314fa532e714448fa8f92e3558d4b60f898607244927a8feee23adcb50ffffc |
|
MD5 | 6d1a5cc67105ce220d3978462dd9fac5 |
|
BLAKE2b-256 | 6b2e7ec2cccf4d73818352b534caf3c03c796ef3b1eb245d0d369015b53420c0 |
File details
Details for the file hrfh-0.1.18-py3-none-any.whl
.
File metadata
- Download URL: hrfh-0.1.18-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4be9767a670817661fef1bb696292f60522d6dcc35412d42239e05eaa99f4619 |
|
MD5 | 45e5981d06cffaf222cbe90123503fec |
|
BLAKE2b-256 | 33fd88a7e8d6342e19bed58b90df34bb3e8ca61e6723bb7889c5d147f477ec61 |