No project description provided
Project description
Useragent_classifier
Installation
pip install useragent_classifier
Basic Usage
Text
useragent_classifier -f /tmp/mylist_of_User_agent.csv
Where mylist_of_User_agent.csv file is in the following format, one user agent by row, with no header
| Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko |
| Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0 |
| Opera/6.11 (Linux 2.4.18-bf2.4 i686; U) [en] |
It will produce a two files:
- a file with cluster number attributed to each User agent
- a file usefull to explain cluster with the most important word or set of word in this cluster
Graphical analysis of cluster
useragent_classifier -f /tmp/mylist_of_User_agent.csv --graphical-explanation
Launch a graphical analysis of cluster on local host on port 8050
Usage in python program
df = pd.DataFrame([
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11.1 (KHTML, like Gecko) Safari/419.3"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/521.32.1 (KHTML, like Gecko) Safari/521.32.1"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit (KHTML, like Gecko)"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_3; es-es) AppleWebKit/531.22.7 (KHTML, like Gecko)"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_6; en-us) AppleWebKit/528.16 (KHTML, like Gecko)"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_5; it-it) AppleWebKit/525.18 (KHTML, like Gecko)"
])
df.columns = ["ua"] # a column 'ua' is mandatory for the usage in python script
# 2 or 3 clusters, clusters explanation based on a maximum of 10 words or group of words
classifier = UserAgentClassifier(n_clusters=[2, 3], n_top_words=10)
cluster = classifier.get_cluster(df)
feature_importances = classifier._features_importances
More advanced Usage
To display the help
useragent_classifier --help
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file useragent_classifier-0.8.4.tar.gz.
File metadata
- Download URL: useragent_classifier-0.8.4.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.7.3 Linux/4.19.0-18-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
744f5d144ffded9fc3c9a4a59c4115e05c184db097a3dd0fd8256cc09e296cef
|
|
| MD5 |
79a117f355c194d22a603bdfe0fbf7c3
|
|
| BLAKE2b-256 |
6063bad8e2ab8249075e68eb58990c64e5ede5055ea425d7f7cdf291ddbdf0dc
|
File details
Details for the file useragent_classifier-0.8.4-py3-none-any.whl.
File metadata
- Download URL: useragent_classifier-0.8.4-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.7.3 Linux/4.19.0-18-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f279f1e2f18a0e2d01576ea8ce97c0d21bf4bc73b39a8f1cbf6dfa6d21507a0f
|
|
| MD5 |
33d80c96f6a4d166532fb4db2e09a2a1
|
|
| BLAKE2b-256 |
a3e7301a3336b5d376460b48a1340dcc3189bdd3b7e2a8c30fa2031f76d18bbb
|