Language-Agnostic Website Embedding and Classification
Project description
Homepage2Vec - Beta :construction:
Language-Agnostic Website Embedding and Classification
Getting started
Setup:
Step 1: install the library with pip.
pip install homepage2vec
Usage:
import logging
from homepage2vec.model import WebsiteClassifier
logging.getLogger().setLevel(logging.DEBUG)
model = WebsiteClassifier()
website = model.fetch_website('epfl.ch')
scores, embeddings = model.predict(website)
print("Classes probabilities:", scores)
print("Embedding:", embeddings)
Result:
Classes probabilities: {'Arts': 0.3674524128437042, 'Business': 0.0720655769109726,
'Computers': 0.03488553315401077, 'Games': 7.529282356699696e-06,
'Health': 0.02021787129342556, 'Home': 0.0005890956381335855,
'Kids_and_Teens': 0.3113572597503662, 'News': 0.0079914266243577,
'Recreation': 0.00835705827921629, 'Reference': 0.931416392326355,
'Science': 0.959597110748291, 'Shopping': 0.0010162043618038297,
'Society': 0.23374591767787933, 'Sports': 0.00014659571752417833}
Embedding: [-4.596550941467285, 1.0690114498138428, 2.1633379459381104,
0.1665923148393631, -4.605356216430664, -2.894961357116699, 0.5615459084510803,
1.6420538425445557, -1.918184757232666, 1.227172613143921, 0.4358430504798889,
...]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
homepage2vec-0.0.3b0.tar.gz
(10.4 kB
view hashes)
Built Distribution
Close
Hashes for homepage2vec-0.0.3b0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e8478dc462d4ab09e4de8ad265f84a73c25afd1dd6195a4e57222cdb4c036e8 |
|
MD5 | 23338b19f2ef9e2f48d533f4f77b411f |
|
BLAKE2b-256 | 598f54c103acc069057763b787de4426454b5721e81b61e9fa371a0be6572c7b |