High-dimensional embedding generation library
Project description
HiDi is a library for high-dimensional embedding generation for collaborative filtering applications.
Read the full documentation.
How Do I Use It?
This will get you started.
from hidi import inout, clean, matrix, pipeline
# CSV file with link_id and item_id columns
in_files = ['hidi/examples/data/user-item.csv']
# File to write output data to
outfile = 'embeddings.csv'
transforms = [
inout.ReadTransform(in_files), # Read data from disk
clean.DedupeTransform(), # Dedupe it
matrix.SparseTransform(), # Make a sparse user*item matrix
matrix.SimilarityTransform(), # To item*item similarity matrix
matrix.SVDTransform(), # Perform SVD dimensionality reduction
matrix.ItemsMatrixToDFTransform(), # Make a DataFrame with an index
inout.WriteTransform(outfile) # Write results to csv
]
pl = pipeline.Pipeline(transforms)
pl.run()
Setup
Requirements
HiDi is tested against CPython 2.7, 3.4, 3.5, and 3.6. It may work with different version of CPython.
Installation
To install HiDi, simply run
$ pip install hidi
Run the Tests
$ pip install tox
$ tox
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
HiDi-0.0.3.tar.gz
(8.3 kB
view details)
File details
Details for the file HiDi-0.0.3.tar.gz
.
File metadata
- Download URL: HiDi-0.0.3.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a1edfecc8ffd0afc8eea010e79f4ca1ebb201f7107583edf874cf2fb443c86f |
|
MD5 | 9fca72240802b408fd5c1bf996f279f1 |
|
BLAKE2b-256 | 1c7768c28a07ce8e0344a2cef0c7c32ebecd58e4eae82592c60244489b6a15a0 |