High-dimensional embedding generation library
Project description
# HiDi: Pipelines for Embeddings
HiDi is a library for high-dimensional embedding generation for collaborative
filtering applications.
## How Do I Use It?
This will get you started.
```python
from hidi import inout, clean, matrix, pipeline
# CSV file with link_id and item_id columns
in_files = ['hidi/examples/data/user-item.csv']
# File to write output data to
outfile = 'embeddings.csv'
transforms = [
inout.ReadTransform(in_files), # Read data from disk
clean.DedupeTransform(), # Dedupe it
matrix.SparseTransform(), # Make a sparse user*item matrix
matrix.SimilarityTransform(), # To item*item similarity matrix
matrix.SVDTransform(), # Perform SVD dimensionality reduction
matrix.ItemsMatrixToDFTransform(), # Make a DataFrame with an index
inout.WriteTransform(outfile) # Write results to csv
]
pl = pipeline.Pipeline(transforms)
pl.run()
```
## Setup
### Requirements
HiDi is tested against CPython 2.7, 3.4, 3.5, and 3.6. It may work with
different version of CPython.
### Installation
To install HiDi, simply run
```sh
$ pip install hidi
```
## Run the Tests
```
$ pip install tox
$ tox
```
HiDi is a library for high-dimensional embedding generation for collaborative
filtering applications.
## How Do I Use It?
This will get you started.
```python
from hidi import inout, clean, matrix, pipeline
# CSV file with link_id and item_id columns
in_files = ['hidi/examples/data/user-item.csv']
# File to write output data to
outfile = 'embeddings.csv'
transforms = [
inout.ReadTransform(in_files), # Read data from disk
clean.DedupeTransform(), # Dedupe it
matrix.SparseTransform(), # Make a sparse user*item matrix
matrix.SimilarityTransform(), # To item*item similarity matrix
matrix.SVDTransform(), # Perform SVD dimensionality reduction
matrix.ItemsMatrixToDFTransform(), # Make a DataFrame with an index
inout.WriteTransform(outfile) # Write results to csv
]
pl = pipeline.Pipeline(transforms)
pl.run()
```
## Setup
### Requirements
HiDi is tested against CPython 2.7, 3.4, 3.5, and 3.6. It may work with
different version of CPython.
### Installation
To install HiDi, simply run
```sh
$ pip install hidi
```
## Run the Tests
```
$ pip install tox
$ tox
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
HiDi-0.0.1.tar.gz
(6.9 kB
view details)
File details
Details for the file HiDi-0.0.1.tar.gz
.
File metadata
- Download URL: HiDi-0.0.1.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5092f0511b23086c81f642d2bc7cf324ea8da2d6d0f0d0b67aa8614224d79469 |
|
MD5 | 475439a318f55bf979cbda383d0b0f33 |
|
BLAKE2b-256 | cf4e532c33aebd2cba97631956e8fb4fc99a9f97203a99849ab7f46fd50bf9fd |