Skip to main content

A framework for representing sequences as embeddings.

Project description

Skip-Grammar

A framework for representing sequences as embeddings.

Models

Skip-gram Negative Sampling (SGNS)

Popular natural language processing models such as word2vec and bert can be repurposed to learn relationships from arbitrary sequences of items. Skip-gram Negative Sampling is such an algorithm part of the models module. This is implemented in PyTorch components or can be composed as a PyTorch Lightning module. Both are availble under the relevent namespaces skipgrammar.models.sgns and skipgrammar.models.lighting.sgns.

Datasets

Last.FM

The Last.FM Dataset-1K dataset is comprised of the listening history of approximately 1,000 users from the music service Last.FM. The dataset is availble at the project's main site here and also preprocessed here for ease of use. The variants in the dataset module use the latter.

MovieLens

The popular recommendation system dataset MovieLens is availble in three variants via the dataset module.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skipgrammar-0.1.3.tar.gz (13.2 kB view hashes)

Uploaded Source

Built Distribution

skipgrammar-0.1.3-py3-none-any.whl (15.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page