Pretrained word embeddings in Python.
Project description
# embeddings
This python package contains utilities to download and make available pretrained word embeddings.
Embeddings are stored in the `$EMBEDDINGS_ROOT` directory (defaults to `~/.embeddings`) in a SQLite 3 database for minimal load time and fast retrieval.
Instead of loading a large file to query for embeddings, `embeddings` is fast:
```python
In [1]: %timeit GloveEmbedding('common_crawl_840', d_emb=300)
100 loops, best of 3: 12.7 ms per loop
In [2]: %timeit GloveEmbedding('common_crawl_840', d_emb=300).emb('canada')
100 loops, best of 3: 12.9 ms per loop
In [3]: g = GloveEmbedding('common_crawl_840', d_emb=300)
In [4]: %timeit -n1 g.emb('canada')
1 loop, best of 3: 38.2 µs per loop
```
## Installation
```bash
pip install embeddings # from pypi
pip install git+https://github.com/vzhong/embeddings.git # from github
```
## Usage
Note that on first usage, the embeddings will be downloaded. This may take a long time for large embeddings such as GloVe.
```python
from embeddings import GloveEmbedding, FastTextEmbedding, KazumaCharEmbedding
g = GloveEmbedding('common_crawl_840', d_emb=300, show_progress=True)
f = FastTextEmbedding()
k = KazumaCharEmbedding()
for w in ['canada', 'vancouver', 'toronto']:
print('embedding {}'.format(w))
print(g.emb(w))
print(f.emb(w))
print(k.emb(w))
```
## Contribution
Pull requests welcome!
This python package contains utilities to download and make available pretrained word embeddings.
Embeddings are stored in the `$EMBEDDINGS_ROOT` directory (defaults to `~/.embeddings`) in a SQLite 3 database for minimal load time and fast retrieval.
Instead of loading a large file to query for embeddings, `embeddings` is fast:
```python
In [1]: %timeit GloveEmbedding('common_crawl_840', d_emb=300)
100 loops, best of 3: 12.7 ms per loop
In [2]: %timeit GloveEmbedding('common_crawl_840', d_emb=300).emb('canada')
100 loops, best of 3: 12.9 ms per loop
In [3]: g = GloveEmbedding('common_crawl_840', d_emb=300)
In [4]: %timeit -n1 g.emb('canada')
1 loop, best of 3: 38.2 µs per loop
```
## Installation
```bash
pip install embeddings # from pypi
pip install git+https://github.com/vzhong/embeddings.git # from github
```
## Usage
Note that on first usage, the embeddings will be downloaded. This may take a long time for large embeddings such as GloVe.
```python
from embeddings import GloveEmbedding, FastTextEmbedding, KazumaCharEmbedding
g = GloveEmbedding('common_crawl_840', d_emb=300, show_progress=True)
f = FastTextEmbedding()
k = KazumaCharEmbedding()
for w in ['canada', 'vancouver', 'toronto']:
print('embedding {}'.format(w))
print(g.emb(w))
print(f.emb(w))
print(k.emb(w))
```
## Contribution
Pull requests welcome!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
embeddings-0.0.3.tar.gz
(6.0 kB
view details)
Built Distribution
embeddings-0.0.3-py3.5.egg
(20.5 kB
view details)
File details
Details for the file embeddings-0.0.3.tar.gz
.
File metadata
- Download URL: embeddings-0.0.3.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f213e01886afe8a1442e0012f03db7efb9dfcb8bb40d319beb1e95e80e14be42 |
|
MD5 | 601916be116f09ca9cad9e3b3cc8c9a3 |
|
BLAKE2b-256 | 0f490a8315eef4fb95b25f3622d8e82d441c8e8fb6261fa5a825068b787eea8d |
File details
Details for the file embeddings-0.0.3-py3.5.egg
.
File metadata
- Download URL: embeddings-0.0.3-py3.5.egg
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8562b87b3918d041711fc3cc25a2b0224d75f23f5851c79eda8949f20b2ebe46 |
|
MD5 | 0f7cab1050cd02fcf7348edefd04c554 |
|
BLAKE2b-256 | cd0c55cf315cadfb0c5aaee98cd6f8ada4d9522f8b2219fb1727a7afe5fbfd11 |