Pretrained word embeddings in Python.
Project description
# embeddings
This python package contains utilities to download and make available pretrained word embeddings.
Embeddings are stored in the `$EMBEDDINGS_ROOT` directory (defaults to `~/.embeddings`) in a SQLite 3 database for minimal load time and fast retrieval.
Instead of loading a large file to query for embeddings, `embeddings` is fast:
```python
In [1]: %timeit GloveEmbedding('common_crawl_840', d_emb=300)
100 loops, best of 3: 12.7 ms per loop
In [2]: %timeit GloveEmbedding('common_crawl_840', d_emb=300).emb('canada')
100 loops, best of 3: 12.9 ms per loop
In [3]: g = GloveEmbedding('common_crawl_840', d_emb=300)
In [4]: %timeit -n1 g.emb('canada')
1 loop, best of 3: 38.2 µs per loop
```
## Installation
```bash
pip install embeddings # from pypi
pip install git+https://github.com/vzhong/embeddings.git # from github
```
## Usage
Note that on first usage, the embeddings will be downloaded. This may take a long time for large embeddings such as GloVe.
```python
from embeddings import GloveEmbedding, FastTextEmbedding, KazumaCharEmbedding
g = GloveEmbedding('common_crawl_840', d_emb=300, show_progress=True)
f = FastTextEmbedding()
k = KazumaCharEmbedding()
for w in ['canada', 'vancouver', 'toronto']:
print('embedding {}'.format(w))
print(g.emb(w))
print(f.emb(w))
print(k.emb(w))
```
## Contribution
Pull requests welcome!
This python package contains utilities to download and make available pretrained word embeddings.
Embeddings are stored in the `$EMBEDDINGS_ROOT` directory (defaults to `~/.embeddings`) in a SQLite 3 database for minimal load time and fast retrieval.
Instead of loading a large file to query for embeddings, `embeddings` is fast:
```python
In [1]: %timeit GloveEmbedding('common_crawl_840', d_emb=300)
100 loops, best of 3: 12.7 ms per loop
In [2]: %timeit GloveEmbedding('common_crawl_840', d_emb=300).emb('canada')
100 loops, best of 3: 12.9 ms per loop
In [3]: g = GloveEmbedding('common_crawl_840', d_emb=300)
In [4]: %timeit -n1 g.emb('canada')
1 loop, best of 3: 38.2 µs per loop
```
## Installation
```bash
pip install embeddings # from pypi
pip install git+https://github.com/vzhong/embeddings.git # from github
```
## Usage
Note that on first usage, the embeddings will be downloaded. This may take a long time for large embeddings such as GloVe.
```python
from embeddings import GloveEmbedding, FastTextEmbedding, KazumaCharEmbedding
g = GloveEmbedding('common_crawl_840', d_emb=300, show_progress=True)
f = FastTextEmbedding()
k = KazumaCharEmbedding()
for w in ['canada', 'vancouver', 'toronto']:
print('embedding {}'.format(w))
print(g.emb(w))
print(f.emb(w))
print(k.emb(w))
```
## Contribution
Pull requests welcome!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
embeddings-0.0.2.tar.gz
(6.8 kB
view details)
File details
Details for the file embeddings-0.0.2.tar.gz.
File metadata
- Download URL: embeddings-0.0.2.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f206103e7cab4791f68fe4a519d7763eaff8671da2cd6b4a17fb6b08089cdc1
|
|
| MD5 |
77993029367f7e5c0e0e7425cf3c659a
|
|
| BLAKE2b-256 |
ef67751c22d7cfc7010a58a2abb7a64309585c39d0e8467be76c90711f24d3ba
|