Fast and accurate language identifier
Project description
heliport
A language identification tool that aims to be both fast and accurate. Originally started as a HeLI-OTS port to Rust.
Installation
From PyPi
Install it in your environment
pip install heliport
then download the model
heliport-download
From source
Install the requirements:
Clone the repo, build the package and compile the model
git clone https://github.com/ZJaume/heliport
cd heliport
pip install .
heliport-convert
Usage
CLI
Just run the heliport
command that reads lines from stdin
cat sentences.txt | heliport
eng_latn
cat_latn
rus_cyrl
...
Python package
>>> from heliport import Identifier
>>> i = Identifier()
>>> i.identify("L'aigua clara")
'cat_latn'
Rust crate
use std::sync::Arc;
use heliport::identifier::Identifier;
use heliport::lang::Lang;
use heliport::load_models;
let (charmodel, wordmodel) = load_models("/dir/to/models")
let identifier = Identifier::new(
Arc::new(charmodel),
Arc::new(wordmodel),
);
let lang, score = identifier.identify("L'aigua clara");
assert_eq!(lang, Lang::cat_Latn);
Benchmarks
Speed benchmarks with 100k random sentences from OpenLID, all the tools running single-threaded:
tool | time (s) |
---|---|
CLD2 | 1.12 |
HeLI-OTS | 60.37 |
lingua all high preloaded | 56.29 |
lingua all low preloaded | 23.34 |
fasttext openlid193 | 8.44 |
heliport | 2.33 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file heliport-0.6.0.tar.gz
.
File metadata
- Download URL: heliport-0.6.0.tar.gz
- Upload date:
- Size: 49.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c626a5457b06a6ba129fdb300a67671da1f0b7a32673930220ac1ddc693085d1 |
|
MD5 | 50f822abec84afc726e65ab7ddcc7558 |
|
BLAKE2b-256 | 38970e7bb92759022da9f89754372c4883ddb2f2d492ec0eb24ba520059f66c4 |
File details
Details for the file heliport-0.6.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: heliport-0.6.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1efe0012d061a06f70ff1c1d1ca06c2b0ee8698434d507ef49c4045d6d3bcf1b |
|
MD5 | e4ad78a7039bed039e2cd8c3fd7ebc33 |
|
BLAKE2b-256 | 078caa4c8de0d5d374684b5d36ab58a7bfee2f11c759f857e3eca3d5cf76a6f5 |
File details
Details for the file heliport-0.6.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: heliport-0.6.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6b0e53e2dec4956914f833817062372d81deb2df7ca04f0b5f04ac2f0ec7401 |
|
MD5 | ff4d262df6c581c2fc86d1812fd531d6 |
|
BLAKE2b-256 | 5d3a48aee4c61acc13c05ff4d6ac9a2ebb8b8d964b543c29d81b0b7ede20ef76 |
File details
Details for the file heliport-0.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: heliport-0.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 36ee3d1fbea81161f7aa64aa83cf9dbd58a5fd8fbb6f34819badfe769b1a69d1 |
|
MD5 | 197fea0566128e8efd9bef7466bffb32 |
|
BLAKE2b-256 | 5aba62571508589df58bf69665d62508c9bdc63ac1f17a89b6deca3b37a4b0cb |
File details
Details for the file heliport-0.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: heliport-0.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a0c29e4c6fb1d37ef969f1b5beb2c6fdebc41f2159516a2b759bc47fbb0a2896 |
|
MD5 | 7955e197a0ca8ccb52c5b17adf8f5379 |
|
BLAKE2b-256 | 034df1b011e4f487606177679cf5b0b7e7d093169bd863a14922b07dd5519d5f |
File details
Details for the file heliport-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: heliport-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8a77a9eb7a3af7f2dba44c1eb5208b58eee319fcdfe81796198d4c3373102ab |
|
MD5 | f8dd7a2963a03c3093be770ccdaaef28 |
|
BLAKE2b-256 | d0b838fe54935898c97b0cff54bfb8b2a80998b8c35299ea835ee001c4f34159 |
File details
Details for the file heliport-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: heliport-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d94eae8d24cc62b3506cdfbc7f58bb1c87755677c09d512ef250847615a7e7c |
|
MD5 | 11280e572ca5e2f9b795063f23356dd3 |
|
BLAKE2b-256 | 6d9dbf5dd8ab21441bb501080bed240b8bdde81f7022908e8cb414da8db673b2 |
File details
Details for the file heliport-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: heliport-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96d1d3d1665b2e7acd2902d9fa5cde0cd475afb5266edba1c6684c2780d1a14a |
|
MD5 | b2cacc8faa30ba564f55b3e868fe230d |
|
BLAKE2b-256 | ba06aadcff9cbbb0c4ea681ee6e9274c6aa7972d9da627d1db3cfc98edc237de |