JAI - Trust your data
Project description
Jai SDK - Trust your data
Installation
The source code is currently hosted on GitHub at: https://github.com/jquant/jai-sdk
Installing jai-sdk using pip
:
pip install jai-sdk
Get your Auth Key
First, you'll need and Authorization key to use the backend API.
To get an Trial version API using the sdk, fill the values with your information:
from jai import Jai
r = Jai.get_auth_key(email=EMAIL, firstName=FIRSTNAME, lastName=LASTNAME)
If the response code is 201, then you should be receiving an email with your Auth Key.
Get Started
If you already have an Auth Key, the you can use the sdk:
from jai import Jai
j = Jai(AUTH_KEY)
Setting up your databases
All data should be in pandas.DataFrame or pandas.Series format
Aplication using the NLP FastText model
### fasttext implementation
# save this if you want to work in the same database later
name = 'text_data'
### Insert data and train the FastText model
# data can be a list of texts, pandas Series or DataFrame.
# if data is a list, then the ids will be set with range(len(data_list))
# if data is a pandas type, then the ids will be the index values.
# heads-up: index values must not contain duplicates.
j.setup(name, data, db_type='FastText')
# wait for the training to finish
j.wait_setup(name, 10)
Aplication using the NLP BERT model
### BERT implementation
# generate a random name for identification of the base; it can be a user input
name = j.generate_name(20, prefix='sdk_', suffix='_text')
# this time we choose db_type="Text", applying the pre-trained BERT model
j.setup(name, data, db_type='Text', batch_size=1024)
j.wait_setup(name, 10)
Checking database
Here are some methods to check your databases.
The name of your database should appear in:
>>> j.names
['jai_database', 'jai_unsupervised', 'jai_supervised']
or you can check if a given database name is valid:
>>> j.is_valid(name)
True
You can also check the types for each of your databases with:
>>> j.info
db_name db_type
0 jai_database Text
1 jai_unsupervised Unsupervised
2 jai_supervised Supervised
If you want to check which ids are in your database:
>>> j.ids(name)
['1000 items from 0 to 999']
Similarity
After you're done setting up your database, you perform similarity searches:
- Using the indexes of the input data
# Find the 5 most similar values for ids 0 and 1
results = j.similar(name, [0, 1], top_k=5)
# Find the 20 most similar values for every id from [0, 99]
ids = list(range(100))
results = j.similar(name, ids, top_k=20)
# Find the 100 most similar values for every input value
results = j.similar(name, data.index, top_k=100, batch_size=1024)
- Using new data to be processed All data should be in pandas.DataFrame or pandas.Series format
# Find the 100 most similar values for every new_data
results = j.similar(name, new_data, top_k=100, batch_size=1024)
The output will be a list of dictionaries with ("query_id") being the id of the value you want to find similars and ("results") a list with top_k
dictionaries with the "id" and the "distance" between "query_id" and "id".
[
{
'query_id': 0,
'results':
[
{'id': 0, 'distance': 0.0},
{'id': 3836, 'distance': 2.298321008682251},
{'id': 9193, 'distance': 2.545339584350586},
{'id': 832, 'distance': 2.5819168090820312},
{'id': 6162, 'distance': 2.638622283935547},
...
]
},
...,
{
'query_id': 9,
'results':
[
{'id': 9, 'distance': 0.0},
{'id': 54, 'distance': 5.262974262237549},
{'id': 101, 'distance': 5.634262561798096},
...
]
},
...
]
Removing data
After you're done with the model setup, you can delete your raw data
# Delete the raw data inputed as it won't be needed anymore
j.delete_raw_data(name)
If you no longer need the model or anything else related to your database:
j.delete_database(name)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file jai-sdk-0.3.0.tar.gz
.
File metadata
- Download URL: jai-sdk-0.3.0.tar.gz
- Upload date:
- Size: 17.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12d82f7b9679110b382350b7701a06b89aa3a095ca5a3480c04175bfa18a7cea |
|
MD5 | e18b3eb7cc7ce224deba57c03d270054 |
|
BLAKE2b-256 | 84ce85e5ba1a36d66b6a7f5f3f6166ea7044e61f0068becce1ccf3aa9e8d76c5 |
File details
Details for the file jai_sdk-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: jai_sdk-0.3.0-py3-none-any.whl
- Upload date:
- Size: 18.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 961bd271bafba5207ba90b1e470816c8449b3fcc2fedd29cfe6b4e5718c88e88 |
|
MD5 | 75daafe57c9f0766afcac72bb0a7d957 |
|
BLAKE2b-256 | bca79024d5fd696538f582d2c15d5d05757d473027676fc1f227120f8714546f |