Client for the BusinessOptics API
Project description
BusinessOptics Client
Easy access to the BusinessOptics API, based on the python requests library.
For example:
from businessoptics import Client
print Client(auth=('user@example.com', 'apikey')).workspace(123).query('ideaname').tuples()
Installation
pip install businessoptics
Authentication
Construct a new Client.
Authentication details can either be passed in directly:
client = Client(auth=('user@example.com', 'apikey'))
or be extracted from environment variables:
- BUSINESSOPTICS_EMAIL
- BUSINESSOPTICS_APIKEY
so you can go
client = Client()
or from a ~/.businessoptics_client.config
JSON file of the form
{
"user@example.com": "<apikey>",
"other@example.com": "<apikey>"
}
so you can easily switch between multiple users and create a client as below
client = Client(auth="user@example.com")
Usage
The client uses logging to show what it's doing, so make sure you have logging configured. A quick way to do this is as follows:
from businessoptics import setup_quick_console_logging
setup_quick_console_logging()
Run a query and download tuples
client = Client()
workspace = client.workspace(123)
# For a single idea:
tuples = workspace.query('idea_name1').tuples()
# tuples is now a list of dictionaries
# For multiple ideas:
query = workspace.query(['idea_name1', 'idea_name2'])
tuples1 = query.tuples('idea_name1')
tuples2 = query.tuples('idea_name2')
# For large numbers of tuples:
for tup in query.tuple_generator('idea_name1'):
process(tup)
# Get a (possibly cached) pandas dataframe of tuples:
df = query.to_df('idea_name1')
# Quick queries have a similar API, e.g.
tuples = workspace.quick('idea_name1').tuples()
Upload tuples to a dataset
dataset = client.dataset(456)
dataset.upload_tuples(
[{'id': 1, 'name': 'alice'}],
[{'id': 2, 'name': 'bob'}],
)
# Or use a generator for large amounts of tuples, e.g.
def tuples():
for i, tup in enumerate(large_database_query()):
tup['id'] = i
yield tup
dataset.upload_tuples(tuples())
# Or upload a pandas DataFrame
df = pd.read_csv(...)
dataset.upload_df(df) # Uploads all the columns, but not the indexes
dataset.upload_df(df.reset_index()) # Uploads all the indexes as well
Download a file from Google Drive
from businessoptics import gdrive_file
# Get URL by clicking on a file and then 'Get shareable link' in Google Drive
gfile = gdrive_file('https://drive.google.com/open?id=ABCDEF123')
gfile.path() # The path to the downloaded file
gfile.open() # Open the file
gfile.unzip() # Unzip the file, returning a similar object. The zip must only contain one file
gfile.untar() # Untar the file. Similar to the above, but use for '.tar.gz'.
gfile.unzip('the_only_file_you_need.csv') # extract a specific file when there are many
# Read a zipped CSV into Pandas
df = pd.read_csv(gfile.unzip().path())
Upload a file to Google Drive
from businessoptics import upload_to_google_drive
# File will be called 'a_local_file.csv' on Google Drive
upload_to_google_drive('path/to/a_local_file.csv')
# File will be called 'name_on_drive.csv' on Google Drive
upload_to_google_drive('path/to/a_local_file.csv', 'name_on_drive.csv')
# File will be zipped before upload and will be called 'a_local_file.csv.zip' on Google Drive
upload_to_google_drive('path/to/a_local_file.csv', zipit=True)
Controlling the download cache
If the /global_cache
folder is present (e.g. it is on jupyter.businessoptics.net) then by default cached files (from gdrive_file()
or .to_df()
) are stored there and shared by everyone. This may lead to strange behaviour if multiple people are downloading the same behaviour. You can avoid this and only use the cache in your home folder as follows:
from businessoptics import isolate_cache
isolate_cache()
Generic API usage
Every Client instance has a base URL. All requests made from it start from that base, and you can optionally add more to the URL for the request. For example:
client = Client() # client.base_url is ''
workspace = client.workspace(123) # workspace.base_url is '/api/v2/workspace/123'
# sends a GET request to /api/v2/workspace/123, returning metadata about the workspace
workspace.get()
# sends a GET request to /api/v2/workspace/123/query, returning the query history
# of the workspace
workspace.get('query')
The API responds with JSON which is automatically parsed into Python data structures, with a dictionary at the top.
If you want to send a POST, PUT, or DELETE request, use the post/put/delete
method. You will probably need to specify the json
keyword argument with some dictionary for the body of the request.
If there was an error, an APIError
exception will be raised.
Resource classes
Client
has several subclasses, each representing different resources in the app and having different methods. Here is how you would create instances of these classes and a brief overview of what you can do with them. For more details see the source code and docstrings.
All of these classes have base URLs which accept a plain get()
to get metadata about the resource.
from businessoptics import Client, Workspace, DataCollection, Dataset, Query, IdeaResult, Dashboard
client = Client()
# Workspace
workspace = client.workspace(123)
workspace = client.workspace('workspace name')
workspace = Workspace.at(client, '/api/v2/workspace/123/')
## Initialise a training run
training_run = workspace.train(['idea1', 'idea2'])
## Wait for it to complete
training_run.await()
# Query
## Get an existing, previously initiated query:
query = client.query(456)
query = Query.at(client, '/api/v2/query/456')
## Run a new query:
query = workspace.query(['idea_name1', 'idea_name2'])
### Pass knowledge parameters:
query = workspace.query('idea_name', parameters={'param1': 1, 'param2': 2})
### Run using hadoop:
query = workspace.query('idea_name', execution_mode='hadoop')
## To get tuples, use the tuples(), tuple_generator(), or to_df() methods that
## exist in IdeaResult. You don't have to separately get the result, just pass
## the idea name as the first argument, e.g. you can do:
tuples = query.tuples('idea_name')
## which is equivalent to:
tuples = query.result('idea_name').tuples()
## You can also run quick queries by replacing workspace.query with workspace.quick
# IdeaResult
result = query.result('idea_name1')
result = IdeaResult.at(client, '/api/v2/query/456/result/idea_name1')
tuples = result.tuples()
## For large numbers of tuples:
for tup in result.tuple_generator():
process(tup)
## Get a dataframe:
df = result.to_df()
## Reingest into a dataset
data_update = result.reingest_into_existing_dataset(456)
## Wait for the reingestion to finish
data_update.await()
# DataCollection
collection = client.datacollection(123)
collection = client.datacollection('collection name')
collection = DataCollection.at(client, '/api/v2/datacollection/123')
collection = client.dataset(456).collection # NOT the datacollection method
# Dataset
dataset = client.dataset(456)
dataset = collection.dataset('dataset name')
dataset = Dataset.at(client, '/api/v2/dataset/456')
## For uploading tuples, see section above
## Downloading tuples is similar to IdeaResult:
## use the methods tuples(), tuple_generator(), and to_df()
## You can also specify filters for the first two methds - see the docstring for tuple_generator
## Create a new dataset:
dataset = collection.create_tablestore_dataset(
name='test',
dimensions=[
dict(name='col1', type='integer', default='-1', key=False),
dict(name='col2', type='integer', default='-1', key=False),
]
)
## or from a dataframe (see docstring):
dataset = collection.create_tablestore_dataset_from_df('df test', df)
## Duplicate a dataset
dataset_name = dataset.get()['name']
new_dataset_name = 'new.' + dataset_name
new_dataset = dataset.duplicate(new_dataset_name) # see docstring for more parameters
## Rename a dataset:
new_dataset.rename(dataset_name)
## Delete a dataset:
dataset.delete()
## Delete tuples:
dataset.delete_tuples() # see docstring for how to specify filter
# Dashboard
dashboard = client.dashboard(456)
dashboard = workspace.dashboard('dashboard name')
dashboard = Dashboard.at('/api/v2/dashboard/456')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file businessoptics-0.8.0.tar.gz
.
File metadata
- Download URL: businessoptics-0.8.0.tar.gz
- Upload date:
- Size: 24.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.8.0 tqdm/4.43.0 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d173472311493cd2bbc00e4adb1f6be3532bce72f1d4a62eb297383031878f0 |
|
MD5 | 4a131b40394425b6e5f012dd67171db4 |
|
BLAKE2b-256 | 05ff3d5fe473c1654afa2f5f1eae9cc6ce79306f22e69d0123ec5359bf7455fd |