Skip to main content

Alphacast Python SDK

Project description

Alphacast Python Library

Introduction

The Alphacast Python library allows you to interact with the assets hosted in Alphacast without the need to know the details of the API integration. In the current version you can interact with your repositories and datasets.

Quick Start

Begin by installing Alphacast SDK in your console


    pip install alphacast

Then import the module in your screen and initialize the session with your API key. To find you API you need to have an Alphacast account (https://www.alphacast.io/api/auth/login). Once created, find your key in your settings (click on your user on the left menu)


    from alphacast import Alphacast

    alphacast = Alphacast(YOUR_API_KEY)

Repositories

All the interaction with your repositories are handled with the "repository" class. To read the metadata of all your repositories and those you have write permissions use repository.read_all()


    alphacast.repository.read_all()

To read the metadata of a single repository by id use read_by_id or read_by_name. You need to have owner, admin or write permissions to access the repo.


    alphacast.repository.read_by_id(repo_id)

    alphacast.repository.read_by_name(repo_name)

To create a repository you need to define its name, slug (optional), privacy (optional) and description (also optional). The parameter returnIdIfExists (True/False) is usen to describe the action when de repository already exists. If true, then it returns de Id.


    alphacast.repository.create("my first Test Repo", repo_description="This is my first Repo", slug="test-repo", privacy="Public", returnIdIfExists=True)

Finding datasets

Finding and downloading data is at the core of Alphacast and is done with the "datasets" class.

To access the metadata of all your datasets (that is, those where you have owner, admin or write permission) use read_all() or read_by_name().


    alphacast.datasets.read_all()

    alphacast.datasets.read_by_name("dataset_name")

NOTE Only your dataset can be accessed with these methods. If you want to use public data you have to use the id which can be found in the url of the dataset in alphacast.io web.

With the id you can also access information of datasets where you have read permission (either because they are public or because you have been granted access)

use the dataset(dataset_id).metadata() and datestats() methods to access more information.

metadata retrieves the values of id, name, createdAt, updatedAt, repositoryId and permission levels.

datestats retrieves the infered Frequency and the first and last Date available


    alphacast.datasets.dataset(5565).metadata()

    alphacast.datasets.dataset(5208).datestats()

    alphacast.datasets.dataset(5208).get_column_definitions()

Downloading data

The method download_data() of the Class dataset() is used to retrieve the data from the datasets. You need to have read permission (or above) to access the data


    # for json/xlsx/csv data use format = "json"  / "xlsx" / "csv"

    json_data = alphacast.datasets.dataset(6755).download_data(format = "json")

    excel_file = alphacast.datasets.dataset(6755).download_data("xlsx")

    csv_data = alphacast.datasets.dataset(6755).download_data("csv")

    

    # To load this into a Pandas dataframe 

    import pandas as pd

    import io

    df = pd.read_csv( io.StringIO(alphacast.datasets.dataset(6755).download_data("csv").decode("UTF-8")))



    # or directly

    df = alphacast.datasets.dataset(6755).download_data("pandas")

Filtering Dates, variables and entities

DAtes, variables and entities can be filtered when downloading by following this guides


from datetime import datetime



fEntities = {"Entiti_Name": ["Entity_value_1", "Entity_value_2"]} 

fVariables=["Variable_1", "Variable_2", "Variable_3"]

d1 = datetime.datetime(2019, 2, 27)

d2 = datetime.datetime(2022, 3, 2)



alphacast.datasets.dataset(dataset_id).download_data("pandas", startDate=d1, endDate=d2, filterEntities= fEntities, filterVariables=fVariables)



Creating datasets

Creating datasets and uploading information is a two step process. First you need to create the datasets and "initialize" its columns. We need to know which are the "Date" and the Entity column or columns.

Entity can be defined as one or many columns as long as the pairs of Date / Entity are unique. Basically, think of Date / Entity as a unique index.

Important Note If you want to create Alphacast charts with your data then Entity need to be a single columns (Date / Entity pair). Our chart engine accept, for the moment, only single entity datasets.

So first let's create a dataset


    alphacast.datasets.create(dataset_name, repo_id, description)

The process, if succesfull, will provide you with an id. you can check if your dataset has been created visiting alphacast.io/datasets/{dataset_id}

Uploading data

Now let's insert some data into that dataset. We will use the pandas dataframe loaded before. Uploading using Pandas Dataframes is an easy way to do it, but plain csv can be uploaded.


# keep some variables from the dataset

    df = df[['Date', 'country', 'CPI - All Urban Wage Earners and Clerical Workers - current_prices_yoy']]



# initialize de variables. We will use "Date" as date column and "country" as entity. 

    alphacast.datasets.dataset({dataset_id}).initialize_columns(dateColumnName = "Date", entitiesColumnNames=["country"], dateFormat= "%Y-%m-%d")



{"id": {dataset_id}, "columnDefinitions": [{"sourceName": "Date", "dataType": "Date", "dateFormat": "%Y-%m-%d", "isEntity": "True"}, {"sourceName": "country", "isEntity": "True"}], "updateAt": "2021-10-06T16:51:35.418493"}'

Next step. Upload the data. Four parameters are needed. "df" is The data and uploadIndex defines if the DataFrame index should be uploaded also.

deleteMissingFromDB and onConflictUpdateDB are two parameters to decide the behaviour of what to do with if there is data already on the dataset. If deleteMissingFromDB is true everything that is not sent in the current upload will be deleted. If onConflictUpdateDB is true the conflicting values of matching Date / Entities will be updated.


    alphacast.datasets.dataset(7938).upload_data_from_df(df, deleteMissingFromDB = True, onConflictUpdateDB = True, uploadIndex=False)

    

    #upload_data_from_csv() is also available

Now head to https://www.alphacast.io/datasets/{dataset_id} to see the result.

Checking the status of the Process

Your request creates a upload process in Alphacast, that may take some time. You will get the id of that process when submiting the upload. It will look like this

{"id": 45141, "status": "Requested", "createdAt": "2021-10-06T16:58:18.999786", "datasetId": 7938}'

To check the status of all your processes for that dataset use


    alphacast.datasets.dataset(7938).processes()



    b'[{"id": 45141, "datasetId": 7938, "status": "Processed", "statusDescription": "1292 values added to database./n", "deleteMissingFromDB": 0, "onConflictUpdateDB": 0, "createdAt": "2021-10-06T16:58:18", "processedAt": "2021-10-05T15:40:52"}]'



    #or alternatively



    alphacast.datasets.dataset(7938).process(45141)    

ok! We are done. Good Job!!

Much more features are coming down the road. Stay tuned. We would love to hear your feedback at hello@alphacast.io

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alphacast-0.1.2.2.tar.gz (7.0 kB view details)

Uploaded Source

File details

Details for the file alphacast-0.1.2.2.tar.gz.

File metadata

  • Download URL: alphacast-0.1.2.2.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.16

File hashes

Hashes for alphacast-0.1.2.2.tar.gz
Algorithm Hash digest
SHA256 a204c3561471b50c772bd2b84cdd5cf46712c86d096ed14a80ba60dba7a0c69a
MD5 7146221bba80afc00f7c11877f25388f
BLAKE2b-256 703336ede00f8cb898ffb246e01bf1a383e63c7464eb8ce1d8ce4fd68a949f07

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page