Skip to main content

Python client for the Datacore API (https://datacore.vn). Imported as `datacore`.

Project description

Datacore Python Client

PyPI version Python versions CI License: MIT

Python client library for the Datacore API — supports two access modes:

  • Demo: Preview datasets without an API key
  • Paid: Full access with an API key

Distribution name vs. import name: install as datacore-vn, import as datacore. (Same pattern as pip install scikit-learnimport sklearn.) The shorter datacore distribution name on PyPI belongs to an unrelated abandoned 2022 project; datacore-vn is our permanent distribution name.

Installation

From PyPI:

pip install datacore-vn                # core (pandas-only)
pip install "datacore-vn[polars]"      # also installs polars
pip install "datacore-vn[all]"         # all optional extras

Or with uv:

uv add datacore-vn
uv add "datacore-vn[polars]"

From GitHub directly:

pip install git+https://github.com/DataCore-VietNam/DataCore.git
pip install "datacore-vn[polars] @ git+https://github.com/DataCore-VietNam/DataCore.git"

For contributors only (you do not need this to use the library — the commands above are all an end user needs):

git clone https://github.com/DataCore-VietNam/DataCore.git
cd DataCore
pip install -e ".[dev]"

Configuration (.env, optional)

Create a .env file in your working directory (see .env.example):

X_API_KEY=your-api-key-here

Never commit your real API key. .env is git-ignored. Use .env.example as a template.


Usage

1. Initialize the client

from datacore import Datacore

# Demo mode (no API key required)
client = Datacore()

# Paid mode — pass the key explicitly...
client = Datacore(api_key="your-api-key")

# ...or rely on X_API_KEY from .env / environment
client = Datacore()

# Enable request/response debug logging
client = Datacore(api_key="your-api-key", debug=True)

2. Preview a dataset (Demo mode)

Preview data without an API key.

# All columns
df = client.preview("dataset_historical_price")
print(df.head())

# Filter specific columns
df = client.preview("dataset_historical_price", columns=["symbol", "date", "close_price"])
print(df.head())

3. Fetch data (Paid mode)

# All columns — returns {"data": DataFrame, "info": str}
result = client.get_data("dataset_historical_price")
print(result["data"].head())
print(result["info"])
# num: 3760607, totalPage: 37607, currentPage: 1, queried_rows: 100

# Filter specific columns
result = client.get_data(
    "dataset_historical_price",
    columns=["symbol", "date", "close_price"],
)
print(result["data"].head())

Full parameters:

result = client.get_data(
    dataset_code="dataset_historical_price",
    columns=["symbol", "date", "close_price"],   # client-side column filter (optional)
    conditions=None,         # EXPERIMENTAL server-side row filter -- see note below
    select_fields=None,      # server-side field selection (optional)
    page=1,
    limit=100,               # max 100 server-side (HTTP 400 if higher)
    return_type="dataframe", # "dataframe" | "polars" | "json" | "dict"
    include_info=True,       # True: returns {"data": ..., "info": ...} | False: data only
)

Page size: the gateway currently caps limit at 100 rows per request. Passing a larger value returns HTTP 400: Invalid request content. For larger downloads, paginate with download_data or paginate.

⚠️ conditions is experimental. The server-side conditions row filter is forwarded to the gateway verbatim, but the accepted JSON shape is not yet finalised — every shape tried so far is rejected by gateway.datacore.vn/data/ds/search with HTTP 400. Do not rely on conditions in production yet. Until the gateway schema is confirmed, fetch unfiltered data and filter the returned DataFrame client-side. This parameter may change in a future release.

A convenience wrapper that returns the DataFrame directly (no info dict):

df = client.get_dataframe("dataset_historical_price", limit=100)

3b. Polars output (optional)

If you installed with pip install "datacore-vn[polars]", you can ask for a polars DataFrame instead of pandas:

# Via return_type
result = client.get_data(
    "dataset_historical_price",
    columns=["symbol", "date", "close_price"],
    limit=100,
    return_type="polars",
)
print(type(result["data"]))     # <class 'polars.DataFrame'>
print(result["data"].head())

# Convenience method (no info dict, just the polars frame)
df_pl = client.get_polars("dataset_historical_price", limit=100)

# Preview supports polars too
df_pl = client.preview("dataset_historical_price", return_type="polars")

Pandas is the default for backwards compatibility; polars is purely opt-in. The same columns= filter works for both backends.


4. Iterate all pages (Paid mode)

for page_df in client.paginate("dataset_historical_price", limit=100, max_pages=5):
    print(page_df.shape)

5. Download data to file

# Download all pages of a small dataset (76 pages, ~7.5k rows)
download_result = client.download_data(
    dataset_code="gross_domestic_product_dataset_ds",
    output_path="data.csv",
    file_format="csv",     # "csv" or "json"
    start_page=1,
    end_page=None,         # None = download until the last page
    limit=100,             # max per-request page size (see note above)
    show_progress=True,
)
print(download_result)
# {"output_path": "data.csv", "pages_downloaded": 76, "rows_downloaded": 7551, ...}

# `dataset_historical_price` is large (~3.7M rows / 37k pages at limit=100);
# expect it to take a long time and a lot of network.

# Download only first 3 pages, filtered to specific columns (CSV only)
download_result = client.download_data(
    dataset_code="dataset_historical_price",
    columns=["symbol", "date", "close_price"],
    output_path="data_page1_3.csv",
    file_format="csv",
    start_page=1,
    end_page=3,
    show_progress=True,
)

columns filtering only applies to file_format="csv". JSON output preserves the full raw API response.


Method Summary

Method Description Requires API key
preview(dataset_code, columns, return_type) Preview a dataset (pandas or polars) No
preview_raw(dataset_code) Preview a dataset, raw dict response No
get_data(dataset_code, ...) Fetch data, returns {"data", "info"} by default Yes
get_dataframe(dataset_code, ...) Fetch data, returns pandas DataFrame directly Yes
get_polars(dataset_code, ...) Fetch data, returns polars DataFrame directly (needs [polars] extra) Yes
get_data_info(dataset_code, ...) Get dataset metadata summary Yes
paginate(dataset_code, ...) Generator yielding one pandas DataFrame per page Yes
download_data(dataset_code, output_path, ...) Download data to CSV/JSON file Yes
set_api_key(api_key) Set / replace the API key on an existing client
is_authenticated() Returns True if an API key is configured

Error Handling

Error Cause Solution
AuthenticationError Missing or invalid API key (HTTP 401 or httpCode:401 in body) Pass api_key= or set X_API_KEY in .env
PermissionDeniedError No access to dataset (HTTP 403) Check your subscription plan
APIRequestError Server error, invalid request, or unknown dataset Check dataset_code and conditions
ValueError Bad argument (e.g. unknown column, page < 1, bad file_format) Check the error message

All exceptions inherit from DatacoreError, so you can catch them generically:

from datacore import Datacore, DatacoreError

try:
    client.get_data("dataset_historical_price")
except DatacoreError as e:
    print(f"Datacore call failed: {e}")

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacore_vn-1.0.0.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datacore_vn-1.0.0-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file datacore_vn-1.0.0.tar.gz.

File metadata

  • Download URL: datacore_vn-1.0.0.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datacore_vn-1.0.0.tar.gz
Algorithm Hash digest
SHA256 5e9383ab9c8633be1c1bf0bc87317c5e0c02d1923871adb17bde33252669f049
MD5 379508d4021a34e184df228afcd3ad17
BLAKE2b-256 af6235c9b1d9aca841618cd80176883ab75e203cd77871ff3a107bb63f658a28

See more details on using hashes here.

Provenance

The following attestation bundles were made for datacore_vn-1.0.0.tar.gz:

Publisher: publish.yml on DataCore-VietNam/DataCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datacore_vn-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: datacore_vn-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datacore_vn-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ddd741ea990f08bfac5f11335246fbc0d163c1df364e884603fa303ce159da11
MD5 cb2a2244329d4bb04010b73ba9f56187
BLAKE2b-256 eb45f83f14467970e02012b31e289cec80680db66f80c4b099e727f5e022e056

See more details on using hashes here.

Provenance

The following attestation bundles were made for datacore_vn-1.0.0-py3-none-any.whl:

Publisher: publish.yml on DataCore-VietNam/DataCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page