Python client for the Datacore API (https://datacore.vn). Imported as `datacore`.
Project description
Datacore Python Client
Python client library for the Datacore API — supports two access modes:
- Demo: Preview datasets without an API key
- Paid: Full access with an API key
Distribution name vs. import name: install as
datacore-vn, import asdatacore. (Same pattern aspip install scikit-learn→import sklearn.) The shorterdatacoredistribution name on PyPI belongs to an unrelated abandoned 2022 project;datacore-vnis our permanent distribution name.
Installation
From PyPI:
pip install datacore-vn # core (pandas-only)
pip install "datacore-vn[polars]" # also installs polars
pip install "datacore-vn[all]" # all optional extras
Or with uv:
uv add datacore-vn
uv add "datacore-vn[polars]"
From GitHub directly:
pip install git+https://github.com/DataCore-VietNam/DataCore.git
pip install "datacore-vn[polars] @ git+https://github.com/DataCore-VietNam/DataCore.git"
For contributors only (you do not need this to use the library — the commands above are all an end user needs):
git clone https://github.com/DataCore-VietNam/DataCore.git
cd DataCore
pip install -e ".[dev]"
Configuration (.env, optional)
Create a .env file in your working directory (see .env.example):
X_API_KEY=your-api-key-here
Never commit your real API key.
.envis git-ignored. Use.env.exampleas a template.
Usage
1. Initialize the client
from datacore import Datacore
# Demo mode (no API key required)
client = Datacore()
# Paid mode — pass the key explicitly...
client = Datacore(api_key="your-api-key")
# ...or rely on X_API_KEY from .env / environment
client = Datacore()
# Enable request/response debug logging
client = Datacore(api_key="your-api-key", debug=True)
2. Preview a dataset (Demo mode)
Preview data without an API key.
# All columns
df = client.preview("dataset_historical_price")
print(df.head())
# Filter specific columns
df = client.preview("dataset_historical_price", columns=["symbol", "date", "close_price"])
print(df.head())
3. Fetch data (Paid mode)
# All columns — returns {"data": DataFrame, "info": str}
result = client.get_data("dataset_historical_price")
print(result["data"].head())
print(result["info"])
# num: 3760607, totalPage: 37607, currentPage: 1, queried_rows: 100
# Filter specific columns
result = client.get_data(
"dataset_historical_price",
columns=["symbol", "date", "close_price"],
)
print(result["data"].head())
Full parameters:
result = client.get_data(
dataset_code="dataset_historical_price",
columns=["symbol", "date", "close_price"], # client-side column filter (optional)
conditions=None, # EXPERIMENTAL server-side row filter -- see note below
select_fields=None, # server-side field selection (optional)
page=1,
limit=100, # max 100 server-side (HTTP 400 if higher)
return_type="dataframe", # "dataframe" | "polars" | "json" | "dict"
include_info=True, # True: returns {"data": ..., "info": ...} | False: data only
)
Page size: the gateway currently caps
limitat 100 rows per request. Passing a larger value returnsHTTP 400: Invalid request content. For larger downloads, paginate withdownload_dataorpaginate.
⚠️
conditionsis experimental. The server-sideconditionsrow filter is forwarded to the gateway verbatim, but the accepted JSON shape is not yet finalised — every shape tried so far is rejected bygateway.datacore.vn/data/ds/searchwithHTTP 400. Do not rely onconditionsin production yet. Until the gateway schema is confirmed, fetch unfiltered data and filter the returned DataFrame client-side. This parameter may change in a future release.
A convenience wrapper that returns the DataFrame directly (no info dict):
df = client.get_dataframe("dataset_historical_price", limit=100)
3b. Polars output (optional)
If you installed with pip install "datacore-vn[polars]", you can ask for a
polars DataFrame instead of pandas:
# Via return_type
result = client.get_data(
"dataset_historical_price",
columns=["symbol", "date", "close_price"],
limit=100,
return_type="polars",
)
print(type(result["data"])) # <class 'polars.DataFrame'>
print(result["data"].head())
# Convenience method (no info dict, just the polars frame)
df_pl = client.get_polars("dataset_historical_price", limit=100)
# Preview supports polars too
df_pl = client.preview("dataset_historical_price", return_type="polars")
Pandas is the default for backwards compatibility; polars is purely opt-in.
The same columns= filter works for both backends.
4. Iterate all pages (Paid mode)
for page_df in client.paginate("dataset_historical_price", limit=100, max_pages=5):
print(page_df.shape)
5. Download data to file
# Download all pages of a small dataset (76 pages, ~7.5k rows)
download_result = client.download_data(
dataset_code="gross_domestic_product_dataset_ds",
output_path="data.csv",
file_format="csv", # "csv" or "json"
start_page=1,
end_page=None, # None = download until the last page
limit=100, # max per-request page size (see note above)
show_progress=True,
)
print(download_result)
# {"output_path": "data.csv", "pages_downloaded": 76, "rows_downloaded": 7551, ...}
# `dataset_historical_price` is large (~3.7M rows / 37k pages at limit=100);
# expect it to take a long time and a lot of network.
# Download only first 3 pages, filtered to specific columns (CSV only)
download_result = client.download_data(
dataset_code="dataset_historical_price",
columns=["symbol", "date", "close_price"],
output_path="data_page1_3.csv",
file_format="csv",
start_page=1,
end_page=3,
show_progress=True,
)
columnsfiltering only applies tofile_format="csv". JSON output preserves the full raw API response.
Method Summary
| Method | Description | Requires API key |
|---|---|---|
preview(dataset_code, columns, return_type) |
Preview a dataset (pandas or polars) | No |
preview_raw(dataset_code) |
Preview a dataset, raw dict response | No |
get_data(dataset_code, ...) |
Fetch data, returns {"data", "info"} by default |
Yes |
get_dataframe(dataset_code, ...) |
Fetch data, returns pandas DataFrame directly | Yes |
get_polars(dataset_code, ...) |
Fetch data, returns polars DataFrame directly (needs [polars] extra) |
Yes |
get_data_info(dataset_code, ...) |
Get dataset metadata summary | Yes |
paginate(dataset_code, ...) |
Generator yielding one pandas DataFrame per page | Yes |
download_data(dataset_code, output_path, ...) |
Download data to CSV/JSON file | Yes |
set_api_key(api_key) |
Set / replace the API key on an existing client | — |
is_authenticated() |
Returns True if an API key is configured |
— |
Error Handling
| Error | Cause | Solution |
|---|---|---|
AuthenticationError |
Missing or invalid API key (HTTP 401 or httpCode:401 in body) |
Pass api_key= or set X_API_KEY in .env |
PermissionDeniedError |
No access to dataset (HTTP 403) | Check your subscription plan |
APIRequestError |
Server error, invalid request, or unknown dataset | Check dataset_code and conditions |
ValueError |
Bad argument (e.g. unknown column, page < 1, bad file_format) |
Check the error message |
All exceptions inherit from DatacoreError, so you can catch them generically:
from datacore import Datacore, DatacoreError
try:
client.get_data("dataset_historical_price")
except DatacoreError as e:
print(f"Datacore call failed: {e}")
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datacore_vn-1.0.0.tar.gz.
File metadata
- Download URL: datacore_vn-1.0.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e9383ab9c8633be1c1bf0bc87317c5e0c02d1923871adb17bde33252669f049
|
|
| MD5 |
379508d4021a34e184df228afcd3ad17
|
|
| BLAKE2b-256 |
af6235c9b1d9aca841618cd80176883ab75e203cd77871ff3a107bb63f658a28
|
Provenance
The following attestation bundles were made for datacore_vn-1.0.0.tar.gz:
Publisher:
publish.yml on DataCore-VietNam/DataCore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datacore_vn-1.0.0.tar.gz -
Subject digest:
5e9383ab9c8633be1c1bf0bc87317c5e0c02d1923871adb17bde33252669f049 - Sigstore transparency entry: 1551645413
- Sigstore integration time:
-
Permalink:
DataCore-VietNam/DataCore@64818659356284acecf5e8e0344ad4028bd66fcf -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/DataCore-VietNam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@64818659356284acecf5e8e0344ad4028bd66fcf -
Trigger Event:
push
-
Statement type:
File details
Details for the file datacore_vn-1.0.0-py3-none-any.whl.
File metadata
- Download URL: datacore_vn-1.0.0-py3-none-any.whl
- Upload date:
- Size: 11.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ddd741ea990f08bfac5f11335246fbc0d163c1df364e884603fa303ce159da11
|
|
| MD5 |
cb2a2244329d4bb04010b73ba9f56187
|
|
| BLAKE2b-256 |
eb45f83f14467970e02012b31e289cec80680db66f80c4b099e727f5e022e056
|
Provenance
The following attestation bundles were made for datacore_vn-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on DataCore-VietNam/DataCore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datacore_vn-1.0.0-py3-none-any.whl -
Subject digest:
ddd741ea990f08bfac5f11335246fbc0d163c1df364e884603fa303ce159da11 - Sigstore transparency entry: 1551645485
- Sigstore integration time:
-
Permalink:
DataCore-VietNam/DataCore@64818659356284acecf5e8e0344ad4028bd66fcf -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/DataCore-VietNam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@64818659356284acecf5e8e0344ad4028bd66fcf -
Trigger Event:
push
-
Statement type: