Skip to main content

A namespace package for data files

Project description

What's a Python Namespace Package, and What's It For?

DataRepos - A Namespace Package for Data Files

This package provides a namespace intended for data files.

It lets you read data files from a common package, but with the actual data distributed over multiple packages.

For example, if you have various data files that are very large (on the order of hundreds of megabytes or larger), then you can divide the data files into different packages, but they'll be callable from the same namespace in your Python code.

To use DataRepos, you should install it into your virtual environment:

(venv) $ python -m pip install data-repos

You can now import DataRepos as data_repos.

Read Data Files

DataRepos provides a read() function that you can use to read data files. The data are returned as a pandas DataFrame:

>>> from data_repos import read
>>> read.data("countries")
              country   population
0             Austria      8840521
1              Canada     37057765
2                Cuba     11338138
3  Dominican Republic     10627165
4             Germany     82905782
5              Norway      5311916
...

Install Data Files

You can install other data files from PyPI with pip. Other cooperating DataRepos packages can be installed and will integrate smoothly:

(venv) $ python -m pip install data-repos-cars

You can then read the cars dataset with the exact same syntax:

>>> from data_repos import read
>>> read.data("cars")
...

Because DataRepos is a namespace package, it can be extended on the fly.

Available Data Files

Two datasets are included as examples:

  • iris: The classical Iris dataset, originally published by Ronald Fisher in 1936
  • countries: Countries and their population, collected by Samayo

You can read these files with read.data("iris") and read.data("countries"), respectively.

Add Your Own Data Files

You can also add your own data files by storing them in a folder named data_repos that's on Python's path.

See examples of how to do this and learn more about namespace packages in What's a Python Namespace Package, and What's It For?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-repos-1.0.0.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_repos-1.0.0-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file data-repos-1.0.0.tar.gz.

File metadata

  • Download URL: data-repos-1.0.0.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for data-repos-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d0dc420b3c8e4cef9b86915cacc1e60df2d2e0ce35564bca4da77f020336121e
MD5 3240c7bf9b64835d3dd372fbf6bc8b49
BLAKE2b-256 d535991c97fa78bec5fff27d6fcb08d058e406e8f1f53fcfd8f8fce096f56fe0

See more details on using hashes here.

File details

Details for the file data_repos-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: data_repos-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for data_repos-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c5c7dbd6a8f374a0f5a7abd79e8278ab3b1a32c1fdb27af11b8151fd18de2a3
MD5 aa6c7bf910531cd806f67764bb8ecbc6
BLAKE2b-256 31a41bbafbd1a7c8b57983bbbcd16aac914c1c63b069ff3267411bd4d5663eda

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page