Skip to main content

No project description provided

Project description

The BMS Lake API

A FastAPI Plugin that allows you to expose your Data Lake as an API, allowing multiple output formats, such as Parquet, Csv, Json, Excel, ...

The lake API also contains a minimal security layer for convenience (Basic Auth), but you can also bring your own.

It constrast to roapi, we intentionally do not want to expose most SQL by default, but we limit possible queries using a config. This makes it easy for you to control what happens on your data. If you want the sql endpoint, you can enable this.

To run the app with default config, just do:

app = FastAPI()
bmsdna.lakeapi.init_lakeapi(app)

To adjust the config, you can do like this:

import dataclasses
import bmsdna.lakeapi

def_cfg = bmsdna.lakeapi.get_default_config() # Get default startup config
cfg = dataclasses.replace(def_cfg, enable_sql_endpoint=True, data_path="tests/data") # Use dataclasses.replace to set the properties you want
sti = bmsdna.lakeapi.init_lakeapi(app, cfg, "config_test.yml") # Enable it. The first parameter is the FastAPI instance, the 2nd one is the basic config and the third one the config of the tables

Installation

PyPI version

Pypi Package bmsdna-lakeapi can be installed like any python package : pip install bmsdna-lakeapi

OpenApi

Of course, everything works with Open API and FastAPI. Meaning you can add other FastAPI routes, you can use the /docs and /redoc endpoint.

Default Security

By Default, Basic Authentication is enabled. To add a user, simply run add_lakeapi_user YOURUSERNAME --yaml-file config.yml. This will add the user to your config yaml (argon2 encrypted). The generated Password is printed. If you do not want this logic, you can overwrite the username_retriver of the Default Config

Standalone Mode

If you just want to run this thing, you can run it with a webserver:

Uvicorn: uvicorn bmsdna.lakeapi.standalone:app --host 0.0.0.0 --port 8080

Gunicorn: gunicorn bmsdna.lakeapi.standalone:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:80

Of course you need to adjust your http options as needed. Also, you need to pip install uvicorn/gunicorn

You can still use environment variables for configuration

Environment Variables

  • CONFIG_PATH: The path of the config file, defaults to config.yml. If you want to split the config, you can specify a folder, too
  • DATA_PATH: The path of the data files, defaults to data. Paths in config.yml are relative to DATA_PATH
  • ENABLE_SQL_ENDPOINT: Set this to 1 to enable the SQL Endpoint

Config File

The application by default relies on a Config file beeing present at the root of your project that's call config.yml.

The config file looks something like this, see also our test yaml:

tables:
  - name: fruits
    tag: test
    version: 1
    api_method:
      - get
      - post
    params:
      - name: cars
        operators:
          - "="
          - in
      - name: fruits
        operators:
          - "="
          - in
    dataframe:
      uri: delta/fruits
      file_type: delta

  - name: fruits_partition
    tag: test
    version: 1
    api_method:
      - get
      - post
    params:
      - name: cars
        operators:
          - "="
          - in
      - name: fruits
        operators:
          - "="
          - in
      - name: pk
        combi:
          - fruits
          - cars
      - name: combi
        combi:
          - fruits
          - cars
    dataframe:
      uri: delta/fruits_partition
      file_type: delta
      select:
        - name: A
        - name: fruits
        - name: B
        - name: cars

  - name: fake_delta
    tag: test
    version: 1
    allow_get_all_pages: true
    api_method:
      - get
      - post
    params:
      - name: name
        operators:
          - "="
      - name: name1
        operators:
          - "="
    dataframe:
      uri: delta/fake
      file_type: delta

  - name: fake_delta_partition
    tag: test
    version: 1
    allow_get_all_pages: true
    api_method:
      - get
      - post
    params:
      - name: name
        operators:
          - "="
      - name: name1
        operators:
          - "="
    dataframe:
      uri: delta/fake
      file_type: delta

  - name: fruits_csv
    tag: test
    version: 1
    allow_get_all_pages: true
    api_method:
      - get
      - post
    params:
      - name: fruits
        operators:
          - "="
      - name: cars
        operators:
          - "="
    dataframe:
      uri: csv/fruits.csv
      file_type: csv

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bmsdna_lakeapi-0.2.3.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

bmsdna_lakeapi-0.2.3-py3-none-any.whl (35.1 kB view details)

Uploaded Python 3

File details

Details for the file bmsdna_lakeapi-0.2.3.tar.gz.

File metadata

  • Download URL: bmsdna_lakeapi-0.2.3.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for bmsdna_lakeapi-0.2.3.tar.gz
Algorithm Hash digest
SHA256 f54d9342e9e930938f7a5152769bc1665d7e1e0ab5e02c8f1fdbcb25350ea73d
MD5 2d888737c6d8724216ec62e26400dce2
BLAKE2b-256 f81df42701cc48d33bfbeaf3db0537cbebc82e7432aaef82349b2eddcc975343

See more details on using hashes here.

File details

Details for the file bmsdna_lakeapi-0.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for bmsdna_lakeapi-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3f1b6081fd24562b97261df5550d18db5247d5d2e363532652edd668d5fd918d
MD5 9ee8ec0456ef459f2acb7844e0c78ae3
BLAKE2b-256 e3b61553b6d8bedb2708d2202bcd892b818a81fb4259a5fa09c96985e2cf355d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page