Skip to main content

Convert data structure to schema.

Project description

derek

Latest release MIT license

Tools for converting data into schema.

(Still very much pre-alpha!)

Implemented in multiple languages.

Index Coverage Supported versions Downloads
Python pypi Python code coverage Python versions -
JavaScript (node.js) npm Javascript code coverage node version npm downloads
Rust (coming soon!) - - - -
Nim (coming soon!) - - - -
  1. Installation
  2. What is Derek?
    1. Document data structures
    2. Extract schemas from APIs
    3. Really lightweight
    4. Extensible
    5. KISS
  3. Documentation
    1. Features
    2. Specification
    3. API

Installation

Python

You can install this from the pypi index. It's available as the derek-py package.

Simple example with pip (poetry is recommended):

pip install derek-py

Complete set of supported installation methods:

Package manager pypi git
pip pip install derek-py pip install git+https://github.com/benjaminwoods/derek@main
poetry poetry add derek-py poetry add git+https://github.com/benjaminwoods/derek#main

Javascript (Node.js)

You can install this from the npm index. It's available as the derek-ts package.

Simple example with yarn:

yarn add derek-ts

Complete set of supported installation methods:

Package manager npm git
npm npm i derek-ts npm i git+https://github.com/benjaminwoods/derek#main
yarn yarn add derek-ts yarn add git+https://github.com/benjaminwoods/derek#main

What is Derek?

Here's a quick guide showing what you can do with derek. These examples are for a Python implementation.

Derek documents data structures.

Load some data into a tree of nodes:

# Import the main class
from derek import Derek

# Suppose that you have some JSON-compatible data
obj = [
  {
    'some': [1.0, 3, "4.5"],
    'data': [3.4, 4.5]
  },
  {
    'some': [2, "4.0", 1.5],
    'data': [1.4]
  }
]

# Feed this data into Derek.tree
root_node = Derek.tree(obj, name='MyDataStructure')

You can use .example() to see a simple example item of data:

>>> root_node.example()
[{'some': [1.0], 'data': [3.4]}]

You can produce an OAS2/OAS3 JSON schema from this data, too:

j = root_node.parse(format='oas3')
import json
print(json.dumps(j, indent=2))
{
  "MyDataStructure": {
    "type": "array",
    "items": {
      "type": "object",
      "additionalProperties": {
        "oneOf": [
          {
            "type": "array",
            "items": {
              "oneOf": [
                {
                  "type": "string"
                },
                {
                  "type": "integer"
                },
                {
                  "type": "number"
                }
              ]
            }
          },
          {
            "type": "array",
            "items": {
              "type": "number"
            }
          }
        ]
      }
    },
    "example": [
      {
        "some": [1.0],
        "data": [3.4]
      }
    ]
  }
}

Install and use the yaml package to convert this structure to an OAS3-compliant data schema.

import yaml
print(yaml.dump(j))
MyDataStructure:
  example:
    - data:
        - 3.4
      some:
        - 1.0
  items:
    additionalProperties:
      oneOf:
        - items:
            type: number
          type: array
        - items:
            oneOf:
              - type: number
              - type: integer
              - type: string
          type: array
    type: object
  type: array

Derek extracts schemas from APIs.

Quickly extract schemas from APIs, by feeding the returned JSON into Derek.

from derek import Derek

from pycoingecko import CoinGeckoAPI
cg = CoinGeckoAPI()

# Get all coins from CoinGecko
root_node = Derek.tree(cg.get_coins_list(), name='GetCoins')

Parse to get your schema:

j = root_node.parse(format='oas3')
import json
print(json.dumps(j, indent=2))
{
  "GetCoins": {
    "type": "array",
    "items": {
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "example": [
      {
        "id": "01coin",
        "symbol": "zoc",
        "name": "01coin"
      }
    ]
  }
}

Derek is really lightweight.

No required dependencies. Always.

Derek is extensible.

Use libraries like pywhat and yaml to quickly extend Derek:

import json, yaml

from derek import Derek, Parser

from pywhat import Identifier

class PywhatDerek(Derek):
    @property
    def parser(self):
        return PywhatParser()

    def get_oas3_yaml(self):
        return yaml.dump(
            self.parse(format="oas3")
        )

class PywhatParser(Parser):
    @classmethod
    def oas2(cls, node):
        # Call the superclass parser for the current node:
        #   _sup = cls.__mro__[PywhatParser.__mro__.index(int):]
        #   j = _sup.oas2(cls, node)
        # All calls to the oas2 method in the superclass therefore re-route
        # back to this class method, automatically handling all recursive calls
        # here.
        j = super(PywhatParser, cls).oas2(node)

        # The rest of this function simply patches in results from a call
        # to the pywhat API.
        identifier = Identifier()

        if all(map(lambda t: not isinstance(node.value, t), [list, dict])):
            result = identifier.identify(str(node.value))

            if result['Regexes'] is not None:
                matches = [entry for entry in result['Regexes']['text']]

                # Select the match as the longest string
                map_func = lambda d: (d['Matched'], d['Regex Pattern']['Name'])
                max_func = lambda tup: len(tup[0])
                _, match = max(
                    map(map_func, matches),
                    key=max_func
                )

                j = {
                    **j,
                    'description': match
                }

        return j

Allowing for functionality like:

root_node = PywhatDerek.tree(
    {'data': ['17VZNX1SN5NtKa8UQFxwQbFeFc3iqRYhem']},
    name='Addresses'
)
root_node.get_oas3_yaml()

returning:

Addresses:
  additionalProperties:
    items:
      description: "Bitcoin (\u20BF) Wallet Address"
      type: string
    type: array
  example:
    data:
      - 17VZNX1SN5NtKa8UQFxwQbFeFc3iqRYhem
  type: object

Derek is straightforward.

Derek is designed for ease of use. If you're trying to use Derek functionality in a workflow and it feels like it should be easier to get your desired result, please make an issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

derek_py-0.0.2.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

derek_py-0.0.2-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file derek_py-0.0.2.tar.gz.

File metadata

  • Download URL: derek_py-0.0.2.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.1 Windows/10

File hashes

Hashes for derek_py-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c14017f07a55ca0d11a9638a458697fc7d9945f447316f574544d01f6be4775d
MD5 1e05a743888f4bab72da3d0bc9c048e0
BLAKE2b-256 f0cae5fec613993f862a799c784c0db06104dd5817030342a6a4fbcf1ef8ca9f

See more details on using hashes here.

File details

Details for the file derek_py-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: derek_py-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.1 Windows/10

File hashes

Hashes for derek_py-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9967aed78c26910f2c41258660c02037982c6d27950ef09e06dae12b09b0ad1a
MD5 bc4ac17f167f8a3276f937528d055e80
BLAKE2b-256 ea1b42ed9c41868cdf9a30708c8db0a82c85a0e4b857543e67961cb61a0b6eba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page