Skip to main content

AI utility to extract data from any JSON and reformat it into a new JSON with repeatable queries.

Project description

Jaiqu

Natural language to DSL agent for JSON querying

Python Version

๐Ÿ”— Main site ย ย โ€ขย ย  ๐Ÿฆ Twitter ย ย โ€ขย ย  ๐Ÿ“ข Discord ย ย โ€ขย ย  ๐Ÿ–‡๏ธ AgentOps

Jaiqu

Streamlit App License: MIT PyPI - Version X (formerly Twitter) Follow

Replicable, AI-generated JSON transformation queries. Transform any JSON into any schema automatically.

Jaiqu is an AI agent for creating repeatable JSON transforms using jq query language syntax. Jaiqu translates any arbitrary JSON inputs into any desired schema.

Building AI agents? Check out AgentOps

Live Demo

Video Overview

Alt text

Features

  • Translate any schema to any schema AI agent automatically maps data from a source schema to a desired format by iteratively prompting GPT-4 to create valid jq query syntax.
  • Schema validation Given a requirement schema, automatically validate whether the required data is present in the input json.
  • Fuzzy term matching Infers keys based on symantic similarity (i.e. datetime vs date_time). GPT-4 automaticlaly maps and translates input keys to desired output keys.

Example usage:

from jaiqu import validate_schema, translate_schema

# Desired data format 
schema = {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "properties": {
        "id": {
            "type": ["string", "null"],
            "description": "A unique identifier for the record."
        },
        "date": {
            "type": "string",
            "description": "A string describing the date."
        },
        "model": {
            "type": "string",
            "description": "A text field representing the model used."
        }
    },
    "required": [
        "id",
        "date"
    ]
}

# Provided data
input_json = {
    "call.id": "123",
    "datetime": "2022-01-01",
    "timestamp": 1640995200,
    "Address": "123 Main St",
    "user": {
        "name": "John Doe",
        "age": 30,
        "contact": "john@email.com"
    }
}

# (Optional) Create hints so the agent knows what to look for in the input
key_hints="We are processing outputs of an containing an id, a date, and a model. All the required fields should be present in this input, but the names might be different."

Validating an input json contains all the information required in a schema

schema_properties, valid = validate_schema(input_json, schema, key_hints)

print(schema_properties)

>>> {
      "id": {
          "identified": true,
          "key": "call.id",
          "message": "123",
          "type": [
          "string",
          "null"
          ],
          "description": "A unique identifier for the record.",
          "required": true
      },
      "date": {
          "identified": true,
          "key": "datetime",
          "message": "2022-01-01",
          "type": "string",
          "description": "A string describing the date."
          "required": true
      }
    }
print(valid)
>>> True

Creating a repeatable jq query for extracitng data from identically formatted input JSONs

jq_query = jaiqu.translate_schema(input_json, schema, key_hints, max_retries=30)
>>>'{"id": .attributes["call.id"], "date": .datetime}'

CLI Usage

git clone https://github.com/AgentOps-AI/Jaiqu.git
cd Jaiqu/samples/

jaiqu -s schema.json -d data.json
# Validating schema: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 3/3 [00:11<00:00,  3.73s/it, Key: model]
# Translating schema: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2/2 [00:02<00:00,  1.46s/it, Key: date]
# Retry attempts:  20%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ                     | 2/10 [00:02<00:11,  1.46s/it]
# Validation attempts:  10%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Ž                          | 1/10 [00:00<00:08,  1.02it/s]

jq '{ "id": (if .["call.id"] then .["call.id"] else null end), "date": (if has("datetime") then .datetime else "None" end) }' data.json
# Run command?
# [E]xecute, [A]bort: e
# {
#   "id": "123",
#   "date": "2022-01-01"
# }

Note: usage is currently limited to python 3.9 & 3.10

Installation

Recommended: PyPI:

pip install jaiqu

Architecture

Unraveling the Jaiqu agentic workflow pattern

flowchart TD
    A[Start translate_schema] --> B{Validate input schema}
    B -- Valid --> C[For each key, create a jq filter query]
    B -- Invalid --> D[Throw RuntimeError]
    C --> E[Compile and Test jq Filter]
    E -- Success --> F[Validate JSON]
    E -- Fail --> G[Retry Create jq Filter]
    G -- Success --> E
    G -- Fail n times--> H[Throw RuntimeError]
    F -- Success --> I[Return jq query string]
    F -- Fail --> J[Retry Validate JSON]
    J -- Success --> I
    J -- Fail n times --> K[Throw RuntimeError]

Running tests

  1. Install pytest if you don't have it already
pip install pytest
  1. Run the tests/ folder while in the parent directory
pytest tests

This repo also supports tox, simply run python -m tox.

Contributing

Contributions to Jaiqu are welcome! Feel free to create an issue for any bug reports, complaints, or feature suggestions.

License

Jaiqu is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jaiqu-0.0.6.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jaiqu-0.0.6-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file jaiqu-0.0.6.tar.gz.

File metadata

  • Download URL: jaiqu-0.0.6.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.11

File hashes

Hashes for jaiqu-0.0.6.tar.gz
Algorithm Hash digest
SHA256 121ca5bb6d4abe8f804f55352c20725c471c94067b032c21706f451ee186bada
MD5 01528bd1b817575be377c57e80e5a19b
BLAKE2b-256 e1a9d28532641f5a2cfbc322660e3803f22f8bff7e62ac9e7f3e51be0eed8d3a

See more details on using hashes here.

File details

Details for the file jaiqu-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: jaiqu-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.11

File hashes

Hashes for jaiqu-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 5b1219cf07879a397e3d1257374e645f5d36038027eb89f6856a5f4557296fdc
MD5 6fdcf9551c530c8b900575cf9183f6a9
BLAKE2b-256 a1db6e6bd6309c8c4bac0db902c65d5dcb654aae7d95508cf59831c6b7e4301c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page