No project description provided

These details have not been verified by PyPI

Project description

Crux command line tool - cruxctl

Herein contains the source code for the cruxctl command line tool. It is used to submit jobs, validate YAML files, and manipulate deadlines related to Crux.

The repositories related to it are cruxctl and crux-odin (a library used by cruxctl).

Installation

You install cruxctl via PyPI and pip in any Python environment you wish. You can install it in a venv environment, a pipenv environment, or a poetry environment. You can also install it at the system level if you wish. The installation doesn't vary from any other Python package. Just do a pip install cruxctl or pip install cruxctl==<version> and you're good to go. It does require that you can authenticate with Google Cloud and have the proper permissions to access the Crux API. It also requires that you have a login on the Crux system so you can get a token. Type cruxctl --help for command help and cruxctl <command> --help for subcommand help.

Usage

Authorize with cruxctl auth. cruxctl auth --help will get you started with authorization.

Examples for AI Schedule

Get calculated delivery deadline:

cruxctl ai-schedule get-delivery-deadline -d AQKwpurp8B-G848Qqs7JthWOog -bm 60

Example for AI curation

Onboard data through Crux - run through profiler, upload vendor doc. These would trigger curation to run on event based. After profiling is done, you are now able to download odin yaml. You can check the odin file against curation output using cruxctl command.

cruxctl dataset update -f [ODIN_YAML_FILE]
  --profile [ENVIRONMENT] --from-docs

Examples for Deadline Management

See available commands and help:

cruxctl deadlines --help

Get all deadlines:

cruxctl deadlines get-all

Get a specific deadline:

cruxctl deadlines get dataset-id-abc

Insert a deadline:

cruxctl deadlines insert dataset-id-abc  0 23 '3W' '*' '*' '*'

Delete deadlines matching dataset ID:

cruxctl deadlines delete dataset-id-abc

Delete all deadlines:

cruxctl deadlines delete-all dataset-id-abc

Import deadlines from CSV:

cruxctl deadlines import /path/to/file/deadlines.csv

Export deadlines to GCS bucket as CSV file:

cruxctl deadlines export gs://my-bucket/deadlines.csv

Get all notification snoozes:

cruxctl deadlines get-all-notification-snooze

Get a specific notification snooze:

cruxctl deadlines get-notification-snooze dataset-id-abc

Create a notification snooze:

cruxctl deadlines create-notification-snooze dataset-id-abc 72 hours

Delete a notification snooze:

cruxctl deadlines delete-notification-snooze dataset-id-abc

Delete expired notification snooze(s):

cruxctl deadlines delete-expired-notification-snooze

Example for YAML Validation

Validate YAML files which possibly point to a parent YAML file. There are two forms: one where you just give the YAML file names and the other where you give a start directory and the YAML file names. The second form exists because normally the data engineers stick the YAML files below a directory named after the company. They also often put a parent YAML file there too and a bunch of child YAML files refer to it. Therefore, we allow the user to pass this directory as the first argument and the child or parent files as the subsequent arguments. If you modify the child file and there is a parent, the combined parent/child YAML is validated. If you pass a parent file, ALL THE CHILDREN of that parent file are validated.

You can also pass a parent file and a child file with the first form where you just give YAML paths. In this case, pass the parent and the child YAML file as the same argument separated by a comma. For example

cruxctl dataset validate a.yaml b.yaml,c.yaml

validates a.yaml by itself and the combined b.yaml/c.yaml. This supposes that b.yaml is the "parent" of c.yaml.

The full usage syntax is:

cruxctl dataset validate [--profile local|dev|staging|prod] [--quiet] file_or_dir yaml_file...

Normally cruxctl dataset validate prints out the progress as it goes. --quiet turns this off.

Example for creating a new YAML file and dataset

When we create a new YAML file, we create a new dataset and data product based on the file name of the YAML output file. These dataset and data product are written to the catalog through our Crux API. The usage of the command is as follows:

cruxctl dataset init [--dataset-name dataset_name] [--data-product-name data_product_name] [--environment local|dev|staging|prod] yaml_output_file

By default the dataset_name and data_product_name are the same as the output file name (minus the .yaml extension). The environment is prod by default. When the command runs, it prints out what it is doing like this:

CRUX_API_TOKEN loaded.
Using org ID "OrEC0NbO"
Checking if data product "sample10" exists
It doesn't. Creating it.
Created data product "sample10" with ID "Prb8CPw0FAkt"
Created dataset "sample10" with ID "Dspmm40k"
Mapped dataset ID "Dspmm40k" to data product ID "Prb8CPw0FAkt"
Created /tmp/sample10.yaml

The org ID is looked up via the access token you stored with cruxctl auth login.

If you don't like the dataset_name or data_product_name to match the file name, give the --dataset-name or --data-product-name options. For the data product ID, you can give an existing one too and it will use that data product ID rather than creating one. It always creates a new dataset ID.

To verify your dataset ID was created, go here and give the filter name.EQ.yourname. To see if the data product was created, go here and give the filter name.EQ.yourname.

Example for deploying an Odin dataset to the control plane

To deploy an Odin dataset YAML file to the control plane, give one or more arguments to the dataset apply command. This command can deploy multiple YAML files from one command line invocation if you give multiple YAML files to apply. Like the dataset validate command, you can give a directory as the first argument and YAML files to apply after that or you can just give the YAML files to apply (or combined YAML files separated by commas. See the dataset validate command for syntax).

Usage:

cruxctl dataset apply [--profile local|dev|staging|prod] [--quiet] file_or_dir yaml_file...

Applying starts the processing runs for the YAML files. Normally it prints out as it is applying the YAML files. Use --quiet to turn this off.

Example for deleting a dataset in the control plane

If you'd like to delete an existing dataset(s) in the control plane, give the following command:

cruxctl dataset delete [--profile local|dev|staging|prod] [--quiet] dataset_id...

Example for getting the events from a deployed dataset

To see the events from a deployed dataset, give the following command:

cruxctl dataset events [--watch] [--environment local|dev|staging|prod] dataset_id

This prints out the events for that dataset ID. If you give the --watch option, then every three second more output is checked for an output. The output looks like this:

{'specversion': '1.0', 'type': 'com.crux.cp.dataset.ingest.apply.v1', 'source': '/apilayer', 'subject': '', 'id': 'e0e1936d-e70b-4351-95e4-66fbefbbdf8b', 'time': '2024-09-10T22:39:57.077029Z', 'data': {'id': 0, 'datasetId': 'DssgxkJB', 'orgId': 'test', 'eventId': 'e0e1936d-e70b-4351-95e4-66fbefbbdf8b', 'eventSource': '/apilayer', 'eventType': 'com.crux.cp.dataset.ingest.apply.v1', 'message': 'validation pass', 'statusType': 'Apply'}}
{'specversion': '1.0', 'type': 'com.crux.cp.dataset.ingest.apply.v1', 'source': '/apilayer', 'subject': '', 'id': 'e69b7204-2cff-4702-a65e-885bb7f77d7d', 'time': '2024-09-10T21:31:01.843591Z', 'data': {'id': 0, 'datasetId': 'DssgxkJB', 'orgId': 'test', 'eventId': 'e69b7204-2cff-4702-a65e-885bb7f77d7d', 'eventSource': '/apilayer', 'eventType': 'com.crux.cp.dataset.ingest.apply.v1', 'message': 'validation pass', 'statusType': 'Apply'}}
{'specversion': '1.0', 'type': 'com.crux.cp.dataset.ingest.apply.v1', 'source': '/apilayer', 'subject': '', 'id': 'f41f54e3-b9d2-4638-a766-69662c75fbc4', 'time': '2024-09-10T21:59:00.67005Z', 'data': {'id': 0, 'datasetId': 'DssgxkJB', 'orgId': 'test', 'eventId': 'f41f54e3-b9d2-4638-a766-69662c75fbc4', 'eventSource': '/apilayer', 'eventType': 'com.crux.cp.dataset.ingest.apply.v1', 'message': 'validation pass', 'statusType': 'Apply'}}

Example for retrieving dataset's PDK logs

If you'd like to retrieve pdk logs of an existing dataset in the control plane, run the following command:

cruxctl dataset pdk-logs [DATASET_ID] --delivery-id [DELIVERY_ID]

Example for retrieving dataset's dispatch logs

If you'd like to retrieve dispatch logs of an existing dataset in the control plane, run the following command:

cruxctl dataset dispatch-logs [DATASET_ID] --export-id [EXPORT_ID]

Thanks to all the contributors:

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

3.0.2

Mar 3, 2026

3.0.1

Feb 23, 2026

2.8.0

May 28, 2025

2.7.0

Apr 23, 2025

2.6.0

Apr 16, 2025

2.5.0

Apr 9, 2025

This version

2.4.0

Mar 20, 2025

2.3.5

Mar 3, 2025

2.3.4

Jan 15, 2025

2.3.3

Jan 15, 2025

2.3.2

Jan 14, 2025

2.3.1

Jan 13, 2025

2.3.0

Jan 3, 2025

2.2.1

Dec 17, 2024

2.2.0

Dec 16, 2024

2.1.4

Dec 16, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cruxctl-2.4.0.tar.gz (48.3 kB view details)

Uploaded Mar 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cruxctl-2.4.0-py3-none-any.whl (61.6 kB view details)

Uploaded Mar 20, 2025 Python 3

File details

Details for the file cruxctl-2.4.0.tar.gz.

File metadata

Download URL: cruxctl-2.4.0.tar.gz
Upload date: Mar 20, 2025
Size: 48.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.1 CPython/3.12.3 Linux/6.8.0-1021-azure

File hashes

Hashes for cruxctl-2.4.0.tar.gz
Algorithm	Hash digest
SHA256	`3d2f53fcd1e4a171aabdbff5a4dde9f504f0e4aed7b1eaf36c9568db542cfd7d`
MD5	`9640abfc4324000505a7fe6fbd8bdf4e`
BLAKE2b-256	`bdaaec520f26561696e80086dc5001ecff14527a3d6e7cdaccfaa55e84c49b2c`

See more details on using hashes here.

File details

Details for the file cruxctl-2.4.0-py3-none-any.whl.

File metadata

Download URL: cruxctl-2.4.0-py3-none-any.whl
Upload date: Mar 20, 2025
Size: 61.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.1 CPython/3.12.3 Linux/6.8.0-1021-azure

File hashes

Hashes for cruxctl-2.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`34e6c1ac23dab291e9dc0e15b78cd4da55ce07931c18e6505647daef9e9e501b`
MD5	`f6dcca8fac32bff669c5d6cd2e619747`
BLAKE2b-256	`375f7cabcdd14bcb9247a8ca253dbc4bbadfc046bf3774e4874fb72de70f04a2`

See more details on using hashes here.

cruxctl 2.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Crux command line tool - cruxctl

Installation

Usage

Examples for AI Schedule

Example for AI curation

Examples for Deadline Management

Example for YAML Validation

Example for creating a new YAML file and dataset

Example for deploying an Odin dataset to the control plane

Example for deleting a dataset in the control plane

Example for getting the events from a deployed dataset

Example for retrieving dataset's PDK logs

Example for retrieving dataset's dispatch logs

Thanks to all the contributors:

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes