Skip to main content

A framework for writing Unstract Tools/Apps

Project description

Unstract

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

Unstract SDK

The unstract-sdk package helps with developing tools that are meant to be run on the Unstract platform. This includes modules to help with tool development and execution, caching, making calls to LLMs / vectorDBs / embeddings .etc. They also contain helper methods/classes to aid with other tasks such as indexing and auditing the LLM calls.

Installation

  • The below libraries need to be installed to run the SDK
    • Linux

      sudo apt install build-essential pkg-config libmagic-dev
      
    • Mac

      brew install pkg-config libmagic pandoc tesseract-ocr
      

Tools

Create a scaffolding for a new tool

Example

unstract-tool-gen --command NEW --tool-name <name of tool> \
 --location ~/path_to_repository/unstract-tools/ --overwrite false

Supported commands:

  • NEW - Create a new tool

Environment variables required for all Tools

Variable Description
PLATFORM_SERVICE_HOST The host in which the platform service is running
PLATFORM_SERVICE_PORT The port in which the service is listening
PLATFORM_SERVICE_API_KEY The API key for the platform
TOOL_DATA_DIR The directory in the filesystem which has contents for tool execution

Llama Index support

Unstract SDK 0.3.2 uses the following version of Llama Index Version 0.9.28 as on January 14th, 2024

Developing with the SDK

Ensure that you have all the required dependencies and pre-commit hooks installed

pdm install
pre-commit install

Once the changes have been made, it can be tested with Unstract through the following means.

With PDM

Specify the SDK as a dependency to a project using a tool like pdm by adding the following to your pyproject.toml

[tool.pdm.dev-dependencies]
local_copies = [
    "-e unstract-adapters @ file:///${UNSTRACT_ADAPTERS_PATH}",
    "-e unstract-sdk @ file:///${UNSTRACT_SDK_PATH}",
]

Or by running the below command

pdm add -e /path/to/unstract-sdk --dev

With pip

  • If the project is using pip it might be possible to add it as a dependency in requirements.txt
-e /path/to/unstract-sdk

NOTE: Building locally might require the below section to be replaced in the unstract-sdk's build system configuration

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
  • Another option is to provide a git URL in requirements.txt, this can come in handy while building tool docker images. Don't forget to run apt install git within the Dockerfile for this
unstract-sdk @ git+https://github.com/Zipstack/unstract-sdk@feature-branch
  • Or try installing a local PyPI server and upload / download your package from this server

Additonal dependencies for tool

Tools may need to be backed up by a file storage. unstract.sdk.file_storage contains the required interfaces for the same. fssepc is being used underneath to implement these interfaces. Hence, one can choose to use a file_system supported by fsspec for this. However, the required dependencies need to be added in the tool dependency manager. Eg. If the tool is using Minio as the underlying file storage, then s3fs can be added to support it. Similarly, for Google Cloud Storage, gcsfs is to be added. The following versions are tested in the SDK using unit test cases for the above package. gcsfs==2024.10.0 s3fs==2024.10.0

Documentation generation

Follow this README.md for generating documentation.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unstract_sdk-0.58.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

unstract_sdk-0.58.0-py3-none-any.whl (261.7 kB view details)

Uploaded Python 3

File details

Details for the file unstract_sdk-0.58.0.tar.gz.

File metadata

  • Download URL: unstract_sdk-0.58.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.12.3 CPython/3.9.6

File hashes

Hashes for unstract_sdk-0.58.0.tar.gz
Algorithm Hash digest
SHA256 97d53e8b3ab96f687bf08925441df1775348c765c119f37024a99c8ae648b6c1
MD5 54336afbe55100ee5ced3c7aff417aa6
BLAKE2b-256 4b548a4c049d6963b035938445fe70e0fd25299fe0b3ca488e6dd450427cabd3

See more details on using hashes here.

File details

Details for the file unstract_sdk-0.58.0-py3-none-any.whl.

File metadata

  • Download URL: unstract_sdk-0.58.0-py3-none-any.whl
  • Upload date:
  • Size: 261.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.12.3 CPython/3.9.6

File hashes

Hashes for unstract_sdk-0.58.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a9cbc4bbcda01bdd5b08796bc298ed178f94af0f0a32961b5eb52173178d4240
MD5 91721eab75d60c7d294b83e6a7e7a0fe
BLAKE2b-256 dbee18ab3fd3f70afb36eb45613f35ba12c02f3008f1fdf8ad62aef53f751c2c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page