Skip to main content

Lightweight wrapper for cortecs.ai enabling ⚡️ instant provisioning

Project description

cortecs-py

Lightweight wrapper for the cortecs.ai enabling instant provisioning.

⚡ Instant provisioning

Dynamic provisioning allows you to run LLM-workflows on dedicated compute. The LLM and underlying resources are automatically provisioned for the duration of use, providing maximum cost-efficiency. Once the workflow is complete, the infrastructure is automatically shut down.

This library starts and stops your resources. The logic can be implemented using popular frameworks such as langchain or crewAI.

  1. Load (vast amounts of) data
  2. Start your LLM
  3. Execute your (batch) jobs
  4. Shutdown your LLM
from cortecs.client import Cortecs
from cortecs.langchain.dedicated_llm import DedicatedLLM

cortecs = Cortecs()

with DedicatedLLM(client=cortecs, model_name='neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8') as llm:
    essay = llm.invoke('Write an essay about dynamic provisioning')
    print(essay.content)

Getting started

Install

pip install pycortecs

Summarizing documents

First, set up the in environment variables. Use your credentials from cortecs.ai.

export CORTECS_CLIENT_ID="<YOUR_ID>"
export CORTECS_CLIENT_SECRET="<YOUR_SECRET>"

This example shows how to use langchain to configure a simple translation chain. The llm is dynamically provisioned and the chain is executed in paralle.

from langchain_community.document_loaders import ArxivLoader
from langchain_core.prompts import ChatPromptTemplate

from cortecs.client import Cortecs
from cortecs.langchain.dedicated_llm import DedicatedLLM

cortecs = Cortecs(api_base_url='https://develop.cortecs.ai/api/v1')
loader = ArxivLoader(
    query="reasoning",
    load_max_docs=20,
    get_ful_documents=True,
    doc_content_chars_max=25000,  # ~6.25k tokens, make sure the models supports that context length
    load_all_available_meta=False
)

prompt = ChatPromptTemplate.from_template("{text}\n\n Explain me like I'm five:")
docs = loader.load()

with DedicatedLLM(client=cortecs, model_name='neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8') as llm:
    chain = prompt | llm

    print("Processing data batch-wise ...")
    summaries = chain.batch([{"text": doc.page_content} for doc in docs])
    for summary in summaries:
        print(summary.content + '-------\n\n\n')

This simple example showcases the power of dynamic provisioning. We translated X input tokens to Y output tokens in Z minutes. The llm can be fully utilized in those Z minutes enabling better cost efficiency. Comparing the execution with cloud-APIs from OpenAI and Meta we see the costs advantage.

TODO insert bar chart

Use Cases

For more information see our docs or join our discord.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cortecs_py-0.0.1.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

cortecs_py-0.0.1-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file cortecs_py-0.0.1.tar.gz.

File metadata

  • Download URL: cortecs_py-0.0.1.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for cortecs_py-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b4080eb3b874ab09423554240298bc276bbec608150e91b2d68da0ed7248c163
MD5 192c424392548bd42d0069644b5e6e24
BLAKE2b-256 d28ab63da0318afaf08a563d7043bef2c22118702dac4e1e7019616559c54d27

See more details on using hashes here.

File details

Details for the file cortecs_py-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: cortecs_py-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for cortecs_py-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8d6b1a4fd20dcefb1b28514772c717ca786d7f9ded1076b77c62376d40993adb
MD5 3f6476c10849f681076bec68d5791b3a
BLAKE2b-256 a2c8a108a31f52f31024299632fd5eceb6a82e614d89c0e8f5e0cb43d7a4dab5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page