Skip to main content

Run AI locally with as little friction as possible.

Project description

cria

Cria, use Python to run LLMs with as little friction as possible.

Cria is a library for programmatically running Large Language Models through Python. Cria is built so you need as little configuration as possible — even with more advanced features.

  • Easy: No configuration is required out of the box. Getting started takes just five lines of code.
  • Concise: Write less code to save time and avoid duplication.
  • Local: Free and unobstructed by rate limits, running LLMs requires no internet connection.
  • Efficient: Use advanced features with your own ollama instance, or a subprocess.

Guide

Quickstart

Running Cria is easy. After installation, you need just five lines of code — no configurations, no manual downloads, no API keys, and no servers to worry about.

import cria

ai = cria.Cria()

prompt = "Who is the CEO of OpenAI?"
for chunk in ai.chat(prompt):
    print(chunk, end="")
>>> The CEO of OpenAI is Sam Altman!

or, you can run this more configurable example.

import cria

with cria.Model() as ai:
  prompt = "Who is the CEO of OpenAI?"
  response = ai.chat(prompt, stream=False)
  print(response)
>>> The CEO of OpenAI is Sam Altman!

[!WARNING] If no model is configured, Cria automatically installs and runs the default model: llama3.1:8b (4.7GB).

Installation

  1. Cria uses ollama, to install it, run the following.

    Windows

    Download

    Mac

    Download

    Linux

    curl -fsSL https://ollama.com/install.sh | sh
    
  2. Install Cria with pip.

    pip install cria
    

Advanced Usage

Custom Models

To run other LLMs, pass them into your ai variable.

import cria

ai = cria.Cria("llama2")

prompt = "Who is the CEO of OpenAI?"
for chunk in ai.chat(prompt):
    print(chunk, end="") # The CEO of OpenAI is Sam Altman. He co-founded OpenAI in 2015 with...

You can find available models here.

Streams

Streams are used by default in Cria, but you can turn them off by passing in a boolean for the stream parameter.

prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman!

Closing

By default, models are closed when you exit the Python program, but closing them manually is a best practice.

ai.close()

You can also use with statements to close models automatically (recommended).

Message History

Follow-Up

Message history is automatically saved in Cria, so asking follow-up questions is easy.

prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman.

prompt = "Tell me more about him."
response = ai.chat(prompt, stream=False)
print(response) # Sam Altman is an American entrepreneur and technologist who serves as the CEO of OpenAI...

Clear Message History

You can reset message history by running the clear method.

prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # Sam Altman is an American entrepreneur and technologist who serves as the CEO of OpenAI...

ai.clear()

prompt = "Tell me more about him."
response = ai.chat(prompt, stream=False)
print(response) # I apologize, but I don't have any information about "him" because the conversation just started...

Passing In Custom Context

You can also create a custom message history, and pass in your own context.

context = "Our AI system employed a hybrid approach combining reinforcement learning and generative adversarial networks (GANs) to optimize the decision-making..."
messages = [
    {"role": "system", "content": "You are a technical documentation writer"},
    {"role": "user", "content": context},
]

prompt = "Write some documentation using the text I gave you."
for chunk in ai.chat(messages=messages, prompt=prompt):
    print(chunk, end="") # AI System Optimization: Hybrid Approach Combining Reinforcement Learning and...

In the example, instructions are given to the LLM as the system. Then, extra context is given as the user. Finally, the prompt is entered (as a user). You can use any mixture of roles to specify the LLM to your liking.

The available roles for messages are:

  • user - Pass prompts as the user.
  • system - Give instructions as the system.
  • assistant - Act as the AI assistant yourself, and give the LLM lines.

The prompt parameter will always be appended to messages under the user role, to override this, you can choose to pass in nothing for prompt.

Interrupting

With Message History

If you are streaming messages with Cria, you can interrupt the prompt mid way.

response = ""
max_token_length = 5

prompt = "Who is the CEO of OpenAI?"
for i, chunk in enumerate(ai.chat(prompt)):
  if i >= max_token_length:
    ai.stop()
  response += chunk

print(response) # The CEO of OpenAI is
response = ""
max_token_length = 5

prompt = "Who is the CEO of OpenAI?"
for i, chunk in enumerate(ai.generate(prompt)):
  if i >= max_token_length:
    ai.stop()
  response += chunk

print(response) # The CEO of OpenAI is

In the examples, after the AI generates five tokens (units of text that are usually a couple of characters long), text generation is stopped via the stop method. After stop is called, you can safely break out of the for loop.

Without Message History

By default, Cria automatically saves responses in message history, even if the stream is interrupted. To prevent this behaviour though, you can pass in the allow_interruption boolean.

ai = cria.Cria(allow_interruption=False)

response = ""
max_token_length = 5

prompt = "Who is the CEO of OpenAI?"
for i, chunk in enumerate(ai.chat(prompt)):

  if i >= max_token_length:
    ai.stop()
    break

  print(chunk, end="") # The CEO of OpenAI is

prompt = "Tell me more about him."
for chunk in ai.chat(prompt):
  print(chunk, end="") # I apologize, but I don't have any information about "him" because the conversation just started...

Multiple Models and Parallel Conversations

Models

If you are running multiple models or parallel conversations, the Model class is also available. This is recommended for most use cases.

import cria

ai = cria.Model()

prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman.

All methods that apply to the Cria class also apply to Model.

With Model

Multiple models can be run through a with statement. This automatically closes them after use.

import cria

prompt = "Who is the CEO of OpenAI?"

with cria.Model("llama3") as ai:
  response = ai.chat(prompt, stream=False)
  print(response) # OpenAI's CEO is Sam Altman, who also...

with cria.Model("llama2") as ai:
  response = ai.chat(prompt, stream=False)
  print(response) # The CEO of OpenAI is Sam Altman.

Standalone Model

Or, models can be run traditionally.

import cria


prompt = "Who is the CEO of OpenAI?"

llama3 = cria.Model("llama3")
response = llama3.chat(prompt, stream=False)
print(response) # OpenAI's CEO is Sam Altman, who also...

llama2 = cria.Model("llama2")
response = llama2.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman.

# Not required, but best practice.
llama3.close()
llama2.close()

Generate

Cria also has a generate method.

prompt = "Who is the CEO of OpenAI?"
for chunk in ai.generate(prompt):
    print(chunk, end="") # The CEO of OpenAI (Open-source Artificial Intelligence) is Sam Altman.

promt = "Tell me more about him."
response = ai.generate(prompt, stream=False)
print(response) # I apologize, but I think there may have been some confusion earlier. As this...

Running Standalone

When you run cria.Cria(), an ollama instance will start up if one is not already running. When the program exits, this instance will terminate.

However, if you want to save resources by not exiting ollama, either run your own ollama instance in another terminal, or run a managed subprocess.

Running Your Own Ollama Instance

ollama serve
prompt = "Who is the CEO of OpenAI?"
with cria.Model() as ai:
    response = ai.generate("Who is the CEO of OpenAI?", stream=False)
    print(response)

Running A Managed Subprocess (Reccomended)

# If it is the first time you start the program, ollama will start automatically
# If it is the second time (or subsequent times) you run the program, ollama will already be running

ai = cria.Cria(standalone=True, close_on_exit=False)
prompt = "Who is the CEO of OpenAI?"

with cria.Model("llama2") as llama2:
    response = llama2.generate("Who is the CEO of OpenAI?", stream=False)
    print(response)

with cria.Model("llama3") as llama3:
    response = llama3.generate("Who is the CEO of OpenAI?", stream=False)
    print(response)

quit()
# Despite exiting, olama will keep running, and be used the next time this program starts.

Formatting

To format the output of the LLM, pass in the format keyword.

ai = cria.Cria()

prompt = "Return a JSON array of AI companies."
response = ai.chat(prompt, stream=False, format="json")
print(response) # ["OpenAI", "Anthropic", "Meta", "Google", "Cohere", ...].

The current supported formats are:

  • JSON

Contributing

If you have a feature request, feel free to make an issue!

Contributions are highly appreciated.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cria-1.6.6.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

cria-1.6.6-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file cria-1.6.6.tar.gz.

File metadata

  • Download URL: cria-1.6.6.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for cria-1.6.6.tar.gz
Algorithm Hash digest
SHA256 e1d5402eee1894c8586066ead8c08376f3f864ce8e7a0b35dc7c9ae133f01923
MD5 81f5881fd1d3e9d4adcf7816b3463691
BLAKE2b-256 497aaed55ba6e5bce415ed3921d6232649d49ad357e03c81031b08008a165a26

See more details on using hashes here.

File details

Details for the file cria-1.6.6-py3-none-any.whl.

File metadata

  • Download URL: cria-1.6.6-py3-none-any.whl
  • Upload date:
  • Size: 7.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for cria-1.6.6-py3-none-any.whl
Algorithm Hash digest
SHA256 5e5b6d0d437e4e323963e06fce3a7d7d6ac1520f290ed8d50e0310254449ebf4
MD5 4f7cb068b3cc982e3d102d74af103718
BLAKE2b-256 fdcc0e3b7e9844ad125797e0fa12add12a8c6b8e3baf7b5d276612602ae8e615

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page