An integration package connecting Maritaca AI and LangChain for Brazilian Portuguese language models
Project description
langchain-maritaca
An integration package connecting Maritaca AI and LangChain for Brazilian Portuguese language models.
Author: Anderson Henrique da Silva Location: Minas Gerais, Brasil GitHub: anderson-ufrj
Overview
Maritaca AI provides state-of-the-art Brazilian Portuguese language models, including the Sabiá family of models. This integration allows you to use Maritaca's models seamlessly within the LangChain ecosystem.
Available Models
| Model | Description | Pricing (per 1M tokens) |
|---|---|---|
sabia-3.1.1 |
Most capable model, best for complex tasks | Check Maritaca AI for pricing |
sabiazinho-3.1 |
Fast and economical, great for simple tasks | Check Maritaca AI for pricing |
Installation
pip install langchain-maritaca
Setup
Set your Maritaca API key as an environment variable:
export MARITACA_API_KEY="your-api-key"
Or pass it directly to the model:
from langchain_maritaca import ChatMaritaca
model = ChatMaritaca(api_key="your-api-key")
Usage
Basic Usage
from langchain_maritaca import ChatMaritaca
model = ChatMaritaca(
model="sabia-3.1",
temperature=0.7,
)
messages = [
("system", "Você é um assistente prestativo especializado em cultura brasileira."),
("human", "Quais são as principais festas populares do Brasil?"),
]
response = model.invoke(messages)
print(response.content)
Streaming
from langchain_maritaca import ChatMaritaca
model = ChatMaritaca(model="sabia-3.1", streaming=True)
for chunk in model.stream("Conte uma história sobre o folclore brasileiro"):
print(chunk.content, end="", flush=True)
Async Usage
import asyncio
from langchain_maritaca import ChatMaritaca
async def main():
model = ChatMaritaca(model="sabia-3.1")
response = await model.ainvoke("Qual é a receita de pão de queijo?")
print(response.content)
asyncio.run(main())
With LangChain Expression Language (LCEL)
from langchain_maritaca import ChatMaritaca
from langchain_core.prompts import ChatPromptTemplate
model = ChatMaritaca(model="sabia-3.1")
prompt = ChatPromptTemplate.from_messages([
("system", "Você é um especialista em {topic}."),
("human", "{question}"),
])
chain = prompt | model
response = chain.invoke({
"topic": "história do Brasil",
"question": "Quem foi Tiradentes?"
})
print(response.content)
With Tool Calling (Function Calling)
from langchain_maritaca import ChatMaritaca
from langchain_core.tools import tool
@tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
return f"O clima em {city} está ensolarado, 25°C"
model = ChatMaritaca(model="sabia-3.1")
model_with_tools = model.bind_tools([get_weather])
response = model_with_tools.invoke("Como está o tempo em São Paulo?")
print(response)
With Caching
from langchain_core.caches import InMemoryCache
from langchain_core.globals import set_llm_cache
from langchain_maritaca import ChatMaritaca
# Enable caching globally
set_llm_cache(InMemoryCache())
model = ChatMaritaca(model="sabia-3.1")
# First call - hits the API
response1 = model.invoke("Qual é a capital do Brasil?")
# Second call - uses cache (instant, no API cost!)
response2 = model.invoke("Qual é a capital do Brasil?")
With Callbacks for Observability
from langchain_maritaca import ChatMaritaca, CostTrackingCallback, LatencyTrackingCallback
# Create callbacks for monitoring
cost_cb = CostTrackingCallback()
latency_cb = LatencyTrackingCallback()
model = ChatMaritaca(callbacks=[cost_cb, latency_cb])
# Make some calls
model.invoke("Hello!")
model.invoke("How are you?")
# Check metrics
print(f"Total cost: ${cost_cb.total_cost:.6f}")
print(f"Total tokens: {cost_cb.total_tokens}")
print(f"Average latency: {latency_cb.average_latency:.2f}s")
print(f"P95 latency: {latency_cb.p95_latency:.2f}s")
Token Counting & Cost Estimation
from langchain_maritaca import ChatMaritaca
from langchain_core.messages import HumanMessage
model = ChatMaritaca(model="sabia-3.1")
# Count tokens in text
tokens = model.get_num_tokens("Olá, como você está?")
print(f"Tokens: {tokens}")
# Estimate cost before making a request
messages = [HumanMessage(content="Tell me about Brazil")]
estimate = model.estimate_cost(messages, max_output_tokens=1000)
print(f"Estimated cost: ${estimate['total_cost']:.6f}")
Tip: Install with
pip install langchain-maritaca[tokenizer]for accurate token counting using tiktoken.
Why Maritaca AI?
Maritaca AI models are specifically trained for Brazilian Portuguese, offering:
- Native Portuguese Understanding: Better comprehension of Brazilian idioms, expressions, and cultural context
- Local Data Training: Trained on diverse Brazilian Portuguese data sources
- Cost-Effective: Competitive pricing for Portuguese language tasks
- Low Latency: Servers located in Brazil for faster response times
Used in Production
Cidadão.AI - Brazilian government transparency platform powered by AI agents, handling 331K+ requests/month.
- Frontend: github.com/anderson-ufrj/cidadao.ai-frontend
- Backend: github.com/anderson-ufrj/cidadao.ai-backend
Using this package in production? Open an issue to get featured!
API Reference
ChatMaritaca
Main class for interacting with Maritaca AI models.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
model |
str | "sabia-3.1" |
Model name to use |
temperature |
float | 0.7 |
Sampling temperature (0.0-2.0) |
max_tokens |
int | None | Maximum tokens to generate |
top_p |
float | 0.9 |
Top-p sampling parameter |
api_key |
str | None | Maritaca API key (or use env var) |
base_url |
str | "https://chat.maritaca.ai/api" |
API base URL |
timeout |
float | 60.0 |
Request timeout in seconds |
max_retries |
int | 2 |
Maximum retry attempts |
retry_if_rate_limited |
bool | True |
Auto-retry on rate limit (HTTP 429) |
retry_delay |
float | 1.0 |
Initial delay between retries (seconds) |
retry_max_delay |
float | 60.0 |
Maximum delay between retries (seconds) |
retry_multiplier |
float | 2.0 |
Multiplier for exponential backoff |
streaming |
bool | False |
Enable streaming responses |
Development
Setup
# Clone the repository
git clone https://github.com/anderson-ufrj/langchain-maritaca.git
cd langchain-maritaca
# Install dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
ruff check .
ruff format .
# Run type checking
mypy langchain_maritaca
Running Tests
# Unit tests only
pytest tests/unit_tests/
# Integration tests (requires MARITACA_API_KEY)
pytest tests/integration_tests/
# With coverage
pytest --cov=langchain_maritaca --cov-report=html
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Changelog
See CHANGELOG.md for a list of changes.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Related Projects
- LangChain - Building applications with LLMs through composability
- Maritaca AI - Brazilian Portuguese language models
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_maritaca-0.2.4.tar.gz.
File metadata
- Download URL: langchain_maritaca-0.2.4.tar.gz
- Upload date:
- Size: 111.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad2bbbfe89cf8f9d673e8b8d5cee1b26bf6896ebe038688c4cffbf58bafd564a
|
|
| MD5 |
b6e3d75a85dfd0cea118503bdb665567
|
|
| BLAKE2b-256 |
16b1ffaca1530ecaa6c65e260f0165bc88e22c4aac85616caa08e9d21881181a
|
Provenance
The following attestation bundles were made for langchain_maritaca-0.2.4.tar.gz:
Publisher:
publish.yml on anderson-ufrj/langchain-maritaca
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_maritaca-0.2.4.tar.gz -
Subject digest:
ad2bbbfe89cf8f9d673e8b8d5cee1b26bf6896ebe038688c4cffbf58bafd564a - Sigstore transparency entry: 790486027
- Sigstore integration time:
-
Permalink:
anderson-ufrj/langchain-maritaca@f7c2994ce7fafa2e9f0d47d8c8cefef40fc08238 -
Branch / Tag:
refs/tags/v0.2.4 - Owner: https://github.com/anderson-ufrj
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f7c2994ce7fafa2e9f0d47d8c8cefef40fc08238 -
Trigger Event:
release
-
Statement type:
File details
Details for the file langchain_maritaca-0.2.4-py3-none-any.whl.
File metadata
- Download URL: langchain_maritaca-0.2.4-py3-none-any.whl
- Upload date:
- Size: 25.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00d2b1e2e5c2cd9fc89a4c528bb79a0a50780bb1613f8b982ba471e52c277cfd
|
|
| MD5 |
aa9df98fcb021302df198caf8e67fb95
|
|
| BLAKE2b-256 |
61135cab33dee0fee90281db71a0c7e5c65e43fb2a21403bca9d4aa4065981d2
|
Provenance
The following attestation bundles were made for langchain_maritaca-0.2.4-py3-none-any.whl:
Publisher:
publish.yml on anderson-ufrj/langchain-maritaca
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_maritaca-0.2.4-py3-none-any.whl -
Subject digest:
00d2b1e2e5c2cd9fc89a4c528bb79a0a50780bb1613f8b982ba471e52c277cfd - Sigstore transparency entry: 790486028
- Sigstore integration time:
-
Permalink:
anderson-ufrj/langchain-maritaca@f7c2994ce7fafa2e9f0d47d8c8cefef40fc08238 -
Branch / Tag:
refs/tags/v0.2.4 - Owner: https://github.com/anderson-ufrj
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f7c2994ce7fafa2e9f0d47d8c8cefef40fc08238 -
Trigger Event:
release
-
Statement type: